ICode9

精准搜索请尝试: 精确搜索
首页 > 其他分享> 文章详细

dropna 缺失数据处理

2021-01-21 19:33:52  阅读:361  来源: 互联网

标签:25 toy name Batman dropna NaT 数据处理 缺失


pandas 官方 api

  1. 函数原型
DataFrame.dropna(axis=0, how='any', thresh=None, subset=None, inplace=False)
  1. 参数意义
  • axis{0 or ‘index’, 1 or ‘columns’}, default 0

    • Determine if rows or columns which contain missing values are removed.
      • 0, or ‘index’ : Drop rows which contain missing values.
      • 1, or ‘columns’ : Drop columns which contain missing value.
    • Changed in version 1.0.0: Pass tuple or list to drop on multiple axes. Only a single axis is allowed.
  • how{‘any’, ‘all’}, default ‘any’

    • Determine if row or column is removed from DataFrame, when we have at least one NA or all NA.
      • ‘any’ : If any NA values are present, drop that row or column.
      • ‘all’ : If all values are NA, drop that row or column.
  • threshint, optional

    • Require that many non-NA values.
  • subsetarray-like, optional

    • Labels along other axis to consider, e.g. if you are dropping rows these would be a list of columns to include.
  • inplacebool, default False

    • If True, do operation inplace and return None.
  • Returns

    • DataFrame or None
    • DataFrame with NA entries dropped from it or None if inplace=True.
  1. 样例
df = pd.DataFrame({"name": ['Alfred', 'Batman', 'Catwoman'],
                   "toy": [np.nan, 'Batmobile', 'Bullwhip'],
                   "born": [pd.NaT, pd.Timestamp("1940-04-25"),
                            pd.NaT]})       
	name        toy       born
0    Alfred        NaN        NaT
1    Batman  Batmobile 1940-04-25
2  Catwoman   Bullwhip        NaT

默认删除

df.dropna()
     name        toy       born
1  Batman  Batmobile 1940-04-25

删除所有存在NAN值的列

df.dropna(axis='columns')
       name
0    Alfred
1    Batman
2  Catwoman

删除所有列都为空的行

df.dropna(how='all')
       name        toy       born
0    Alfred        NaN        NaT
1    Batman  Batmobile 1940-04-25
2  Catwoman   Bullwhip        NaT

删除空值大于2的列

df.dropna(thresh=2)
       name        toy       born
1    Batman  Batmobile 1940-04-25
2  Catwoman   Bullwhip        NaT

删除name,toy列为空的行

df.dropna(subset=['name', 'toy'])
       name        toy       born
1    Batman  Batmobile 1940-04-25
2  Catwoman   Bullwhip        NaT
df.dropna(inplace=True)
     name        toy       born
 1  Batman  Batmobile 1940-04-25

标签:25,toy,name,Batman,dropna,NaT,数据处理,缺失
来源: https://blog.csdn.net/weixin_43745072/article/details/112969660

本站声明: 1. iCode9 技术分享网(下文简称本站)提供的所有内容,仅供技术学习、探讨和分享;
2. 关于本站的所有留言、评论、转载及引用,纯属内容发起人的个人观点,与本站观点和立场无关;
3. 关于本站的所有言论和文字,纯属内容发起人的个人观点,与本站观点和立场无关;
4. 本站文章均是网友提供,不完全保证技术分享内容的完整性、准确性、时效性、风险性和版权归属;如您发现该文章侵犯了您的权益,可联系我们第一时间进行删除;
5. 本站为非盈利性的个人网站,所有内容不会用来进行牟利,也不会利用任何形式的广告来间接获益,纯粹是为了广大技术爱好者提供技术内容和技术思想的分享性交流网站。

专注分享技术,共同学习,共同进步。侵权联系[81616952@qq.com]

Copyright (C)ICode9.com, All Rights Reserved.

ICode9版权所有