pandas中dropna()参数详解
DataFrame.dropna( axis=0, how=‘any’, thresh=None, subset=None, inplace=False)
1.axis参数确定是否删除包含缺失值的行或列
axis=0或axis=’index’删除含有缺失值的行,
axis=1或axis=’columns’删除含有缺失值的列,
import pandas as pd import numpy as np df = pd.DataFrame({
"name": ['Alfred', 'Batman', 'Catwoman'], "toy": [np.nan, 'Batmobile', 'Bullwhip'], "born": [pd.NaT, pd.Timestamp("1940-04-25"), pd.NaT]})
df
| name | toy | born | |
|---|---|---|---|
| 0 | Alfred | NaN | NaT |
| 1 | Batman | Batmobile | 1940-04-25 |
| 2 | Catwoman | Bullwhip | NaT |
df.dropna() #默认是axis=0
| name | toy | born | |
|---|---|---|---|
| 1 | Batman | Batmobile | 1940-04-25 |
df.dropna(axis=1) #输出
| name | |
|---|---|
| 0 | Alfred |
| 1 | Batman |
| 2 | Catwoman |
2.how参数当我们至少有一个NA时,确定是否从DataFrame中删除行或列
how=’all’或者how=‘any’。
how=’all’时表示删除全是缺失值的行(列)
how=’any’时表示删除只要含有缺失值的行(列)
df.dropna(how='all')
| name | toy | born | |
|---|---|---|---|
| 0 | Alfred | NaN | NaT |
| 1 | Batman | Batmobile | 1940-04-25 |
| 2 | Catwoman | Bullwhip | NaT |
3.thresh=n表示保留至少含有n个非na数值的行
df.dropna(thresh=2)
| name | toy | born | |
|---|---|---|---|
| 1 | Batman | Batmobile | 1940-04-25 |
| 2 | Catwoman | Bullwhip | NaT |
4.subset定义要在哪些列中查找缺失值
df.dropna(subset=['name', 'born']) #删除在'name' 'born'列含有缺失值的行
| name | toy | born | |
|---|---|---|---|
| 1 | Batman | Batmobile | 1940-04-25 |
5.inplace表示直接在原DataFrame修改
发布者:全栈程序员-站长,转载请注明出处:https://javaforall.net/220217.html原文链接:https://javaforall.net
