na_filter=False ignored when index_col set #5239

cancan101 · 2013-10-16T02:24:42Z

Given the following CSV file:

u1,u2,u3,d1,d2,d3,d4
Good Things,C,,1,1,1,1
Good Things,R,,1,1,1,1
Bad Things,C,,1,1,1,1
Bad Things,T,,1,1,1,1
Okay Things,N,B,1,1,1,1
Okay Things,N,D,1,1,1,1
Okay Things,B,,1,1,1,1
Okay Things,D,,1,1,1,1

First I parse with na_filter=True:

In [13]: pd.read_csv("/home/alex/nan_issue.csv", na_filter=True)
Out[13]: 
            u1 u2   u3  d1  d2  d3  d4
0  Good Things  C  NaN   1   1   1   1
1  Good Things  R  NaN   1   1   1   1
2   Bad Things  C  NaN   1   1   1   1
3   Bad Things  T  NaN   1   1   1   1
4  Okay Things  N    B   1   1   1   1
5  Okay Things  N    D   1   1   1   1
6  Okay Things  B  NaN   1   1   1   1
7  Okay Things  D  NaN   1   1   1   1

then I parse with na_filter=False:

In [12]: pd.read_csv("/home/alex/nan_issue.csv", na_filter=False)
Out[12]: 
            u1 u2 u3  d1  d2  d3  d4
0  Good Things  C      1   1   1   1
1  Good Things  R      1   1   1   1
2   Bad Things  C      1   1   1   1
3   Bad Things  T      1   1   1   1
4  Okay Things  N  B   1   1   1   1
5  Okay Things  N  D   1   1   1   1
6  Okay Things  B      1   1   1   1
7  Okay Things  D      1   1   1   1

then index_cols set:

In [11]: pd.read_csv("/home/alex/nan_issue.csv", na_filter=False,index_col=[0,1,2],)
Out[11]: 
                    d1  d2  d3  d4
u1          u2 u3                 
Good Things C  NaN   1   1   1   1
            R  NaN   1   1   1   1
Bad Things  C  NaN   1   1   1   1
            T  NaN   1   1   1   1
Okay Things N  B     1   1   1   1
               D     1   1   1   1
            B  NaN   1   1   1   1
            D  NaN   1   1   1   1

Finally setting na_values=[], keep_default_na=False seems to fix the issue:

In [14]: pd.read_csv("/home/alex/nan_issue.csv", na_filter=False,index_col=[0,1,2],na_values=[], keep_default_na=False)
Out[14]: 
                   d1  d2  d3  d4
u1          u2 u3                
Good Things C       1   1   1   1
            R       1   1   1   1
Bad Things  C       1   1   1   1
            T       1   1   1   1
Okay Things N  B    1   1   1   1
               D    1   1   1   1
            B       1   1   1   1
            D       1   1   1   1

The text was updated successfully, but these errors were encountered:

jreback · 2013-10-16T12:30:26Z

There are very limited tests with na_filter=False and its a pretty silly parameter, so moving to low-priority. You are welcome to do a PR if you'd like.

cancan101 · 2013-10-16T13:11:05Z

TBI I can work around for now. My observation is that there probably too many parameters on the method having to do with handling of nans.It would be great to clean this up.
keep_default_na, na_filter, na_values

There is also interaction between these parameters.

jreback · 2013-10-16T13:23:49Z

sure.....though to be honest, easiest just to drop na_filter...but if you come up with a better API, gr8

Closes pandas-devgh-5239.

Closes gh-5239.

Closes pandas-devgh-5239.

Closes pandas-devgh-5239. (cherry picked from commit c176a3c)

Closes gh-5239. (cherry picked from commit c176a3c)

gfyoung added a commit to forking-repos/pandas that referenced this issue Nov 5, 2017

BUG: Don't parse NA-values in index when requested

c75ffb6

Closes pandas-devgh-5239.

gfyoung added Bug Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate and removed Prio-low labels Nov 5, 2017

gfyoung modified the milestones: Someday, Next Major Release Nov 5, 2017

gfyoung mentioned this issue Nov 5, 2017

BUG: Don't parse NA-values in index when requested #18127

Merged

jreback modified the milestones: Next Major Release, 0.21.1 Nov 6, 2017

TomAugspurger mentioned this issue Nov 6, 2017

BUG from_csv na_filter=False not working with index #18143

Closed

gfyoung added a commit to forking-repos/pandas that referenced this issue Nov 6, 2017

BUG: Don't parse NA-values in index when requested

fdb6ef3

Closes pandas-devgh-5239.

gfyoung closed this as completed in #18127 Nov 6, 2017

gfyoung added a commit that referenced this issue Nov 6, 2017

BUG: Don't parse NA-values in index when requested (#18127)

c176a3c

Closes gh-5239.

watercrossing pushed a commit to watercrossing/pandas that referenced this issue Nov 10, 2017

BUG: Don't parse NA-values in index when requested (pandas-dev#18127)

82ff239

Closes pandas-devgh-5239.

No-Stream pushed a commit to No-Stream/pandas that referenced this issue Nov 28, 2017

BUG: Don't parse NA-values in index when requested (pandas-dev#18127)

1ad0e4a

Closes pandas-devgh-5239.

TomAugspurger pushed a commit to TomAugspurger/pandas that referenced this issue Dec 8, 2017

BUG: Don't parse NA-values in index when requested (pandas-dev#18127)

91639e5

Closes pandas-devgh-5239. (cherry picked from commit c176a3c)

TomAugspurger pushed a commit that referenced this issue Dec 11, 2017

BUG: Don't parse NA-values in index when requested (#18127)

d2c86a9

Closes gh-5239. (cherry picked from commit c176a3c)

gfyoung mentioned this issue Dec 22, 2017

read_csv() ignores na_filter=False for index columns #7518

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

na_filter=False ignored when index_col set #5239

na_filter=False ignored when index_col set #5239

cancan101 commented Oct 16, 2013

jreback commented Oct 16, 2013

cancan101 commented Oct 16, 2013

jreback commented Oct 16, 2013

na_filter=False ignored when index_col set #5239

na_filter=False ignored when index_col set #5239

Comments

cancan101 commented Oct 16, 2013

jreback commented Oct 16, 2013

cancan101 commented Oct 16, 2013

jreback commented Oct 16, 2013