Skip to content

make null lowercase a missing value #16534

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jun 1, 2017

Conversation

OlegShteynbuk
Copy link
Contributor

@OlegShteynbuk OlegShteynbuk commented May 29, 2017

xref #16079
closes #16471

@gfyoung
Copy link
Member

gfyoung commented May 29, 2017

  1. You also need to add it to parsers.pyx
  2. There's a test (also in na_values.py) that checks whether each NaN value you specified gets interpreted as such. You should add null to that test.

@OlegShteynbuk
Copy link
Contributor Author

OlegShteynbuk commented May 29, 2017 via email

@gfyoung
Copy link
Member

gfyoung commented May 29, 2017

@OlegShteynbuk : Yeah, you need to run pytest test_parsers.py instead (and yeah, that was the test I was referring to. Good catch 😄)

@OlegShteynbuk
Copy link
Contributor Author

OlegShteynbuk commented May 29, 2017 via email

@gfyoung
Copy link
Member

gfyoung commented May 29, 2017

@OlegShteynbuk : Nope, not an issue.

@OlegShteynbuk
Copy link
Contributor Author

added it to parsers.pyx thanks to gfyoung

@codecov
Copy link

codecov bot commented May 29, 2017

Codecov Report

Merging #16534 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master   #16534   +/-   ##
=======================================
  Coverage   90.79%   90.79%           
=======================================
  Files         161      161           
  Lines       51063    51063           
=======================================
  Hits        46365    46365           
  Misses       4698     4698
Flag Coverage Δ
#multiple 88.63% <ø> (ø) ⬆️
#single 40.15% <ø> (ø) ⬆️
Impacted Files Coverage Δ
pandas/io/common.py 69.91% <ø> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e60dc4c...4a91508. Read the comment docs.

@codecov
Copy link

codecov bot commented May 29, 2017

Codecov Report

Merging #16534 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master   #16534   +/-   ##
=======================================
  Coverage   90.79%   90.79%           
=======================================
  Files         161      161           
  Lines       51063    51063           
=======================================
  Hits        46365    46365           
  Misses       4698     4698
Flag Coverage Δ
#multiple 88.63% <ø> (ø) ⬆️
#single 40.15% <ø> (ø) ⬆️
Impacted Files Coverage Δ
pandas/io/common.py 69.91% <ø> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e60dc4c...4a91508. Read the comment docs.

@codecov
Copy link

codecov bot commented May 29, 2017

Codecov Report

Merging #16534 into master will increase coverage by <.01%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #16534      +/-   ##
==========================================
+ Coverage   90.74%   90.75%   +<.01%     
==========================================
  Files         161      161              
  Lines       51074    51073       -1     
==========================================
+ Hits        46349    46350       +1     
+ Misses       4725     4723       -2
Flag Coverage Δ
#multiple 88.59% <ø> (ø) ⬆️
#single 40.16% <ø> (ø) ⬆️
Impacted Files Coverage Δ
pandas/io/common.py 69.91% <ø> (ø) ⬆️
pandas/core/generic.py 92.31% <0%> (+0.04%) ⬆️
pandas/core/indexes/datetimes.py 95.33% <0%> (+0.09%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e437ad5...da74438. Read the comment docs.

@gfyoung
Copy link
Member

gfyoung commented May 29, 2017

@OlegShteynbuk : See the checklist in the PR you submitted? Just run through those and make sure you can check all of them off. Then it should be good to merge!

@OlegShteynbuk
Copy link
Contributor Author

OlegShteynbuk commented May 29, 2017

@gfyoung looks ok, not sure if the last test change to parsers.pyx is included, or need a new pull request.

@gfyoung
Copy link
Member

gfyoung commented May 29, 2017

I'm not sure you fully understood what I was asking you to do. The code changes are fine, but there are several other things you need to do (i.e whatsnew) before this will be merged. That's why we have the checkboxes in the PR you submitted.

@OlegShteynbuk
Copy link
Contributor Author

where are the checkboxes located, can see checks for tests they all fine, i can write whatsnew but where is it?

@gfyoung
Copy link
Member

gfyoung commented May 29, 2017

If you look at the web version of GitHub, the very first text box under the PR title has the checkboxes. Also, the whatsnew can be found under doc/source/whatsnew (probably want 0.21.0 as this is technically an API change)

@OlegShteynbuk
Copy link
Contributor Author

there are checkboxes on the top, first comment, there is whatsnew checkbox don't see any links, need to add whatsnew somewhere

@gfyoung
Copy link
Member

gfyoung commented May 29, 2017

there are checkboxes on the top, first comment

Correct. Make sure you've satisfied everything listed on those checkboxes.

need to add whatsnew somewhere

Yes, I gave you the file-path that you should follow in your local repository: pandas/doc/source/whatsnew

@OlegShteynbuk
Copy link
Contributor Author

probably need a new .txt file under pandas/doc/source/whatsnew

@gfyoung
Copy link
Member

gfyoung commented May 29, 2017

No, you shouldn't. You should add it to the 0.21 version.

@OlegShteynbuk
Copy link
Contributor Author

ok, will add 0.21

@OlegShteynbuk
Copy link
Contributor Author

OlegShteynbuk commented May 29, 2017

added to whatsnew 0.21, commited to my branch

@OlegShteynbuk
Copy link
Contributor Author

all other checkboxes looks fine, will be away for about an hour, do i need to do anything else before leaving?

@OlegShteynbuk
Copy link
Contributor Author

  • The na_values argument for :func:read_csv function now has 'null' (in lower case) as a default value that is interpreted as NaN: (:issue:16471)

@OlegShteynbuk
Copy link
Contributor Author

probably need to modify na_values in io.rst file too as it goes into docs
added 'null' to io.rst line 226, looks like the one that is going into read_csv docs, on my branch;
there is also a heading NA Values (line 1023) describing default values, not sure about this one, most likely need to be modified too.

@jreback jreback added IO CSV read_csv, to_csv Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate labels May 30, 2017
@@ -38,6 +38,7 @@ Other Enhancements
- :func:`read_feather` has gained the ``nthreads`` parameter for multi-threaded operations (:issue:`16359`)
- :func:`DataFrame.clip()` and :func: `Series.cip()` have gained an inplace argument. (:issue: `15388`)
- :func:`crosstab` has gained a ``margins_name`` parameter to define the name of the row / column that will contain the totals when margins=True. (:issue:`15972`)
- The ``na_values`` argument for :func:`read_csv` function now has 'null' (in lower case) as a default value that is interpreted as NaN: (:issue:`16471`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:func:`read_csv has gained 'null' as an additional default missing value.

na_values is not involved here, rather the default has changed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed; yes - na_values is just a way to add additional strings to those default values for one function call, thanks for the correction.

@jreback jreback added this to the 0.21.0 milestone Jun 1, 2017
@jreback jreback merged commit fc4408b into pandas-dev:master Jun 1, 2017
@jreback
Copy link
Contributor

jreback commented Jun 1, 2017

thanks @OlegShteynbuk

@jreback
Copy link
Contributor

jreback commented Jun 1, 2017

@gfyoung do we have an issue about consolidating the duplicate definitions of NA values (in parsers & IO), maybe just remove from parsers and leave in IO is best. if not, can you create one?

@OlegShteynbuk OlegShteynbuk deleted the null_lower_case branch June 3, 2017 01:08
@OlegShteynbuk
Copy link
Contributor Author

consolidated the duplicate definitions of NA values (in parsers & IO), removed from parsers and left in IO, assert removed too; probably need to create a different issue.
#16589

@OlegShteynbuk
Copy link
Contributor Author

consolidated NA values in parsers.pyx

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
IO CSV read_csv, to_csv Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add 'null' in lower case to na_values defaults in pandas.read_csv
3 participants