-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
DOC: update docs for read_csv().na_values and keep_default_na #14030
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -25,6 +25,7 @@ | |||
import pandas.core.common as com | |||
from warnings import warn | |||
from distutils.version import LooseVersion | |||
from parsers import _NA_VALUES |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add the full import path. Actually we should move these to pandas.io.common
.
Current coverage is 85.25% (diff: 100%)@@ master #14030 diff @@
==========================================
Files 139 139
Lines 50380 50383 +3
Methods 0 0
Messages 0 0
Branches 0 0
==========================================
+ Hits 42953 42955 +2
- Misses 7427 7428 +1
Partials 0 0
|
Do I need add Also I am not sure about this error:
|
whatsnew is not needed. And don't worry about the travis error, that is something unrelated. |
na_values : str or list-like or dict, default None | ||
Additional strings to recognize as NA/NaN. If dict passed, specific | ||
per-column NA values. By default the following values are interpreted | ||
as NaN: `'""" + "'`, `'".join(sorted(_NA_VALUES)) + """'`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am afraid this does not work with docstrings. You will have to hardcode them like it is done in the read_csv
docstring
@jorisvandenbossche Hey I made some changes. Is there anyway I can preview it? |
You can just look at the docstring in the console to see if it looks ok (with ipython: |
def read_excel(io, sheetname=0, header=0, skiprows=None, skip_footer=0, | ||
index_col=None, names=None, parse_cols=None, parse_dates=False, | ||
date_parser=None, na_values=None, thousands=None, | ||
convert_float=True, has_index_names=None, converters=None, | ||
engine=None, squeeze=False, **kwds): | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think its necessary to make this an Appender doc-string. We generally only do that if you need it more than once. @jorisvandenbossche ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, it's to be able to automatically inject the NA_VALUES instead of manually writing it in the docstring.
It's a bit a trade-off here: is the added complexity of moving the docstring away from the actual function worth it for the 1 line of na values that is written manually (and has to be updated if the default recognized NaN values are changed).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh I c. Then this is ok. We want to have the common NA values in a single place
na_values : str or list-like or dict, default None | ||
Additional strings to recognize as NA/NaN. If dict passed, specific | ||
per-column NA values. By default the following values are interpreted | ||
as NaN: `'""" + "'`, `'".join(sorted(_NA_VALUES)) + """'`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the backticks are not needed around the strings. For the rest, the docstring looks fine in the console!
I squashed and removed the backticks. I did test: In [12]: _NA_VALUES = set([
....: '-1.#IND', '1.#QNAN', '1.#IND', '-1.#QNAN', '#N/A N/A', '#N/A',
....: 'N/A', 'NA', '#NA', 'NULL', 'NaN', '-NaN', 'nan', '-nan', ''
....: ])
In [13]: s = """as NaN: '""" + "', '".join(sorted(_NA_VALUES)) + """'."""
In [14]: print s
as NaN: '', '#N/A', '#N/A N/A', '#NA', '-1.#IND', '-1.#QNAN', '-NaN', '-nan', '1.#IND', '1.#QNAN', 'N/A', 'NA', 'NULL', 'NaN', 'nan'. I didn't check the doc with installation and |
Okay I fixed it. Something's wrong with Cython. |
thanks! Can you do a followup where we make this wrap on the NaNs (they generate an extra long line)
|
Sure I'll try! |
git diff upstream/master | flake8 --diff