Skip to content

Commit 7d68422

Browse files
committed
Merge pull request #4374 from jreback/GH4318
BUG: Fixed passing keep_default_na=False when na_values=None (GH4318)
2 parents 8ac0e11 + 1663fde commit 7d68422

File tree

4 files changed

+74
-2
lines changed

4 files changed

+74
-2
lines changed

doc/source/io.rst

+47
Original file line numberDiff line numberDiff line change
@@ -546,6 +546,53 @@ The ``thousands`` keyword allows integers to be parsed correctly
546546
547547
os.remove('tmp.csv')
548548
549+
.. _io.na_values:
550+
551+
NA Values
552+
~~~~~~~~~
553+
554+
To control which values are parsed as missing values (which are signified by ``NaN``), specifiy a
555+
list of strings in ``na_values``. If you specify a number (a ``float``, like ``5.0`` or an ``integer`` like ``5``),
556+
the corresponding equivalent values will also imply a missing value (in this case effectively
557+
``[5.0,5]`` are recognized as ``NaN``.
558+
559+
To completely override the default values that are recognized as missing, specify ``keep_default_na=False``.
560+
The default ``NaN`` recognized values are ``['-1.#IND', '1.#QNAN', '1.#IND', '-1.#QNAN', '#N/A N/A', 'NA',
561+
'#NA', 'NULL', 'NaN', 'nan']``.
562+
563+
.. code-block:: python
564+
565+
read_csv(path, na_values=[5])
566+
567+
the default values, in addition to ``5`` , ``5.0`` when interpreted as numbers are recognized as ``NaN``
568+
569+
.. code-block:: python
570+
571+
read_csv(path, keep_default_na=False, na_values=[""])
572+
573+
only an empty field will be ``NaN``
574+
575+
.. code-block:: python
576+
577+
read_csv(path, keep_default_na=False, na_values=["NA", "0"])
578+
579+
only ``NA`` and ``0`` as strings are ``NaN``
580+
581+
.. code-block:: python
582+
583+
read_csv(path, na_values=["Nope"])
584+
585+
the default values, in addition to the string ``"Nope"`` are recognized as ``NaN``
586+
587+
.. _io.infinity:
588+
589+
Infinity
590+
~~~~~~~~
591+
592+
``inf`` like values will be parsed as ``np.inf`` (positive infinity), and ``-inf`` as ``-np.inf`` (negative infinity).
593+
These will ignore the case of the value, meaning ``Inf``, will also be parsed as ``np.inf``.
594+
595+
549596
.. _io.comments:
550597

551598
Comments

doc/source/release.rst

+1
Original file line numberDiff line numberDiff line change
@@ -82,6 +82,7 @@ pandas 0.13
8282
local variable was undefined (:issue:`4381`)
8383
- In ``to_json``, raise if a passed ``orient`` would cause loss of data because
8484
of a duplicate index (:issue:`4359`)
85+
- Fixed passing ``keep_default_na=False`` when ``na_values=None`` (:issue:`4318`)
8586

8687
pandas 0.12
8788
===========

pandas/io/parsers.py

+5-2
Original file line numberDiff line numberDiff line change
@@ -1774,8 +1774,11 @@ def _try_convert_dates(parser, colspec, data_dict, columns):
17741774

17751775
def _clean_na_values(na_values, keep_default_na=True):
17761776

1777-
if na_values is None and keep_default_na:
1778-
na_values = _NA_VALUES
1777+
if na_values is None:
1778+
if keep_default_na:
1779+
na_values = _NA_VALUES
1780+
else:
1781+
na_values = []
17791782
na_fvalues = set()
17801783
elif isinstance(na_values, dict):
17811784
if keep_default_na:

pandas/io/tests/test_parsers.py

+21
Original file line numberDiff line numberDiff line change
@@ -108,6 +108,27 @@ def test_empty_string(self):
108108
np.nan, 'seven']})
109109
tm.assert_frame_equal(xp.reindex(columns=df.columns), df)
110110

111+
112+
# GH4318, passing na_values=None and keep_default_na=False yields 'None' as a na_value
113+
data = """\
114+
One,Two,Three
115+
a,1,None
116+
b,2,two
117+
,3,None
118+
d,4,nan
119+
e,5,five
120+
nan,6,
121+
g,7,seven
122+
"""
123+
df = self.read_csv(
124+
StringIO(data), keep_default_na=False)
125+
xp = DataFrame({'One': ['a', 'b', '', 'd', 'e', 'nan', 'g'],
126+
'Two': [1, 2, 3, 4, 5, 6, 7],
127+
'Three': ['None', 'two', 'None', 'nan', 'five', '',
128+
'seven']})
129+
tm.assert_frame_equal(xp.reindex(columns=df.columns), df)
130+
131+
111132
def test_read_csv(self):
112133
if not compat.PY3:
113134
if 'win' in sys.platform:

0 commit comments

Comments
 (0)