Skip to content

Commit 5f312da

Browse files
chrisgorgojreback
authored andcommitted
Adding 'n/a' to list of strings denoting missing values (#16079)
1 parent 8d092d9 commit 5f312da

File tree

5 files changed

+8
-6
lines changed

5 files changed

+8
-6
lines changed

doc/source/io.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -226,7 +226,7 @@ NA and Missing Data Handling
226226
na_values : scalar, str, list-like, or dict, default ``None``
227227
Additional strings to recognize as NA/NaN. If dict passed, specific per-column
228228
NA values. By default the following values are interpreted as NaN:
229-
``'-1.#IND', '1.#QNAN', '1.#IND', '-1.#QNAN', '#N/A N/A', '#N/A', 'N/A', 'NA',
229+
``'-1.#IND', '1.#QNAN', '1.#IND', '-1.#QNAN', '#N/A N/A', '#N/A', 'N/A', 'n/a', 'NA',
230230
'#NA', 'NULL', 'null', 'NaN', '-NaN', 'nan', '-nan', ''``.
231231
keep_default_na : boolean, default ``True``
232232
If na_values are specified and keep_default_na is ``False`` the default NaN

doc/source/whatsnew/v0.21.0.txt

+3-1
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@ Other Enhancements
3838
- :func:`read_feather` has gained the ``nthreads`` parameter for multi-threaded operations (:issue:`16359`)
3939
- :func:`DataFrame.clip()` and :func: `Series.cip()` have gained an inplace argument. (:issue: `15388`)
4040
- :func:`crosstab` has gained a ``margins_name`` parameter to define the name of the row / column that will contain the totals when margins=True. (:issue:`15972`)
41-
- :func:`read_csv` has gained 'null' as an additional default missing value.(:issue:`16471`)
41+
4242
.. _whatsnew_0210.api_breaking:
4343

4444
Backwards incompatible API changes
@@ -49,6 +49,8 @@ Backwards incompatible API changes
4949

5050
- Accessing a non-existent attribute on a closed :class:`HDFStore` will now
5151
raise an ``AttributeError`` rather than a ``ClosedFileError`` (:issue:`16301`)
52+
- :func:`read_csv` now treats ``'null'`` strings as missing values by default (:issue:`16471`)
53+
- :func:`read_csv` now treats ``'n/a'`` strings as missing values by default (:issue:`16078`)
5254

5355
.. _whatsnew_0210.api:
5456

pandas/_libs/parsers.pyx

+1-1
Original file line numberDiff line numberDiff line change
@@ -277,7 +277,7 @@ DEFAULT_CHUNKSIZE = 256 * 1024
277277
# no longer excluding inf representations
278278
# '1.#INF','-1.#INF', '1.#INF000000',
279279
_NA_VALUES = [b'-1.#IND', b'1.#QNAN', b'1.#IND', b'-1.#QNAN',
280-
b'#N/A N/A', b'NA', b'#NA', b'NULL', b'null', b'NaN',
280+
b'#N/A N/A', b'n/a', b'NA', b'#NA', b'NULL', b'null', b'NaN',
281281
b'nan', b'']
282282

283283

pandas/io/common.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@
3131
# '1.#INF','-1.#INF', '1.#INF000000',
3232
_NA_VALUES = set([
3333
'-1.#IND', '1.#QNAN', '1.#IND', '-1.#QNAN', '#N/A N/A', '#N/A',
34-
'N/A', 'NA', '#NA', 'NULL', 'null', 'NaN', '-NaN', 'nan', '-nan', ''
34+
'N/A', 'n/a', 'NA', '#NA', 'NULL', 'null', 'NaN', '-NaN', 'nan', '-nan', ''
3535
])
3636

3737
try:

pandas/tests/io/parser/na_values.py

+2-2
Original file line numberDiff line numberDiff line change
@@ -70,8 +70,8 @@ def test_non_string_na_values(self):
7070

7171
def test_default_na_values(self):
7272
_NA_VALUES = set(['-1.#IND', '1.#QNAN', '1.#IND', '-1.#QNAN',
73-
'#N/A', 'N/A', 'NA', '#NA', 'NULL', 'null', 'NaN',
74-
'nan', '-NaN', '-nan', '#N/A N/A', ''])
73+
'#N/A', 'N/A', 'n/a', 'NA', '#NA', 'NULL', 'null',
74+
'NaN', 'nan', '-NaN', '-nan', '#N/A N/A', ''])
7575
assert _NA_VALUES == parsers._NA_VALUES
7676
nv = len(_NA_VALUES)
7777

0 commit comments

Comments
 (0)