pandas.io.json.dumps raises OverflowError on float32 and float16 NaNs #16686

Kiv · 2017-06-12T22:31:56Z

Code Sample, a copy-pastable example if possible

In [3]: from pandas.io.json import dumps

In [4]: import numpy as np

In [5]: dumps(np.float64('NaN'))
Out[5]: 'null'

In [6]: dumps(np.float32('NaN'))
---------------------------------------------------------------------------
OverflowError                             Traceback (most recent call last)
----> 1 dumps(np.float32('NaN'))

OverflowError: Invalid Nan value when encoding double

In [7]: dumps(np.float16('NaN'))
---------------------------------------------------------------------------
OverflowError                             Traceback (most recent call last)
----> 1 dumps(np.float16('NaN'))

OverflowError: Maximum recursion level reached

Problem description

Float64 does the sensible thing. Float32 and float16 should behave consistently but they error out instead.

Expected Output

'null' for all 3 cases

Output of `pd.show_versions()`

INSTALLED VERSIONS ------------------ commit: None python: 3.6.1.final.0 python-bits: 64 OS: Linux OS-release: 4.4.0-79-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_CA.UTF-8 LOCALE: en_CA.UTF-8

pandas: 0.20.2
pytest: None
pip: 9.0.1
setuptools: 27.2.0
Cython: None
numpy: 1.13.0
scipy: 0.19.0
xarray: None
IPython: 5.3.0
sphinx: 1.5.3
patsy: 0.4.1
dateutil: 2.6.0
pytz: 2017.2
blosc: None
bottleneck: None
tables: 3.3.0
numexpr: 2.6.2
feather: None
matplotlib: 2.0.2
openpyxl: None
xlrd: 1.0.0
xlwt: 1.2.0
xlsxwriter: None
lxml: None
bs4: 4.6.0
html5lib: 0.999
sqlalchemy: 1.1.9
pymysql: None
psycopg2: 2.7.1 (dt dec pq3 ext lo64)
jinja2: 2.9.6
s3fs: None
pandas_gbq: None
pandas_datareader: None

The text was updated successfully, but these errors were encountered:

jreback · 2017-06-13T10:31:37Z

If you want to do a PR for this would be great. Note this is actually tricky to hit in practice as an array of float (whether float 32/64 with nans) is serialized in a slightly different impl. Further we have very limited support for float16s.

Kiv · 2017-06-13T10:45:46Z

I can try, should I be looking in pandas/_libs/src/ujson/python/objToJSON.c? Or is it somewhere else?

Kiv · 2017-06-13T11:11:29Z

I saw in ultrajsonenc.c that the error comes from line 784:

    if (!(value == value)) {
        SetError(obj, enc, "Invalid Nan value when encoding double");
        return FALSE;
    }

Are you suggesting that this code be changed to print the null instead? Or that a different code path be taken and we don't reach here?

Kiv · 2017-06-13T11:54:51Z

Let me know if this is the right approach, it fixes float32 and I don't care about float16:

Kiv@1457ac6

chris-b1 · 2017-06-13T20:36:41Z

@Kiv - that looks reasonable, although I'm not sure why float 32 isn't already hitting this path that float 64 does? (I'm not especially familiar with this code)
https://github.com/Kiv/pandas/blob/1457ac60656046b1e4ff71a24a225cba8517f3e1/pandas/_libs/src/ujson/python/objToJSON.c#L545

jreback · 2017-06-14T23:20:57Z

@chris-b1 if you look at the line before there is a casting to <double>. I don't actually know what this does to a np.float32('nan'), but its possible it is not then detected by isnan.

jreback added Difficulty Intermediate IO JSON read_json, to_json, json_normalize Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate labels Jun 13, 2017

jreback added this to the Next Major Release milestone Jun 13, 2017

datapythonista mentioned this issue Jan 12, 2018

np.float16 support #9220

Closed

jbrockmendel removed Difficulty Intermediate labels Oct 21, 2019

mroeschke added Bug Dtype Conversions Unexpected or buggy dtype conversions labels May 8, 2020

jreback closed this as completed Nov 20, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pandas.io.json.dumps raises OverflowError on float32 and float16 NaNs #16686

pandas.io.json.dumps raises OverflowError on float32 and float16 NaNs #16686

Kiv commented Jun 12, 2017 •

edited

Loading

jreback commented Jun 13, 2017

Kiv commented Jun 13, 2017

Kiv commented Jun 13, 2017

Kiv commented Jun 13, 2017

chris-b1 commented Jun 13, 2017

jreback commented Jun 14, 2017

pandas.io.json.dumps raises OverflowError on float32 and float16 NaNs #16686

pandas.io.json.dumps raises OverflowError on float32 and float16 NaNs #16686

Comments

Kiv commented Jun 12, 2017 • edited Loading

Code Sample, a copy-pastable example if possible

Problem description

Expected Output

Output of pd.show_versions()

jreback commented Jun 13, 2017

Kiv commented Jun 13, 2017

Kiv commented Jun 13, 2017

Kiv commented Jun 13, 2017

chris-b1 commented Jun 13, 2017

jreback commented Jun 14, 2017

Kiv commented Jun 12, 2017 •

edited

Loading

Output of `pd.show_versions()`