Skip to content

Inexplicable KeyError in pd.io.json.json_normalize(dic) #21352

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
fera0013 opened this issue Jun 7, 2018 · 3 comments
Closed

Inexplicable KeyError in pd.io.json.json_normalize(dic) #21352

fera0013 opened this issue Jun 7, 2018 · 3 comments
Labels
IO JSON read_json, to_json, json_normalize

Comments

@fera0013
Copy link

fera0013 commented Jun 7, 2018

Code Sample, a copy-pastable example if possible

import pandas as pd

dic = {'_id': '5436e3abbae478396759f0cf', 'meta': {'clinical': {'benign_malignant': 'benign', 'age_approx': 55, 'sex': 'female', 'diagnosis': 'nevus', 'diagnosis_confirm_type': None, 'anatom_site_general': 'anterior torso', 'melanocytic': True}, 'acquisition': {'image_type': 'dermoscopic', 'pixelsX': 1022, 'pixelsY': 767}}, 'name': 'ISIC_0000000'}

frame = pd.io.json.json_normalize(dic)

Problem description

version 0.23.0 throws an

KeyError: 'diagnosis_confirm_type'

which 0.22.0 did not.

Expected Output

No exception

Output of pd.show_versions()

[paste the output of pd.show_versions() here below this line]

INSTALLED VERSIONS

commit: None
python: 3.6.5.final.0
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 63 Stepping 2, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None

pandas: 0.23.0
pytest: 3.5.1
pip: 10.0.1
setuptools: 39.1.0
Cython: 0.28.2
numpy: 1.14.3
scipy: 1.1.0
pyarrow: None
xarray: None
IPython: 6.4.0
sphinx: 1.7.4
patsy: 0.5.0
dateutil: 2.7.3
pytz: 2018.4
blosc: None
bottleneck: 1.2.1
tables: 3.4.3
numexpr: 2.6.5
feather: None
matplotlib: 2.2.2
openpyxl: 2.5.3
xlrd: 1.1.0
xlwt: 1.3.0
xlsxwriter: 1.0.4
lxml: 4.2.1
bs4: 4.6.0
html5lib: 0.9999999
sqlalchemy: 1.2.7
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None

@fera0013 fera0013 changed the title Unexplicable KeyError in pd.io.json.json_normalize(dic) Inexplicable KeyError in pd.io.json.json_normalize(dic) Jun 7, 2018
@uds5501
Copy link
Contributor

uds5501 commented Jun 7, 2018

@fera0013 Now this is awkward. Apparently the problem here is that the .json can't normalize None value. This should not be happening.

PS: the error may lie in the .pop() part

@fera0013
Copy link
Author

fera0013 commented Jun 7, 2018

The dict I am passing obviously contains values, which are not valid in json, such as True or None without quotation marks. So I would expect a json method to fail. The error message however is misleading, because it actually is a value and not a key error. Besides, the previous version was able to parse the same dict without error.

So json_normalize should either throw an error such as "invalid json", or "invalid value", or (what I prefer) it should be robust towards values without quotation marks, as long as the overall dict structure is syntactically correct.

If the method accepts a python dictionary (which it actually does), it should be able to deal with any syntactically correct python dict.

@WillAyd
Copy link
Member

WillAyd commented Jun 7, 2018

This is fixed on master via #21164

@WillAyd WillAyd closed this as completed Jun 7, 2018
@WillAyd WillAyd added the IO JSON read_json, to_json, json_normalize label Jun 7, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
IO JSON read_json, to_json, json_normalize
Projects
None yet
Development

No branches or pull requests

3 participants