Skip to content

read_json Raises AttributeError with Valid JSON as Input #14358

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
craymichael opened this issue Oct 5, 2016 · 2 comments
Open

read_json Raises AttributeError with Valid JSON as Input #14358

craymichael opened this issue Oct 5, 2016 · 2 comments
Labels
Bug IO JSON read_json, to_json, json_normalize

Comments

@craymichael
Copy link

craymichael commented Oct 5, 2016

A small, complete example of the issue

pd.read_json('[{}, null]')  # This raises `AttributeError`
pd.read_json('[null, {}]')  # This does not

While this isn't a big problem, the output is not as expected. The only way I see this causing issues for me is for parsing JSON received on a server; the raised error does not signify bad JSON (which it isn't) nor an appropriate ValueError. I don't think this should raise an error at all actually, correct me if I'm wrong.

Expected Output

# Raises `AttributeError`
      0
0    {}
1  None
# This is as expected
      0
0  None
1    {}

Actual Output

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python2.7/dist-packages/pandas/io/json.py", line 211, in read_json
    date_unit).parse()
  File "/usr/local/lib/python2.7/dist-packages/pandas/io/json.py", line 279, in parse
    self._parse_no_numpy()
  File "/usr/local/lib/python2.7/dist-packages/pandas/io/json.py", line 496, in _parse_no_numpy
    loads(json, precise_float=self.precise_float), dtype=None)
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/frame.py", line 263, in __init__
    arrays, columns = _to_arrays(data, columns, dtype=dtype)
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/frame.py", line 5355, in _to_arrays
    coerce_float=coerce_float, dtype=dtype)
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/frame.py", line 5471, in _list_of_dict_to_arrays
    columns = lib.fast_unique_multiple_list_gen(gen)
  File "pandas/lib.pyx", line 504, in pandas.lib.fast_unique_multiple_list_gen (pandas/lib.c:10190)
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/frame.py", line 5470, in <genexpr>
    gen = (list(x.keys()) for x in data)
AttributeError: 'NoneType' object has no attribute 'keys'

Output of pd.show_versions()

## INSTALLED VERSIONS

commit: None
python: 2.7.11.final.0
python-bits: 64
OS: Linux
OS-release: 4.4.0-38-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8

pandas: 0.18.1
nose: None
pip: 8.1.2
setuptools: 25.1.1
Cython: None
numpy: 1.11.1
scipy: 0.18.0
statsmodels: None
xarray: None
IPython: None
sphinx: None
patsy: None
dateutil: 2.5.3
pytz: 2016.6.1
blosc: None
bottleneck: None
tables: None
numexpr: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.8
boto: None
pandas_datareader: None

@jreback jreback changed the title read_json Raises AttributeError with Valid JSON as Input read_json Raises AttributeError with Valid JSON as Input Oct 6, 2016
@jreback jreback added the IO JSON read_json, to_json, json_normalize label Oct 6, 2016
@jreback
Copy link
Contributor

jreback commented Oct 6, 2016

I suppose. If someone wanted to submit more tests / fix for odd cases then sure.

@jreback jreback added this to the Next Major Release milestone Oct 6, 2016
@mroeschke mroeschke added the Bug label Oct 22, 2019
@sshankar-rks
Copy link

sshankar-rks commented May 3, 2020

Hit this today. Downloaded a bunch of json files from somewhere, and it turns out some of them have this pattern. While i can workaround this by cleaning the data manually, i believe this should work as expected (skip null rows). In case you're interested, my full repro sample is below:

import pandas as pd
import json

def is_json(myjson):
  try:
    json_object = json.loads(myjson)
  except ValueError as e:
    return False
  return True

pd.show_versions()

test_json_without_null = '[{"text":"one", "number":1 }, {"text":"two", "number":2 }]'

print("test_json_without_null is valid: {}".format(is_json(test_json_without_null)))

df2 = pd.read_json(path_or_buf = test_json_without_null)

print("Read test_json_without_null successfully")

test_json_with_null = '[{"text":"one", "number":1 }, null, {"text":"two", "number":2 }]'

print("test_json_with_null is valid: {}".format(is_json(test_json_with_null)))

df2 = pd.read_json(path_or_buf = test_json_with_null)
print("Read test_json_with_null successfully")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug IO JSON read_json, to_json, json_normalize
Projects
None yet
Development

No branches or pull requests

5 participants