Using to_json/read_json with orient='table' on a DataFrame with a single level MultiIndex does not work #29928

larrymouse · 2019-11-29T17:36:52Z

Code Sample, a copy-pastable example if possible

import pandas

#Scenario1
df=pandas.DataFrame(data=[[1,1],[2,2],[3,3],[4,4]],columns=['A','B'], index=pandas.MultiIndex.from_tuples([(1,),(2,),(3,),(4,)]))
js=df.to_json()
pandas.read_json(js) #ok
js=df.to_json(orient='table')
new_df=pandas.read_json(js, orient='table') # this runs but creates NaN indexes

#Scenario2
df=pandas.DataFrame(data=[[1,1],[2,2],[3,3],[4,4]],columns=['A','B'], index=pandas.MultiIndex.from_tuples([(1,),(2,),(3,),(4,)], names=['ind1']))
js=df.to_json()
pandas.read_json(js) #ok
js=df.to_json(orient='table')
try:
    new_df==pandas.read_json(js, orient='table')  # this fails with KeyError: 'None of [None] are in the columns'
except KeyError as e:
    pass

#Scenario3
df=pandas.DataFrame(data=[[1,1],[2,2],[3,3],[4,4]],columns=['A','B'], index=pandas.MultiIndex.from_tuples([(1,1),(1,2),(2,1),(2,2)], names=['ind1', 'ind2']))
js=df.to_json()
pandas.read_json(js) #ok
js=df.to_json(orient='table')
new_df=pandas.read_json(js, orient='table')  # this works as is is multi level multiindex

#Scenario4
df=pandas.DataFrame(data=[[1,1],[2,2],[3,3],[4,4]],columns=['A','B'], index=pandas.MultiIndex.from_tuples([(1,),(2,),(3,),(4,)]))
df.index=df.index.to_flat_index() # a workaround?
js=df.to_json()
pandas.read_json(js) #ok
js=df.to_json(orient='table')
new_df=pandas.read_json(js, orient='table') # This is a workaround that does produce sensible results

Problem description

I have a DataFrame that has a MultiIndex, even though that index only has one level and could easily be a normal index, I am receiving this table from ipyaggrid, so am not in control of that.

Saving this table to json with df.to_json(orient='table) and then loading from pandas.read_json(js, orient='table) leads to either an exception or the index being changed to NaN values.

Neither of these outcomes seems desirable or correct.

I have a workaround to intercept DataFrames with this feature and convert their index so a flat Index, but it would be good if the table serialisation just worked.

I was drawn to orient='Table' by the comments on this similar issue, my main requirement here is to save/load the DataFrame preserving its row and column order. You don't get this with the standard to_json/from_json modes
#4889

Expected Output

new_df for scenario1 and scenario2 to should have a MultiIndex, in the first scenario with a level that has no name and in the second scenario with level called 'ind1'

Output of `pd.show_versions()`

pandas.show_versions()
INSTALLED VERSIONS

commit : None
python : 3.6.7.final.0
python-bits : 64
OS : Windows
OS-release : 7
machine : AMD64
processor : Intel64 Family 6 Model 62 Stepping 4, GenuineIntel
byteorder : little
LC_ALL : None
LANG : None
LOCALE : None.None
pandas : 0.25.0
numpy : 1.17.0
pytz : 2019.2
dateutil : 2.8.0
pip : 10.0.1
setuptools : 39.1.0
Cython : None
pytest : None
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 2.10.3
IPython : 7.9.0
pandas_datareader: None
bs4 : None
bottleneck : None
fastparquet : None
gcsfs : None
lxml.etree : None
matplotlib : 3.1.1
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pytables : None
s3fs : None
scipy : 1.3.1
sqlalchemy : None
tables : None
xarray : None
xlrd : 1.2.0
xlwt : None
xlsxwriter : None

The text was updated successfully, but these errors were encountered:

janheindejong · 2020-04-17T07:41:49Z

I guess I have the same problem.

Creating two DFs, one with multiindex, and one without:

import pandas as pd
import numpy as np
from datetime import datetime 

index = [datetime(2000, 1, 1, 0, 0), datetime(2000, 1, 1, 0, 1), datetime(2000, 1, 1, 0, 2)]
arrays = [['bar', 'bar', 'baz', 'baz', 'foo', 'foo', 'qux', 'qux'],
          ['one', 'two', 'one', 'two', 'one', 'two', 'one', 'two']]

tuples = list(zip(*arrays))
columns_mi = pd.MultiIndex.from_tuples(tuples, names=['first', 'second'])
columns = ["A", "B", "C", "D", "E", "F", "G", "H"]

df = pd.DataFrame(data=np.random.randn(3, 8), index=index, columns=columns)
df_mi = pd.DataFrame(data=np.random.randn(3, 8), index=index, columns=columns_mi)

Converting the 'normal' DataFrame to json and back works fine (both in split and table orientation):

s = df.to_json(orient="split")
pd.read_json(s, orient="split")
s = df.to_json(orient="table")
pd.read_json(s, orient="table")

But doing the same with the multiindex one doesn't.

s = df_mi.to_json(orient="split")
pd.read_json(s, orient="split")

Raises: TypeError: <class 'tuple'> is not convertible to datetime

s = df_mi.to_json(orient="table")
pd.read_json(s, orient="table")

Raises: NotImplementedError: orient='table' is not supported for MultiIndex.

I'm trying to send multiindexed dataframes over HTTP; any suggestions on how I can do that?

…e level MultiIndex. Index field name in written json was incorrect, so applying read_json resulted in NaN index values. Dataframe to_json with 'table' orient now treats single level MultiIndex like single index.

…e level MultiIndex. Index field name in written json was incorrect, so applying read_json resulted in NaN index values. Dataframe to_json with 'table' orient now treats single level MultiIndex like single Index.

LucasG0 · 2020-05-26T15:10:32Z

I submitted a PR for the author issue.

About janheindejong issues,

Raises: TypeError: <class 'tuple'> is not convertible to datetime

This may be a dupe of #4889, MultiIndex Dataframes and Series do not seem to be handled by read_json(orient='split').

NotImplementedError: orient='table' is not supported for MultiIndex

orient='table' does not support MultiIndex columns on Dataframes.

…e level MultiIndex. Index field name in written json was incorrect, so applying read_json resulted in NaN index values. Dataframe to_json with 'table' orient now treats single level MultiIndex like single Index.

Interesting6 · 2023-11-20T01:32:28Z

Same question, I'm trying to send multiindexed dataframes over HTTP.

When i serialise multi-index dataframe to an json by split or table orient, tuple inside the index would be converted to list. But when i try to deserialise it, it raise TypeError: unhashable type: 'list' ;

Any suggestions on how I can do that?

developer992 · 2024-03-13T08:25:37Z

it's year 2024 and there is still no generic support to dump dataframe to json and get it back ?

NotImplementedError: orient='table' is not supported for MultiIndex columns

jbrockmendel added the IO JSON read_json, to_json, json_normalize label Nov 30, 2019

mroeschke added Bug MultiIndex labels May 8, 2020

LucasG0 mentioned this issue May 26, 2020

BUG:#29928 Fix to_json output 'table' orient for single level MultiIndex. #34375

Closed

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using to_json/read_json with orient='table' on a DataFrame with a single level MultiIndex does not work #29928

Using to_json/read_json with orient='table' on a DataFrame with a single level MultiIndex does not work #29928

larrymouse commented Nov 29, 2019 •

edited

Loading

pandas.show_versions()
INSTALLED VERSIONS

janheindejong commented Apr 17, 2020

LucasG0 commented May 26, 2020

Interesting6 commented Nov 20, 2023 •

edited

Loading

developer992 commented Mar 13, 2024 •

edited

Loading

Using to_json/read_json with orient='table' on a DataFrame with a single level MultiIndex does not work #29928

Using to_json/read_json with orient='table' on a DataFrame with a single level MultiIndex does not work #29928

Comments

larrymouse commented Nov 29, 2019 • edited Loading

Code Sample, a copy-pastable example if possible

Problem description

Expected Output

Output of pd.show_versions()

pandas.show_versions() INSTALLED VERSIONS

janheindejong commented Apr 17, 2020

LucasG0 commented May 26, 2020

Interesting6 commented Nov 20, 2023 • edited Loading

developer992 commented Mar 13, 2024 • edited Loading

larrymouse commented Nov 29, 2019 •

edited

Loading

Output of `pd.show_versions()`

pandas.show_versions()
INSTALLED VERSIONS

Interesting6 commented Nov 20, 2023 •

edited

Loading

developer992 commented Mar 13, 2024 •

edited

Loading