-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG:#29928 Fix to_json output 'table' orient for single level MultiIndex. #34375
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
pandas/io/json/_json.py
Outdated
@@ -286,6 +286,9 @@ def __init__( | |||
) | |||
raise ValueError(msg) | |||
|
|||
if obj.index.nlevels == 1 and isinstance(obj.index, MultiIndex): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you do this in build_table_schema instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure
is there an associated issue? |
Yes, #29928 |
pandas/io/json/_table_schema.py
Outdated
@@ -230,6 +230,10 @@ def build_table_schema(data, index=True, primary_key=None, version=True): | |||
'pandas_version': '0.20.0', | |||
'primaryKey': ['idx']} | |||
""" | |||
|
|||
if data.index.nlevels == 1 and isinstance(data.index, MultiIndex): | |||
data.index = data.index.get_level_values(0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is data
copied somewhere or is this mutating the frame that the user supplies?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
data
is indeed modified, thanks.
The simplest solution is to make a copy of data
in this particular case.
However, a copy of data
is made after the call of build_table_schema
.
If we definitely want to avoid the copy, it is possible to make it by adding modifications in build_table_schema
, and replacing index in constructor of JSONTableWriter
once the copy is created.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's try that for now. We usually want to avoid extra copies
@@ -435,6 +435,23 @@ def test_to_json_categorical_index(self): | |||
|
|||
assert result == expected | |||
|
|||
@pytest.mark.parametrize("name", [None, "foo"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you replicate all of the tests in the OP (there are 4 cases)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I provided tests for cases 1 and 2 in the OP.
Case 3 is the standard behavior of orient='table'
on MultiIndex with several levels, which is tested in test_read_json_table_orient
.
Case 4 is a kind of workaround for single level MultiIndex.
) | ||
result = df.to_json(orient="table") | ||
|
||
assert result == expected |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do these fully round-trip? can you test that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They should, I will complete tests.
doc/source/whatsnew/v1.1.0.rst
Outdated
@@ -951,9 +951,12 @@ I/O | |||
- :func:`pandas.read_hdf` has a more explicit error message when loading an | |||
unsupported HDF file (:issue:`9539`) | |||
- Bug in :meth:`~DataFrame.read_feather` was raising an `ArrowIOError` when reading an s3 or http file path (:issue:`29055`) | |||
- Bug in :meth:`read_parquet` was raising a ``FileNotFoundError`` when passed an s3 directory path. (:issue:`26388`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don’t think mean to include this line or the next
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In deed, I will remove it.
98df8b7
to
689d50c
Compare
doc/source/whatsnew/v1.1.0.rst
Outdated
@@ -956,6 +956,7 @@ I/O | |||
- Bug in :meth:`~DataFrame.to_excel` could not handle the column name `render` and was raising an ``KeyError`` (:issue:`34331`) | |||
- Bug in :meth:`~SQLDatabase.execute` was raising a ``ProgrammingError`` for some DB-API drivers when the SQL statement contained the `%` character and no parameters were present (:issue:`34211`) | |||
- Bug in :meth:`~pandas.io.stata.StataReader` which resulted in categorical variables with difference dtypes when reading data using an iterator. (:issue:`31544`) | |||
- Bug in :meth:`~DataFrame.to_json` with 'table' orient was writting wrong index field name for MultiIndex Dataframe with a single level. (:issue:`29928`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
writting -> writing
Considering I have no news on this PR, is it suitable, or is there anything I can do to improve it ? |
@LucasG0 can you fix merge conflicts and move note to 1.2? Someone will take a look again after that |
b079e0e
to
d9ac451
Compare
@WillAyd Done, I can restore test for timedeltas if needed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm though I think should merge master and repush to fix CI. @jreback
doc/source/whatsnew/v1.2.0.rst
Outdated
@@ -284,6 +284,7 @@ MultiIndex | |||
^^^^^^^^^^ | |||
|
|||
- Bug in :meth:`DataFrame.xs` when used with :class:`IndexSlice` raises ``TypeError`` with message `Expected label or tuple of labels` (:issue:`35301`) | |||
- Bug in :meth:`~DataFrame.to_json` with 'table' orient was writting wrong index field name for MultiIndex Dataframe with a single level (:issue:`29928`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually can you move this to the I/O section?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, done
d8bbabd
to
5d43564
Compare
lgtm @jreback |
cbbe8de
to
350be02
Compare
…e level MultiIndex. Index field name in written json was incorrect, so applying read_json resulted in NaN index values. Dataframe to_json with 'table' orient now treats single level MultiIndex like single Index.
@jreback maybe you have an opinion on this ? :) |
Thanks I will check it. |
This pull request is stale because it has been open for thirty days with no activity. Please update or respond to this comment if you're still interested in working on this. |
closing as stale. if you want to continue, pls ping and can re-open. |
dataframe.to_json()
was writing incorrect index field name, so applying read_json resulted in NaN index values.dataframe.to_json()
now converts single level MultiIndex into single Index before encoding.black pandas
git diff upstream/master -u -- "*.py" | flake8 --diff