Skip to content

pd.read_json and orient="index" sorts results #28557

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
WillAyd opened this issue Sep 21, 2019 · 4 comments · Fixed by #28606
Closed

pd.read_json and orient="index" sorts results #28557

WillAyd opened this issue Sep 21, 2019 · 4 comments · Fixed by #28606
Labels
IO JSON read_json, to_json, json_normalize

Comments

@WillAyd
Copy link
Member

WillAyd commented Sep 21, 2019

Note that the order of input is not maintained with orient="index"

>>> df = pd.DataFrame(range(3), index=range(3, 0, -1))
>>> df
   0
3  0
2  1
1  2
>>> df.to_json(orient="index")
'{"3":{"0":0},"2":{"0":1},"1":{"0":2}}'
>>> pd.read_json(df.to_json(orient="index"), orient="index")
   0
1  2
2  1
3  0

This is not necessarily a problem as by definition JSON objects are unordered, but preserving the order here would be a logical follow up to #27309

@WillAyd WillAyd added the IO JSON read_json, to_json, json_normalize label Sep 21, 2019
@KimDoubleB
Copy link
Contributor

KimDoubleB commented Sep 21, 2019

Hi !
If this issue is not assigned, I would like to make PR for this :)

This is the result you expected, right?

...
>>> pd.read_json(df.to_json(orient="index"), orient="index")
   0
3  0
2  1
1  2

@WillAyd
Copy link
Member Author

WillAyd commented Sep 23, 2019

Yes that's correct, at least for Py36 and above. The lines where this is happening is here:

elif orient == "index":

So for now look in pandas.compat for the PY35 directive and only sort for that. Py36 and above won't need extra handling.

We'll eventually be ripping out Py35 by the time this gets released but good to do it that way so the subsequent clean up is easily recognizable.

@WillAyd WillAyd added this to the Contributions Welcome milestone Sep 23, 2019
@KimDoubleB
Copy link
Contributor

I have completed the code modification to resolve the issue you mentioned. However, an error occurs because the existing test file is checking the sorting at Py36 and above.

if not numpy and (orient == "index" or (PY35 and orient == "columns")):
# TODO: debug why sort is required
expected = expected.sort_index()

Therefore, can I modify the test file to solve the errors, and then move on to PR?

@WillAyd
Copy link
Member Author

WillAyd commented Sep 24, 2019

Yep the text will need to change to only sort on Py35

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
IO JSON read_json, to_json, json_normalize
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants