Skip to content

to_json on multiindex does not escape strings properly #15273

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
rs2 opened this issue Jan 31, 2017 · 3 comments · Fixed by #41493
Closed

to_json on multiindex does not escape strings properly #15273

rs2 opened this issue Jan 31, 2017 · 3 comments · Fixed by #41493
Labels
good first issue Needs Tests Unit test(s) needed to prevent regressions
Milestone

Comments

@rs2
Copy link
Contributor

rs2 commented Jan 31, 2017

Code Sample, a copy-pastable example if possible

import pandas as pd
import datetime

pd.read_json(pd.DataFrame(True, index=pd.date_range(datetime.date(2017, 1, 20), datetime.date(2017, 1, 23)), columns=['foo', 'bar']).stack().to_json())

Problem description

The result of pd.DataFrame(True, index=pd.date_range(datetime.date(2017, 1, 20), datetime.date(2017, 1, 23)), columns=['foo', 'bar']).stack().to_json() contains

r'{"[1484870400000,"foo"]":true,"[1484870400000,"bar"]"',

Expected Output

Whereas it should be:
r'{"[1484870400000,\"foo\"]":true,"[1484870400000,\"bar\"]"'

Output of pd.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.4.5.final.0 python-bits: 32 OS: Windows machine: AMD64 LC_ALL: None LANG: None LOCALE: None.None

pandas: 0.19.1
Cython: 0.25.1
numpy: 1.11.2
pandas_datareader: None

@jreback
Copy link
Contributor

jreback commented Jan 31, 2017

So this does seem wrong for the default of orient='columns', though split is ok.

In [18]: df.to_json(orient='split')
Out[18]: '{"name":null,"index":[[1484870400000,"foo"],[1484870400000,"bar"],[1484956800000,"foo"],[1484956800000,"bar"],[1485043200000,"foo"],[1485043200000,"bar"],[1485129600000,"foo"],[1485129600000,"bar"]],"data":[true,true,true,true,true,true,true,true]}'

In [19]: df.to_json(orient='columns')
Out[19]: '{"[1484870400000,"foo"]":true,"[1484870400000,"bar"]":true,"[1484956800000,"foo"]":true,"[1484956800000,"bar"]":true,"[1485043200000,"foo"]":true,"[1485043200000,"bar"]":true,"[1485129600000,"foo"]":true,"[1485129600000,"bar"]":true}'

@jreback jreback added Bug IO JSON read_json, to_json, json_normalize MultiIndex Difficulty Intermediate labels Jan 31, 2017
@jreback jreback added this to the Next Major Release milestone Jan 31, 2017
@jreback
Copy link
Contributor

jreback commented Jan 31, 2017

cc @kawochen
cc @Komnomnomnom

pull requests welcome to fix!

@mroeschke
Copy link
Member

This looks correct on master now. Could use a test

In [39]: pd.DataFrame(True, index=pd.date_range(datetime.date(2017, 1, 20), datetime.date(2017, 1, 23)), columns=
    ...: ['foo', 'bar']).stack().to_json()
Out[39]: '{"(Timestamp(\'2017-01-20 00:00:00\'), \'foo\')":true,"(Timestamp(\'2017-01-20 00:00:00\'), \'bar\')":true,"(Timestamp(\'2017-01-21 00:00:00\'), \'foo\')":true,"(Timestamp(\'2017-01-21 00:00:00\'), \'bar\')":true,"(Timestamp(\'2017-01-22 00:00:00\'), \'foo\')":true,"(Timestamp(\'2017-01-22 00:00:00\'), \'bar\')":true,"(Timestamp(\'2017-01-23 00:00:00\'), \'foo\')":true,"(Timestamp(\'2017-01-23 00:00:00\'), \'bar\')":true}'

@mroeschke mroeschke added good first issue Needs Tests Unit test(s) needed to prevent regressions and removed Bug IO JSON read_json, to_json, json_normalize MultiIndex labels May 8, 2021
@mroeschke mroeschke modified the milestones: Contributions Welcome, 1.3 May 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Needs Tests Unit test(s) needed to prevent regressions
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants