-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
ENH: json_normalize should allow a different separator than . #14883
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
The dot notation for accessing columns is just a convenience. You can still get the column normally using |
I understand. I'm merely offering the observation that columns with |
Sure thing. Just wanted to point that out in case it was keeping you from doing something you needed to do now since you said that you were new to pandas. |
Yeah, it's the vega-lite bug I filed (vega/vega-lite#1775) that's my proximate difficulty here. |
this would be quite easy to add; PR's welcome. |
closes pandas-dev#14883 Author: Jeff Reback <[email protected]> Author: John Owens <[email protected]> Closes pandas-dev#14950 from jowens/json_normalize-separator and squashes the following commits: 0327dd1 [Jeff Reback] compare sorted columns bc5aae8 [Jeff Reback] CLN: fixup json_normalize with sep 8edc40e [John Owens] ENH: json_normalize now takes a user-specified separator
Problem description
The above snippet shows that it's not ideal to have
.
as a character in a column name. (I'm running into this when using Vega for data visualization, vega/vega-lite#1775.) Whenjson_normalize
flattens a nested input JSON, it separates the nesting levels with a.
. I believe this happens on this line:pandas/pandas/io/json.py
Line 831 in 7d8bc0d
I'd like to see an additional argument to
json_normalize
,separator
, with default.
, that specified the character (string) that separated nesting levels. In the line of code above,'.'.join(val)
would be replaced byseparator.join(val)
(if I'm reading what that line does correctly). I could use, say,_
to use underscore instead of period.n00b at pandas, please correct me if I'm doing anything wrong.
Expected Output
Output of
pd.show_versions()
INSTALLED VERSIONS
commit: None
python: 2.7.12.final.0
python-bits: 64
OS: Darwin
OS-release: 16.3.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.US-ASCII
LOCALE: None.None
pandas: 0.19.1
nose: 1.3.7
pip: 9.0.1
setuptools: 30.3.0
Cython: 0.25.2
numpy: 1.11.2
scipy: 0.18.1
statsmodels: None
xarray: None
IPython: 5.1.0
sphinx: 1.5.1
patsy: None
dateutil: 2.6.0
pytz: 2016.10
blosc: None
bottleneck: 1.2.0
tables: 3.2.3.1
numexpr: 2.6.1
matplotlib: 1.5.3
openpyxl: 2.4.1
xlrd: 1.0.0
xlwt: None
xlsxwriter: None
lxml: 3.6.0
bs4: 4.5.1
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.8
boto: None
pandas_datareader: None
The text was updated successfully, but these errors were encountered: