-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
ENH: GH14883: json_normalize now takes a user-specified separator #14950
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
you don't need to open/close new PR's. simply push new commits to this one. |
doc/source/whatsnew/v0.20.0.txt
Outdated
@@ -109,6 +109,7 @@ Other enhancements | |||
- ``pandas.io.json.json_normalize()`` gained the option ``errors='ignore'|'raise'``; the default is ``errors='raise'`` which is backward compatible. (:issue:`14583`) | |||
|
|||
- ``.select_dtypes()`` now allows the string 'datetimetz' to generically select datetimes with tz (:issue:`14910`) | |||
+- ``pandas.io.json.json_normalize()`` gained ``sep`` option that accepts ``str``, default is ".", which is backward compatible. (:issue:`14883`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a sep
option
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you don't need to say accepts 'str'
, explain the purpose instead.
pandas/io/json.py
Outdated
errors='raise'): | ||
|
||
errors='raise', | ||
sep='.'): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can put this before errors
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jreback Comment on the previous pull request indicated that I should put this last in order to not break any current code that uses positional arguments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah but I don't think its a big deal and a bit cleaner this way
@@ -63,6 +63,21 @@ def test_simple_normalize(self): | |||
|
|||
tm.assert_frame_equal(result, expected) | |||
|
|||
def test_simple_normalize_with_default_separator(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add the issue number as a comment
you can make this all 1 test (with 3 sub-tests).
@jreback Thank you for all the kind comments! Um, can you take a minute or three to look at the actual code in json_normalize and clue me into where I'm going wrong? ('Cause it doesn't work.) |
@jowens no idea, you'll have to step thru. my guess is that |
can you update |
I have no update, sorry to say. Someone who understands this code better is gonna be necessary to make this fix, I'm afraid. |
@jowens I pushed a commit here. 3 issues.
I think this might benefit from a deeper nested example as well. |
Codecov Report
@@ Coverage Diff @@
## master #14950 +/- ##
==========================================
- Coverage 90.99% 90.97% -0.03%
==========================================
Files 143 143
Lines 49420 49422 +2
==========================================
- Hits 44972 44963 -9
- Misses 4448 4459 +11
Continue to review full report at Codecov.
|
@jowens can you update? (for the questions I had above) |
Is the update you want "write a better test for 2 and 3, and add a deeper example"? |
yep |
can you rebase / add those addtl tests? |
I'd like to, I just haven't had the opportunity to write them yet. Sorry. |
@jowens np. just moving PR's along. pls ping when you are ready for review. |
closes pandas-dev#14883 Author: Jeff Reback <[email protected]> Author: John Owens <[email protected]> Closes pandas-dev#14950 from jowens/json_normalize-separator and squashes the following commits: 0327dd1 [Jeff Reback] compare sorted columns bc5aae8 [Jeff Reback] CLN: fixup json_normalize with sep 8edc40e [John Owens] ENH: json_normalize now takes a user-specified separator
test_simple_normalize_with_{default, user_specified, user_specified_unicode}_separator
)git diff upstream/master | flake8 --diff
However, this doesn't work, even after making the fixes suggested in #14891. I thought replacing
'.'
inmeta_keys = ['.'.join(val) for val in meta]
withsep
would do the trick. It doesn't, so I'm a little puzzled. Happy to take a suggestion.