-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Added errors{'raise','ignore'} for keys not found in meta for json_normalize #14583
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added errors{'raise','ignore'} for keys not found in meta for json_normalize #14583
Conversation
Current coverage is 85.26% (diff: 100%)@@ master #14583 diff @@
==========================================
Files 144 144
Lines 50980 50985 +5
Methods 0 0
Messages 0 0
Branches 0 0
==========================================
+ Hits 43470 43474 +4
- Misses 7510 7511 +1
Partials 0 0
|
Cleaning it up, should be better now. Is there a new whatsnew file so I can add the comments? What's the next version number? |
Why is nobody merging this? |
@@ -740,6 +742,8 @@ def json_normalize(data, record_path=None, meta=None, | |||
If True, prefix records with dotted (?) path, e.g. foo.bar.field if | |||
path to records is ['foo', 'bar'] | |||
meta_prefix : string, default None | |||
error: {'raise', 'ignore'}, default 'raise' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need a version in 0.20.0 tag
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need a version added tag
@@ -80,3 +80,4 @@ Performance Improvements | |||
|
|||
Bug Fixes | |||
~~~~~~~~~ | |||
- Enhancement in ``pandas.io.json.json_normalize``If meta keys are not always present a new option to set errors='ignore' has been implemented (:issue:`14505`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can put in enhancements section
@@ -80,3 +80,4 @@ Performance Improvements | |||
|
|||
Bug Fixes | |||
~~~~~~~~~ | |||
- Enhancement in ``pandas.io.json.json_normalize``If meta keys are not always present a new option to set errors='ignore' has been implemented (:issue:`14505`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
space after quotes and before If. put errors='ignore'
in double-backticks.
meta_val = np.nan | ||
else: | ||
raise KeyError( | ||
"Try running with errors='ignore' as the following key may not always be present: " + str(e)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this won't pass linting
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not? What do I need to change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We check for PEP8, so this line will be too long. Normally you can set your editor to check for those things, but can also see the output of the check on travis (3d build): https://travis-ci.org/pandas-dev/pandas/jobs/174180562 (at the bottom)
pandas/io/json.py:746:80: E501 line too long (85 > 79 characters)
pandas/io/json.py:853:80: E501 line too long (129 > 79 characters)
pandas/io/tests/json/test_json_norm.py:229:5: E303 too many blank lines (2)
pandas/io/tests/json/test_json_norm.py:271:80: E501 line too long (81 > 79 characters)
pandas/io/tests/json/test_json_norm.py:272:80: E501 line too long (104 > 79 characters)
pandas/io/tests/json/test_json_norm.py:273:17: E225 missing whitespace around operator
pandas/io/tests/json/test_json_norm.py:274:10: E128 continuation line under-indented for visual indent
pandas/io/tests/json/test_json_norm.py:275:10: E128 continuation line under-indented for visual indent
pandas/io/tests/json/test_json_norm.py:276:10: E128 continuation line under-indented for visual indent
pandas/io/tests/json/test_json_norm.py:277:10: E128 continuation line under-indented for visual indent
@@ -225,6 +225,59 @@ def test_nested_flattens(self): | |||
|
|||
self.assertEqual(result, expected) | |||
|
|||
|
|||
def test_json_normalise_fix(self): | |||
# issue 14505 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
put a comment explaining what this is. rename test -> test_json_normalize_errors
'price': {0: '0', 1: '0', 2: '0', 3: '0'}, | ||
'symbol': {0: 'AAPL', 1: 'GOOG', 2: 'AAPL', 3: 'GOOG'}} | ||
|
||
self.assertEqual(j.fillna('').to_dict(), expected) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test the same with errors='raise'
@dickreuter we have 107 open PR's. very few people do code review. simply ping if you want someone to look. |
@dickreuter just so you know: we don't get notifications when you push new code, only from comments. So it's always good to put a comment after you pushed new code to indicated that you updated. And as @jreback says, pinging like you now did always helps |
Ok great. Please have a look. I think this can now be merged. |
@dickreuter Did you push the changes? As there is still a merge conflict and there is no new commit since @jreback 's comments |
Just pushed the changes as instructed. |
Anything else I need to fix in this or can this be merged? |
Just did another rebase and pushed again. Can this be merged now? thanks |
@@ -49,6 +49,7 @@ Other enhancements | |||
^^^^^^^^^^^^^^^^^^ | |||
|
|||
- ``pd.read_excel`` now preserves sheet order when using ``sheetname=None`` (:issue:`9930`) | |||
- ``pandas.io.json.json_normalize`` If meta keys are not always present a new option to set errors="ignore" (:issue:`14583`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In pandas.io.json.json_normalize
gained the option errors='ignore'|raise
(in double-backticks); the default is raise
and is backward compatible.
@@ -104,4 +105,4 @@ Performance Improvements | |||
.. _whatsnew_0200.bug_fixes: | |||
|
|||
Bug Fixes | |||
~~~~~~~~~ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
revert this
@@ -740,6 +742,8 @@ def json_normalize(data, record_path=None, meta=None, | |||
If True, prefix records with dotted (?) path, e.g. foo.bar.field if | |||
path to records is ['foo', 'bar'] | |||
meta_prefix : string, default None | |||
error: {'raise', 'ignore'}, default 'raise' | |||
* ignore: will ignore keyErrors if keys listed in meta are not always present |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add an entry for raise
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will ignore KeyError
, if the keys passed in meta are not all present
meta_val = np.nan | ||
else: | ||
raise \ | ||
KeyError( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
better comment please
@@ -225,6 +225,65 @@ def test_nested_flattens(self): | |||
|
|||
self.assertEqual(result, expected) | |||
|
|||
|
|||
def test_json_normalize_errors(self): | |||
# If meta keys are not always present a new option to set errors='ignore' has been implemented (:issue:`14583`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GH14583
just some doc comments, otherwise lgtm. |
Implemented doc amendments as suggested. Let me know if anything else needs to be changed. thanks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some PEP8 style issues, otherwise looks good!
@@ -740,6 +742,9 @@ def json_normalize(data, record_path=None, meta=None, | |||
If True, prefix records with dotted (?) path, e.g. foo.bar.field if | |||
path to records is ['foo', 'bar'] | |||
meta_prefix : string, default None | |||
error: {'raise', 'ignore'}, default 'raise' | |||
* ignore: will ignore KeyError if keys listed in meta are not always present | |||
* raise: will raise KeyError if keys listed in meta are not always present |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect those two lines above are too long (> 80, we check for PEP8, which will let travis fail)
if errors == 'ignore': | ||
meta_val = np.nan | ||
else: | ||
raise KeyError("Try running with errors='ignore' as key %s is not always present.", e) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here, for PEP8 line too long
@@ -225,6 +225,65 @@ def test_nested_flattens(self): | |||
|
|||
self.assertEqual(result, expected) | |||
|
|||
|
|||
def test_json_normalize_errors(self): | |||
# GH14583: If meta keys are not always present a new option to set errors='ignore' has been implemented |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here and in other places below line length is also a problem
@dickreuter You still have some style issues (that's the reason that travis is failing). You can always look in the log of the third travis build to see the issues (https://travis-ci.org/pandas-dev/pandas/jobs/182143127, scroll to the bottom):
(it also a good idea to set up your IDE to warn for this) |
I believe this is fixed now. Some checks are still not successful but I can't find any reason why. Any suggestions why? All it says is: The command "ci/lint.sh" exited with 1. |
git diff master | flake8 -diff |
@dickreuter You can also always look at the third travis build (https://travis-ci.org/pandas-dev/pandas/jobs/183014087, scroll to bottom), which says "pandas/io/json.py:857:45: E211 whitespace before '('" |
Linting still seems to fail, even though when I do "git diff master | flake8 --diff" I only get problems in v0.20.0.txt, all the rest is fine. Any suggestions how I can find the remaining issues? |
@dickreuter It was not linting this time, the mac build failed. |
And it is related to my code? I can't find any details. |
@dickreuter you need to rebase on master
|
… meta parameter that don't always occur in every item of the list Added documentation and test for issue #14505 Added keyword errors {'raise'|'ignore} Shortened what's new Removed commas in dictionary for linting compatibility Updated doc
Now all checked have passed. |
thanks! |
…on_normalize Author: dickreuter <[email protected]> Closes pandas-dev#14583 from dickreuter/json_normalize_enhancement and squashes the following commits: 701c140 [dickreuter] adjusted formatting 3c94206 [dickreuter] shortened lines to pass linting 2028924 [dickreuter] doc changes d298588 [dickreuter] Fixed as instructed in pull request page bcfbf18 [dickreuter] Avoids exception when pandas.io.json.json_normalize
Continuation of #14505
Added errors{'raise','ignore'} for keys not found in meta for json_normalize