-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Avoids exception when pandas.io.json.json_normalize contains items in… #14505
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
… meta parameter that don't always occur in every item of the list
Current coverage is 85.27% (diff: 90.00%)@@ master #14505 diff @@
==========================================
Files 140 140
Lines 50670 50698 +28
Methods 0 0
Messages 0 0
Branches 0 0
==========================================
+ Hits 43205 43233 +28
Misses 7465 7465
Partials 0 0
|
what exactly is this fixing? would need a test. |
…wers xref numpy/numpy#8127 closes #14489 Author: Jeff Reback <[email protected]> Closes #14498 from jreback/compat and squashes the following commits: 882872e [Jeff Reback] COMPAT/TST: fix test for range testing of negative integers to neg powers
Title is self-explanatory. Affects Python 2.x only. Closes #14477. Author: gfyoung <[email protected]> Closes #14492 from gfyoung/quotechar-unicode-2.x and squashes the following commits: ec9f59a [gfyoung] BUG: Accept unicode quotechars again in pd.read_csv
The above will fail because trade_version is only available in one of the two items in the list, so there is no way to output it if not all elements are exactly the same. With my change it will simply ignore it and output nan instead of throwing an error. |
When the driver was not installed, but sqlalchemy itself was, when passing a URI string, you got an error indicating that SQLAlchemy was not installed, instead of the driver not being installed. This was because the import error for the driver was captured as import error for sqlalchemy.
@dickreuter Can you add that as a test and add a release note? See http://pandas.pydata.org/pandas-docs/stable/contributing.html#contributing-to-the-code-base |
This would be the test, but unclear where I should store it. Any suggestions? from unittest import TestCase
|
Looks like those tests are all in https://github.com/pandas-dev/pandas/blob/master/pandas/io/tests/json/test_json_norm.py You could add it as a test method under |
Test and documentation is now added. |
@@ -78,3 +78,4 @@ Bug Fixes | |||
|
|||
|
|||
- Bug in ``pd.pivot_table`` may raise ``TypeError`` or ``ValueError`` when ``index`` or ``columns`` is not scalar and ``values`` is not specified (:issue:`14380`) | |||
- Bug in ``pandas.io.json.json_normalize``When parsing a nested json and convert it to a dataframe, the meta parameter can be used to use fields as metadata for each record in resulting table. In some cases, not all items may contain all of the specified meta fields. This change will avoid throwing an error and output np.nan instead. (:issue '14505') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pls simplify. a user just wants to know does this issue pertain to them, and a short expl.
make the issue
(:issue:
14505)
meta_val = _pull_field(obj, val[level:]) | ||
try: | ||
meta_val = _pull_field(obj, val[level:]) | ||
except: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't use a bare except, list a specific exception KeyError
?
@@ -225,6 +225,51 @@ def test_nested_flattens(self): | |||
|
|||
self.assertEqual(result, expected) | |||
|
|||
def test_json_normalise_fix(self): | |||
j = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add the issue number as a comment
|
||
} | ||
] | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this prob does not pass linting. make sure it does.
} | ||
j = json_normalize(data=j['Trades'], record_path=[['general', 'stocks']], | ||
meta=[['general', 'tradeid'], ['general', 'trade_version']]) | ||
self.assertEqual(len(j), 4) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
construct the expected frame and use assert_frame_equal
@@ -792,7 +792,10 @@ def _recursive_extract(data, path, seen_meta, level=0): | |||
if level + 1 > len(val): | |||
meta_val = seen_meta[key] | |||
else: | |||
meta_val = _pull_field(obj, val[level:]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be a keyword, call it errors='raise'|'ignore'
. You are defining ignore
. Please leave the default as raise
(which is the current behavior).
Added documenation Shortened what's new Removed commas in dictionary for linting compatibility
Added keyword errors {'raise'|'ignore} |
pandas.core.common.array_equivalent was removed without deprecation warning. This commits adds it back to the core.common namespace with deprecation warning
* BUG/API: Index.append with mixed object/Categorical indices * Only coerce to object if the calling index is not categorical * Add test for the df.info() case (GH14298)
… meta parameter that don't always occur in every item of the list
Added documenation Shortened what's new Removed commas in dictionary for linting compatibility
# Conflicts: # doc/source/whatsnew/v0.19.1.txt
you need to rebase on master
|
I did that earlier today. It now says: "This branch is 8 commits ahead of pandas-dev:master.". There should currently be no more conflicts. |
maybe you didn't push it |
My fork on github seems up to date with what I have locally, so I assume it has been pushed. Are there any further changes you expected me to implement that are not present? |
@dickreuter its impossible to see until you rebase on master. this should have just your commits https://github.com/pandas-dev/pandas/pull/14505/commits |
I see, isn't it showing those commits of others only because I did a rebase of my fork (and then a local rebase of my local copy instead of a merge?). If that's a problem I could delete my fork and create a new one, then make the changes again and create a new pull request, unless you have a better suggestion. |
you prob just need something like
|
git fetch origin --> doesn't fetch anything as my local copy is in I think all my changes can be seen What may be confusing is that I also did an automatic reformatting On 3 November 2016 at 23:32, Jeff Reback [email protected] wrote:
|
could also be called
it doesn't matter if your branch is in sync with YOUR upstream, rather it needs to be in sync with pandas master (and on top of it), that's what a rebase is. you need to rebase to remove all of the merges of master. you shouldn't do that, instead rebase. |
This seems to be the problem: Will try to fix it, or if it's too complicated just delete and redo. |
@jreback Regarding your comment above (#14505 (comment)): pandas-dev repo is typically called 'upstream', and your own fork 'origin' (that's how our contributor guide also says it), so you need |
Follow-up in #14583 |
Continued in #14583
When using pandas.io.json.json_normalize to parse a nested json and convert it to a dataframe, the meta parameter can be used to use fields as metadata for each record in resulting table. In some cases, not all items may contain all of the specified meta fields. This change will avoid throwing an error and output np.nan instead.
… meta parameter that don't always occur in every item of the list