Skip to content

Ability to Unnest Series of Dicts #30300

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
bnaul opened this issue Dec 17, 2019 · 5 comments
Closed

Ability to Unnest Series of Dicts #30300

bnaul opened this issue Dec 17, 2019 · 5 comments

Comments

@bnaul
Copy link
Contributor

bnaul commented Dec 17, 2019

Code Sample, a copy-pastable example if possible

s = pd.Series([{'a': {'b': 1, 'c': 2}}, {'a': {'b': 3, 'c': 4}}], index=[1, 3])
s

1    {'a': {'b': 1, 'c': 2}}
3    {'a': {'b': 3, 'c': 4}}
dtype: object


json_normalize(s)

   a.b  a.c
0    1    2
1    3    4

Problem description

Maybe I'm missing some nuance of nesting or something that would potentially result in a different number of rows...? But it seems like if the rows of the output correspond to those of the input then the index should match as well.

Output of pd.show_versions()

pandas==0.25.3

@WillAyd
Copy link
Member

WillAyd commented Dec 17, 2019

IIUC you want the resulting index to be 1 and 3 as well? json_normalize works on generic sequences (actually only documented for list) so doesn't attempt to preserve any index information

@bnaul
Copy link
Contributor Author

bnaul commented Dec 17, 2019

Ah I see, I guess that makes sense. What I really had in mind was something like Series.explode but for (potentially nested) dicts instead of lists, and json_normalize seemed to be pretty close to that already. Applying pd.Series does this sort of "box"ing but only for one level:

s = pd.Series([{'a': 1, 'b': 2}, {'a': 2, 'b': 3}], index=[1, 3])
s

1    {'a': 1, 'b': 2}
3    {'a': 2, 'b': 3}
dtype: object


s.apply(pd.Series)

   a  b
1  1  2
3  2  3

I guess if json_normalize wasn't intended for this then what I'm really asking is whether there's another place where this kind of transformation would belong? Would enhancing .explode to handle dicts as well be a bad idea?

@WillAyd
Copy link
Member

WillAyd commented Dec 17, 2019

Perhaps a DictArray like #29557 with an unnest method would work here but let's see what others think

@WillAyd WillAyd changed the title json_normalize resets index Ability to Unnest Series of Dicts Dec 17, 2019
@jbrockmendel
Copy link
Member

Is the request for a change in how the Series constructor behaves, or something else?

@bnaul
Copy link
Contributor Author

bnaul commented Dec 18, 2019

@jbrockmendel it was originally a request for a change to json_normalize, but it sounds like that function wasn't intended to handle inputs that are already in Series form.

I think it probably makes sense to close this as a standalone issue for now because I don't think changing the Series constructor makes sense; @WillAyd's proposal for a DictArray would definitely be a great place for this kind of functionality though 👍

@bnaul bnaul closed this as completed Dec 18, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants