Skip to content

np.array([pd.Series({'a':'a'})]) works but np.array([pd.Series({1:1})]) doesn't #38543

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
soerenwolfers opened this issue Dec 17, 2020 · 11 comments · Fixed by #42939
Closed

np.array([pd.Series({'a':'a'})]) works but np.array([pd.Series({1:1})]) doesn't #38543

soerenwolfers opened this issue Dec 17, 2020 · 11 comments · Fixed by #42939
Assignees
Labels
good first issue Needs Tests Unit test(s) needed to prevent regressions Upstream issue Issue related to pandas dependency
Milestone

Comments

@soerenwolfers
Copy link

The last line here throws a KeyError: 0

import numpy as np
import pandas as pd
np.array([pd.Series({0:0})])
np.array([pd.Series({'a':'a'})])
np.array([pd.Series({1:1})])

Is that a bug?

@jorisvandenbossche
Copy link
Member

Can you show the output of pd.show_versions() ?

@MarcoGorelli MarcoGorelli added the Needs Info Clarification about behavior needed to assess issue label Dec 18, 2020
@MarcoGorelli
Copy link
Member

I can't reproduce this on v1.1.5

@soerenwolfers
Copy link
Author

Sorry,

pandas : 1.1.4
numpy : 1.17.4

@MarcoGorelli
Copy link
Member

Thanks, can confirm this reproduces.

If you upgrade pandas to 1.1.5 you'll find it fixed - it would be good to do a git bisect to figure out when this was fixed, and (if it was fixed accidentally) to add a test

@MarcoGorelli MarcoGorelli added Needs Tests Unit test(s) needed to prevent regressions and removed Needs Info Clarification about behavior needed to assess issue labels Dec 18, 2020
@simonjayhawkins
Copy link
Member

can't reproduce with 0.25.3, 1.0.5, 1.1.3, 1.1.4, 1.1.5, master (all envs using numpy 1.19+)

@MarcoGorelli
Copy link
Member

Same, just tried in 1.1.4 with numpy 1.19.4 and it works fine.

Conversely, on master and with numpy 1.17.4, it still fails. I think we can close this as an upstream bug anyway

@MarcoGorelli MarcoGorelli added Upstream issue Issue related to pandas dependency and removed Needs Tests Unit test(s) needed to prevent regressions labels Dec 18, 2020
@jorisvandenbossche
Copy link
Member

jorisvandenbossche commented Dec 18, 2020

But is it an identified / reported upstream issue?
I think it's still good to add a test (it's not given that this was an intentional/tested fix on numpy's side)

@MarcoGorelli
Copy link
Member

Sure - @soerenwolfers do you feel like contributing a test for this issue? If so, please check the contributing guide

@seberg
Copy link
Contributor

seberg commented Dec 18, 2020

Wow, there was an additional layer of confusion to unravel here :) (for myself). If there is only a single item in the dataframe, NumPy used to do use series[0] to get that item. But if there were more items it effectively used list(series). Since list() probably uses __iter__ internally – and that goes by position and not index – the only thing that gives different results on new NumPy seems to be the specific case of a series with a single element and non 0 index.

@horaceklai
Copy link
Contributor

take

@horaceklai
Copy link
Contributor

Hi I'm not sure what to do now. I made a PR and did the changes requested, but when running the tests, it throws the Key error: 0 as stated in the beginning of this issue.

#42939

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Needs Tests Unit test(s) needed to prevent regressions Upstream issue Issue related to pandas dependency
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants