Skip to content

Add test for MultiIndex Construction with pd.NA #31883

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
WillAyd opened this issue Feb 11, 2020 · 5 comments · Fixed by #33506 · May be fixed by devjeetr/pandas#1
Closed

Add test for MultiIndex Construction with pd.NA #31883

WillAyd opened this issue Feb 11, 2020 · 5 comments · Fixed by #33506 · May be fixed by devjeetr/pandas#1
Assignees
Labels
Constructors Series/DataFrame/Index/pd.array Constructors good first issue Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate MultiIndex Needs Tests Unit test(s) needed to prevent regressions

Comments

@WillAyd
Copy link
Member

WillAyd commented Feb 11, 2020

Discovered in #31799

>>> pd.MultiIndex.from_product([np.array([0., np.nan], dtype="object"), ["a", "B"]])
MultiIndex([(0.0, 'a'),
            (0.0, 'B'),
            (nan, 'a'),
            (nan, 'B')],
           )
>>> pd.MultiIndex.from_product([np.array([0., pd.NA], dtype="object"), ["a", "B"]])
*** TypeError: boolean value of NA is ambiguous

May be the same root cause as #31881

@WillAyd WillAyd added the Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate label Feb 11, 2020
@jorisvandenbossche
Copy link
Member

May be the same root cause as #31881

I think it is rather due to the factorization/hashtable not handling pd.NA, similarly to the creating of a Categorical: #31927

@WillAyd WillAyd changed the title MultiIndex Construction from pd.NA Broken in numpy array broken MultiIndex Construction from pd.NA in numpy array broken Feb 12, 2020
@jreback jreback added this to the 1.0.2 milestone Feb 16, 2020
@jbrockmendel jbrockmendel added Constructors Series/DataFrame/Index/pd.array Constructors MultiIndex labels Feb 25, 2020
@TomAugspurger
Copy link
Contributor

TomAugspurger commented Mar 9, 2020

This was fixed by #31939 and #31799. #31939 was backported, so this should be working on 1.0.2

I think we have a test for it at https://github.com/pandas-dev/pandas/blob/master/pandas/tests/indexes/multi/test_indexing.py#L412-L414, which is xfailed. Removing that test will give us the coverage we want.

@TomAugspurger TomAugspurger added the Needs Tests Unit test(s) needed to prevent regressions label Mar 9, 2020
@TomAugspurger TomAugspurger modified the milestones: 1.0.2, Contributions Welcome Mar 9, 2020
@TomAugspurger TomAugspurger changed the title MultiIndex Construction from pd.NA in numpy array broken Add test for MultiIndex Construction with d.NA Mar 9, 2020
@TomAugspurger TomAugspurger changed the title Add test for MultiIndex Construction with d.NA Add test for MultiIndex Construction with pd.NA Mar 9, 2020
@WillAyd
Copy link
Member Author

WillAyd commented Mar 10, 2020

Thanks for the insights @TomAugspurger. Adding good first issue label as well to see if a new contributor wants to remove the xfail

@MillanSharma
Copy link

take

@sumanau7
Copy link
Contributor

@MillanSharma you are still working on it ?

devjeetr added a commit to devjeetr/pandas that referenced this issue Apr 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Constructors Series/DataFrame/Index/pd.array Constructors good first issue Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate MultiIndex Needs Tests Unit test(s) needed to prevent regressions
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants