Skip to content

Added test to confrim categorical perservation with merge #52810

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

agodinez01
Copy link

@rhshadrach rhshadrach added Reshaping Concat, Merge/Join, Stack/Unstack, Explode Categorical Categorical Data Type Needs Tests Unit test(s) needed to prevent regressions labels Apr 21, 2023

def test_merge_perserve_categorical_index():
# gh37480
df1 = pd.DataFrame({'x': [i for i in range(10)], 'y': [i for i in range(10)],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can do range(10) instead of [i for i in range(10)]

'z': [i for i in range(10)], 'd': [i for i in range(10)]})
df2 = df1.astype({'x':'category', 'y':'category', 'z':'category'})

df3 = df2.iloc[:10, :].groupby(['z', 'x'], observed=True).agg({'d': 'sum'})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you doing a groupby here?


result = pd.merge(df3, df4, left_index=True, right_index=True, how='outer')

assert(result.index.dtypes == 'category').all()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please always check the whole object with tm.assert_frame_equal()

@phofl phofl added the Sprints Sprint Pull Requests label Apr 21, 2023
@github-actions
Copy link
Contributor

This pull request is stale because it has been open for thirty days with no activity. Please update and respond to this comment if you're still interested in working on this.

@github-actions github-actions bot added the Stale label May 22, 2023
@mroeschke
Copy link
Member

Thanks for the pull request, but it appears to have gone stale. If interested in continuing, please merge in the main branch, address any review comments and/or failing tests, and we can reopen.

@mroeschke mroeschke closed this May 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Categorical Categorical Data Type Needs Tests Unit test(s) needed to prevent regressions Reshaping Concat, Merge/Join, Stack/Unstack, Explode Sprints Sprint Pull Requests Stale
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: category index levels casted to non-category dtype in merge
4 participants