Skip to content

Add test for unique aggregation returning different dtypes #47603

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jul 26, 2022

Conversation

srotondo
Copy link
Contributor

@srotondo srotondo commented Jul 5, 2022

@phofl phofl changed the title Unique test Add test for unique aggregation returning different dtypes Jul 6, 2022
agg1 = df1.agg(aggregation)
agg2 = df2.agg(aggregation)

tm.assert_class_equal(agg1, agg2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please define expected as a DataFrame and then use tm.assert_frame_equal. Also, this is not a groupby test, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, this shouldn't be in groupby. I've made the changes and moved it to a more appropriate file.

({"a": [2, 2, 2], "b": [1, 1, 1]}, {"a": 2, "b": 1}),
],
)
def test_unique_agg_type_is_series(test, constant):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still in the wrong place. Please check the frame folder for example. We have specific aggregations tests where this belongs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see some aggregation tests for DataFrames in apply/test_frame_apply.py. Would that be an appropriate place for this test?

@phofl phofl mentioned this pull request Jul 7, 2022
5 tasks
@jreback jreback added Testing pandas testing functions or related to the test suite Dtype Conversions Unexpected or buggy dtype conversions Algos Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff labels Jul 8, 2022
@srotondo
Copy link
Contributor Author

@phofl I've moved my test to a new file, is this a better place for it?

"test, constant",
[
({"a": [1, 2, 3], "b": [1, 1, 1]}, {"a": [1, 2, 3], "b": 1}),
({"a": [2, 2, 2], "b": [1, 1, 1]}, {"a": 2, "b": 1}),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the scalars in constant should be contained in lists e.g.

# result of agg
First aggregation: <class 'pandas.core.series.Series'>
a    [1, 2, 3]
b          [1]
dtype: object
Second aggregation: <class 'pandas.core.series.Series'>
a    [2]
b    [1]
dtype: object

In [3]: Series(data={"a": [1, 2, 3], "b": 1}, index=["a", "b"], dtype="object")
Out[3]:
a    2
b    1
dtype: object

In [4]: constant = {"a": [1, 2, 3], "b": 1}

In [6]: Series(data=constant, index=["a", "b"], dtype="object")
Out[6]:
a    [1, 2, 3]
b            1
dtype: object

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I can do that. Either way will work, but I think you are right that it is more accurate for it to be one-element lists rather than scalars.

@mroeschke mroeschke added this to the 1.5 milestone Jul 22, 2022
@mroeschke mroeschke merged commit 6b7578b into pandas-dev:main Jul 26, 2022
@mroeschke
Copy link
Member

Thanks @srotondo (CI failure unrelated)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Algos Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff Dtype Conversions Unexpected or buggy dtype conversions Testing pandas testing functions or related to the test suite
Projects
None yet
Development

Successfully merging this pull request may close these issues.

unique aggregation unexpectedly returning different type
5 participants