Skip to content

ENH: Select numeric ExtensionDtypes with DataFrame.select_dtypes #35341

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 17 commits into from

Conversation

andrewgsavage
Copy link

@andrewgsavage andrewgsavage commented Jul 19, 2020

@pep8speaks
Copy link

pep8speaks commented Jul 19, 2020

Hello @andrewgsavage! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2020-10-19 13:14:33 UTC

Copy link
Member

@simonjayhawkins simonjayhawkins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @andrewgsavage for the PR.

]
if np.number in dtypes_set:
extracted_dtypes.extend([])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not looked at this in detail, but from the name of the function, i'm assuming that this is not the correct location for this special casing.

There is another issue regarding extension arrays and dtypes, #9581. maybe a compat function for np.issubdtype could help, where we could add the special cases for EAs.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK I've added function to pandas\compat\numpy_init_.py

if numeric:
assert df.select_dtypes(np.number).shape == df.shape
else:
assert df.select_dtypes(np.number).shape != df.shape
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the extension tests are predominantly testing specific EAs. could you add a test to pandas\tests\extension\base\dtype.py instead and override as necessary for the different EAs.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My intention isn't to test the different EAs, but to test a dummy EA that emulates an external EA, like PintArray.

The most similar tests I could find were in pandas\tests\extension\test_common.py. Perhaps I should add the tests there?

@simonjayhawkins simonjayhawkins added Dtype Conversions Unexpected or buggy dtype conversions ExtensionArray Extending pandas with custom dtypes or arrays. labels Jul 19, 2020
@@ -5,6 +5,8 @@

import numpy as np

from pandas.core.dtypes.common import is_extension_array_dtype
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if there's any viable way to avoid this, we'd rather not import from pandas.core here

@@ -62,6 +63,17 @@ def np_array_datetime64_compat(arr, *args, **kwargs):
return np.array(arr, *args, **kwargs)


def np_issubclass_compat(unique_dtype, dtypes_set):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this wouldn't g in here, better in pandas/core/dtypes/common.py

with a full doc-string & examples

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is almost is_numeric, but with the twist about np.number

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've moved it and added doc string and examples. I'm not aware of any pandas dtypes that have _is_numeric = True and don't inherit from np.number.

Should is_numeric return True for ExtensionDtypes that have _is_numeric = True and don't inherit from np.number?

pass


def test_select_dtypes_numeric():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all of this should go in tests/frame/methods/test_select_dtypes.py

@github-actions
Copy link
Contributor

This pull request is stale because it has been open for thirty days with no activity. Please update or respond to this comment if you're still interested in working on this.

@github-actions github-actions bot added the Stale label Nov 19, 2020
@simonjayhawkins
Copy link
Member

@andrewgsavage can you merge master to resolve conflicts. Also needs a release note. (put in doc\source\whatsnew\v1.2.0.rst for now, but my need to moved as 1.2rc is scheduled before the end of the month)

@arw2019
Copy link
Member

arw2019 commented Dec 2, 2020

Closing in favor of #38246

@arw2019 arw2019 closed this Dec 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Dtype Conversions Unexpected or buggy dtype conversions ExtensionArray Extending pandas with custom dtypes or arrays. Stale
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ENH: Select numeric ExtensionDtypes with DataFrame.select_dtypes
6 participants