Skip to content

API: DataFrame.select_dtypes should accept scalar #16855

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
chris-b1 opened this issue Jul 7, 2017 · 3 comments · Fixed by #16860
Closed

API: DataFrame.select_dtypes should accept scalar #16855

chris-b1 opened this issue Jul 7, 2017 · 3 comments · Fixed by #16860
Milestone

Comments

@chris-b1
Copy link
Contributor

chris-b1 commented Jul 7, 2017

In [164]: df = pd.DataFrame({'a': [1, 2, 3], 'b': ['a', 'b', 'c']})

In [165]: df.select_dtypes(include='object')
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-165-04044faa1a5a> in <module>()
----> 1 df.select_dtypes(include='object')

~\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\core\frame.py in select_dtypes(self, include, exclude)
   2355         include, exclude = include or (), exclude or ()
   2356         if not (is_list_like(include) and is_list_like(exclude)):
-> 2357             raise TypeError('include and exclude must both be non-string'
   2358                             ' sequences')
   2359         selection = tuple(map(frozenset, (include, exclude)))

TypeError: include and exclude must both be non-string sequences

In [166]: df.select_dtypes(include=['object'])
Out[166]: 
   b
0  a
1  b
2  c

Problem description

Only a convenience thing, but basically anywhere else we take list-likes, we accept a single string and I think should do the same here.

pandas 0.20.2

@chris-b1 chris-b1 added this to the Next Major Release milestone Jul 7, 2017
@TomAugspurger
Copy link
Contributor

+100 :) We should do the same for exclude

@Ffisegydd
Copy link

I was looking at picking this up as one of my first contributions. A quick question for clarity though.

select_dtypes allows strings ('category', 'datetimetz', etc) but also allows numpy.number. For example:

df.select_dtypes(include=['category', numpy.number])

is valid.

My question is: what's the expected behaviour for df.select_dtypes(include=np.number)? The original issue only mentions allowing strings but it seems silly to exclude np.number, as such I'll continue by assuming that's the way to go but would be good to get some clarity.

@TomAugspurger
Copy link
Contributor

TomAugspurger commented Jul 8, 2017 via email

@chris-b1 chris-b1 changed the title API: DataFrame.select_dtypes should accept single string API: DataFrame.select_dtypes should accept scalar Jul 8, 2017
@jreback jreback modified the milestones: 0.21.0, Next Major Release Jul 10, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants