DataFrame.apply() with function that return category series #9573

ruoyu0088 · 2015-03-03T00:53:08Z

import pandas as pd
df = pd.DataFrame({"c0":["A","A","B","B"], "c1":["C","C","D","D"]})
df.apply(lambda s:s.astype("category"))

the resut is a series with series as element, not a dataframe:

c0    [A, A, B, B]
Categories (2, object): [A < B]
c1    [C, C, D, D]
Categories (2, object): [C < D]
dtype: object

Here is the output of `show_vershons()`:

INSTALLED VERSIONS
------------------
commit: None
python: 2.7.9.final.0
python-bits: 32
OS: Windows
OS-release: 7
machine: x86
processor: x86 Family 6 Model 42 Stepping 7, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None

pandas: 0.15.2.dev
nose: 1.3.4
Cython: 0.21.2
numpy: 1.9.1
scipy: 0.15.0
statsmodels: 0.6.1
IPython: 3.0.0
sphinx: 1.2.3
patsy: 0.3.0
dateutil: 2.3
pytz: 2014.10
bottleneck: None
tables: 3.1.1
numexpr: 2.4
matplotlib: 1.4.2
openpyxl: None
xlrd: 0.9.3
xlwt: None
xlsxwriter: 0.6.5
lxml: 3.4.1
bs4: 4.3.2
html5lib: None
httplib2: None
apiclient: None
rpy2: 2.5.4
sqlalchemy: 0.9.8
pymysql: None
psycopg2: None

The text was updated successfully, but these errors were encountered:

jreback · 2015-03-03T01:01:53Z

hmm, not behaving properly. as a work-around you can do this:

In [8]: DataFrame(dict([(c,col.astype('category')) for c, col in df.iteritems()]))
Out[8]: 
  c0 c1
0  A  C
1  A  C
2  B  D
3  B  D

In [9]: DataFrame(dict([(c,col.astype('category')) for c, col in df.iteritems()])).dtypes
Out[9]: 
c0    category
c1    category
dtype: object

sebp · 2015-03-31T12:19:21Z

This problem is still present in 0.16.0, apply does not retain category dtypes.

import pandas

df = pandas.DataFrame({'col0': [1, 2, 3, 4, 5],
                       'col1': ['yes', 'no', 'no', 'yes', 'yes'],
                       'col2': ['small', 'large', 'medium', 'large', 'small']})
df['col2'] = pandas.Categorical(df['col2'], categories=['small', 'medium', 'large'],
                                ordered=True)

other = df.apply(lambda x: x)

# returns category
print(df['col2'].dtype)
# returns object
print(other['col2'].dtype)

pandas.show_versions():

INSTALLED VERSIONS
------------------
commit: None
python: 3.4.3.final.0
python-bits: 64
OS: Linux
OS-release: 3.19.1-201.fc21.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: de_DE.utf8

pandas: 0.16.0
nose: 1.3.4
Cython: 0.21.2
numpy: 1.9.2
scipy: 0.15.1
statsmodels: 0.6.1
IPython: 3.0.0
sphinx: 1.2.3
patsy: 0.3.0
dateutil: 2.4.1
pytz: 2015.2
bottleneck: None
tables: 3.1.1
numexpr: 2.3.1
matplotlib: 1.4.3
openpyxl: 1.8.5
xlrd: 0.9.3
xlwt: None
xlsxwriter: 0.5.7
lxml: 3.4.0
bs4: 4.3.2
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None

jreback · 2015-03-31T12:30:11Z

@sebp well this is still an open issue; issues are closed if they are merged in
pull requests are welcome

jreback added Bug Reshaping Concat, Merge/Join, Stack/Unstack, Explode Categorical Categorical Data Type labels Mar 3, 2015

jreback added this to the 0.16.1 milestone Mar 3, 2015

jreback added Difficulty Advanced labels Apr 4, 2015

jreback modified the milestones: 0.17.0, 0.16.1 Apr 28, 2015

behzadnouri mentioned this issue Jun 15, 2015

BUG: closes bug in apply when function returns categorical #10354

Merged

jreback closed this as completed in #10354 Jun 15, 2015

jreback mentioned this issue Sep 30, 2015

DataFrame.apply() silently converting columns to non-categorical type #11208

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DataFrame.apply() with function that return category series #9573

DataFrame.apply() with function that return category series #9573

ruoyu0088 commented Mar 3, 2015

jreback commented Mar 3, 2015

sebp commented Mar 31, 2015

jreback commented Mar 31, 2015

DataFrame.apply() with function that return category series #9573

DataFrame.apply() with function that return category series #9573

Comments

ruoyu0088 commented Mar 3, 2015

jreback commented Mar 3, 2015

sebp commented Mar 31, 2015

jreback commented Mar 31, 2015