-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: Series.map using categorical Series raises AttributeError #10464
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
get_values is fine |
Thanks. Another point I forgot to refer is below case results in
|
@@ -2010,7 +2010,7 @@ def map_f(values, f): | |||
arg = self._constructor(arg, index=arg.keys()) | |||
|
|||
indexer = arg.index.get_indexer(values) | |||
new_values = com.take_1d(arg.values, indexer) | |||
new_values = com.take_1d(arg.get_values(), indexer) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so the way to do this is actually
arg.values.take_nd(indexer)
which will return a Categorical
.
so maybe make this mod in core.common.take_nd
(which is aliases to take_1d
). if its a categorical_dtype, then short-circuit as I show above. (it ultimately calls take_1d
as well but with the codes)
130786d
to
5566933
Compare
Applied a fix to |
|
||
if is_categorical(arr): | ||
result = take_nd(arr.get_values(), indexer, axis=axis, out=out, | ||
fill_value=fill_value, mask_info=mask_info, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
more like return arr.values.take_1d(...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jreback Categorical
doesn't have values
, but get_values()
.
import pandas as pd
c = pd.Categorical(['A', 'B', 'A', 'A'])
c
# [A, B, A, A]
# Categories (2, object): [A, B]
c.values
# AttributeError: 'Categorical' object has no attribute 'values'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, you send this from a higher level (e.g. arr is actually a Series/CategoricalIndex object when you start).
You must use the take_1d
in Categorical, and NOT create one here, this is moving too specific knowledge to this function
something like:
if is_categorical_dtype(arr):
arr = getattr(arr,'value',arr)
return arr.take_1d(....)
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jreback I think I could understand. Changed to use take_nd
, as Categorical
doesn't have take_1d
.
BTW, I've noticed
So another point of the original issue is the behavior when the mapped category doesn't contain all the mappings. Should raise or padding with np.nan (create a new category implicitly)?
|
These should be the same as a post-convert to category
|
I see. Then, current impl works as expected. I'll consider if it can be better.
|
|
||
if is_categorical(arr): | ||
return arr.take_nd(indexer, fill_value=fill_value, | ||
allow_fill=allow_fill) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
exactly!
looks good. go ahead and merge when green. |
@jreback Thanks for review. Merging. |
BUG: Series.map using categorical Series raises AttributeError
Closes #10324. Closes #10460.
Based on #9848, using
.get_values
should be avoided?