-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
BUG: List indexer on PeriodIndex doesn't coerce strings #30515
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there are specific conversions for index types
this is adding a large special case
pls find a more existing code friendly way
indexing code is already very complex
we need to reduce that not add to it
0c9ee8c
to
2cbd26b
Compare
2cbd26b
to
5e86611
Compare
a414a4f
to
fa51eb7
Compare
pandas/core/indexes/datetimes.py
Outdated
@@ -1006,6 +1006,16 @@ def indexer_between_time( | |||
|
|||
return mask.nonzero()[0] | |||
|
|||
def _convert_arr_indexer(self, keyarr): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where exactly are these hit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these function call from "._convert_listlike_indexer" and get returned value from "common.asarray_tuple"
pandas/core/indexes/period.py
Outdated
@@ -839,6 +839,16 @@ def memory_usage(self, deep=False): | |||
result += self._int64index.memory_usage(deep=deep) | |||
return result | |||
|
|||
def _convert_arr_indexer(self, keyarr): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why would these not share the impl from the superclass?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At first, I Implemented on the base
But,
would it be simpler to override DatetimeIndex._convert_arr_indexer and PeriodIndex._convert_arr_indexer?
So I changed. Okay I change again.
@jbrockmendel if you'd have a look |
2805c8d
to
8b9d712
Compare
pandas/core/common.py
Outdated
if isinstance(values, list) and ( | ||
is_datetime64_any_dtype(dtype) or is_period_dtype(dtype) | ||
): | ||
return array(values, dtype) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does timedelta64 ever go through here? could just check needs_i8_conversion(dtype)
What about other EA dtypes? isnt this kind of "anything that we dont want to pass to np.asarray below"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jbrockmendel
Okay. I change code
8b9d712
to
91f70fc
Compare
pandas/core/common.py
Outdated
@@ -225,6 +225,10 @@ def asarray_tuplesafe(values, dtype=None): | |||
if isinstance(values, list) and dtype in [np.object_, object]: | |||
return construct_1d_object_array_from_listlike(values) | |||
|
|||
if isinstance(values, list) and hasattr(values, "__array__"): | |||
# avoid converting extension array to numpy array |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so we're looking at some custom list subclass?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jbrockmendel
"asarray_tuplesafe" and "_convert_arr_indexer" both return "array-like" object. before, "asarray_tuplesafe" and "_convert_arr_indexer" return numpy array type. But i think, both can return EA type also
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why can't you use use is_list_like?
cb53b55
to
fa76233
Compare
fa76233
to
3ae8f48
Compare
3ae8f48
to
b138f94
Compare
b138f94
to
dbb0cdd
Compare
@jbrockmendel over to you if any comments. |
pandas/core/indexes/datetimelike.py
Outdated
if lib.infer_dtype(keyarr) == "string": | ||
# Weak reasoning that indexer is a list of strings | ||
# representing datetime or timedelta or period | ||
extension_arr = pd_array(keyarr, self.dtype) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we have strings that cant be parsed to self.dtype this raises DateParseError (at least in the Period[D] case). should we catch and re-raise as KeyError?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jbrockmendel
Okay, It makes sense. I add exception handling and test code
f9f246b
to
3efb005
Compare
ac86f36
to
c143ddb
Compare
looks find. can you merge master and ping o green. |
7e57054
to
9f2db9a
Compare
ping @jreback |
thanks @proost for sticking with this |
BUG: List indexer on PeriodIndex doesn't coerce strings (pandas-dev#30515) (#30…
black pandas
git diff upstream/master -u -- "*.py" | flake8 --diff
Description:
In single object indexer case,
datetime string converted to datetime-like object. Because indexing use ".xs" which is ".get_loc" that has type casting labels code.
But in list indexer,
there is no type check and casting code. So i add them.
this bug happen not only series but also dataframe And "datetimeindex" as well as "periodIndex"