Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
BUG/TST: Fix infer_dtype for Period array-likes and general ExtensionArrays #37367
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG/TST: Fix infer_dtype for Period array-likes and general ExtensionArrays #37367
Changes from 1 commit
a3b7ad7
6aa4475
d6f6c0a
50d7425
ed36be2
ce64b64
0e15a9a
688104f
f5ea183
4c539e3
0504ad1
88dcd81
b53f46e
100b83b
fbb713d
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we explicity add the Period[D] types to the type_map (there aren't that many)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Going from the
PeriodDtypeCode
enum, there are actually quite some?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we were to get that specific, at some point it makes more sense to just return a dtype object
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right here why can't we try to convert to an EA type using
registry.find()
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because we already have a
dtype
here, there is no need to look anything up in the registry. We don't want to infer a type, we have an EAdtype
that we want to find in the TYPE_MAP to infer the category returned byinfer_dtype
, by checking the name, kind or base attributes of the dtype.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to import those in lib.pyx ? (might be fine, eg there are already imports from tslibs.period, but at the moment only
cimport
s)If that's OK, I am fine with changing it. I don't necessarily find it less "hacky" than the current solution, but I just want some solution that is acceptable for all
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we not already import all of the scalar EA types?
why is this any different
+1 on using the existing machinery
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
100% agree on both points
not ideal, but i dont think it will harm anything ATM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
exactly you are missing the entire point of the dtype abstraction
you avoid parsing strings in the first place
i will be blocking this util / unless a good soln is done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lib.pyx doesn't know anything about EAs. It only imports helper functions like
is_period_object
different than what?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we have tests that get here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The base extension test and the decimal test that I added in this PR get here (those were raising errors before)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note I am not fully sure about this change. Before it raised an error, now it will typically return "mixed" for random python objects. Both don't seem very useful (or in other words, both are an indication that nothing could be inferred). But I found it a bit inconsistent to raise in this case, while if you would pass a list of the same objects, we would actually infer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we are already doing this below. This does a full conversion of the object, I think this will simply kill performance in some cases making this routine very unpredictable. would rather not make this change here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
random python objects will be marked as 'mixed' in any event w/o the performance panelty below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you are going to remove the exception then you can remove this L1337,1338 entirely (as np.asarray is called on L1341)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed the duplicate
asarray
, but so we should still decide if we want to keep that exception or not