-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
REF: Simplify Datetimelike constructor dispatching #23140
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 4 commits
f13cc58
4188ec7
7804f1b
a4775f4
8ee34fa
78943c1
aa71383
eae8389
e871733
7840f91
ec50b0b
eb7a6b6
32c6391
c903917
b97ec96
11db555
147de57
7c4d281
b90f421
dc4f474
46d5e64
b5827c7
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -141,7 +141,7 @@ def __new__(cls, values, freq=None, dtype=None, **kwargs): | |
|
||
elif is_object_dtype(values) or isinstance(values, (list, tuple)): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. shouldn't this be is_list_like? (for the isinstance check) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is specifically for object dtype (actually, I need to add There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. specifically what happens if other non ndarray list likes hit this path? do they need handling? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. They do need handling, but we're not there yet. The thought process for implementing these constructors piece-by-piece is a) The DatetimeIndex/TimedeltaIndex/PeriodIndex constructors are overgrown; let's avoid that in the Array subclasses. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Other question: where was this handled previously? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's hard for me to say what's better in the abstract. From the WIP PeriodArray PR, I found that having to think carefully about what type of data I had forced some clarity in the code. I liked having to explicitly reach for that Regardless, I think our two goals with the array constructors should be
If you think we're likely to end up in a situation where being able to pass an array of objects to the main There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. i am a bit puzzled why you would handle lists and and ndarray differently (tom and joris); these are clearly doing the same thing and we have a very similar handling for list likes throughout pandas separating these is a non starter - even having a separate constructor is also not very friendly. pandas does inference on the construction which is one of the big selling points. trying to change this, esp at the micro level is a huge mental disconnect. if you want to propose something like that pls do it in other issues. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I don't think we are. But, my only argument was
If that's not persuasive then I'm not going to argue against handling them in the init. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
+1
+1
+1
Yes, I think we should be pretty forgiving about what gets accepted into There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
It's not about lists vs arrays, it's about arrays of Period objects vs arrays of ordinal integers, which is something very different.
Being forgiving is exactly what lead to the complex Period/DatetimeIndex constructors. I think we should not make the same choice for our Array classes. I personally also think it makes the code clearer to even separate those two concepts (basically what we also did with IntegerArray), but maybe let's open an issue to further discuss that instead of here in a hidden review comment thread? (i can only open one later today ) |
||
# e.g. array([Period(...), Period(...), NaT]) | ||
values = np.array(values) | ||
values = np.array(values, dtype=object) | ||
if freq is None: | ||
freq = libperiod.extract_freq(values) | ||
values = libperiod.extract_ordinals(values, freq) | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -430,10 +430,7 @@ def min(self, axis=None, *args, **kwargs): | |
-------- | ||
numpy.ndarray.min | ||
""" | ||
if axis is not None and axis >= self.ndim: | ||
raise ValueError("`axis` must be fewer than the number of " | ||
"dimensions ({ndim})".format(ndim=self.ndim)) | ||
|
||
_validate_minmax_axis(axis) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. not what i mean There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I see, done. |
||
nv.validate_min(args, kwargs) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is there reason not to add the axis validation to the existing There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. exactly I don't want another function, rather you can simply check this in side the function which is already there. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done. I'm not wild about the fact that the nv.validate_(min|max|argmin|argmax) functions now implicitly assume they are only being called on 1-dim objects, but at least the assumption is correct for now. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hmm, yeah, that makes sense. |
||
|
||
try: | ||
|
@@ -462,10 +459,7 @@ def argmin(self, axis=None, *args, **kwargs): | |
-------- | ||
numpy.ndarray.argmin | ||
""" | ||
if axis is not None and axis >= self.ndim: | ||
raise ValueError("`axis` must be fewer than the number of " | ||
"dimensions ({ndim})".format(ndim=self.ndim)) | ||
|
||
_validate_minmax_axis(axis) | ||
nv.validate_argmin(args, kwargs) | ||
|
||
i8 = self.asi8 | ||
|
@@ -486,10 +480,7 @@ def max(self, axis=None, *args, **kwargs): | |
-------- | ||
numpy.ndarray.max | ||
""" | ||
if axis is not None and axis >= self.ndim: | ||
raise ValueError("`axis` must be fewer than the number of " | ||
"dimensions ({ndim})".format(ndim=self.ndim)) | ||
|
||
_validate_minmax_axis(axis) | ||
nv.validate_max(args, kwargs) | ||
|
||
try: | ||
|
@@ -518,10 +509,7 @@ def argmax(self, axis=None, *args, **kwargs): | |
-------- | ||
numpy.ndarray.argmax | ||
""" | ||
if axis is not None and axis >= self.ndim: | ||
raise ValueError("`axis` must be fewer than the number of " | ||
"dimensions ({ndim})".format(ndim=self.ndim)) | ||
|
||
_validate_minmax_axis(axis) | ||
nv.validate_argmax(args, kwargs) | ||
|
||
i8 = self.asi8 | ||
|
@@ -722,6 +710,25 @@ def _time_shift(self, periods, freq=None): | |
return result | ||
|
||
|
||
def _validate_minmax_axis(axis): | ||
""" | ||
Ensure that the axis argument passed to min, max, argmin, or argmax is | ||
zero or None, as otherwise it will be incorrectly ignored. | ||
|
||
Parameters | ||
---------- | ||
axis : int or None | ||
|
||
Raises | ||
------ | ||
ValueError | ||
""" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. see my comment above |
||
ndim = 1 # hard-coded for Index | ||
if axis is not None and axis >= ndim: | ||
raise ValueError("`axis` must be fewer than the number of " | ||
"dimensions ({ndim})".format(ndim=ndim)) | ||
|
||
|
||
def _ensure_datetimelike_to_i8(other, to_utc=False): | ||
""" | ||
helper for coercing an input scalar or array to i8 | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why wouldn’t u just pop the kwarg for key and pass it directly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm actually that ends up being appreciably more verbose. We have to do separate cls._generate_range calls for TimedeltaArray vs DatetimeArray