-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
REF: Simplify Period/Datetime Array/Index constructors #23093
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
0e40ed4
dc63072
ee210e4
fc88e59
69a8f7e
04c75ca
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -27,6 +27,7 @@ | |
from pandas.tseries import frequencies | ||
from pandas.tseries.offsets import Tick, DateOffset | ||
|
||
from pandas.core.arrays import datetimelike as dtl | ||
from pandas.core.arrays.datetimelike import DatetimeLikeArrayMixin | ||
|
||
|
||
|
@@ -132,7 +133,7 @@ def __new__(cls, values, freq=None, **kwargs): | |
# TODO: what if it has tz? | ||
values = dt64arr_to_periodarr(values, freq) | ||
|
||
return cls._simple_new(values, freq, **kwargs) | ||
return cls._simple_new(values, freq=freq, **kwargs) | ||
|
||
@classmethod | ||
def _simple_new(cls, values, freq=None, **kwargs): | ||
|
@@ -141,21 +142,27 @@ def _simple_new(cls, values, freq=None, **kwargs): | |
Ordinals in an ndarray are fastpath-ed to `_from_ordinals` | ||
""" | ||
|
||
if is_period_dtype(values): | ||
freq = dtl.validate_dtype_freq(values.dtype, freq) | ||
values = values.asi8 | ||
|
||
if not is_integer_dtype(values): | ||
values = np.array(values, copy=False) | ||
if len(values) > 0 and is_float_dtype(values): | ||
raise TypeError("{cls} can't take floats" | ||
.format(cls=cls.__name__)) | ||
return cls(values, freq=freq) | ||
return cls(values, freq=freq, **kwargs) | ||
|
||
return cls._from_ordinals(values, freq) | ||
return cls._from_ordinals(values, freq=freq, **kwargs) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. was this causing bugs that kwargs was not passed? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is necessary in order to allow The PeriodIndex constructors in particular (but also DatetimeIndex and TimedeltaIndex) are really messy and complicated. Inheriting methods and deleting as much as possible makes it easier for me to simplify them, even if some of that will end up getting copy/pasted down the road if/when inheritance is replaced. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you then move the implementation and share it by calling the other class instead of inheriting? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do you mean moving the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I meant in PeriodIndex._simple_new calling PeriodArrayMixin._simple_new, similar as you did in one of the previous PRs (didn't look at the code, so don't know if that is possible in this case) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think I get what you're saying, will give it a shot There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Did anything come of this, one way or the other? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes. In an earlier commit I deleted PeriodIndex._simple_new and retained PeriodIndex._from_ordinals. Now that has been reversed, and its non-trivially nicer. |
||
|
||
@classmethod | ||
def _from_ordinals(cls, values, freq=None): | ||
def _from_ordinals(cls, values, freq=None, **kwargs): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ideally I'd like to get rid of |
||
""" | ||
Values should be int ordinals | ||
`__new__` & `_simple_new` cooerce to ordinals and call this method | ||
""" | ||
# **kwargs are included so that the signature matches PeriodIndex, | ||
# letting us share _simple_new | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why do you want to share this? Once we actually don't inherit it anymore, we will need to add it back to PeriodIndex There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think the signature should match across index classes, and hopefully the implementation can be shared. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. But this one is about There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we're moving towards getting rid of |
||
|
||
values = np.array(values, dtype='int64', copy=False) | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -131,15 +131,6 @@ def __new__(cls, values, freq=None, start=None, end=None, periods=None, | |
|
||
freq, freq_infer = dtl.maybe_infer_freq(freq) | ||
|
||
if values is None: | ||
# TODO: Remove this block and associated kwargs; GH#20535 | ||
if freq is None and com._any_none(periods, start, end): | ||
raise ValueError('Must provide freq argument if no data is ' | ||
'supplied') | ||
periods = dtl.validate_periods(periods) | ||
return cls._generate_range(start, end, periods, freq, | ||
closed=closed) | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is still present in the TimedeltaIndex constructor, but its not too late to get it out of the TimedeltaArray constructor. |
||
result = cls._simple_new(values, freq=freq) | ||
if freq_infer: | ||
inferred = result.inferred_freq | ||
|
@@ -151,6 +142,12 @@ def __new__(cls, values, freq=None, start=None, end=None, periods=None, | |
@classmethod | ||
def _generate_range(cls, start, end, periods, freq, closed=None, **kwargs): | ||
# **kwargs are for compat with TimedeltaIndex, which includes `name` | ||
|
||
periods = dtl.validate_periods(periods) | ||
if freq is None and any(x is None for x in [periods, start, end]): | ||
raise ValueError('Must provide freq argument if no data is ' | ||
'supplied') | ||
|
||
if com.count_not_none(start, end, periods, freq) != 3: | ||
raise ValueError('Of the four parameters: start, end, periods, ' | ||
'and freq, exactly three must be specified') | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -181,15 +181,7 @@ def __new__(cls, data=None, ordinal=None, freq=None, start=None, end=None, | |
if name is None and hasattr(data, 'name'): | ||
name = data.name | ||
|
||
if dtype is not None: | ||
dtype = pandas_dtype(dtype) | ||
if not is_period_dtype(dtype): | ||
raise ValueError('dtype must be PeriodDtype') | ||
if freq is None: | ||
freq = dtype.freq | ||
elif freq != dtype.freq: | ||
msg = 'specified freq and dtype are different' | ||
raise IncompatibleFrequency(msg) | ||
freq = dtl.validate_dtype_freq(dtype, freq) | ||
|
||
# coerce freq to freq object, otherwise it can be coerced elementwise | ||
# which is slow | ||
|
@@ -218,7 +210,7 @@ def __new__(cls, data=None, ordinal=None, freq=None, start=None, end=None, | |
# not array / index | ||
if not isinstance(data, (np.ndarray, PeriodIndex, | ||
DatetimeIndex, Int64Index)): | ||
if is_scalar(data) or isinstance(data, Period): | ||
if is_scalar(data): | ||
cls._scalar_data_error(data) | ||
|
||
# other iterable of some kind | ||
|
@@ -248,21 +240,7 @@ def _engine(self): | |
return self._engine_type(lambda: self, len(self)) | ||
|
||
@classmethod | ||
def _simple_new(cls, values, name=None, freq=None, **kwargs): | ||
""" | ||
Values can be any type that can be coerced to Periods. | ||
Ordinals in an ndarray are fastpath-ed to `_from_ordinals` | ||
""" | ||
if not is_integer_dtype(values): | ||
values = np.array(values, copy=False) | ||
if len(values) > 0 and is_float_dtype(values): | ||
raise TypeError("PeriodIndex can't take floats") | ||
return cls(values, name=name, freq=freq, **kwargs) | ||
|
||
return cls._from_ordinals(values, name, freq, **kwargs) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If keeping this now, I think it will be clearer what the changes are once we actually split index/array There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think it matters too much either way. @jbrockmendel does reverting this change cause issues with master? Or does the "old" There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It has changed since joris's comment above, but this is just a matter of code duplication.
I'm kind of hoping I can simplify these a good deal further before you finish up with SparseEA |
||
|
||
@classmethod | ||
def _from_ordinals(cls, values, name=None, freq=None, **kwargs): | ||
def _from_ordinals(cls, values, freq=None, name=None): | ||
""" | ||
Values should be int ordinals | ||
`__new__` & `_simple_new` cooerce to ordinals and call this method | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved these checks here from inside
__new__
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, as long as we're OK with ignoring gibberish arguments when there not actually used (which seems to be what we do right now).