-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
CLN: Move to_period, to_perioddelta up to EA subclasses #23113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 9 commits
e2c6727
5c32348
ba5c2c1
e1e05a9
d3ca5e9
b4c2496
192170b
b8c2647
a99b1f1
edea5ef
a4ee53d
eaf364f
0a5bb77
b38ad9b
28bcd5a
033960a
91aaef4
44c8162
ef5c1e6
3b31903
0ce415b
5d8bc1a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -13,6 +13,7 @@ | |
|
||
from pandas.errors import NullFrequencyError, PerformanceWarning | ||
from pandas import compat | ||
from pandas.compat.numpy import function as nv | ||
|
||
from pandas.tseries import frequencies | ||
from pandas.tseries.offsets import Tick, DateOffset | ||
|
@@ -207,6 +208,24 @@ def astype(self, dtype, copy=True): | |
return self._box_values(self.asi8) | ||
return super(DatetimeLikeArrayMixin, self).astype(dtype, copy) | ||
|
||
def repeat(self, repeats, *args, **kwargs): | ||
""" | ||
Analogous to ndarray.repeat | ||
""" | ||
nv.validate_repeat(args, kwargs) | ||
if is_period_dtype(self): | ||
freq = self.freq | ||
else: | ||
freq = None | ||
return self._shallow_copy(self.asi8.repeat(repeats), | ||
freq=freq) | ||
|
||
def tolist(self): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. same for There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this is ok, again a common / useful method. actually I would move these up to array.base (and if needed subclass here for example repeat) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. But then let's first have a general discussion about adding it to the EA interface, and what the exact expected semantics are. But unless we have a clear use case for pandas (and have we? or is this on the Index rather for users?), I think we should a bit hesitant to add a lot. (and also the same comment applies here: even if we add it to the Array object, it should not be removed from the Index) |
||
""" | ||
return a list of the underlying data | ||
""" | ||
return list(self.astype(object)) | ||
|
||
# ------------------------------------------------------------------ | ||
# Null Handling | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -829,6 +829,24 @@ def to_period(self, freq=None): | |
|
||
return PeriodArrayMixin(self.values, freq=freq) | ||
|
||
def to_perioddelta(self, freq): | ||
""" | ||
Calculate TimedeltaArray of difference between index | ||
values and index converted to PeriodArray at specified | ||
freq. Used for vectorized offsets | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I find this a rather "obscure" method (meaning, not directly sure how it would be useful for me as a user). But I am also not a frequent user of periods/offsets, so difficult to say (and maybe a good example in the docstring might already help). So what I want to say: in case this is indeed rather for internal use, I would not make it a public method on the array class (but can still do the move but make it for example private here) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is used indirectly in DatetimeIndex + DateOffset, so will be needed for DatetimeArray + DateOffset. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. we might want to make this private, i dont' think we need to expose to the user There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I privatized this earlier today, but have now changed my mind. For DateOffset arithmetic to work "for free" it needs to have the same name. We can revisit this later if needbe. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OK, can you then add a comment like |
||
|
||
Parameters | ||
---------- | ||
freq: Period frequency | ||
|
||
Returns | ||
------- | ||
TimedeltaArray/Index | ||
""" | ||
from pandas.core.arrays.timedeltas import TimedeltaArrayMixin | ||
i8delta = self.asi8 - self.to_period(freq).to_timestamp().asi8 | ||
return TimedeltaArrayMixin(i8delta) | ||
|
||
# ----------------------------------------------------------------- | ||
# Properties - Vectorized Timestamp Properties/Methods | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -42,7 +42,6 @@ | |
from pandas.tseries.offsets import ( | ||
generate_range, CDay, prefix_mapping) | ||
|
||
from pandas.core.tools.timedeltas import to_timedelta | ||
from pandas.util._decorators import Appender, cache_readonly, Substitution | ||
import pandas.core.common as com | ||
import pandas.tseries.offsets as offsets | ||
|
@@ -621,13 +620,6 @@ def to_series(self, keep_tz=False, index=None, name=None): | |
|
||
return Series(values, index=index, name=name) | ||
|
||
@Appender(DatetimeArrayMixin.to_period.__doc__) | ||
def to_period(self, freq=None): | ||
from pandas.core.indexes.period import PeriodIndex | ||
|
||
result = DatetimeArrayMixin.to_period(self, freq=freq) | ||
return PeriodIndex(result, name=self.name) | ||
|
||
def snap(self, freq='S'): | ||
""" | ||
Snap time stamps to nearest occurring frequency | ||
|
@@ -699,23 +691,6 @@ def union(self, other): | |
result.freq = to_offset(result.inferred_freq) | ||
return result | ||
|
||
def to_perioddelta(self, freq): | ||
""" | ||
Calculate TimedeltaIndex of difference between index | ||
values and index converted to periodIndex at specified | ||
freq. Used for vectorized offsets | ||
|
||
Parameters | ||
---------- | ||
freq: Period frequency | ||
|
||
Returns | ||
------- | ||
y: TimedeltaIndex | ||
""" | ||
return to_timedelta(self.asi8 - self.to_period(freq) | ||
.to_timestamp().asi8) | ||
|
||
def union_many(self, others): | ||
""" | ||
A bit of a hack to accelerate unioning a collection of indexes | ||
|
@@ -1247,6 +1222,9 @@ def slice_indexer(self, start=None, end=None, step=None, kind=None): | |
is_year_end = wrap_field_accessor(DatetimeArrayMixin.is_year_end) | ||
is_leap_year = wrap_field_accessor(DatetimeArrayMixin.is_leap_year) | ||
|
||
to_perioddelta = wrap_array_method(DatetimeArrayMixin.to_perioddelta, | ||
False) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Does anyone know why the result's |
||
to_period = wrap_array_method(DatetimeArrayMixin.to_period, True) | ||
normalize = wrap_array_method(DatetimeArrayMixin.normalize, True) | ||
to_julian_date = wrap_array_method(DatetimeArrayMixin.to_julian_date, | ||
False) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is a
repeat
method needed? For now, we don't have that in our EA interface, so if avoidable, I would keep it that way (and keep it on the index).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
repeat
is also moved to PeriodArray in #22862.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we might need to discuss it there as well.
But also, in that PR, the index method still exists. So in this PR, you are removing something from the index, which will need to be added back in a next PR.