-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Make _freq/freq/tz/_tz/dtype/_dtype/offset/_offset all inherit reliably #24517
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## master #24517 +/- ##
==========================================
+ Coverage 31.88% 31.89% +<.01%
==========================================
Files 166 166
Lines 52434 52445 +11
==========================================
+ Hits 16718 16726 +8
- Misses 35716 35719 +3
Continue to review full report at Codecov.
|
Codecov Report
@@ Coverage Diff @@
## master #24517 +/- ##
==========================================
+ Coverage 31.88% 31.88% +<.01%
==========================================
Files 166 166
Lines 52434 52412 -22
==========================================
- Hits 16718 16714 -4
+ Misses 35716 35698 -18
Continue to review full report at Codecov.
|
result.name = name | ||
# For groupby perf. See note in indexes/base about _index_data | ||
# TODO: make sure this is updated correctly if edited |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This TODO is for _index_data? In theory that shouldn't happen, since DatetimeIndex is immutable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In _libs.reduction there is a line:
object.__setattr__(cached_ityp, '_index_data', islider.buf)
which makes me wary. Is this just never relevant for DTI?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's the intent of all this, to directly mutate the buffer in place. The .reset
method undoes all this stuff. Nobody else should be messing with it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@TomAugspurger I think all your other comments have been addressed, not sure about this one. Should this TODO comment be removed? Some other action taken?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah this todo isn't necessary AFAICT.
Also, I really don't think we should have inverted the relationship between the eadata and data attributes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just because that change isn't going to last through #24024, so I think it was unnecessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I really don't think we should have inverted the relationship between the eadata and data attributes. [...] so I think it was unnecessary.
We can't have both 1) _eadata being a property that depends on self.freq and 2) freq be a property that depends on _eadata.
Definitely agree that 24024 should return things to the old pattern.
def _eadata(self): | ||
return DatetimeArray._simple_new(self._data, | ||
tz=self.tz, freq=self.freq) | ||
def _data(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that _data
is a property anywhere else. Just an attribute that's set in _simple_new
.
Actually... how does this even work? If you don't have a setter (which I don't see) then simple_new should fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see you set eadata.
It doesn't really matter since we're removing it soon anyway, but I'd prefer that _eadata
always reference _data
. That makes keeps all the index classes consistent that ._data
is the actual array, not a property.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that _data is a property anywhere else. Just an attribute that's set in _simple_new.
That's right. In this PR we set _eadata inside _simple_new and make _data a property returning _eadata._data. The freq/tz passthrough doesn't work with _eadata as a property (as in master), so the only question is whether to also set _data in _simple_new. I chose to make it a property to prevent any shenanigans where they become untied.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think https://github.com/pandas-dev/pandas/pull/24024/files#diff-26a6d2ca7adfca586aabbb1c9dd8bf36R74 is what we want for eadata & freq (and if we can do it here, instead of that PR, then that's best).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The freq/tz passthrough doesn't work with _eadata as a property (as in master),
Why's that? I suppose I could see why setting doesn't work, since IIUC we create a new DateteimArray on each invocation of _eadata
.
Should we just hold off on these changes til #24024 then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, of course it won't work, since we call .freq
when creating the _eadata
instance
While you're at it, could you add
@property
def _ndarray_values(self):
return self._eadata._ndarray_values
|
@@ -231,8 +231,13 @@ def _simple_new(cls, values, freq=None, tz=None): | |||
result = object.__new__(cls) | |||
result._data = values | |||
result._freq = freq | |||
tz = timezones.maybe_get_tz(tz) | |||
result._tz = timezones.tz_standardize(tz) | |||
if tz is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it cheap to detect when you need to call maybe_get_tz & tz_standarize? ideally these could be called in the DatetimeTZDtype constructor / alternatively could have a dedicate constructor,
DateteimeTZDtype.construct_from_tztype
, and will pave the way for a unified dtype as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just a small comment on future directions about dtypes.
ping on green. |
ping |
thanks! |
""" | ||
# GH 18595 | ||
return self._tz | ||
return getattr(self._dtype, "tz", None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any reason this uses the private attribute instead of the dtype property?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, feel free to change
There was an issue in rebasing #24024 in which index._freq didn't match index._eadata._freq. This ensures that is never an issue by removing _freq/freq from the Index subclasses and making them properties aliasing the _eadata attrs.
To do this, we have to make _eadata not-a-property for the time being.
Also make the _tz --> _dtype transition in #24024, and fix a couple of places in DatetimeIndex where it sets _tz manually that would otherwise be missed.
Remove PeriodIndex.shift, since it now can use the DatetimeIndexOpsMixin version (also done in 24024)