-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: preserve timezone info when writing empty tz-aware series to HDF5 #37072
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
11f00bb
to
869bf8c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok i am not sure how to grok this one as i think its on top of the other; in the future put them both in 1 PR is ok for this size change, or now, wait till the other is merged then rebase this one (or make this orthogonal)
pandas/io/pytables.py
Outdated
@@ -2960,11 +2960,17 @@ def write_array_empty(self, key: str, value: ArrayLike): | |||
node._v_attrs.value_type = str(value.dtype) | |||
node._v_attrs.shape = value.shape | |||
|
|||
def write_array(self, key: str, value: ArrayLike, items: Optional[Index] = None): | |||
# TODO: we only have one test that gets here, the only EA | |||
def write_array(self, key: str, obj, items: Optional[Index] = None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is the typing changed? can you be explicit here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a type hint.
AFAIU for series to_numpy()
drops timezone info so we need to pass the original object in SeriesFixed.read(...)
and special case here on timezone-aware data
xref (from stale PR that worked on this) https://github.com/pandas-dev/pandas/pull/20595/files#r179816447
pandas/io/pytables.py
Outdated
elif is_datetime64tz_dtype(obj): | ||
value = obj.dt._get_values() | ||
else: | ||
value = obj.values |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use .to_numpy()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Ok will do! For this one I'll wait until the other is merge (since this is a dependant) |
d30f5bf
to
2ea93bb
Compare
@jreback this is ready for re-review
|
pandas/io/pytables.py
Outdated
# that gets passed is DatetimeArray, and we never have | ||
# both self._filters and EA | ||
assert isinstance(value, (np.ndarray, ABCExtensionArray)), type(value) | ||
|
||
if isinstance(obj, (np.ndarray, ABCExtensionArray)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the annotation of obj as a Series accurate? if so, this branch is never hit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No it's not. Should be SeriesOrFrame (will fix)
pandas/io/pytables.py
Outdated
if isinstance(obj, (np.ndarray, ABCExtensionArray)): | ||
value = obj | ||
elif is_datetime64tz_dtype(obj): | ||
value = obj.dt._get_values() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i dont know off the top of my head what _get_values returns, is it a DatetimeArray? The standard pattern would be to use extract_array(obj, extract_numpy=True)
at the beginning of this method
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_get_values
returns a DatetimeIndex
But actually the extract_array
pattern does the job of this entire if/elif/else block here - thanks for suggesting! (I'll remember for future)
CI Green + ready for re-review |
thanks @arw2019 |
black pandas
git diff upstream/master -u -- "*.py" | flake8 --diff
xref #20595 (stale PR that looked into this)