-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: Retain tz-aware dtypes with melt (#15785) #20292
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
pandas/tests/reshape/test_melt.py
Outdated
@pytest.mark.parametrize("col", [ | ||
pd.Series(pd.date_range('2010', periods=5, tz='US/Pacific')), | ||
pd.Series(["a", "b", "c", "a", "d"], dtype="category")]) | ||
def test_pandas_dtypes_id_var(self, col): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you wanted to reduce duplication of code further could parametrize something like "as_val" with parameters of True
and False
and then just add a conditional at the top of the function to set attr2
either to either the col or [0, 1, 0, 0, 0]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ahh I see @WillAyd suggested. in any event this makes this much harder to read.
Codecov Report
@@ Coverage Diff @@
## master #20292 +/- ##
==========================================
+ Coverage 91.72% 91.73% +<.01%
==========================================
Files 150 150
Lines 49165 49174 +9
==========================================
+ Hits 45099 45108 +9
Misses 4066 4066
Continue to review full report at Codecov.
|
pandas/core/reshape/melt.py
Outdated
id_data = frame.pop(col) | ||
if is_extension_type(id_data): | ||
# Preserve pandas dtype by not converting to a numpy array | ||
id_data = concat([id_data] * K, ignore_index=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@TomAugspurger do we have this in ExtensionArray ATM? .tile()
? or could emulate like this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No .tile
. We do have _concat_same_type
.
pandas/core/reshape/melt.py
Outdated
mdata[col] = np.tile(frame.pop(col).values, K) | ||
id_data = frame.pop(col) | ||
if is_extension_type(id_data): | ||
# Preserve pandas dtype by not converting to a numpy array |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comment is not needed
pandas/tests/reshape/test_melt.py
Outdated
df = DataFrame({'klass': range(5), | ||
'col': col, | ||
'attr1': [1, 0, 0, 0, 0]}) | ||
if pandas_dtype_value: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you move what this does into the parameterize, maybe by using a fixture. this defeats the purpose of being able to look at the parameterize and see what cases are being tested
doc/source/whatsnew/v0.23.0.txt
Outdated
@@ -896,6 +896,7 @@ Timezones | |||
- Bug in :func:`Timestamp.tz_localize` where localizing a timestamp near the minimum or maximum valid values could overflow and return a timestamp with an incorrect nanosecond value (:issue:`12677`) | |||
- Bug when iterating over :class:`DatetimeIndex` that was localized with fixed timezone offset that rounded nanosecond precision to microseconds (:issue:`19603`) | |||
- Bug in :func:`DataFrame.diff` that raised an ``IndexError`` with tz-aware values (:issue:`18578`) | |||
- Bug in :func:`melt` that coverted tz-aware dtypes to tz-naive (:issue:`15785`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
coverted --> converted
thanks! |
git diff upstream/master -u -- "*.py" | flake8 --diff
.values
call was converting tz aware data to tz naive data (by casting to a numpy array). Added an additional test for Categorical data as well.