BUG: melt changes type of tz-aware columns #15785

stigviaene · 2017-03-23T13:43:32Z

Code Samples

import pandas as pd
frame = pd.DataFrame({'klass':range(5), 'ts': [pd.Timestamp('2017-03-23 08:22:42.173378+01'), pd.Timestamp('2017-03-23 08:22:42.178578+01'), pd.Timestamp('2017-03-23 08:22:42.173578+01'), pd.Timestamp('2017-03-23 08:22:42.178378+01'), pd.Timestamp('2017-03-23 08:22:42.163378+01')], 'attribute':['att1', 'att2', 'att3', 'att4', 'att5'], 'value': ['a', 'b', 'c', 'd', 'd']})
# At this point, frame.ts is of dtype datetime64[ns, pytz.FixedOffset(60)]
frame.set_index(['ts', 'klass'], inplace=True)
queried_index = frame.query('value=="d"').index
pivoted_frame = frame.reset_index().pivot_table(index=['klass', 'ts'], columns='attribute', values='value', aggfunc='first')
melted_frame = pd.melt(pivoted_frame.reset_index(), id_vars=['klass', 'ts'], var_name='attribute', value_name='value')
# At this point, melted_frame.ts is of dtype datetime64[ns]
queried_after_melted_index = melted_frame.query('value=="d"').set_index(['ts', 'klass']).index
frame.loc[queried_index]  # Works
frame.loc[queried_index] = 'test'  # Works
frame.loc[queried_after_melted_index]  # Works
frame.loc[queried_after_melted_index] = 'test'  # Breaks

The last statement gives:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.5/dist-packages/pandas/core/indexing.py", line 140, in __setitem__
    indexer = self._get_setitem_indexer(key)
  File "/usr/local/lib/python3.5/dist-packages/pandas/core/indexing.py", line 127, in _get_setitem_indexer
    return self._convert_to_indexer(key, is_setter=True)
  File "/usr/local/lib/python3.5/dist-packages/pandas/core/indexing.py", line 1230, in _convert_to_indexer
    raise KeyError('%s not in index' % objarr[mask])
KeyError: "MultiIndex(levels=[[2017-03-23 07:22:42.163378, 2017-03-23 07:22:42.173378, 2017-03-23 07:22:42.173578, 2017-03-23 07:22:42.178378, 2017-03-23 07:22:42.178578], [0, 1, 2, 3, 4]],\n           labels=[[3, 0], [3, 4]],\n           names=['ts', 'klass']) not in index"

Problem description

It is counter-intuitive that any operation (which does not explicitly mention in its docs that it does) alters the type of any column.
Also counter-intuitive is that frame.loc has different behavior in a statement than it has in an assignment.

Expected Output

melted_frame.ts and frame.ts have the same dtype.
DataFrame.loc fails in both cases, not just in an assignment, or succeeds in both.

Output of `pd.show_versions()`

INSTALLED VERSIONS ------------------ commit: None python: 3.5.2.final.0 python-bits: 64 OS: Linux OS-release: 4.4.0-66-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8

pandas: 0.19.2
nose: None
pip: 9.0.1
setuptools: 20.7.0
Cython: None
numpy: 1.12.0
scipy: None
statsmodels: None
xarray: None
IPython: 5.3.0
sphinx: None
patsy: None
dateutil: 2.6.0
pytz: 2016.10
blosc: None
bottleneck: None
tables: None
numexpr: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: 0.7.3
lxml: 3.5.0
bs4: 4.4.1
html5lib: 0.999
httplib2: 0.9.1
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: 2.6.1 (dt dec pq3 ext lo64)
jinja2: 2.8
boto: None
pandas_datareader: None

The text was updated successfully, but these errors were encountered:

jreback · 2017-03-23T14:15:39Z

@stigviaene .melt doesn't have the battery of tests that most other things have. So not suprising that this doesn't convert correctly. Welcome to have you submit a patch to fix or at least see if you can locate the problem.

your comments on indexing are orthogonal. If you have a specific bug/comment you can raise in another issue.

jreback added Bug Difficulty Intermediate Dtype Conversions Unexpected or buggy dtype conversions Reshaping Concat, Merge/Join, Stack/Unstack, Explode Timezones Timezone data dtype labels Mar 23, 2017

jreback added this to the Next Major Release milestone Mar 23, 2017

jreback changed the title ~~melt changes type of timestamp columns~~ BUG: melt changes type of timestamp columns Mar 23, 2017

jreback changed the title ~~BUG: melt changes type of timestamp columns~~ BUG: melt changes type of tz-aware columns Mar 23, 2017

jreback mentioned this issue Mar 31, 2017

BUG: melt should preserve Categorical id_vars #15853

Closed

mroeschke mentioned this issue Mar 12, 2018

BUG: Retain tz-aware dtypes with melt (#15785) #20292

Merged

4 tasks

jreback modified the milestones: Next Major Release, 0.23.0 Mar 12, 2018

jreback closed this as completed in #20292 Mar 13, 2018

jreback pushed a commit that referenced this issue Mar 13, 2018

BUG: Retain tz-aware dtypes with melt (#15785) (#20292)

53bf291

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: melt changes type of tz-aware columns #15785

BUG: melt changes type of tz-aware columns #15785

stigviaene commented Mar 23, 2017

jreback commented Mar 23, 2017

BUG: melt changes type of tz-aware columns #15785

BUG: melt changes type of tz-aware columns #15785

Comments

stigviaene commented Mar 23, 2017

Code Samples

Problem description

Expected Output

Output of pd.show_versions()

jreback commented Mar 23, 2017

Output of `pd.show_versions()`