Skip to content

Empty dataframe with a multi-index loses the type information about the index types #23842

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
haroldfox opened this issue Nov 21, 2018 · 3 comments
Labels
Dtype Conversions Unexpected or buggy dtype conversions Duplicate Report Duplicate issue or pull request Indexing Related to indexing on series/frames, not to indexes themselves MultiIndex

Comments

@haroldfox
Copy link

Code Sample, a copy-pastable example if possible

# Your code here
import pandas as pd

test = pd.DataFrame({'time': [pd.Timestamp('2010-01-01'), pd.Timestamp('2011-01-01')], 'id': ['X', 'Y'], 'val': [0, 1]}).set_index(['time', 'id'])
test2 = test.head(0).reset_index()
[(col, test2[col].dtype) for col in test2.columns]

Problem description

The columns of test2 all have the type float64
[('time', dtype('float64')), ('id', dtype('float64')), ('val', dtype('int64'))]

I am on pandas 0.23.0

The index should keep its types of [(Timestamp, object)]. When there's a single index, the types are preserved. Seems like a MultiIndex problem.

Expected Output

[('time', dtype('<M8[ns]')), ('id', dtype('O')), ('val', dtype('int64'))]

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.5.6.final.0
python-bits: 64
OS: Linux
OS-release: 4.1.35-pv-ts2
machine: x86_64
processor:
byteorder: little
LC_ALL: en_US.utf8
LANG: en_US.utf8
LOCALE: en_US.UTF-8

pandas: 0.23.4
pytest: 3.8.0
pip: 10.0.1
setuptools: 40.2.0
Cython: 0.28.5
numpy: 1.13.3
scipy: 1.1.0
pyarrow: 0.8.0
xarray: None
IPython: 6.5.0
sphinx: 1.7.9
patsy: 0.5.0
dateutil: 2.7.3
pytz: 2018.5
blosc: None
bottleneck: 1.2.1
tables: 3.4.4
numexpr: 2.6.8
feather: None
matplotlib: 2.2.3
openpyxl: 2.5.6
xlrd: 1.1.0
xlwt: None
xlsxwriter: 1.1.0
lxml: None
bs4: 4.6.3
html5lib: 0.9999999
sqlalchemy: 1.1.13
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None

@gfyoung gfyoung added Dtype Conversions Unexpected or buggy dtype conversions Indexing Related to indexing on series/frames, not to indexes themselves MultiIndex labels Nov 21, 2018
@gfyoung
Copy link
Member

gfyoung commented Nov 21, 2018

Unusual use case, but given that it already works for a single Index (confirmed), I don't see why it shouldn't work with a MultiIndex.

cc @toobaz

@TomAugspurger
Copy link
Contributor

Duplicate of #19602 I think.

@TomAugspurger TomAugspurger added the Duplicate Report Duplicate issue or pull request label Nov 21, 2018
@TomAugspurger TomAugspurger added this to the No action milestone Nov 21, 2018
@haroldfox
Copy link
Author

yep, it's a dupe. thanks for looking into it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Dtype Conversions Unexpected or buggy dtype conversions Duplicate Report Duplicate issue or pull request Indexing Related to indexing on series/frames, not to indexes themselves MultiIndex
Projects
None yet
Development

No branches or pull requests

3 participants