Skip to content

DataFrame should allow to hand in dtypes for every column #26305

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
dwt opened this issue May 7, 2019 · 1 comment
Closed

DataFrame should allow to hand in dtypes for every column #26305

dwt opened this issue May 7, 2019 · 1 comment
Labels
Dtype Conversions Unexpected or buggy dtype conversions Duplicate Report Duplicate issue or pull request

Comments

@dwt
Copy link

dwt commented May 7, 2019

Currently it's quite hard to create empty data frames that have specific columns and types. This seems to happen to me on a regular basis, which is why I add this feature request.

Code Sample, a copy-pastable example if possible

# state of the art / workaround
            empty_result = pd.DataFrame(np.empty((0,), 
                dtype=[
                    ('time', datetime),
                    ('ability', float),
                    ('error', float),
                    ('index', int),
                    ('index_error', float)
                ]))
# how it should be
            empty_result = pd.DataFrame(
                dtype=[
                    ('time', datetime),
                    ('ability', float),
                    ('error', float),
                    ('index', int),
                    ('index_error', float)
                ])

Problem description

It's just very non intuitive to specify a data frame completely when it's empty. I seem to need this on a regular basis when writing apis that deal with empty incoming data, which would then fail other pandas operations and correct empty data frames need to be constructed to return them instead.

Expected Output

An empty data frame should be constructed.

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 2.7.10.final.0
python-bits: 64
OS: Darwin
OS-release: 18.5.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: de_DE.utf-8
LOCALE: None.None

pandas: 0.24.2
pytest: 3.4.2
pip: 19.1
setuptools: 40.6.3
Cython: None
numpy: 1.14.6
scipy: 1.2.1
pyarrow: None
xarray: None
IPython: 5.8.0
sphinx: None
patsy: None
dateutil: 2.8.0
pytz: 2018.9
blosc: None
bottleneck: 1.2.1
tables: None
numexpr: 2.6.9
feather: None
matplotlib: 2.2.4
openpyxl: 2.6.1
xlrd: 1.2.0
xlwt: None
xlsxwriter: 1.1.5
lxml.etree: 4.3.2
bs4: 4.7.1
html5lib: 1.0.1
sqlalchemy: 1.2.18
pymysql: 0.9.3
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
gcsfs: None

@jreback
Copy link
Contributor

jreback commented May 7, 2019

duplicate of #9133 and #4464

we already accept this in .astype() so would take it for the constructor; pandas gets updated by pull requests from the community; you are welcome to do this.

@jreback jreback closed this as completed May 7, 2019
@jreback jreback added Dtype Conversions Unexpected or buggy dtype conversions Duplicate Report Duplicate issue or pull request labels May 7, 2019
@jreback jreback added this to the No action milestone May 7, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Dtype Conversions Unexpected or buggy dtype conversions Duplicate Report Duplicate issue or pull request
Projects
None yet
Development

No branches or pull requests

2 participants