Skip to content

ERR: Can't initialise DataFrame using empty Series and empty columns #34977

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
3 tasks done
MarcoGorelli opened this issue Jun 24, 2020 · 1 comment
Open
3 tasks done
Labels
Constructors Series/DataFrame/Index/pd.array Constructors Enhancement Error Reporting Incorrect or improved errors from pandas

Comments

@MarcoGorelli
Copy link
Member

MarcoGorelli commented Jun 24, 2020

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • (optional) I have confirmed this bug exists on the master branch of pandas.


noticed while working on #30858

Note: Please read this guide detailing how to provide the necessary information for us to reproduce your bug.

Code Sample, a copy-pastable example

This returns an error:

>>> import pandas as pd
>>> pd.DataFrame(pd.Series([]), columns=[])
<stdin>:1: DeprecationWarning: The default dtype for empty Series will be 'object' instead of 'float64' in a future version. Specify a dtype explicitly to silence this warning.
Traceback (most recent call last):
  File "/home/marco/pandas-dev/pandas/core/internals/managers.py", line 1613, in create_block_manager_from_blocks
    make_block(values=blocks[0], placement=slice(0, len(axes[0])))
  File "/home/marco/pandas-dev/pandas/core/internals/blocks.py", line 2728, in make_block
    return klass(values, ndim=ndim, placement=placement)
  File "/home/marco/pandas-dev/pandas/core/internals/blocks.py", line 121, in __init__
    raise ValueError(
ValueError: Wrong number of items passed 1, placement implies 0

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/marco/pandas-dev/pandas/core/frame.py", line 488, in __init__
    mgr = init_ndarray(data, index, columns, dtype=dtype, copy=copy)
  File "/home/marco/pandas-dev/pandas/core/internals/construction.py", line 234, in init_ndarray
    return create_block_manager_from_blocks(block_values, [columns, index])
  File "/home/marco/pandas-dev/pandas/core/internals/managers.py", line 1623, in create_block_manager_from_blocks
    raise construction_error(tot_items, blocks[0].shape[1:], axes, e)
ValueError: Empty data passed with indices specified.

But this doesn't:

>>> pd.DataFrame([], columns=[])
Empty DataFrame
Columns: []
Index: []

#### Problem description

[this should explain **why** the current behaviour is a problem and why the expected output is a better solution]

#### Expected Output

Empty DataFrame
Columns: []
Index: []


#### Output of ``pd.show_versions()``

<details>
INSTALLED VERSIONS
------------------
commit           : 6a6faf596bc1e3bf4078d4837f654d0f2f754820
python           : 3.8.3.final.0
python-bits      : 64
OS               : Linux
OS-release       : 5.4.0-37-generic
Version          : #41-Ubuntu SMP Wed Jun 3 18:57:02 UTC 2020
machine          : x86_64
processor        : x86_64
byteorder        : little
LC_ALL           : None
LANG             : en_GB.UTF-8
LOCALE           : en_GB.UTF-8

pandas           : 1.1.0.dev0+1944.g6a6faf596
numpy            : 1.18.5
pytz             : 2020.1
dateutil         : 2.8.1
pip              : 20.1.1
setuptools       : 47.3.1.post20200616
Cython           : 0.29.20
pytest           : 5.4.3
hypothesis       : 5.16.1
sphinx           : 3.1.1
blosc            : None
feather          : None
xlsxwriter       : 1.2.9
lxml.etree       : 4.5.1
html5lib         : 1.0.1
pymysql          : None
psycopg2         : None
jinja2           : 2.11.2
IPython          : 7.15.0
pandas_datareader: None
bs4              : 4.9.1
bottleneck       : 1.3.2
fsspec           : 0.7.4
fastparquet      : 0.4.0
gcsfs            : None
matplotlib       : 3.2.1
numexpr          : 2.7.1
odfpy            : None
openpyxl         : 3.0.3
pandas_gbq       : None
pyarrow          : 0.17.1
pytables         : None
pyxlsb           : None
s3fs             : 0.4.2
scipy            : 1.4.1
sqlalchemy       : 1.3.17
tables           : 3.6.1
tabulate         : 0.8.7
xarray           : 0.15.1
xlrd             : 1.2.0
xlwt             : 1.3.0
numba            : 0.48.0

</details>
@MarcoGorelli MarcoGorelli added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Jun 24, 2020
@TomAugspurger TomAugspurger removed the Needs Triage Issue that has not been reviewed by a pandas team member label Sep 4, 2020
@TomAugspurger
Copy link
Contributor

@MarcoGorelli I'm not sure this is a bug, since you're passing a 1-D input to a thing that's supposed to be 2d. This looks like the right way to do this

In [68]: pd.DataFrame([np.array([])], columns=[])
Out[68]:
Empty DataFrame
Columns: []
Index: [0]

(certainly the error message can be improved).

@TomAugspurger TomAugspurger added the Constructors Series/DataFrame/Index/pd.array Constructors label Sep 4, 2020
@MarcoGorelli MarcoGorelli added the Error Reporting Incorrect or improved errors from pandas label Sep 4, 2020
@MarcoGorelli MarcoGorelli changed the title BUG: Can't initialise DataFrame using empty Series and empty columns ERR: Can't initialise DataFrame using empty Series and empty columns Sep 4, 2020
@MarcoGorelli MarcoGorelli removed the Bug label Sep 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Constructors Series/DataFrame/Index/pd.array Constructors Enhancement Error Reporting Incorrect or improved errors from pandas
Projects
None yet
Development

No branches or pull requests

3 participants