Skip to content

BUG: pd.concat raises if called on mixture of empty and non-empty dataframes #18178

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
JosephWagner opened this issue Nov 8, 2017 · 4 comments · Fixed by #18191
Closed

BUG: pd.concat raises if called on mixture of empty and non-empty dataframes #18178

JosephWagner opened this issue Nov 8, 2017 · 4 comments · Fixed by #18191
Labels
Regression Functionality that used to work in a prior pandas version Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Milestone

Comments

@JosephWagner
Copy link
Contributor

I noticed a change in how pd.concat works between 0.20.3 and 0.21.0:

import pandas as pd

df1 = pd.DataFrame({'foo': [1]})
df2 = pd.DataFrame({'foo': []})

res = pd.concat([df1, df2])

This example does not raise an exception in 0.20.3. In 0.21.0, it raises the following error:

Traceback (most recent call last):
  File "test.py", line 6, in <module>
    res = pd.concat([df1, df2])
  File "/homes/joewag/miniconda3/envs/py3/lib/python3.6/site-packages/pandas/core/reshape/concat.py", line 213, in concat
    return op.get_result()
  File "/homes/joewag/miniconda3/envs/py3/lib/python3.6/site-packages/pandas/core/reshape/concat.py", line 408, in get_result
    copy=self.copy)
  File "/homes/joewag/miniconda3/envs/py3/lib/python3.6/site-packages/pandas/core/internals.py", line 5202, in concatenate_block_managers
    return BlockManager(blocks, axes)
  File "/homes/joewag/miniconda3/envs/py3/lib/python3.6/site-packages/pandas/core/internals.py", line 3028, in __init__
    self._verify_integrity()
  File "/homes/joewag/miniconda3/envs/py3/lib/python3.6/site-packages/pandas/core/internals.py", line 3239, in _verify_integrity
    construction_error(tot_items, block.shape[1:], self.axes)
  File "/homes/joewag/miniconda3/envs/py3/lib/python3.6/site-packages/pandas/core/internals.py", line 4603, in construction_error
    passed, implied))
ValueError: Shape of passed values is (1, 1), indices imply (1, 0)

Expected Output

I would expect no error, and all(res==df1) to be true

Output of pd.show_versions()

[paste the output of pd.show_versions() here below this line]
INSTALLED VERSIONS

commit: None
python: 3.6.2.final.0
python-bits: 64
OS: Linux
OS-release: 2.6.32-696.1.1.el6.centos.plus.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8

pandas: 0.21.0
pytest: 3.2.3
pip: 9.0.1
setuptools: 36.2.2.post20170724
Cython: 0.26
numpy: 1.13.3
scipy: 0.19.1
pyarrow: None
xarray: None
IPython: 6.1.0
sphinx: 1.6.4
patsy: 0.4.1
dateutil: 2.6.1
pytz: 2017.2
blosc: None
bottleneck: None
tables: 3.4.0
numexpr: 2.6.2
feather: None
matplotlib: None
openpyxl: None
xlrd: 1.1.0
xlwt: None
xlsxwriter: 1.0.2
lxml: None
bs4: 4.6.0
html5lib: None
sqlalchemy: 1.1.14
pymysql: 0.7.11.None
psycopg2: None
jinja2: 2.9.6
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None

@jreback
Copy link
Contributor

jreback commented Nov 8, 2017

This check: https://github.com/pandas-dev/pandas/blob/master/pandas/core/reshape/concat.py#L293 is a little bogus, instead if this were np.prod(obj.shape) then it would filter properly.

want to give this a shot?

@jreback jreback added Bug Difficulty Intermediate Reshaping Concat, Merge/Join, Stack/Unstack, Explode labels Nov 8, 2017
@jreback jreback added this to the Next Major Release milestone Nov 8, 2017
@jreback jreback changed the title pd.concat raises if called on mixture of empty and non-empty dataframes BUG: pd.concat raises if called on mixture of empty and non-empty dataframes Nov 8, 2017
@SmokinCaterpillar
Copy link
Contributor

While trying to fix this, I stumbled upon #18187. There is also unexpected behavior if an empty and non-empty series are concatenated, which does not fail, but simply returns an empty series.

@jorisvandenbossche jorisvandenbossche added Regression Functionality that used to work in a prior pandas version and removed Bug labels Nov 9, 2017
@jorisvandenbossche jorisvandenbossche modified the milestones: Next Major Release, 0.21.1 Nov 9, 2017
@SmokinCaterpillar
Copy link
Contributor

SmokinCaterpillar commented Nov 9, 2017

@jreback the straight forward fix via np.prod(obj.shape) does not seem to work, simply replacing sum(obj.shape) by it leads to a failure of test_append_length0_frame at https://github.com/pandas-dev/pandas/blob/master/pandas/tests/reshape/test_concat.py#L760

Edit: I'm not so sure if "Effort Low" is the correct tag here :-D.
Probably someone has to solve this and #18187 simultaneously by fixing some changes from 0.20.x to 0.21.0. I'll try to look into it, but I don't know if I am able to find a solution in a reasonable amount of time.

@jreback
Copy link
Contributor

jreback commented Nov 9, 2017

I'm not so sure if "Effort Low" is the correct tag here :-D.

sure it is, its not 0 effort, rather a an hour or 2

SmokinCaterpillar pushed a commit to flix-tech/pandas that referenced this issue Nov 9, 2017
…of empty RangeIndex

The `_concat_rangeindex_same_dtype` now keeps track of the last non-empty RangeIndex to extract the new stop value.

This fixes two issues with concatenating non-empty and empty DataFrames and Series.

Two regression tests were added as well.
No-Stream pushed a commit to No-Stream/pandas that referenced this issue Nov 28, 2017
TomAugspurger pushed a commit to TomAugspurger/pandas that referenced this issue Dec 8, 2017
TomAugspurger pushed a commit that referenced this issue Dec 11, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Regression Functionality that used to work in a prior pandas version Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants