BUG: pd.concat raises if called on mixture of empty and non-empty dataframes #18178

JosephWagner · 2017-11-08T19:27:33Z

I noticed a change in how pd.concat works between 0.20.3 and 0.21.0:

import pandas as pd

df1 = pd.DataFrame({'foo': [1]})
df2 = pd.DataFrame({'foo': []})

res = pd.concat([df1, df2])

This example does not raise an exception in 0.20.3. In 0.21.0, it raises the following error:

Traceback (most recent call last):
  File "test.py", line 6, in <module>
    res = pd.concat([df1, df2])
  File "/homes/joewag/miniconda3/envs/py3/lib/python3.6/site-packages/pandas/core/reshape/concat.py", line 213, in concat
    return op.get_result()
  File "/homes/joewag/miniconda3/envs/py3/lib/python3.6/site-packages/pandas/core/reshape/concat.py", line 408, in get_result
    copy=self.copy)
  File "/homes/joewag/miniconda3/envs/py3/lib/python3.6/site-packages/pandas/core/internals.py", line 5202, in concatenate_block_managers
    return BlockManager(blocks, axes)
  File "/homes/joewag/miniconda3/envs/py3/lib/python3.6/site-packages/pandas/core/internals.py", line 3028, in __init__
    self._verify_integrity()
  File "/homes/joewag/miniconda3/envs/py3/lib/python3.6/site-packages/pandas/core/internals.py", line 3239, in _verify_integrity
    construction_error(tot_items, block.shape[1:], self.axes)
  File "/homes/joewag/miniconda3/envs/py3/lib/python3.6/site-packages/pandas/core/internals.py", line 4603, in construction_error
    passed, implied))
ValueError: Shape of passed values is (1, 1), indices imply (1, 0)

Expected Output

I would expect no error, and all(res==df1) to be true

Output of `pd.show_versions()`

[paste the output of `pd.show_versions()` here below this line]
INSTALLED VERSIONS

commit: None
python: 3.6.2.final.0
python-bits: 64
OS: Linux
OS-release: 2.6.32-696.1.1.el6.centos.plus.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8

pandas: 0.21.0
pytest: 3.2.3
pip: 9.0.1
setuptools: 36.2.2.post20170724
Cython: 0.26
numpy: 1.13.3
scipy: 0.19.1
pyarrow: None
xarray: None
IPython: 6.1.0
sphinx: 1.6.4
patsy: 0.4.1
dateutil: 2.6.1
pytz: 2017.2
blosc: None
bottleneck: None
tables: 3.4.0
numexpr: 2.6.2
feather: None
matplotlib: None
openpyxl: None
xlrd: 1.1.0
xlwt: None
xlsxwriter: 1.0.2
lxml: None
bs4: 4.6.0
html5lib: None
sqlalchemy: 1.1.14
pymysql: 0.7.11.None
psycopg2: None
jinja2: 2.9.6
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None

The text was updated successfully, but these errors were encountered:

jreback · 2017-11-08T20:00:18Z

This check: https://github.com/pandas-dev/pandas/blob/master/pandas/core/reshape/concat.py#L293 is a little bogus, instead if this were np.prod(obj.shape) then it would filter properly.

want to give this a shot?

SmokinCaterpillar · 2017-11-09T10:59:51Z

While trying to fix this, I stumbled upon #18187. There is also unexpected behavior if an empty and non-empty series are concatenated, which does not fail, but simply returns an empty series.

SmokinCaterpillar · 2017-11-09T11:29:59Z

@jreback the straight forward fix via np.prod(obj.shape) does not seem to work, simply replacing sum(obj.shape) by it leads to a failure of test_append_length0_frame at https://github.com/pandas-dev/pandas/blob/master/pandas/tests/reshape/test_concat.py#L760

Edit: I'm not so sure if "Effort Low" is the correct tag here :-D.
Probably someone has to solve this and #18187 simultaneously by fixing some changes from 0.20.x to 0.21.0. I'll try to look into it, but I don't know if I am able to find a solution in a reasonable amount of time.

jreback · 2017-11-09T12:40:02Z

I'm not so sure if "Effort Low" is the correct tag here :-D.

sure it is, its not 0 effort, rather a an hour or 2

…of empty RangeIndex The `_concat_rangeindex_same_dtype` now keeps track of the last non-empty RangeIndex to extract the new stop value. This fixes two issues with concatenating non-empty and empty DataFrames and Series. Two regression tests were added as well.

…#18191)

… of empty RangeIndex (pandas-dev#18191)

… of empty RangeIndex (pandas-dev#18191) (cherry picked from commit 6b3641b)

…#18191) (cherry picked from commit 6b3641b)

jreback added Bug Difficulty Intermediate Reshaping Concat, Merge/Join, Stack/Unstack, Explode labels Nov 8, 2017

jreback added this to the Next Major Release milestone Nov 8, 2017

jreback changed the title ~~pd.concat raises if called on mixture of empty and non-empty dataframes~~ BUG: pd.concat raises if called on mixture of empty and non-empty dataframes Nov 8, 2017

SmokinCaterpillar mentioned this issue Nov 9, 2017

BUG: pd.concat returns empty series if called on mixture of empty and non-empty series #18187

Closed

jorisvandenbossche added Regression Functionality that used to work in a prior pandas version and removed Bug labels Nov 9, 2017

jorisvandenbossche modified the milestones: Next Major Release, 0.21.1 Nov 9, 2017

SmokinCaterpillar mentioned this issue Nov 9, 2017

Fix for #18178 and #18187 by changing the concat of empty RangeIndex #18191

Merged

jreback closed this as completed in #18191 Nov 10, 2017

jreback pushed a commit that referenced this issue Nov 10, 2017

Fix for #18178 and #18187 by changing the concat of empty RangeIndex (…

6b3641b

…#18191)

jreback mentioned this issue Nov 11, 2017

inconsistency in concat behavior in pandas 0.21.0 #18227

Closed

No-Stream pushed a commit to No-Stream/pandas that referenced this issue Nov 28, 2017

Fix for pandas-dev#18178 and pandas-dev#18187 by changing the concat…

f7e931b

… of empty RangeIndex (pandas-dev#18191)

TomAugspurger pushed a commit to TomAugspurger/pandas that referenced this issue Dec 8, 2017

Fix for pandas-dev#18178 and pandas-dev#18187 by changing the concat…

de4be1f

… of empty RangeIndex (pandas-dev#18191) (cherry picked from commit 6b3641b)

TomAugspurger pushed a commit that referenced this issue Dec 11, 2017

Fix for #18178 and #18187 by changing the concat of empty RangeIndex (…

fc1c2b8

…#18191) (cherry picked from commit 6b3641b)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: pd.concat raises if called on mixture of empty and non-empty dataframes #18178

BUG: pd.concat raises if called on mixture of empty and non-empty dataframes #18178

JosephWagner commented Nov 8, 2017

[paste the output of `pd.show_versions()` here below this line]
INSTALLED VERSIONS

jreback commented Nov 8, 2017

SmokinCaterpillar commented Nov 9, 2017

SmokinCaterpillar commented Nov 9, 2017 •

edited

Loading

jreback commented Nov 9, 2017

BUG: pd.concat raises if called on mixture of empty and non-empty dataframes #18178

BUG: pd.concat raises if called on mixture of empty and non-empty dataframes #18178

Comments

JosephWagner commented Nov 8, 2017

Expected Output

Output of pd.show_versions()

[paste the output of pd.show_versions() here below this line] INSTALLED VERSIONS

jreback commented Nov 8, 2017

SmokinCaterpillar commented Nov 9, 2017

SmokinCaterpillar commented Nov 9, 2017 • edited Loading

jreback commented Nov 9, 2017

Output of `pd.show_versions()`

[paste the output of `pd.show_versions()` here below this line]
INSTALLED VERSIONS

SmokinCaterpillar commented Nov 9, 2017 •

edited

Loading