inconsistency in concat behavior in pandas 0.21.0 #18227
Labels
Duplicate Report
Duplicate issue or pull request
Reshaping
Concat, Merge/Join, Stack/Unstack, Explode
Code Sample, a copy-pastable example if possible
Rewriting it in copy/pastable form...
This works:
And this does not:
Problem description
This is also described here.
In the above code,
df1
equalsdf3
, anddf2
anddf4
are both empty dataframes with the same column names (although, strangely,df2
anddf4
aren't equal according todf2.equals(df4)
) butpd.concat([df1, df2])
works whilepd.concat([df3, df4])
results inValueError
. This did not happen in previous versions of Pandas, but when I upgraded to 0.21.0, it started happening.Oddly, as the stackoverflow link notes, using the
drop_duplicates()
method on eitherdf3
ordf4
(or both) results in theconcat()
working, even though neither of them contains any duplicates.Expected Output
The expected output of
pd.concat([df3, df4])
isOutput of
pd.show_versions()
[paste the output of
pd.show_versions()
here below this line]INSTALLED VERSIONS
commit: None
python: 2.7.13.final.0
python-bits: 64
OS: Darwin
OS-release: 16.7.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: None.None
pandas: 0.21.0
pytest: 3.2.1
pip: 9.0.1
setuptools: 32.1.0
Cython: 0.23.4
numpy: 1.13.3
scipy: 0.19.1
pyarrow: None
xarray: None
IPython: 5.3.0
sphinx: None
patsy: 0.4.1
dateutil: 2.6.1
pytz: 2017.3
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 2.0.2
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: 4.6.0
html5lib: 0.999999999
sqlalchemy: 1.1.14
pymysql: 0.7.11.None
psycopg2: 2.7.3.1 (dt dec pq3 ext lo64)
jinja2: 2.9.6
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: 0.5.0
The text was updated successfully, but these errors were encountered: