Skip to content

concatenating frame and series with identical keys returns " int() argument must be a string" #33114

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
MarcoGorelli opened this issue Mar 29, 2020 · 1 comment

Comments

@MarcoGorelli
Copy link
Member

MarcoGorelli commented Mar 29, 2020

Code Sample, a copy-pastable example if possible

>>> import pandas as pd
>>> s = pd.Series([1, 2])                                                                                                                                                                                        
>>> df = pd.DataFrame([[1, 2], [3, 4]])                                                                                                                                                                          
>>> pd.concat([df, s], axis=1, keys=['a', 'a']) 

Problem description

This returns

Traceback
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-4-262b5243c6cf> in <module>
----> 1 pd.concat([df, s], axis=1, keys=['a', 'a'])

~/pandas/pandas/core/reshape/concat.py in concat(objs, axis, join, ignore_index, keys, levels, names, verify_integrity, sort, copy)
    278         verify_integrity=verify_integrity,
    279         copy=copy,
--> 280         sort=sort,
    281     )
    282 

~/pandas/pandas/core/reshape/concat.py in __init__(self, objs, axis, join, keys, levels, names, ignore_index, verify_integrity, copy, sort)
    447         self.copy = copy
    448 
--> 449         self.new_axes = self._get_new_axes()
    450 
    451     def get_result(self):

~/pandas/pandas/core/reshape/concat.py in _get_new_axes(self)
    510         return [
    511             self._get_concat_axis() if i == self.axis else self._get_comb_axis(i)
--> 512             for i in range(ndim)
    513         ]
    514 

~/pandas/pandas/core/reshape/concat.py in <listcomp>(.0)
    510         return [
    511             self._get_concat_axis() if i == self.axis else self._get_comb_axis(i)
--> 512             for i in range(ndim)
    513         ]
    514 

~/pandas/pandas/core/reshape/concat.py in _get_concat_axis(self)
    566         else:
    567             concat_axis = _make_concat_multiindex(
--> 568                 indexes, self.keys, self.levels, self.names
    569             )
    570 

~/pandas/pandas/core/reshape/concat.py in _make_concat_multiindex(indexes, keys, levels, names)
    648 
    649         return MultiIndex(
--> 650             levels=levels, codes=codes_list, names=names, verify_integrity=False
    651         )
    652 

~/pandas/pandas/core/indexes/multi.py in __new__(cls, levels, codes, sortorder, names, dtype, copy, name, verify_integrity, _set_identity)
    281         # we've already validated levels and codes, so shortcut here
    282         result._set_levels(levels, copy=copy, validate=False)
--> 283         result._set_codes(codes, copy=copy, validate=False)
    284 
    285         result._names = [None] * len(levels)

~/pandas/pandas/core/indexes/multi.py in _set_codes(self, codes, level, copy, validate, verify_integrity)
    864             new_codes = FrozenList(
    865                 _coerce_indexer_frozen(level_codes, lev, copy=copy).view()
--> 866                 for lev, level_codes in zip(self._levels, codes)
    867             )
    868         else:

~/pandas/pandas/core/indexes/multi.py in <genexpr>(.0)
    864             new_codes = FrozenList(
    865                 _coerce_indexer_frozen(level_codes, lev, copy=copy).view()
--> 866                 for lev, level_codes in zip(self._levels, codes)
    867             )
    868         else:

~/pandas/pandas/core/indexes/multi.py in _coerce_indexer_frozen(array_like, categories, copy)
   3597         Non-writeable.
   3598     """
-> 3599     array_like = coerce_indexer_dtype(array_like, categories)
   3600     if copy:
   3601         array_like = array_like.copy()

~/pandas/pandas/core/dtypes/cast.py in coerce_indexer_dtype(indexer, categories)
    856     length = len(categories)
    857     if length < _int8_max:
--> 858         return ensure_int8(indexer)
    859     elif length < _int16_max:
    860         return ensure_int16(indexer)

~/pandas/pandas/_libs/algos_common_helper.pxi in pandas._libs.algos.ensure_int8()
     59             return arr
     60         else:
---> 61             return arr.astype(np.int8, copy=copy)
     62     else:
     63         return np.array(arr, dtype=np.int8)
TypeError: int() argument must be a string, a bytes-like object or a number, not 'slice'

Expected Output

I would either expect this to work, or for it to return

InvalidIndexError: Reindexing only valid with uniquely valued Index objects

, which is what happens when you do

pd.concat([df, df], axis=1, keys=['a', 'a'])

Output of pd.show_versions()

INSTALLED VERSIONS

commit : 36d6583
python : 3.7.6.final.0
python-bits : 64
OS : Linux
OS-release : 4.15.0-91-generic
Version : #92-Ubuntu SMP Fri Feb 28 11:09:48 UTC 2020
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_GB.UTF-8
LOCALE : en_GB.UTF-8

pandas : 0.26.0.dev0+2746.g36d6583cf
numpy : 1.18.1
pytz : 2019.3
dateutil : 2.8.1
pip : 20.0.2
setuptools : 45.1.0.post20200119
Cython : 0.29.16
pytest : 5.4.1
hypothesis : 5.8.0
sphinx : 2.4.4
blosc : None
feather : None
xlsxwriter : 1.2.8
lxml.etree : 4.5.0
html5lib : 1.0.1
pymysql : None
psycopg2 : None
jinja2 : 2.11.1
IPython : 7.13.0
pandas_datareader: None
bs4 : 4.8.2
bottleneck : 1.3.2
fastparquet : 0.3.3
gcsfs : None
matplotlib : 3.1.3
numexpr : 2.7.1
odfpy : None
openpyxl : 3.0.1
pandas_gbq : None
pyarrow : 0.16.0
pytables : None
pyxlsb : None
s3fs : 0.4.0
scipy : 1.4.1
sqlalchemy : 1.3.15
tables : 3.6.1
tabulate : 0.8.7
xarray : 0.15.0
xlrd : 1.2.0
xlwt : 1.3.0
numba : 0.48.0

@MarcoGorelli
Copy link
Member Author

duplicate of #33654

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant