Skip to content

BUG: Printing of MultiIndex fails for longer strings #53537

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
2 of 3 tasks
hagenw opened this issue Jun 6, 2023 · 7 comments
Closed
2 of 3 tasks

BUG: Printing of MultiIndex fails for longer strings #53537

hagenw opened this issue Jun 6, 2023 · 7 comments
Labels
Bug Needs Triage Issue that has not been reviewed by a pandas team member

Comments

@hagenw
Copy link

hagenw commented Jun 6, 2023

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas as pd

# Short string, works when printing
index = pd.MultiIndex.from_arrays((['a' * 10], [0.1], [0.9]), names=['file', 'start', 'end'])
print(index)

# Long string, fails when printing
index = pd.MultiIndex.from_arrays((['a' * 58], [0.1], [0.9]), names=['file', 'start', 'end'])
print(index)

Issue Description

The last print statement fails with:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/audeering.local/hwierstorf/.envs/test/lib/python3.8/site-packages/pandas/core/indexes/base.py", line 1233, in __repr__
    data = self._format_data()
  File "/home/audeering.local/hwierstorf/.envs/test/lib/python3.8/site-packages/pandas/core/indexes/base.py", line 1275, in _format_data
    return format_object_summary(
  File "/home/audeering.local/hwierstorf/.envs/test/lib/python3.8/site-packages/pandas/io/formats/printing.py", line 440, in format_object_summary
    summary, line = _extend_line(summary, line, tail[-1], display_width - 2, space2)
  File "/home/audeering.local/hwierstorf/.envs/test/lib/python3.8/site-packages/pandas/io/formats/printing.py", line 355, in _extend_line
    if adj.len(line.rstrip()) + adj.len(value.rstrip()) >= display_width:
AttributeError: 'tuple' object has no attribute 'rstrip'

Expected Behavior

It should have returned something like:

MultiIndex([('aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa', ...)],
           names=['file', 'start', 'end'])

Installed Versions

INSTALLED VERSIONS

commit : 965ceca
python : 3.8.16.final.0
python-bits : 64
OS : Linux
OS-release : 5.4.0-149-generic
Version : #166~18.04.1-Ubuntu SMP Fri Apr 21 16:42:44 UTC 2023
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_GB.UTF-8
LOCALE : en_GB.UTF-8

pandas : 2.0.2
numpy : 1.24.3
pytz : 2023.3
dateutil : 2.8.2
setuptools : 67.7.2
pip : 23.1.2
Cython : None
pytest : None
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : None
IPython : None
pandas_datareader: None
bs4 : None
bottleneck : None
brotli : None
fastparquet : None
fsspec : None
gcsfs : None
matplotlib : None
numba : None
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pyreadstat : None
pyxlsb : None
s3fs : None
scipy : None
snappy : None
sqlalchemy : None
tables : None
tabulate : None
xarray : None
xlrd : None
zstandard : None
tzdata : 2023.3
qtpy : None
pyqt5 : None

@hagenw hagenw added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Jun 6, 2023
@hagenw
Copy link
Author

hagenw commented Jun 6, 2023

When trying to replicate this on the main branch of pandas import of pandas fails for me after installing it with python setup.py install:

>>> import pandas as pd
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/audeering.local/hwierstorf/git/pandas/pandas/__init__.py", line 46, in <module>
    from pandas.core.api import (
  File "/home/audeering.local/hwierstorf/git/pandas/pandas/core/api.py", line 28, in <module>
    from pandas.core.arrays import Categorical
  File "/home/audeering.local/hwierstorf/git/pandas/pandas/core/arrays/__init__.py", line 12, in <module>
    from pandas.core.arrays.interval import IntervalArray
  File "/home/audeering.local/hwierstorf/git/pandas/pandas/core/arrays/interval.py", line 89, in <module>
    from pandas.core.arrays.timedeltas import TimedeltaArray
  File "/home/audeering.local/hwierstorf/git/pandas/pandas/core/arrays/timedeltas.py", line 33, in <module>
    from pandas._libs.tslibs.fields import (
ImportError: cannot import name 'get_timedelta_days' from 'pandas._libs.tslibs.fields' (/home/audeering.local/hwierstorf/git/pandas/pandas/_libs/tslibs/fields.cpython-38-x86_64-linux-gnu.so)

@Charlie-XIAO
Copy link
Contributor

@hagenw This bug does exist in the 2.0.2 release, but has been fixed in #53004. Now on the main branch, the output of your example would be

MultiIndex([('aaaaaaaaaa', 0.1, 0.9)],
           names=['file', 'start', 'end'])
MultiIndex([('aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa', ...)],
           names=['file', 'start', 'end'])

as expected.

@mroeschke
Copy link
Member

Yeah since this is fixed on main and it is tested going to close.

@hagenw
Copy link
Author

hagenw commented Jun 7, 2023

Cool, thanks for fixing.

@hagenw
Copy link
Author

hagenw commented Jul 13, 2023

This issue should have been fixed with pandas 2.0.3, but it isn't for me:

>>> import pandas as pd

>>> pd.__version__
'2.0.3'

>>> index = pd.MultiIndex.from_arrays((['a' * 58], [0.1], [0.9]), names=['file', 'start', 'end'])

>>> print(index)
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[11], line 1
----> 1 print(index)

File ~/.envs/analysis/lib/python3.8/site-packages/pandas/core/indexes/base.py:1233, in Index.__repr__(self)
   1229 """
   1230 Return a string representation for this object.
   1231 """
   1232 klass_name = type(self).__name__
-> 1233 data = self._format_data()
   1234 attrs = self._format_attrs()
   1235 space = self._format_space()

File ~/.envs/analysis/lib/python3.8/site-packages/pandas/core/indexes/base.py:1275, in Index._format_data(self, name)
   1272     if is_object_dtype(self.categories):
   1273         is_justify = False
-> 1275 return format_object_summary(
   1276     self,
   1277     self._formatter_func,
   1278     is_justify=is_justify,
   1279     name=name,
   1280     line_break_each_value=self._is_multi,
   1281 )

File ~/.envs/analysis/lib/python3.8/site-packages/pandas/io/formats/printing.py:440, in format_object_summary(obj, formatter, is_justify, name, indent_for_name, line_break_each_value)
    437     summary, line = _extend_line(summary, line, word, display_width, space2)
    439 # last value: no sep added + 1 space of width used for trailing ','
--> 440 summary, line = _extend_line(summary, line, tail[-1], display_width - 2, space2)
    441 summary += line
    443 # right now close is either '' or ', '
    444 # Now we want to include the ']', but not the maybe space.

File ~/.envs/analysis/lib/python3.8/site-packages/pandas/io/formats/printing.py:355, in format_object_summary.<locals>._extend_line(s, line, value, display_width, next_line_prefix)
    352 def _extend_line(
    353     s: str, line: str, value: str, display_width: int, next_line_prefix: str
    354 ) -> tuple[str, str]:
--> 355     if adj.len(line.rstrip()) + adj.len(value.rstrip()) >= display_width:
    356         s += line.rstrip()
    357         line = next_line_prefix

AttributeError: 'tuple' object has no attribute 'rstrip'

@Charlie-XIAO
Copy link
Contributor

I think it is for 2.1.0.

@hagenw
Copy link
Author

hagenw commented Jul 13, 2023

Ah, this might be true. I just assumed that "being fixed in main" would include it automatically in the next release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Needs Triage Issue that has not been reviewed by a pandas team member
Projects
None yet
Development

No branches or pull requests

3 participants