pd.io.formats.format.DataFrameFormatter issue: columns not truncating properly #16911

bsolomon1124 · 2017-07-13T20:08:01Z

In several IDEs, DataFrames with long column names (not necessarily a large number of columns) do not seem to truncate properly. I originally posted this in SO thinking it was a pd.set_option that I had ignored, but one answer pointed out that it may be an issue of pd.io.formats.format.DataFrameFormatter checking max_cols against the number of columns, not the total width of the columns, in deciding whether to truncate.

Problem description

I would like to keep
pd.set_option('expand_frame_repr', False)
but still truncate the view of DataFrames as shown in "Expected Output" below. I've noticed that this seems to be dependent on the length of columns rather than number of columns. For instance, this df displays in a readable way:

df = pd.DataFrame(np.random.randn(1000, 1000),
                  columns=['col' + str(i) for i in range(1000)])

but this one is unreadable:

df.add_prefix('really_long_column_name')

producing the "Messy output' below. For users wanting to keep pd.set_option('expand_frame_repr', False) but still have a truncated view, shouldn't pd.io.formats.format.DataFrameFormatter check the total length of all columns? (or somehow consider the effect of both column width and number of columns)

Messy output
https://i.stack.imgur.com/yzZUI.png

Expected output
https://i.stack.imgur.com/arvRm.png

Output of `pd.show_versions()`

INSTALLED VERSIONS

commit: None
python: 3.6.0.final.0
python-bits: 64
OS: Windows
OS-release: 7
machine: AMD64
processor: Intel64 Family 6 Model 60 Stepping 3, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None

pandas: 0.20.1
nose: 1.3.7
pip: 9.0.1
setuptools: 27.2.0
Cython: 0.25.2
numpy: 1.11.3
scipy: 0.18.1
statsmodels: 0.6.1
xarray: None
IPython: 5.1.0
sphinx: 1.5.1
patsy: 0.4.1
dateutil: 2.6.0
pytz: 2016.10
blosc: None
bottleneck: 1.2.0
tables: 3.2.2
numexpr: 2.6.1
matplotlib: 2.0.0
openpyxl: 2.4.1
xlrd: 1.0.0
xlwt: 1.2.0
xlsxwriter: 0.9.6
lxml: 3.7.2
bs4: 4.5.3
html5lib: 0.9999999
httplib2: None
apiclient: None
sqlalchemy: 1.1.5
pymysql: None
psycopg2: None
jinja2: 2.9.4
boto: 2.45.0
pandas_datareader: 0.4.0

The text was updated successfully, but these errors were encountered:

gfyoung · 2017-07-14T06:17:42Z

@bsolomon1 : I notice that you're using 0.19.2, and the code for that section pointed out in your SO post has been expanded since then (see current implementation here).

Try installing 0.20.3 first and see if that fixes your issue. If not, try installing master. Regardless, your question seems pretty reasonable, but as I don't use DataFrame printing all too often, I would want to defer to others who have used it more than I have.

bsolomon1124 · 2017-07-14T19:12:36Z

thanks for pointing out, I'm in 0.19 for compatibility with some other packages, but I've tried updating manually to 0.20 and had same issue.

gfyoung · 2017-07-14T19:15:47Z

Okay, good to know. Could you update your pandas version in the issue? I would wait for a day or two (or maybe until Monday since it's the weekend) to see if there's any other feedback. Otherwise, you are more than welcome to give this a shot at implementing if you're interested.

jreback · 2017-07-14T20:02:12Z

this is a duplicate issue IIRC. if someone would have a look.

gfyoung · 2017-07-14T21:32:00Z

Is it #7059? We can close that and then move forward with this one then.

jreback · 2017-07-14T22:23:23Z

it looks similar. @bsolomon1 if you'd have a look as well.

bsolomon1124 · 2017-07-24T12:24:04Z

Sorry I am late getting back to this @jreback. I'm working in 0.20.3 and still seeing the same issue. (See the second link ("messy output") in my initial comment.) I do agree this is similar to 7059 and that they are talking about the same bug , and I think either an elipsis in the middle, or at the end of, long col names would be a nice solution.

jreback · 2017-07-24T13:02:50Z

I'm working in 0.20.3 and still seeing the same issue.

why would you think this is fixed? this is an open issue.

bsolomon1124 · 2017-07-26T20:41:16Z

@jreback I was just addressing @gfyoung 's comment. ("Try installing 0.20.3 first and see if that fixes your issue" from early in the thread.)

gfyoung added Output-Formatting __repr__ of pandas objects, to_string Usage Question labels Jul 14, 2017

gfyoung mentioned this issue Jul 18, 2017

Make DataFrame.to_html output full content #17004

Closed

emsems mentioned this issue Mar 5, 2020

DataFrame output too wide / not truncated properly #32461

Open

mroeschke added Enhancement and removed Usage Question labels Apr 30, 2020

This was referenced Aug 11, 2022

tex labels in MultiIndex columns handley-lab/anesthetic#214

Closed

tex labels in MultiIndex columns handley-lab/anesthetic#215

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pd.io.formats.format.DataFrameFormatter issue: columns not truncating properly #16911

pd.io.formats.format.DataFrameFormatter issue: columns not truncating properly #16911

bsolomon1124 commented Jul 13, 2017 •

edited

Loading

INSTALLED VERSIONS

gfyoung commented Jul 14, 2017

bsolomon1124 commented Jul 14, 2017

gfyoung commented Jul 14, 2017 •

edited

Loading

jreback commented Jul 14, 2017

gfyoung commented Jul 14, 2017

jreback commented Jul 14, 2017

bsolomon1124 commented Jul 24, 2017

jreback commented Jul 24, 2017

bsolomon1124 commented Jul 26, 2017

pd.io.formats.format.DataFrameFormatter issue: columns not truncating properly #16911

pd.io.formats.format.DataFrameFormatter issue: columns not truncating properly #16911

Comments

bsolomon1124 commented Jul 13, 2017 • edited Loading

Problem description

Output of pd.show_versions()

INSTALLED VERSIONS

gfyoung commented Jul 14, 2017

bsolomon1124 commented Jul 14, 2017

gfyoung commented Jul 14, 2017 • edited Loading

jreback commented Jul 14, 2017

gfyoung commented Jul 14, 2017

jreback commented Jul 14, 2017

bsolomon1124 commented Jul 24, 2017

jreback commented Jul 24, 2017

bsolomon1124 commented Jul 26, 2017

bsolomon1124 commented Jul 13, 2017 •

edited

Loading

Output of `pd.show_versions()`

gfyoung commented Jul 14, 2017 •

edited

Loading