Skip to content

Justification is broken with to_string(index=False) #13032

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
evanpw opened this issue Apr 29, 2016 · 7 comments · Fixed by #22505
Closed

Justification is broken with to_string(index=False) #13032

evanpw opened this issue Apr 29, 2016 · 7 comments · Fixed by #22505
Labels
Bug Output-Formatting __repr__ of pandas objects, to_string
Milestone

Comments

@evanpw
Copy link
Contributor

evanpw commented Apr 29, 2016

With 0.17.1, justification is handled correctly:

In [3]: df = pd.DataFrame(np.random.randn(3, 3), columns=['one', 'two', 'three'])

In [4]: print df.to_string(index=False)
      one       two     three
 0.096065 -0.529437 -2.058535
 0.588013  0.221911 -2.400129
 0.848234 -0.093931  1.221708

But in 0.18.0, it's broken:

In [4]: print df.to_string(index=False)
one       two     three
1.164170 -0.027236 -0.114962
-1.405931  0.019395  0.743320
0.598321  0.655802 -1.061834

This is caused by PR #11942. The issue it was meant to fix (#11833) is actually caused by using the format string "% d" for formatting integers, which produces a leading space if the integer is positive (where the minus sign would go for a negative integer). My preferred solution would be to change the formatter to "%d" (and to revert #11942), which would have the added bonus of eliminating double spaces between some columns (see the first to_string output in #11833, for example). A bunch of formatting tests would have to be adjusted, though. Is there anyone who prefers the extra space?

@jreback jreback added the Output-Formatting __repr__ of pandas objects, to_string label Apr 30, 2016
@jreback jreback added this to the 0.18.2 milestone Apr 30, 2016
@jreback
Copy link
Contributor

jreback commented Apr 30, 2016

yeah looks like the prior fix only dealt with integers. a more comprehensive fix would be ok.

@hawkeyej
Copy link

The formatting issue seems more broad to me. This example demonstrates:

import pandas as pd

NAMES = ['Short', 'Longer', 'Much Longer name to the Max -----------']
VALUES = [1, 9374518, 32432]

d = pd.DataFrame({'Name': NAMES, 'Value': VALUES})

print("With index")
print(d.to_string())

print("\n\nWithout index")
print(d.to_string(index=False))

Produces this:

With index
                                      Name    Value
0                                    Short        1
1                                   Longer  9374518
2  Much Longer name to the Max -----------    32432

Without index
Name    Value
                                  Short        1
                                 Longer  9374518
Much Longer name to the Max -----------    32432

@jreback
Copy link
Contributor

jreback commented May 31, 2016

Might affect this.

display.max_colwidth : int
    The maximum width in characters of a column in the repr of
    a pandas data structure. When the column overflows, a "..."
    placeholder is embedded in the output.
    [default: 50] [currently: 50]

@hawkeyej
Copy link

hawkeyej commented Jun 1, 2016

Note that only the header line is shifted incorrectly.

My current workaround is this (data is the DataFrame):

# Find longest index string
longest = max([len(str(s)) for s in data.index]) + 1

# Print to lines and manually remove 'index' from each line
lines = data.to_string().split('\n')
lines = [l[longest:] for l in lines]
print('\n'.join(lines))

@jreback
Copy link
Contributor

jreback commented Jun 1, 2016

@hawkeyej certainly love for a PR to fix! just write up a test of the correct results, and use your fix; see what else breaks :)

@jorisvandenbossche
Copy link
Member

Initial work in #14196, can possibly be used as a basis (it needs tests)

@kuraga
Copy link

kuraga commented May 5, 2017

Ping. Annoying bug, can I help? Though I'm not a pandas expert...

verena-neufeld pushed a commit to hande-qmc/hande that referenced this issue Jan 17, 2018
With index=False the column labels are not correctly aligned
with the data, as leading spaces are stripped.  Remove the
index from the formatted table ourself.
See pandas-dev/pandas#13032
@jreback jreback modified the milestones: Contributions Welcome, 0.24.0 Sep 23, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment