Skip to content

ENH: Removing unnecessary whitespace when formatting to a HTML table. #5012

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from
Closed

Conversation

alefnula
Copy link
Contributor

Closes #4987.

The problem is that all formatters always call _make_fixed_width. So, the only solution I found is just to strip the unnecessary whitespace.

@@ -734,7 +734,7 @@ def _write_body(self, indent):

fmt_values = {}
for i in range(len(self.columns)):
fmt_values[i] = self.fmt._format_col(i)
fmt_values[i] = [item.strip() for item in self.fmt._format_col(i)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible to pass an argument to _format_col that controls this? e.g. in the HTMLFormatter it will not include the whitespace, all other formatters will be unchanged.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jreback It uses the format_array function, which uses an appropriate ArrayFormatter. They all call _make_fixed_width which can justify either right or left. A third value can be added justify=None, but then every ArrayFormatter should be changed to support justify=None. I can add that, no problem :), but it will be a lot of changes, and I didn't wanted to do that just for this small enhancement that could be fixed in one line. If you think that the more general solution is better, I'll do that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jreback But maybe this isn't such a bad idea since then I can add justify=center too. Currently only left and right are supported.

@jreback
Copy link
Contributor

jreback commented Sep 28, 2013

yep

@alefnula
Copy link
Contributor Author

@jreback OK, I'll implement the additional justifications.

@alefnula
Copy link
Contributor Author

It turned out easier than I thought. Hope I haven't missed something though...

justfunc = lambda self, x: self.rjust(x)
elif justify == 'center':
justfunc = lambda self, x: self.center(x)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can these be of the form str.center, str.rjust etc?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, because it can also be a unicode.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry i meant compat.text_type.center ... or are we allowing str and unicode? if allowing both then disregard my suggestion

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's just unicode in python 2 and will throw an error if receives a str:

>>> pd.compat.text_type.center('test', 30)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: descriptor 'center' requires a 'unicode' object but received a 'str'

And as I said in the previous comment, the same thing applies for the other way round:

>>> pd.compat.binary_type.center(u'test', 30)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: descriptor 'center' requires a 'str' object but received a 'unicode'

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay forget it then :)

@cpcloud
Copy link
Member

cpcloud commented Sep 28, 2013

@alefnula can u add tests for the new justification options

@alefnula
Copy link
Contributor Author

@cpcloud I changed the HTML tests so the None option is working... For the other, will something like this be enough?

In [2]: df = pd.DataFrame({'foo': [1, 2, 3], 'bar': [4, 5, 6], 'baz': [7, 8, 9]})
In [3]: print(df.to_string(justify='left'))
   bar  baz  foo
0  4    7    1
1  5    8    2
2  6    9    3
In [4]: print(df.to_string(justify='right'))
   bar  baz  foo
0    4    7    1
1    5    8    2
2    6    9    3
In [5]: print(df.to_string(justify='center'))
   bar  baz  foo
0   4    7    1
1   5    8    2
2   6    9    3

@alefnula
Copy link
Contributor Author

Although, maybe align is a better parameter name than justify :-/

@alefnula
Copy link
Contributor Author

There was some really weird code here... The alignment of the columns was done twice (sometimes even 3 times, which is ridiculous - recursive calls...) I really couldn't find the reason why so I removed that.

Also there is a bug in the current string column formatting. The column header is not aligned. The bug is also on the master branch. I'll try to fix it here if this patch can be accepted.

Example of the bug:

import numpy as np
import pandas as pd
ind = pd.date_range('2013-01-01', freq='H', periods=10)
df = pd.DataFrame(np.random.randn(10, 3), index=ind)
df = df.reset_index()
df['test'] = 'foo bar baz'
print(df.to_string(justify='left'))

Produces:

  index                0         1         2        test        
0 2013-01-01 00:00:00 -0.295496 -1.622709 -2.330210  foo bar baz
1 2013-01-01 01:00:00 -0.663977 -0.415800 -1.928071  foo bar baz
2 2013-01-01 02:00:00 -0.468839  0.106345  1.344761  foo bar baz
3 2013-01-01 03:00:00 -0.154564  0.146896 -0.322325  foo bar baz
4 2013-01-01 04:00:00  0.695505  0.900301  0.261876  foo bar baz
5 2013-01-01 05:00:00 -1.142544 -0.089581  2.180599  foo bar baz
6 2013-01-01 06:00:00 -0.806229 -0.766914 -0.852616  foo bar baz
7 2013-01-01 07:00:00  0.037000 -0.093492  1.208734  foo bar baz
8 2013-01-01 08:00:00  1.578699  0.067581  0.855523  foo bar baz
9 2013-01-01 09:00:00  1.707266  1.333138  1.185469  foo bar baz

You can see that the test column header is shifted...

Finally, maybe align is a better parameter name than justify :-/

@cpcloud
Copy link
Member

cpcloud commented Sep 29, 2013

@alefnula i would put the column alignment in a separate issue.

@alefnula
Copy link
Contributor Author

@cpcloud The problem is that I don't even know if that's a bug or a feature :D That's why I asked.

@jreback
Copy link
Contributor

jreback commented Oct 2, 2013

@cpcloud ok to merge?

@jreback
Copy link
Contributor

jreback commented Oct 2, 2013

@alefnula relize no release notes...can you add?

@alefnula
Copy link
Contributor Author

alefnula commented Oct 2, 2013

@jreback I don't think this should be merged yet. The justfy=None doesn't actually work on to_string because it will realise that it's None and turn it to right (the default), so maybe it should be justify='none' instead of justify=None? justify=None should probably just revert to whatever is the default... Probably right should be the default as it was before these changes.

@jreback
Copy link
Contributor

jreback commented Oct 2, 2013

you can just put justify = 'right' as the default which iirc was what it was

@alefnula
Copy link
Contributor Author

alefnula commented Oct 2, 2013

That's fine with me :) I'll do it that way then. I should probably change the docs also. Just to add that the 'center' option is also available.

@jreback
Copy link
Contributor

jreback commented Oct 3, 2013

@cpcloud this looks ok....so when you are ready can merge (as I believe you are waiting to merge something else)

@alefnula
Copy link
Contributor Author

alefnula commented Oct 3, 2013

@jreback I just rebased it, didn't changed anything yet. :) There is one more problem with this (I'm really sorry to be such a hassle :( but I just don't want to break anything), docs:

justify : {'left', 'right'}, default None
    Left or right-justify the column labels. If None uses the option from
    the print configuration (controlled by set_option), 'right' out
    of the box.

But now justify justifies the whole columns, not just the headers :( And also I cannot use the None and just default to right because if it's None it will read the:

if justify is None:
    self.justify = get_option("display.colheader_justify")

If it's ok to change the functionality to justify the whole columns I need a sentinel for no justification. And probably read get_option("display.column_justify") instead of display.colheader_justify.

display.colheader_justify is mentioned in the docs so it will be a backward incompatible change..

@jreback
Copy link
Contributor

jreback commented Oct 4, 2013

@cpcloud what do you think?
I think columns should be left alone and this just should apply to the header

@alefnula
Copy link
Contributor Author

alefnula commented Oct 4, 2013

@jreback In that case this whole pull request should be discarded since it justifies the whole columns... I could maybe separate the column and header justification and use justify for header and align parameter for columns?

@jreback
Copy link
Contributor

jreback commented Oct 4, 2013

do you want to defer this to 0.14....as templates hopefully will solve these types of things?
@cpcloud ?

@jreback
Copy link
Contributor

jreback commented Oct 9, 2013

defering this to 0.14

@ghost
Copy link

ghost commented Feb 4, 2014

Stale, Please open an updated PR if the problems mentioned are sorted out.

@ghost ghost closed this Feb 4, 2014
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ENH: to_html improvement
3 participants