Skip to content

BUG: to_html misses truncation indicators (...) when index=False #22786

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Nov 15, 2018

Conversation

simonjayhawkins
Copy link
Member

@simonjayhawkins simonjayhawkins commented Sep 20, 2018

@pep8speaks
Copy link

Hello @simonjayhawkins! Thanks for submitting the PR.

@WillAyd
Copy link
Member

WillAyd commented Sep 20, 2018

@simonjayhawkins I know you have a few overlapping PRs open at this point - is this the one you think should go in first? Want to make sure I prioritize review accordingly

@simonjayhawkins
Copy link
Member Author

@WillAyd : Yes Will. Although not printing the column names #22747 is a probably a simpler fix and could go first. I've not raised a PR for that yet, as there's been no affirmation it's an issue.

@simonjayhawkins
Copy link
Member Author

@WillAyd : Will, the refactoring PR, #22726 is independent to keep style changes and bug fixes separate. So there is no reason that #22726 can't be progressed in parallel.

@codecov
Copy link

codecov bot commented Sep 21, 2018

Codecov Report

Merging #22786 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #22786      +/-   ##
==========================================
+ Coverage   92.24%   92.24%   +<.01%     
==========================================
  Files         161      161              
  Lines       51336    51338       +2     
==========================================
+ Hits        47357    47359       +2     
  Misses       3979     3979
Flag Coverage Δ
#multiple 90.64% <100%> (ø) ⬆️
#single 42.34% <0%> (-0.01%) ⬇️
Impacted Files Coverage Δ
pandas/io/formats/html.py 93.18% <100%> (+0.04%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c55057f...15bd352. Read the comment docs.

@jreback jreback added Bug Output-Formatting __repr__ of pandas objects, to_string IO HTML read_html, to_html, Styler.apply, Styler.applymap labels Sep 23, 2018
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment as @WillAyd about the multiple PRs

ncols = len(self.fmt.tr_frame.columns)
nrows = len(self.fmt.tr_frame)

row = []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you give some comments on what is happening here, would not object to a self.write_row?

Copy link
Member Author

@simonjayhawkins simonjayhawkins Sep 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jreback
The added code is copy and pasted from self._write_regular_rows()

But for the index=False case, the (row) index value is not added to row. The assignment to dot_col_ix and the value of nindex_levels in the self.write_tr call is also changed to account for this.

Currently the truncation tests in tests\io\formats\test_to_html.py are being skipped. Without test coverage the values of dot_col_ix and nindex_levels in self._write_regular_rows() can take on any value and pass the tests.

To avoid potential regression on the notebook display codepath (index=True), i have duplicated the code rather than make any changes to self._write_regular_rows()

Tests could be added but I wanted to avoid going out of scope on this PR.

The existing code for the index=False case was not in a function unlike the code for the index=True cases. However it was only 3 lines.

So I would agree that a self.write_row function would now be a good idea if it was unlikely that a future refactoring was not to use self._write_regular_rows() instead.

How should I proceed?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests could be added but I wanted to avoid going out of scope on this PR.

can you do a pre-cursor PR which locks down the tests first?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@simonjayhawkins i mean comments in the code

@simonjayhawkins
Copy link
Member Author

simonjayhawkins commented Sep 23, 2018

@jreback : This PR is currently blocking #22655, hence the overlap. I've marked it as WIP for the moment.

@jreback
Copy link
Contributor

jreback commented Oct 18, 2018

can you merge master

@jreback
Copy link
Contributor

jreback commented Oct 28, 2018

@WillAyd can you have another look here

@@ -306,6 +306,8 @@ def _column_header():
align = self.fmt.justify

if truncate_h:
if self.fmt.index is False:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of doing is False can you leverage the implicit truthiness here and do if not self.fmt.index? While not documented I believe index=0 and index=None are acceptable in this and other parsers, so would be ideal to handle those consistently with False

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you update this

@@ -306,6 +306,8 @@ def _column_header():
align = self.fmt.justify

if truncate_h:
if self.fmt.index is False:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you update this

ncols = len(self.fmt.tr_frame.columns)
nrows = len(self.fmt.tr_frame)

row = []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@simonjayhawkins i mean comments in the code

@@ -1881,6 +1903,22 @@ def test_to_html_multiindex_max_cols(self):
</table>""")
assert result == expected

def test_to_html_truncation_index_false_max_rows(self, datapath):
# GH 15019
np.random.seed(seed=0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do really need to set the seed?

# GH 15019
np.random.seed(seed=0)
df = pd.DataFrame(np.random.randn(5, 2))
result = df.to_html(max_rows=4, index=False)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you parameterize over index=False and index=0

@TomAugspurger
Copy link
Contributor

@WillAyd @jreback I think all your comments have been addressed, right?

@jreback jreback added this to the 0.24.0 milestone Nov 14, 2018
def expected_html(datapath, name):
"""
Read HTML file from formats data directory.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be nice to change some of the existing tests to use this function (future PR though)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR #23747 to change existing tests to use this function.

@WillAyd
Copy link
Member

WillAyd commented Nov 15, 2018

@simonjayhawkins code lgtm. Can you resolve conflicts? Can merge thereafter on green

@jreback jreback merged commit 8af7637 into pandas-dev:master Nov 15, 2018
@simonjayhawkins simonjayhawkins deleted the truncation-inicators branch November 15, 2018 14:18
thoo added a commit to thoo/pandas that referenced this pull request Nov 15, 2018
* upstream/master:
  BUG: to_html misses truncation indicators (...) when index=False (pandas-dev#22786)
  API/DEPR: replace "raise_conflict" with "errors" for df.update (pandas-dev#23657)
  BUG: Append DataFrame to Series with dateutil timezone (pandas-dev#23685)
  CLN/CI: Catch that stderr-warning! (pandas-dev#23706)
  ENH: Allow for join between two multi-index dataframe instances (pandas-dev#20356)
  Ensure Index._data is an ndarray (pandas-dev#23628)
  DOC: flake8-per-pr for windows users (pandas-dev#23707)
  DOC: Handle exceptions when computing contributors. (pandas-dev#23714)
  DOC: Validate space before colon docstring parameters pandas-dev#23483 (pandas-dev#23506)
  BUG-22984 Fix truncation of DataFrame representations (pandas-dev#22987)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug IO HTML read_html, to_html, Styler.apply, Styler.applymap Output-Formatting __repr__ of pandas objects, to_string
Projects
None yet
5 participants