-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
REGR: Notebook (html) repr of DataFrame no longer follows min_rows/max_rows settings #37363
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@jorisvandenbossche can you please confirm it works? I am still unable to install dev environment with jupyter (even conda). |
I can confirm this works, thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the quick follow-up!
else: | ||
max_rows = self.max_rows | ||
|
||
return self._adjust_max_rows(max_rows) | ||
|
||
def _adjust_max_rows(self, max_rows: Optional[int]) -> Optional[int]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it is worth creating a separate function for those few lines, I think it fits fine in the function above, as it was before
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought that some explanation should be added why there are still some changes to max_rows and comparison with frame length. Could be a comment, but a separate function with the docstring is better, isn't it.
return height - self._get_number_of_auxillary_rows() | ||
|
||
max_rows: Optional[int] | ||
if self._is_screen_short(height): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this screen_short doing? I think it is rather about the dataframe being longer than the screen, and that we want to show all rows possible
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def _is_screen_short(self, max_height) -> bool:
return bool(self.max_rows == 0 and len(self.frame) > max_height)
If in terminal (max_rows==0) and if screen height (terminal height) is not enough to accommodate the dataframe entirely, then the screen is short.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the logic can be improved, then please suggest. But my refactoring mainly consisted of removing comments and making methods with the same/similar names. Apparently got lost in some logic, but that was not covered by the tests properly.
I think that we can make a system test with the string representation as well, once we see that the present behavior is correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code snippet before refactoring (regarding _is_screen_short).
# Format only rows and columns that could potentially fit the
# screen
if max_cols == 0 and len(self.frame.columns) > w:
max_cols = w
if max_rows == 0 and len(self.frame) > h:
max_rows = h
min_rows=min_rows, | ||
) | ||
result = formatter.max_rows_fitted | ||
assert result == expected |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is actually already tested (see #37359 (comment), so not sure it is necessarily needed to add additional tests). But so the problem is that it was not catching the regression (and I suppose those tests above will also not have catched it?)
I will try to think about another way to test it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These tests do fail on master
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pandas\tests\io\formats\test_format.py:2059: AssertionError
============================================================== short test summary info ===============================================================
FAILED pandas/tests/io/formats/test_format.py::TestDataFrameFormatting::test_max_rows_fitted[50-30-10-10] - assert 30 == 10
FAILED pandas/tests/io/formats/test_format.py::TestDataFrameFormatting::test_max_rows_fitted[100-60-10-10] - assert 60 == 10
FAILED pandas/tests/io/formats/test_format.py::TestDataFrameFormatting::test_max_rows_fitted[61-60-10-10] - assert 60 == 10
=========================================================== 3 failed, 206 passed in 4.35s ============================================================
pandas/io/formats/format.py
Outdated
# rows available to fill with actual data | ||
return height - self._get_number_of_auxillary_rows() | ||
|
||
max_rows: Optional[int] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you type this at the top to make it less cluttering
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also think that splitting even further would be better.
if self._is_in_terminal():
return self._get_max_rows_fitted_in_terminal()
else:
return self._adjust_max_rows(self.max_rows)
Probably other PR?
also pls rebase |
@ivanovmg can you merge master. @jorisvandenbossche if any comments. |
@ivanovmg can you merge master one more time and this can be merged cc @jorisvandenbossche |
Done, it's green. |
thanks @ivanovmg |
black pandas
git diff upstream/master -u -- "*.py" | flake8 --diff
Fix display logic in number of rows,
explained here https://pandas.pydata.org/docs/dev/user_guide/options.html#frequently-used-options