-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG-22984 Fix truncation of DataFrame representations #22987
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Hello @JustinZhengBC! Thanks for updating the PR.
Comment last updated on October 07, 2018 at 01:59 Hours UTC |
Codecov Report
@@ Coverage Diff @@
## master #22987 +/- ##
==========================================
- Coverage 92.24% 92.24% -0.01%
==========================================
Files 161 161
Lines 51340 51315 -25
==========================================
- Hits 47361 47336 -25
Misses 3979 3979
Continue to review full report at Codecov.
|
doc/source/whatsnew/v0.24.0.txt
Outdated
@@ -194,6 +194,7 @@ Other Enhancements | |||
- :meth:`Index.to_frame` now supports overriding column name(s) (:issue:`22580`). | |||
- New attribute :attr:`__git_version__` will return git commit sha of current build (:issue:`21295`). | |||
- Compatibility with Matplotlib 3.0 (:issue:`22790`). | |||
- Representation of :class:`DataFrame` fills up the terminal window better |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add the issue number here, I would call this more of a bug fix, no?
@@ -616,11 +616,6 @@ def to_string(self): | |||
else: # max_cols == 0. Try to fit frame to terminal | |||
text = self.adj.adjoin(1, *strcols).split('\n') | |||
max_len = Series(text).str.len().max() | |||
headers = [ele[0] for ele in strcols] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we have a number of tests for this, suprised this didn't break anything, can you make a test if we have no coverage for this case now? (e.g. you use option_context to set the width, then check the output string)
lgtm. @TomAugspurger a glance if you can. |
doc/source/whatsnew/v0.24.0.txt
Outdated
@@ -1312,6 +1312,7 @@ Notice how we now instead output ``np.nan`` itself instead of a stringified form | |||
- :func:`read_sas()` will correctly parse sas7bdat files with data page types having also bit 7 set (so page type is 128 + 256 = 384) (:issue:`16615`) | |||
- Bug in :meth:`detect_client_encoding` where potential ``IOError`` goes unhandled when importing in a mod_wsgi process due to restricted access to stdout. (:issue:`21552`) | |||
- Bug in :func:`to_string()` that broke column alignment when ``index=False`` and width of first column's values is greater than the width of first column's header (:issue:`16839`, :issue:`13032`) | |||
- Bug in :func:`to_string()` that caused representations of :class:`DataFrame` to not take up the whole window (:issue:`22984`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be :class:`DataFrame.to_string` right? We don't have a top-level pandas.to_string
.
@@ -343,6 +343,16 @@ def test_repr_truncates_terminal_size(self): | |||
|
|||
assert df2.columns[0] in result.split('\n')[0] | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: we have a pytest fixture formock now, that does the try / except / skip done above.
I think it'd be cleanest to split this into a new test here, and accept the mock
parameter.
def test_repr_truncates_terminal_size_full(self, mock):
...
Any if you're feeling adventurous, you could change the try / except / skip mock import above to use the fixture as well. Not a big deal though.
* Easy bits of pandas-dev#23382 * Easy parts of pandas-dev#23368
Re-attempt of pandas-devgh-15506. Closes pandas-devgh-15475.
* Add documentation line with example for the ambiguous parameter of tz_locaclize * Updating 'ambiguous'-param doc + update it on Timestamp, DatetimeIndex and NaT This is following the discussion at pandas-dev#23408 (comment)
* BUG: Identify SparseDataFrame as sparse The is_sparse function checks to see if an array-like is spare by checking to see if it is an instance of ABCSparseArray or ABCSparseSeries. This commit adds ABCSparseDataFrame to that list -- so it can detect that a DataFrame (which is an array-like object) is sparse. Added a test for this. * Revert "BUG: Identify SparseDataFrame as sparse" This reverts commit 10dffd1. The previous commit's change was not necessary. Will add a docstring to clarify the behaviour of the method. * DOC: Revise is_sparce docstring Clean up the docstring for is_sparse so it confirms to the documentation style guide. Add additional examples and clarify that is_sparse expect a 1-dimensional array-like. * DOC: Adjust is_sparse docstring. Responding to pull request comments.
139235a
to
f452c40
Compare
Codecov Report
@@ Coverage Diff @@
## master #22987 +/- ##
==========================================
- Coverage 92.24% 92.24% -0.01%
==========================================
Files 161 161
Lines 51339 51336 -3
==========================================
- Hits 47360 47357 -3
Misses 3979 3979
Continue to review full report at Codecov.
|
@JustinZhengBC need to merge master |
Another hypothesis failure on azure. OK to merge @jreback? I can investigate the failing test separately. |
yep |
Merged master to fix the confict. May as well let the CI run. ping on green. |
@TomAugspurger green. Also thanks for fixing the merge issue |
I caused it my merging my own PR, so it's the least I could do :) Thanks! |
* upstream/master: BUG: to_html misses truncation indicators (...) when index=False (pandas-dev#22786) API/DEPR: replace "raise_conflict" with "errors" for df.update (pandas-dev#23657) BUG: Append DataFrame to Series with dateutil timezone (pandas-dev#23685) CLN/CI: Catch that stderr-warning! (pandas-dev#23706) ENH: Allow for join between two multi-index dataframe instances (pandas-dev#20356) Ensure Index._data is an ndarray (pandas-dev#23628) DOC: flake8-per-pr for windows users (pandas-dev#23707) DOC: Handle exceptions when computing contributors. (pandas-dev#23714) DOC: Validate space before colon docstring parameters pandas-dev#23483 (pandas-dev#23506) BUG-22984 Fix truncation of DataFrame representations (pandas-dev#22987)
* BUG-22984 Fix truncation of DataFrame representations
* BUG-22984 Fix truncation of DataFrame representations
* BUG-22984 Fix truncation of DataFrame representations
git diff upstream/master -u -- "*.py" | flake8 --diff
When printing a DataFrame to terminal, an extra column's worth of space is added to the calculated width of the DataFrame. This is presumably to help edge cases, but the calculated difference between the DataFrame width and the terminal window width is incremented by 1 a few lines later, seemingly to fix the same problem. Do any more experienced developers know of a reason to pad the DataFrame width even more?