BUG-22984 Fix truncation of DataFrame representations #22987

JustinZhengBC · 2018-10-04T00:10:03Z

closes BUG: wrong detection if truncated repr is needed #22984
tests added / passed
passes git diff upstream/master -u -- "*.py" | flake8 --diff
whatsnew entry

When printing a DataFrame to terminal, an extra column's worth of space is added to the calculated width of the DataFrame. This is presumably to help edge cases, but the calculated difference between the DataFrame width and the terminal window width is incremented by 1 a few lines later, seemingly to fix the same problem. Do any more experienced developers know of a reason to pad the DataFrame width even more?

pep8speaks · 2018-10-04T00:10:06Z

Hello @JustinZhengBC! Thanks for updating the PR.

There are no PEP8 issues in the file pandas/io/formats/format.py !
There are no PEP8 issues in the file pandas/tests/io/formats/test_format.py !

Comment last updated on October 07, 2018 at 01:59 Hours UTC

codecov · 2018-10-04T20:43:17Z

Codecov Report

Merging #22987 into master will decrease coverage by <.01%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master   #22987      +/-   ##
==========================================
- Coverage   92.24%   92.24%   -0.01%     
==========================================
  Files         161      161              
  Lines       51340    51315      -25     
==========================================
- Hits        47361    47336      -25     
  Misses       3979     3979

Flag	Coverage Δ
#multiple	`90.63% <ø> (-0.01%)`	⬇️
#single	`42.31% <ø> (-0.04%)`	⬇️

Impacted Files	Coverage Δ
pandas/io/formats/format.py	`97.88% <ø> (-0.01%)`	⬇️
pandas/core/arrays/timedeltas.py	`95.08% <0%> (-0.56%)`	⬇️
pandas/core/dtypes/concat.py	`96.26% <0%> (-0.41%)`	⬇️
pandas/core/arrays/datetimelike.py	`95.92% <0%> (-0.22%)`	⬇️
pandas/core/arrays/datetimes.py	`98.44% <0%> (-0.04%)`	⬇️
pandas/core/indexes/datetimes.py	`96.12% <0%> (ø)`	⬆️
pandas/core/arrays/period.py	`98.49% <0%> (+0.04%)`	⬆️
pandas/tseries/offsets.py	`97.07% <0%> (+0.08%)`	⬆️
pandas/core/indexes/datetimelike.py	`98.01% <0%> (+0.27%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fb4405d...139235a. Read the comment docs.

jreback · 2018-10-06T15:14:28Z

doc/source/whatsnew/v0.24.0.txt

@@ -194,6 +194,7 @@ Other Enhancements
 - :meth:`Index.to_frame` now supports overriding column name(s) (:issue:`22580`).
 - New attribute :attr:`__git_version__` will return git commit sha of current build (:issue:`21295`).
 - Compatibility with Matplotlib 3.0 (:issue:`22790`).
+- Representation of :class:`DataFrame` fills up the terminal window better


can you add the issue number here, I would call this more of a bug fix, no?

jreback · 2018-10-06T15:15:08Z

pandas/io/formats/format.py

@@ -616,11 +616,6 @@ def to_string(self):
            else:  # max_cols == 0. Try to fit frame to terminal
                text = self.adj.adjoin(1, *strcols).split('\n')
                max_len = Series(text).str.len().max()
-                headers = [ele[0] for ele in strcols]


we have a number of tests for this, suprised this didn't break anything, can you make a test if we have no coverage for this case now? (e.g. you use option_context to set the width, then check the output string)

jreback · 2018-10-06T15:15:19Z

cc @jorisvandenbossche

jreback · 2018-11-14T14:33:32Z

lgtm. @TomAugspurger a glance if you can.

TomAugspurger · 2018-11-14T14:48:55Z

doc/source/whatsnew/v0.24.0.txt

@@ -1312,6 +1312,7 @@ Notice how we now instead output ``np.nan`` itself instead of a stringified form
 - :func:`read_sas()` will correctly parse sas7bdat files with data page types having also bit 7 set (so page type is 128 + 256 = 384) (:issue:`16615`)
 - Bug in :meth:`detect_client_encoding` where potential ``IOError`` goes unhandled when importing in a mod_wsgi process due to restricted access to stdout. (:issue:`21552`)
 - Bug in :func:`to_string()` that broke column alignment when ``index=False`` and width of first column's values is greater than the width of first column's header (:issue:`16839`, :issue:`13032`)
+- Bug in :func:`to_string()` that caused representations of :class:`DataFrame` to not take up the whole window (:issue:`22984`)


This should be :class:`DataFrame.to_string` right? We don't have a top-level pandas.to_string.

TomAugspurger · 2018-11-14T14:53:22Z

pandas/tests/io/formats/test_format.py

@@ -343,6 +343,16 @@ def test_repr_truncates_terminal_size(self):

        assert df2.columns[0] in result.split('\n')[0]



nitpick: we have a pytest fixture formock now, that does the try / except / skip done above.

I think it'd be cleanest to split this into a new test here, and accept the mock parameter.

def test_repr_truncates_terminal_size_full(self, mock): ...

Any if you're feeling adventurous, you could change the try / except / skip mock import above to use the fixture as well. Not a big deal though.

* Easy bits of pandas-dev#23382 * Easy parts of pandas-dev#23368

Re-attempt of pandas-devgh-15506. Closes pandas-devgh-15475.

Closes pandas-dev#23438

…das-dev#23150)

…andas-dev#23032)

…ev#23474)

…ons (pandas-dev#22980)

* Add documentation line with example for the ambiguous parameter of tz_locaclize * Updating 'ambiguous'-param doc + update it on Timestamp, DatetimeIndex and NaT This is following the discussion at pandas-dev#23408 (comment)

Closes pandas-devgh-9208.

* BUG: Identify SparseDataFrame as sparse The is_sparse function checks to see if an array-like is spare by checking to see if it is an instance of ABCSparseArray or ABCSparseSeries. This commit adds ABCSparseDataFrame to that list -- so it can detect that a DataFrame (which is an array-like object) is sparse. Added a test for this. * Revert "BUG: Identify SparseDataFrame as sparse" This reverts commit 10dffd1. The previous commit's change was not necessary. Will add a docstring to clarify the behaviour of the method. * DOC: Revise is_sparce docstring Clean up the docstring for is_sparse so it confirms to the documentation style guide. Add additional examples and clarify that is_sparse expect a 1-dimensional array-like. * DOC: Adjust is_sparse docstring. Responding to pull request comments.

xref pandas-devgh-10523.

…andas-dev#23688)

codecov · 2018-11-14T17:01:07Z

Codecov Report

Merging #22987 into master will decrease coverage by <.01%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master   #22987      +/-   ##
==========================================
- Coverage   92.24%   92.24%   -0.01%     
==========================================
  Files         161      161              
  Lines       51339    51336       -3     
==========================================
- Hits        47360    47357       -3     
  Misses       3979     3979

Flag	Coverage Δ
#multiple	`90.64% <ø> (-0.01%)`	⬇️
#single	`42.34% <ø> (ø)`	⬆️

Impacted Files	Coverage Δ
pandas/io/formats/format.py	`97.88% <ø> (-0.01%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e413c49...82fa50c. Read the comment docs.

jreback · 2018-11-14T17:08:58Z

@JustinZhengBC need to merge master

TomAugspurger · 2018-11-14T20:05:23Z

Another hypothesis failure on azure. OK to merge @jreback? I can investigate the failing test separately.

jreback · 2018-11-14T21:18:51Z

yep

…-22984

TomAugspurger · 2018-11-14T21:20:16Z

Merged master to fix the confict. May as well let the CI run.

ping on green.

JustinZhengBC · 2018-11-14T23:42:49Z

@TomAugspurger green. Also thanks for fixing the merge issue

TomAugspurger · 2018-11-15T03:02:11Z

I caused it my merging my own PR, so it's the least I could do :)

Thanks!

* upstream/master: BUG: to_html misses truncation indicators (...) when index=False (pandas-dev#22786) API/DEPR: replace "raise_conflict" with "errors" for df.update (pandas-dev#23657) BUG: Append DataFrame to Series with dateutil timezone (pandas-dev#23685) CLN/CI: Catch that stderr-warning! (pandas-dev#23706) ENH: Allow for join between two multi-index dataframe instances (pandas-dev#20356) Ensure Index._data is an ndarray (pandas-dev#23628) DOC: flake8-per-pr for windows users (pandas-dev#23707) DOC: Handle exceptions when computing contributors. (pandas-dev#23714) DOC: Validate space before colon docstring parameters pandas-dev#23483 (pandas-dev#23506) BUG-22984 Fix truncation of DataFrame representations (pandas-dev#22987)

* BUG-22984 Fix truncation of DataFrame representations

jreback requested changes Oct 6, 2018

View reviewed changes

jreback added Output-Formatting __repr__ of pandas objects, to_string Bug labels Oct 6, 2018

jreback added this to the 0.24.0 milestone Nov 14, 2018

jreback approved these changes Nov 14, 2018

View reviewed changes

TomAugspurger reviewed Nov 14, 2018

View reviewed changes

JustinZhengBC and others added 20 commits November 14, 2018 08:58

BUG-22984 Fix truncation of DataFrame representations

eb4a239

BUG-22984 Fix flake8 issues

8e82c82

BUG-22984 Fix whatsnew and add test

448153d

BUG-22984 Fix whatsnew and add test

244b295

BUG-22984 Fix linting issue

aa867b0

DOC: Updating the docstring of Series.dot (pandas-dev#22890)

34b464f

Fixing flake8 problems new to flake8 3.6.0 (pandas-dev#23472)

4629504

strictness and checks for Timedelta _simple_new (pandas-dev#23433)

696e8c7

REF: cython cleanup, typing, optimizations (pandas-dev#23464)

3faf1a9

* Easy bits of pandas-dev#23382 * Easy parts of pandas-dev#23368

ENH: Add FrozenList.union and .difference (pandas-dev#23394)

126edd9

Re-attempt of pandas-devgh-15506. Closes pandas-devgh-15475.

BUG: Allow freq conversion from dt64 to period (pandas-dev#23460)

da08eeb

Closes pandas-dev#23438

add number of Errors, Warnings to scripts/validate_docstrings.py (pan…

29239ad

…das-dev#23150)

DOC: Add cookbook entry for triangular correlation matrix (GH22840) (p…

90be7b3

…andas-dev#23032)

CLN: doc string (pandas-dev#23469)

defff22

STYLE: Standardize cython spacing for casting, with linting (pandas-d…

7aed9e6

…ev#23474)

DOC: Adding documentation for pandas.core.indexes.api internal functi…

f1768c7

…ons (pandas-dev#22980)

DOC: Validate in docstrings that numpy and pandas are not imported (p…

cb51a02

…andas-dev#23161)

DOC: Updated docstrings related to DateTimeIndex. GH22459 (pandas-dev…

ae938fd

…#22504)

DOC: Rephrased doc for Series.asof. Added examples (pandas-dev#21034)

8c29ede

benoxoft and others added 13 commits November 14, 2018 08:59

CI: Allow to compile docs with ipython 7.11 pandas-dev#22990 (pandas-…

951041e

…dev#23655)

TST: IntervalTree.get_loc_interval should return platform int (pandas…

d5d6d91

…-dev#23660)

CLN: Move to_excel to generic.py (pandas-dev#23656)

4242077

Add to_flat_index method to MultiIndex (pandas-dev#22866)

c1640c6

BUG: Fix read_excel w/parse_cols & empty dataset (pandas-dev#23661)

8e4bf4c

Closes pandas-devgh-9208.

DOC: Surface / doc mangle_dupe_cols in read_excel (pandas-dev#23678)

7dab45f

xref pandas-devgh-10523.

Fix errorbar visualization (pandas-dev#23674)

991547e

DOC: Accessing files from a S3 bucket. (pandas-dev#23639)

c8ac3bf

REF: Move Excel names parameter handling to CSV (pandas-dev#23690)

f9563ea

BUG: Fix Series/DataFrame.rank(pct=True) with more than 2**24 rows (p…

2688cbe

…andas-dev#23688)

CI: raise clone depth limit on CI

d0adfb0

Implement _most_ of the EA interface for DTA/TDA (pandas-dev#23643)

f452c40

JustinZhengBC force-pushed the BUG-22984 branch from 139235a to f452c40 Compare November 14, 2018 17:00

JustinZhengBC added 3 commits November 14, 2018 09:10

Merge master

36f0608

reapply changes

8459936

r

0f7aa4b

Merge remote-tracking branch 'upstream/master' into JustinZhengBC-BUG…

82fa50c

…-22984

TomAugspurger merged commit 6920363 into pandas-dev:master Nov 15, 2018

tm9k1 pushed a commit to tm9k1/pandas that referenced this pull request Nov 19, 2018

BUG-22984 Fix truncation of DataFrame representations (pandas-dev#22987)

32a50dc

* BUG-22984 Fix truncation of DataFrame representations

Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019

BUG-22984 Fix truncation of DataFrame representations (pandas-dev#22987)

d0862a9

* BUG-22984 Fix truncation of DataFrame representations

Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019

BUG-22984 Fix truncation of DataFrame representations (pandas-dev#22987)

87d3a3d

* BUG-22984 Fix truncation of DataFrame representations

		@@ -343,6 +343,16 @@ def test_repr_truncates_terminal_size(self):

		assert df2.columns[0] in result.split('\n')[0]

Uh oh!

BUG-22984 Fix truncation of DataFrame representations #22987

BUG-22984 Fix truncation of DataFrame representations #22987

Uh oh!

Conversation

JustinZhengBC commented Oct 4, 2018

Uh oh!

pep8speaks commented Oct 4, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Comment last updated on October 07, 2018 at 01:59 Hours UTC

Uh oh!

codecov bot commented Oct 4, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

jreback Oct 6, 2018

Choose a reason for hiding this comment

Uh oh!

jreback Oct 6, 2018

Choose a reason for hiding this comment

Uh oh!

jreback commented Oct 6, 2018

Uh oh!

jreback commented Nov 14, 2018

Uh oh!

TomAugspurger Nov 14, 2018

Choose a reason for hiding this comment

Uh oh!

TomAugspurger Nov 14, 2018

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Nov 14, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

jreback commented Nov 14, 2018

Uh oh!

TomAugspurger commented Nov 14, 2018

Uh oh!

jreback commented Nov 14, 2018

Uh oh!

TomAugspurger commented Nov 14, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

JustinZhengBC commented Nov 14, 2018

Uh oh!

TomAugspurger commented Nov 15, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

pep8speaks commented Oct 4, 2018 •

edited

Loading

codecov bot commented Oct 4, 2018 •

edited

Loading

codecov bot commented Nov 14, 2018 •

edited

Loading

TomAugspurger commented Nov 14, 2018 •

edited

Loading

TomAugspurger commented Nov 15, 2018 •

edited

Loading