BUG: DataFrame.to_string with formatters, header and index False #13350

chiroptical · 2016-06-02T23:15:57Z

tests added / passed - added test specific to format bug
passes pep8radius master --diff
whatsnew entry - not needed

Found this bug experimenting with formatters. First pull request to pandas, but I believe guidelines are quite clear. I can explain what was happening in more detail if that is necessary.

jreback · 2016-06-02T23:25:29Z

is this related to #13032 ?

can you show an example of before / after

jreback · 2016-06-02T23:26:15Z

pandas/tests/formats/test_printing.py

@@ -9,6 +9,17 @@
 _multiprocess_can_split_ = True


+def test_to_string_formatters_index_header():
+    from pandas import DataFrame


add the issue number (this PR since there is no number) as a comment

if you need the import put at the top of the file

chiroptical · 2016-06-03T00:52:02Z

Turns out is exactly related to #13032. I did see this issue, but didn't make the connection initially. In my testing, the lines should be removed not modified. To be honest, I didn't understand what these lines accomplished when I was reviewing the code originally.

Minimal code to reproduce (stripped down from #13032, easier to see with formatters):

>>> import pandas as pd
>>> frame = pd.DataFrame(data={0: 0, 1: 0}, index=[0])
>>> formatter = lambda x: '{:10.3f}'.format(x)
>>> print(frame.to_string(index=False, header=False))

Before (adding slashes to make space counting easier):
\ \ \ \ 0.000\ \ \ \ \ \ 0.000
After:
\ \ \ \ \ 0.000\ \ \ \ \ \ 0.000

Output from PR for #13032 (without formatter):

>>> print(df.to_string(index=False))
      one       two     three
 1.722364  0.846757  0.094394
-0.578834  0.836656  0.665414
 0.345460  1.782786  1.760175

Output from PR for #13032 (with formatter 10.3f, slashes added once for column count):

>>> print(df.to_string(index=False,formatters=[formatter,formatter,formatter]))
       one        two      three
\ \ \ \ \ 1.722      0.847      0.094
    -0.579      0.837      0.665
     0.345      1.783      1.760

I will add the PR and Issues numbers plus make corrections shortly.

codecov-io · 2016-06-03T01:00:15Z

Current coverage is 84.23% (diff: 100%)

Merging #13350 into master will increase coverage by <.01%

@@             master     #13350   diff @@
==========================================
  Files           138        138          
  Lines         50713      50721     +8   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
+ Hits          42715      42723     +8   
  Misses         7998       7998          
  Partials          0          0

Powered by Codecov. Last update ce56542...73d7b7e

chiroptical · 2016-06-03T01:02:24Z

Additionally, the failed Travis CI checks don't appear to be related to my subtrations/additions. Lastly, the unit test must use assert because of the location in the code. Should it be moved?

jreback · 2016-06-03T01:02:57Z

@barrymoo can you also read thru the comments on that issue (and the referenced one) and see if covering bases.

chiroptical · 2016-06-03T01:09:21Z

@jreback Of course,

From comment #13032 (comment):
Before:

Name    Value
                                  Short        1
                                 Longer  9374518
Much Longer name to the Max -----------    32432

After (desired output, correct?):

                                    Name    Value
                                   Short        1
                                  Longer  9374518
 Much Longer name to the Max -----------    32432

I think the others comments would be unaffected. To be fair though, I am very new to pandas.

chiroptical · 2016-06-03T01:42:31Z

I see what you mean now. From #11833:

Code:

>>> import pandas as pd
>>> df = pd.DataFrame({'a':range(5)})
>>> df.to_string(index=False)
u' a\n 0\n 1\n 2\n 3\n 4'
>>> formatter = lambda x: '{:1d}'.format(x)
>>> df.to_string(formatters=[formatter], index=False)
u'a\n0\n1\n2\n3\n4'

I should be able to fix this issue quickly.

chiroptical · 2016-06-03T02:45:42Z

A different perspective: where are those spaces coming from in the first place? (I will try to track this down)

chiroptical · 2016-06-03T03:40:55Z

Code:

#!/usr/bin/env python
import pandas as pd
import numpy as np

# Test 1
frame = pd.DataFrame(data={0: 0, 1: 0}, index=[0])
formatter = lambda x: '{:10.3f}'.format(x)
string = frame.to_string(index=False, header=False)
print('--> Begin Test 1 <--')
print(string)
print('-->  End Test 1  <--')

# Test 2
df = pd.DataFrame(np.random.randn(3, 3), columns=['one', 'two', 'three'])
string = df.to_string(index=False)
print('--> Begin Test 2 <--')
print(string)
print('-->  End Test 2  <--')

# Test 3
df = pd.DataFrame({'a':range(5)})
string = df.to_string(index=False)
print('--> Begin Test 3 <--')
print(string)
print('-->  End Test 3  <--')

# Test 4
NAMES = ['Short', 'Longer', 'Much Longer name to the Max -----------']
VALUES = [1, 9374518, 32432]
d = pd.DataFrame({'Name': NAMES, 'Value': VALUES})
string = d.to_string(index=False)
print('--> Begin Test 4 <--')
print(string)
print('-->  End Test 4  <--')

Produces:

$ python quick-test.py
--> Begin Test 1 <--
0 0
-->  End Test 1  <--
--> Begin Test 2 <--
      one       two     three
-0.117275 -0.410192 -2.170441
 0.194766  0.521318  0.936951
-0.923841 -1.829388  1.078478
-->  End Test 2  <--
--> Begin Test 3 <--
a
0
1
2
3
4
-->  End Test 3  <--
--> Begin Test 4 <--
                                   Name   Value
                                  Short       1
                                 Longer 9374518
Much Longer name to the Max -----------   32432
-->  End Test 4  <--

I believe this is the desired output. Unfortunately, this code fails about 20 tests. Hopefully, it is because the spacing in the expected output has changed slightly.

chiroptical · 2016-06-03T04:59:18Z

About 30 failed checks from nosetests pandas/tests/formats/test_format.py. I will checking the expected outputs this weekend. Note, I edited #13350 (comment) to include the new code edits.

chiroptical · 2016-06-03T15:42:06Z

I have created barrymoo/pandas-pr-13350-supplement to document the test failures (and eventually generate new expected strings). This is a work-in-progress.

chiroptical · 2016-06-06T14:47:51Z

I have added some more tests to the supplement. There is one test which I am having some difficulty with, starts: https://github.com/barrymoo/pandas-pr-13350-supplement/blob/master/tests.py#L474. I think the frame is supposed to overcome the terminal size, but it doesn't.

chiroptical · 2016-06-16T13:38:25Z

Hey @jreback I finished my supplement but I need some opinions about output formatting especially concerning the test_*_east_asian* tests (I don't know what these should look like). Is there anyone else we could pull in to review this? That way I can fix the rest of the formatting concerns and fix everything with one PR.

For the supplement, clone it, activate the dev environment (tested with 2 & 3, but have not examined diffs of the output), run python tests.py, and review formatting. Submit issues to the other repo with concerns of specific tests.

jreback · 2016-06-16T13:51:45Z

@barrymoo not sure what you mean by supplement. simply update this PR, comments can just be done here.
cc @sinhrks

sinhrks · 2016-06-16T14:11:41Z

@barrymoo I live in Japan and am willing to check the output format amd provide test cases:)

chiroptical · 2016-06-16T14:18:16Z

@jreback I ripped out the failing tests so one can easily print the results out on the command line. That way I can get some input from the community on how people want things formatted and make additional changes.

chiroptical · 2016-06-21T16:57:38Z

Here's a great example for why I need the supplement. For test_datetimelike_frame, my changes lead to the following output

                          dt  x
0  2011-01-01 00:00:00-05:00  1
1  2011-01-01 00:00:00-05:00  2
..                       ... ..
8                        NaT  9
9                        NaT 10

But, do you like:

                          dt  x
 0 2011-01-01 00:00:00-05:00  1
 1 2011-01-01 00:00:00-05:00  2
..                       ... ..
 8                       NaT  9
 9                       NaT 10

or...

                         dt  x
0 2011-01-01 00:00:00-05:00  1
1 2011-01-01 00:00:00-05:00  2
.                       ... ..
8                       NaT  9
9                       NaT 10

I can easily generate all of these outputs. Or, would you rather I pick what I like and get all of the tests working.

evanpw · 2016-06-26T14:56:07Z

I also worked on this; getting all of the tests to pass afterward is a nightmare. It looks like this change removes the leading space on integers but leaves it on floats. Is that true?

chiroptical · 2016-06-26T15:20:25Z

@evanpw it's very time consuming. I am still working through all of the tests, but I don't have a ton of free time. If you're looking at all positive numbers there is an extra space for the nonexistent "-" sign. There is an example in one of the above comments.

evanpw · 2016-06-26T15:25:13Z

After this change there won't be an extra leading space for a column of positive integers, but there will still be one for a column of positive floats, right?

chiroptical · 2016-06-26T15:30:52Z

That's correct, but I can likely fix that too. Again the majority of this work is fixing the tests.

Author: sinhrks <[email protected]> Closes #13677 from sinhrks/append_series and squashes the following commits: 4bc7b54 [sinhrks] ENH: Series.append now has ignore_index kw

closes #13598 Author: wcwagner <[email protected]> Closes #13690 from wcwagner/bug/13598 and squashes the following commits: 9669f3f [wcwagner] BUG: "Replaced isinstance with is_integer, and changed test_pad_width to use getattr" 40a3188 [wcwagner] BUG: "Switched to single test method asserting functions that use pad raise correctly." 06795db [wcwagner] BUG: "Added tests for width parameter on center, ljust, rjust, zfill." 468df3a [wcwagner] BUG: Add type check for width parameter in str.pad method GH13598

closes #13603 Author: yui-knk <[email protected]> Closes #13687 from yui-knk/fix_13603 and squashes the following commits: 0960395 [yui-knk] BUG: Cast a key to NaT before get loc from Index

…st entry closes #13695 Author: Jeff Reback <[email protected]> Closes #13698 from jreback/merge_asof and squashes the following commits: c46dcfa [Jeff Reback] BUG: merge_asof not handling allow_exact_matches and tolerance on first entry

…orted engines

closes #12995 flake8-ed *.pyx files and fixed errors. Removed the E226 check because that inhibits pointers (e.g. char*). Author: gfyoung <[email protected]> Closes #14147 from gfyoung/pyx-flake8 and squashes the following commits: 386ed58 [gfyoung] MAINT: flake8 *.pyx files

…#14164) API/DEPR: Remove +/- as setops for DatetimeIndex/PeriodIndex (GH9630) xref #13777, deprecations put in place in #9630

* MAINT: Replace datetools import in tests * MAINT: Replace datetools import internally * DOC: Replace datetools import in docs * MAINT: Remove datetool imports from scripts * DEPR: Deprecate pandas.core.datetools Closes gh-14094.

Concatting categoricals with non-matching categories will now return object dtype instead of raising an error. * ENH: concat and append now can handleunordered categories * reomove union_categoricals kw from concat

…4144)

* DOC: remove examples on Panel4D (caused warnings) and refer to older docs * DOC: fix build warnings * resolve comments

* DOC: clean-up 0.19.0 whatsnew file * further clean-up * Update highlights * consistent use of behaviour/behavior * s/favour/favor

closes #14155 closes #14160

closes #14088 Author: John Liekezer <[email protected]> Closes #14090 from conquistador1492/issue_14088 and squashes the following commits: c91425b [John Liekezer] BUG: fix tz-aware datetime convert to DatetimeIndex (GH 14088)

Closes gh-14140.

closes #14190 Author: Chris <[email protected]> Closes #14191 from chris-b1/cat-ctor and squashes the following commits: 4cad147 [Chris] add some nulls to tests da865e2 [Chris] BUG: Categorical constructor not idempotent with ext dtype

closes #14171 Author: Josh Howes <[email protected]> Closes #14182 from josh-howes/bugfix/14171-series-str-contains-only-nan-values and squashes the following commits: c7e9721 [Josh Howes] BUG: fix str.contains for series containing only nan values

jreback · 2016-09-09T22:40:07Z

can you rebase / update?

…m/barrymoo/pandas into dataframe-to_string-minor-bug-fix

chiroptical · 2016-09-10T02:12:57Z

I didn't do this correct, sorry have not mastered this bit of git yet. I will submit a different pull request

jreback added the Output-Formatting __repr__ of pandas objects, to_string label Jun 2, 2016

jreback reviewed Jun 2, 2016
View reviewed changes

jreback added the Bug label Jun 3, 2016

sinhrks and others added 5 commits July 18, 2016 18:08

TST: assert message shows unnecessary diff (#13676)

6b9cd15

ENH: Series.append now has ignore_index kw

694fe61

Author: sinhrks <[email protected]> Closes #13677 from sinhrks/append_series and squashes the following commits: 4bc7b54 [sinhrks] ENH: Series.append now has ignore_index kw

BUG: Cast a key to NaT before get loc from Index

9f635cd

closes #13603 Author: yui-knk <[email protected]> Closes #13687 from yui-knk/fix_13603 and squashes the following commits: 0960395 [yui-knk] BUG: Cast a key to NaT before get loc from Index

jreback and others added 19 commits September 5, 2016 18:02

DOC: issue typo in v0.19.0

f7506c6

BLD: add in build conflict resolution to appeveyor.yml

3110a72

TST: skipping xref #14120, locale separator in parser tests of unsupp…

1a8273c

…orted engines

API/DEPR: Remove +/- as setops for Index (GH8227) (#14127)

8023029

Fix trivial typo in comment (#14174)

844d5fb

API/DEPR: Remove +/- as setops for DatetimeIndex/PeriodIndex (GH9630) (…

e88ad28

…#14164) API/DEPR: Remove +/- as setops for DatetimeIndex/PeriodIndex (GH9630) xref #13777, deprecations put in place in #9630

DEPR: Deprecate pandas.core.datetools (#14105)

3f3839b

* MAINT: Replace datetools import in tests * MAINT: Replace datetools import internally * DOC: Replace datetools import in docs * MAINT: Remove datetool imports from scripts * DEPR: Deprecate pandas.core.datetools Closes gh-14094.

ENH: concat and append now can handle unordered categories (#13767)

ab4bd36

Concatting categoricals with non-matching categories will now return object dtype instead of raising an error. * ENH: concat and append now can handleunordered categories * reomove union_categoricals kw from concat

Add steps to run gbq integration testing to the contributing docs (#1…

9b7efd6

…4144)

DOC: cleanup build warnings (#14172)

1ace12b

* DOC: remove examples on Panel4D (caused warnings) and refer to older docs * DOC: fix build warnings * resolve comments

DOC: clean-up 0.19.0 whatsnew file (#14176)

957eaa4

* DOC: clean-up 0.19.0 whatsnew file * further clean-up * Update highlights * consistent use of behaviour/behavior * s/favour/favor

RLS: v0.19.0rc1

497a3bc

BUG : bug in setting a slice of a Series with a np.timedelta64

ff435ba

closes #14155 closes #14160

DOC: minor typo in 0.19.0 whatsnew file (#14185)

02df7b6

TST: Make encoded sep check more locale sensitive (#14161)

8af6264

Closes gh-14140.

Barry E. Moore II and others added 5 commits September 9, 2016 22:04

BUG: df.to_string with formatters, header and index False

678b636

BUG: Fix issue #13032, annotate test

5a1b743

BUG: spacing issue complete

4d7559d

BUG: hunt down remaining leading whitespace

aa91bcd

Merge branch 'dataframe-to_string-minor-bug-fix' of https://github.co…

ee2481c

…m/barrymoo/pandas into dataframe-to_string-minor-bug-fix

chiroptical closed this Sep 10, 2016

chiroptical deleted the dataframe-to_string-minor-bug-fix branch September 10, 2016 02:25

chiroptical mentioned this pull request Sep 10, 2016

Fix minor spacing issue #14196

Closed

4 tasks

jorisvandenbossche added this to the No action milestone Sep 14, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: DataFrame.to_string with formatters, header and index False #13350

BUG: DataFrame.to_string with formatters, header and index False #13350

chiroptical commented Jun 2, 2016 •

edited by jreback

Loading

jreback commented Jun 2, 2016

jreback Jun 2, 2016

chiroptical commented Jun 3, 2016

codecov-io commented Jun 3, 2016 •

edited

Loading

chiroptical commented Jun 3, 2016

jreback commented Jun 3, 2016

chiroptical commented Jun 3, 2016

chiroptical commented Jun 3, 2016 •

edited

Loading

chiroptical commented Jun 3, 2016

chiroptical commented Jun 3, 2016 •

edited

Loading

chiroptical commented Jun 3, 2016

chiroptical commented Jun 3, 2016 •

edited

Loading

chiroptical commented Jun 6, 2016

chiroptical commented Jun 16, 2016

jreback commented Jun 16, 2016

sinhrks commented Jun 16, 2016

chiroptical commented Jun 16, 2016

chiroptical commented Jun 21, 2016 •

edited

Loading

evanpw commented Jun 26, 2016

chiroptical commented Jun 26, 2016

evanpw commented Jun 26, 2016

chiroptical commented Jun 26, 2016

jreback commented Sep 9, 2016

chiroptical commented Sep 10, 2016 •

edited

Loading

BUG: DataFrame.to_string with formatters, header and index False #13350

BUG: DataFrame.to_string with formatters, header and index False #13350

Conversation

chiroptical commented Jun 2, 2016 • edited by jreback Loading

jreback commented Jun 2, 2016

jreback Jun 2, 2016

Choose a reason for hiding this comment

chiroptical commented Jun 3, 2016

codecov-io commented Jun 3, 2016 • edited Loading

Current coverage is 84.23% (diff: 100%)

chiroptical commented Jun 3, 2016

jreback commented Jun 3, 2016

chiroptical commented Jun 3, 2016

chiroptical commented Jun 3, 2016 • edited Loading

chiroptical commented Jun 3, 2016

chiroptical commented Jun 3, 2016 • edited Loading

chiroptical commented Jun 3, 2016

chiroptical commented Jun 3, 2016 • edited Loading

chiroptical commented Jun 6, 2016

chiroptical commented Jun 16, 2016

jreback commented Jun 16, 2016

sinhrks commented Jun 16, 2016

chiroptical commented Jun 16, 2016

chiroptical commented Jun 21, 2016 • edited Loading

evanpw commented Jun 26, 2016

chiroptical commented Jun 26, 2016

evanpw commented Jun 26, 2016

chiroptical commented Jun 26, 2016

jreback commented Sep 9, 2016

chiroptical commented Sep 10, 2016 • edited Loading

chiroptical commented Jun 2, 2016 •

edited by jreback

Loading

codecov-io commented Jun 3, 2016 •

edited

Loading

chiroptical commented Jun 3, 2016 •

edited

Loading

chiroptical commented Jun 3, 2016 •

edited

Loading

chiroptical commented Jun 3, 2016 •

edited

Loading

chiroptical commented Jun 21, 2016 •

edited

Loading

chiroptical commented Sep 10, 2016 •

edited

Loading