To string with encoding #28951

mohitanand001 · 2019-10-13T06:12:47Z

close to_string() does not have an "encoding" parameter. #28766
tests added / passed
passes black pandas
passes git diff upstream/master -u -- "*.py" | flake8 --diff
whatsnew entry

simonjayhawkins

@farziengineer Thanks for the PR.

I think best to put this on hold until #28692 is merged to ensure consistency.

doc/source/whatsnew/v1.0.0.rst

pandas/core/frame.py

pandas/io/formats/format.py

…ame.py

jreback · 2019-10-16T12:29:52Z

can you merge master

mohitanand001 · 2019-10-18T18:13:53Z

@simonjayhawkins please have a look.

simonjayhawkins

@farziengineer

The three methods to_html, to_latex and to_string have a lot of code in common and I'd prefer to not have separate tests.

although that was done in #28692, now that encoding has been added to to_html, that just leaves to_string (i.e. this PR)

can you look to add encoding in test_filepath_or_buffer_arg in test_formats.py instead of the test added here. (if you can)

otherwise can you make the test consistent with #28692

jreback · 2019-10-18T21:27:29Z

pandas/tests/io/formats/test_to_string.py

@@ -0,0 +1,6 @@
+def test_to_string_encoding(float_frame,):


these are all test_to_format.py, pls move there

would take a followup to split up test_to_format into 2 parts though (e.g. move the to_string out to a separate file)

…ineer/pandas into to_string_with_encoding

pandas/tests/io/formats/test_format.py

mohitanand001 · 2019-10-20T06:12:04Z

Hi the CI is failing for reasons I do not understand. https://travis-ci.org/pandas-dev/pandas/jobs/600239741
It says

No output has been received in the last 10m0s, this potentially indicates a stalled build or something wrong with the build itself

…ineer/pandas into to_string_with_encoding

simonjayhawkins

@farziengineer lgtm @WillAyd @jreback

mohitanand001 · 2019-10-21T16:13:05Z

Hi @WillAyd @jreback can you please review.

pandas/io/formats/format.py

pandas/tests/io/formats/test_format.py

WillAyd · 2019-10-21T16:44:01Z

pandas/tests/io/formats/test_format.py

+            ValueError, match="buf is not a file name and encoding is specified."
+        ):
+            getattr(df, method)(buf=filepath_or_buffer, encoding=encoding)
+    elif encoding == "foo":


I think you can remove the invalid encoding; this doesn't test any function pandas provides rather just builtin Python functionality

@simonjayhawkins thoughts?

The reason I asked for it to be added was so that the precedence of the Exceptions was checked and to confirm the encoding parameter was passed to the builtin function.

Oh OK my mistake - just didn't see that was asked for previously (lost in GH comments)

so agree with #28951 (comment) but this sort of compensates. If we conform the encoding is passed, then reading back in is only testing Python functionality.

float_frame will probably work with any encoding, so maybe best to modify float_frame if encoding=="gbq".

this should work...

diff --git a/pandas/tests/io/formats/test_format.py b/pandas/tests/io/formats/test_format.py index 096fc6cb4..490cecb41 100644 --- a/pandas/tests/io/formats/test_format.py +++ b/pandas/tests/io/formats/test_format.py @@ -73,17 +73,19 @@ def filepath_or_buffer(filepath_or_buffer_id, tmp_path): @pytest.fixture -def assert_filepath_or_buffer_equals(filepath_or_buffer, filepath_or_buffer_id): +def assert_filepath_or_buffer_equals( + filepath_or_buffer, filepath_or_buffer_id, encoding +): """ Assertion helper for checking filepath_or_buffer. """ def _assert_filepath_or_buffer_equals(expected): if filepath_or_buffer_id == "string": - with open(filepath_or_buffer) as f: + with open(filepath_or_buffer, encoding=encoding) as f: result = f.read() elif filepath_or_buffer_id == "pathlike": - result = filepath_or_buffer.read_text() + result = filepath_or_buffer.read_text(encoding=encoding) elif filepath_or_buffer_id == "buffer": result = filepath_or_buffer.getvalue() assert result == expected @@ -3250,6 +3252,8 @@ def test_filepath_or_buffer_arg( filepath_or_buffer_id, ): df = float_frame + if encoding == "gbk": + float_frame.iloc[0, 0] = "造成输出中文显示乱码" if filepath_or_buffer_id not in ["string", "pathlike"] and encoding is not None: with pytest.raises(

pandas/tests/io/formats/test_format.py

WillAyd · 2019-10-22T00:49:40Z

pandas/tests/io/formats/test_format.py

+    filepath_or_buffer,
+    assert_filepath_or_buffer_equals,
+    encoding,
+    filepath_or_buffer_id,
 ):
    df = float_frame


Is there a reason to use float_frame here? I think can just construct a DataFrame using the data you have on line below instead of modifying this; as is its not clear on intent to use this fixture

@simonjayhawkins float_frame fixture has been uniformly used in all the tests, hence it is used here.

maybe..

diff --git a/pandas/tests/io/formats/test_format.py b/pandas/tests/io/formats/test_format.py index 096fc6cb4..dc2784a7b 100644 --- a/pandas/tests/io/formats/test_format.py +++ b/pandas/tests/io/formats/test_format.py @@ -3240,16 +3240,19 @@ def test_repr_html_ipython_config(ip): @pytest.mark.parametrize("method", ["to_string", "to_html", "to_latex"]) -@pytest.mark.parametrize("encoding", [None, "utf-8", "gbk", "foo"]) +@pytest.mark.parametrize( + "encoding, data", + [(None, "abc"), ("utf-8", "abc"), ("gbk", "造成输出中文显示乱码"), ("foo", "abc")], +) def test_filepath_or_buffer_arg( - float_frame, method, filepath_or_buffer, assert_filepath_or_buffer_equals, encoding, + data, filepath_or_buffer_id, ): - df = float_frame + df = DataFrame([data]) if filepath_or_buffer_id not in ["string", "pathlike"] and encoding is not None: with pytest.raises(

Looks fine, I'll make the changes and push.

pandas/tests/io/formats/test_format.py

jreback · 2019-10-22T12:37:46Z

lgtm. merge when @WillAyd and @simonjayhawkins good.

WillAyd

If you can address outstanding suggestion from @simonjayhawkins on float_frame replacement then lgtm

pandas/tests/io/formats/test_format.py

…t_frame).

…ineer/pandas into to_string_with_encoding

mohitanand001 · 2019-10-23T14:37:45Z

@WillAyd updated the branch with the required changes, please have a look.

WillAyd · 2019-10-23T15:14:34Z

pandas/tests/io/formats/test_format.py

+    else:
+        expected = getattr(df, method)()
+        getattr(df, method)(buf=filepath_or_buffer, encoding=encoding)
+        assert_filepath_or_buffer_equals(expected)


Is there a reason for this to be a fixture instead of just a global function? This way of invoking the function seems very magical; I think easier if not a fixture

Shall I make this change as a part of this PR itself.

Hmm sorry thought it was a part of this PR. I think OK to do here but let's see what @simonjayhawkins thinks

this is fine as is.

WillAyd · 2019-10-23T17:48:13Z

Thanks @farziengineer

mohitanand001 · 2019-10-23T17:50:48Z

Thanks @simonjayhawkins @WillAyd for all the reviews.

Mohit Anand and others added 7 commits October 13, 2019 09:34

Added code in core/frame.py to include encoding in to_string

ae76e46

Modified io/formats/format.py to include encoding parameter in to_strin

3985040

Added encoding to pandas/core/format.py for to_string

8eea18e

Changed formatting with black

d2c70ee

Added test for to_string encoding

835b22d

Removed spaces from test_to_string.py file

4e30d7c

Added whatsnew for to_string with encoding param.

78ba34b

mohitanand001 mentioned this pull request Oct 13, 2019

to_html()没有encoding参数？造成输出中文显示乱码 #28663

Closed

simonjayhawkins added Enhancement Output-Formatting __repr__ of pandas objects, to_string labels Oct 13, 2019

simonjayhawkins requested changes Oct 13, 2019

View reviewed changes

doc/source/whatsnew/v1.0.0.rst Outdated Show resolved Hide resolved

doc/source/whatsnew/v1.0.0.rst Outdated Show resolved Hide resolved

pandas/core/frame.py Outdated Show resolved Hide resolved

mohitanand001 requested a review from simonjayhawkins October 13, 2019 11:37

mohitanand001 added 4 commits October 13, 2019 20:57

Modified whatsnew with to_string encoding note in Others section.

56fdad9

Modified func to meth in for to_string note added in whatsnew.

83ccffd

Added full stop at end of line in to_string docstring.

19801bb

Moved whatsnew note from Other to Other enhancements section.

d44afa7

WillAyd requested changes Oct 14, 2019

View reviewed changes

pandas/core/frame.py Outdated Show resolved Hide resolved

pandas/io/formats/format.py Show resolved Hide resolved

Added annotation to encoding parameter in to_string in pandas/core/fr…

b4b983b

…ame.py

Merge branch 'master' into to_string_with_encoding

acb2c6e

mohitanand001 requested a review from WillAyd October 16, 2019 14:56

simonjayhawkins reviewed Oct 18, 2019

View reviewed changes

jreback added this to the 1.0 milestone Oct 18, 2019

jreback reviewed Oct 18, 2019

View reviewed changes

mohitanand001 added 2 commits October 19, 2019 09:15

Added encoding in test_format.py and removed test_to_string.py

236db38

Merge branch 'to_string_with_encoding' of https://github.com/farzieng…

93e8b11

…ineer/pandas into to_string_with_encoding

simonjayhawkins reviewed Oct 19, 2019

View reviewed changes

pandas/tests/io/formats/test_format.py Outdated Show resolved Hide resolved

mohitanand001 added 2 commits October 19, 2019 14:46

Added encoding in pytest parameter in test_format.py

1f3e55d

Fixed encoding paramter placement in test_format.py

b0364d2

Removed commented code from format.py

1a35eeb

mohitanand001 requested a review from simonjayhawkins October 20, 2019 06:12

mohitanand001 added 2 commits October 20, 2019 15:04

Removed commented code from format.py

f865ac3

Merge branch 'to_string_with_encoding' of https://github.com/farzieng…

757a1f5

…ineer/pandas into to_string_with_encoding

simonjayhawkins approved these changes Oct 21, 2019

View reviewed changes

WillAyd requested changes Oct 21, 2019

View reviewed changes

mohitanand001 requested a review from WillAyd October 21, 2019 17:53

Added encoding in assert_filepath_or_buffer_equals fixture.

835cdb8

WillAyd requested changes Oct 22, 2019

View reviewed changes

mohitanand001 requested a review from WillAyd October 22, 2019 05:10

WillAyd reviewed Oct 22, 2019

View reviewed changes

pandas/tests/io/formats/test_format.py Show resolved Hide resolved

mohitanand001 added 3 commits October 22, 2019 22:41

Modified test_filepath_or_buffer_arg to have custom data(removed floa…

648fa55

…t_frame).

Modified test_filepath_or_buffer_arg to have custom data(removed floa…

3698a2a

…t_frame).

Merge branch 'to_string_with_encoding' of https://github.com/farzieng…

f8917de

…ineer/pandas into to_string_with_encoding

mohitanand001 requested a review from WillAyd October 22, 2019 19:50

WillAyd reviewed Oct 23, 2019

View reviewed changes

mohitanand001 requested a review from WillAyd October 23, 2019 17:46

WillAyd approved these changes Oct 23, 2019

View reviewed changes

WillAyd merged commit ee59b7d into pandas-dev:master Oct 23, 2019

HawkinsBA pushed a commit to HawkinsBA/pandas that referenced this pull request Oct 29, 2019

To string with encoding (pandas-dev#28951)

0aa913e

Reksbril pushed a commit to Reksbril/pandas that referenced this pull request Nov 18, 2019

To string with encoding (pandas-dev#28951)

e46699d

proost pushed a commit to proost/pandas that referenced this pull request Dec 19, 2019

To string with encoding (pandas-dev#28951)

851bd42

proost pushed a commit to proost/pandas that referenced this pull request Dec 19, 2019

To string with encoding (pandas-dev#28951)

39317dd

bongolegend pushed a commit to bongolegend/pandas that referenced this pull request Jan 1, 2020

To string with encoding (pandas-dev#28951)

4957f23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

To string with encoding #28951

To string with encoding #28951

mohitanand001 commented Oct 13, 2019 •

edited by simonjayhawkins

Loading

simonjayhawkins left a comment

jreback commented Oct 16, 2019

mohitanand001 commented Oct 18, 2019

simonjayhawkins left a comment

jreback Oct 18, 2019

mohitanand001 commented Oct 20, 2019

simonjayhawkins left a comment

mohitanand001 commented Oct 21, 2019

WillAyd Oct 21, 2019

mohitanand001 Oct 21, 2019

simonjayhawkins Oct 21, 2019

WillAyd Oct 21, 2019

simonjayhawkins Oct 21, 2019

simonjayhawkins Oct 21, 2019

WillAyd Oct 22, 2019

mohitanand001 Oct 22, 2019

simonjayhawkins Oct 22, 2019

mohitanand001 Oct 22, 2019

jreback commented Oct 22, 2019

WillAyd left a comment

mohitanand001 commented Oct 23, 2019

WillAyd Oct 23, 2019

mohitanand001 Oct 23, 2019

WillAyd Oct 23, 2019

simonjayhawkins Oct 23, 2019 •

edited

Loading

WillAyd commented Oct 23, 2019

mohitanand001 commented Oct 23, 2019

To string with encoding #28951

To string with encoding #28951

Conversation

mohitanand001 commented Oct 13, 2019 • edited by simonjayhawkins Loading

simonjayhawkins left a comment

Choose a reason for hiding this comment

jreback commented Oct 16, 2019

mohitanand001 commented Oct 18, 2019

simonjayhawkins left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mohitanand001 commented Oct 20, 2019

simonjayhawkins left a comment

Choose a reason for hiding this comment

mohitanand001 commented Oct 21, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jreback commented Oct 22, 2019

WillAyd left a comment

Choose a reason for hiding this comment

mohitanand001 commented Oct 23, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

simonjayhawkins Oct 23, 2019 • edited Loading

Choose a reason for hiding this comment

WillAyd commented Oct 23, 2019

mohitanand001 commented Oct 23, 2019

mohitanand001 commented Oct 13, 2019 •

edited by simonjayhawkins

Loading

simonjayhawkins Oct 23, 2019 •

edited

Loading