DOC: read_excel doc - fixed formatting and added examples #18753

JanLauGe · 2017-12-12T23:36:07Z

Fixes a formatting bug in the read_excel docs that caused a line break and bold print in list of _NA_VALUES. Adds examples in the read_excel docstring.

closes read_excel kwarg for comment? #18735
tests added & passed
passes git diff master -u -- "*.py" | flake8 --diff
whatsnew entry

JanLauGe · 2017-12-12T23:37:12Z

First pandas PR, please let me know about anything that should be done differently.
Thank you!

chris-b1 · 2017-12-13T00:20:59Z

Thanks @JanLauGe! to close #18735 I'd like to see the following

add comment in code to the explicit named keyword arguments to the read_excel function, and subsequent internal calls
add comment to the docstring for the same
add a few tests for comment functionality

codecov · 2017-12-13T01:36:33Z

Codecov Report

Merging #18753 into master will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           master   #18753   +/-   ##
=======================================
  Coverage   91.58%   91.58%           
=======================================
  Files         150      150           
  Lines       48972    48972           
=======================================
  Hits        44851    44851           
  Misses       4121     4121

Flag	Coverage Δ
#multiple	`89.94% <ø> (ø)`	⬆️
#single	`41.72% <ø> (ø)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update faeac49...6afed06. Read the comment docs.

JanLauGe · 2017-12-13T09:52:18Z

Thanks @chris-b1 ! To make sure I understand: You'd like comment to be a named argument instead of a keyword argument?

jreback

nice examples. pls add a whatsnew note in 0.22.0, you can put this in other api changes section, just say comment arg is exposed as a named parameter in read_excel.

jreback · 2017-12-13T14:25:30Z

pandas/tests/io/test_excel.py

@@ -1862,6 +1862,62 @@ def test_invalid_columns(self):
            with pytest.raises(KeyError):
                write_frame.to_excel(path, 'test1', columns=['C', 'D'])

+    def test_comment_arg(self):
+        # Test the comment argument functionality to read_excel


add issue number here as a comment

jreback · 2017-12-13T14:27:00Z

pandas/tests/io/test_excel.py

+            # Read file without comment arg
+            read_frame = read_excel(path, 'test_c')
+            read_frame_commented = read_excel(path, 'test_c', comment='#')
+            tm.assert_class_equal(read_frame, read_frame_commented)


be consistent with other tests on nameing, e.g.
write_frame -> df
result = read_excel

use assert_frame_equal

jreback · 2017-12-13T14:27:10Z

pandas/tests/io/test_excel.py

+            # Create file to read in
+            write_frame = DataFrame({'A': ['one', '#one', 'one'],
+                                     'B': ['two', 'two', '#two']})
+            write_frame.to_excel(path, 'test_c')


same on naming

jreback · 2017-12-13T14:27:33Z

pandas/tests/io/test_excel.py

+
+            # Test that all-comment lines at EoF are ignored
+            read_frame_short = read_excel(path, comment='#')
+            assert (read_frame_short.shape == write_frame.iloc[0:1, :].shape)


comapre using assert_frame_equal

chris-b1

lgtm with @jreback's comments addressed

JanLauGe · 2017-12-13T18:58:27Z

Thanks for the comments @jreback, that should be all done now.
Please let me know if there is anything else outstanding.

alysivji · 2017-12-14T14:07:24Z

I spent a bunch of time lining up the arguments in read_excel to match read_csv

Would it be possible to keep following this pattern? Looks like comment can go between thousands and skip_footer

JanLauGe · 2017-12-14T14:18:43Z

Good point @alysivji, correct now?

alysivji · 2017-12-14T15:24:30Z

LGTM!

jreback

lgtm. just a small doc comments. ping on green.

jreback · 2017-12-18T12:49:25Z

doc/source/whatsnew/v0.22.0.txt

@@ -193,6 +193,7 @@ Other API Changes
 - Rearranged the order of keyword arguments in :func:`read_excel()` to align with :func:`read_csv()` (:issue:`16672`)
 - :func:`pandas.merge` now raises a ``ValueError`` when trying to merge on incompatible data types (:issue:`9780`)
 - :func:`wide_to_long` previously kept numeric-like suffixes as ``object`` dtype. Now they are cast to numeric if possible (:issue:`17627`)
+- comment arg is exposed as a named parameter in :func:`read_excel`


reverse this statement, in :func:`read_excel`, the comment argument is now exposed as a named parameter (and need double-backticks)

JanLauGe · 2017-12-20T08:43:00Z

uhoh, seems I mocked something up.
Sorry @jreback, can you explain to me why is AppVeyor suddenly failing now?

chris-b1 · 2017-12-21T15:09:43Z

pandas/io/excel.py

@@ -223,7 +301,8 @@ def read_excel(io,
               parse_dates=False,
               date_parser=None,
               thousands=None,
-               skipfooter=0,
+               comment=None,
+               skip_footer=0,


tests are failing because this parameter got renamed, should still be skipfooter

jreback · 2017-12-21T17:37:49Z

doc/source/whatsnew/v0.22.0.txt

@@ -198,6 +198,7 @@ Other API Changes
 - Rearranged the order of keyword arguments in :func:`read_excel()` to align with :func:`read_csv()` (:issue:`16672`)
 - :func:`pandas.merge` now raises a ``ValueError`` when trying to merge on incompatible data types (:issue:`9780`)
 - :func:`wide_to_long` previously kept numeric-like suffixes as ``object`` dtype. Now they are cast to numeric if possible (:issue:`17627`)


rebase on master. can you move to 0.23 (docs were renamed), prob easiest to just check this file from master and past in new one

jorisvandenbossche · 2017-12-22T10:18:24Z

pandas/io/excel.py

+>>> df_out = pd.DataFrame([('string1', 1),
+...                        ('string2', 2),
+...                        ('string3', 3)],
+...                       columns=('Name', 'Value'))


please use a list for the column names (both are OK, but a list is IMO more idiomatic as tuples are also used for single labels of multi-indexed columns

jorisvandenbossche · 2017-12-22T10:19:35Z

pandas/io/excel.py

@@ -137,7 +137,7 @@
 na_values : scalar, str, list-like, or dict, default None
    Additional strings to recognize as NA/NaN. If dict passed, specific
    per-column NA values. By default the following values are interpreted
-    as NaN: '""" + fill("', '".join(sorted(_NA_VALUES)), 70) + """'.
+    as NaN: '""" + fill("', '".join(sorted(_NA_VALUES)), 999) + """'.


why this change?

Ah, I see you mentioned this in the very first post of the PR. But, I don't think this is the fully correct solution, as this will make the line too long.
Apparently for the read_csv docstring this is working properly, so maybe check how they did it there?

I think that the long line is not a problem, as line breaks in the input doc string do not translate to line breaks in the compiled output. I have tested compiling the docs locally and the output looked fine for me. However, since you referred to the read_csv doc string, I had a look at that. It uses , subsequent_indent=" ", so I will change it here to keep things consistent.

jorisvandenbossche · 2017-12-22T10:21:14Z

pandas/io/excel.py

+
+Index and header can be specified via the `index_col` and `header` arguments
+
+>>> pd.read_excel(open('tmp.xlsx','rb'), index_col=None, header=None)


Can you use here (and in the following ones as well) just the "'tmp.xlsx'" (without the open). It is fine to show this ability (above), but I would use the simpler case for the rest of the examples

jorisvandenbossche · 2017-12-22T10:22:20Z

pandas/io/excel.py

+as strings or lists of strings!
+
+>>> pd.read_excel(open('tmp.xlsx','rb'),
+...               true_values='2',


This has no effect on the example?

You're right! true_values is not working as I expected. Did I misunderstand the argument or is it broken?

…_* tests

pep8speaks · 2017-12-29T08:55:37Z

Hello @JanLauGe! Thanks for updating the PR.

Cheers ! There are no PEP8 issues in this Pull Request. 🍻

Comment last updated on December 30, 2017 at 12:36 Hours UTC

jreback · 2017-12-29T16:52:25Z

pandas/io/excel.py

@@ -132,12 +132,13 @@
 nrows : int, default None
    Number of rows to parse

-    .. versionadded:: 0.23.0
+    .. versionadded:: 0.22.0


revert this (0.22 is a special release)

jreback · 2017-12-29T16:52:32Z

pandas/io/excel.py

 skip_footer : int, default 0

-    .. deprecated:: 0.23.0
+    .. deprecated:: 0.22.0


jreback · 2017-12-30T12:36:37Z

pushed a fix for the lint issue. ping on green.

jreback · 2017-12-30T13:33:48Z

thanks @JanLauGe

…#18753)

jreback added the IO Excel read_excel, to_excel label Dec 13, 2017

jreback requested changes Dec 13, 2017

View reviewed changes

chris-b1 approved these changes Dec 13, 2017

View reviewed changes

jreback added the Docs label Dec 13, 2017

vertan mentioned this pull request Dec 14, 2017

new line char not showing up correctly in docs #18736

Closed

jreback approved these changes Dec 18, 2017

View reviewed changes

jreback added this to the 0.22.0 milestone Dec 18, 2017

chris-b1 reviewed Dec 21, 2017

View reviewed changes

jreback requested changes Dec 21, 2017

View reviewed changes

jorisvandenbossche reviewed Dec 22, 2017

View reviewed changes

JanLauGe added 9 commits December 29, 2017 07:40

DOC: read_excel - added examples and fixed formatting bug

1157d72

read_excel - added comment as named argument comment and test_comment…

53a61db

…_* tests

added whatsnew entry

d2f3123

modified tests as requested

cc8a5c2

changed order of arguments

5d4be77

trigger travisCI build

e7ca7e6

modified whatsnew entry

128e148

rebase on master

c56da89

DOC: read_excel doc - fixed formatting and added examples

fda8fa2

JanLauGe force-pushed the read_excel_doc branch from 610073a to fda8fa2 Compare December 29, 2017 08:55

Merge branch 'master' into read_excel_doc

74ca2d1

jreback requested changes Dec 29, 2017

View reviewed changes

JanLauGe and others added 4 commits December 30, 2017 02:35

DOC: read_excel doc - fixed formatting and added examples

1096655

DOC: read_excel doc - fixed formatting and added examples

642910b

Merge branch 'master' into PR_TOOL_MERGE_PR_18753

4a930ef

lint

6afed06

jreback approved these changes Dec 30, 2017

View reviewed changes

jreback merged commit e92d788 into pandas-dev:master Dec 30, 2017

hexgnu pushed a commit to hexgnu/pandas that referenced this pull request Jan 1, 2018

DOC: read_excel doc - fixed formatting and added examples (pandas-dev…

21b154c

…#18753)


		Index and header can be specified via the `index_col` and `header` arguments

		>>> pd.read_excel(open('tmp.xlsx','rb'), index_col=None, header=None)

Uh oh!

DOC: read_excel doc - fixed formatting and added examples #18753

DOC: read_excel doc - fixed formatting and added examples #18753

Uh oh!

Conversation

JanLauGe commented Dec 12, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

JanLauGe commented Dec 12, 2017

Uh oh!

chris-b1 commented Dec 13, 2017

Uh oh!

codecov bot commented Dec 13, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

JanLauGe commented Dec 13, 2017

Uh oh!

jreback left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

chris-b1 left a comment

Choose a reason for hiding this comment

Uh oh!

JanLauGe commented Dec 13, 2017

Uh oh!

alysivji commented Dec 14, 2017

Uh oh!

JanLauGe commented Dec 14, 2017

Uh oh!

alysivji commented Dec 14, 2017

Uh oh!

jreback left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

JanLauGe commented Dec 20, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pep8speaks commented Dec 29, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Comment last updated on December 30, 2017 at 12:36 Hours UTC

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jreback commented Dec 30, 2017

Uh oh!

jreback commented Dec 30, 2017

Uh oh!

Uh oh!

JanLauGe commented Dec 12, 2017 •

edited

Loading

codecov bot commented Dec 13, 2017 •

edited

Loading

pep8speaks commented Dec 29, 2017 •

edited

Loading