DOC: Improved the docstring of errors.ParserWarning #20076

joaoavf · 2018-03-09T14:17:11Z

Checklist for the pandas documentation sprint (ignore this if you are doing
an unrelated PR):

PR title is "DOC: update the docstring"
The validation script passes: scripts/validate_docstrings.py <your-function-or-method>
The PEP8 style check passes: git diff upstream/master -u -- "*.py" | flake8 --diff
The html version looks good: python doc/make.py --single <your-function-or-method>
It has been proofread on language by another sprint participant

Please include the output of the validation script below between the "```" ticks:

################################################################################
################### Docstring (pandas.errors.ParserWarning)  ###################
################################################################################

Warning raised when reading a file that doesn't use the default parser.

Thrown by `pd.read_csv` and `pd.read_table` when it is necessary to
change parsers, generally from 'c' to 'python'.

It happens due to lack of support or functionality for parsing
particular attributes of a CSV file with the requested engine.

Currently, C-unsupported options include the following parameters:

1. `sep` other than a single character (e.g. regex separators)
2. `skipfooter` higher than 0
3. `sep=None` with `delim_whitespace=False`

The warning can be avoided by adding `engine='python'` as a parameter
in `pd.read_csv` and `pd.read_table` methods.

See Also
--------
pd.read_csv : Read CSV (comma-separated) file into DataFrame.
pd.read_table : Read general delimited file into DataFrame.

Examples
--------
Using a `sep` in `pd.read_csv` other than a single character:

>>> import io
>>> csv = u'''a;b;c
...           1;1,8
...           1;2,1'''
>>> df = pd.read_csv(io.StringIO(csv), sep='[;,]')
Traceback (most recent call last):
...
ParserWarning: Falling back to the 'python' engine...

Adding `engine='python'` to `pd.read_csv` removes the Warning:

>>> df = pd.read_csv(io.StringIO(csv), sep='[;,]', engine='python')
scripts/validate_docstrings.py:1: ParserWarning: Falling back to the 'python' engine because the 'c' engine does not support regex separators (separators > 1 char and different from '\s+' are interpreted as regex); you can avoid this warning by specifying engine='python'.
  #!/usr/bin/env python

################################################################################
################################## Validation ##################################
################################################################################

Errors found:
        No returns section found
        Examples do not pass tests

################################################################################
################################### Doctests ###################################
################################################################################

**********************************************************************
Line 32, in pandas.errors.ParserWarning
Failed example:
    df = pd.read_csv(io.StringIO(csv), sep='[;,]')
Expected:
    Traceback (most recent call last):
    ...
    ParserWarning: Falling back to the 'python' engine...
Got nothing

I am documenting a Warning and I could not find a better way to display the warning in the html example other than using a "Traceback (most recent call last):" followed by "ParserWarning: Falling back to the 'python' engine..." in the docstring.

It also says that it found errors about "No returns sections found". On what I understood this is not relevant to the docstring in hand.

pep8speaks · 2018-03-09T14:17:13Z

Hello @joaoavf! Thanks for updating the PR.

Cheers ! There are no PEP8 issues in this Pull Request. 🍻

Comment last updated on March 15, 2018 at 19:28 Hours UTC

datapythonista

Looks great, added couple of comments.

datapythonista · 2018-03-09T14:53:50Z

pandas/errors/__init__.py

+    >>> df = pd.read_csv(io.StringIO(csv), sep='[;,]')
+    Traceback (most recent call last):
+    ...
+    ParserWarning: Falling back to the 'python' engine...


did you check why the validation says that this test didn't pass, and that the read_csv returned nothing?

When I ran the code in my console I had this warning displayed: 'ParserWarning: Falling back to the 'python' engine...'

I thought it might have something to do as it is a warning and not an error. Something along the lines that the kind of output generated by an error could be caught by Traceback but not the output of a warning.

Any ideas on how to fix and approach this?

datapythonista · 2018-03-09T14:56:33Z

pandas/errors/__init__.py

-    parsing particular attributes of a CSV file with the requested engine.
+    Warning raised in `pd.read_csv` and `pd.read_table` when it is
+    necessary to change parsers, generally from 'c' to 'python'.
+


The first line needs to fit in a line. Can you write something more concise please? This paragraph is really useful, and it surely needs to be in the description, but the first line is used in some summaries that should be shorter. Something like Warning raised when reading a table does not use the default parser. Not sure if it's accurate or fits in one line, but to give you an idea.

Great Mark! Thanks for the suggestion. Already commited my version of it.

datapythonista · 2018-03-09T14:57:28Z

pandas/errors/__init__.py

+
+    The warning can be avoided by adding `engine='python'` as a parameter
+    in `pd.read_csv` and `pd.read_table` methods.
+


I think read_csv and read_table are good candidates for a See Also section, as you're mentioning them.

Added a See Also section with read_csv and read_table.

codecov · 2018-03-09T16:34:19Z

Codecov Report

Merging #20076 into master will increase coverage by <.01%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master   #20076      +/-   ##
==========================================
+ Coverage    91.7%    91.7%   +<.01%     
==========================================
  Files         150      150              
  Lines       49122    49152      +30     
==========================================
+ Hits        45045    45074      +29     
- Misses       4077     4078       +1

Flag	Coverage Δ
#multiple	`90.08% <ø> (ø)`	⬆️
#single	`41.84% <ø> (-0.02%)`	⬇️

Impacted Files	Coverage Δ
pandas/errors/__init__.py	`92.3% <ø> (ø)`	⬆️
pandas/core/base.py	`96.78% <0%> (-0.02%)`	⬇️
pandas/core/indexes/datetimes.py	`95.64% <0%> (-0.01%)`	⬇️
pandas/core/series.py	`93.85% <0%> (-0.01%)`	⬇️
pandas/core/groupby.py	`92.14% <0%> (-0.01%)`	⬇️
pandas/core/indexes/base.py	`96.66% <0%> (-0.01%)`	⬇️
pandas/core/generic.py	`95.84% <0%> (ø)`	⬆️
pandas/core/indexes/multi.py	`95.06% <0%> (ø)`	⬆️
pandas/core/strings.py	`98.32% <0%> (ø)`	⬆️
pandas/core/indexes/timedeltas.py	`91.03% <0%> (ø)`	⬆️
... and 3 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 731d971...17687b5. Read the comment docs.

jreback · 2018-03-10T12:09:12Z

pandas/errors/__init__.py

-    parsing particular attributes of a CSV file with the requested engine.
+    Warning raised when reading a file that doesn't use the default parser.
+
+    Thrown by `pd.read_csv` and `pd.read_table` when it is necessary to


Thrown -> Raised

Jeff, I made all the changes you requested. Thank you for the feedback.

jreback · 2018-03-10T12:09:27Z

pandas/errors/__init__.py

-    to change parsers (generally from 'c' to 'python') contrary to the
-    one specified by the user due to lack of support or functionality for
-    parsing particular attributes of a CSV file with the requested engine.
+    Warning raised when reading a file that doesn't use the default parser.


say default is the c parser

jreback · 2018-03-10T12:10:21Z

pandas/errors/__init__.py

+    Thrown by `pd.read_csv` and `pd.read_table` when it is necessary to
+    change parsers, generally from 'c' to 'python'.
+
+    It happens due to lack of support or functionality for parsing


due to a lack

for parsing a particular attribute

jreback · 2018-03-10T12:10:35Z

pandas/errors/__init__.py

+    It happens due to lack of support or functionality for parsing
+    particular attributes of a CSV file with the requested engine.
+
+    Currently, C-unsupported options include the following parameters:


'c' unsupported options

[ci skip]

TomAugspurger · 2018-03-15T19:29:17Z

#20309 (comment) for doctesting warnings. Thanks @joaoavf

DOC: Improved the docstring of errors.ParserWarning

abdc710

Correcting whitespace issue on line 58

0607602

jorisvandenbossche added the Docs label Mar 9, 2018

datapythonista reviewed Mar 9, 2018

View reviewed changes

Converting the short summary to one line.

850a3fe

joaoavf added 2 commits March 9, 2018 16:41

minor formatting changes

815bf0e

Added a 'See Also' section

b12907e

jreback requested changes Mar 10, 2018

View reviewed changes

jreback added Error Reporting Incorrect or improved errors from pandas IO CSV read_csv, to_csv labels Mar 10, 2018

joaoavf added 2 commits March 10, 2018 13:36

Making changes request by jreback and minor formatting changes

8fe1c04

Adding the default 'c' parser to the short summary

37c9f96

jreback added this to the 0.23.0 milestone Mar 10, 2018

jreback approved these changes Mar 10, 2018

View reviewed changes

Pass doctests [ci skip]

17687b5

[ci skip]

TomAugspurger merged commit 30e0006 into pandas-dev:master Mar 15, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DOC: Improved the docstring of errors.ParserWarning #20076

DOC: Improved the docstring of errors.ParserWarning #20076

joaoavf commented Mar 9, 2018 •

edited

Loading

pep8speaks commented Mar 9, 2018 •

edited

Loading

datapythonista left a comment

datapythonista Mar 9, 2018

joaoavf Mar 9, 2018 •

edited

Loading

datapythonista Mar 9, 2018

joaoavf Mar 9, 2018

datapythonista Mar 9, 2018

joaoavf Mar 9, 2018

codecov bot commented Mar 9, 2018 •

edited

Loading

jreback Mar 10, 2018

joaoavf Mar 10, 2018

jreback Mar 10, 2018

jreback Mar 10, 2018

jreback Mar 10, 2018

TomAugspurger commented Mar 15, 2018


		The warning can be avoided by adding `engine='python'` as a parameter
		in `pd.read_csv` and `pd.read_table` methods.

DOC: Improved the docstring of errors.ParserWarning #20076

DOC: Improved the docstring of errors.ParserWarning #20076

Conversation

joaoavf commented Mar 9, 2018 • edited Loading

pep8speaks commented Mar 9, 2018 • edited Loading

Comment last updated on March 15, 2018 at 19:28 Hours UTC

datapythonista left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

joaoavf Mar 9, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented Mar 9, 2018 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TomAugspurger commented Mar 15, 2018

joaoavf commented Mar 9, 2018 •

edited

Loading

pep8speaks commented Mar 9, 2018 •

edited

Loading

joaoavf Mar 9, 2018 •

edited

Loading

codecov bot commented Mar 9, 2018 •

edited

Loading