-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
DOC: Fix flake8 problems in io.rst #23855
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## master #23855 +/- ##
=======================================
Coverage 42.46% 42.46%
=======================================
Files 161 161
Lines 51557 51557
=======================================
Hits 21892 21892
Misses 29665 29665
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, but there are couple of unrelated things, that if you can fix here, that would be great.
doc/source/io.rst
Outdated
import pandas as pd | ||
import pandas.util.testing as tm | ||
from pandas.compat import StringIO, BytesIO | ||
ExcelWriter = pd.ExcelWriter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's somehow unrelated to your changes, but if you don't mind removing this line, and replacing ExcelWriter
by pd.ExcelWriter
wherever is used, that would be great. I think reassigning this just make things difficult for the users to understand.
And if import pandas.util.testing as tm
is used just in one code block or section, would be nice to move it there, as it may not be obvious for users to know what tm
is without the import. If it's used in many places leave it here, and we'll see what we can do later on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple of comments
doc/source/io.rst
Outdated
data = '# empty\n# second empty line\n# third empty' \ | ||
'line\nX,Y,Z\n1,2,3\nA,B,C\n1,2.,4.\n5.,NaN,10.0' | ||
data = """# empty\n# second empty line\n# third empty \ | ||
line\nX,Y,Z\n1,2,3\nA,B,C\n1,2.,4.\n5.,NaN,10.0""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
those lines are not equivalent
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am unable to resolve an error here. On my local build the the second line is not appended to the variable data. I am quoting the code here. Need your help.
`
In [68]: data = '# empty\n# second empty line\n# third empty' \
In [69]: 'line\nX,Y,Z\n1,2,3\nA,B,C\n1,2.,4.\n5.,NaN,10.0'
Out[69]: 'line\nX,Y,Z\n1,2,3\nA,B,C\n1,2.,4.\n5.,NaN,10.0'
In [70]: print(data)
# empty
# second empty line
# third empty
In [71]: pd.read_csv(StringIO(data), comment='#', skiprows=4, header=1)
EmptyDataError Traceback (most recent call last)
in
----> 1 pd.read_csv(StringIO(data), comment='#', skiprows=4, header=1)
`
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that if I copy paste these lines to a ipython terminal. It works well. I tried combination of spaces etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@datapythonista - Can you please confirm the above? Otherwise everything is ready for a commit and push.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I missed that.
I didn't quite understand what's the problem, but checking it again, I think much better ways to write that are:
data = '\n'.join(['# empty',
'# second empty line',
'# third empty',
'line',
'X,Y,Z',
'1,2,3',
'A,B,C',
'1,2.,4.',
'5.,NaN,10.0'])
or
data = '''# empty
# second empty line
# third empty
line
X,Y,Z
1,2,3
A,B,C
1,2.,4.
5.,NaN,10.0'''
which makes possible to read the content
doc/source/io.rst
Outdated
import numpy as np | ||
np.random.seed(123456) | ||
import pandas as pd | ||
import pandas.io.date_converters as conv |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be left where it was. Any reason to move it here?
or even better, I'd use pd..io.date_converters...
, no extra import is needed, and the code is more explicit
doc/source/io.rst
Outdated
pd.options.display.max_rows = 15 | ||
clipdf = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6], 'C': ['p', 'q', 'r']}, | ||
index=['x', 'y', 'z']) | ||
|
||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are those needed for the validation? I'd say flake8-rst should fail with 3 blank lines
doc/source/io.rst
Outdated
@@ -718,7 +715,8 @@ result in byte strings being decoded to unicode in the result: | |||
|
|||
.. ipython:: python | |||
|
|||
data = b'word,length\nTr\xc3\xa4umen,7\nGr\xc3\xbc\xc3\x9fe,5'.decode('utf8').encode('latin-1') | |||
data = (b'word,length\nTr\xc3\xa4umen,7\nGr\xc3\xbc\xc3\x9fe,5'. | |||
decode('utf8').encode('latin-1')) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it'll be more readable as:
data = b'...'
data = data.decode(...
doc/source/io.rst
Outdated
@@ -2186,6 +2179,7 @@ A few notes on the generated table schema: | |||
|
|||
.. ipython:: python | |||
|
|||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this new blank line is correct
doc/source/io.rst
Outdated
|
||
.. ipython:: python | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same, blank line not needed
@saurav2608 in general try to not open the pull requests until you're done. You can push to your remote branch, but if you open a PR, someone will review it, and it's a waste of time if your changes are not finished (unless you need a review to make sure what you're doing is correct...) |
@datapythonista - I will keep this in mind henceforth. I will edit the .rst file over the weekend. |
@saurav2608 do you have time to address the comments? |
I got swamped by some other work. I can take this up on Wednesday. |
HI @datapythonista - travis fails with the below error.
I believe you have fixed this in #23975 . Is there a way to re-initiate the test? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I rerun travis. But reviewing this again, I saw that we're using different ways of creating multiline strings. If you don't mind, would be great to standardize them
doc/source/io.rst
Outdated
'line\nX,Y,Z\n1,2,3\nA,B,C\n1,2.,4.\n5.,NaN,10.0' | ||
print(data) | ||
pd.read_csv(StringIO(data), comment='#', skiprows=4, header=1) | ||
data = '\n'.join(['# empty', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry I didn't see it before, but looks like mostly everywhere on this file, what is being used is:
data = ('#empty\n'
'#second empty line'
...
I think it would be nice to keep the same for consistency. Or was this the format that was failing in this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it does. I wonder what I was thinking here. :(
doc/source/io.rst
Outdated
@@ -718,7 +724,8 @@ result in byte strings being decoded to unicode in the result: | |||
|
|||
.. ipython:: python | |||
|
|||
data = b'word,length\nTr\xc3\xa4umen,7\nGr\xc3\xbc\xc3\x9fe,5'.decode('utf8').encode('latin-1') | |||
data = b'word,length\nTr\xc3\xa4umen,7\nGr\xc3\xbc\xc3\x9fe,5' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you mind breaking this in lines in the way previously suggested?
doc/source/io.rst
Outdated
@@ -992,7 +998,7 @@ DD/MM/YYYY instead. For convenience, a ``dayfirst`` keyword is provided: | |||
|
|||
data = "date,value,cat\n1/6/2000,5,a\n2/6/2000,10,b\n3/6/2000,15,c" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
doc/source/io.rst
Outdated
@@ -1166,7 +1175,7 @@ options as follows: | |||
|
|||
.. ipython:: python | |||
|
|||
data= 'a,b,c\n1,Yes,2\n3,No,4' | |||
data = 'a,b,c\n1,Yes,2\n3,No,4' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any reason to not split this in several lines as in the rest?
can you rebase |
@jreback : Unfortunately, I am not able to. I messed up at some point and now rebase and squash generates conflict errors that I am not resolve it with my knowledge of git. I am open to ideas on how to squash the commits. In the worst case, I can delete the branch and start again. |
just do a |
I am not sure if this worked. |
It did not. Somehow an old version of the file was pushed. |
Can you take a look at the first section of: https://datapythonista.github.io/blog/useful-git-commands.html |
@datapythonista - I followed your post. But if I try to push I get below error. Should I pull one before pushing? |
This is done I think. I forced a push on my forked repo. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, thanks @saurav2608
@saurav2608 can you do in this PR branch: I'm not sure why the CI is failing, I think it shouldn't, but hopefully updating the branch will fix it. |
Hello @saurav2608! Thanks for updating the PR.
|
@saurav2608 looks like the PR contains unrelated changes again. Not sure what is causing this, I don't think the command I wrote causes it. But in any case, can you repeat what you did before to fix the history and leave only your changes in the PR please. |
I redid the PR. this look okay now. Please review once and let me know. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @saurav2608, just couple of small things if you don't mind
doc/source/io.rst
Outdated
@@ -1166,7 +1175,7 @@ options as follows: | |||
|
|||
.. ipython:: python | |||
|
|||
data= 'a,b,c\n1,Yes,2\n3,No,4' | |||
data = 'a,b,c\n1,Yes,2\n3,No,4' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any reason to not split this in several lines as in the rest?
doc/source/io.rst
Outdated
@@ -2186,6 +2237,7 @@ A few notes on the generated table schema: | |||
|
|||
.. ipython:: python | |||
|
|||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this is giving an error, but that will be fixed in flake8-rst, as the error is wrong. So just one blank line please.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
thanks a lot for fixing all these @saurav2608
@datapythonista - no problem. It took long for me to fix this. Thank for your guidance. There are still some issues to resolve. For example suppressing |
No worries @saurav2608, we all do what we can, and it also takes time for us to review. I don't think we want to fix the And don't worry much if not everything is fixed. There are so many things to fix at this point, that any fixes are very useful. Once most of the stuff is fixed, then we'll check more in detail the outstanding problems. If you're looking for other things to contribute, it's probably better to wait until #23847 is merged to work on the rest of the One related issue that can be address and would be very useful to have implemented soon would be #23952. |
@saurav2608 can you rebase and remove io.rst from the setup.cfg |
@jreback if you want to merge this, I'll open a PR with the fixed files, and I'll add this one too. |
thanks @saurav2608 @datapythonista go for it |
Opened #24051, will ping you when is green. |
git diff upstream/master -u -- "*.py" | flake8 --diff
Pushing this code to check if this clears CI. Made a few changes towards issue #23791