Skip to content

DOC: Fix flake8 problems in io.rst #23855

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Dec 2, 2018
Merged

Conversation

saurav2608
Copy link

@saurav2608 saurav2608 commented Nov 22, 2018

Pushing this code to check if this clears CI. Made a few changes towards issue #23791

@codecov
Copy link

codecov bot commented Nov 22, 2018

Codecov Report

Merging #23855 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master   #23855   +/-   ##
=======================================
  Coverage   42.46%   42.46%           
=======================================
  Files         161      161           
  Lines       51557    51557           
=======================================
  Hits        21892    21892           
  Misses      29665    29665
Flag Coverage Δ
#single 42.46% <ø> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b45eb26...d8b555a. Read the comment docs.

Copy link
Member

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, but there are couple of unrelated things, that if you can fix here, that would be great.

import pandas as pd
import pandas.util.testing as tm
from pandas.compat import StringIO, BytesIO
ExcelWriter = pd.ExcelWriter
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's somehow unrelated to your changes, but if you don't mind removing this line, and replacing ExcelWriter by pd.ExcelWriter wherever is used, that would be great. I think reassigning this just make things difficult for the users to understand.

And if import pandas.util.testing as tm is used just in one code block or section, would be nice to move it there, as it may not be obvious for users to know what tm is without the import. If it's used in many places leave it here, and we'll see what we can do later on.

Copy link
Member

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of comments

data = '# empty\n# second empty line\n# third empty' \
'line\nX,Y,Z\n1,2,3\nA,B,C\n1,2.,4.\n5.,NaN,10.0'
data = """# empty\n# second empty line\n# third empty \
line\nX,Y,Z\n1,2,3\nA,B,C\n1,2.,4.\n5.,NaN,10.0"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

those lines are not equivalent

Copy link
Author

@saurav2608 saurav2608 Nov 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am unable to resolve an error here. On my local build the the second line is not appended to the variable data. I am quoting the code here. Need your help.
`
In [68]: data = '# empty\n# second empty line\n# third empty' \

In [69]: 'line\nX,Y,Z\n1,2,3\nA,B,C\n1,2.,4.\n5.,NaN,10.0'
Out[69]: 'line\nX,Y,Z\n1,2,3\nA,B,C\n1,2.,4.\n5.,NaN,10.0'

In [70]: print(data)
# empty
# second empty line
# third empty

In [71]: pd.read_csv(StringIO(data), comment='#', skiprows=4, header=1)

EmptyDataError Traceback (most recent call last)
in
----> 1 pd.read_csv(StringIO(data), comment='#', skiprows=4, header=1)
`

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that if I copy paste these lines to a ipython terminal. It works well. I tried combination of spaces etc.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@datapythonista - Can you please confirm the above? Otherwise everything is ready for a commit and push.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I missed that.

I didn't quite understand what's the problem, but checking it again, I think much better ways to write that are:

data = '\n'.join(['# empty',
                  '# second empty line',
                  '# third empty',
                  'line',
                  'X,Y,Z',
                  '1,2,3',
                  'A,B,C',
                  '1,2.,4.',
                  '5.,NaN,10.0'])

or

data = '''# empty
# second empty line
# third empty
line
X,Y,Z
1,2,3
A,B,C
1,2.,4.
5.,NaN,10.0'''

which makes possible to read the content

import numpy as np
np.random.seed(123456)
import pandas as pd
import pandas.io.date_converters as conv
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be left where it was. Any reason to move it here?

or even better, I'd use pd..io.date_converters..., no extra import is needed, and the code is more explicit

pd.options.display.max_rows = 15
clipdf = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6], 'C': ['p', 'q', 'r']},
index=['x', 'y', 'z'])



Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are those needed for the validation? I'd say flake8-rst should fail with 3 blank lines

@@ -718,7 +715,8 @@ result in byte strings being decoded to unicode in the result:

.. ipython:: python

data = b'word,length\nTr\xc3\xa4umen,7\nGr\xc3\xbc\xc3\x9fe,5'.decode('utf8').encode('latin-1')
data = (b'word,length\nTr\xc3\xa4umen,7\nGr\xc3\xbc\xc3\x9fe,5'.
decode('utf8').encode('latin-1'))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'll be more readable as:

data = b'...'
data = data.decode(...

@@ -2186,6 +2179,7 @@ A few notes on the generated table schema:

.. ipython:: python


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this new blank line is correct


.. ipython:: python


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same, blank line not needed

@datapythonista datapythonista changed the title DOC: Reordered import to conform to PEP8 standards. DOC: Fix flake8 problems in io.rst Nov 22, 2018
@datapythonista
Copy link
Member

@saurav2608 in general try to not open the pull requests until you're done. You can push to your remote branch, but if you open a PR, someone will review it, and it's a waste of time if your changes are not finished (unless you need a review to make sure what you're doing is correct...)

@gfyoung gfyoung added Docs Code Style Code style, linting, code_checks labels Nov 22, 2018
@saurav2608
Copy link
Author

@saurav2608 in general try to not open the pull requests until you're done.

@datapythonista - I will keep this in mind henceforth. I will edit the .rst file over the weekend.

@datapythonista
Copy link
Member

@saurav2608 do you have time to address the comments?

@saurav2608
Copy link
Author

I got swamped by some other work. I can take this up on Wednesday.

@saurav2608
Copy link
Author

saurav2608 commented Nov 29, 2018

HI @datapythonista - travis fails with the below error.

[create env]
Solving environment: ...working... failed
ResolvePackageNotFound:

  • flake8-rst=0.4.2

I believe you have fixed this in #23975 . Is there a way to re-initiate the test?

Copy link
Member

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I rerun travis. But reviewing this again, I saw that we're using different ways of creating multiline strings. If you don't mind, would be great to standardize them

'line\nX,Y,Z\n1,2,3\nA,B,C\n1,2.,4.\n5.,NaN,10.0'
print(data)
pd.read_csv(StringIO(data), comment='#', skiprows=4, header=1)
data = '\n'.join(['# empty',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry I didn't see it before, but looks like mostly everywhere on this file, what is being used is:

data = ('#empty\n'
        '#second empty line'
...

I think it would be nice to keep the same for consistency. Or was this the format that was failing in this case?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it does. I wonder what I was thinking here. :(

@@ -718,7 +724,8 @@ result in byte strings being decoded to unicode in the result:

.. ipython:: python

data = b'word,length\nTr\xc3\xa4umen,7\nGr\xc3\xbc\xc3\x9fe,5'.decode('utf8').encode('latin-1')
data = b'word,length\nTr\xc3\xa4umen,7\nGr\xc3\xbc\xc3\x9fe,5'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you mind breaking this in lines in the way previously suggested?

@@ -992,7 +998,7 @@ DD/MM/YYYY instead. For convenience, a ``dayfirst`` keyword is provided:

data = "date,value,cat\n1/6/2000,5,a\n2/6/2000,10,b\n3/6/2000,15,c"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

@@ -1166,7 +1175,7 @@ options as follows:

.. ipython:: python

data= 'a,b,c\n1,Yes,2\n3,No,4'
data = 'a,b,c\n1,Yes,2\n3,No,4'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason to not split this in several lines as in the rest?

@jreback
Copy link
Contributor

jreback commented Nov 29, 2018

can you rebase

@saurav2608
Copy link
Author

saurav2608 commented Nov 30, 2018

can you rebase

@jreback : Unfortunately, I am not able to. I messed up at some point and now rebase and squash generates conflict errors that I am not resolve it with my knowledge of git.

I am open to ideas on how to squash the commits. In the worst case, I can delete the branch and start again.

@datapythonista
Copy link
Member

just do a git fetch upstream && git merge upstream/master edit the file with the conflict, keep the lines that arr ok, and leave the file how it has to be, do a git add of the file, commit and push

@saurav2608
Copy link
Author

I am not sure if this worked.

@saurav2608
Copy link
Author

I am not sure if this worked.

It did not. Somehow an old version of the file was pushed.

@datapythonista
Copy link
Member

Can you take a look at the first section of: https://datapythonista.github.io/blog/useful-git-commands.html

@saurav2608
Copy link
Author

@datapythonista - I followed your post. But if I try to push I get below error. Should I pull one before pushing?
! [rejected] io-doc -> io-doc (non-fast-forward) error: failed to push some refs to 'https://github.com/saurav2608/pandas.git' hint: Updates were rejected because the tip of your current branch is behind hint: its remote counterpart. Integrate the remote changes (e.g. hint: 'git pull ...') before pushing again. hint: See the 'Note about fast-forwards' in 'git push --help' for details.

@saurav2608
Copy link
Author

@datapythonista - I followed your post. But if I try to push I get below error. Should I pull one before pushing?
! [rejected] io-doc -> io-doc (non-fast-forward) error: failed to push some refs to 'https://github.com/saurav2608/pandas.git' hint: Updates were rejected because the tip of your current branch is behind hint: its remote counterpart. Integrate the remote changes (e.g. hint: 'git pull ...') before pushing again. hint: See the 'Note about fast-forwards' in 'git push --help' for details.

This is done I think. I forced a push on my forked repo.

Copy link
Member

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thanks @saurav2608

@datapythonista
Copy link
Member

@saurav2608 can you do in this PR branch: git fetch upstream && git merge upstream/master && git push please

I'm not sure why the CI is failing, I think it shouldn't, but hopefully updating the branch will fix it.

@pep8speaks
Copy link

Hello @saurav2608! Thanks for updating the PR.

@datapythonista
Copy link
Member

@saurav2608 looks like the PR contains unrelated changes again. Not sure what is causing this, I don't think the command I wrote causes it. But in any case, can you repeat what you did before to fix the history and leave only your changes in the PR please.

@saurav2608
Copy link
Author

I redid the PR. this look okay now. Please review once and let me know.

Copy link
Member

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @saurav2608, just couple of small things if you don't mind

@@ -1166,7 +1175,7 @@ options as follows:

.. ipython:: python

data= 'a,b,c\n1,Yes,2\n3,No,4'
data = 'a,b,c\n1,Yes,2\n3,No,4'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason to not split this in several lines as in the rest?

@@ -2186,6 +2237,7 @@ A few notes on the generated table schema:

.. ipython:: python


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is giving an error, but that will be fixed in flake8-rst, as the error is wrong. So just one blank line please.

Copy link
Member

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

thanks a lot for fixing all these @saurav2608

@saurav2608
Copy link
Author

saurav2608 commented Dec 2, 2018

@datapythonista - no problem. It took long for me to fix this. Thank for your guidance. There are still some issues to resolve. For example suppressing E402 module level import not at top of file type errors where the import is deliberately kept at that location. I can work on those once these changes are merged.

@datapythonista
Copy link
Member

No worries @saurav2608, we all do what we can, and it also takes time for us to review.

I don't think we want to fix the E402, but make flake8-rst not report them as errors. I think it's better for users to have the imports in the block where they are used.

And don't worry much if not everything is fixed. There are so many things to fix at this point, that any fixes are very useful. Once most of the stuff is fixed, then we'll check more in detail the outstanding problems.

If you're looking for other things to contribute, it's probably better to wait until #23847 is merged to work on the rest of the .rst files, as it's a bit difficult to follow right now what is done, what's in progress, and what needs to be done. After that PR is merged, I'll create issues for the pages that need fixes, and it'll be easier.

One related issue that can be address and would be very useful to have implemented soon would be #23952.

@jreback
Copy link
Contributor

jreback commented Dec 2, 2018

@saurav2608 can you rebase and remove io.rst from the setup.cfg

@datapythonista
Copy link
Member

@jreback if you want to merge this, I'll open a PR with the fixed files, and I'll add this one too.

@jreback jreback added this to the 0.24.0 milestone Dec 2, 2018
@jreback jreback merged commit 6dd130a into pandas-dev:master Dec 2, 2018
@jreback
Copy link
Contributor

jreback commented Dec 2, 2018

thanks @saurav2608

@datapythonista go for it

@datapythonista
Copy link
Member

Opened #24051, will ping you when is green.

@saurav2608 saurav2608 deleted the io-doc branch December 3, 2018 02:52
Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019
Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Code Style Code style, linting, code_checks Docs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DOC: Fix format of io.rst
6 participants