Skip to content

DOC: add sep argument to read_clipboard signature #14537

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Nov 7, 2016
Merged

DOC: add sep argument to read_clipboard signature #14537

merged 2 commits into from
Nov 7, 2016

Conversation

keshavramaswamy
Copy link
Contributor

@keshavramaswamy keshavramaswamy commented Oct 29, 2016

@codecov-io
Copy link

codecov-io commented Oct 29, 2016

Current coverage is 85.26% (diff: 100%)

Merging #14537 into master will decrease coverage by <.01%

@@             master     #14537   diff @@
==========================================
  Files           140        140          
  Lines         50693      50672    -21   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
- Hits          43229      43208    -21   
  Misses         7464       7464          
  Partials          0          0          

Powered by Codecov. Last update e1cdc4b...18c30eb

Copy link
Contributor

@TomAugspurger TomAugspurger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could verify that we have tests for all the cases of sep and delim_whitespace, and maybe add a few if we don't?

"""
Read text from clipboard and pass to read_table. See read_table for the
full argument list

If unspecified, `sep` defaults to '\s+'
If unspecified, `sep` defaults to '\s+' and delim_whitespace defaults
Copy link
Contributor

@TomAugspurger TomAugspurger Oct 30, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Take a look at the NumPy docstring reference, which we try to follow. So this would be something like like

sep : str, optional
    A string or regex delimiter. Default `'\s+'` 
delim_whitespace : bool, optional
    Whether to .... Default False


return read_table(StringIO(text), **kwargs)
return read_table(StringIO(text), sep, delim_whitespace, **kwargs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you make these keyword arguments (sep=sep, delim_whitespace=delim_whitespace)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TomAugspurger Thanks for your review. Will push in the required changes.

@jorisvandenbossche jorisvandenbossche added Docs IO Data IO issues that don't fit into a more specific label labels Oct 30, 2016
@jorisvandenbossche
Copy link
Member

I would not list delim_whitespace specifically, as this is actually a equivalent for sep='\s+' (that is the reason it is checked for in the code, as it is an alternative way to specify the sep).

So would only list sep of the two, and explain clearly what the default sep values means (you can even say it is equivalent to delim_whitespace=True).

@keshavramaswamy
Copy link
Contributor Author

@jorisvandenbossche Thanks for your review. While that makes sense, as issue #13726 mentions, with both the arguments added, it'll be helpful for argument completion in interactive use. Thoughts?

@jreback jreback changed the title DOC:added arguments to read_clipboard signature DOC: added arguments to read_clipboard signature Oct 31, 2016
@jorisvandenbossche
Copy link
Member

with both the arguments added, it'll be helpful for argument completion in interactive use

The point I was trying to make is that in the current way they are specified, having both arguments is not useful: sep='\s+' is the default, and setting delim_whitespace to True is equivalent with this, so in practice you never need this.

Towards the users, I would maybe rather make delim_whitespace=True the default, as that is easier to understand what it means compared to sep='\s+'. But that is maybe a bit more complex to implement.
So to conclude, I would just document sep, and leave delim_whitespace out of the signature.

@keshavramaswamy
Copy link
Contributor Author

@jorisvandenbossche ah, got it. Will remove delim_whitespace from the signature.

@keshavramaswamy
Copy link
Contributor Author

@TomAugspurger Can this be merged?

@@ -71,6 +71,7 @@ def check_round_trip_frame(self, data_type, excel=None, sep=None):
def test_round_trip_frame_sep(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think you can add round-trip tests for slightly unusual delimiters instead of the standard white-space one? One is probably sufficient.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This actually just used read_csv under the hood, so I don't know to what extent this is needed? (apart from checking that the keyword is passed correctly)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, but if the implementation changes, we would like to make sure such behaviour does not break. In any case, couldn't hurt, right? 😄

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gfyoung not sure what you exactly mean by 'round-trip tests', I added a test for pipe-delimiter though.

Copy link
Member

@gfyoung gfyoung Nov 4, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's what the test method name is describing, which is why I put the comment under the name. What you did should be fine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @gfyoung - got it.

Parameters
----------
sep : str, default '\s+'
A string or regex delimiter.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add here an explanation of what \s+ means ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added that! @jorisvandenbossche

@jorisvandenbossche jorisvandenbossche added this to the 0.20.0 milestone Nov 2, 2016
@keshavramaswamy
Copy link
Contributor Author

Looks like the build fails due to this? pypa/pip#4059

@jorisvandenbossche jorisvandenbossche changed the title DOC: added arguments to read_clipboard signature DOC: add sep argument to read_clipboard signature Nov 7, 2016
@jorisvandenbossche jorisvandenbossche merged commit 2e276fb into pandas-dev:master Nov 7, 2016
@jorisvandenbossche
Copy link
Member

@keshavramaswamy Thanks a lot! (I made a small adjustment before merging, and made the docstring a raw string, which is needed for sphinx to interpret the \ literally)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Docs IO Data IO issues that don't fit into a more specific label
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DOC: give read_clipboard a signature
5 participants