Skip to content

DOC: add sep argument to read_clipboard signature #14537

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Nov 7, 2016
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 12 additions & 8 deletions pandas/io/clipboard.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,16 @@
from pandas.compat import StringIO


def read_clipboard(**kwargs): # pragma: no cover
"""
def read_clipboard(sep='\s+', **kwargs): # pragma: no cover
r"""
Read text from clipboard and pass to read_table. See read_table for the
full argument list

If unspecified, `sep` defaults to '\s+'
Parameters
----------
sep : str, default '\s+'.
A string or regex delimiter. The default of '\s+' denotes
one or more whitespace characters.

Returns
-------
Expand All @@ -29,7 +33,7 @@ def read_clipboard(**kwargs): # pragma: no cover
except:
pass

# Excel copies into clipboard with \t seperation
# Excel copies into clipboard with \t separation
# inspect no more then the 10 first lines, if they
# all contain an equal number (>0) of tabs, infer
# that this came from excel and set 'sep' accordingly
Expand All @@ -43,12 +47,12 @@ def read_clipboard(**kwargs): # pragma: no cover

counts = set([x.lstrip().count('\t') for x in lines])
if len(lines) > 1 and len(counts) == 1 and counts.pop() != 0:
kwargs['sep'] = '\t'
sep = '\t'

if kwargs.get('sep') is None and kwargs.get('delim_whitespace') is None:
kwargs['sep'] = '\s+'
if sep is None and kwargs.get('delim_whitespace') is None:
sep = '\s+'

return read_table(StringIO(text), **kwargs)
return read_table(StringIO(text), sep=sep, **kwargs)


def to_clipboard(obj, excel=None, sep=None, **kwargs): # pragma: no cover
Expand Down
2 changes: 2 additions & 0 deletions pandas/io/tests/test_clipboard.py
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,8 @@ def check_round_trip_frame(self, data_type, excel=None, sep=None):
def test_round_trip_frame_sep(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think you can add round-trip tests for slightly unusual delimiters instead of the standard white-space one? One is probably sufficient.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This actually just used read_csv under the hood, so I don't know to what extent this is needed? (apart from checking that the keyword is passed correctly)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, but if the implementation changes, we would like to make sure such behaviour does not break. In any case, couldn't hurt, right? 😄

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gfyoung not sure what you exactly mean by 'round-trip tests', I added a test for pipe-delimiter though.

Copy link
Member

@gfyoung gfyoung Nov 4, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's what the test method name is describing, which is why I put the comment under the name. What you did should be fine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @gfyoung - got it.

for dt in self.data_types:
self.check_round_trip_frame(dt, sep=',')
self.check_round_trip_frame(dt, sep='\s+')
self.check_round_trip_frame(dt, sep='|')

def test_round_trip_frame_string(self):
for dt in self.data_types:
Expand Down