Skip to content

BUG in clipboard (linux, python2) with unicode and separator (GH13747) #14599

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 26 commits into from

Conversation

aileronajay
Copy link
Contributor

@aileronajay aileronajay commented Nov 7, 2016

@sinhrks sinhrks added Bug IO Data IO issues that don't fit into a more specific label Unicode Unicode strings labels Nov 7, 2016
@pijucha
Copy link
Contributor

pijucha commented Nov 8, 2016

I believe this fix should also close #12529 (at least it worked for me a few months ago on linux and windows).

Good practice is to add some tests for a fix, for example as in pandas/io/tests/test_clipboard.py.

xref #12580

@aileronajay
Copy link
Contributor Author

aileronajay commented Nov 8, 2016

@pijucha , i thought we already have one unicode character test in test_clipboard.py (GH 13747) that used to fail earliers (in OSX) and it passes with this change, do i need to add more?

@pijucha
Copy link
Contributor

pijucha commented Nov 8, 2016

Ah, you're right. I forgot.

I played with some non-latin characters and also °C for the other issue - just to be sure. The existing test contains only 'en español' and I wondered whether it could still pass with locale set to iso8859-1 or the like. I don't remember now, maybe it's not an issue.

@aileronajay
Copy link
Contributor Author

@pijucha and none of the existing tests failed post this change so i think we can prolly merge it

@aileronajay
Copy link
Contributor Author

I ran this test case in local post changes and it succeeds

df = pd.DataFrame({'a':['µasd','Ωœ∑´'], 'b':['øπ∆˚¬','œ∑´®']})
df.to_clipboard(excel=None, sep =sep)
result = read_clipboard(sep = sep, index_col = 0)
tm.assert_frame_equal(df, result, check_dtype=False)

@jreback
Copy link
Contributor

jreback commented Nov 8, 2016

wouldn't hurt to add that as a test

@aileronajay
Copy link
Contributor Author

should i add this as a individual test in pandas/io/tests/test_clipboard.py itself?

@jreback
Copy link
Contributor

jreback commented Nov 8, 2016

yep

@codecov-io
Copy link

codecov-io commented Nov 8, 2016

Current coverage is 85.20% (diff: 30.09%)

Merging #14599 into master will decrease coverage by 0.08%

@@             master     #14599   diff @@
==========================================
  Files           140        143     +3   
  Lines         50706      50773    +67   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
+ Hits          43248      43262    +14   
- Misses         7458       7511    +53   
  Partials          0          0          

Powered by Codecov. Last update 45543ec...2aafb66

@aileronajay
Copy link
Contributor Author

aileronajay commented Nov 8, 2016

@jreback i have added the test case in the file

# unicode round trip test for GH 13747
def test_round_trip_frame_unicode(self):
sep = ','
df = pd.DataFrame({'a':['µasd','Ωœ∑´'], 'b':['øπ∆˚¬','œ∑´®']})
Copy link
Contributor

@pijucha pijucha Nov 8, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aileronajay
Just a thought: adding this

cls.data['utf8'] = pd.DataFrame({'a': ['µasd', 'Ωœ∑´'], 'b': ['øπ∆˚¬', 'œ∑´®']})

to the method setUpClass above would have the same (or better) effect as the separate test but the code would be neater.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pijucha that makes sense and i thought of doing that initially, but wouldn't that make the code similar to en espanol test that is already present. Though i can make change if that seems to be the way to go :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Placing it in setUpClass just allows to run all tests automatically. Whether it's similar to en espanol is another thing (Chinese or cyrillic characters might be less similar but it's probably ok).

@aileronajay
Copy link
Contributor Author

@pijucha I have refactored the test case to be in sync with the rest of the file

@jorisvandenbossche jorisvandenbossche changed the title changes for GH 13747 BUG in clipboard (linux, python2) with unicode and separator (GH13747) Nov 8, 2016
@aileronajay
Copy link
Contributor Author

@jreback @pijucha i can see that travis ci check has errored but i cant see any error message related to the changes that i have made. The logs just say that no output was received in last 10 minutes. Could it be an issue with the travis at the time?

@jreback
Copy link
Contributor

jreback commented Nov 8, 2016

that might just be a stalled build it happens

in any event i would ideally like to have a test that actually fails on linux (and os x) before this change (so that once this fix is applied the test passes)

iow this is not currently tested on linux (though it fails for me locally as well)

@jreback
Copy link
Contributor

jreback commented Nov 8, 2016

I think we just disabled the test in the linked issue - can u re enable

@aileronajay
Copy link
Contributor Author

@jreback if i remove the "@disabled" tag from the class (pandas/io/tests/test_clipboard.py) , will that be sufficient?

@jreback
Copy link
Contributor

jreback commented Nov 8, 2016

yes - ideally see if that fails w/o your change then passes with / that was out there a long time ago (so the test might not work well anyhow)

@aileronajay
Copy link
Contributor Author

@jreback , when i run all tests in my local machine, using the script, test.sh (at root directory), it does run these clipboard tests (that is how i came across this issue in the first place). So now i wonder if the disabled tag works at all, as the test.sh was able to call test_clipboard.py even with the disabled tag in place and now that i have removed the disabled tag, it would definitely execute the clipboard tests

@jreback
Copy link
Contributor

jreback commented Nov 8, 2016

what i mean is travis doesn't run the disabled tests by default
that's what i want to see is s test that fails there (create a new branch and just remove the disabled ) thrn push up to test

@TomAugspurger
Copy link
Contributor

FYI I think they will fail on travis. In this job which is under my pytest branch, I had messed up the disabled, so the clipboard tests got run and they failed.

@jreback
Copy link
Contributor

jreback commented Nov 8, 2016

@aileronajay ok looks like if u can get these to pass on your other PR would be great
closing this as was a test

@jreback jreback closed this Nov 8, 2016
@aileronajay
Copy link
Contributor Author

@jreback i have implemented the code review comments

@aileronajay
Copy link
Contributor Author

@jreback can you pls take a look

@jreback
Copy link
Contributor

jreback commented Nov 17, 2016

@jreback
Copy link
Contributor

jreback commented Nov 17, 2016

@aileronajay changes look fine. just need to nail down testing.

@aileronajay
Copy link
Contributor Author

@jreback the job that had failed earlier has this log message (pasted below). I think it was due to an intermittent network issue

$ /home/travis/miniconda/envs/pandas/bin/conda install -n pandas --file=ci/requirements-2.7.run
Traceback (most recent call last):
File "/home/travis/miniconda/lib/python2.7/site-packages/conda/exceptions.py", line 479, in conda_exception_handler
return_value = func(_args, *_kwargs)
File "/home/travis/miniconda/lib/python2.7/site-packages/conda/cli/main.py", line 145, in _main
exit_code = args.func(args, p)
File "/home/travis/miniconda/lib/python2.7/site-packages/conda/cli/main_install.py", line 80, in execute
install(args, parser, 'install')
File "/home/travis/miniconda/lib/python2.7/site-packages/conda/cli/install.py", line 420, in install
raise CondaRuntimeError('RuntimeError: %s' % e)
CondaRuntimeError: Runtime error: RuntimeError: Runtime error: Could not open u'/home/travis/miniconda/pkgs/glib-2.43.0-1.tar.bz2.part' for writing (HTTPSConnectionPool(host='repo.continuum.io', port=443): Read timed out.).

@jreback
Copy link
Contributor

jreback commented Nov 17, 2016

@aileronajay yes 1 job failed. however 3 of the bottom jobs failed: https://travis-ci.org/pandas-dev/pandas/builds/176283955. These must pass as well.

@aileronajay
Copy link
Contributor Author

@jreback the 3 jobs that have failed at the bottom, have one common error

ERROR: test_round_trip_frame (pandas.io.tests.test_clipboard.TestClipboard)

Traceback (most recent call last):
File "/home/travis/build/pandas-dev/pandas/pandas/io/tests/test_clipboard.py", line 86, in test_round_trip_frame
self.check_round_trip_frame(dt)
File "/home/travis/build/pandas-dev/pandas/pandas/io/tests/test_clipboard.py", line 67, in check_round_trip_frame
data.to_clipboard(excel=excel, sep=sep, encoding=encoding)
File "/home/travis/build/pandas-dev/pandas/pandas/core/generic.py", line 1238, in to_clipboard
clipboard.to_clipboard(self, excel=excel, sep=sep, **kwargs)
File "/home/travis/build/pandas-dev/pandas/pandas/io/clipboard.py", line 120, in to_clipboard
clipboard_set(objstr)
File "/home/travis/build/pandas-dev/pandas/pandas/util/clipboard/clipboards.py", line 125, in call
raise PyperclipException(EXCEPT_MSG)
pandas.util.clipboard.exceptions.PyperclipException:
Pyperclip could not find a copy/paste mechanism for your system.
For more information, please visit https://pyperclip.readthedocs.org

In the latest version of pyperclip, it decides how to get the clipboard using this query

def determine_clipboard():
# Determine the OS/platform and set
# the copy() and paste() functions accordingly.
if 'cygwin' in platform.system().lower():
# FIXME: pyperclip currently does not support Cygwin,
# see asweigart/pyperclip#55
pass
elif os.name == 'nt' or platform.system() == 'Windows':
return init_windows_clipboard()
if os.name == 'mac' or platform.system() == 'Darwin':
return init_osx_clipboard()
if HAS_DISPLAY:
# Determine which command/module is installed, if any.
try:
import gtk # check if gtk is installed
except ImportError:
pass
else:
return init_gtk_clipboard()

    try:
        import PyQt4  # check if PyQt4 is installed
    except ImportError:
        pass
    else:
        return init_qt_clipboard()

    if _executable_exists("xclip"):
        return init_xclip_clipboard()
    if _executable_exists("xsel"):
        return init_xsel_clipboard()
    if _executable_exists("klipper") and _executable_exists("qdbus"):
        return init_klipper_clipboard()

return init_no_clipboard()

where the HAS_DISPLAY is retrieved using HAS_DISPLAY = os.getenv("DISPLAY", False). So in a unix env if the HAS_DISPLAY resolved to false, we will get the exception that we are getting ( pandas.util.clipboard.exceptions.PyperclipException:
Pyperclip could not find a copy/paste mechanism for your system.). Is that possible that the travis configuration is different for the failing envs so that the HAS_DISPLAY resolves to false or none of gtk, xsel, xcopy and klipper are present

@jreback
Copy link
Contributor

jreback commented Nov 17, 2016

ok this is easy then. we don't test this on every system.

change the top portion

try:
    import pandas.util.clipboard  # noqa
except OSError:
    raise nose.SkipTest("no clipboard found")

to something that triggers the exception (and then catch the error, so we can skip the tests if its not installed)

e.g.

try:
    DataFrame({'A' : [1,2]}).to_clipboard()
   pd.read_clipboard()
except (that exception):
    raise nose.SkipTest("clipboard primitives not installed")

@@ -43,3 +43,4 @@ Bug Fixes


- Explicit check in ``to_stata`` and ``StataWriter`` for out-of-range values when writing doubles (:issue:`14618`)
- BUG in clipboard (linux, python2) with unicode and separator (:issue:`13747`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

list all of the isues this is expected to close

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

implemented this change now

@@ -115,3 +119,16 @@ def test_read_clipboard_infer_excel(self):
exp = pd.read_clipboard()

tm.assert_frame_equal(res, exp)

# test case for testing invalid encoding
def test_invalid_encoding(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

put comments inside the test
put the issue number as a comment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

implemented the suggested change

@@ -115,3 +119,16 @@ def test_read_clipboard_infer_excel(self):
exp = pd.read_clipboard()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add tests for other issues that we mentioned (or attach the issue number to an existing tests as appropriate)

Copy link
Contributor Author

@aileronajay aileronajay Nov 17, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#14362 , says that that upgrading to latest should fix, we already have round trip tests to test this (that have now been enabled)
#12807, existing tests fail,( that now pass)
#12529, i have added unicode test to test this

@aileronajay
Copy link
Contributor Author

aileronajay commented Nov 18, 2016

@jreback after handling of the cases where clipboard primitives are absent, tests are not failing anymore. Also added additional bugs that are being fixed to whatsnew file and tagged them to tests as appropriate

@aileronajay
Copy link
Contributor Author

@jreback I have addressed the review comments given earlier and we dont have any failures related to this change

@jreback jreback closed this in 4a1a330 Nov 18, 2016
@jreback
Copy link
Contributor

jreback commented Nov 18, 2016

thanks @aileronajay really nice effort!

@aileronajay
Copy link
Contributor Author

Thanks @jreback

jorisvandenbossche pushed a commit to jorisvandenbossche/pandas that referenced this pull request Dec 14, 2016
vendered updated version of Pyperclip

closes pandas-dev#13747
closes pandas-dev#14362
closes pandas-dev#12807
closes pandas-dev#12529

Author: Ajay Saxena <[email protected]>
Author: Ajay Saxena <[email protected]>

Closes pandas-dev#14599 from aileronajay/master and squashes the following commits:

2aafb66 [Ajay Saxena] moved comment inside test and added github issue labels to test
b74fbc1 [Ajay Saxena] ignore lint test for pyperclip files
9db42d8 [Ajay Saxena] whatsnew conflict
1dca292 [Ajay Saxena] conflict resolution
98b61e8 [Ajay Saxena] merge conflict
cedb690 [Ajay Saxena] merge conflict in whats new file
7af95da [Ajay Saxena] merging lastest changes
ac8ae60 [Ajay Saxena] skip clipboard test if clipboard primitives are absent
b03ed56 [Ajay Saxena] changed whatsnew file
c0aafd7 [Ajay Saxena] Merge branch 'test_branch'
9946fb7 [Ajay Saxena] Merge branch 'master' of https://github.com/pandas-dev/pandas into test_branch
ed1375f [Ajay Saxena] Merge branch 'test_branch'
0665fd4 [Ajay Saxena] fixed linting and test case as per code review
d202fd0 [Ajay Saxena] added test for valid encoding, modified setup.py so that pandas/util/clipboard can be found
dd57ae3 [Ajay Saxena] code review changes and read clipboard invalid encoding test
71d58d0 [Ajay Saxena] testing encoding in kwargs to to_clipboard and test case for the same
02f87b0 [Ajay Saxena] removed duplicate files
825bbe2 [Ajay Saxena] all files related to pyperclip are under pandas.util.clipboard
c5a87d8 [Ajay Saxena] Merge branch 'test_branch' of https://github.com/aileronajay/pandas into test_branch
f708c2e [Ajay Saxena] Merge branch 'master' of https://github.com/aileronajay/pandas
d565b1f [Ajay Saxena] updated pyperclip to the latest version
14d94a0 [Ajay Saxena] changed the pandas util clipboard file to return unicode if the python version is 2, else str
66d8ebf [Ajay Saxena] removed the disabled tag for clipboard test so that we can check if they pass after this change
edb8553 [Ajay Saxena] refactored the new unicode test to be in sync with the rest of the file
c83d000 [Ajay Saxena] added test case for unicode round trip
fb922d6 [Ajay Saxena] changes for GH 13747

(cherry picked from commit 4a1a330)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug IO Data IO issues that don't fit into a more specific label Unicode Unicode strings
Projects
None yet
7 participants