Skip to content

BUG: to_clipboard raises an error with null values in data frame #43402

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
BabakAmini opened this issue Sep 4, 2021 · 13 comments
Open

BUG: to_clipboard raises an error with null values in data frame #43402

BabakAmini opened this issue Sep 4, 2021 · 13 comments
Labels
Bug IO Data IO issues that don't fit into a more specific label Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate Windows Windows OS

Comments

@BabakAmini
Copy link

BabakAmini commented Sep 4, 2021

With the latest version of pandas installed on windows, running the following code will raise an error:

df = pd.DataFrame(['\x00'])
df.to_clipboard(index=False, excel=False)

Here is the traceback:

ArgumentError Traceback (most recent call last)

in
1 df = pd.DataFrame(['\x00'])
----> 2 df.to_clipboard(index=False, excel=False)

e:\Anaconda\lib\site-packages\pandas\core\generic.py in to_clipboard(self, excel, sep, **kwargs)
3023 from pandas.io import clipboards
3024
-> 3025 clipboards.to_clipboard(self, excel=excel, sep=sep, **kwargs)
3026
3027 @Final

e:\Anaconda\lib\site-packages\pandas\io\clipboards.py in to_clipboard(obj, excel, sep, **kwargs)
143 else:
144 objstr = str(obj)
--> 145 clipboard_set(objstr)

e:\Anaconda\lib\site-packages\pandas\io\clipboard_init_.py in lazy_load_stub_copy(text)
631 global copy, paste
632 copy, paste = determine_clipboard()
--> 633 return copy(text)
634
635

e:\Anaconda\lib\site-packages\pandas\io\clipboard_init_.py in copy_windows(text)
454 # the object must have been allocated using the
455 # function with the GMEM_MOVEABLE flag.
--> 456 count = wcslen(text) + 1
457 handle = safeGlobalAlloc(GMEM_MOVEABLE, count * sizeof(c_wchar))
458 locked_handle = safeGlobalLock(handle)

e:\Anaconda\lib\site-packages\pandas\io\clipboard_init_.py in call(self, *args)
306
307 def call(self, *args):
--> 308 ret = self.f(*args)
309 if not ret and get_errno():
310 raise PyperclipWindowsException("Error calling " + self.f.name)

ArgumentError: argument 1: <class 'ValueError'>: embedded null character

@BabakAmini BabakAmini added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Sep 4, 2021
@mzeitlin11
Copy link
Member

Thanks for reporting this @BabakAmini. Can you please post a reproducible example (https://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports)?

Can you also please try on the latest version (1.3.2)? There are some fixes which might have fixed your issue (like #41109)

@mzeitlin11 mzeitlin11 added IO Data IO issues that don't fit into a more specific label Needs Info Clarification about behavior needed to assess issue and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Sep 5, 2021
@BabakAmini
Copy link
Author

BabakAmini commented Sep 5, 2021

Thanks for the reply @mzeitlin11. I've upgraded to the latest version but still got the same error. Below is the code to reproduce it:

import json
import pandas as pd
import requests

r = requests.get("http://cdn.tsetmc.com/api/Trade/GetTradeHistory/23086515493897579/20210831/false")
js = json.loads(r.text.replace("'", "\""))
df = pd.json_normalize(js['tradeHistory'])
df.to_clipboard(index=False, excel=False)

The problem is with the column named iSensVarP. This column contains NULL (\x00) in all rows.

@mzeitlin11
Copy link
Member

Thanks for checking! I can't reproduce this on macOS, so perhaps a windows specific issue? Would you mind trying to simplify the failing example to not use requests or json? For example, something along the lines of

df = pd.DataFrame(['\x00'])
df.to_clipboard(index=False, excel=False)

would be a nicer reproducer (if that fails or something similar) to ensure the issue is only with presence of \x00. But I can't check this since it doesn't fail for me :)

@BabakAmini
Copy link
Author

I just copied your sample code and got the same error.

@mzeitlin11
Copy link
Member

Thanks! Would you mind editing your issue body with this failing example and the traceback? Further investigations to fix welcome!

@mzeitlin11 mzeitlin11 added Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate Windows Windows OS and removed Needs Info Clarification about behavior needed to assess issue labels Sep 5, 2021
@mzeitlin11 mzeitlin11 added this to the Contributions Welcome milestone Sep 5, 2021
@Hossein1399
Copy link

Hi! I use windows 10 and Python 3.8 and PyCharm professional 2020.3 . My computer uses CPU Intel core I5 9600k and 8G ram.
I executed above codes and there wasn't any error and I could print the data.

@BabakAmini
Copy link
Author

@mzeitlin11 I asked my friend to run the code on Ubuntu and he didn't get any error. Also as @Hossein1399 mentioned above, he was successful in running the code on windows. So, it seems that this issue isn't specific to the operating system. Hope someone helps me to get rid of this error.

@mzeitlin11
Copy link
Member

Might still be specific to different Windows OSes - see an issue like #38527 which only occurred for WSL 2.0

@Liam3851
Copy link
Contributor

Might be Python-version related rather than OS-related? I can reproduce on Windows 10 with Python 3.7.11 and pandas 1.3.3, with identical traceback to OP.

@mzeitlin11
Copy link
Member

Can anyone reproduce with python >= 3.8? pandas 1.4 will officially support >= 3.8, so if this only occurs on 3.7 or below, there might not be anything to do here

@BabakAmini
Copy link
Author

BabakAmini commented Sep 17, 2021

@mzeitlin11 I'm using python 3.8.8 and pandas 1.3.2. Still have the same error.

INSTALLED VERSIONS

commit : 5f648bf
python : 3.8.8.final.0
python-bits : 64
OS : Windows
OS-release : 10
Version : 10.0.19041
machine : AMD64
processor : Intel64 Family 6 Model 142 Stepping 10, GenuineIntel
byteorder : little
LC_ALL : en_US.UTF-8
LANG : en_US.UTF-8
LOCALE : Persian_Iran.1256

pandas : 1.3.2
numpy : 1.20.1
pytz : 2021.1
dateutil : 2.8.1
pip : 21.0.1
setuptools : 52.0.0.post20210125
Cython : 0.29.23
pytest : 6.2.3
hypothesis : None
sphinx : 4.0.1
blosc : None
feather : None
xlsxwriter : 1.3.8
lxml.etree : 4.6.3
html5lib : 1.1
pymysql : None
psycopg2 : 2.8.6 (dt dec pq3 ext lo64)
jinja2 : 2.11.3
IPython : 7.22.0
pandas_datareader: None
bs4 : 4.9.3
bottleneck : 1.3.2
fsspec : 0.9.0
fastparquet : None
gcsfs : None
matplotlib : 3.3.4
numexpr : 2.7.3
odfpy : None
openpyxl : 3.0.7
pandas_gbq : None
pyarrow : None
pyxlsb : None
s3fs : None
scipy : 1.6.2
sqlalchemy : 1.4.7
tables : 3.6.1
tabulate : None
xarray : None
xlrd : 2.0.1
xlwt : 1.3.0
numba : 0.53.1

@Liam3851
Copy link
Contributor

I'm not sure how @Hossein1399 got this to work, maybe PyCharm patches some C calls, or maybe different Windows builds? But I get the same error calling wcslen on a string internally containing null using raw ctypes without our to_clipboard function:

In [1]: import ctypes

In [2]: msvcrt = ctypes.CDLL('msvcrt')

In [3]: msvcrt.wcslen('\x00')
---------------------------------------------------------------------------
ArgumentError                             Traceback (most recent call last)
<ipython-input-3-9c3618f0051c> in <module>
----> 1 msvcrt.wcslen('\x00')

ArgumentError: argument 1: <class 'ValueError'>: embedded null character

Appears potentially related to https://bugs.python.org/issue13617 and the fix thereof.

@BabakAmini
Copy link
Author

I'm back here with an update. I upgraded to Python 3.10.5. Now I don't get that error anymore, but nothing gets copied to the clipboard.

@mroeschke mroeschke removed this from the Contributions Welcome milestone Oct 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug IO Data IO issues that don't fit into a more specific label Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate Windows Windows OS
Projects
None yet
Development

No branches or pull requests

5 participants