-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
OverflowError when loading uint64 from csv #11440
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I just tried with 0.17 and the error is now "Kerned died restarting" which translates from the interpreter into a segfault
|
hi @teto, |
Indeed sorry, the CSV is generated from the pcap, my mistake. Here is the good file: |
thanks @teto !
|
All my TSV files use '|' as a separator, only with very long numbers I've this kind of crash (if you remove the 'dtype' parameter from the read_csv call it will load just fine). |
My bad, I forgot to say in my description that by default then the dsn column is considered as "object" but I want to plot it, that's why I then added the dtype=np.uint64 Thanks for your help :) |
Can confirm that this happens in pandas 0.17. I tried this with your datafile: >>> pd.read_csv("iperf.csv", sep="|", dtype={'dsn':np.float64}).head()
tcpstream ipsrc dss_length dss_dsn sport packetid ipdst dss_rawack \
0 NaN NaN NaN NaN NaN 1 NaN NaN
1 NaN NaN NaN NaN NaN 2 NaN NaN
2 NaN NaN NaN NaN NaN 3 NaN NaN
3 NaN NaN NaN NaN NaN 4 NaN NaN
4 NaN NaN NaN NaN NaN 5 NaN NaN
dsn time_delta dport tcpflags dss_ssn tcpseq dack subtype mptcpstream \
0 NaN 0.000000 NaN NaN NaN NaN NaN NaN NaN
1 NaN 0.000000 NaN NaN NaN NaN NaN NaN NaN
2 NaN 0.030046 NaN NaN NaN NaN NaN NaN NaN
3 NaN 0.000000 NaN NaN NaN NaN NaN NaN NaN
4 NaN 3.969954 NaN NaN NaN NaN NaN NaN NaN
datafin master reltime
0 NaN NaN 0.000000
1 NaN NaN 0.000000
2 NaN NaN 0.030046
3 NaN NaN 0.030046
4 NaN NaN 4.000000 and it seems to work, which would suggest that the np.uint64 is not liking the Can you still make your plot when the datatype is float or do you need np.uint64 for something in particular? |
@teto so the way to do this is to just pass |
They are by default loaded as object but then, I can't plot the column, hence I force the conversion to uint64. |
|
can you show what the values actually are? (in a small copy-pastable example)? |
fyi this is a manifestation of #4471 |
I guessed so but the mentioned branch (https://github.com/jtratner/pandas/tree/GH4471_fix_uint64_maybe_convert_objects) looks out of tree since it was scheduled for 0.14 and is not yet merged upstream, is that correct ? |
that's correct. its still an open bug. |
closing as dupe. thanks for the report. |
Hi,
I am trying to plot a set of values that are all uint64 with panda
The csv file is available here https://transfer.sh/Sy5DS/iperf-client-linux-2rtrs-f30b30-f30b30-w140k-lia-run1.csv and this is the command I used:
My version was the one with ubuntu's 15.04 repo
pd.version.version
Out[10]: '0.15.0'
I am trying to upgrade it, hoping it would solve this, would it ?
I believe the problem is related to:
#4471
I recently discovered panda and so far it's the best tool I found to plot/work with data (tried R/gnuplot etc...) thanks a lot for the work.
The text was updated successfully, but these errors were encountered: