-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: read_csv fails with uint64 #14983
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Labels
Milestone
Comments
I think response in #14982 answers this. Key idea is to make sure this is performant though. |
also in this issue can make sure that passing |
gfyoung
added a commit
to forking-repos/pandas
that referenced
this issue
Dec 28, 2016
Add handling for uint64 elements in an array with the follow behavior specifications: 1) If uint64 and NaN are both detected, the original input will be returned if coerce_numeric is False. Otherwise, an Exception is raised. 2) If uint64 and negative numbers are both detected, the original input be returned if coerce_numeric is False. Otherwise, an Exception is raised. Closes pandas-devgh-14982. Partial fix for pandas-devgh-14983.
gfyoung
added a commit
to forking-repos/pandas
that referenced
this issue
Dec 28, 2016
Add handling for uint64 elements in an array with the follow behavior specifications: 1) If uint64 and NaN are both detected, the original input will be returned if coerce_numeric is False. Otherwise, an Exception is raised. 2) If uint64 and negative numbers are both detected, the original input be returned if coerce_numeric is False. Otherwise, an Exception is raised. Closes pandas-devgh-14982. Partial fix for pandas-devgh-14983.
gfyoung
added a commit
to forking-repos/pandas
that referenced
this issue
Dec 29, 2016
Add handling for uint64 elements in an array with the follow behavior specifications: 1) If uint64 and NaN are both detected, the original input will be returned if coerce_numeric is False. Otherwise, an Exception is raised. 2) If uint64 and negative numbers are both detected, the original input be returned if coerce_numeric is False. Otherwise, an Exception is raised. Closes pandas-devgh-14982. Partial fix for pandas-devgh-14983.
gfyoung
added a commit
to forking-repos/pandas
that referenced
this issue
Dec 29, 2016
Add handling for uint64 elements in an array with the follow behavior specifications: 1) If uint64 and NaN are both detected, the original input will be returned if coerce_numeric is False. Otherwise, an Exception is raised. 2) If uint64 and negative numbers are both detected, the original input be returned if coerce_numeric is False. Otherwise, an Exception is raised. Closes pandas-devgh-14982. Partial fix for pandas-devgh-14983.
jreback
pushed a commit
that referenced
this issue
Dec 30, 2016
Add handling for `uint64` elements in an array with the follow behavior specifications: 1) If `uint64` and `NaN` are both detected, the original input will be returned if `coerce_numeric` is `False`. Otherwise, an `Exception` is raised. 2) If `uint64` and negative numbers are both detected, the original input be returned if `coerce_numeric` is `False`. Otherwise, an `Exception` is raised. Closes #14982. Partial fix for #14983. Author: gfyoung <[email protected]> Closes #15005 from gfyoung/maybe-convert-numeric-uint64 and squashes the following commits: c3bd28a [gfyoung] BUG: Convert uint64 in maybe_convert_numeric
gfyoung
added a commit
to forking-repos/pandas
that referenced
this issue
Dec 31, 2016
gfyoung
added a commit
to forking-repos/pandas
that referenced
this issue
Dec 31, 2016
gfyoung
added a commit
to forking-repos/pandas
that referenced
this issue
Dec 31, 2016
Adds behavior to allow for parsing of uint64 data in read_csv. Also ensures that they are properly handled along with NaN and negative values. Closes pandas-devgh-14983.
gfyoung
added a commit
to forking-repos/pandas
that referenced
this issue
Dec 31, 2016
Adds behavior to allow for parsing of uint64 data in read_csv. Also ensures that they are properly handled along with NaN and negative values. Closes pandas-devgh-14983.
gfyoung
added a commit
to forking-repos/pandas
that referenced
this issue
Dec 31, 2016
Adds behavior to allow for parsing of uint64 data in read_csv. Also ensures that they are properly handled along with NaN and negative values. Closes pandas-devgh-14983.
gfyoung
added a commit
to forking-repos/pandas
that referenced
this issue
Dec 31, 2016
Adds behavior to allow for parsing of uint64 data in read_csv. Also ensures that they are properly handled along with NaN and negative values. Closes pandas-devgh-14983.
jreback
pushed a commit
that referenced
this issue
Jan 2, 2017
Adds behavior to allow for parsing of uint64 data in read_csv. Also ensures that they are properly handled along with NaN and negative values. Closes gh-14983.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
master
at aba7d2:We should be able to handle
uint64
, and tests like this one here should not be enforcing buggy behavior.The buggy behavior for the C engine traces to here, where we attempt to cast according to this order defined here. Note for starters that
uint64
is not in that list. Thistry-except
is due toOverflowError
withint64
, after which we immediately convert to anobject
array of strings. At first, I thought insertinguint64
to the list would be good, but that can cause bad casting in the other direction, i.e. negative numbers get converted to theiruint64
equivalents.The buggy behavior for the Python engine traces to here, where we attempt to infer the
dtype
here. However, as I pointed out in #14982, this function fails withuint64
with a similar (and non-sensical)try-except
forOverflowError
inint64
.The questions that I posed in #14982 are also relevant here, since they should be consistent across both engines that also is performant. Patching the Python engine probably requires fixing #14982 first, and patching the C engine probably requires adding new functions to
parser.pyx
to parseuint64
andtokenizer.c
. However, in light of the questions that I posed in #14982, I'm not really sure what is best.The text was updated successfully, but these errors were encountered: