-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
ENH: infer_dtype should infer integer-na #27283
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Is this a post 1.0 thing? Right now, my preference is to keep the "numpy by default" data model for 1.0 (I don't fully understand what changing just |
This is a pre-cursor to actually changing the default int type; as I said above this is a necessary requirement. |
But as Tom implied, "actually changing the default int type" is something for sure for after 1.0. |
sure, my point is that this is a non-trivial amount of work that can be done w/o user changes and is a pre-cursor for other things. |
Do we have an idea of what we'll need to update internally to return |
see #27335 (comment) this is a relatively small change |
xref #26272
xref #27267
As the first step of moving towards integer-na dtypes as the primary integer type, we need to teach
infer_dtype
thatinteger-na
is a valid inferred type, right now[4] could return 'integer-na' to indicate that we might want to infer
Int64
dtype and is distinct from the inferred type of [5] which must becomefloat64
.This will allow us to then support changing integer columns when we add nulls to Int64 rather than coerce to float64; this is pretty common in indexing setting operations.
Secondly we can then enable
.to_numeric
to infer to integer-na (or unsigned-na) and the corresponding dtypes (#26272).Finally we could support coercion of
object
dtypes from integers and nulls to coerce to Int64 (#27267 for.explode()
and.infer_objects()
This issue itself only is a very minor user facing change (infer_dtype itself).
The text was updated successfully, but these errors were encountered: