-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
NaNs in Float64Index are converted to silly integers using index.astype('int') #13149
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This is numpy behaviour:
But, we should probably check for the occurence of NaNs, just as we do for Series:
|
I wanted to fix this bug but noticed a similar behaviour of other objects: DatetimeIndex, TimedeltaIndex, Categorical, CategoricalIndex. Namely (all four of them behave identically):
However, unlike with Float64Index, this is invertible:
My question: is this behaviour also a bug and should be fixed the same way (raising a ValueError)? And if so, should all the fixes be placed into one commit/pull request? By the way, there might be other objects with the same issue, which call numpy.ndarray.astype(). And numpy is also a bit inconsistent here:
|
@ch41rmn these are all as expected. converting to The only issue is that |
@jreback I actually think we should raise in the datetimeindex case as well (ideally). A |
Raising for CategoricalIndex seems less of a problem (not a common thing to do) |
This is excactly what should be returned (and is useful). yes its equivalen to internal
|
1. Float64Index.astype(int) raises ValueError if a NaN is present. Previously, it converted NaN's to the smallest negative integer. 2. TimedeltaIndex.astype(int) and DatetimeIndex.astype(int) return Int64Index, which is consistent with behavior of other Indexes. Previously, they returned a numpy.array of ints. 3. Added: - bool parameter 'copy' to Index.astype() - shared doc string to .astype() - tests on .astype() (consolidated and added new) - bool parameter 'copy' to Categorical.astype() 4. Internals: - Fixed core.common.is_timedelta64_ns_dtype(). - Set a default NaT representation to a string type in a parameter of DatetimeIndex._format_native_types(). Previously, it produced a unicode u'NaT' in Python2.
Code Sample, a copy-pastable example if possible
output of
pd.show_versions()
The text was updated successfully, but these errors were encountered: