You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I expected that for each value a the number of entries would be returned, instead it throws an OutOfBoundsDatetime although it is not pandas object in first place, but a datetime object. I am aware that the pandas timestamps have a limited time span, however the objects are not timestamps, but datetimes... Thanks!
Code Sample, a copy-pastable example if possible
import pandas as pd
import datetime
s = pd.Series([datetime.datetime(9999,1,1)])
s.value_counts()
Expected Output
---------------------------------------------------------------------------
OutOfBoundsDatetime Traceback (most recent call last)
<ipython-input-7-1b9293e6eb60> in <module>()
----> 1 s.value_counts()
/usr/lib/python2.7/dist-packages/pandas/core/base.pyc in value_counts(self, normalize, sort, ascending, bins, dropna)
466 from pandas.tseries.api import DatetimeIndex, PeriodIndex
467 result = value_counts(self, sort=sort, ascending=ascending,
--> 468 normalize=normalize, bins=bins, dropna=dropna)
469
470 if isinstance(self, PeriodIndex):
/usr/lib/python2.7/dist-packages/pandas/core/algorithms.pyc in value_counts(values, sort, ascending, normalize, bins, dropna)
318
319 if not isinstance(keys, Index):
--> 320 keys = Index(keys)
321 result = Series(counts, index=keys, name=name)
322
/usr/lib/python2.7/dist-packages/pandas/core/index.pyc in __new__(cls, data, dtype, copy, name, fastpath, tupleize_cols, **kwargs)
177 tslib.is_timestamp_array(subarr)):
178 from pandas.tseries.index import DatetimeIndex
--> 179 return DatetimeIndex(subarr, copy=copy, name=name, **kwargs)
180 elif (inferred.startswith('timedelta') or
181 lib.is_timedelta_array(subarr)):
/usr/lib/python2.7/dist-packages/pandas/util/decorators.pyc in wrapper(*args, **kwargs)
87 else:
88 kwargs[new_arg_name] = new_arg_value
---> 89 return func(*args, **kwargs)
90 return wrapper
91 return _deprecate_kwarg
/usr/lib/python2.7/dist-packages/pandas/tseries/index.pyc in __new__(cls, data, freq, start, end, periods, copy, name, tz, verify_integrity, normalize, closed, ambiguous, dtype, **kwargs)
317 except ValueError:
318 # tz aware
--> 319 subarr = tools._to_datetime(data, box=False, utc=True)
320
321 # we may not have been able to convert
/usr/lib/python2.7/dist-packages/pandas/tseries/tools.pyc in _to_datetime(arg, errors, dayfirst, yearfirst, utc, box, format, exact, unit, freq, infer_datetime_format)
393 return _convert_listlike(arg, box, format, name=arg.name)
394 elif com.is_list_like(arg):
--> 395 return _convert_listlike(arg, box, format)
396
397 return _convert_listlike(np.array([ arg ]), box, format)[0]
/usr/lib/python2.7/dist-packages/pandas/tseries/tools.pyc in _convert_listlike(arg, box, format, name)
381 return DatetimeIndex._simple_new(values, name=name, tz=tz)
382 except (ValueError, TypeError):
--> 383 raise e
384
385 if arg is None:
OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 9999-01-01 00:00:00
It looks like the problem comes when we try to construct the index here. Essentially we call Index([datetime.datetime(9999,1,1)]) instead of Index([datetime.datetime(9999,1,1)], dtype=object). Want to see if making that change fixes it without breaking any tests?
Any reason you're using datetime.datetimes with object dtype instead of Periods?
In [4]: pd.Series([pd.Period('9999-01-01')]).value_counts()
Out[4]:
9999-01-011Freq: D, dtype: int64
That might be less painful for you since there will surely be other places in pandas that don't support datetimes with object dtype. It might not be worth fixing this one case, if Periods provide a better alternative.
I read the data from a sql-database via pd.read_sql_query and for values that do not go outside the boundaries the type is datetime64[ns], however infinity is marked as 9999 in some columns and the dtype for those is object.
Thanks for the advice on periods.
sinhrks
changed the title
series with datetime raises pandas.tslib.OutOfBoundsDatetime on count_values()
series with datetime raises pandas.tslib.OutOfBoundsDatetime on value_counts()
Jul 24, 2016
I expected that for each value a the number of entries would be returned, instead it throws an OutOfBoundsDatetime although it is not pandas object in first place, but a datetime object. I am aware that the pandas timestamps have a limited time span, however the objects are not timestamps, but datetimes... Thanks!
Code Sample, a copy-pastable example if possible
Expected Output
output of
pd.show_versions()
The text was updated successfully, but these errors were encountered: