Skip to content

REGR? no error anymore when converting out of bounds datetime64[non-ns] data #26206

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jorisvandenbossche opened this issue Apr 24, 2019 · 7 comments
Labels
Datetime Datetime data dtype
Milestone

Comments

@jorisvandenbossche
Copy link
Member

Didn't directly find a related issue, but on master / 0.24 / 0.23, we see:

In [1]: pd.Series(np.array(['2262-04-12'], dtype='datetime64[D]'))
Out[1]: 
0   1677-09-21 00:25:26.290448384
dtype: datetime64[ns]

while on pandas 0.22.0:

In [1]: pd.Series(np.array(['2262-04-12'], dtype='datetime64[D]'))
---------------------------------------------------------------------------
OutOfBoundsDatetime                       Traceback (most recent call last)
<ipython-input-1-b3f7cbbf1054> in <module>()
----> 1 pd.Series(np.array(['2262-04-12'], dtype='datetime64[D]'))

~/miniconda3/envs/pandas022/lib/python3.6/site-packages/pandas/core/series.py in __init__(self, data, index, dtype, name, copy, fastpath)
    264                                        raise_cast_failure=True)
    265 
--> 266                 data = SingleBlockManager(data, index, fastpath=True)
    267 
    268         generic.NDFrame.__init__(self, data, fastpath=True)

~/miniconda3/envs/pandas022/lib/python3.6/site-packages/pandas/core/internals.py in __init__(self, block, axis, do_integrity_check, fastpath)
   4400         if not isinstance(block, Block):
   4401             block = make_block(block, placement=slice(0, len(axis)), ndim=1,
-> 4402                                fastpath=True)
   4403 
   4404         self.blocks = [block]

~/miniconda3/envs/pandas022/lib/python3.6/site-packages/pandas/core/internals.py in make_block(values, placement, klass, ndim, dtype, fastpath)
   2955                      placement=placement, dtype=dtype)
   2956 
-> 2957     return klass(values, ndim=ndim, fastpath=fastpath, placement=placement)
   2958 
   2959 # TODO: flexible with index=None and/or items=None

~/miniconda3/envs/pandas022/lib/python3.6/site-packages/pandas/core/internals.py in __init__(self, values, placement, fastpath, **kwargs)
   2468     def __init__(self, values, placement, fastpath=False, **kwargs):
   2469         if values.dtype != _NS_DTYPE:
-> 2470             values = tslib.cast_to_nanoseconds(values)
   2471 
   2472         super(DatetimeBlock, self).__init__(values, fastpath=True,

pandas/_libs/tslib.pyx in pandas._libs.tslib.cast_to_nanoseconds()

pandas/_libs/tslib.pyx in pandas._libs.tslib._check_dts_bounds()

OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 2262-04-12 00:00:00

In [2]: pd.__version__
Out[2]: '0.22.0'

cc @jbrockmendel any idea if this was changed on purpose or to what refactoring could have been the cause of this change?

@jorisvandenbossche
Copy link
Member Author

Additional observation: it only seems to be the Series and DataFrame constructors that have this issue, others like pd.array, pd.to_datetime, pd.Index all still raise the OutOfBoundsDatetime error.

It might be that #18231 is the cause (it touches maybe_castable, which led to sanitize_array no longer to return the original datetime64[D] data), but if that is the case, then it was an unintentional side-effect and should also be fixed differently now (the change in that PR to maybe_castable seems logical).

@jbrockmendel
Copy link
Member

Off the top of my head I don't know where in the DTA refactor process this would have been changed. maybe_castable seems like a reasonable guess for a place to look.

@jorisvandenbossche
Copy link
Member Author

Yeah, if my observation from above is true, this has nothing to do with any of the DTA refactoring. In fact, we should probably use more of the "array creation from different kind of data" functionality that is gathered in array/datetimes.py (as this is handling the case correctly) in the Series/DataFrame construction

@gfyoung gfyoung added the Datetime Datetime data dtype label Apr 26, 2019
@gfyoung
Copy link
Member

gfyoung commented Apr 26, 2019

cc @mroeschke

jorisvandenbossche added a commit to jorisvandenbossche/pandas that referenced this issue Jun 17, 2019
@jreback jreback added this to the 0.25.0 milestone Jun 17, 2019
jorisvandenbossche added a commit to jorisvandenbossche/pandas that referenced this issue Jun 18, 2019
jorisvandenbossche added a commit to jorisvandenbossche/pandas that referenced this issue Jun 21, 2019
@jreback jreback modified the milestones: 0.25.0, 1.0 Jun 28, 2019
@Vinci08
Copy link

Vinci08 commented Jul 22, 2019

I am having an out of bounds error, which forces me to downgrade pandas to 0.24.2. Here's my code:

 df['DAT'] = pd.Series(df['DAT'].values, dtype='datetime64[ns]')
 df['DAT'] = pd.to_datetime(df['DAT'], errors='coerce').dt.strftime('%m-%d-%Y')

It gave me this error:

Traceback (most recent call last):
  File "C:\Users\qqq\AppData\Local\Continuum\miniconda3\lib\site-packages\pandas\core\arrays\datetimes.py", line 1979, in objects_to_datetime64ns
    values, tz_parsed = conversion.datetime_to_datetime64(data)
  File "pandas\_libs\tslibs\conversion.pyx", line 200, in pandas._libs.tslibs.conversion.datetime_to_datetime64
TypeError: Unrecognized value type: <class 'datetime.date'>

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "I:\spreadsheet\run_dat.py", line 196, in <module>
    run_and_pull()
  File "I:\spreadsheet\run_dat.py", line 93, in run_and_pull
    df[col] = pd.Series(df[col].values, dtype='datetime64[ns]')
  File "C:\Users\qqq\AppData\Local\Continuum\miniconda3\lib\site-packages\pandas\core\series.py", line 311, in __init__
    data = sanitize_array(data, index, dtype, copy, raise_cast_failure=True)
  File "C:\Users\qqq\AppData\Local\Continuum\miniconda3\lib\site-packages\pandas\core\internals\construction.py", line 664, in sanitize_array
    subarr = _try_cast(data, dtype, copy, raise_cast_failure)
  File "C:\Users\qqq\AppData\Local\Continuum\miniconda3\lib\site-packages\pandas\core\internals\construction.py", line 784, in _try_cast
    subarr = maybe_cast_to_datetime(arr, dtype)
  File "C:\Users\qqq\AppData\Local\Continuum\miniconda3\lib\site-packages\pandas\core\dtypes\cast.py", line 1052, in maybe_cast_to_datetime
    value = to_datetime(value, errors=errors)
  File "C:\Users\qqq\AppData\Local\Continuum\miniconda3\lib\site-packages\pandas\util\_decorators.py", line 208, in wrapper
    return func(*args, **kwargs)
  File "C:\Users\qqq\AppData\Local\Continuum\miniconda3\lib\site-packages\pandas\core\tools\datetimes.py", line 787, in to_datetime
    cache_array = _maybe_cache(arg, format, cache, convert_listlike)
  File "C:\Users\qqq\AppData\Local\Continuum\miniconda3\lib\site-packages\pandas\core\tools\datetimes.py", line 156, in _maybe_cache
    cache_dates = convert_listlike(unique_dates, True, format)
  File "C:\Users\qqq\AppData\Local\Continuum\miniconda3\lib\site-packages\pandas\core\tools\datetimes.py", line 460, in _convert_listlike_datetimes
    allow_object=True,
  File "C:\Users\qqq\AppData\Local\Continuum\miniconda3\lib\site-packages\pandas\core\arrays\datetimes.py", line 1984, in objects_to_datetime64ns
    raise e
  File "C:\Users\qqq\AppData\Local\Continuum\miniconda3\lib\site-packages\pandas\core\arrays\datetimes.py", line 1975, in objects_to_datetime64ns
    require_iso8601=require_iso8601,
  File "pandas\_libs\tslib.pyx", line 465, in pandas._libs.tslib.array_to_datetime
  File "pandas\_libs\tslib.pyx", line 683, in pandas._libs.tslib.array_to_datetime
  File "pandas\_libs\tslib.pyx", line 679, in pandas._libs.tslib.array_to_datetime
  File "pandas\_libs\tslib.pyx", line 555, in pandas._libs.tslib.array_to_datetime
  File "pandas\_libs\tslibs\np_datetime.pyx", line 118, in pandas._libs.tslibs.np_datetime.check_dts_bounds
pandas._libs.tslibs.np_datetime.OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 1-01-01 00:00:00

I am by no mean an expert, so if there is something from my end that caused this issue, please point it out. However, I ran the file with pandas 0.24.2 without any issue.

@TomAugspurger
Copy link
Contributor

I believe this was closed by #26848. NOt sure why it didn't autoclose.

@jorisvandenbossche
Copy link
Member Author

I believe this was closed by #26848. NOt sure why it didn't autoclose.

I think it is a problem of using a full url in "closes ..." instead the #number

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Datetime Datetime data dtype
Projects
None yet
Development

No branches or pull requests

6 participants