Skip to content

Adding DateOffset(days=1) produces NonExistentTimeError #12156

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jwbarker opened this issue Jan 27, 2016 · 12 comments
Closed

Adding DateOffset(days=1) produces NonExistentTimeError #12156

jwbarker opened this issue Jan 27, 2016 · 12 comments
Assignees
Labels
Bug Frequency DateOffsets Timezones Timezone data dtype

Comments

@jwbarker
Copy link

pd.date_range(pd.Timestamp('2015-1-1', tz='US/Eastern'), pd.Timestamp('2016-1-1', tz='US/Eastern'), freq='H') + pd.DateOffset(days=1)

Traceback (most recent call last):
  File "/home/jeff/.virtualenvs/omnipotent/local/lib/python2.7/site-packages/IPython/core/interactiveshell.py", line 3066, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-21-9574a4185a77>", line 1, in <module>
    pd.date_range(pd.Timestamp('2015-1-1', tz='US/Eastern'), pd.Timestamp('2016-1-1', tz='US/Eastern'), freq='H') + pd.DateOffset(days=1)
  File "/home/jeff/.virtualenvs/omnipotent/local/lib/python2.7/site-packages/pandas/tseries/base.py", line 412, in __add__
    return self._add_delta(other)
  File "/home/jeff/.virtualenvs/omnipotent/local/lib/python2.7/site-packages/pandas/tseries/index.py", line 731, in _add_delta
    new_values = self._add_offset(delta).asi8
  File "/home/jeff/.virtualenvs/omnipotent/local/lib/python2.7/site-packages/pandas/tseries/index.py", line 750, in _add_offset
    result = result.tz_localize(self.tz)
  File "/home/jeff/.virtualenvs/omnipotent/local/lib/python2.7/site-packages/pandas/util/decorators.py", line 89, in wrapper
    return func(*args, **kwargs)
  File "/home/jeff/.virtualenvs/omnipotent/local/lib/python2.7/site-packages/pandas/tseries/index.py", line 1724, in tz_localize
    ambiguous=ambiguous)
  File "pandas/tslib.pyx", line 3781, in pandas.tslib.tz_localize_to_utc (pandas/tslib.c:64980)
NonExistentTimeError: 2015-03-08 02:00:00

INSTALLED VERSIONS

commit: None
python: 2.7.6.final.0
python-bits: 64
OS: Linux
OS-release: 3.13.0-61-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8

pandas: 0.17.1
nose: None
pip: 7.1.0
setuptools: 18.0.1
Cython: None
numpy: 1.10.4
scipy: 0.16.1
statsmodels: None
IPython: 4.0.3
sphinx: None
patsy: None
dateutil: 2.4.2
pytz: 2015.7
blosc: None
bottleneck: None
tables: None
numexpr: None
matplotlib: 1.5.0
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: 4.4.1
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: 1.0.11
pymysql: None
psycopg2: 2.6.1 (dt dec pq3 ext lo64)
Jinja2: None

@jreback
Copy link
Contributor

jreback commented Jan 27, 2016

use this

In [10]: pd.date_range(pd.Timestamp('2015-1-1', tz='US/Eastern'), pd.Timestamp('2016-1-1', tz='US/Eastern'), freq='H') + pd.offsets.Day(1)
Out[10]: 
DatetimeIndex(['2015-01-02 00:00:00-05:00', '2015-01-02 01:00:00-05:00', '2015-01-02 02:00:00-05:00', '2015-01-02 03:00:00-05:00', '2015-01-02 04:00:00-05:00', '2015-01-02 05:00:00-05:00',
               '2015-01-02 06:00:00-05:00', '2015-01-02 07:00:00-05:00', '2015-01-02 08:00:00-05:00', '2015-01-02 09:00:00-05:00',
               ...
               '2016-01-01 15:00:00-05:00', '2016-01-01 16:00:00-05:00', '2016-01-01 17:00:00-05:00', '2016-01-01 18:00:00-05:00', '2016-01-01 19:00:00-05:00', '2016-01-01 20:00:00-05:00',
               '2016-01-01 21:00:00-05:00', '2016-01-01 22:00:00-05:00', '2016-01-01 23:00:00-05:00', '2016-01-02 00:00:00-05:00'],
              dtype='datetime64[ns, US/Eastern]', length=8761, freq='H')

though this might be a bug as Day correctly handles the tz adjustments

cc @rockg @chris-b1

@jreback jreback added this to the Next Major Release milestone Jan 27, 2016
@stephen-hoover
Copy link
Contributor

I poked into this a bit. I haven't solved it yet, but the error happens in DatetimeIndex.tz_localize. You can reproduce the error with

In [1]: pd.date_range(pd.Timestamp('2015-1-1'), pd.Timestamp('2016-1-1'), freq='H').tz_localize('US/Eastern')
---------------------------------------------------------------------------
NonExistentTimeError                      Traceback (most recent call last)
<ipython-input-1-35790fda5612> in <module>()
----> 1 pd.date_range(pd.Timestamp('2015-1-1'), pd.Timestamp('2016-1-1'), freq='H').tz_localize('US/Eastern')

/Users/shoover/src/pandas/pandas/util/decorators.py in wrapper(*args, **kwargs)
     89                 else:
     90                     kwargs[new_arg_name] = new_arg_value
---> 91             return func(*args, **kwargs)
     92         return wrapper
     93     return _deprecate_kwarg

/Users/shoover/src/pandas/pandas/tseries/index.py in tz_localize(self, tz, ambiguous)
   1845 
   1846             new_dates = tslib.tz_localize_to_utc(self.asi8, tz,
-> 1847                                                  ambiguous=ambiguous)
   1848         new_dates = new_dates.view(_NS_DTYPE)
   1849         return self._shallow_copy(new_dates, tz=tz)

/Users/shoover/src/pandas/pandas/tslib.pyx in pandas.tslib.tz_localize_to_utc (pandas/tslib.c:67511)()

NonExistentTimeError: 2015-03-08 02:00:00

In [2]: pd.__version__
Out[2]: '0.18.0+26.g28327ce'

In the original example, when you add a DateOffset, the code path goes through DatetimeIndex._add_offset, which removes timezone localization, does the addition, and then restores the localization. The exception in the original example happens when trying to restore the localization.

@jreback
Copy link
Contributor

jreback commented Mar 18, 2016

this is in the .applly method of DateOffset which doesn't correctly deal with localization

Day does this correctly (so u can use that code / move to DateOffset)

the trick is to know exactly when to call this

@rockg
Copy link
Contributor

rockg commented Mar 18, 2016

What is done for most offsets is to remove the tz entirely which makes additions of months/quarters/weeks/etc. work fine across tz boundaries. However, with day the logic does need to be changed a bit. Right now days=n uses dateutil whereas Day uses timedeltas. If it's easy to convert logic then do so, otherwise the right solution would be to convert to UTC instead of removing the tz. Then I think you eliminate all ambiguity at the cost of yet more specialized code. I'm not really happy with how one needs to remove the tz, but that is really a shortcoming of dateutil that we are working around.

@rockg
Copy link
Contributor

rockg commented Mar 18, 2016

There are some other weird things here, probably separate issues:

This time doesn't exist yet one can create it.

In [17]: pd.Timestamp('2015-03-08 2:00', tz='US/Eastern')
Out[17]: Timestamp('2015-03-08 02:00:00-0500', tz='US/Eastern')

Again, doesn't exist, but doesn't fail

In [14]: t = pd.Timestamp('2015-03-07 2:00', tz='US/Eastern')

In [15]: t + pd.DateOffset(days=1)
Out[15]: Timestamp('2015-03-08 02:00:00-0500', tz='US/Eastern')

@rockg
Copy link
Contributor

rockg commented Mar 18, 2016

And I think the right answer rather than going to UTC, is not removing the tz at all for offsets <= day. I think there is already this capability in _kwds_use_relativedelta. Removing days from there might just work as then it will use timedelta instead of dateutil.

@jreback
Copy link
Contributor

jreback commented Mar 18, 2016

@rockg Day already handles these cases. The issue is that DateOffset has different code, so just needs a refactor to move the .apply from Day to a private method in DateOffset and call when needed.

A wholly different (and maybe better method) of handling all of this is to add a __new__ method to DateOffset which then returns the appropriate .to_offset obejct, so no more DateOffset directly (only sub-classes). I think this would be back-compat actually, and avoid edge cases like this.

@stephen-hoover
Copy link
Contributor

This doesn't seem to be related to DateOffset. The error happens in tz_localize. For example, pd.Timestamp('2015-03-08 02:59:00').tz_localize('US/Eastern') generates a NonExistentTimeError. Maybe I'm missing something. Why doesn't 2 am on March 8th 2015 exist? pd.Timestamp('2014-03-08 02:00:00').tz_localize('US/Eastern') and pd.Timestamp('2016-03-08 02:00:00').tz_localize('US/Eastern') work just fine.

@kawochen
Copy link
Contributor

daylight saving

@stephen-hoover
Copy link
Contributor

Ah, that's what I was missing. Thanks! No wonder this is so confusing.

@gliptak
Copy link
Contributor

gliptak commented May 8, 2016

#13057

http://www.timeanddate.com/time/zone/usa/new-york

2015 Sun, Mar 8, 2:00 AM EST → EDT +1 hour (DST start) UTC-4h

This seem to be the expected behaviour.

@mroeschke
Copy link
Member

Closing as this looks like a duplicate of #28610

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Frequency DateOffsets Timezones Timezone data dtype
Projects
None yet
Development

No branches or pull requests

9 participants