Skip to content

DateOffset + (Timestamp Series with NaT) = wrong results #15688

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
yegulalp opened this issue Mar 14, 2017 · 1 comment
Closed

DateOffset + (Timestamp Series with NaT) = wrong results #15688

yegulalp opened this issue Mar 14, 2017 · 1 comment
Labels
Bug Datetime Datetime data dtype Timezones Timezone data dtype

Comments

@yegulalp
Copy link

# Your code here
import pandas as pd
t = pd.Series([pd.Timestamp("2016-09-01", tz="US/Eastern") + k*pd.Timedelta("00:01:00") for k in xrange(5)])
t.iloc[2] = pd.NaT
print t
print
print t + pd.tseries.offsets.DateOffset(days=1)
# Observed output:
0   2016-09-01 00:00:00-04:00
1   2016-09-01 00:01:00-04:00
2                         NaT
3   2016-09-01 00:03:00-04:00
4   2016-09-01 00:04:00-04:00
dtype: datetime64[ns, US/Eastern]

0   2016-09-02 04:00:00-04:00
1   2016-09-02 04:01:00-04:00
2                         NaT
3   2016-09-02 04:03:00-04:00
4   2016-09-02 04:04:00-04:00
dtype: datetime64[ns, US/Eastern]

Problem description

Adding a DateOffset object to a series with NaT values makes the output incorrect for the entire series. In the example above, I added one day to the series but it jumps by one day and four hours. This happens for other kinds of DateOffset objects (1 minute, 3 months, business day, etc). The incorrect jump seems to be related to the time zone.

Expected Output

0   2016-09-01 00:00:00-04:00
1   2016-09-01 00:01:00-04:00
2                         NaT
3   2016-09-01 00:03:00-04:00
4   2016-09-01 00:04:00-04:00
dtype: datetime64[ns, US/Eastern]

0   2016-09-02 00:00:00-04:00
1   2016-09-02 00:01:00-04:00
2                         NaT
3   2016-09-02 00:03:00-04:00
4   2016-09-02 00:04:00-04:00
dtype: datetime64[ns, US/Eastern]

Output of pd.show_versions()

# Paste the output here pd.show_versions() here

commit: None
python: 2.7.13.final.0
python-bits: 64
OS: Windows
OS-release: 7
machine: AMD64
processor: Intel64 Family 6 Model 60 Stepping 3, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None

pandas: 0.19.2
nose: 1.3.7
pip: 9.0.1
setuptools: 27.2.0
Cython: 0.25.2
numpy: 1.11.3
scipy: 0.18.1
statsmodels: 0.6.1
xarray: None
IPython: 5.1.0
sphinx: 1.5.1
patsy: 0.4.1
dateutil: 2.6.0
pytz: 2016.10
blosc: None
bottleneck: 1.2.0
tables: 3.2.2
numexpr: 2.6.1
matplotlib: 1.5.1
openpyxl: 2.4.0
xlrd: 1.0.0
xlwt: 1.1.2
xlsxwriter: 0.9.6
lxml: 3.7.2
bs4: 4.5.3
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: 1.1.4
pymysql: None
psycopg2: None
jinja2: 2.8.1
boto: 2.45.0
pandas_datareader: 0.2.1

@jreback
Copy link
Contributor

jreback commented Mar 14, 2017

this is correct in master; IIRC fixed by #14928

In [8]: import pandas as pd
   ...: t = pd.Series([pd.Timestamp("2016-09-01", tz="US/Eastern") + k*pd.Timedelta("00:01:00") for k in range(5)])
   ...: t.iloc[2] = pd.NaT
   ...: 

In [9]: t
Out[9]: 
0   2016-09-01 00:00:00-04:00
1   2016-09-01 00:01:00-04:00
2                         NaT
3   2016-09-01 00:03:00-04:00
4   2016-09-01 00:04:00-04:00
dtype: datetime64[ns, US/Eastern]

In [10]: t + pd.tseries.offsets.DateOffset(days=1)
Out[10]: 
0   2016-09-02 00:00:00-04:00
1   2016-09-02 00:01:00-04:00
2                         NaT
3   2016-09-02 00:03:00-04:00
4   2016-09-02 00:04:00-04:00
dtype: datetime64[ns, US/Eastern]

In [11]: (t + pd.tseries.offsets.DateOffset(days=1)) - t
Out[11]: 
0   1 days
1   1 days
2      NaT
3   1 days
4   1 days
dtype: timedelta64[ns]

@jreback jreback closed this as completed Mar 14, 2017
@jreback jreback added Bug Datetime Datetime data dtype Timezones Timezone data dtype labels Mar 14, 2017
@jreback jreback added this to the No action milestone Mar 14, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Datetime Datetime data dtype Timezones Timezone data dtype
Projects
None yet
Development

No branches or pull requests

2 participants