resampling from Day to BusinessDay pulls weekend data back to friday. #15837

erbian · 2017-03-29T16:35:22Z

Code Sample

result = pd.Series(1., pd.date_range('20170101','20181231',freq='D')).resample('B').last().to_period()

print result.index[0]
# Period('2016-12-30', 'B')

pd.Period('20170101', freq='B')
# Period('2017-01-02', 'B')

Problem description

when i have daily data that spans weekends (e.g., 1/1/2017 which is a Sunday) and try to get the last available value for a BusinessDay, the data falls back to the Friday period. This is inconsistent with the timestamp of Sunday belonging to the Monday BusinessDay period (e.g., pd.Period('20170101', 'B') goes to 1/2/2017).

Expected Output

I would expect that the result.index[0] above would return Period('2017-01-02', 'B')

Output of `pd.show_versions()`

INSTALLED VERSIONS ------------------ commit: None python: 2.7.12.final.0 python-bits: 64 OS: Linux OS-release: 4.4.0-36-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: None.None

pandas: 0.19.2
nose: 1.3.7
pip: 9.0.1
setuptools: 34.3.0
Cython: 0.24
numpy: 1.12.0
scipy: 0.18.1
statsmodels: 0.6.1
xarray: 0.9.1
IPython: 4.1.1
sphinx: None
patsy: 0.4.1
dateutil: 2.6.0
pytz: 2016.10
blosc: None
bottleneck: 1.2.0
tables: None
numexpr: 2.5
matplotlib: 1.5.1
openpyxl: 2.3.3
xlrd: 0.9.4
xlwt: 1.0.0
xlsxwriter: None
lxml: 3.5.0
bs4: 4.5.1
html5lib: None
httplib2: 0.9.2
apiclient: None
sqlalchemy: 1.0.12
pymysql: None
psycopg2: None
jinja2: 2.8
boto: 2.45.0
pandas_datareader: None

The text was updated successfully, but these errors were encountered:

erbian · 2017-03-29T16:39:08Z

possibly related to #10575

jreback · 2017-03-29T17:19:58Z

duplicate of this issue: #11123

jreback · 2017-03-29T17:20:48Z

this is basically a convention. Though I guess it could be regarded as a bug as well. See the discussion on #11123. (and the xref issue you pointed). If you have some thoughts, pls share.

erbian · 2017-03-29T17:46:19Z

@jreback I understand that the choice of including Fri-Sat-Sun in Friday period is convention (though as others have pointed out, probably not ideal for time-series analysis given look-forward issues). I think it is then inconsistent that creating a period from a timestamp returns a period for which the timestamp is not included. for example:

 pd.Period(pd.datetime(2017,1,1), 'B').start_time
#  Timestamp('2017-01-02 00:00:00')

does the issue i am raising make sense?

jreback · 2017-03-29T20:43:59Z

In [3]: pd.datetime(2017,1,1) + pd.offsets.BusinessDay()
Out[3]: Timestamp('2017-01-02 00:00:00')

this is just convention (and documented).

In the other issue I suggested a new frequency as B is essentially look-ahead, maybe a look-behind one.

this looks reasonable actually. I think using .to_period() itself is the issue.

In [27]: r = pd.Series(1., pd.date_range('20170101','20170131',freq='D')).resample('B').asfreq()

In [28]: pd.concat([r, r.index.to_series().dt.weekday_name], axis=1)
Out[28]: 
              0          1
2016-12-30  NaN     Friday
2017-01-02  1.0     Monday
2017-01-03  1.0    Tuesday
2017-01-04  1.0  Wednesday
2017-01-05  1.0   Thursday
2017-01-06  1.0     Friday
2017-01-09  1.0     Monday
2017-01-10  1.0    Tuesday
2017-01-11  1.0  Wednesday
2017-01-12  1.0   Thursday
2017-01-13  1.0     Friday
2017-01-16  1.0     Monday
2017-01-17  1.0    Tuesday
2017-01-18  1.0  Wednesday
2017-01-19  1.0   Thursday
2017-01-20  1.0     Friday
2017-01-23  1.0     Monday
2017-01-24  1.0    Tuesday
2017-01-25  1.0  Wednesday
2017-01-26  1.0   Thursday
2017-01-27  1.0     Friday
2017-01-30  1.0     Monday
2017-01-31  1.0    Tuesday

jreback · 2017-03-29T20:44:04Z

cc @chris-b1

jreback added the Resample resample method label Mar 29, 2017

jreback closed this as completed Mar 29, 2017

jreback added the Duplicate Report Duplicate issue or pull request label Mar 29, 2017

jreback added this to the No action milestone Mar 29, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

resampling from Day to BusinessDay pulls weekend data back to friday. #15837

resampling from Day to BusinessDay pulls weekend data back to friday. #15837

erbian commented Mar 29, 2017 •

edited

Loading

erbian commented Mar 29, 2017

jreback commented Mar 29, 2017

jreback commented Mar 29, 2017

erbian commented Mar 29, 2017 •

edited

Loading

jreback commented Mar 29, 2017

jreback commented Mar 29, 2017

resampling from Day to BusinessDay pulls weekend data back to friday. #15837

resampling from Day to BusinessDay pulls weekend data back to friday. #15837

Comments

erbian commented Mar 29, 2017 • edited Loading

Code Sample

Problem description

Expected Output

Output of pd.show_versions()

erbian commented Mar 29, 2017

jreback commented Mar 29, 2017

jreback commented Mar 29, 2017

erbian commented Mar 29, 2017 • edited Loading

jreback commented Mar 29, 2017

jreback commented Mar 29, 2017

erbian commented Mar 29, 2017 •

edited

Loading

Output of `pd.show_versions()`

erbian commented Mar 29, 2017 •

edited

Loading