Skip to content

DOC: warning on look-ahead bias with resampling #26754

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jun 12, 2019

Conversation

0x0L
Copy link
Contributor

@0x0L 0x0L commented Jun 9, 2019

@0x0L 0x0L force-pushed the doc_resampling branch from 2ec656b to c7eb7ff Compare June 9, 2019 14:25
@codecov
Copy link

codecov bot commented Jun 9, 2019

Codecov Report

Merging #26754 into master will decrease coverage by 50.49%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #26754      +/-   ##
==========================================
- Coverage    91.7%    41.2%   -50.5%     
==========================================
  Files         179      179              
  Lines       50767    50767              
==========================================
- Hits        46555    20918   -25637     
- Misses       4212    29849   +25637
Flag Coverage Δ
#multiple ?
#single 41.2% <ø> (-0.09%) ⬇️
Impacted Files Coverage Δ
pandas/io/formats/latex.py 0% <0%> (-100%) ⬇️
pandas/plotting/_matplotlib/__init__.py 0% <0%> (-100%) ⬇️
pandas/io/sas/sas_constants.py 0% <0%> (-100%) ⬇️
pandas/core/groupby/categorical.py 0% <0%> (-100%) ⬇️
pandas/tseries/plotting.py 0% <0%> (-100%) ⬇️
pandas/io/formats/html.py 0% <0%> (-99.37%) ⬇️
pandas/io/sas/sas7bdat.py 0% <0%> (-91.16%) ⬇️
pandas/io/sas/sas_xport.py 0% <0%> (-90.1%) ⬇️
pandas/core/sparse/scipy_sparse.py 10.14% <0%> (-89.86%) ⬇️
pandas/core/tools/numeric.py 10.14% <0%> (-89.86%) ⬇️
... and 132 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c7748ca...c7eb7ff. Read the comment docs.

@codecov
Copy link

codecov bot commented Jun 9, 2019

Codecov Report

Merging #26754 into master will increase coverage by 0.01%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #26754      +/-   ##
==========================================
+ Coverage    91.7%   91.71%   +0.01%     
==========================================
  Files         179      178       -1     
  Lines       50767    50771       +4     
==========================================
+ Hits        46555    46564       +9     
+ Misses       4212     4207       -5
Flag Coverage Δ
#multiple 90.3% <ø> (+0.01%) ⬆️
#single 41.21% <ø> (-0.08%) ⬇️
Impacted Files Coverage Δ
pandas/io/gbq.py 78.94% <0%> (-10.53%) ⬇️
pandas/core/frame.py 96.88% <0%> (-0.12%) ⬇️
pandas/core/indexes/datetimes.py 96.37% <0%> (ø) ⬆️
pandas/core/indexes/timedeltas.py 90.96% <0%> (ø) ⬆️
pandas/io/excel/__init__.py
pandas/core/series.py 93.62% <0%> (+0.01%) ⬆️
pandas/core/indexes/datetimelike.py 98.15% <0%> (+0.01%) ⬆️
pandas/core/arrays/datetimelike.py 97.93% <0%> (+0.04%) ⬆️
pandas/core/groupby/generic.py 89.16% <0%> (+0.08%) ⬆️
pandas/core/generic.py 93.91% <0%> (+0.27%) ⬆️
... and 1 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c7748ca...d17cb64. Read the comment docs.

frequency offsets except for 'M', 'A', 'Q', 'BM', 'BA', 'BQ', and 'W'
which all have a default of 'right'.

This might lead to unintended look-ahead bias as in the following example:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is "look-ahead bias" a common term?

Copy link
Contributor Author

@0x0L 0x0L Jun 10, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is in the context of timeseries forecasting.
EDIT: what about dropping the word bias ?

Copy link
Contributor

@TomAugspurger TomAugspurger Jun 10, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, dropping bias, and perhaps adding a small definition sounds good. Something like

This might lead to unintended looking ahead, where the value for a later date (Sunday) is pulled back to
a previous date (Friday).

@jreback jreback added Docs Datetime Datetime data dtype labels Jun 11, 2019
@jreback jreback added this to the 0.25.0 milestone Jun 11, 2019
@jreback jreback merged commit 634577e into pandas-dev:master Jun 12, 2019
@jreback
Copy link
Contributor

jreback commented Jun 12, 2019

thanks @0x0L

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Datetime Datetime data dtype Docs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Business day resampling
3 participants