Skip to content

add more complex slide example #9

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Oct 20, 2021
Merged

add more complex slide example #9

merged 6 commits into from
Oct 20, 2021

Conversation

dajmcdon
Copy link
Contributor

Closes #5.

Added two "local forecasters" to the vignette.

  1. The baseline.
  2. An AR.

Both do arbitrary aheads. A few notes:

  • I mentioned in the vignette that these will, by design, not account for data versioning.
  • They use an odd amount of data. There is an arbitrary min_train_window argument in both functions that returns NA forecasts if the slide window is not long enough yet. After that, they do whatever. This seemed easiest to illustrate the behavior without too much data processing that would be needed in a real forecaster. Let me know if you have a better idea. I didn't bother to filter out the NAs this induces at the beginning of the series, but that doesn't seem key to the illustration either.

Two things that occur to me related to #7 :

  1. Is there any "filling" of missing time_values? tsibble::fill_gaps() fixes this, but I couldn't tell if it's part of the class.
  2. Does the class enforce sequential time_values? Not clear to me if tsibble does this. But maybe we should?

@dajmcdon dajmcdon requested a review from ryantibs October 15, 2021 18:23
@ryantibs
Copy link
Member

ryantibs commented Oct 15, 2021

Awesome, thanks Dan! This is fast work, and great.

I'm not worried about either of your bullet points, though thanks for pointing them out. (However, I'm not sure what you meant by "After that, they do whatever"? Worth elaborating? Or not important?)

Re your two questions:

  1. Good idea/question. Perhaps this is another reason to move over to tsibble. Can you please write a comment about this over on Should epi_signal inherit from tsibble? #7?
  2. Sequential meaning what, no gaps? Or ordered? Because the slide_by_geo() function does indeed arrange by time_value, as you can see here. Perhaps you should create an issue about this if it becomes a real topic to discuss so as not to distract from this PR too much.

Now a few small requests before the merge:

  • Can you add some code to plot the forecasts (visualize them) at a few points in time, overlaid on the actual signal? Spaced out like you guys did in the evalcast package. Hopefully a few lines of ggplot code, not too much more. Hence might be more interesting to do more than one ahead and fewer quantile levels (only three: lower, median, upper).
  • Can you rename the quantile argument in the functions to be level or quantile_level or probability or something?
  • Can you rebuild the vignette (rebuild the pkgdown site) so we can easily see the results via GitHub pages? If you don't know how to do this, you can just do it with a call to pkgdown::build_site() inside an R session whose working directory is the GitHub repo. Warning: it can take a little while to build the derivatives vignette. If you can't get this to work, let me know, and I can rebuild it on this branch.)

@dajmcdon
Copy link
Contributor Author

I made the graph and managed to rebuild. Those are pretty ugly predictions!

On the questions: I'll take care of 1. For 2. I meant "ordered" (somehow missed the arrange call). And I don't think it's worth worrying about the other bullet.

@ryantibs
Copy link
Member

@dajmcdon Awesome! I was just looking through the code now and these are nice examples of slide indeed.

Couple more things come to mind. (Sorry to drag this out ...)

  1. Given the AR example, is there any reason to do the baseline example? Doesn't seem to add much and it only makes the vignette longer. How do you feel about commenting out the baseline for now (we could maybe reuse it elsewhere)?
  2. What is going on with AR over the full year? I just extended the plotting range and look what I get (below). Something very weird is going on the point predictions in CA in January; do you have a sense for whether this is real (just AR behaving badly), or some numerical bug (matrix inversion problem?) in the implementation? And what's going on with its quantiles (no blue fan) at various other points in time?

Screen Shot 2021-10-16 at 11 13 05 PM

- Scrap baseline forecaster for now
- Refactor AR implementation to use tidy code
- Add a bit more discussion surrounding the AR model's behavior
- Other small fixes/copyedits to vignettes
- Add Dan as a contributor
@ryantibs ryantibs merged commit b61331e into main Oct 20, 2021
@ryantibs
Copy link
Member

I managed to finish off the AR example just now (last commit message give some details). Thanks @dajmcdon for the great work here.

@ryantibs ryantibs deleted the djm-local-ar branch October 20, 2021 03:27
kenmawer pushed a commit that referenced this pull request Jul 5, 2022
epi_slide `f` parameter documentation
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add advanced slide example
2 participants