Skip to content

[JIT Variant] Split multi-signal queries into many single-signal queries #1026

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 29 commits into from

Conversation

dshemetov
Copy link
Contributor

@dshemetov dshemetov commented Nov 9, 2022

A version of #646

Prerequisites:

  • Unless it is a documentation hotfix it should be merged against the dev branch
  • Branch is up-to-date with the branch to be merged with, i.e. dev
  • Build is successful
  • Code is cleaned up and formatted

Summary

Multi-signal queries complicate the JIT work by introducing a lot of iterator overhead. This avoids that by splitting them into many single-signal queries. A few notes:

  • this branch makes non-derived signals completely avoid JIT code path, so its speed on non-derived signals should be identical to dev
  • derived signals that rely on the same base signal make the same SQL query twice, so there's room for optimization here; I'm hoping that database caching makes that extra complexity not worth it
  • just like [JIT Variant] Non-streaming Pandas approach #1014, the Docker image name is jit-multi-sql instead of the branch name because / is not allowed
  • has some conflicts with Convert covidcast signal OR clauses to UNION #1021 that we'll need to address later

dshemetov and others added 29 commits October 7, 2022 15:34
* add smooth_diff
* add model updates
* add /trend endpoint
* add /trendseries endpoint
* add /csv endpoint
* params with utility functions
* update date utility functions
@dshemetov dshemetov requested a review from melange396 November 9, 2022 23:46
@dshemetov dshemetov marked this pull request as draft November 11, 2022 23:38
@dshemetov dshemetov force-pushed the jit_computations branch 5 times, most recently from 8f2bdaf to 2391d10 Compare December 8, 2022 22:11
@dshemetov
Copy link
Contributor Author

Going with the much cleaner approach in #1049.

@dshemetov dshemetov closed this Feb 22, 2023
@dshemetov dshemetov deleted the ds/jit-multi-sql branch March 4, 2023 01:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants