Skip to content

Permit caching of COVIDcast signals for a few hours #159

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
krivard opened this issue Jul 23, 2020 · 11 comments
Open

Permit caching of COVIDcast signals for a few hours #159

krivard opened this issue Jul 23, 2020 · 11 comments
Labels
Engineering Used to filter issues when synching with Asana enhancement good first issue help wanted

Comments

@krivard
Copy link
Contributor

krivard commented Jul 23, 2020

None of the signals update more than once a day, so we could get a substantial performance boost in the map if we allowed caches to stay good for a few hours.

@sgratzl
Copy link
Member

sgratzl commented Aug 7, 2020

could be done in the .htaccess or via setting the header in the PHP file.

e.g. https://www.askapache.com/hacking/speed-site-caching-cache-control/

@capnrefsmmat
Copy link
Contributor

Just added that to #171. It should also improve responsiveness when switching back and forth between signals on the map.

@capnrefsmmat
Copy link
Contributor

I wasn't able to enable caching yet; I think ExpiresByType requires AllowOverride to be set in the Apache config. Someone needs to look into what settings are required and test them out on staging.

@krivard krivard added good first issue needs server access You need to actually log in to something to fix this and removed help wanted labels Sep 17, 2020
@sgratzl
Copy link
Member

sgratzl commented Sep 30, 2020

what is the status here? if you don't have access to the webserver setting the value via PHP would be an option as in

header('Cache-Control: public, max-age=86400');

or so

@capnrefsmmat
Copy link
Contributor

I think that's feasible. To do it through Apache would require coordinating some configuration changes with Brian, testing those on staging, and so on, but header() is much easier.

I guess the question is how long we'd like to cache responses. Probably no more than a few hours, since signals update daily and someone who comes just before the update shouldn't have to wait 24 hours to get it?

@sgratzl
Copy link
Member

sgratzl commented Sep 30, 2020

you roughly know when you put in new data each day. So you could just set the Expires header to that date: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Expires

@capnrefsmmat
Copy link
Contributor

Each signal pipeline delivers data on a different schedule, though, so we'd have to build that into the code -- some kind of configuration file specifying expected delivery times for each pipeline. And then we'd have to think about what happens when a pipeline is late and how the headers should work.

I think a simple first pass would just use a default short expiry, and we can go from there.

@krivard
Copy link
Contributor Author

krivard commented Sep 30, 2020 via email

@SumitDELPHI SumitDELPHI added the Engineering Used to filter issues when synching with Asana label Dec 2, 2020
@krivard krivard added help wanted and removed needs server access You need to actually log in to something to fix this labels Feb 2, 2023
@krivard
Copy link
Contributor Author

krivard commented Feb 2, 2023

There's probably a way to do this in Flask, but it looks like we don't yet? Here's what I get:

$ curl -sLI "https://api.covidcast.cmu.edu/epidata/covidcast/?signal=jhu-csse:confirmed-incidence-num&geo_type=nation&geo_value=us&time_type=day&time_value=20230101"
HTTP/2 200 
date: Thu, 02 Feb 2023 18:26:48 GMT
content-type: application/json
content-length: 92
set-cookie: AWSALBTG=qacugPKQMWrsVqjUA8+5ECCJYZAGov2eBXdEFfLrsS9tVe74n2H2gu0UVvp6MX9YTUWz+707UXh6v4txX4efQ5yh/OvgOTaq51vKy0QHVoY+7qoVjk2BhVXpsdJaDody+4ay5bZgxS/L+U98Iha0RtGISny+LDZhpKMObub+2TnVTu+B8H8=; Expires=Thu, 09 Feb 2023 18:26:47 GMT; Path=/
set-cookie: AWSALBTGCORS=qacugPKQMWrsVqjUA8+5ECCJYZAGov2eBXdEFfLrsS9tVe74n2H2gu0UVvp6MX9YTUWz+707UXh6v4txX4efQ5yh/OvgOTaq51vKy0QHVoY+7qoVjk2BhVXpsdJaDody+4ay5bZgxS/L+U98Iha0RtGISny+LDZhpKMObub+2TnVTu+B8H8=; Expires=Thu, 09 Feb 2023 18:26:47 GMT; Path=/; SameSite=None; Secure
server: nginx/1.22.1
vary: Accept-Encoding
access-control-allow-origin: *
access-control-allow-methods: GET, POST, OPTIONS
access-control-allow-headers: DNT,User-Agent,X-Requested-With,If-Modified-Since,Cache-Control,Content-Type,Range
access-control-expose-headers: Content-Length,Content-Range

ie we say it's okay for a request to include a Cache-Control header but we don't send one in the response.

@melange396
Copy link
Collaborator

related: caching headers for metadata

@melange396
Copy link
Collaborator

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Engineering Used to filter issues when synching with Asana enhancement good first issue help wanted
Projects
None yet
Development

No branches or pull requests

5 participants