Skip to content

GHT signal lost its medical-seeking terms #138

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
capnrefsmmat opened this issue Jul 9, 2020 · 4 comments
Closed

GHT signal lost its medical-seeking terms #138

capnrefsmmat opened this issue Jul 9, 2020 · 4 comments
Assignees
Labels
API change Renames, large changes to calculations, large changes to affected regions
Milestone

Comments

@capnrefsmmat
Copy link
Contributor

In its original version, the GHT signal contained terms for anosmia and medical-seeking queries. However, in the transition to covidcast-indicators, we lost the medical-seeking queries, so our GHT signal now appears to be only about loss of smell or taste.

This was an undocumented change in the data, and probably an undesirable one.

I think it's easy to fix by extending the TERMS list to contain the previous terms, but we need to determine how to issue the fix. @krivard, should this count as a bugfix retroactively applied to the data, or as a new signal name? The current signal can't be too useful, since it has a discontinuity. Also, I wonder if the medical-seeking queries will increase query volume and reduce the problem of the signal being truncated to zero.

@krivard krivard added API change Renames, large changes to calculations, large changes to affected regions Triage Nominate for inclusion in the next release labels Jul 9, 2020
@capnrefsmmat
Copy link
Contributor Author

capnrefsmmat commented Jul 9, 2020

Addison writes:

It seems that when we migrated to the covidcast-indicators pipeline, I missed the fact, during my code review, that the medical seeking terms were inadvertently left out.

We could restore the medical seeking terms, if we believe that they provide additional signal. I think that the indicators team should:

  1. Run some "signal quality" evaluation -- perhaps Ryan's correlations notebook? -- against the current prod signal and against a signal that re-introduced medical seeking terms.
  2. If anosmia + medical seeking is much better, plan a versioned re-introduction. If only slightly better or no change, wait for the promised "super GHT signal" that Google has promised us and other institutions...

@krivard
Copy link
Contributor

krivard commented Jul 9, 2020

Agree this is a bug fix; assuming the eval Addison recommends comes out in favor of restoring the medical seeking terms, we should dig out from the cache commit logs the first day of bad data and reissue this signal back to that point.

@krivard krivard added this to the v1.6 milestone Jul 10, 2020
capnrefsmmat added a commit to cmu-delphi/delphi-epidata that referenced this issue Jul 10, 2020
Currently anosmia. Not pushing yet, because this is subject to
correction once we investigate. See
cmu-delphi/covidcast-indicators#138
@krivard krivard removed the Triage Nominate for inclusion in the next release label Jul 10, 2020
@capnrefsmmat
Copy link
Contributor Author

Table 1 in this paper proposes a much bigger list of terms:

anosmia, chest pain, chest tightness, cold, cold symptoms, cold with fever, conta gious flu, cough, cough and fever, cough fever, covid, covid nhs, covid symptoms,covid-19, covid-19 who, dry cough, feeling exhausted, feeling tired, fever, fever cough,flu and bronchitis, flu complications, how long are you contagious, how long does covid last, how to get over the flu, how to get rid of flu, how to get rid of the flu, how to reduce fever, influenza, influenza b symptoms, isolation, joints aching, loss of smell, loss smell, loss taste, nose bleed, oseltamivir, painful cough, pneumonia, pneumonia, pregnant and have the flu, quarantine, remedies for the flu, respiratory flu, robitussin, robitussin cf, robitussin cough, rsv, runny nose, sars-cov 2, sars-cov-2, sore throat, stay home, strep, strep throat, symptoms of bronchitis, symptoms of flu, symptoms of influenza, symptoms of influenza b, symptoms of pneumonia, symptoms of rsv, tamiflu dosage, tamiflu dose, tamiflu drug, tamiflu generic, tamifluside effects, tamiflu suspension, tamiflu while pregnant, tamiflu wiki, tessalon

But Google is also working on their own system; we may want to just leave our signal untouched and then adopt Google's definition when it's ready.

@krivard
Copy link
Contributor

krivard commented Jul 16, 2020

Eu Jing has completed an initial assessment as follows:

After trying it out on a few different weeks, adding back the medical-seeking terms do not seem to consistently make correlations better; it gets better some weeks, and worse other weeks. [...] These plots are correlations for weeks 06/01/2020 - 06/07/2020 and 07/01/2020 - 07/07/2020, where the "new" plot is with medica-seeking terms added back.

ght_06012020_06072020

ght_07012020_07072020

So indeed; we will leave GHT as it is, and adopt Google's new tool when they make it available.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API change Renames, large changes to calculations, large changes to affected regions
Projects
None yet
Development

No branches or pull requests

4 participants