-
Notifications
You must be signed in to change notification settings - Fork 67
Implement API key scrubbing and structured logging #806
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement API key scrubbing and structured logging #806
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good start -- but structlog takes kwargs instead of a dict
Thanks for the feedback, I've made these further changes:
|
src/server/main.py
Outdated
sqlalchemy_logger.setLevel(logging.ERROR) | ||
#sqlalchemy_logger.setLevel(gunicorn_logger.level) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If switching to ERROR is intended to be permanent, drop the commented-out line and add a comment explaining why we can't use the gunicorn setting.
If this was temporary for testing, revert.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SQLAlchemy logs at gunicorn settings show all queries, and changing this to ERROR would prevent it from showing API keys when the server is querying the api_user table (regardless of the no-track option).
The structured logger already stores queries that access data source tables (covidcast, fluview, etc.), so having SQLAlchemy logs to do that seems redundant.
Unless we specifically need all queries to be logged for analytics / Kibana, this setting seems to be more appropriate to hide API key details.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 nice work! do you have merge permissions or should I click the big button?
yup! I do have merge permissions |
Summary
Notes
This PR implements API key scrubbing for logs in the application layer (Flask + gunicorn). If the user is authorized and tracked, the API key will be logged whenever the user makes an API request. If the user is not tracked, the API key parameter will be masked / scrubbed out.
For logs in the web server layer (NGINX), the API key will be shown as a query string during client requests. This will need to be filtered by creating a custom filter or ingestion pipeline on Filebeat before ingesting to Elasticsearch.
TODO in separate PR: