Skip to content

API server code health improvements #998

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
melange396 opened this issue Oct 10, 2022 · 1 comment
Open

API server code health improvements #998

melange396 opened this issue Oct 10, 2022 · 1 comment
Labels
api change affect the API and its responses code health readability, maintainability, best practices, etc documentation refactor Substantial projects to make the code do the same thing, better.

Comments

@melange396
Copy link
Collaborator

The covidcast API server code works well, but it can be improved with an eye toward extensibility and other future changes. A number of points of potential refinement exist, including:

  • Long and/or complex ternaries and comprehensions are pervasive and should be reduced and/or simplified -- code golf is not good for readability/maintainability
  • Many class, method, and variable names are not descriptive or are misleading, and should be clarified
  • Documentation in the code for explaining algorithms and implementation choices is very thin and should be expanded. Undocumented code understandability is further exacerbated by naming choices.
  • Type annotation/hinting is clunky and verbose, and often not necessary.
@melange396 melange396 added refactor Substantial projects to make the code do the same thing, better. api change affect the API and its responses code health readability, maintainability, best practices, etc documentation labels Oct 10, 2022
@melange396
Copy link
Collaborator Author

here are some more concrete examples:

  • flask.request and request argument value handling:

    • flask.request should not be referenced outside of endpoint handlers
      • (or other event receivers, like those methods decorated by before_request or teardown_appcontext, etc)
      • request makes sense in the endpoint context, but not really beyond that...
      • this means things in _params.py like parse_source_signal_pairs() and parse_time_pairs() and etc should take an argument instead of pulling request out of thin air
        • we should them into more pure functions, so that it is easier to know when an argument that is part of a web/api request is being accessed
    • these methods that retrieve such request values are inconsistently named and located
      • methods currently called parse_blah() do more than just parse, they also 'get' and 'validate' or even 'enforce'
      • there are parse_x() methods defined in src/server/endpoints/covidcast.py that should be in with the rest of the parse_* methods in _params.py
      • there are extract_x() methods in _validate.py that have near identical behaviors and should be moved into the same file as the other similar methods (and renamed)
      • im sure there are more that i cannot currently find / remember
  • _handle_lag_issues_as_of() in src/server/endpoints/covidcast.py:

    • called thrice (only in the same file) with all "None" arguments, which is a complete no-op. a call is also commented out once. these can all be removed.
    • called once with all 3 arguments, and called once with just the as_of argument (which is arguably the most complicated thing this does)
    • argument order does not even match the name!
    • args are mutexed! at least evaluation matches arg order (but obviously not name order)
      • if a lag and an issue are specified, the lag is not added to the WHERE (its a strange use case, but args should be respected or an error should be produced)
      • if you specify a lag or issue, as_of is invalidated (again, possibly strange usage, but inconsistent none the less)
  • set_order() in src/server/_query.py:

    • needlessly overcomplicated, and DESCending is never even used in the codebase
  • get_related_signals() in src/server/endpoints/covidcast_utils/model.py is never used

  • src/server/_db.py is pretty much never used

    • its one method has no usage elsewhere in the repository
    • the rest of it is configuration/options stuff that would better go in _common.py or _config.py
  • TimePair (and other similarly named classes/type-aliases/etc) have horrible names...

    • their names should give some affordance to their nature and/or their usage
      • a "TimePair" is not a pair of times!
  • a number of class definitions are related but scattered around (and might even have some overlap), and could potentially be integrated:

    • _params.py: GeoPair
      _params.py: SourceSignalPair
      _params.py: TimePair
    • endpoints/covidcast_meta.py: SourceSignal
    • endpoints/covidcast_utils/model.py: TimeType
      endpoints/covidcast_utils/model.py: DataSignal
      endpoints/covidcast_utils/model.py: DataSource
    • maybe/probably others too???

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api change affect the API and its responses code health readability, maintainability, best practices, etc documentation refactor Substantial projects to make the code do the same thing, better.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant