Skip to content

Design doc: secure access to APIs from builders #10289

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jun 13, 2023

Conversation

stsewd
Copy link
Member

@stsewd stsewd commented May 3, 2023

Another popular package is https://github.com/jazzband/djangorestframework-simplejwt, but using JWT, we could explore that option if we want, but I'm good with the normal tokens from knox.

Another alternative would be to have a server to server dedicated API, maybe a new implementation of our current API, but dedicated for our builders. This will be a lot of work.


📚 Documentation previews 📚

- Ref readthedocs/meta#21
- Ref #7928 (not
  directly, but it opens the door for the future)
@stsewd stsewd requested a review from a team as a code owner May 3, 2023 19:30
@stsewd stsewd requested a review from benjaoming May 3, 2023 19:30
@stsewd stsewd requested a review from ericholscher May 3, 2023 19:30
@stsewd
Copy link
Member Author

stsewd commented May 10, 2023

Note for myself: checkout this package that David mentioned https://florimondmanca.github.io/djangorestframework-api-key/.

since we can re-use the code from knox package.

Attaching tokens to projects only is possible,
but it will require to manage the authentication manually.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a build is triggered as an "external version", i.e. a PR build, isn't it nontrivial to attach it to a specific user?

I don't think it seems entirely obvious to attach these tokens to users. Attaching them to projects seems more "natural" to sequences where no users are really involved. Incoming webhooks as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For actions that don't come from a user, we will attach it to one of the owners of that project.

https://github.com/readthedocs/readthedocs.org/pull/10289/files#diff-dfbcdb5409a7ef92f7523dc7da43b0feafddbf4f64c1226a14c7803772a9725fR86-R87

But I'm taking a look at the other alternative to see if it can be done without users.

Copy link
Member

@ericholscher ericholscher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great doc, thanks!

It would be useful to comment on why we're planning to use the third-party package. I don't really get the sense for why it's better from this document. It seems like it's to allow us to use a custom model and attach to the user, but I don't understand why that functionality is important.

Overall this seems like a straightforward design, and I'm 👍 on moving forward with either of the proposed implementations to get this working ASAP.

------------------------

We will use the rest-knox_ package,
this package is recommended by the DRF documentation,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Easy enough!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you considered the package that David mentioned in the end?

- Create a token from the webs when a build is triggered.
The triggered project will be attached to the token,
if the build was triggered by a user, that user will be attached to the token,
otherwise the token will be attached to one of the owners of the project.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't love this requirement, but seems fine... Does attaching it to a project user make more sense than just using a dummy user of some kind?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Django rest knox requires you to attach the token to a user, so we need to set a user here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aside from needing some way to attach a token scope to a project rather than a user, do we have other cases where we would need to customize the functionality of rest-knox?

Token expiry is covered, what about scopes? Do we need them only per-user? Or also per-team or per-organization?

Do we need to limit rest-knox tokens beyond what they are originally able to do? Can they do more than what a build needs?

Just asking these questions to understand how well the package fits, and if it fits, that's great 👍 If it needs some level of customization or we have to accept significant gaps, then there can be a motivation to implement our own token authentication.

I'm only knowledgeable of "rolling your own token authentication" from django-ninja, but in this case, it was easy because django-ninja was already designed for it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can attach an organization to the token if needed, but knox requires you to always have a user attached to it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the package that David mentioned solves this problem of the requirement of attaching the token to a User, right? We could explore that option a little more. I have the same concerns about having to attach the token to a User and I don't like that too much. It would be great if we can avoid it.

Comment on lines +88 to +90
- The token will be created with an expiration date
of 3 hours, this should be enough for the build to finish.
We could also make this dynamic depending of the project.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great 👍 I don't think we need to overengineer this, 3 hours seems like a great default to start.

Comment on lines +103 to +104
Attaching tokens to projects only is possible,
but it will require to manage the authentication manually.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't really explain why it's hard. I had this same question around why we're attaching it to a user. Is it so we can do something like:

for user in project.users.all():
  if user.token

or something?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

knox requires you to attach the token to a user, so it can perform authentication (this validates the token and expose it to us), if we only use it for authorization we will need to manually do this (extract the token).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other lib DRF API key doesn't have this constraint, but if we want to implement per-user tokens for our normal API, then we will need to manually implement that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This probably doesn't make too much sense, but I'm asking anyways 😄: what about using both packages?

  • rest-knox for tokens exposed to users where we can benefit of this per-user access
  • djangorestframework-api-key for internal tokens required by our builders (maybe tokens exposed to Organizations and Project to be used on CIs that don't want to tie a particular user to this)

Would that be too complex? Are they compatible to work together?

Taking into account, in particular, that there are features from knox that we need but they are not released yet. So, maybe it makes sense to use both packages together since they solve different problems.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't tested it, but we should be able to use both packages, since they work per-view. But I'd prefer if we just choose one, otherwise we will be maintaining both integrations, it could get confusing. Also, I think we should take into consideration that we want to expose this similar feature to users (for API v3 at least from what I understand).

To summarize:

  • rest knox requires all tokens to be attached to a user, and provides authentication out of the box.
  • API Key doesn't require tokens to be attached to a user, and provides authorization only.

Both packages allow you to use a custom model to attach any models to it (projects, organizations, etc). API key can be extended to attach users to it, and use it for authentication, but that will need to be implemented manually. rest-knox requires a user, but it doesn't require you to make use of it, so we can just ignore the user when we don't need it.

There are more differences/features of each summarized in:

- Support for several tokens per user.
- Tokens are stored in a hashed format in the database.
We don't have access the tokens after they are created.
- Tokens can have an expiration date.
- Tokens can be created with a prefix (rtd_xxx) (unreleased)
- Support for custom token model (unreleased)

Instead of using knox, we can use `DRF API key`_,
it has the same features as knox, with the exception of:
- It is only used for authorization,
it can't be used for authentication (or it can't be out of the box).
- It doesn't expose views to revoke the tokens (but this should be easy to manually implement)
- Changing the behaviour of some things require sub-classing instead of defining settings.

---------------

Some of the features that we may need are not released yet,
we need the custom token model feature, specially.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we need the set if projects for the token, if we're already hacking it onto a user. Seems like a middle step would just be generating it for the user who has access to the project, as a hack.

Comment on lines +120 to +123
There is a race condition when using the token,
and the user that is attached to that token is removed from the project.
This is, if the user is removed while the build is running,
the builders won't be able to access the API.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, I was wondering about this limitation around the user linkage. I think it could be useful to remove the user entirely if possible..

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could also just use the user for knox to make authentication work, but internatlly just check for the projects attached to the token only, this is ignoring the user, for the build APIs at least.

Copy link
Member

@humitos humitos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really good document 💯

I'd like to think more about the proposal of using both packages. It seemed dummy when I asked about this in another comment, but after I continued reading, I think it could be more useful than I thought. In particular because they solve two different problems for us and I think we can have the best of the two packages reducing the amount of work for the implementation from our side.

I'd like to know what's your opinion on this and if you want me to help on researching a little more this idea to decide if they would be compatible to work together.

-----------------

Instead of using the credential of a super user to access the API,
we will create a temporal token attached to a project, and one of the owners of the project.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why it has to be attached to one of the owners?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like it's to allow us to use a custom model and attach to the user, but I don't understand why that functionality is important.

(from @ericholscher comment in #10289 (review))

I think it's important to attach the token to a Project itself so we don't have to do it to an owner. Attaching the token only to a Project (and not a User) will be useful for audit purposes as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For audit, we can just add another field like source that indicates if the token was generated from a manual action or a wehbook, so we know when to take the user into consideration.

------------------------

We will use the rest-knox_ package,
this package is recommended by the DRF documentation,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you considered the package that David mentioned in the end?

- Create a token from the webs when a build is triggered.
The triggered project will be attached to the token,
if the build was triggered by a user, that user will be attached to the token,
otherwise the token will be attached to one of the owners of the project.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the package that David mentioned solves this problem of the requirement of attaching the token to a User, right? We could explore that option a little more. I have the same concerns about having to attach the token to a User and I don't like that too much. It would be great if we can avoid it.

Comment on lines +103 to +104
Attaching tokens to projects only is possible,
but it will require to manage the authentication manually.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This probably doesn't make too much sense, but I'm asking anyways 😄: what about using both packages?

  • rest-knox for tokens exposed to users where we can benefit of this per-user access
  • djangorestframework-api-key for internal tokens required by our builders (maybe tokens exposed to Organizations and Project to be used on CIs that don't want to tie a particular user to this)

Would that be too complex? Are they compatible to work together?

Taking into account, in particular, that there are features from knox that we need but they are not released yet. So, maybe it makes sense to use both packages together since they solve different problems.

@davidfischer
Copy link
Contributor

davidfischer commented May 24, 2023

I looked deeper into DRF API key and I do have a few bits of feedback and comparison with Django Rest Knox (Knox):

  • The really nice part of DRF API Key is that it lets you easily subclass the API Key model and add things that our app needs like scope or tying it to an Organization. It even documents and gives examples of doing exactly that which is very nice. The keys can be set with an expiry time and you could even subclass it twice: once to have organization keys and another time for either a RTD-wide key or user keys or something.
  • By contrast, Knox doesn't talk much about subclassing although it looks possible. Instead Knox mostly just provides settings to change. Keys are tied to users. We could probably change that but then we're digging deep.
  • DRF API Key uses the password hashers before storing the token in the database. This seems like overkill. The whole point of those hashers is that they take something lower entropy like a password and make it harder to brute force or pre-compute if the database leaks. That isn't a problem here though. These tokens have ~256 bits of randomness already so they won't be brute forced. By contrast, Knox just takes 256 bits from urandom and does a sha512 on it before storing it which seems fine.
  • Because DRF API Key uses password hashers, it's going to use more CPU and add latency to API requests that use it. These hashers are designed to be slow (~100ms+ pegging a full CPU core). For a service like ours that gets TONS of these requests, this might be too much. It is possible to override the hasher it uses and just use something like sha512. I noted something similar on their issue tracker: HasAPIKey adds huge performance (speed) hit florimondmanca/djangorestframework-api-key#173 (comment).
  • Looking through DRF Api Key code, I would also increase the length of the prefix they store on the key to at least 22 characters (~128 bits). This is configurable, but I don't think it's a security issue in the module because there's a unique key on the prefix so in the unlikely event of a collision, the key generation/storage would just fail at the DB level.
  • The author has also indicated that they're a little removed from Django and thereby this project. While they still respond in the issue tracker, I don't think there's a ton of active development there. We might have to do a bit of caretaking if we want to go this route.

@ericholscher
Copy link
Member

ericholscher commented May 24, 2023

@davidfischer @stsewd It seems we've had a lot of discussion on tradeoffs, but can folks share what they think is the best path forward? I don't have a great sense of the full tradeoffs, but I'd like to move forward here, and I think the actual API key auth implementation isn't going to be super complex, so making sure we pick the exact correct approach isn't super important?

It seems Santos thinks we move forward with knox, attaching API key to the project and a project owner. I think that makes sense to me as a first approach?

@stsewd
Copy link
Member Author

stsewd commented May 25, 2023

@davidfischer really nice analysis! From my first reading I didn't get that you can have several custom models D: but after reading it a second time, it makes sense since the model isn't directly attached to the other components.

It seems Santos thinks we move forward with knox, attaching API key to the project and a project owner. I think that makes sense to me as a first approach?

And we can start by ignoring the user, and just rely on the attaching project.

I'm a little biased towards rest-knox, because it requires less code to implement it, using API key does require a little more of code (subclassing to get the components wired), but not that much code, so. API key is more flexible than rest-knox since you can have several custom models, so if we ever need to have tokens that are attached to users we can just create a new model that does that.

I'm fine working with any of those packages, maybe we just do a little table and choose the one that has fewer drawbacks :D

@davidfischer
Copy link
Contributor

I think DRF Api Key would require a decent amount of customization. Knox is probably faster to get going and with saner defaults although it's probably not going to give us exactly what we want longer term (organization keys) without a similar level of customization or workarounds. After reviewing both projects, while I think DRF Api Key is very promising, I think Knox is the more mature and active project.

@davidfischer
Copy link
Contributor

I don't want to add requirements here so feel free to disregard, but one thing I think is great to have on an API token is a last_used timestamp. This means the DRF permissions/authentication class would have to update it on a successful auth. It's a bit of overengineering, but you'll be happy you have it when you go to revoke/migrate/swap anything.

@stsewd
Copy link
Member Author

stsewd commented May 25, 2023

It also came to our attention that the rest-knox project latest release was on January 31st, 2022, and the last commit on August 2022.

https://pypi.org/project/django-rest-knox/#history

API key seems a little more active (latest release january 2023).

Maybe we could help to maintain any of the packages, I think that's better than trying to roll out our own solution :D

@ericholscher
Copy link
Member

I think it sounds like folks are on board with moving forward with knox, so I'm 👍 on that decision, and moving to an implementation. Again, I think the actual integration with the build process is the larger amount of dev effort, so if we have to change how the API key auth is done later, it hopefully shouldn't be a ton of work, especially if we don't do a lot of customization in the initial implementation 👍

@stsewd
Copy link
Member Author

stsewd commented May 25, 2023

Just to be clear, we will be installing rest-knox from the git repo (probably pinned to a commit), since the feature to allow overriding the default token model isn't released yet. We could also ask if they can do a release.

@benjaoming
Copy link
Contributor

@stsewd that's perhaps worth the effort. Is this potentially a deciding factor?

  • Let's say that Knox does a release upon request - 💯 the project is alive and well and we've had contact and the feature we need is released 🎉
  • Let's say they don't react or something else is blocking the feature. We go with DRF Api Key?

@stsewd
Copy link
Member Author

stsewd commented May 25, 2023

@stsewd that's perhaps worth the effort. Is this potentially a deciding factor?

* Let's say that Knox does a release upon request - 100 the project is alive and well and we've had contact and the feature we need is released tada

* Let's say they don't react or something else is blocking the feature. We go with DRF Api Key?

Could be, looking at the issues and PRs, API key is looks more actively maintained, also, the current released version has everything that we need, so...

stsewd added a commit that referenced this pull request Jun 1, 2023
For #10289
we are going to need to pass a new argument (build_api_key).
And since we deploy webs first,
builders will have the old task that doesn't match the new signature,
and the task will fail.

To avoid this, we can just accept any kwargs,
this obviously only works if the change is backwards compatible
with the old code from the builders (in this case it will be).
stsewd added a commit that referenced this pull request Jun 1, 2023
…10374)

For #10289
we are going to need to pass a new argument (build_api_key).
And since we deploy webs first,
builders will have the old task that doesn't match the new signature,
and the task will fail.

To avoid this, we can just accept any kwargs,
this obviously only works if the change is backwards compatible
with the old code from the builders (in this case it will be).
stsewd added a commit that referenced this pull request Jun 6, 2023
For #10289
we won't have the credentials hardcoded in our settings anymore,
but instead we will be provided with a dynamic token for the
build to use, for this reason we now have to pass the api client
object around.
stsewd added a commit that referenced this pull request Jun 7, 2023
For #10289
we won't have the credentials hardcoded in our settings anymore,
but instead we will be provided with a dynamic token for the
build to use, for this reason we now have to pass the api client
object around.
Copy link
Member

@humitos humitos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is fine to merge. We already started with the implementation, right? In any case, we can decide specific things on the PR review for the implementation. It seems we already decided most of the important things to move forward.

@ericholscher
Copy link
Member

Agreed on merging. Appreciate the "decision" section, very useful firming up the outcome 👍

@stsewd stsewd merged commit 8012b88 into main Jun 13, 2023
@stsewd stsewd deleted the secure-access-to-api-from-builders branch June 13, 2023 15:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants