Skip to content

Design doc for privacy levels #6194

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
May 4, 2020
206 changes: 206 additions & 0 deletions docs/development/design/privacy-levels.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,206 @@
Privacy Levels
==============

This document describes how to handle and unify privacy levels
on the community and commercial version of Read the Docs.

Current state
-------------

Currently, we have three privacy levels for projects and versions:

#. Public
#. Private
#. Protected (currently hidden)

These levels of privacy aren't clear and bring confusion to our users.
Also, the private level doesn't makes sense on the community site,
since we only support public projects.

Places where we use the privacy levels are:

- On serving docs
- Footer
- Dashboard

Project level privacy
---------------------

Project level privacy was meant to control the dashboard visibility.

This privacy level brings to confusion when users want to make a version public.
We should remove all the project privacy levels.

For the community site the dashboard would be always visible,
and for the commercial site, the dashboard would be always hidden.

The project privacy level is also used to serve the ``404.html`` page,
show ``robots.txt``, and show ``sitemap.xml``.
The privacy level of the default version should be used instead.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm on the fence with this. I think it makes sense, but could be confusing or hard to communicate it clearly. I don't have a better proposal, though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have the same problem with this. If feels like we're overloading the meaning of the default version privacy level and we're hiding side effects into this version privacy level. Project privacy level actually addresses this quite nicely currently.

Project.privacy_level for determining a version's privacy level is definitely on the way out, but perhaps we're talking about the need for a comparable setting that has a more narrow use case? That is, a new setting that doesn't affect version privacy level? Focusing on removing the setting entirely is maybe the wrong point, the main problem around project privacy level is that it made version privacy level confusing. We still might need some version of Project.privacy_level.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

robots and sitemap look related, we can have something like Enable search engine optimization. But for 404 it kind of makes sense to respect the privacy level of the default version, since we are trying to serve the 404 page of the default version :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The case for 404 maybe makes sense, but it is maybe odd to require auth for a 404 page. Also, currently, I believe our 404 page solution only addresses a 404 error, but we'd ideally have a solution for 404, 403, 4xx, and 5xx pages -- all which would follow similar logic. The 403 page is of most concern here, as we'd use it to describe the default version being private (in your proposal about 404 and default version privacy).

However, we'd have to rethink how we're doing 404 pages. Currently, our extension renders the 404 page with the side bar navigation intact, which would leak information if we used the same pattern for a 403 page. So I think we need to consider our eventual use cases as well. In the case we make 4xx and 5xx pages more of a product feature, 4xx and 5xx pages should probably be under no auth. But we'd have to have this configurable because we've already assumed a different direction with 404 pages so far.

For robots.txt and sitemap.xml, we should probably consider why private versions are included in these at all. It seems that these aren't useful, as search engines will never pass authorization, and these endpoints can simply be unauthed if there is no private version data.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should serve robots.txt and sitemap.xml should be shown if the project has any public versions. Since that isn't exposing any additional information on the domain. If all the projects are private, we should never serve anything from that domain other than 404's I think.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with Eric regarding robots and sitemap.

For the 4xx and 5xx, eventually we should do what Anthony mentioned: think about our use cases and adapt our extension to manage those cases as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we instead limit the data going into both robots.txt and sitemap to just public versions? That is, if there are no public versions, both are mostly empty?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds like a good idea, I'll try to update this soon and see if there is anything missing.


Some other ideas about keeping the privacy level is to dictate the default version level of new versions,
but removing all other logic related to this privacy level.
This can be (or is going to be) possible with automation rules,
so we can just remove the field.

Version level privacy
---------------------

Version level privacy is mainly used to restrict access to documentation.
For public level, everyone can access to the documentation.
For private level, only users that are maintainers or that belong to a team with access
(for the commercial site)
can access to the documentation.

The protected privacy level was meant to hid versions from listings and search.
For the community site these versions are treated like public versions,
and on the commercial site they are treated like private.

The protected privacy level is currently hidden.
To keep the behavior of hid the versions from listings and search,
a new field should be added to the Version model: ``hidden``.
The privacy level (public or private) would be respected to determinate access to the documentation.

For the community site, the privacy level would be public and can't be changed.

The default privacy level of new versions for the commercial site would be ``private``.

Footer
------

The footer is used to display not hidden versions that the current user has access to.

For the community site not changes are required on the footer.

For the commercial site we use the project level privacy to decide if show or not
links to the project's dashboard: downloads, project home, and builds.
Given that the project privacy level would be removed (and the dashboard is always under login),
those links would never be shown (except for admin users).

Overview
--------

For the community site:

- The project's dashboard is visible to all users.
- All versions are always public.
- The footer shows links to the project's dashboard (build, downloads, home) to all users.
- Only versions with ``hidden = False`` are listed on the footer and appear on search results.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does somebody set hidden=True for versions? Would that just be in the usual version form? Does that make sense on .org? Is the idea perhaps for some beta version that is fine for people to access but they don't want it in search results?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@davidfischer we talked about new versions' states some time ago at #5321 . Adding the hidden field follow the direction that we talked there.

We didn't talk about the UI yet, though, but I imagine just a new checkbox field similar as the Active one.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it would be just another field on the form. And it makes sense on .org, below are some uses cases that this attribute will replace (protected versions)

- If a project has a `404.html` file on the default version, it's served.
- If a project has a ``robots.txt`` file on the default version, it's served.
- A ``sitemap.xml`` file is always served.

For the commercial site:

- The project's dashboard is visible to only users that have read permission over the project.
- The footer shows links to the project's dashboard (build, downloads, home) to only admin users.
- Only versions with ``hidden = False`` are listed on the footer and appear on search results.
- If a project has a `404.html` file on the default version, it's served if the user has permission over that version.
- If a project has a ``robots.txt`` file on the default version, it's served if the user has permission over that version.
- A ``sitemap.xml`` file is served if the user has permissions over the default version.

Migration
---------

To differentiate between allowing or not privacy levels,
we need to add a setting ``RTD_ALLOW_PRIVACY_LEVELS``.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the plan that RTD_ALLOW_PRIVACY_LEVELS=False would completely remove "privacy level" from all forms everywhere? Would the default be RTD_ALLOW_PRIVACY_LEVELS=False?

For example, "privacy level" would disappear from this form, correct?
Screen Shot 2019-10-03 at 8 57 07 PM

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it will disappear from all forms and ui elements, except for api v2, but there will only show privacy_levet: public and can't be changed.

Would the default be RTD_ALLOW_PRIVACY_LEVELS=False?

Yeah, I think that is a nice default.. or maybe not since we'll be hiding things by default


For the community and commercial site, we need to:

- Remove/change code that depend on the project's privacy level.
Use the default version's privacy level instead.

- Display robots.txt
- Serve 404.html page
- Display sitemap.xml
- Querysets
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We talked about a few of these above


- Remove the project privacy level.
- Migrate all protected versions to have the attribute ``hidden = True``,
and set their privacy level to public for the community site and private for the commercial site.
- Change all querysets used to list versions on the footer and on search to use the ``hidden`` attribute.
- Update docs

For the community site:

- Hide all privacy level related settings from the version form.
- Mark all versions as public.

For the commercial site:

- Always hide the dashboard
- Show links to the dashboard (downloads, builds, project home) on the footer only to admin users.

Upgrade path overview
---------------------

Community site
##############

The default privacy level for the community site is public for project and versions.

Public project
~~~~~~~~~~~~~~

- Public version:
Normal usa case, not changes required.
- Protected version:
Users didn't want to list this version on the footer,
but also not deactivate it. This can be solved by using the new ``hidden`` setting.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, can we safely assume that anyone that is using a protected version will be able to use hidden instead then? That is, can we make this a data migration?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's on the migration section. Also, since we hid the protected level, not sure how users will relate the options... Probably a blog post about the change? And/or a help text like "Are you looking for protected? Go here instead"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we write "data migration" as django migration files, we need to be very carefully to consider if that data migration has to be applied in corporate or not. There are some scenarios where we don't want the same behavior (current private versions, for example)

- Private version:
Users didn't want to show this version to their users yet or they were testing something.
This can be solved with the pull request builder feature and the ``hidden`` setting.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are we planning on doing with versions that are public project private version? What is the migration path when we remove this version level from community?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we are safe to just migrate those to public, and maybe setting the hidden option? Since that's the closer option to private in the community site. Let me know what you think

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using hidden kind of makes sense to me, but it's not 100% accurate. We could be leaking data (not in community since the repository is public anyways, but probably in commercial).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anyways, in commercial, we can keep them private :)


Protected project
~~~~~~~~~~~~~~~~~

Protected projects are not listed publicly.
Probably users were hosting a WIP project,
or personal public project.
A public project with public versions should work for them,
as we are removing listing all projects publicly (except for search).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the migration path here? Change all protected projects to public projects with hidden versions?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'm making more explicitly what migration to do here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually no, sorry. The migration path for the versions of protected and private projects follow the same rules as the public project. Since at the end what users were hidden is the docs built (versions).

The privacy level for project only controls if the dashboard should be public or not and if the project was listed or not.


Private project
~~~~~~~~~~~~~~~

Probably these users want to use our enterprise solution instead.
Or they were hosting a personal project.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the migration path here? These projects likely never worked, but we wouldn't want to do anything that would look like a data leak. Making a project go from private and broken to public and working could be really bad -- especially in the case of an old project that was meant to be completely private. At least projects that should be private (ie - commercial projects on community) are broken right now and we aren't leaking anything.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For all versions of a private project follow the same rules as a public project. I think that maybe all private versions we could email users telling them that we are about to make their versions public.

From the project itself we could only be leaking information in the dashboard and on the build page.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not really concern about a Private Project on community since the VCS repository has to be public anyways. So, there is no "data leak" but "serving broken docs" instead.

If we care too much about these cases, we could probably have an intermediate value for the migration only for these cases (dead state :) ) where the user can change to a different status but nobody can select this state.


Commercial site
###############

The default privacy level for the commercial site is private for project and versions.

Private project
~~~~~~~~~~~~~~~

- Private version:
Normal usa case, not changes required.
- Protected version:
Users didn't want to list this version on the footer,
but also not deactivate it. This can be solved by using the new ``hidden`` setting.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same feedback as above, in case the use case for protected is any different here. If users were trying to accomplish anything else with protected versions, it would help guide the decision on what to do when migrating. hidden=True seems mostly safe.

- Public version:
User has private code, but want to make public their docs.

Protected project
~~~~~~~~~~~~~~~~~

I can't think of a use case for protected projects,
since they aren't listed publicly on the commercial site.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there are even any of these projects, what should we do with them on migration?


Public project
~~~~~~~~~~~~~~

Currently we show links back to project dashboard if the project is public,
which probably users shouldn't see.
With the implementation of this design doc,
public versions don't have links to the project dashboard and the dashboard is always under login.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say that the footer should show these links if you are logged in and you are part of a Team with Read access to the project.

Anonymous users should not see these links.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Links to the dashboard should only be visible for admin users, since they are the only ones that can make changes to the project.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Users with Read access can see the dashboard, so they should be able to see these links as well. Even, if they can not edit anything --but can see the Build outputs, for example.


- Private versions:
Users under the organization can see links to the dashboard.
- Protected versions:
Users under the organization can see links to the dashboard.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a pretty clear case for hidden=True, I'm having trouble considering any other use cases. Do we feel confident about this migration?

- Public versions:
All users can see links to the dashboard.
Probably they have an open source project,
but they still want to manage access using the same teams of the organization.