|
| 1 | +Improving redirects |
| 2 | +=================== |
| 3 | + |
| 4 | +Redirects are a core feature of Read the Docs, |
| 5 | +they allow users to keep old URLs working when they rename or move a page. |
| 6 | + |
| 7 | +The current implementation lacks some features and has some undefined/undocumented behaviors. |
| 8 | + |
| 9 | +Goals |
| 10 | +----- |
| 11 | + |
| 12 | +- Improve the user experience when creating redirects. |
| 13 | +- Improve the current implementation without big breaking changes. |
| 14 | + |
| 15 | +No goals |
| 16 | +-------- |
| 17 | + |
| 18 | +- Replicate every feature of other services without |
| 19 | + having a clear use case for them. |
| 20 | +- Improve the performance of redirects. |
| 21 | + This can be discussed in an issue or pull request. |
| 22 | +- Allow importing redirects. |
| 23 | + We should push users to use our API instead. |
| 24 | +- Allow specifying redirects in the RTD config file. |
| 25 | + We have had several discussions around this, |
| 26 | + but we haven't reached a consensus. |
| 27 | + |
| 28 | +Current implementation |
| 29 | +---------------------- |
| 30 | + |
| 31 | +We have five types of redirects: |
| 32 | + |
| 33 | +Prefix redirect: |
| 34 | + Allows to redirect all the URLs that start with a prefix to a new URL |
| 35 | + using the default version and language of the project. |
| 36 | + For example: a prefix redirect with the value ``/prefix/`` |
| 37 | + will redirect ``/prefix/foo/bar`` to ``/en/latest/foo/bar``. |
| 38 | + |
| 39 | + They are basically the same as an exact redirect with a wildcard at the end. |
| 40 | + They are a shortcut for a redirect like: |
| 41 | + |
| 42 | + - From: ``/prefix/$rest`` |
| 43 | + To: ``/en/latest/`` |
| 44 | + |
| 45 | + Or maybe we could use a prefix redirect to replace the exact redirect with a wildcard? |
| 46 | + |
| 47 | +Page redirect: |
| 48 | + Allows to redirect a single page to a new URL using the current version and language. |
| 49 | + For example: a page redirect with the value ``/old/page.html`` |
| 50 | + will redirect ``/en/latest/old/page.html`` to ``/en/latest/new/page.html``. |
| 51 | + |
| 52 | + Cross domain redirects are not allowed in page redirects. |
| 53 | + They apply to all versions, |
| 54 | + if you want it to apply only to a specific version you can use an exact redirect. |
| 55 | + |
| 56 | + A whole directory can't be redirected with a page redirect, |
| 57 | + an exact redirect with a wildcard at the end needs to be used instead. |
| 58 | + |
| 59 | +Exact redirect: |
| 60 | + Allows to redirect an exact URL to a new URL, |
| 61 | + it allows a wildcard at the end to redirect. |
| 62 | + For example: an exact redirect with the value ``/en/latest/page.html`` |
| 63 | + will redirect ``/en/latest/page.html`` to the new URL. |
| 64 | + |
| 65 | + If an exact redirect with the value ``/en/latest/dir/$rest`` |
| 66 | + is created, it will redirect all paths that start with ``/en/latest/dir/``, |
| 67 | + the rest of the path will be added to the new URL automatically. |
| 68 | + |
| 69 | + - Cross domain redirects are allowed in exact redirects. |
| 70 | + - They apply to all versions. |
| 71 | + - A wildcard is allowed at the end of the URL. |
| 72 | + - If a wildcard is used, the rest of the path will be added to the new URL automatically. |
| 73 | + |
| 74 | +Sphinx HTMLDir to HTML: |
| 75 | + Allows to redirect clean-URLs to HTML URLs. |
| 76 | + Useful in case a project changed the style of their URLs. |
| 77 | + |
| 78 | + They apply to all projects, not just Sphinx projects. |
| 79 | + |
| 80 | +Sphinx HTML to HTMLDir: |
| 81 | + Allows to redirect HTML URLs to clean-URLs. |
| 82 | + Useful in case a project changed the style of their URLs. |
| 83 | + |
| 84 | + They apply to all projects, not just Sphinx projects. |
| 85 | + |
| 86 | +How other services implement redirects |
| 87 | +-------------------------------------- |
| 88 | + |
| 89 | +- Gitbook implementation is very basic, |
| 90 | + they only allow page redirects. |
| 91 | + |
| 92 | + https://docs.gitbook.com/integrations/git-sync/content-configuration#redirects |
| 93 | + |
| 94 | +- Cloudflare pages allow to capture placeholders and one wildcard (in any part of the URL). |
| 95 | + They also allow you to set the status code of the redirect, |
| 96 | + and redirects can be specific in a ``_redirects`` file. |
| 97 | + |
| 98 | + https://developers.cloudflare.com/pages/platform/redirects/ |
| 99 | + |
| 100 | + They have a limit of 2100 redirects. |
| 101 | + In case of multiple matches, the topmost redirect will be used. |
| 102 | + |
| 103 | +- Netlify allows to capture placeholders and a wildcard (only allowed at the end). |
| 104 | + They also allow you to set the status code of the redirect, |
| 105 | + and redirects can be specific in a ``_redirects`` file. |
| 106 | + |
| 107 | + - Forced redirects |
| 108 | + - Match query arguments |
| 109 | + - Match by country/language and cookies |
| 110 | + - Per-domain and protocol redirects |
| 111 | + - In case of multiple matches, the topmost redirect will be used. |
| 112 | + |
| 113 | + https://docs.netlify.com/routing/redirects/ |
| 114 | + |
| 115 | +Improving redirects |
| 116 | +------------------- |
| 117 | + |
| 118 | +General improvements |
| 119 | +~~~~~~~~~~~~~~~~~~~~ |
| 120 | + |
| 121 | +The following improvements will be applied to all types of redirects. |
| 122 | + |
| 123 | +- Allow choosing the status code of the redirect. |
| 124 | + We already have a field for this, but it's not exposed to users. |
| 125 | +- Allow to explicitly define the order of redirects. |
| 126 | + This will be similar to the automation rules feature, |
| 127 | + where users can reorder the rules so the most specific ones are first. |
| 128 | + We currently rely on the implicit order of the redirects (updated_at). |
| 129 | +- Allow to disable redirects. |
| 130 | + It's useful when testing redirects, or when debugging a problem. |
| 131 | + Instead of having to re-create the redirect, |
| 132 | + we can just disable it and re-enable it later. |
| 133 | +- Allow to add a short description. |
| 134 | + It's useful to document why the redirect was created. |
| 135 | + |
| 136 | +Allow matching query arguments |
| 137 | +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 138 | + |
| 139 | +We can do this in two ways: |
| 140 | + |
| 141 | +- At the DB level with some restrictions. |
| 142 | + If done at the DB level, |
| 143 | + we would need to have a different field |
| 144 | + with just the path, and other with the query arguments normalized and sorted. |
| 145 | + |
| 146 | + For example, if we have a redirect with the value ``/foo?blue=1&yellow=2&red=3``, |
| 147 | + if would be normalized in the DB as ``/foo`` and ``blue=1&red=3&yellow=2``. |
| 148 | + This implies that the URL to be matched must have the exact same query arguments, |
| 149 | + it can't have more or less. |
| 150 | + |
| 151 | + I believe the implementation described here is the same being used by Netlify, |
| 152 | + since they have that same restriction. |
| 153 | + |
| 154 | + If the URL contains other parameters in addition to or instead of id, the request doesn't match that rule. |
| 155 | + |
| 156 | + https://docs.netlify.com/routing/redirects/redirect-options/#query-parameters |
| 157 | + |
| 158 | +- At the Python level. |
| 159 | + If done at the DB level, |
| 160 | + we would need to have a different field |
| 161 | + with just the path, and other with query arguments. |
| 162 | + |
| 163 | + The matching of the path would be done at the DB level, |
| 164 | + and the matching of the query arguments would be done at the Python level. |
| 165 | + Here we can be more flexible, allowing any query arguments in the matched URL. |
| 166 | + |
| 167 | + We had some performance problems in the past, |
| 168 | + but I believe it was mainly due to the use of regex instead of using string operations. |
| 169 | + And matching the path is still done at the DB level. |
| 170 | + We could limit the number of redirects that can be created with query arguments, |
| 171 | + or the number of redirects in general. |
| 172 | + |
| 173 | +Don't run redirects on domains from pull request previews |
| 174 | +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 175 | + |
| 176 | +We currently run redirects on domains from pull request previews, |
| 177 | +this is a problem when moving a whole project to a new domain. |
| 178 | + |
| 179 | +Do we have the need to run redirects on external domains? |
| 180 | +They are suppose to be temporary domains. |
| 181 | + |
| 182 | +Normalize paths with trailing slashes |
| 183 | +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 184 | + |
| 185 | +Currently, if users want to redirect a path with a trailing slash and without it, |
| 186 | +they need to create two separate redirects (``/page/`` and ``/page``). |
| 187 | + |
| 188 | +We can simplify this by normalizing the path before matching it. |
| 189 | + |
| 190 | +For example: |
| 191 | + |
| 192 | +- From: ``/page/`` |
| 193 | + To: ``/new/page`` |
| 194 | + |
| 195 | +The from path will be normalized to ``/page``, |
| 196 | +and the filename to match will also be normalized before matching it. |
| 197 | +This is similar to what Netlify does: |
| 198 | +https://docs.netlify.com/routing/redirects/redirect-options/#trailing-slash. |
| 199 | + |
| 200 | +Page and exact redirects without a wildcard at the end will be normalized, |
| 201 | +all other redirects need to be matched as is. |
| 202 | + |
| 203 | +Improving page redirects |
| 204 | +~~~~~~~~~~~~~~~~~~~~~~~~ |
| 205 | + |
| 206 | +- Allow to redirect to external domains. |
| 207 | + This can be useful to apply a redirect of a well known path |
| 208 | + in all versions to another domain. |
| 209 | + |
| 210 | + For example, ``/security/`` to a their security policy page in another domain. |
| 211 | + |
| 212 | + This new feature isn't strictly needed, |
| 213 | + but it will be useful to simplify the explanation of the feature |
| 214 | + (one less restriction to explain). |
| 215 | + |
| 216 | +Improving exact redirects |
| 217 | +~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 218 | + |
| 219 | +- Explicitly place the ``$rest`` placeholder in the target URL, |
| 220 | + instead of adding it automatically. |
| 221 | + |
| 222 | + Some times users want to redirect to a different path, |
| 223 | + we have been adding a query parameter in the target URL to |
| 224 | + prevent the old path from being added in the final path. |
| 225 | + For example ``/new/path/?_=``. |
| 226 | + |
| 227 | + Instead of adding the path automatically, |
| 228 | + users have to add the ``$rest`` placeholder in the target URL. |
| 229 | + For example: |
| 230 | + |
| 231 | + - From: ``/old/path/$rest`` |
| 232 | + To: ``/new/path/$rest`` |
| 233 | + |
| 234 | + - From: ``/old/path/$rest`` |
| 235 | + To: ``/new/path/?page=$rest&foo=bar`` |
| 236 | + |
| 237 | +- Per-domain redirects. |
| 238 | + Do users have the need for this? |
| 239 | + The main problem is that we were applying the redirect |
| 240 | + to external domains, if we stop doing that, is there the need for this? |
| 241 | + We can also try to improve how our built-in redirects work |
| 242 | + (specially our canonical domain redirect). |
| 243 | + |
| 244 | +Improving Sphinx redirects |
| 245 | +~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 246 | + |
| 247 | +These redirects are useful, but we should rename them to something more general, |
| 248 | +since they apply to all types of projects, not just Sphinx projects. |
| 249 | + |
| 250 | +Proposed names: |
| 251 | + |
| 252 | +- HTML URL to clean URL redirect (``file.html`` to ``file/``) |
| 253 | +- Clean URL to HTML URL redirect (``file/`` to ``file.html``) |
| 254 | + |
| 255 | +Other ideas to improve redirects |
| 256 | +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 257 | + |
| 258 | +- Run forced redirects before built-in redirects. |
| 259 | + We currently run built-in redirects before forced redirects, |
| 260 | + this is a problem when moving a whole project to a new domain. |
| 261 | + For example, a forced redirect like ``/$rest``, |
| 262 | + won't work for the root URL of the project, |
| 263 | + since ``/`` will first redirect to ``/en/latest/``. |
| 264 | + |
| 265 | + But shouldn't be a real problem, since users will still need to |
| 266 | + handle the ``/en/latest/file/`` paths. |
| 267 | + |
| 268 | +- Run redirects on the edge. |
| 269 | + Cloudflare allow us to create redirects on the edge, |
| 270 | + but they have some limitations around the number of |
| 271 | + redirect rules that can be created. |
| 272 | + |
| 273 | + And they will be useful for forced exact redirects only, |
| 274 | + since we can't match a redirect based on the response of the origin server. |
| 275 | + |
| 276 | +- Merge prefix redirects with exact redirects. |
| 277 | + Prefix redirects are the same as exact redirects with a wildcard at the end. |
| 278 | + |
| 279 | +- Placeholders. |
| 280 | + I haven't seen users requesting this feature. |
| 281 | + We can consider adding it in the future. |
| 282 | + Maybe we can expose the current language and version as placeholders. |
| 283 | + |
| 284 | +- Replace ``$rest`` with ``*`` in the from_url. |
| 285 | + This will be more consistent with other services, |
| 286 | + but it will require users to re-learn the feature. |
| 287 | + |
| 288 | +- Per-protocol redirects. |
| 289 | + We should push users to always use HTTPS. |
| 290 | + |
| 291 | +- Allow a prefix wildcard. |
| 292 | + We currently only allow a suffix wildcard, |
| 293 | + adding support for a prefix wildcard should be easy. |
| 294 | + But do users need this feature? |
| 295 | + |
| 296 | +Migration |
| 297 | +--------- |
| 298 | + |
| 299 | +Most of the proposed improvements are backwards compatible, |
| 300 | +and just need a data migration to normalize existing redirects. |
| 301 | + |
| 302 | +For the exception of adding the ``$rest`` placeholder in the target URL explicitly, |
| 303 | +that needs users to re-learn how this feature works, i.e, they may be expecting |
| 304 | +to have the path added automatically in the target URL. |
| 305 | + |
| 306 | +We can create a small blog post explaining the changes. |
0 commit comments