|
| 1 | +New search API |
| 2 | +============== |
| 3 | + |
| 4 | +Goals |
| 5 | +----- |
| 6 | + |
| 7 | +- Allow to configure search at the API level, |
| 8 | + instead of having the options in the database. |
| 9 | +- Allow to search a group of projects/versions at the same time. |
| 10 | +- Bring the same syntax to the dashboard search. |
| 11 | + |
| 12 | +Syntax |
| 13 | +------ |
| 14 | + |
| 15 | +The parameters will be given in the query using the ``key:value`` syntax. |
| 16 | +Inspired by `GitHub <https://docs.github.com/en/rest/search>`__ and other services. |
| 17 | + |
| 18 | +Currently the values from all parameters don't include spaces, |
| 19 | +so surrounding the value with quotes won't be supported (``key:"value"``). |
| 20 | + |
| 21 | +To avoid interpreting a query as a parameter, |
| 22 | +an escape character can be put in place, |
| 23 | +for example ``project\:docs`` won't be interpreted as |
| 24 | +a parameter, but as the search term ``project:docs``. |
| 25 | +This is only necessary if the query includes a valid parameter, |
| 26 | +unknown parameters (``foo:bar``) don't require escaping. |
| 27 | + |
| 28 | +All other tokens that don't match a valid parameter, |
| 29 | +will be join to form the final search term. |
| 30 | + |
| 31 | +Parameters |
| 32 | +---------- |
| 33 | + |
| 34 | +project: |
| 35 | + Indicates the project and version |
| 36 | + to includes results from (this doesn't include subprojects). |
| 37 | + If the version isn't provided, |
| 38 | + the default version is used. |
| 39 | + |
| 40 | + Examples: |
| 41 | + |
| 42 | + - ``project:docs/latest`` |
| 43 | + - ``project:docs`` |
| 44 | + |
| 45 | + It can be one or more project parameters. |
| 46 | + At least one is required. |
| 47 | + |
| 48 | + If the user doesn't have permission over one version or if the version doesn't exist, |
| 49 | + we don't include results from that version. |
| 50 | + We don't fail the search, this is so users can use one endpoint for all their users, |
| 51 | + without worrying about what permissions each user has or updating it after a version or project |
| 52 | + has been deleted. |
| 53 | + |
| 54 | + The ``/`` is used as separator, |
| 55 | + but it could be any other character that isn't present in the slug of a version or project. |
| 56 | + ``:`` was considered (``project:docs:latest``), but it could be hard to read |
| 57 | + since ``:`` is already used to separate the key from the value. |
| 58 | + |
| 59 | +subprojects: |
| 60 | + This allows to specify from what project exactly |
| 61 | + we are going to return subprojects from, |
| 62 | + and also include the version we are going to try to match. |
| 63 | + This includes the parent project in the results. |
| 64 | + |
| 65 | + As the ``project`` parameter, the version can be optional, |
| 66 | + and defaults to the default version of the parent project. |
| 67 | + |
| 68 | +user: |
| 69 | + Include results from projects the given user has access to. |
| 70 | + The only supported value is ``@me``, |
| 71 | + which is an alias for the current user. |
| 72 | + |
| 73 | +Including subprojects |
| 74 | +~~~~~~~~~~~~~~~~~~~~~ |
| 75 | + |
| 76 | +Now that we are returning results only |
| 77 | +from the given projects, we need an easy way to |
| 78 | +include results from subprojects. |
| 79 | +Some ideas for implementing this feature are: |
| 80 | + |
| 81 | +``include-subprojects:true`` |
| 82 | + This doesn't make it clear from what |
| 83 | + projects we are going to include subprojects from. |
| 84 | + We could make it so it returns subprojects for all projects. |
| 85 | + Users will probably use this with one project only. |
| 86 | + |
| 87 | +``subprojects:project/version`` (inclusive) |
| 88 | + This allows to specify from what project exactly |
| 89 | + we are going to return subprojects from, |
| 90 | + and also include the version we are going to try to match. |
| 91 | + This includes the parent project in the results. |
| 92 | + |
| 93 | + As the ``project`` parameter, the version can be optional, |
| 94 | + and defaults to the default version of the parent project. |
| 95 | + |
| 96 | +``subprojects:project/version`` (exclusive) |
| 97 | + This is the same as the above, |
| 98 | + but it doesn't include the parent project in the results. |
| 99 | + If we want to include the results from the project, then |
| 100 | + the query will be ``project:project/latest subprojects:project/latest``. |
| 101 | + Is this useful? |
| 102 | + |
| 103 | +The second option was chosen, since that's the current behavior |
| 104 | +of our search when searching on a project with subprojects, |
| 105 | +and avoids having to repeat the project if the user wants to |
| 106 | +include it in the search too. |
| 107 | + |
| 108 | +Cache |
| 109 | +----- |
| 110 | + |
| 111 | +Since the request could be attached to more than one project. |
| 112 | +We will return all the list of projects for the cache tags, |
| 113 | +this is ``project1, project1:version, project2, project2:version``. |
| 114 | + |
| 115 | +CORS |
| 116 | +---- |
| 117 | + |
| 118 | +Since the request could be attached to more than one project. |
| 119 | +we can't make the decision if we should enable CORS or not on a given request from the middleware easily, |
| 120 | +so we won't allow cross site requests when using the new API for now. |
| 121 | +We would need to refactor our CORS code, |
| 122 | +so every view can decide if CORS should be allowed or not, |
| 123 | +for this case, cross site requests will be allowed only if all versions of the final search are public, |
| 124 | +another alternative could be to always allow cross site requests, |
| 125 | +but when a request is cross site, we only return results from public versions. |
| 126 | + |
| 127 | +Analytics |
| 128 | +--------- |
| 129 | + |
| 130 | +We will record the same query for each project that was used in the final search. |
| 131 | + |
| 132 | +Response |
| 133 | +-------- |
| 134 | + |
| 135 | +The response will be similar to the old one, |
| 136 | +but will include extra information about the search, |
| 137 | +like the projects, versions, and the query that were used in the final search. |
| 138 | + |
| 139 | +And the ``version``, ``project``, and ``project_alias`` attributes will |
| 140 | +now be objects. |
| 141 | + |
| 142 | +We could just re-use the old response too, |
| 143 | +since the only breaking changes would be the attributes now being objects, |
| 144 | +and we aren't adding any new information to those objects (yet). |
| 145 | +But also, re-using the current serializers shouldn't be a problem either. |
| 146 | + |
| 147 | +.. code-block:: json |
| 148 | +
|
| 149 | + { |
| 150 | + "count": 1, |
| 151 | + "next": null, |
| 152 | + "previous": null, |
| 153 | + "projects": [ |
| 154 | + { |
| 155 | + "slug": "docs", |
| 156 | + "versions": [ |
| 157 | + { |
| 158 | + "slug": "latest" |
| 159 | + } |
| 160 | + ] |
| 161 | + } |
| 162 | + ], |
| 163 | + "query": "The final query used in the search", |
| 164 | + "results": [ |
| 165 | + { |
| 166 | + "type": "page", |
| 167 | + "project": { |
| 168 | + "slug": "docs", |
| 169 | + "alias": null |
| 170 | + }, |
| 171 | + "version": { |
| 172 | + "slug": "latest" |
| 173 | + }, |
| 174 | + "title": "Main Features", |
| 175 | + "path": "/en/latest/features.html", |
| 176 | + "domain": "https://docs.readthedocs.io", |
| 177 | + "highlights": { |
| 178 | + "title": [] |
| 179 | + }, |
| 180 | + "blocks": [ |
| 181 | + { |
| 182 | + "type": "section", |
| 183 | + "id": "full-text-search", |
| 184 | + "title": "Full-Text Search", |
| 185 | + "content": "We provide search across all the projects that we host. This actually comes in two different search experiences: dashboard search on the Read the Docs dashboard and in-doc search on documentation sites, using your own theme and our search results. We offer a number of search features: Search across subprojects Search results land on the exact content you were looking for Search across projects you have access to (available on Read the Docs for Business) A full range of search operators including exact matching and excluding phrases. Learn more about Server Side Search.", |
| 186 | + "highlights": { |
| 187 | + "title": [ |
| 188 | + "Full-<span>Text</span> Search" |
| 189 | + ], |
| 190 | + "content": [] |
| 191 | + } |
| 192 | + }, |
| 193 | + { |
| 194 | + "type": "domain", |
| 195 | + "role": "http:post", |
| 196 | + "name": "/api/v3/projects/", |
| 197 | + "id": "post--api-v3-projects-", |
| 198 | + "content": "Import a project under authenticated user. Example request: BashPython$ curl \\ -X POST \\ -H \"Authorization: Token <token>\" https://readthedocs.org/api/v3/projects/ \\ -H \"Content-Type: application/json\" \\ -d @body.json import requests import json URL = 'https://readthedocs.org/api/v3/projects/' TOKEN = '<token>' HEADERS = {'Authorization': f'token {TOKEN}'} data = json.load(open('body.json', 'rb')) response = requests.post( URL, json=data, headers=HEADERS, ) print(response.json()) The content of body.json is like, { \"name\": \"Test Project\", \"repository\": { \"url\": \"https://github.com/readthedocs/template\", \"type\": \"git\" }, \"homepage\": \"http://template.readthedocs.io/\", \"programming_language\": \"py\", \"language\": \"es\" } Example response: See Project details Note Read the Docs for Business, also accepts", |
| 199 | + "highlights": { |
| 200 | + "name": [], |
| 201 | + "content": [ |
| 202 | + ", json=data, headers=HEADERS, ) print(response.json()) The content of body.json is like, "name": "<span>Test</span>" |
| 203 | + ] |
| 204 | + } |
| 205 | + } |
| 206 | + ] |
| 207 | + } |
| 208 | + ] |
| 209 | + } |
| 210 | +
|
| 211 | +Examples |
| 212 | +-------- |
| 213 | + |
| 214 | +- ``project:docs project:dev/latest test``: search for ``test`` in the default |
| 215 | + version of the ``docs`` project, and in the latest version of the ``dev`` project. |
| 216 | +- ``a project:docs/stable search term``: search for ``a search term`` in the |
| 217 | + stable version of the ``docs`` project. |
| 218 | + |
| 219 | +- ``project:docs project\:project/version``: search for ``project::project/version`` in the |
| 220 | + default version of the ``docs`` project. |
| 221 | + |
| 222 | +- ``search``: invalid, at least one project is required. |
| 223 | + |
| 224 | +Dashboard search |
| 225 | +---------------- |
| 226 | + |
| 227 | +This is the search feature that you can access from |
| 228 | +the readthedocs.org/readthedocs.com domains. |
| 229 | + |
| 230 | +We have two types: |
| 231 | + |
| 232 | +Project scoped search: |
| 233 | + Search files and versions of the curent project only. |
| 234 | + |
| 235 | +Global search: |
| 236 | + Search files and versions of all projects in .org, |
| 237 | + and only the projects the user has access to in .com. |
| 238 | + |
| 239 | + Global search also allows to search projects by name/description. |
| 240 | + |
| 241 | +This search also allows you to see the number of results |
| 242 | +from other projects/versions/sphinx domains (facets). |
| 243 | + |
| 244 | +Project scoped search |
| 245 | +~~~~~~~~~~~~~~~~~~~~~ |
| 246 | + |
| 247 | +Here the new syntax won't have effect, |
| 248 | +since we are searching for the files of one project only! |
| 249 | + |
| 250 | +Another approach could be linking to the global search |
| 251 | +with ``project:{project.slug}`` filled in the query. |
| 252 | + |
| 253 | +Global search (projects) |
| 254 | +~~~~~~~~~~~~~~~~~~~~~~~~ |
| 255 | + |
| 256 | +We can keep the project search as is, |
| 257 | +without using the new syntax (since it doesn't make sense there). |
| 258 | + |
| 259 | +Global search (files) |
| 260 | +~~~~~~~~~~~~~~~~~~~~~ |
| 261 | + |
| 262 | +Using the same syntax from the API will be allowed, |
| 263 | +by default it will search all projects in .org, |
| 264 | +and all projects the user has access to in .com. |
| 265 | + |
| 266 | +Another approach could be to allow |
| 267 | +filtering by user on .org, this is ``user:stsewd`` or ``user:@me`` |
| 268 | +so a user can search all their projects easily. |
| 269 | +We could allow just ``@me`` to start. |
| 270 | + |
| 271 | +Facets |
| 272 | +~~~~~~ |
| 273 | + |
| 274 | +We will support only the ``projects`` facet to start. |
| 275 | + |
| 276 | +We can keep the facets, but they would be a little different, |
| 277 | +since with the new syntax we need to specify a project in order to search for |
| 278 | +a version, i.e, we can't search all ``latest`` versions of all projects. |
| 279 | + |
| 280 | +By default we will use/show the ``project`` facet, |
| 281 | +and after the user has filtered by a project, |
| 282 | +we will use/show the ``version`` facet. |
| 283 | + |
| 284 | +If the user searches more than one project, |
| 285 | +things get complicated, should we keep showing the ``version`` facet? |
| 286 | +If clicked, should we change the version on all the projects? |
| 287 | + |
| 288 | +If that is too complicated to explain/implement, |
| 289 | +we should be fine by just supporting the ``project`` |
| 290 | +facet for now. |
| 291 | + |
| 292 | +Backwards compatibility |
| 293 | +~~~~~~~~~~~~~~~~~~~~~~~ |
| 294 | + |
| 295 | +We should be able to keep the old URLs working in the global search, |
| 296 | +but we could also just ignore the old syntax, or transform |
| 297 | +the old syntax to the new one and redirect the user to it, |
| 298 | +for example ``?q=test&project=docs&version=latest`` |
| 299 | +would be transformed to ``?q=test project:docs/latest``. |
| 300 | + |
| 301 | +Future features |
| 302 | +--------------- |
| 303 | + |
| 304 | +- Allow searching on several versions of the same project |
| 305 | + (the API response is prepared to support this). |
| 306 | +- Allow searching on all versions of a project easily, |
| 307 | + with a syntax like ``project:docs/*`` or ``project:docs/@all``. |
| 308 | +- Allow specify the type of search: |
| 309 | + |
| 310 | + - Multi match (query as is) |
| 311 | + - Simple query string (allows using the ES query syntax) |
| 312 | + - Fuzzy search (same as multi match, but with with fuzziness) |
| 313 | + |
| 314 | +- Add the ``org`` filter, |
| 315 | + so users can search by all projects that belong |
| 316 | + to an organization. |
| 317 | + We would show results of the default versions of each project. |
0 commit comments