Search: improve results for simple queries #7194

stsewd · 2020-06-15T22:19:02Z

SimpleQueryString don't allow us to set an implicit value for fuzziness,
but is still useful for advanced queries.
This allows us to support both.

Results look good in local testing and comparing them with the results
from production.

SimpleQueryString don't allow us to set an implicit value for fuzziness, but is still useful for advanced queries. This allows us to support both. Results look good in local testing and comparing them with the results from production.

ericholscher

I think this will be a huge benefit to our less advanced users. We should be careful rolling this out though, because I do think users will be somewhat confused by this change.

I wonder if we could allow projects to configure this a bit. As a future feature, but it would be neat to let projects decide on the "fuzziness" of their search, for example.

readthedocs/search/tests/test_xss.py

readthedocs/search/tests/test_api.py

ericholscher · 2020-06-15T22:50:16Z

readthedocs/search/faceted_search.py

+        """
+        tokens = {'+', '|', '-', '"' , '*', '(', ')', '~'}
+        query_tokens = set(query)
+        return not tokens.isdisjoint(query_tokens)


readthedocs/search/faceted_search.py

ericholscher · 2020-06-15T22:54:12Z

I wonder if it's worth putting this behind a feature flag for testing -- I do think it might cause some confusion, especially on the .com

stsewd · 2020-06-15T23:12:36Z

I do think it might cause some confusion, especially on the .com

Not sure what would be confusing. Actually I'm getting results more quickly in the search as you type, since I don't need to type the full word or remember the exact word in english. But I'm fine putting this behind a feature flag, let me know if you want to go that way.

ericholscher · 2020-06-15T23:32:33Z

I think it's more that the results will change, so if people are used to a certain result, it might not show up as high in the results.

Do you find it lists all exact matches prior to partial? It seems like the XSS test suggests that isn't the case?

stsewd · 2020-06-15T23:41:31Z

I think it's more that the results will change, so if people are used to a certain result, it might not show up as high in the results.

Ok, got it. A feature flag makes sense them, we could see how the results changed in our docs (but in local testing with my frequent queries was all good)

Do you find it lists all exact matches prior to partial? It seems like the XSS test suggests that isn't the case?

The XSS query was listing more results without "", not fewer. If that is what you mean.

ericholscher · 2020-06-16T15:20:26Z

The XSS query was listing more results without "", not fewer. If that is what you mean.

Right, but presumably the additional results were partial matches? My question is: do all full matches return higher in results than any partial matches -- because that would be an issue if not.

stsewd · 2020-06-16T17:34:32Z

Ok, I just checked, the first results is what we expect

(Pdb) pp hits[0]
{'_id': '5',
 '_index': 'test_page_index',
 '_score': 4.838786,
 '_source': {'build': None,
             'commit': '5',
             'domains': [{'role_name': 'py:function', 'anchor': 'celery.utils.depreca...},
                         {'role_name': 'py:function', 'anchor': 'celery.utils.depreca...}],
             'full_path': 'support.html',
             'path': 'support',
             'project': 'docs',
             'sections': [{'id': 'usage-questions', 'title': 'Usage Questions', 'conte...},
                          {'id': 'community-support', 'title': 'Community Support', 'c...},
                          {'id': 'commercial-support', 'title': 'Commercial Support', ...}],
             'title': 'Support',
             'version': 'latest'},
 '_type': 'doc',
 'inner_hits': {'domains': <Response: {}>,
                'sections': <Response: [{'_index': 'test_page_index', '_type': 'doc', '_id': '5', '_...}]>}}

That's from the page we expect https://github.com/readthedocs/readthedocs.org/blob/master/readthedocs/search/tests/data/docs/support.json#L18

I have added the feature flag on the indoc-search, we can't add the feature flag to the dashboard search bc isn't related to a project but to a user (it defaults to the old behavior, so we are fine)

Search: improve results for simple queries

450d97b

SimpleQueryString don't allow us to set an implicit value for fuzziness, but is still useful for advanced queries. This allows us to support both. Results look good in local testing and comparing them with the results from production.

stsewd requested review from a team and ericholscher June 15, 2020 22:19

ericholscher approved these changes Jun 15, 2020

View reviewed changes

stsewd added 2 commits June 15, 2020 18:01

Use kwargs

5632361

Assert we are actually getting results

9400d06

stsewd added 4 commits June 16, 2020 12:03

Add feature flag

32dd134

Merge branch 'master' into improve-search-simple-queries

b0ea5c8

Fix test

fc5c14c

Fix test

9c2fe89

stsewd merged commit 3fc1fc7 into master Jun 16, 2020

stsewd deleted the improve-search-simple-queries branch June 16, 2020 17:59

stsewd mentioned this pull request Jun 23, 2020

Search related settings in the configuration file #7217

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Search: improve results for simple queries #7194

Search: improve results for simple queries #7194

stsewd commented Jun 15, 2020

ericholscher left a comment

ericholscher Jun 15, 2020

ericholscher commented Jun 15, 2020

stsewd commented Jun 15, 2020

ericholscher commented Jun 15, 2020

stsewd commented Jun 15, 2020 •

edited

Loading

ericholscher commented Jun 16, 2020 •

edited

Loading

stsewd commented Jun 16, 2020

Search: improve results for simple queries #7194

Search: improve results for simple queries #7194

Conversation

stsewd commented Jun 15, 2020

ericholscher left a comment

Choose a reason for hiding this comment

ericholscher Jun 15, 2020

Choose a reason for hiding this comment

ericholscher commented Jun 15, 2020

stsewd commented Jun 15, 2020

ericholscher commented Jun 15, 2020

stsewd commented Jun 15, 2020 • edited Loading

ericholscher commented Jun 16, 2020 • edited Loading

stsewd commented Jun 16, 2020

stsewd commented Jun 15, 2020 •

edited

Loading

ericholscher commented Jun 16, 2020 •

edited

Loading