Skip to content

Upgrade Elasticsearch to 6.8.3 #6309

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ dist: xenial
matrix:
include:
- python: 3.6
env: TOXENV=py36,codecov ES_VERSION=6.2.4 ES_DOWNLOAD_URL=https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-${ES_VERSION}.tar.gz
env: TOXENV=py36,codecov ES_VERSION=6.8.3 ES_DOWNLOAD_URL=https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-${ES_VERSION}.tar.gz
- python: 3.6
env: TOXENV=docs
- python: 3.6
Expand Down
10 changes: 5 additions & 5 deletions docs/development/search.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,21 +4,21 @@ Search
Read The Docs uses Elasticsearch_ instead of the built in Sphinx search for providing better search
results. Documents are indexed in the Elasticsearch index and the search is made through the API.
All the Search Code is open source and lives in the `GitHub Repository`_.
Currently we are using the `Elasticsearch 6.3`_ version.
Currently we are using the `Elasticsearch 6.8.3`_ version.

Local Development Configuration
-------------------------------

Installing and running Elasticsearch
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
You need to install and run Elasticsearch_ version 6.3 on your local development machine.
You need to install and run Elasticsearch_ version 6.8.3 on your local development machine.
You can get the installation instructions
`here <https://www.elastic.co/guide/en/elasticsearch/reference/6.3/install-elasticsearch.html>`_.
`here <https://www.elastic.co/guide/en/elasticsearch/reference/6.8/install-elasticsearch.html>`_.
Otherwise, you can also start an Elasticsearch Docker container by running the following command::

docker run -p 9200:9200 -p 9300:9300 \
-e "discovery.type=single-node" \
docker.elastic.co/elasticsearch/elasticsearch:6.3.2
docker.elastic.co/elasticsearch/elasticsearch:6.8.3

Indexing into Elasticsearch
^^^^^^^^^^^^^^^^^^^^^^^^^^^
Expand Down Expand Up @@ -102,7 +102,7 @@ As per requirements of `django-elasticsearch-dsl`_, it is stored in the


.. _Elasticsearch: https://www.elastic.co/products/elasticsearch
.. _Elasticsearch 6.3: https://www.elastic.co/guide/en/elasticsearch/reference/6.3/index.html
.. _Elasticsearch 6.8.3: https://www.elastic.co/guide/en/elasticsearch/reference/6.8/index.html
.. _GitHub Repository: https://github.com/readthedocs/readthedocs.org/tree/master/readthedocs/search
.. _the Elasticsearch document: https://www.elastic.co/guide/en/elasticsearch/guide/current/document.html
.. _django-elasticsearch-dsl: https://github.com/sabricot/django-elasticsearch-dsl
Expand Down
16 changes: 8 additions & 8 deletions readthedocs/core/static-src/core/js/doc-embed/search.js
Original file line number Diff line number Diff line change
Expand Up @@ -136,9 +136,9 @@ function attach_elastic_search_query(data) {
if(inner_hits[j].type === "sections") {

section = inner_hits[j];
section_subtitle = section._source.title;
section_subtitle_link = link + "#" + section._source.id;
section_content = [section._source.content.substr(0, MAX_SUBSTRING_LIMIT) + " ..."];
section_subtitle = section.source.title;
section_subtitle_link = link + "#" + section.source.id;
section_content = [section.source.content.substr(0, MAX_SUBSTRING_LIMIT) + " ..."];

if (section.highlight) {
if (section.highlight["sections.title"]) {
Expand Down Expand Up @@ -173,15 +173,15 @@ function attach_elastic_search_query(data) {
if (inner_hits[j].type === "domains") {

domain = inner_hits[j];
domain_role_name = domain._source.role_name;
domain_subtitle_link = link + "#" + domain._source.anchor;
domain_name = domain._source.name;
domain_role_name = domain.source.role_name;
domain_subtitle_link = link + "#" + domain.source.anchor;
domain_name = domain.source.name;
domain_subtitle = "";
domain_content = "";
domain_docstrings = "";

if (domain._source.docstrings !== "") {
domain_docstrings = domain._source.docstrings.substr(0, MAX_SUBSTRING_LIMIT) + " ...";
if (domain.source.docstrings !== "") {
domain_docstrings = domain.source.docstrings.substr(0, MAX_SUBSTRING_LIMIT) + " ...";
}

if (domain.highlight) {
Expand Down
2 changes: 1 addition & 1 deletion readthedocs/core/static/core/js/readthedocs-doc-embed.js

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion readthedocs/projects/static/projects/js/tools.js

Large diffs are not rendered by default.

6 changes: 1 addition & 5 deletions readthedocs/search/api.py
Original file line number Diff line number Diff line change
Expand Up @@ -47,11 +47,7 @@ def get_inner_hits(self, obj):
sections = inner_hits.sections or []
domains = inner_hits.domains or []
all_results = itertools.chain(sections, domains)

sorted_results = utils._get_sorted_results(
results=all_results,
source_key='_source',
)
sorted_results = utils._get_sorted_results(results=all_results)

log.debug('[API] Sorted Results: %s', sorted_results)
return sorted_results
Expand Down
18 changes: 11 additions & 7 deletions readthedocs/search/documents.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,11 @@
import logging

from django.conf import settings
from django_elasticsearch_dsl import DocType, Index, fields
from django_elasticsearch_dsl import Document, fields
from django_elasticsearch_dsl.registries import registry

from elasticsearch import Elasticsearch
from elasticsearch_dsl import Index

from readthedocs.projects.models import HTMLFile, Project

Expand All @@ -30,8 +32,9 @@ def update(self, *args, **kwargs):
super().update(*args, **kwargs)


@project_index.doc_type
class ProjectDocument(RTDDocTypeMixin, DocType):
@registry.register_document
@project_index.document
class ProjectDocument(RTDDocTypeMixin, Document):

# Metadata
url = fields.TextField(attr='get_absolute_url')
Expand All @@ -45,7 +48,7 @@ class ProjectDocument(RTDDocTypeMixin, DocType):

modified_model_field = 'modified_date'

class Meta:
class Django:
model = Project
fields = ('name', 'slug', 'description')
ignore_signals = True
Expand All @@ -64,8 +67,9 @@ def faceted_search(cls, query, user, language=None):
return ProjectSearch(**kwargs)


@page_index.doc_type
class PageDocument(RTDDocTypeMixin, DocType):
@registry.register_document
@page_index.document
class PageDocument(RTDDocTypeMixin, Document):

# Metadata
project = fields.KeywordField(attr='project.slug')
Expand Down Expand Up @@ -102,7 +106,7 @@ class PageDocument(RTDDocTypeMixin, DocType):

modified_model_field = 'modified_date'

class Meta:
class Django:
model = HTMLFile
fields = ('commit', 'build')
ignore_signals = True
Expand Down
4 changes: 2 additions & 2 deletions readthedocs/search/faceted_search.py
Original file line number Diff line number Diff line change
Expand Up @@ -84,7 +84,7 @@ def query(self, search, query):
class ProjectSearchBase(RTDFacetedSearch):
facets = {'language': TermsFacet(field='language')}
doc_types = [ProjectDocument]
index = ProjectDocument._doc_type.index
index = ProjectDocument._index._name
fields = ('name^10', 'slug^5', 'description')
operators = ['and', 'or']

Expand All @@ -99,7 +99,7 @@ class PageSearchBase(RTDFacetedSearch):
),
}
doc_types = [PageDocument]
index = PageDocument._doc_type.index
index = PageDocument._index._name

_outer_fields = ['title^4']
_section_fields = ['sections.title^3', 'sections.content']
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ def _run_reindex_tasks(self, models, queue):
app_label = queryset.model._meta.app_label
model_name = queryset.model.__name__

index_name = doc._doc_type.index
index_name = doc._index._name
new_index_name = "{}_{}".format(index_name, timestamp)
# Set index temporarily for indexing,
# this will only get set during the running of this command
Expand Down
8 changes: 4 additions & 4 deletions readthedocs/search/tasks.py
Original file line number Diff line number Diff line change
Expand Up @@ -44,17 +44,17 @@ def index_objects_to_es(

if index_name:
# Hack the index name temporarily for reindexing tasks
old_index_name = document._doc_type.index
document._doc_type.index = index_name
old_index_name = document._index._name
document._index._name = index_name
log.info('Replacing index name %s with %s', old_index_name, index_name)

log.info("Indexing model: %s, '%s' objects", model.__name__, queryset.count())
doc_obj.update(queryset.iterator())

if index_name:
log.info('Undoing index replacement, settings %s with %s',
document._doc_type.index, old_index_name)
document._doc_type.index = old_index_name
document._index._name, old_index_name)
document._index._name = old_index_name


@app.task(queue='web')
Expand Down
28 changes: 18 additions & 10 deletions readthedocs/search/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -106,7 +106,7 @@ def _get_index(indices, index_name):
:return: DED Index
"""
for index in indices:
if str(index) == index_name:
if index._name == index_name:
return index


Expand Down Expand Up @@ -156,15 +156,23 @@ def _indexing_helper(html_objs_qs, wipe=False):
delete_objects_in_es.delay(**kwargs)


def _get_sorted_results(results, source_key='_source'):
def _get_sorted_results(results):
"""Sort results according to their score and returns results as list."""
sorted_results = [
{
'type': hit._nested.field,
source_key: hit._source.to_dict(),
'highlight': hit.highlight.to_dict() if hasattr(hit, 'highlight') else {}
}
for hit in sorted(results, key=attrgetter('_score'), reverse=True)
]
sorted_results = []

for hit in sorted(results, key=attrgetter('meta.score'), reverse=True):
try:
highlight = hit.meta.highlight.to_dict()
except Exception:
highlight = {}

try:
sorted_results.append({
'type': hit.meta.nested.field,
'source': hit.to_dict(),
'highlight': highlight,
})
except Exception:
return []

return sorted_results
7 changes: 1 addition & 6 deletions readthedocs/search/views.py
Original file line number Diff line number Diff line change
Expand Up @@ -115,12 +115,7 @@ def elastic_search(request, project_slug=None):
sections = inner_hits.sections or []
domains = inner_hits.domains or []
all_results = itertools.chain(sections, domains)

sorted_results = utils._get_sorted_results(
results=all_results,
source_key='source',
)

sorted_results = utils._get_sorted_results(results=all_results,)
result.meta.inner_hits = sorted_results
except Exception:
log.exception('Error while sorting the results (inner_hits).')
Expand Down
21 changes: 3 additions & 18 deletions requirements/pip.txt
Original file line number Diff line number Diff line change
Expand Up @@ -45,24 +45,9 @@ django-allauth==0.40.0
GitPython==3.0.3

# Search
elasticsearch==6.4.0 # pyup: <7.0.0


# elasticsearch-dsl==6.3.1 produces this error
# File "/home/travis/build/rtfd/readthedocs.org/.tox/py36/lib/python3.6/site-packages/django_elasticsearch_dsl/documents.py", line 8, in <module>
# from elasticsearch_dsl.document import DocTypeMeta as DSLDocTypeMeta
# ImportError: cannot import name 'DocTypeMeta'
#
# Commit 97e3f75 adds the NestedFacet
git+https://github.com/elastic/elasticsearch-dsl-py@97e3f756a8cacd1c863d3ced3d17abcafbb0f85e#egg=elasticsearch-dsl==6.1.1

# django-elasticsearch-dsl==6.4.1 produces this error
# File "/home/travis/build/readthedocs/readthedocs.org/.tox/py36/lib/python3.6/site-packages/django_elasticsearch_dsl/__init__.py", line 3, in <module>
# from .documents import DocType # noqa
# File "/home/travis/build/readthedocs/readthedocs.org/.tox/py36/lib/python3.6/site-packages/django_elasticsearch_dsl/documents.py", line 7, in <module>
# from elasticsearch_dsl import Document as DSLDocument
# ImportError: cannot import name 'Document'
django-elasticsearch-dsl==0.5.1 # pyup: ignore
elasticsearch>=6.0.0,<7.0.0 # pyup: ignore
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to explicit the version. otherwise it will break between version

elasticsearch-dsl>=6.0.0,<7.0.0 # pyup: ignore
django-elasticsearch-dsl>=6.0.0,<7.0.0 # pyup: ignore
selectolax==0.2.1
orjson==2.0.7

Expand Down