Skip to content

Improved support for refresh during updates #323

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Mar 19, 2021

Conversation

chosak
Copy link
Contributor

@chosak chosak commented Feb 12, 2021

This PR includes two commits that both improve the ability to refresh during updates.

First, it adds the ability to pass refresh='wait_for' to Document.update(). Currently you can either pass nothing (in which case no refresh happens) or refresh=True, in which case the refresh happens right away. Per the docs, it's also possible to specify refresh='wait_for' to

[w]ait for the changes made by the request to be made visible by a refresh before replying. This doesn’t force an immediate refresh, rather, it waits for a refresh to happen.

Second, it adds a new --refresh parameter to the search_index management command. Currently when running that command, there's no way to force a refresh to happen. This can be useful, for example, if you want to programmatically call the command from some Django tests that fill an index, and then immediately run tests against that index. Without setting refresh=True, there's no guarantee that the refreshed index will be available to search.

See commit messages for additional details.

The Elasticsearch bulk APIs support a configurable 'refresh' parameter
that can be set to:

1. empty string / true : perform refresh after operation completes
2. 'wait_for': wait until next scheduled refresh
3. false (default): don't do any refresh

See documentation at:

https://www.elastic.co/guide/en/elasticsearch/reference/master/docs-refresh.html

Currently the `Document.update()` API accepts a `refresh` parameter
that, if `True`, passes `True` to the ES bulk API. There isn't currently
a way to pass 'wait_for'.

This commit modifies how `Document.update()` works so that it is
possible to pass `refresh='wait_for'`. I've implemented this in a
generic way, as opposed to hardcoding logic for 'wait_for', to support
future changes to this ES bulk API.

Note that, in keeping with the current convention, if
`refresh='wait_for'` is passed, this will override the value of
auto_refresh set either via setting or via the Document class. This
gives developers the maximum configurability here.
Currently when updating or rebuilding an index, the `refresh` parameter
is never set, regardless if settings.ELASTICSEARCH_DSL_AUTO_REFRESH is
set or if auto_refresh is set on the Document class.

This commit adds the ability to pass `--refresh` to the `search_index`
management command, to force a refresh when the index is updated.

This can be useful, for example, if you want to programmatically call
that command from some Django tests that fill an index, and then
immediately run tests against that index. Without setting
`refresh=True`, there's no guarantee that the refreshed index will be
available to search.
@chosak chosak force-pushed the feature/refresh-improvements branch from abb42d2 to b22d27a Compare February 12, 2021 19:23
@safwanrahman
Copy link
Collaborator

@chosak Thanks a lot for the PR. I think it was better to create a issue before submitting a patch. So we could have discussed how this could be implemented!

As you have submitted the patch, I will review it and will surely try to merge it to master. Thanks

@safwanrahman
Copy link
Collaborator

@chosak I have couple of simple issue about this PR, other than that it looks really good. Can you fix that so we can get it merged?

@chosak
Copy link
Contributor Author

chosak commented Mar 7, 2021

👍 @safwanrahman thanks for the review!

Copy link
Collaborator

@safwanrahman safwanrahman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. r+ from my end! Thanks a lot! 🚀

@safwanrahman safwanrahman merged commit 540ed35 into django-es:master Mar 19, 2021
@chosak chosak deleted the feature/refresh-improvements branch April 21, 2021 14:57
chosak added a commit to chosak/consumerfinance.gov that referenced this pull request Apr 22, 2021
This commit upgrades our use of the django-elasticsearch-dsl package
from version 7.1.4 to new version 7.2.0:

https://github.com/django-es/django-elasticsearch-dsl/releases/tag/7.2.0

This allows us to remove some patching code in Elasticsearch tests
related to refreshing indexes deterministically after an update, the
result of this PR:

django-es/django-elasticsearch-dsl#323
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants