-
Notifications
You must be signed in to change notification settings - Fork 269
Improved support for refresh during updates #323
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improved support for refresh during updates #323
Conversation
The Elasticsearch bulk APIs support a configurable 'refresh' parameter that can be set to: 1. empty string / true : perform refresh after operation completes 2. 'wait_for': wait until next scheduled refresh 3. false (default): don't do any refresh See documentation at: https://www.elastic.co/guide/en/elasticsearch/reference/master/docs-refresh.html Currently the `Document.update()` API accepts a `refresh` parameter that, if `True`, passes `True` to the ES bulk API. There isn't currently a way to pass 'wait_for'. This commit modifies how `Document.update()` works so that it is possible to pass `refresh='wait_for'`. I've implemented this in a generic way, as opposed to hardcoding logic for 'wait_for', to support future changes to this ES bulk API. Note that, in keeping with the current convention, if `refresh='wait_for'` is passed, this will override the value of auto_refresh set either via setting or via the Document class. This gives developers the maximum configurability here.
Currently when updating or rebuilding an index, the `refresh` parameter is never set, regardless if settings.ELASTICSEARCH_DSL_AUTO_REFRESH is set or if auto_refresh is set on the Document class. This commit adds the ability to pass `--refresh` to the `search_index` management command, to force a refresh when the index is updated. This can be useful, for example, if you want to programmatically call that command from some Django tests that fill an index, and then immediately run tests against that index. Without setting `refresh=True`, there's no guarantee that the refreshed index will be available to search.
abb42d2
to
b22d27a
Compare
@chosak Thanks a lot for the PR. I think it was better to create a issue before submitting a patch. So we could have discussed how this could be implemented! As you have submitted the patch, I will review it and will surely try to merge it to master. Thanks |
@chosak I have couple of simple issue about this PR, other than that it looks really good. Can you fix that so we can get it merged? |
👍 @safwanrahman thanks for the review! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. r+ from my end! Thanks a lot! 🚀
This commit upgrades our use of the django-elasticsearch-dsl package from version 7.1.4 to new version 7.2.0: https://github.com/django-es/django-elasticsearch-dsl/releases/tag/7.2.0 This allows us to remove some patching code in Elasticsearch tests related to refreshing indexes deterministically after an update, the result of this PR: django-es/django-elasticsearch-dsl#323
This PR includes two commits that both improve the ability to refresh during updates.
First, it adds the ability to pass
refresh='wait_for'
toDocument.update()
. Currently you can either pass nothing (in which case no refresh happens) orrefresh=True
, in which case the refresh happens right away. Per the docs, it's also possible to specifyrefresh='wait_for'
toSecond, it adds a new
--refresh
parameter to the search_index management command. Currently when running that command, there's no way to force a refresh to happen. This can be useful, for example, if you want to programmatically call the command from some Django tests that fill an index, and then immediately run tests against that index. Without settingrefresh=True
, there's no guarantee that the refreshed index will be available to search.See commit messages for additional details.