-
Notifications
You must be signed in to change notification settings - Fork 269
[Showerthought] Use virtual indexes for zero-downtime rebuilds? #75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
We were using feature like this (self-made) on several projects as well, because migrations happens on the way for DB and Search engine as well. We used raw numbers instead of timestamps for simplicity (at least for our case it was simpler) |
One note - I believe if it will be introduced it needs to be done explicitly with some management command or something like this. |
I need this feature for a current project. Is this feature still desired in the library? If so I can start a PR. From what I understand the command accepts an index name argument to build in the background, and the alias name argument. The command creates and builds the new index. When the new index is finished rebuilding the alias will be updated to point to the new index. Is this the desired behavior? |
I would suggest that the alias name and potentially the new index name suffix are configurable. For example adding a This will then allow people to put whatever meaning to their reindexes that they need to capture, the index alias could default to |
Good thinking. Should the user be able to create an alias for each model so the virtual reindexing could be done for each model in the registry in one command? Building off your suggestion, perhaps the CLI could look like:
How does this sound? |
Sounds perfect |
@ezbc if you take a look at https://github.com/rtfd/readthedocs.org/pull/4368/files#diff-2859d2a6db2d38d6545b0ecadbae2f61R58 it looks like @safwanrahman has already done all of this along with making it celery based in the readthedocs project. We probably would want to heavily borrow this |
Thanks @ezbc for your interest. Yes, this feature is very much desired. |
Thanks for pointing out the PR for RTD. I’ll get started on this feature this week and bring up any issues or questions along the way. |
@josh-stableprice and @safwanrahman I'm wondering if we should always delete the old index or not after a successful population of a new index. One use case I can think of for keeping indexes is if a user wanted to verify the new index before switching over the alias. If we did not delete the old index automatically that would open a can of worms for the user to manage existing indexes, e.g. change aliases and delete old indexes. One option is to automatically delete the old index for now and add the functionality later for a user to not delete the old index and add commands to manage the old indexes. What are your thoughts? |
Your last thought basically hits the nail on the head, however don't kill
yourself trying to implement that all on one go unless you have the spare
time. I'd just implement a virtual index, with automatic replacement if
indexing had no errors for now.
(Hope this makes sense, answered this as soon as I woke up)
…On Tue, 11 Jun 2019, 00:58 Elijah Bernstein-Cooper, < ***@***.***> wrote:
@josh-stableprice <https://github.com/josh-stableprice> and @safwanrahman
<https://github.com/safwanrahman> I'm wondering if we should always
delete the old index or not after a successful population of a new index.
One use case I can think of for keeping indexes is if a user wanted to
verify the new index before switching over the alias.
If we did not delete the old index automatically that would open a can of
worms for the user to manage existing indexes to change aliases and delete
old indexes. One option is to automatically delete the old index for now
and add the functionality later for a user to not delete the old index and
add commands to manage the old indexes.
What are your thoughts?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#75>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ALEDBXF74W6MNLDVRBNL7IDPZ3TA5ANCNFSM4EIIUDOA>
.
|
@josh-stableprice or @safwanrahman, I'm getting back into this now. I'm considering if the aliases should all be updated in the same transaction after each model has a new rebuilt index. This seems like the safest option to me in case an app with multiple models deploys breaking changes for the model indexes. What do you think? |
I would agree if it makes it that much safer as that's the overarching goal
of using the index aliases
…On Thu, 5 Sep 2019 at 20:48, Elijah Bernstein-Cooper < ***@***.***> wrote:
@josh-stableprice <https://github.com/josh-stableprice> or @safwanrahman
<https://github.com/safwanrahman>, I'm getting back into this now.
I'm considering if the aliases should all be updated in the same
transaction after each model has a new rebuilt index. This seems like the
safest option to me in case an app with multiple models deploys breaking
changes for the model indexes.
What do you think?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#75>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ALEDBXETHZGOOPZ7TFU5W4DQIFPBJANCNFSM4EIIUDOA>
.
--
Josh Harwood
Backend Developer
Stable Group Ltd
Email: [email protected]
Website: stableprice.com <https://stableprice.com/>
Office Address: 3 Whitehall Ct, London, SW1A 2EL, UK
Company Twitter: https://twitter.com/stableprice>
Company LinkedIn: https://www.linkedin.com/company/18252297/
|
Right now, when you rebuild a index, the index is nuked first, then rebuilt from scratch. During this reindexing process, any searches to the index might fail.
Instead, you could use "virtual indexes" to perform a rebuild without downtime. By that, I mean that you create a real index with a different name, e.g.
index_name.<timestamp>
. You can then point an alias forindex_name
and point it to the real index.When rebuilding the index, you could create a new index in the background, populate it, then switch the aliases over. That way, the application can still use the old index while the new index is being created.
Most Elasticsearch applications I know use something like this and I'm willing to contribute something similar to this project. However, before I do that, I would like to know whether this is a desirable feature to have or whether it's unnecessary complexity for a generic library.
The text was updated successfully, but these errors were encountered: