-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Unable to set _id in bulk index with raw source documents #2861
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I'm sorry, I actually sent the code with that mistake. Please see my edited post. The problem is not in the compilation (it was my mistake in copying and pasting the code here), but in the "withId" call. You can set anything there and the setting is ignored in the bulk request. |
Is this fix working? |
Uh oh!
There was an error while loading. Please reload this page.
I've tried to bulk index a bunch of JSON raw records into ES, and I needed to set custom _id values for them. Individual indexing works by calling "IndexQueryBuilder().withId(some_id_value)" and then calling the individual index method, but calling the "bulkIndex" method doesn't consider what was defined as the _id desired value.
Here's the code that ignores the ".withId" call:
It should be interesting (if not mandatory) that the user could set the _id for each individual record sent in the bulk request.
I was able to loop over individual IndexQuery objects and send them one by one to ES, and that correctly sets the _id value, but that increases processing time a lot - in my scenario of ~2m JSON records, elapsed time increases from 15-20 minutes (in batches of 2000 records) to ~3 hours.
The text was updated successfully, but these errors were encountered: