You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I'm opening this issue to report something likely to be a bug.
When creating a SearchRequest for searchForStream method, maxResults parameter is overwriting size, while it seems like it shouldn't be.
Bug
I set both maxResult and Pageable.size (will refer to pageSize) fields for the query and passes it to SearchOperations.searchForStream.
When requestConverter.searchRequestinside the method creates a SearchRequest, maxResult overwrites pageSize.
In the below code, builder.size is called first with query.getPageable().getPageSize(), followed by another call with query.getMaxResults(), overwriting the result of the first call.
Want to fetch at most 150,000 documents in total (out of 250k), using SearchOperations.searchForStream (maxResult 150000)
Want to fetch 10,000 documents per page (per scroll search request), which fills almost all the response memory buffer (Pageable.ofSize(10000))
result
Expected : 15 search requests in total, a single search returning a page of 10k documents
Actual : a single search tries to fetch 150k documents, overflowing response memory buffer and causing below Exception
Caused by: org.springframework.dao.DataAccessResourceFailureException: entity content is too long [169386197] for the configured buffer limit [104857600]; nested exception is java.lang.RuntimeException: entity content is too long [169386197] for the configured buffer limit [104857600]
Possible solution
I think two parameters have totally separate roles, thus one shouldn't overwrite the other.
pageSize parameter should decide how many documents that I want ES to fetch per page,
while maxResult parameter decides how many documents I want to fetch in total during the whole stream operation.
👉 Therefore, the actual pageSize should be Min(maxResult, pageSize) if maxResult is set
Further explanation with code for your understanding
SearchOperations.searchForStream returns a CloseableIterator that uses scroll API to search through a large set of data by pages.
Currently, maxResult is solely used to get maxCount for this method.
When currentCount reachesmaxCount, the iterator is forced to return false for hasNext, even when there could be more documents available for searching.
Hi, I'm opening this issue to report something likely to be a bug.
When creating a
SearchRequest
forsearchForStream
method,maxResults
parameter is overwritingsize
, while it seems like it shouldn't be.Bug
I set both
maxResult
andPageable.size
(will refer topageSize
) fields for the query and passes it to SearchOperations.searchForStream.When
requestConverter.searchRequest
inside the method creates aSearchRequest
,maxResult
overwritespageSize
.In the below code,
builder.size
is called first withquery.getPageable().getPageSize()
, followed by another call withquery.getMaxResults()
, overwriting the result of the first call.spring-data-elasticsearch/src/main/java/org/springframework/data/elasticsearch/client/elc/RequestConverter.java
Lines 1469 to 1492 in 0e5af90
I was stuck at the case where,
SearchOperations.searchForStream
(maxResult 150000
)Pageable.ofSize(10000)
)result
Caused by: org.springframework.dao.DataAccessResourceFailureException: entity content is too long [169386197] for the configured buffer limit [104857600]; nested exception is java.lang.RuntimeException: entity content is too long [169386197] for the configured buffer limit [104857600]
Possible solution
I think two parameters have totally separate roles, thus one shouldn't overwrite the other.
pageSize
parameter should decide how many documents that I want ES to fetch per page,maxResult
parameter decides how many documents I want to fetch in total during the whole stream operation.👉 Therefore, the actual pageSize should be
Min(maxResult, pageSize)
ifmaxResult
is setChanges made : https://github.com/hy2850/spring-data-elasticsearch/commits/%233089-size-and-maxResults/
Further explanation with code for your understanding
SearchOperations.searchForStream
returns a CloseableIterator that uses scroll API to search through a large set of data by pages.Currently,
maxResult
is solely used to getmaxCount
for this method.spring-data-elasticsearch/src/main/java/org/springframework/data/elasticsearch/core/AbstractElasticsearchTemplate.java
Lines 410 to 423 in 0e5af90
The iterator keeps track of how many documents it has fetched so far, in
currentCount
instance variable.spring-data-elasticsearch/src/main/java/org/springframework/data/elasticsearch/core/StreamQueries.java
Lines 130 to 137 in 0e5af90
When
currentCount
reachesmaxCount
, the iterator is forced to return false forhasNext
, even when there could be more documents available for searching.spring-data-elasticsearch/src/main/java/org/springframework/data/elasticsearch/core/StreamQueries.java
Lines 107 to 116 in 0e5af90
So, I reached a conclusion that
maxResult
should only decidemaxCount
, not thepageSize
, and this is a bug.The text was updated successfully, but these errors were encountered: