Skip to content

Documentation fix? #1744

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Stexxen opened this issue Mar 26, 2021 · 5 comments
Closed

Documentation fix? #1744

Stexxen opened this issue Mar 26, 2021 · 5 comments
Labels
status: waiting-for-triage An issue we've not yet triaged

Comments

@Stexxen
Copy link
Contributor

Stexxen commented Mar 26, 2021

While performing some testing on Streams in Repositories i.e.

Stream<Thing> findAllBySomething();

I came up against the max scroll Context error refered to in #406 and #1337
Trying to create too many scroll contexts. Must be less than or equal to: [500]

At the bottom of Section 11.2 the following is stated
image

If the Stream is not closed and you are able to create 500 scroll contexts within the 60 seconds it triggers the exception.

I wondered if it was worth explaining that a try-with-resources should be used to make sure the Stream was closed and therefore the scroll context?

@spring-projects-issues spring-projects-issues added the status: waiting-for-triage An issue we've not yet triaged label Mar 26, 2021
@Stexxen
Copy link
Contributor Author

Stexxen commented Mar 26, 2021

Here is a quick test-case, that will trigger the max scroll context exception

@SpringBootApplication
@Import(EConfig.class)
public class ScrollContext implements CommandLineRunner {

  public static void main(String[] args) {
    SpringApplication.run(ScrollContext.class).close();
  }

  @Autowired
  ElasticScrollRepository repo;

  @Override
  public void run(String... args) throws Exception {
    for (int i=0; i< 600; i++) {
      List<TestRecord> list = repo.findAllByVal(i).collect(Collectors.toList());
    }
  }

}

@Configuration
@EnableElasticsearchRepositories(basePackageClasses = ElasticScrollRepository.class)
class EConfig extends AbstractElasticsearchConfiguration {

  @Override
  public RestHighLevelClient elasticsearchClient() {
    final ClientConfiguration clientConfiguration = ClientConfiguration.builder()
                                                            .connectedTo("localhost:9200")
                                                            .build();
    return RestClients.create(clientConfiguration).rest();
  }
}

@Repository
interface ElasticScrollRepository extends ElasticsearchRepository<TestRecord, String> {

  Stream<TestRecord> findAllByVal(Integer val);
}

@Document(indexName = "scroll_index_test")
class TestRecord {

  @Id
  public String id;

  @Field(name = "val", type = FieldType.Integer)
  public Integer val;

}

if the run() method is changed to

  @Override
  public void run(String... args) throws Exception {
    for (int i=0; i< 600; i++) {
      try(Stream<TestRecord> stream = repo.findAllByVal(i)) {
        List<TestRecord> list = stream.collect(Collectors.toList());
      }
    }
  }

then no exceptions are raised.

But you have to know that with ElasticSearch the Stream must be closed so the scroll context is also closed, and so I wondered if it was worth making that point clearer?

FYI : This was running with Spring Boot Parent 2.4.4 against ES 7.12

@sothawo
Copy link
Collaborator

sothawo commented Mar 26, 2021

normally in this line

 List<TestRecord> list = repo.findAllByVal(i).collect(Collectors.toList());

when the list is complete, the close() of the Stream should have been called by the Collector, which in turn would then under the hood close the Elasticsearch resources as well. I will have to investigate this.

That Streams should be closed is nothing special to Spring Data Elasticsearch, you should always close the resources you create somehow, be it file resources, streams or whatever. I do not see the necessity to add this to the Spring Data Elasticsearch docs.

@Stexxen
Copy link
Contributor Author

Stexxen commented Mar 26, 2021

Yes that was started this, I expected the Collector to close the stream for me, but it does not. Which means it must now be split from the one-liner to a try-with-resources.

In regard to your point about Streams being closed. I would disagree, even the JavaDocs suggest closing a Stream is the exception not the rule.

From https://docs.oracle.com/javase/8/docs/api/java/util/stream/Stream.html

Streams have a BaseStream.close() method and implement AutoCloseable, but nearly all stream instances do not actually need to be closed after use. Generally, only streams whose source is an IO channel (such as those returned by Files.lines(Path, Charset)) will require closing. Most streams are backed by collections, arrays, or generating functions, which require no special resource management. (If a stream does require closing, it can be declared as a resource in a try-with-resources statement.)

And this particular SO Post https://stackoverflow.com/questions/38698182/close-java-8-stream is an example where ever answer and comment has a different opinion.

I was just thinking that it would clarify things to say that the Streams created in the ElasticSearch JPA Respositories are IO streams and must be closed after use.

@sothawo
Copy link
Collaborator

sothawo commented Mar 27, 2021

Fair point with the hint that most streams do not require a close operation and it should be documented.

But instead of adding this behaviour to the docs I will better change the implementation that reads the data from Elasticsearch to automatically close the scroll context when all results are returned so that an explicit close is not necessary.

I created #1745 for this, will do it later today.

@sothawo
Copy link
Collaborator

sothawo commented Mar 27, 2021

Implemented with #1746, now you don't have to call close, this is done automatically. Thanks for finding this.

@sothawo sothawo closed this as completed Mar 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: waiting-for-triage An issue we've not yet triaged
Projects
None yet
Development

No branches or pull requests

3 participants