Skip to content

Latest commit

 

History

History
177 lines (136 loc) · 6.54 KB

elasticsearch-misc.adoc

File metadata and controls

177 lines (136 loc) · 6.54 KB

Miscellaneous Elasticsearch Operation Support

This chapter covers additional support for Elasticsearch operations that cannot be directly accessed via the repository interface. It is recommended to add those operations as custom implementation as described in [repositories.custom-implementations] .

Index settings

When creating Elasticsearch indices with Spring Data Elasticsearch different index settings can be defined by using the @Setting annotation. The following arguments are available:

  • useServerConfiguration does not send any settings parameters, so the Elasticsearch server configuration determines them.

  • settingPath refers to a JSON file defining the settings that must be resolvable in the classpath

  • shards the number of shards to use, defaults to 1

  • replicas the number of replicas, defaults to 1

  • refreshIntervall, defaults to "1s"

  • indexStoreType, defaults to "fs"

It is as well possible to define index sorting (check the linked Elasticsearch documentation for the possible field types and values):

@Document(indexName = "entities")
@Setting(
  sortFields = { "secondField", "firstField" },                                  (1)
  sortModes = { Setting.SortMode.max, Setting.SortMode.min },                    (2)
  sortOrders = { Setting.SortOrder.desc, Setting.SortOrder.asc },
  sortMissingValues = { Setting.SortMissing._last, Setting.SortMissing._first })
class Entity {
    @Nullable
    @Id private String id;

    @Nullable
    @Field(name = "first_field", type = FieldType.Keyword)
    private String firstField;

    @Nullable @Field(name = "second_field", type = FieldType.Keyword)
    private String secondField;

    // getter and setter...
}
  1. when defining sort fields, use the name of the Java property (firstField), not the name that might be defined for Elasticsearch (first_field)

  2. sortModes, sortOrders and sortMissingValues are optional, but if they are set, the number of entries must match the number of sortFields elements

Index Mapping

When Spring Data Elasticsearch creates the index mapping with the IndexOperations.createMapping() methods, it uses the annotations described in [elasticsearch.mapping.meta-model.annotations], especially the @Field annotation. In addition to that it is possible to add the @Mapping annotation to a class. This annotation has the following properties:

  • mappingPath a classpath resource in JSON format; if this is not empty it is used as the mapping, no other mapping processing is done.

  • enabled when set to false, this flag is written to the mapping and no further processing is done.

  • dateDetection and numericDetection set the corresponding properties in the mapping when not set to DEFAULT.

  • dynamicDateFormats when this String array is not empty, it defines the date formats used for automatic date detection.

  • runtimeFieldsPath a classpath resource in JSON format containing the definition of runtime fields which is written to the index mappings, for example:

{
  "day_of_week": {
    "type": "keyword",
    "script": {
      "source": "emit(doc['@timestamp'].value.dayOfWeekEnum.getDisplayName(TextStyle.FULL, Locale.ROOT))"
    }
  }
}

Filter Builder

Filter Builder improves query speed.

private ElasticsearchOperations operations;

IndexCoordinates index = IndexCoordinates.of("sample-index");

SearchQuery searchQuery = new NativeSearchQueryBuilder()
  .withQuery(matchAllQuery())
  .withFilter(boolFilter().must(termFilter("id", documentId)))
  .build();

Page<SampleEntity> sampleEntities = operations.searchForPage(searchQuery, SampleEntity.class, index);

Using Scroll For Big Result Set

Elasticsearch has a scroll API for getting big result set in chunks. This is internally used by Spring Data Elasticsearch to provide the implementations of the <T> SearchHitsIterator<T> SearchOperations.searchForStream(Query query, Class<T> clazz, IndexCoordinates index) method.

IndexCoordinates index = IndexCoordinates.of("sample-index");

SearchQuery searchQuery = new NativeSearchQueryBuilder()
  .withQuery(matchAllQuery())
  .withFields("message")
  .withPageable(PageRequest.of(0, 10))
  .build();

SearchHitsIterator<SampleEntity> stream = elasticsearchTemplate.searchForStream(searchQuery, SampleEntity.class, index);

List<SampleEntity> sampleEntities = new ArrayList<>();
while (stream.hasNext()) {
  sampleEntities.add(stream.next());
}

stream.close();

There are no methods in the SearchOperations API to access the scroll id, if it should be necessary to access this, the following methods of the ElasticsearchRestTemplate can be used:

@Autowired ElasticsearchRestTemplate template;

IndexCoordinates index = IndexCoordinates.of("sample-index");

SearchQuery searchQuery = new NativeSearchQueryBuilder()
  .withQuery(matchAllQuery())
  .withFields("message")
  .withPageable(PageRequest.of(0, 10))
  .build();

SearchScrollHits<SampleEntity> scroll = template.searchScrollStart(1000, searchQuery, SampleEntity.class, index);

String scrollId = scroll.getScrollId();
List<SampleEntity> sampleEntities = new ArrayList<>();
while (scroll.hasSearchHits()) {
  sampleEntities.addAll(scroll.getSearchHits());
  scrollId = scroll.getScrollId();
  scroll = template.searchScrollContinue(scrollId, 1000, SampleEntity.class);
}
template.searchScrollClear(scrollId);

To use the Scroll API with repository methods, the return type must defined as Stream in the Elasticsearch Repository. The implementation of the method will then use the scroll methods from the ElasticsearchTemplate.

interface SampleEntityRepository extends Repository<SampleEntity, String> {

    Stream<SampleEntity> findBy();

}

Sort options

In addition to the default sort options described [repositories.paging-and-sorting] Spring Data Elasticsearch has a GeoDistanceOrder class which can be used to have the result of a search operation ordered by geographical distance.

If the class to be retrieved has a GeoPoint property named location, the following Sort would sort the results by distance to the given point:

Sort.by(new GeoDistanceOrder("location", new GeoPoint(48.137154, 11.5761247)))