A Complete Guide to Using ElasticSearch with Spring Boot

In this post, I will cover the details of how to use Elasticsearch with Spring Boot. I will also cover the fundamentals of Elasticsearch and how it is used in the industry.

What is Elasticsearch?

Elasticsearch is a distributed, free and open search and analytics engine for all types of data, including textual, numerical, geospatial, structured, and unstructured.

It is built upon Apache Lucene. Elasticsearch is often part of the ELK stack (Elastic, LogStash, and Kibana). One can use Elasticsearch to store, search, and manage data for

  • Logs
  • Metrics
  • A search backend
  • Application Monitoring

Search has become a central idea in many fields with ever-increasing data. As most applications become data-intensive, it is important to search through a large volume of data with speed and flexibility. ElasticSearch offers both.

In this post, we will look at Spring Data Elasticsearch. It provides a simple interface to do search, store, and run analytics operations. We will show how we can use Spring Data to index and search log data.

Key Concepts of Elasticsearch

Elasticsearch has indexes, documents, and fields. The idea is simple and very similar to databases. Elasticsearch stores data as documents(Rows) in indexes(Database tables). A user can search through this data using fields(Columns).

Usually, the data in elasticsearch goes through different analyzers to split that data. The default analyzer split the data on punctuation like space or comma.

We will be using spring-data-elasticsearch library to build the demo of this post. In Spring Data, a document is nothing but a POJO object. We will add different annotations from elasticsearch in the same class.

As said previously, elasticsearch can store different types of data. Nevertheless, we will be looking at the simple text data in this demo.

Creating Spring Boot Application

Let’s create a simple spring boot application. We will be using spring-data-elasticsearch dependency.


dependencies {
	implementation 'org.springframework.boot:spring-boot-starter-data-elasticsearch'
	implementation 'org.springframework.boot:spring-boot-starter-thymeleaf'
	implementation 'org.springframework.boot:spring-boot-starter-web'
	testImplementation 'org.springframework.boot:spring-boot-starter-test'
}

Subsequently, we need to create Elasticsearch client bean. Now there are two ways to create this bean.

The simple method to add this bean is by adding the properties in application.properties.

spring.elasticsearch.rest.uris=localhost:9200
spring.elasticsearch.rest.connection-timeout=1s
spring.elasticsearch.rest.read-timeout=1m
spring.elasticsearch.rest.password=
spring.elasticsearch.rest.username=

But in our application, we will be building this bean programmatically. We will be using Java High-Level Rest Client (JHLC). JHLC is a default client of elasticsearch.


@Configuration
@EnableElasticsearchRepositories
public class ElasticsearchClientConfiguration extends AbstractElasticsearchConfiguration
{

    @Override
    @Bean
    public RestHighLevelClient elasticsearchClient ()
    {
        final ClientConfiguration clientConfiguration =
                ClientConfiguration.builder().connectedTo("localhost:9200").build();

        return RestClients.create(clientConfiguration).rest();
    }
}

Henceforth, we have a client configuration that can also use properties from application.properties. We use RestClients to create elasticsearchClient.

Additionally, we will be using LogData as our model. Basically, we will be building a document for LogData to store in an index.


@Document(indexName = "logdataindex")
public class LogData
{
    @Id
    private String id;

    @Field(type = FieldType.Text, name = "host")
    private String host;

    @Field(type = FieldType.Date, name = "date")
    private Date date;

    @Field(type = FieldType.Text, name = "message")
    private String message;

    @Field(type = FieldType.Double, name = "size")
    private double size;

    @Field(type = FieldType.Text, name = "status")
    private String status;

    // Getters and Setters

}
  • @Document – specifies our index.
  • @Id – represents the field _id of our document and it is unique for each message.
  • @Field – represents a different type of field that might be in our data.

There are two ways one can search or create an index with elasticsearch  –

  1. Using Spring Data Repository
  2. Using ElasticsearchRestTemplate

Spring Data Repository with Elasticsearch

Overall, Spring Data Repository allows us to create repositories that we can use for writing simple CRUD methods for searching or indexing in elasticsearch. But if you want more control over the queries, you might want to use ElasticsearchRestTemplate. Especially, it allows you to write more efficient queries.

public interface LogDataRepository extends ElasticsearchRepository<LogData, String>
{
}

This repository provides basic CRUD methods that Spring takes care of from an implementation perspective.

Using ElasticsearchRestTemplate

If we want to use advanced queries like aggregation, suggestions, we can use ElasticsearchRestTemplate . Spring Data library provides this template.

 public List getLogDatasByHost(String host) {
    Query query = new NativeSearchQueryBuilder()
        .withQuery(QueryBuilders.matchQuery("host", host))
        .build();
    SearchHits searchHits = elasticsearchRestTemplate.search(query, LogData.class);

    return searchHits.get().map(SearchHit::getContent).collect(Collectors.toList());
  }

I will show further the usage of ElasticsearchRestTemplate when we do more complex queries.

ElasticsearchRestTemplate implements ElasticsearchOperations.  There are key queries that you can use with ElasticsearchRestTemplate that makes the use of it easier compared to Spring Data repositories.

index() OR bulkIndex() allow creating a single index or indices in bulk. One can build an index query object and use it in index() method call.


  private ElasticsearchRestTemplate elasticsearchRestTemplate;

  public List createLogData
            (final List logDataList) {

      List queries = logDataList.stream()
      .map(logData ->
        new IndexQueryBuilder()
        .withId(logData.getId().toString())
        .withObject(logData).build())
      .collect(Collectors.toList());;
    
      return elasticsearchRestTemplate.bulkIndex(queries,IndexCoordinates.of("logdataindex"));
  }

search() method helps to search documents in an index. One can perform search operations by building Query object. There are three types of Query one can build. NativeQuery, CriteriaQuery, and StringQuery.

Rest Controller to query elasticsearch instance

Let’s create a rest controller that we will use to add the bulk of data in our elasticsearch instance as well as to query the same instance.

@RestController
@RequestMapping("/v1/betterjavacode/logdata")
public class LogDataController
{
    @Autowired
    private LogDataService logDataService;

    @GetMapping
    public List searchLogDataByHost(@RequestParam("host") String host)
    {
        List logDataList = logDataService.getAllLogDataForHost(host);

        return logDataList;
    }

    @GetMapping("/search")
    public List searchLogDataByTerm(@RequestParam("term") String term)
    {
        return logDataService.findBySearchTerm(term);
    }

    @PostMapping
    public LogData addLogData(@RequestBody LogData logData)
    {

        return logDataService.createLogDataIndex(logData);
    }

    @PostMapping("/createInBulk")
    public  List addLogDataInBulk(@RequestBody List logDataList)
    {
        return (List) logDataService.createLogDataIndices(logDataList);
    }
}

Running Elasticsearch Instance

So far, we have shown how to create an index, and how to use elasticsearch client. But, we have not shown connecting this client to our elasticsearch instance.

We will be using a docker instance to run elasticsearch on our local enviornment. AWS provides its own service to run Elasticsearch.

To run your own docker instance of elasticsearch, use the following command –

docker run -p 9200:9200 -e "discovery.type=single-node" docker.elastic.co/elasticsearch/elasticsearch:7.10.0

Subsequently, this will start the node elasticsearch node that you can verify by visiting http://localhost:9200

Elastic Search and Spring Data - Docker Instance

Creating Index and Searching for Data

Altogether, if we start the application, we will be using a Postman to create an initial index and continue to add documents to it.

This will also create an index and add the documents to that index. On elasticsearch instance, we can see the log as below:

{
	"type": "server",
	"timestamp": "2021-08-22T18:48:46,579Z",
	"level": "INFO",
	"component": "o.e.c.m.MetadataCreateIndexService",
	"cluster.name": "docker-cluster",
	"node.name": "e5f3b8096ca3",
	"message": "[logdataindex] creating index, cause [api], templates [], shards [1]/[1]",
	"cluster.uuid": "mi1O1od7Rju1dQMXDnCuNQ",
	"node.id": "PErAmAWPRiCS5tv-O7HERw"
}

The message clearly shows that it has created an index logdataindex. Now if add more documents to the same index, it will update that index.

Let’s run a search query now. I will run a simple query to search for the text term “Google”

This was a simple search query. As previously mentioned, we can write more complex search queries using different types of queries – String, Criteria, or Native.

Conclusion

Code for this demo is available on my GitHub repository.

In this post, we covered the following things

  • Elasticsearch and Key Concepts about Elasticsearch
  • Spring Data repository and ElasticsearchRestTemplate
  • Integration with Spring Boot Application
  • Execution of different queries against Elasticsearch

If you have not checked out my book about Spring Security, you can check here.

Do you find Gradle as a build tool confusing? Why is it so complex to understand? I am writing a new simple book about Gradle – Gradle For Humans. Follow me here for more updates.