Elasticsearch in Action: Geospatial Data Types

Madhusudhan Konda
5 min readJan 22, 2023
Excerpts taken from my upcoming book: Elasticsearch in Action

The excerpts are taken from my book Elasticsearch in Action, Second Edition. The code is available in my GitHub repository. You can find executable Kibana scripts in the repository so you can run the commands in Kibana straight away. All code is tested against Elasticsearch 8.4 version.

Me @ Medium || LinkedIn || Twitter || GitHub

In the last article, we looked fundamentals of location search. In this article, we look at the data types provided by Elasticearch for geo-search.

Similar to how the textual data is represented by the text data type, Elasticsearch provides two dedicated data types to work with spatial data: the geo_point and geo_shape. The geo_point data type expresses a longitude and latitude that works on location-based queries. The geo_shape type, on the other hand, lets us index geoshapes such as points, multi lines, polygons, and a few others. Let’s look at these spatial data types in the following sections.

The geo_point data type

A location on a map is expressed universally by longitude and latitude. Elasticsearch supports the representation of such location data using a dedicated geo_pointdata type. Once the mapping is ready, we can index a document. The following listing demonstrates the code for creating a data schema for the bus_stops index with a couple of fields.

PUT bus_stops
{
"mappings": {
"properties": {
"name":{
"type": "text"
},
"location":{
"type": "geo_point"
}
}
}
}

The bus_stopsindex is defined with two properties: a name and a location. The location is represented by a geo_pointdata type, which means it would expect to be set with latitude and longitude values when indexing the document. The following query in the next listing indexes the London Bridge Station bus stop.

POST bus_stops/_doc
{
"name":"London Bridge Station",
"location":"51.07, 0.08"
}

As the query shows, the locationfield is provided with stringified latitude and longitude values separated by a comma: “51.07, 0.08”. Providing the coordinates in this string format is not the only way you can set the locationfield. Fortunately, there are a bunch of formats in addition to string, such as array, well-known-text (WKT) point, and geohash, that we can use to input the locationfield’s geographic coordinates. The query in the following listing provides the mechanism of these types of inputs.

# As WKT point (lat, lon)
POST bus_stops/_doc
{
"text": "London Victoria Station",
"location" : "POINT (51.49 0.14)"
}

# As location object
POST bus_stops/_doc
{
"text": "Leicester Square Station",
"location" : {
"lon":-0.12,
"lat":51.50
}
}

# As an array (lon, lat)
POST bus_stops/_doc
{
"text": "Westminster Station",
"location" : [51.54, 0.23]
}

# As a geohash
POST bus_stops/_doc
{
"text": "Hyde Park Station",
"location" : "gcpvh2bg7sff"
}

The queries in the above given listing index various bus stop locations using multiple formats. As you can see, one can use a string of latitude and longitude as in the previous listing or, as in the listing given above, either an object, an array, a geohash, or a WKT-formatted POINT shape.

Now that we understand the geo_point data type, it’s time to learn about the second type: the geo_shapedata type. As the name indicates, the geo_shapetype helps index and search data using a particular shape; for example, a polygon. Let’s next look at the geo_shapedata type to understand how we can index data for geoshapes.

The geo_shape data type

Similar to the geo_pointtype, which represents a point on the map, Elasticsearch provides a geo_shapedata type to represent shapes such as points, multipoints, lines, and polygons. The shapes are represented by an open standard called GeoJSON (http://geojson.org) and, accordingly, is written in JSON format. The geometric shapes are mapped to a geo_shapedata type.

Let’s first create the mapping for an index of cafes with a couple of fields. One of them is the addressfield, which points to the location of a cafe, represented as a geo_shapetype. The following listing demonstrates this.

PUT cafes
{
"mappings": {
"properties": {
"name":{
"type": "text"
},
"address": {
"type": "geo_shape"
}
}
}
}

The code creates an index called cafesto house local restaurants. The notable field is the addressfield, which is defined as a geo_shapetype. This type now expects inputs of shapes in GeoJSON or WKT. For example, to represent a point on a map, we can input the field using Point in GeoJSON or POINT in WKT as the code in this listing demonstrates.

# Inputting the address in GeoJSON format
PUT cafes/_doc/1
{
"name":"Costa Coffee",
"address" : {
"type" : "Point",
"coordinates" : [0.17, 51.57]
}
}

# Inputting the address in WKT format
PUT /cafes/_doc/2
{
"address" : "POINT (0.17 51.57)"
}

This code declares two ways to input a geo_shapefield: using GeoJSON or WKT. GeoJSON expects a type attribute of an appropriate shape (“type”:”Point”) and the corresponding coordinates (“coordinates”:[0.17, 51.57]) as in the example. The second example in the listing given above shows the mechanics of creating a point using a WKT format (“address”: “POINT (0.17 51.57)”).

Note: There is a subtle difference when representing the coordinates using a string format versus other formats. The string format expects the values in the order of latitude and longitude separated by a comma; for example, “(51.57, 0.17)”. However, the coordinates are interchanged for GeoJSON or WKT formats as longitude and latitude; for example, “POINT (0.17 51.57)”.

We can build various shapes using these formats. The table below provides a brief description of a few of them. I suggest that you consult the Elasticsearch documentation about how you can index and search documents to understand the concepts and examples in detail.

Table : Various shapes supported by the geo_shape data type

That’s pretty much about Geo data types. Don’t forget to read the last article describing the basics of location search.

Me @ Medium || LinkedIn || Twitter || GitHub

These short articles are condensed excerpts taken from my book Elasticsearch in Action, Second Edition. The code is available in my GitHub repository.

Elasticsearch in Action

--

--

Madhusudhan Konda
Madhusudhan Konda

Written by Madhusudhan Konda

Madhusudhan Konda is a full-stack lead engineer, mentor, and conference speaker. He delivers live online training on Elasticsearch, Elastic Stack &Spring Cloud

No responses yet