August 4, 2017

MariaDB Server 10.2: JSON/GeoJSON & GIS

Starting with MariaDB Server 10.2, new JSON/GeoJSON functionality was introduced. Virtual columns also received improvements. With version 10.2 tons of new functionality was added and many of the limitations previously experienced were lifted.

Here, I’ll show how these new MariaDB Server 10.2 features can empower your business.

Side note: One additional piece of information before we start. You cannot imagine how much I hated mathematics back in my school days. Especially geometry and all the Sin, Cos and Radians stuff. Now I’m older, (and eventually) smarter and want to thank my teachers for not giving up!

For today’s task, we’ll create a MariaDB table with JSON objects. It will contain public data about meteorite landings, to see how safe the area I (or you) live in really is. The dataset is publicly available from NASA’s open data portal.

A little bit of geometry background before we go any further.

Latitudes and longitudes are numerical values and they are expressed in degrees. Latitude defines how far north (positive) or south (negative) of the equator a point is located. All points on the equator have a latitude value of zero. The north pole has a latitude of 90, and the south pole has a latitude of -90. So, that’s why all northern hemisphere locations have positive latitude and southern-hemisphere locations have negative values.

By definition, the Greenwich Observatory near London has a longitude of zero. Negative longitude means that this point is west of Greenwich, positive means that the point is east.

We will provide geographic coordinates (latitude and longitude) and by using GIS functions we will get the information about meteorite impacts in a circle of XY km (or miles) around those coordinates.

To get started, I created a table with the following properties:

CREATE TABLE gis_json (

id INTEGER NOT NULL PRIMARY KEY AUTO_INCREMENT,

jsonfield VARCHAR(1024),

name VARCHAR(255) as (JSON_VALUE(jsonfield,'$.name')),

gis_point POINT as ( ST_GeomFromGeoJSON(JSON_QUERY( jsonfield, '$.geolocation')) ) PERSISTENT,

rlat VARCHAR(20) as (JSON_VALUE(jsonfield,'$.reclat')),

rlong VARCHAR(20) as (JSON_VALUE(jsonfield,'$.reclong')),

KEY jsonkey (name),

CHECK (JSON_VALID(jsonfield))

) ENGINE=InnoDB DEFAULT CHARSET=utf8;

I’m using jsonfield to save the actual JSON formatted data.

The field name is a virtual column, it’s matching a value from the JSON key ‘name’.

The field gis_point is a virtual column too, but here I’m using a persistent method to save its content. It is calculated from the JSON field ‘geolocation’ using GeoJSON function ST_GeomFromGeoJSON. All geolocation data in the JSON file are Point type.

I enabled a Check to be sure that the JSON data I read is valid and properly formatted. I also defined an index on the field name.

Here is just a sample of one JSON data record:

{

"fall":"Fell",

"geolocation":{

"type":"Point","coordinates":[50.775,6.08333]

},

"Id":"1",

"Mass":"21",

"name":"Aachen",

"nametype":"Valid",

"recclass":"L5",

"reclat":"50.775000",

"reclong":"6.083330",

"year":"1880-01-01"

}

As you can see, some of the data is redundant, but I decided to keep this record for the sake of simplicity.

Now, we can create the database and the table, and insert the data:

Welcome to the MariaDB monitor.  Commands end with ; or \g.

Your MariaDB connection id is 31

Server version: 10.2.6-MariaDB Homebrew

Copyright (c) 2000, 2017, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [(none)]> create database geodata;

Query OK, 1 row affected (0.00 sec)

MariaDB [(none)]> use geodata;

Database changed

MariaDB [geodata]> source json_gis_sample.sql;

Query OK, 0 rows affected, 1 warning (0.00 sec)

Query OK, 0 rows affected (0.03 sec)

Query OK, 38401 rows affected (0.67 sec)

Records: 38401  Duplicates: 0  Warnings: 0

After we have imported the whole dataset (38401 records), we can execute our query:

SELECT name, impactyear, mass_g, Y(gis_point) as latit, X(gis_point) as longit, round(distance,2) as distance_km 
  FROM ( 
         SELECT name, JSON_VALUE(jsonfield,'$.year') as impactyear, JSON_VALUE(jsonfield,'$.mass') as mass_g, gis_point, r , 
                units * DEGREES( ACOS( COS(RADIANS(latpoint)) * 
                                       COS(RADIANS(X(gis_point))) * 
                                       COS(RADIANS(longpoint) - RADIANS(Y(gis_point))) + 
                                       SIN(RADIANS(latpoint)) * 
                                       SIN(RADIANS(X(gis_point))))) AS distance 
           FROM gis_json
           JOIN ( 
                  SELECT 48.139 AS latpoint, 11.566 AS longpoint, 
                         100.0 AS r, 111.045 AS units 
                ) AS p 
           WHERE MbrContains(GeomFromText( 
                        CONCAT('LINESTRING(', latpoint-(r/units),' ',
                                              longpoint-(r /(units* COS(RADIANS(latpoint)))), 
                                              ',', 
                                              latpoint+(r/units) ,' ', 
                                              longpoint+(r /(units * COS(RADIANS(latpoint)))), ')')),
                        gis_point) 
       ) AS d 
 WHERE distance <= r 
 ORDER BY distance;

+----------------+------------+--------+----------+----------+-------------+
| name           | impactyear | mass_g | latit    | longit   | distance_km |
+----------------+------------+--------+----------+----------+-------------+
| Mässing        | 1803-01-01 | 1600   | 12.61667 | 48.13333 |       77.86 |
| Schönenberg    | 1846-01-01 | 8000   | 10.46667 | 48.11667 |       81.52 |
| Eichstädt      | 1785-01-01 | 3000   | 11.21667 |     48.9 |       88.32 |
| Neuschwanstein | 2002-01-01 | 6189   | 10.80833 |   47.525 |       88.54 |
| Mühlau         | 1877-01-01 | 5      | 11.41667 | 47.28333 |       95.67 |
+----------------+------------+--------+----------+----------+-------------+
5 rows in set (0.06 sec)

As you can see with impactyear and mass_g, we can grab the JSON data directly from the jsonfield, using the JSON_VALUE() function. I’m also using MbrContains, which is a Minimum Bounding Rectangle (MBR) geographical function.

The result shows that (statistically) Munich is a safe (and great!) place to live. It’s much easier to drown in good beer than to be hit by a meteor.   

You can adapt the following variables to fit your needs:

latpoint 48.139

longpoint 11.566

As a starting point, I used the center of Munich.

Munich map.png

r 100 -> this is a distance from the starting point you want to search within. Since we are using a circle distance in the query, it’s the circle radius.

units 111.045 -> if you want to search based in Kilometer, one degree has 111.045 km. For Miles, use 69.0. If you are on the sea, and want to know how far away the Roman Abramovich's yacht is, you should use 60.0 as units value. Please note, that you should add or subtract around 1 mile from the value, depending on where Roman was standing (bow or stern!) when he shared his geo position with you.

To search within a broader radius of 250 km, I’ll just change the r value from 100 to 250 and restart the query:

+-----------------+------------+--------+----------+----------+-------------+
| name            | impactyear | mass_g | latit    | longit   | distance_km |
+-----------------+------------+--------+----------+----------+-------------+
| Mässing         | 1803-01-01 | 1600   | 12.61667 | 48.13333 |       77.86 |
| Schönenberg     | 1846-01-01 | 8000   | 10.46667 | 48.11667 |       81.52 |
| Eichstädt       | 1785-01-01 | 3000   | 11.21667 |     48.9 |       88.32 |
| Neuschwanstein  | 2002-01-01 | 6189   | 10.80833 |   47.525 |       88.54 |
| Mühlau          | 1877-01-01 | 5      | 11.41667 | 47.28333 |       95.67 |
| Unter-Mässing   | 1920-01-01 | 80000  | 11.33333 | 49.09028 |      107.01 |
| Mauerkirchen    | 1768-01-01 | 19000  | 13.13333 | 48.18333 |      116.20 |
| Ischgl          | 1976-01-01 | 724    | 10.27333 | 47.02633 |      156.97 |
| Prambachkirchen | 1932-01-01 | 2125   | 13.94083 |  48.3025 |      176.63 |
| Bohumilitz      | 1829-01-01 | 59000  | 13.76667 |    49.05 |      190.66 |
| Langwies        | 1985-01-01 | 16     |  9.71667 | 46.81667 |      202.04 |
| Teplá           | 1909-01-01 | 17000  | 12.86667 | 49.98333 |      225.60 |
| Barcis          | 1950-01-01 | 87     |    12.35 |     46.1 |      234.04 |
| Elbogen         | 1399-12-24 | 107000 | 12.73333 | 50.18333 |      242.31 |
| Pribram         | 1959-01-01 | 5555   | 14.03333 | 49.66667 |      247.39 |
| Ybbsitz         | 1977-01-01 | 15000  |    14.89 |    47.96 |      247.53 |
+-----------------+------------+--------+----------+----------+-------------+
16 rows in set (0.06 sec)


With this set of functions, you can start enjoying a non-relational structure combined with the benefits of a powerful relational database system.

As an example, these functions can be used to build geotargeting API services for your business, showing your customers where your POS is located or checking how far away your friends are who shared their position with you.

A nice Open Source Framework for mapping is OpenLayers. Additionally, you can find a big list of free JSON datasets that don’t require authentication here.

If you want to give this code a try, download the complete SQL file (DDL and data).

Want more? Join our webinar on August 10 to dive deeper on geospatial data with MariaDB. In the webinar, we will cover how to build location-based services with MariaDB Server, and SQL using spatial data types, indexes and functions. Register for the MariaDB geospatial webinar now.

Happy geo positioning!