# Vector Overview

<table data-view="cards"><thead><tr><th align="center"></th><th align="center"></th><th align="center"></th><th data-hidden data-card-cover data-type="files"></th></tr></thead><tbody><tr><td align="center"><strong>WEBINAR</strong></td><td align="center">The Next Generation of MariaDB: Powered by Vector Search</td><td align="center"><a href="https://go.mariadb.com/GLBL-WBN-2025-01-30-WhatsnewinMariaDB-ES.html?utm_source=onpagepromo&#x26;utm_medium=kb&#x26;utm_campaign=webinar-platform-vector"><strong>Watch Now</strong></a></td><td><a href="broken-reference">Broken file</a></td></tr></tbody></table>

{% hint style="info" %}
[Vectors](https://mariadb.com/docs/server/reference/sql-structure/vectors) are available from [MariaDB 11.7](https://app.gitbook.com/s/aEnK0ZXmUbJzqQrTjFyb/community-server/old-releases/11.7/what-is-mariadb-117).
{% endhint %}

MariaDB Vector is a feature that allows MariaDB Server to perform as a relational vector database. Vectors generated by an AI model can be stored and searched in MariaDB.

The initial implementation uses the modified HNSW[^1] algorithm for searching in the vector index (to solve the so-called Approximate Nearest Neighbor problem), and defaults to Euclidean distance. Concurrent reads/writes and all [transaction isolation levels](https://mariadb.com/docs/server/sql-statements/administrative-sql-statements/set-commands/set-transaction#isolation-level) are supported.

MariaDB uses `int16` for indexes, which gives 15 bits to store the value, rather than 10 bits for float16.

## Creating

Vectors can be defined using `VECTOR INDEX` for the index definition, and using the [VECTOR data type](https://mariadb.com/docs/server/reference/sql-structure/vectors/vector) in the [CREATE TABLE](https://mariadb.com/docs/server/reference/sql-structure/vectors/create-table-with-vectors) statement.

```sql
CREATE TABLE v (
     id INT PRIMARY KEY,
     v VECTOR(5) NOT NULL,
     VECTOR INDEX (v)
);
```

The distance function used to build the vector index can be `euclidean` (the default) or `cosine`. An additional option, `M`, can be used to configure the vector index. Larger values mean slower `SELECT` and `INSERT` statements, larger index size and higher memory consumption but more accurate results. The valid range is from `3` to `200`.

```sql
CREATE TABLE embeddings (
        doc_id BIGINT UNSIGNED PRIMARY KEY,
        embedding VECTOR(1536) NOT NULL,
        VECTOR INDEX (embedding) M=8 DISTANCE=cosine
);
```

## Inserting

Vector columns store [32-bit IEEE 754 floating point numbers](https://en.wikipedia.org/wiki/Single-precision_floating-point_format).

```sql
INSERT INTO v VALUES 
     (1, x'e360d63ebe554f3fcdbc523f4522193f5236083d'),
     (2, x'f511303f72224a3fdd05fe3eb22a133ffae86a3f'),
     (3,x'f09baa3ea172763f123def3e0c7fe53e288bf33e'),
     (4,x'b97a523f2a193e3eb4f62e3f2d23583e9dd60d3f'),
     (5,x'f7c5df3e984b2b3e65e59d3d7376db3eac63773e'),
     (6,x'de01453ffa486d3f10aa4d3fdd66813c71cb163f'),
     (7,x'76edfc3e4b57243f10f8423fb158713f020bda3e'),
     (8,x'56926c3fdf098d3e2c8c5e3d1ad4953daa9d0b3e'),
     (9,x'7b713f3e5258323f80d1113d673b2b3f66e3583f'),
     (10,x'6ca1d43e9df91b3fe580da3e1c247d3f147cf33e');
```

Alternatively, you can use `VEC_FromText()` function:

```sql
INSERT INTO v VALUES
  (1,Vec_FromText('[0.418708,0.809902,0.823193,0.598179,0.0332549]')),
  (2,Vec_FromText('[0.687774,0.789588,0.496138,0.57487,0.917617]')),
  (3,Vec_FromText('[0.333221,0.962687,0.467263,0.448235,0.475671]')),
  (4,Vec_FromText('[0.822185,0.185643,0.683452,0.211072,0.554056]')),
  (5,Vec_FromText('[0.437057,0.167281,0.0770977,0.428638,0.241591]')),
  (6,Vec_FromText('[0.76956,0.926895,0.803376,0.0157961,0.589042]')),
  (7,Vec_FromText('[0.493999,0.641957,0.761598,0.94276,0.425865]')),
  (8,Vec_FromText('[0.924108,0.275466,0.0543329,0.0731585,0.136344]')),
  (9,Vec_FromText('[0.186956,0.69666,0.0356002,0.668875,0.84722]')),
  (10,Vec_FromText('[0.415294,0.609278,0.426765,0.988832,0.475556]'));
```

## Querying

For vector indexes built with the `euclidean` function, [VEC\_DISTANCE\_EUCLIDEAN](https://mariadb.com/docs/server/reference/sql-functions/vector-functions/vec_distance_euclidean) can be used. It calculates a Euclidean (L2) distance between two points:

```sql
SELECT id FROM v ORDER BY 
  VEC_DISTANCE_EUCLIDEAN(v, x'6ca1d43e9df91b3fe580da3e1c247d3f147cf33e');
+----+
| id |
+----+
| 10 |
|  7 |
|  3 |
|  9 |
|  2 |
|  1 |
|  5 |
|  4 |
|  6 |
|  8 |
+----+
```

Most commonly, this kind of query is done with a limit, for example to return vectors that are closest to a given vector, such as from a user search query, image or a song fragment:

```sql
SELECT id FROM v 
  ORDER BY VEC_DISTANCE_EUCLIDEAN(v, x'6ca1d43e9df91b3fe580da3e1c247d3f147cf33e') 
  LIMIT 2;
+----+
| id |
+----+
| 10 |
|  7 |
+----+
```

For vector indexes built with the `cosine` function, [VEC\_DISTANCE\_COSINE](https://mariadb.com/docs/server/reference/sql-functions/vector-functions/vec_distance_cosine) can be used. It calculates a [Cosine distance](https://en.wikipedia.org/wiki/Cosine_similarity#Cosine_distance) between two vectors:

```sql
SELECT VEC_DISTANCE_COSINE(VEC_FROMTEXT('[1,2,3]'), VEC_FROMTEXT('[3,5,7]'));
```

The [VEC\_DISTANCE](https://mariadb.com/docs/server/reference/sql-functions/vector-functions/vector-functions-vec_distance) function is a generic function that behaves either as [VEC\_DISTANCE\_EUCLIDEAN](https://mariadb.com/docs/server/reference/sql-functions/vector-functions/vec_distance_euclidean) or [VEC\_DISTANCE\_COSINE](https://mariadb.com/docs/server/reference/sql-functions/vector-functions/vec_distance_cosine), depending on the underlying index type:

```sql
SELECT id FROM v 
  ORDER BY VEC_DISTANCE(v, x'6ca1d43e9df91b3fe580da3e1c247d3f147cf33e');
+----+
| id |
+----+
| 10 |
|  7 |
|  3 |
|  9 |
|  2 |
|  1 |
|  5 |
|  4 |
|  6 |
|  8 |
+----+
```

{% hint style="info" %}
There is no function for *dot product* (also called *inner product*) distance available in many other vector databases. Dot product is not a proper distance measure (for example, vector's closest match is not necessarily itself) and is only used for performance reasons, because it is often faster than cosine or euclidean and produces the same results if vectors are normalized. In MariaDB optimized implementation euclidean and cosine measures are the fastest, and dot product, if implemented, would not provide any performance benefits. Use euclidean or cosine (they are equally fast) for normalized vectors.
{% endhint %}

## System Variables

There are a number of system variables used for vectors. See [Vector System Variables](https://mariadb.com/docs/server/reference/sql-structure/vectors/vector-system-variables).

## Vector Framework Integrations

MariaDB Vector is integrated in several frameworks, see [Vector Framework Integrations](https://mariadb.com/docs/server/reference/sql-structure/vectors/vector-framework-integrations).

## What is a Vector?

{% columns %}
{% column %}
{% embed url="<https://www.youtube.com/shorts/VrG8H53KJZY>" %}
What exactly is a vector in AI and RAG (1 minute • 2026)
{% endembed %}
{% endcolumn %}

{% column %}
{% hint style="info" %}
**Video summary**

* A vector (an *embedding*) is an ordered list of numbers.
* AI models map content (like text) into vectors.
* Similar meanings end up close together in vector space.
* RAG and semantic search retrieve relevant items by finding the nearest vectors.
  {% endhint %}
  {% endcolumn %}
  {% endcolumns %}

## See Also

* [MariaDB Vector: How it works](https://mariadb.org/mariadb-vector-how-it-works/) (blog post • 5 minutes • 2026)
* [Everything you need to know to start building apps with AI and RAG](https://youtu.be/ZlLV7rda9GY) (video • 40 minutes • 2026)
* [What exactly is a vector in AI and RAG?](https://www.youtube.com/shorts/VrG8H53KJZY) (video • 1 minute • 2026)
* [Get to know MariaDB’s Rocket-Fast Native Vector Search - Sergei Golubchyk](https://www.youtube.com/watch?v=gNyzcy_6qJM) (video • 37 minutes • 2025)
* [MariaDB Vector, a new Open Source vector database that you are already familiar by Sergei Golubchik](https://www.youtube.com/watch?v=r9af4bvF7jI) (video • 27 minutes • 2025)
* [AI first applications with MariaDB Vector - Vicentiu Ciorbaru](https://www.youtube.com/watch?v=vp126N1QOws) (video • 22 minutes • 2025)
* [MariaDB Vector: A storage engine for LLMs - Kaj Arnö and Jonah Harris](https://www.youtube.com/watch?v=3y-yWoH-CF8) (video • 12 minutes • 2025)
* [Try RAG with MariaDB Vector on your own MariaDB data!](https://mariadb.org/rag-with-mariadb-vector/) (blog post • 5 minutes • 2024)
  * The post, written by Robert Silén, explains how to build a Retrieval-Augmented Generation (RAG) system using MariaDB's native vector storage. It covers:
    1. Preparation: Setting up a MariaDB 11.7+ environment and creating a table with the `VECTOR` data type.
    2. Indexing: Using OpenAI's embedding model to vectorize documentation and store it in MariaDB.
    3. Search & Generation: Performing a nearest-neighbor search (`VEC_DISTANCE_EUCLIDEAN`) to find relevant context for a user's question and feeding that context into an LLM for an accurate response.

<sub>*This page is licensed: CC BY-SA / Gnu FDL*</sub>

{% @marketo/form formId="4316" %}

[^1]: HNSW (Hierarchical Navigable Small World): A multi-layered, graph-based algorithm used for lightning-fast vector searches by finding "nearest neighbors."
