# Index Statistics

## How Index Statistics Help the Query Optimizer

Understanding index statistics is crucial for the MariaDB query optimizer to efficiently execute queries. Accurate and current statistics guide the optimizer in choosing the best way to access data, similar to using a personal address book for quicker searches rather than a larger phone book. Up-to-date index statistics ensure optimized query performance.

## Value Groups

The statistics primarily focus on groups of index elements with identical values. In a primary key, each index is unique, resulting in a group size of one. In a non-unique index, multiple keys may share the same value. The worst-case scenario involves large groups with identical values, such as an index on a boolean field.

MariaDB makes heavy use of the average group size statistic. For example, if there are 100 rows, and twenty groups with the same index values, the average group size would be five.

However, averages can be skewed by extremes, and the usual culprit is NULL values. The row of 100 may have 19 groups with an average size of 1, while the other 81 values are all NULL. MariaDB may think five is a good average size and choose to use that index, and then end up having to read through 81 rows with identical keys, taking longer than an alternative.

## Dealing with NULLs

There are three main approaches to the problem of NULLs. NULL index values can be treated as a single group (`nulls_equal`). This is usually fine, but if you have large numbers of NULLs the average group size is slanted higher, and the optimizer may miss using the index for ref accesses when it would be useful. This is the default used by [InnoDB](https://mariadb.com/docs/server/server-usage/storage-engines/innodb). The opposite approach is `nulls_unequal`, with each NULL forming its own group of one. Conversely, the average group size is slanted lower, and the optimizer may use the index for ref accesses when not suitable. This is the default used by the [Aria](https://mariadb.com/docs/server/server-usage/storage-engines/aria) and [MyISAM](https://mariadb.com/docs/server/server-usage/storage-engines/myisam-storage-engine) storage engines. A third option, `nulls_ignored`, sees NULLs ignored altogether from index group calculations.

The default approaches can be changed by setting the [aria\_stats\_method](https://mariadb.com/docs/server/server-usage/storage-engines/aria/aria-system-variables#aria_stats_method), [myisam\_stats\_method](https://mariadb.com/docs/server/server-usage/storage-engines/myisam-storage-engine/myisam-system-variables#myisam_stats_method) and [innodb\_stats\_method](https://mariadb.com/docs/server/server-usage/storage-engines/innodb/innodb-system-variables#innodb_stats_method) server variables.

## Null-Safe and Regular Comparisons

The comparison operator used plays an important role. If two values are compared with <=> (see the [null-safe-equal](https://mariadb.com/docs/server/reference/sql-structure/operators/comparison-operators/null-safe-equal) comparison operator), and both are null, 1 is returned. If the same values are compared with = (see the [equal](https://mariadb.com/docs/server/reference/sql-structure/operators/comparison-operators/equal) comparison operator) null is returned. For example:

```sql
SELECT 1 <=> 1, NULL <=> NULL, 1 <=> NULL;
+---------+---------------+------------+
| 1 <=> 1 | NULL <=> NULL | 1 <=> NULL |
+---------+---------------+------------+
|       1 |             1 |          0 |
+---------+---------------+------------+

SELECT 1 = 1, NULL = NULL, 1 = NULL;
+-------+-------------+----------+
| 1 = 1 | NULL = NULL | 1 = NULL |
+-------+-------------+----------+
|     1 |        NULL |     NULL |
+-------+-------------+----------+
```

## Engine-Independent Statistics

[MariaDB 10.0.1](https://app.gitbook.com/s/aEnK0ZXmUbJzqQrTjFyb/community-server/old-releases/10.0/10.0.1) introduced a way to gather statistics independently of the storage engine. See [Engine-independent table statistics](https://mariadb.com/docs/server/ha-and-performance/optimization-and-tuning/query-optimizations/statistics-for-optimizing-queries/engine-independent-table-statistics).

## Histogram-Based Statistics

[Histogram-Based Statistics](https://mariadb.com/docs/server/ha-and-performance/optimization-and-tuning/query-optimizations/statistics-for-optimizing-queries/histogram-based-statistics) were introduced in [MariaDB 10.0.2](https://app.gitbook.com/s/aEnK0ZXmUbJzqQrTjFyb/community-server/old-releases/10.0/10.0.2), and are collected by default from [MariaDB 10.4.3](https://app.gitbook.com/s/aEnK0ZXmUbJzqQrTjFyb/community-server/old-releases/10.4/10.4.3).

## See Also

* [User Statistics](https://mariadb.com/docs/server/ha-and-performance/optimization-and-tuning/query-optimizations/statistics-for-optimizing-queries/user-statistics). This plugin provides user, client, table and index usage statistics.
* [InnoDB Persistent Statistics](https://mariadb.com/docs/server/ha-and-performance/optimization-and-tuning/query-optimizations/statistics-for-optimizing-queries/innodb-persistent-statistics)
* [Engine-independent Statistics](https://mariadb.com/docs/server/ha-and-performance/optimization-and-tuning/query-optimizations/statistics-for-optimizing-queries/engine-independent-table-statistics)
* [Histogram-based Statistics](https://mariadb.com/docs/server/ha-and-performance/optimization-and-tuning/query-optimizations/statistics-for-optimizing-queries/histogram-based-statistics)
* [Ignored Indexes](https://mariadb.com/docs/server/ha-and-performance/optimization-and-tuning/optimization-and-indexes/ignored-indexes)

<sub>*This page is licensed: CC BY-SA / Gnu FDL*</sub>

{% @marketo/form formId="4316" %}


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://mariadb.com/docs/server/ha-and-performance/optimization-and-tuning/optimization-and-indexes/index-statistics.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
