Optimizer Trace Guide

You are viewing an old version of this article. View the current version here.

Optimizer trace uses the JSON format. It is basically a structured log file showing what actions were taken by the query optimizer.

A Basic Example

Let's take a simple query:

MariaDB> explain select * from t1 where a<10;
+------+-------------+-------+-------+---------------+------+---------+------+------+-----------------------+
| id   | select_type | table | type  | possible_keys | key  | key_len | ref  | rows | Extra                 |
+------+-------------+-------+-------+---------------+------+---------+------+------+-----------------------+
|    1 | SIMPLE      | t1    | range | a             | a    | 5       | NULL | 10   | Using index condition |
+------+-------------+-------+-------+---------------+------+---------+------+------+-----------------------+

One can see the full trace here. Taking only the component names, one gets:

MariaDB> select * from information_schema.optimizer_trace limit 1\G
*************************** 1. row ***************************
                            QUERY: select * from t1 where a<10
                            TRACE: 
{
  "steps": [
    {
      "join_preparation": { ... }
    },
    {
      "join_optimization": {
        "select_id": 1,
        "steps": [
          { "condition_processing": { ... } },
          { "table_dependencies": [ ... ] },
          { "ref_optimizer_key_uses": [ ... ] },
          { "rows_estimation": [
              {
                "range_analysis": {
                   "analyzing_range_alternatives" : { ... },
                  "chosen_range_access_summary": { ... },
                },
                "selectivity_for_indexes" : { ... },
                "selectivity_for_columns" : { ... }
              }
            ]
          },
          { "considered_execution_plans": [ ... ] },
          { "attaching_conditions_to_tables": { ... } }
         ]
      }
    },
    {
      "join_execution": { ... }
    }
  ]
}

Trace Structure

For each SELECT, there are two "Steps":

  • join_preparation
  • join_optimization

Join preparation shows early query rewrites. join_optmization is where most of the query optimizations are done. They are:

  • condition_processing - basic rewrites in WHERE/ON conditions.
  • ref_optimizer_key_uses - Construction of possible ways to do ref and eq_ref accesses.
  • rows_estimation - Consideration of range and index_merge accesses.
  • considered_execution_plans - Join optimization itself, that is, choice of the join order.
  • attaching_conditions_to_tables - Once the join order is fixed, parts of the WHERE clause are "attached" to tables to filter out rows as early as possible.

The above steps are for just one SELECT. If the query has subqueries, each SELECT will have these steps, and there will be extra steps/rewrites to handle the subquery construct itself.

Extracting Trace Components

If you are interested in some particular part of the trace, MariaDB has two functions that come in handy:

For example, the contents of the analyzing_range_alternatives node can be extracted like so:

MariaDB> select JSON_DETAILED(JSON_EXTRACT(trace, '$**.analyzing_range_alternatives')) 
   ->   from INFORMATION_SCHEMA.OPTIMIZER_TRACE\G
*************************** 1. row ***************************
JSON_DETAILED(JSON_EXTRACT(trace, '$**.analyzing_range_alternatives')): [
    {
        "range_scan_alternatives": 
        [
            {
                "index": "a_b_c",
                "ranges": 
                [
                    "(1) <= (a,b) < (4,50)"
                ],
                "rowid_ordered": false,
                "using_mrr": false,
                "index_only": false,
                "rows": 4,
                "cost": 6.2509,
                "chosen": true
            }
        ],
        "analyzing_roworder_intersect": 
        {
            "cause": "too few roworder scans"
        },
        "analyzing_index_merge_union": []
    }
]

Examples of various information in the trace

Basic rewrites

A lot of applications construct database query text on the fly, which sometimes means that the query has constructs that are repetitive or redundant. In most cases, the optimizer will be able to remove them. One can check the trace to be sure:

explain select * from t1 where not (col1 >= 3);

Optimizer trace will show:

  "steps": [
    {
      "join_preparation": {
        "select_id": 1,
        "steps": [
          {
            "expanded_query": "select t1.a AS a,t1.b AS b,t1.col1 AS col1 from t1 where t1.col1 < 3"
          }

Here, one can see that NOT was removed.

Similarly, one can also see that IN(...) with one element is the same as equality:

explain select * from t1 where col1  in (1);

will show

      "join_preparation": {
        "select_id": 1,
        "steps": [
          {
            "expanded_query": "select t1.a AS a,t1.b AS b,t1.col1 AS col1 from t1 where t1.col1 = 1"

On the other hand, converting an UTF-8 column to UTF-8 is not removed:

explain select * from t1 where convert(utf8_col using utf8) = 'hello';

will show

      "join_preparation": {
        "select_id": 1,
        "steps": [
          {
            "expanded_query": "select t1.a AS a,t1.b AS b,t1.col1 AS col1,t1.utf8_col AS utf8_col from t1 where convert(t1.utf8_col using utf8) = 'hello'"
          }

so redundant CONVERT calls should be used with caution.

Range optimizer - what ranges will be scanned

MySQL/MariaDB optimizer has a complex part called the Range Optimizer. This is a module that examines WHERE (and ON) clause and constructs index ranges that need to be scanned to answer the query. The rules for constructing the ranges are quite complex.

An example: Consider a table

create table some_events ( 
  start_date date, 
  end_date date, 
  ...
  key (start_date, end_date)
);

and a query:

mysql> explain select * from some_events where start_date >= '2019-09-10' and end_date <= '2019-09-14';
+------+-------------+-------------+------+---------------+------+---------+------+------+-------------+
| id   | select_type | table       | type | possible_keys | key  | key_len | ref  | rows | Extra       |
+------+-------------+-------------+------+---------------+------+---------+------+------+-------------+
|    1 | SIMPLE      | some_events | ALL  | start_date    | NULL | NULL    | NULL | 1000 | Using where |
+------+-------------+-------------+------+---------------+------+---------+------+------+-------------+

One might think that the optimizer would be able to use the restrictions on both start_date and end_date to construct a narrow range to be scanned. But this is not so, one of the restrictions creates a left-endpoint range and the other one creates a right-endpoint range, hence they cannot be combined.

select 
   JSON_DETAILED(JSON_EXTRACT(trace, '$**.analyzing_range_alternatives')) as trace 
from information_schema.optimizer_trace\G
*************************** 1. row ***************************
trace: [
    {
        "range_scan_alternatives": 
        [
            {
                "index": "start_date",
                "ranges": 
                [
                    "(2019-09-10,NULL) < (start_date,end_date)"
                ],
...

the potential range only uses one of the bounds.

Comments

Comments loading...
Content reproduced on this site is the property of its respective owners, and this content is not reviewed in advance by MariaDB. The views, information and opinions expressed by this content do not necessarily represent those of MariaDB or any other party.