MariaDB 5.3 Optimizer Debugging

MariaDB 5.3 has an optimizer debugging patch. The patch is pushed into

lp:maria-captains/maria/5.3-optimizer-debugging

The patch is wrapped in #ifdef, but there is a #define straight in mysql_priv.h so simply compiling that tree should produce a binary with optimizer debugging enabled.

The patch adds two system variables:

  • @@debug_optimizer_prefer_join_prefix
  • @@debug_optimizer_dupsweedout_penalized

the variables are present as session/global variables, and are also settable via the server command line.

debug_optimizer_prefer_join_prefix

If this variable is non-NULL, it is assumed to specify a join prefix as a comma-separated list of table aliases:

set debug_optimizer_prefer_join_prefix='tbl1,tbl2,tbl3';

The optimizer will try its best to build a join plan which matches the specified join prefix. It does this by comparing join prefixes it is considering with @@debug_optimizer_prefer_join_prefix, and multiplying cost by a million if the plan doesn't match the prefix.

As a result, you can more-or-less control the join order. For example, let's take this query:

MariaDB [test]> explain select * from ten A, ten B, ten C;
+----+-------------+-------+------+---------------+------+---------+------+------+------------------------------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra                              |
+----+-------------+-------+------+---------------+------+---------+------+------+------------------------------------+
|  1 | SIMPLE      | A     | ALL  | NULL          | NULL | NULL    | NULL |   10 |                                    |
|  1 | SIMPLE      | B     | ALL  | NULL          | NULL | NULL    | NULL |   10 | Using join buffer (flat, BNL join) |
|  1 | SIMPLE      | C     | ALL  | NULL          | NULL | NULL    | NULL |   10 | Using join buffer (flat, BNL join) |
+----+-------------+-------+------+---------------+------+---------+------+------+------------------------------------+
3 rows in set (0.00 sec)

and request a join order of C,A,B:

MariaDB [test]> set debug_optimizer_prefer_join_prefix='C,A,B';
Query OK, 0 rows affected (0.00 sec)

MariaDB [test]> explain select * from ten A, ten B, ten C;
+----+-------------+-------+------+---------------+------+---------+------+------+------------------------------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra                              |
+----+-------------+-------+------+---------------+------+---------+------+------+------------------------------------+
|  1 | SIMPLE      | C     | ALL  | NULL          | NULL | NULL    | NULL |   10 |                                    |
|  1 | SIMPLE      | A     | ALL  | NULL          | NULL | NULL    | NULL |   10 | Using join buffer (flat, BNL join) |
|  1 | SIMPLE      | B     | ALL  | NULL          | NULL | NULL    | NULL |   10 | Using join buffer (flat, BNL join) |
+----+-------------+-------+------+---------------+------+---------+------+------+------------------------------------+
3 rows in set (0.00 sec)

We got it.

Note that this is still a best-effort approach:

  • you won't be successful in forcing join orders which the optimizer considers invalid (e.g. for "t1 LEFT JOIN t2" you won't be able to get a join order of t2,t1).
  • The optimizer does various plan pruning and may discard the requested join order before it has a chance to find out that it is a million-times cheaper than any other.

Semi-joins

It is possible to force the join order of joins plus semi-joins. This may cause a different strategy to be used:

MariaDB [test]> set debug_optimizer_prefer_join_prefix=NULL;
Query OK, 0 rows affected (0.00 sec)

MariaDB [test]> explain select * from ten A where a in (select B.a from ten B, ten C where C.a + A.a < 4);
+----+-------------+-------+------+---------------+------+---------+------+------+----------------------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra                      |
+----+-------------+-------+------+---------------+------+---------+------+------+----------------------------+
|  1 | PRIMARY     | A     | ALL  | NULL          | NULL | NULL    | NULL |   10 |                            |
|  1 | PRIMARY     | B     | ALL  | NULL          | NULL | NULL    | NULL |   10 | Using where                |
|  1 | PRIMARY     | C     | ALL  | NULL          | NULL | NULL    | NULL |   10 | Using where; FirstMatch(A) |
+----+-------------+-------+------+---------------+------+---------+------+------+----------------------------+
3 rows in set (0.00 sec)

MariaDB [test]> set debug_optimizer_prefer_join_prefix='C,A,B';
Query OK, 0 rows affected (0.00 sec)

MariaDB [test]> explain select * from ten A where a in (select B.a from ten B, ten C where C.a + A.a < 4);
+----+-------------+-------+------+---------------+------+---------+------+------+-------------------------------------------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra                                           |
+----+-------------+-------+------+---------------+------+---------+------+------+-------------------------------------------------+
|  1 | PRIMARY     | C     | ALL  | NULL          | NULL | NULL    | NULL |   10 | Start temporary                                 |
|  1 | PRIMARY     | A     | ALL  | NULL          | NULL | NULL    | NULL |   10 | Using where; Using join buffer (flat, BNL join) |
|  1 | PRIMARY     | B     | ALL  | NULL          | NULL | NULL    | NULL |   10 | Using where; End temporary                      |
+----+-------------+-------+------+---------------+------+---------+------+------+-------------------------------------------------+
3 rows in set (0.00 sec)

Semi-join materialization is a somewhat special case, because "join prefix" is not exactly what you see in the EXPLAIN output. For semi-join materialization:

  • don't put "<subqueryN>" into @@debug_optimizer_prefer_join_prefix
  • instead, put all of the materialization tables into the place where you want the <subqueryN> line.
  • Attempts to control the join order inside the materialization nest will be unsuccessful. Example: we want A-C-B-AA:
MariaDB [test]> set debug_optimizer_prefer_join_prefix='A,C,B,AA';
Query OK, 0 rows affected (0.00 sec)

MariaDB [test]> explain select * from ten A, ten AA where A.a in (select B.a from ten B, ten C);
+----+-------------+-------------+--------+---------------+--------------+---------+------+------+------------------------------------+
| id | select_type | table       | type   | possible_keys | key          | key_len | ref  | rows | Extra                              |
+----+-------------+-------------+--------+---------------+--------------+---------+------+------+------------------------------------+
|  1 | PRIMARY     | A           | ALL    | NULL          | NULL         | NULL    | NULL |   10 |                                    |
|  1 | PRIMARY     | <subquery2> | eq_ref | distinct_key  | distinct_key | 5       | func |    1 |                                    |
|  1 | PRIMARY     | AA          | ALL    | NULL          | NULL         | NULL    | NULL |   10 | Using join buffer (flat, BNL join) |
|  2 | SUBQUERY    | B           | ALL    | NULL          | NULL         | NULL    | NULL |   10 |                                    |
|  2 | SUBQUERY    | C           | ALL    | NULL          | NULL         | NULL    | NULL |   10 |                                    |
+----+-------------+-------------+--------+---------------+--------------+---------+------+------+------------------------------------+
5 rows in set (0.00 sec)

but we get A-B-C-AA.

debug_optimizer_dupsweedout_penalized

There are four semi-join execution strategies:

  1. FirstMatch
  2. Materialization
  3. LooseScan
  4. DuplicateWeedout

The first three strategies have flags in @@optimizer_switch that can be used to disable them. The DuplicateWeedout strategy does not have a flag. This was done for a reason, as that strategy is the catch-all strategy and it can handle all kinds of subqueries, in all kinds of join orders. (We're slowly moving to the point where it will be possible to run with FirstMatch enabled and everything else disabled but we are not there yet.)

Since DuplicateWeedout cannot be disabled, there are cases where it "gets in the way" by being chosen over the strategy you need. This is what debug_optimizer_dupsweedout_penalized is for. if you set:

MariaDB [test]> set debug_optimizer_dupsweedout_penalized=TRUE;

...the costs of query plans that use DuplicateWeedout will be multiplied by a millon. This doesn't mean that you will get rid of DuplicateWeedout due to Bug #898747 it is still possible to have DuplicateWeedout used even if a cheaper plan exits. A partial remedy to this is to run with

MariaDB [test]> set optimizer_prune_level=0;

It is possible to use both debug_optimizer_dupsweedout_penalized and debug_optimizer_prefer_join_prefix at the same time. This should give you the desired strategy and join order.

Further reading

  • See mysql-test/t/debug_optimizer.test (in the MariaDB source code) for examples

Comments

Comments loading...
Content reproduced on this site is the property of its respective owners, and this content is not reviewed in advance by MariaDB. The views, information and opinions expressed by this content do not necessarily represent those of MariaDB or any other party.