MariaDB 5.3 optimizer debugging
Contents
MariaDB 5.3 has an optimizer debugging patch. The patch is pushed into
lp:maria-captains/maria/5.3-optimizer-debugging
The patch is wrapped in #ifdef, but there is a #define straight in mysql_priv.h so simply compiling that tree should produce a binary with optimizer debugging enabled.
The patch adds two variables:
- debug_optimizer_prefer_join_prefix
- debug_optimizer_dupsweedout_penalized
the variables are present as session/global variables, as well as settable from command line.
debug_optimizer_prefer_join_prefix
If this variable is non-NULL, it is assumed to specify a join prefix as a comma-separated list of table aliases:
set debug_optimizer_prefer_join_prefix='tbl1,tbl2,tbl3';
The optimizer will try its best to build join plan that matches the specified join prefix. It does it by comparing join prefixes it is considering with @@debug_optimizer_prefer_join_prefix, and multiplying cost by million if the plan doesn't match the prefix.
As a result, you can more-or-less control the join order. For example, let's take this query:
MariaDB [test]> explain select * from ten A, ten B, ten C; +----+-------------+-------+------+---------------+------+---------+------+------+------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+------+---------------+------+---------+------+------+------------------------------------+ | 1 | SIMPLE | A | ALL | NULL | NULL | NULL | NULL | 10 | | | 1 | SIMPLE | B | ALL | NULL | NULL | NULL | NULL | 10 | Using join buffer (flat, BNL join) | | 1 | SIMPLE | C | ALL | NULL | NULL | NULL | NULL | 10 | Using join buffer (flat, BNL join) | +----+-------------+-------+------+---------------+------+---------+------+------+------------------------------------+ 3 rows in set (0.00 sec)
and request a join order of C,A,B:
MariaDB [test]> set debug_optimizer_prefer_join_prefix='C,A,B'; Query OK, 0 rows affected (0.00 sec) MariaDB [test]> explain select * from ten A, ten B, ten C; +----+-------------+-------+------+---------------+------+---------+------+------+------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+------+---------------+------+---------+------+------+------------------------------------+ | 1 | SIMPLE | C | ALL | NULL | NULL | NULL | NULL | 10 | | | 1 | SIMPLE | A | ALL | NULL | NULL | NULL | NULL | 10 | Using join buffer (flat, BNL join) | | 1 | SIMPLE | B | ALL | NULL | NULL | NULL | NULL | 10 | Using join buffer (flat, BNL join) | +----+-------------+-------+------+---------------+------+---------+------+------+------------------------------------+ 3 rows in set (0.00 sec)
We got it.
Note that this is still best-effort approach:
- you won't be successful in forcing join orders that optimizer considers invalid (e.g. for "t1 LEFT JOIN t2" you won't be able to get a join order of t2,t1.
- The optimizer does various plan pruning and may discard the requested join order before it has a chance to find out that it is million-times cheaper than any other.
Semi-joins
It is possible to force the join order of joins plus semi-joins. This may cause different strategy to be used:
MariaDB [test]> set debug_optimizer_prefer_join_prefix=NULL; Query OK, 0 rows affected (0.00 sec) MariaDB [test]> explain select * from ten A where a in (select B.a from ten B, ten C where C.a + A.a < 4); +----+-------------+-------+------+---------------+------+---------+------+------+----------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+------+---------------+------+---------+------+------+----------------------------+ | 1 | PRIMARY | A | ALL | NULL | NULL | NULL | NULL | 10 | | | 1 | PRIMARY | B | ALL | NULL | NULL | NULL | NULL | 10 | Using where | | 1 | PRIMARY | C | ALL | NULL | NULL | NULL | NULL | 10 | Using where; FirstMatch(A) | +----+-------------+-------+------+---------------+------+---------+------+------+----------------------------+ 3 rows in set (0.00 sec) MariaDB [test]> set debug_optimizer_prefer_join_prefix='C,A,B'; Query OK, 0 rows affected (0.00 sec) MariaDB [test]> explain select * from ten A where a in (select B.a from ten B, ten C where C.a + A.a < 4); +----+-------------+-------+------+---------------+------+---------+------+------+-------------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+------+---------------+------+---------+------+------+-------------------------------------------------+ | 1 | PRIMARY | C | ALL | NULL | NULL | NULL | NULL | 10 | Start temporary | | 1 | PRIMARY | A | ALL | NULL | NULL | NULL | NULL | 10 | Using where; Using join buffer (flat, BNL join) | | 1 | PRIMARY | B | ALL | NULL | NULL | NULL | NULL | 10 | Using where; End temporary | +----+-------------+-------+------+---------------+------+---------+------+------+-------------------------------------------------+ 3 rows in set (0.00 sec)
Semi-join materialization is a somewhat special case, because "join prefix" is not exactly what you see in EXPLAIN output. For semi-join materialization, one should
- don't put "<subqueryN>" into @@debug_optimizer_prefer_join_prefix.
- instead, put all of the materialization tables into the place where you want the <subqueryN> line.
- Attempts to control the join order inside the materialization nest will be unsuccessful. Example: we want A-C-B-AA:
MariaDB [test]> set debug_optimizer_prefer_join_prefix='A,C,B,AA'; Query OK, 0 rows affected (0.00 sec) MariaDB [test]> explain select * from ten A, ten AA where A.a in (select B.a from ten B, ten C); +----+-------------+-------------+--------+---------------+--------------+---------+------+------+------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------------+--------+---------------+--------------+---------+------+------+------------------------------------+ | 1 | PRIMARY | A | ALL | NULL | NULL | NULL | NULL | 10 | | | 1 | PRIMARY | <subquery2> | eq_ref | distinct_key | distinct_key | 5 | func | 1 | | | 1 | PRIMARY | AA | ALL | NULL | NULL | NULL | NULL | 10 | Using join buffer (flat, BNL join) | | 2 | SUBQUERY | B | ALL | NULL | NULL | NULL | NULL | 10 | | | 2 | SUBQUERY | C | ALL | NULL | NULL | NULL | NULL | 10 | | +----+-------------+-------------+--------+---------------+--------------+---------+------+------+------------------------------------+ 5 rows in set (0.00 sec)
but we A-B-C-AA.
debug_optimizer_dupsweedout_penalized
<<code inline=false>> <</code>>