You are viewing an old version of this article. View the current version here.

The goal of the subquery cache is to optimize the evaluation of correlated subqueries by storing results together with correlation parameters in a cache and avoiding re-execution of the subquery in cases where the result is already in the cache.


Starting in MariaDB 5.3.2-beta the cache is on by default. In previous versions of MariaDB, the cache was off by default. One can switch it on or off using optimizer_switch subquery_cache like so:

set optimizer_switch='subquery_cache=on';

The efficiency of the subquery cache is visible in 2 statistical variables:

  • subquery_cache_hit - Global counter for all subquery cache hits.
  • subquery_cache_miss - Global counter for all subquery cache misses.

The session variables tmp_table_size and max_heap_table_size have influence on the size of in-memory temporary tables in the table used for caching. It cannot grow more than the minimum of the above variables values (see the Implementation section for details).


Your usage of the cache is visible in EXTENDED EXPLAIN output (warnings) as "<expr_cache><//list of parameters//>(//cached expression//)". For example:

MariaDB [test]> explain extended select * from t1 where a in (select b from t2);
| id | select_type        | table | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |
|  1 | PRIMARY            | t1    | ALL  | NULL          | NULL | NULL    | NULL |    2 |   100.00 | Using where |
|  2 | DEPENDENT SUBQUERY | t2    | ALL  | NULL          | NULL | NULL    | NULL |    2 |   100.00 | Using where |
2 rows in set, 1 warning (0.00 sec)

MariaDB [test]> show warnings;
| Level | Code | Message                                                                                                                                                                                                    |
| Note  | 1003 | select `test`.`t1`.`a` AS `a` from `test`.`t1` where <expr_cache><`test`.`t1`.`a`>(<in_optimizer>(`test`.`t1`.`a`,<exists>(select 1 from `test`.`t2` where (<cache>(`test`.`t1`.`a`) = `test`.`t2`.`b`)))) |
1 row in set (0.00 sec)

In the example above the presence of "<expr_cache><`test`.`t1`.`a`>(...)" is how you know you are using the subquery cache.


Every subquery cache creates a temporary table where the results and all parameters are stored. It has a unique index over all parameters. First the cache is created in a heap table (if doing this is impossible the cache becomes disabled for that expression). When the table grows up to the minimum of tmp_table_size and max_heap_table_size, the hit rate will be checked:

  • if the hit rate is really small (<0.2) the cache will be disabled.
  • if the hit rate is moderate (<0.7) the table will be cleaned (all records deleted) to keep the table in memory
  • if the hit rate is high the table will be converted to a disk table (for 5.3.0 it can only be converted to a disk table).
hit rate = hit / (hit + miss)

Performance impact

Here are some examples that show the performance impact of the subquery cache (these tests were made on a 2.53 GHz Intel Core 2 Duo MacBook Pro with dbt-3 scale 1 data set).

examplecache oncache offgainhitmisshit rate
11.01sec1 hour 31 min 43.33sec5445x1499752599.98%

Example 1

Dataset from DBT-3 benchmark, a query to find customers with balance near top in their nation:

select count(*) from customer 
   c_acctbal > 0.8 * (select max(c_acctbal) 
                      from customer C 
                      where C.c_nationkey=customer.c_nationkey
                      group by c_nationkey);

Example 2

DBT-3 benchmark, Query #17

select sum(l_extendedprice) / 7.0 as avg_yearly 
from lineitem, part 
  p_partkey = l_partkey and 
  p_brand = 'Brand#42' and p_container = 'JUMBO BAG' and 
  l_quantity < (select 0.2 * avg(l_quantity) from lineitem 
                where l_partkey = p_partkey);

Example 3

DBT-3 benchmark, Query #2

        s_acctbal, s_name, n_name, p_partkey, p_mfgr, s_address, s_phone, s_comment
        part, supplier, partsupp, nation, region
        p_partkey = ps_partkey and s_suppkey = ps_suppkey and p_size = 33
        and p_type like '%STEEL' and s_nationkey = n_nationkey
        and n_regionkey = r_regionkey and r_name = 'MIDDLE EAST'
        and ps_supplycost = (
                        partsupp, supplier, nation, region
                        p_partkey = ps_partkey and s_suppkey = ps_suppkey
                        and s_nationkey = n_nationkey and n_regionkey = r_regionkey
                        and r_name = 'MIDDLE EAST'
order by
        s_acctbal desc, n_name, s_name, p_partkey;

Example 4

DBT-3 benchmark, Query #20

        s_name, s_address
        supplier, nation
        s_suppkey in (
                        distinct (ps_suppkey)
                        partsupp, part
                        and p_name like 'indian%'
                        and ps_availqty > (
                                        0.5 * sum(l_quantity)
                                        l_partkey = ps_partkey
                                        and l_suppkey = ps_suppkey
                                        and l_shipdate >= '1995-01-01'
                                        and l_shipdate < date_ADD('1995-01-01',interval 1 year)
        and s_nationkey = n_nationkey and n_name = 'JAPAN'
order by

See Also:


Comments loading...