Subquery Cache

The goal of the subquery cache is to optimize the evaluation of correlated subqueries by storing results together with correlation parameters in a cache and avoiding re-execution of the subquery in cases where the result is already in the cache.

Administration

The cache is on by default. One can switch it off using the optimizer_switch subquery_cache setting, like so:

SET optimizer_switch='subquery_cache=off';

The efficiency of the subquery cache is visible in 2 statistical variables:

The session variables tmp_table_size and max_heap_table_size influence the size of in-memory temporary tables in the table used for caching. It cannot grow more than the minimum of the above variables values (see the Implementation section for details).

Visibility

Your usage of the cache is visible in EXTENDED EXPLAIN output (warnings) as"<expr_cache><//list of parameters//>(//cached expression//)". For example:

EXPLAIN EXTENDED SELECT * FROM t1 WHERE a IN (SELECT b FROM t2);
+----+--------------------+-------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type        | table | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |
+----+--------------------+-------+------+---------------+------+---------+------+------+----------+-------------+
|  1 | PRIMARY            | t1    | ALL  | NULL          | NULL | NULL    | NULL |    2 |   100.00 | Using where |
|  2 | DEPENDENT SUBQUERY | t2    | ALL  | NULL          | NULL | NULL    | NULL |    2 |   100.00 | Using where |
+----+--------------------+-------+------+---------------+------+---------+------+------+----------+-------------+
2 rows in set, 1 warning (0.00 sec)

SHOW WARNINGS;
+-------+------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Level | Code | Message                                                                                                                                                                                                    |
+-------+------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Note  | 1003 | SELECT `test`.`t1`.`a` AS `a` from `test`.`t1` WHERE <expr_cache><`test`.`t1`.`a`>(<in_optimizer>(`test`.`t1`.`a`,<exists>(SELECT 1 FROM `test`.`t2` WHERE (<cache>(`test`.`t1`.`a`) = `test`.`t2`.`b`)))) |
+-------+------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

In the example above the presence of"<expr_cache><test.t1.a>(...)" is how you know you are using the subquery cache.

Implementation

Every subquery cache creates a temporary table where the results and all parameters are stored. It has a unique index over all parameters. First the cache is created in a MEMORY table (if doing this is impossible the cache becomes disabled for that expression). When the table grows up to the minimum oftmp_table_size and max_heap_table_size, the hit rate will be checked:

  • if the hit rate is really small (<0.2) the cache will be disabled.

  • if the hit rate is moderate (<0.7) the table will be cleaned (all records deleted) to keep the table in memory

  • if the hit rate is high the table will be converted to a disk table (for 5.3.0 it can only be converted to a disk table).

Performance Impact

Here are some examples that show the performance impact of the subquery cache (these tests were made on a 2.53 GHz Intel Core 2 Duo MacBook Pro with dbt-3 scale 1 data set).

example
cache on
cache off
gain
hit
miss
hit rate
1
2
3
4

example

cache on

cache off

gain

hit

miss

hit rate

1

1.01sec

1 hour 31 min 43.33sec

5445x

149975

25

99.98%

2

0.21sec

1.41sec

6.71x

6285

220

96.6%

3

2.54sec

2.55sec

1.00044x

151

461

24.67%

4

1.87sec

1.95sec

0.96x

0

23026

0%

Example 1

Dataset from DBT-3 benchmark, a query to find customers with balance near top in their nation:

Example 2

DBT-3 benchmark, Query #17

Example 3

DBT-3 benchmark, Query #2

Example 4

DBT-3 benchmark, Query #20

See Also

This page is licensed: CC BY-SA / Gnu FDL

Last updated

Was this helpful?