Optimizations for derived tables
Derived tables are subqueries in the FROM
clause. Prior to MariaDB 5.3/MySQL 5.6, they were too slow to be usable. In MariaDB 5.3/MySQL 5.6, there are two optimizations
- Derived table merge
- Automatic index creation for derived tables
which provide adequate performance.
Derived table merge
The idea
Users of "big" database systems are used to using FROM subqueries as a way to structure their queries. For example, if one's first thought was that they need to select cities with population greater than 10,000 people, and then that from these cities one needs to select those that are located in Germany, one could write this SQL:
select * from (select * from City where Population > 10*1000) as big_city where big_city.Country='DEU'
For MySQL, such syntax was a taboo. If you run EXPLAIN for this query, you can see why:
mysql> explain select * from (select * from City where Population > 1*1000) as big_city where big_city.Country='DEU' ; +----+-------------+------------+------+---------------+------+---------+------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+------------+------+---------------+------+---------+------+------+-------------+ | 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 4068 | Using where | | 2 | DERIVED | City | ALL | Population | NULL | NULL | NULL | 4079 | Using where | +----+-------------+------------+------+---------------+------+---------+------+------+-------------+ 2 rows in set (0.60 sec)
it plans to do the following actions:
From left to right:
- Execute the subquery:
(select * from City where Population > 1*1000)
, exactly as it was written in the query - Put result of the subquery into temporary table.
- Read back, and apply WHERE condition from the upper select,
big_city.Country='DEU'
Executing subquery like this is very inefficient, because highly-selective condition from the parent select, (Country='DEU'
) is not used when scanning the base table City
. We read too many records from the City
table, and then we have to write them into temporary table and read back again, before finally filtering them out.
If one runs this query in MariaDB/MySQL 5.6, they see:
MariaDB [world]> explain select * from (select * from City where Population > 1*1000) as big_city where big_city.Country='DEU'; +----+-------------+-------+------+--------------------+---------+---------+-------+------+------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+------+--------------------+---------+---------+-------+------+------------------------------------+ | 1 | SIMPLE | City | ref | Population,Country | Country | 3 | const | 90 | Using index condition; Using where | +----+-------------+-------+------+--------------------+---------+---------+-------+------+------------------------------------+ 1 row in set (0.00 sec)
Automatic
See also
Optimizing Subqueries in the FROM Clause in MySQL 5.6 manual