Table pullout optimization

You are viewing an old version of this article. View the current version here.

The idea of Table Pullout

Sometimes, a subquery can be re-written as a join. Let's take this query as example:

select *
from City 
where City.Country in (select Country.Code
                       from Country 
                       where Country.Population < 100*1000);

If we know that there can be at most one country with with a given value of Country.Code (we can tell that if we see that table Country has a primary key or unique index over that column), we can re-write this query as:

select City.* 
from 
  City, Country 
where
 City.Country=Country.Code AND Country.Population < 100*1000;

Table pullout in action

If one runs EXPLAIN for the above query in MySQL 5.1-5.6 or MariaDB 5.1-5.2, they'll get this plan:

MySQL [world]> explain select * from City where City.Country in (select Country.Code from Country where Country.Population < 100*1000);
+----+--------------------+---------+-----------------+--------------------+---------+---------+------+------+-------------+
| id | select_type        | table   | type            | possible_keys      | key     | key_len | ref  | rows | Extra       |
+----+--------------------+---------+-----------------+--------------------+---------+---------+------+------+-------------+
|  1 | PRIMARY            | City    | ALL             | NULL               | NULL    | NULL    | NULL | 4079 | Using where |
|  2 | DEPENDENT SUBQUERY | Country | unique_subquery | PRIMARY,Population | PRIMARY | 3       | func |    1 | Using where |
+----+--------------------+---------+-----------------+--------------------+---------+---------+------+------+-------------+
2 rows in set (0.00 sec)

It shows that the optimizer is going to do a full scan on table City, and for each city it will do a lookup in table Country.

Now, if one runs the same query in MariaDB 5.3, they will get this plan:

MariaDB [world]> explain select * from City where City.Country in (select Country.Code from Country where Country.Population < 100*1000);
+----+-------------+---------+-------+--------------------+------------+---------+--------------------+------+-----------------------+
| id | select_type | table   | type  | possible_keys      | key        | key_len | ref                | rows | Extra                 |
+----+-------------+---------+-------+--------------------+------------+---------+--------------------+------+-----------------------+
|  1 | PRIMARY     | Country | range | PRIMARY,Population | Population | 4       | NULL               |   37 | Using index condition |
|  1 | PRIMARY     | City    | ref   | Country            | Country    | 3       | world.Country.Code |   18 |                       |
+----+-------------+---------+-------+--------------------+------------+---------+--------------------+------+-----------------------+
2 rows in set (0.00 sec)

The interesting parts are:

  • Both tables have select_type=PRIMARY, and id=1 as if they were in one join.
  • The `Country` table is first, followed by the `City`.

Indeed, if one runs EXPLAIN EXTENDED; SHOW WARNINGS, they will see that the subquery is gone and it was replaced with a join:

MariaDB [world]> show warnings\G
*************************** 1. row ***************************
  Level: Note
   Code: 1003
Message: select `world`.`City`.`ID` AS `ID`,`world`.`City`.`Name` AS 
`Name`,`world`.`City`.`Country` AS `Country`,`world`.`City`.`Population` AS 
`Population` 

  
   from `world`.`City` join `world`.`Country` where 


((`world`.`City`.`Country` = `world`.`Country`.`Code`) and (`world`.`Country`.
`Population` < (100 * 1000)))
1 row in set (0.00 sec)

Changing subquery into join allows to feed the join to join optimizer, which can make a choice between two possible join orders

  1. City -> Country
  2. Country -> City

Comments

Comments loading...
Content reproduced on this site is the property of its respective owners, and this content is not reviewed in advance by MariaDB. The views, information and opinions expressed by this content do not necessarily represent those of MariaDB or any other party.