LooseScan strategy
LooseScan is an execution strategy for Semi-join subqueries.
Contents
The idea
We will demonstrate LooseScan strategy by example. Suppose, we're looking for countries that have satellites. We can get them using the following query (for the sake of simplicity we ignore satellites that are owned by consortiums of multiple countries)
select * from Country where Country.code in (select country_code from Satellite)
Suppose, there is an index on Satellite.country_code
. If we use that index, we will get satellites in the order of their owner country:
when satellites are ordered, they are also grouped by their country, for example, all satellites belonging to Australia you get satellites from each country grouped together, it is easy to select one item from each group (that is, each country) and thus avoid producing duplicates:
LooseScan in action
EXPLAIN output for the above query looks as follows:
MariaDB [world]> explain select * from Country where Country.code in (select country_code from Satellite); +----+-------------+-----------+--------+---------------+--------------+---------+------------------------------+------+-------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-----------+--------+---------------+--------------+---------+------------------------------+------+-------------------------------------+ | 1 | PRIMARY | Satellite | index | country_code | country_code | 9 | NULL | 932 | Using where; Using index; LooseScan | | 1 | PRIMARY | Country | eq_ref | PRIMARY | PRIMARY | 3 | world.Satellite.country_code | 1 | Using index condition | +----+-------------+-----------+--------+---------------+--------------+---------+------------------------------+------+-------------------------------------+
Factsheet
- LooseScan avoids production of duplicate record combination by putting the subquery table first, and using its index to select one record from multiple duplicates
- Hence, in order for LooseScan to be applicable, the subquery should look like:
expr IN (SELECT tbl.keypart1 FROM tbl ...)
or
expr IN (SELECT tbl.keypart2 FROM tbl WHERE tbl.keypart1=const AND ...)
- LooseScan can handle correlated subqueries