FirstMatch strategy

You are viewing an old version of this article. View the current version here.

FirstMatch is an execution strategy for Semi-join subqueries.

The idea

It is very similar to how IN/EXISTS subqueries were executed in MySQL 5.x.

Let's take the usual example of search for countries with big cities:

select * from Country 
where Country.code IN (select City.Country 
                       from City 
                       where City.Population > 1*1000*1000)
      and Country.continent='Europe'

Suppose, our execution plan is to find countries in Europe, and then for each found country check if it has any big cities. Regular inner join execution will look as follows:

firstmatch-inner-join

Since Germany has got two big cities (in this diagram), it will be put into query output twice. This is not correct, SELECT ... FROM Country should not produce the same query twice. FirstMatch strategy avoids production of duplicates by short-cutting execution as soon as the first genuine match was found:

firstmatch-firstmatch

Note that short-cutting has to take place after "Using where" has been applied. It would have been wrong to short-cut after we've found Trier.

FirstMatch in action

EXPLAIN for the above query will look as follows:

MariaDB [world]> explain select * from Country  where Country.code IN (select City.Country from City where City.Population > 1*1000*1000) and Country.continent='Europe';
+----+-------------+---------+------+--------------------+-----------+---------+--------------------+------+----------------------------------+
| id | select_type | table   | type | possible_keys      | key       | key_len | ref                | rows | Extra                            |
+----+-------------+---------+------+--------------------+-----------+---------+--------------------+------+----------------------------------+
|  1 | PRIMARY     | Country | ref  | PRIMARY,continent  | continent | 17      | const              |   60 | Using index condition            |
|  1 | PRIMARY     | City    | ref  | Population,Country | Country   | 3       | world.Country.Code |   18 | Using where; FirstMatch(Country) |
+----+-------------+---------+------+--------------------+-----------+---------+--------------------+------+----------------------------------+
2 rows in set (0.00 sec)

FirstMatch(Country) in in the Extra column means that as soon as we have produced one matching record combination, short-cut the execution and jump back to table Country.

FirstMatch's query plan is very similar to one you would get in MySQL:

MySQL [world]> explain select * from Country  where Country.code IN (select City.Country from City where City.Population > 1*1000*1000) and Country.continent='Europe';
+----+--------------------+---------+----------------+--------------------+-----------+---------+-------+------+------------------------------------+
| id | select_type        | table   | type           | possible_keys      | key       | key_len | ref   | rows | Extra                              |
+----+--------------------+---------+----------------+--------------------+-----------+---------+-------+------+------------------------------------+
|  1 | PRIMARY            | Country | ref            | continent          | continent | 17      | const |   60 | Using index condition; Using where |
|  2 | DEPENDENT SUBQUERY | City    | index_subquery | Population,Country | Country   | 3       | func  |   18 | Using where                        |
+----+--------------------+---------+----------------+--------------------+-----------+---------+-------+------+------------------------------------+
2 rows in set (0.01 sec)

and these two particular query plans will execute in the same time.

Difference between FirstMatch and IN->EXISTS

General idea behind FirstMatch strategy is the same as one behind IN->EXISTS transformation, however, FirstMatch has several advantages:

  • Equality propagation works across semi-join bounds, but not subquery bounds. Therefore, converting subquery to semi-join and using FirstMatch can still give a better execution plan (TODO example)
  • There is only one way to apply IN->EXISTS strategy and MySQL will do it unconditionally. with FirstMatch, the optimizer can make a choice whether it should run FirstMatch strategy as soon as all tables used in the subquery are in the join prefix, or at some later point in time (TODO: example)

FirstMatch factsheet

  • FirstMatch strategy works by executing the subquery and short-cutting its execution as soon as the first match was found.
  • This means, subquery tables must be after all of parent select's tables that are referred from the subquery predicate.
  • EXPLAIN shows FirstMatch as "FirstMatch(tableN)"
  • The strategy can handle correlated subqueries
  • But it cannot be applied if the subquery has meaningful GROUP BY and/or aggregate functions.

Comments

Comments loading...
Content reproduced on this site is the property of its respective owners, and this content is not reviewed in advance by MariaDB. The views, information and opinions expressed by this content do not necessarily represent those of MariaDB or any other party.