Learn about table elimination for query optimization in MariaDB Server. This section explains how the optimizer removes unnecessary tables from query plans, improving performance.
This page is licensed: CC BY-SA / Gnu FDL
The first thing the MariaDB optimizer does is to merge the VIEW
definition into the query to obtain:
SELECT ACRAT_rating
FROM
ac_anchor
LEFT JOIN ac_name ON ac_anchor.AC_ID=ac_name.AC_ID
LEFT JOIN ac_dob ON ac_anchor.AC_ID=ac_dob.AC_ID
LEFT JOIN ac_rating ON (ac_anchor.AC_ID=ac_rating.AC_ID AND
ac_rating.ACRAT_fromdate =
(SELECT MAX(sub.ACRAT_fromdate)
FROM ac_rating sub WHERE sub.AC_ID = ac_rating.AC_ID))
WHERE
ACNAM_name='Gary Oldman'It's important to realize that the obtained query has a useless part:
left join ac_dob on ac_dob.AC_ID=... will produce exactly
one matching record:
primary key(ac_dob.AC_ID) guarantees that there will be
at most one match for any value of ac_anchor.AC_ID,
and if there won't be a match, LEFT JOIN will generate a
NULL-complemented “row”
and we don't care what the matching record is, as tableac_dob is not used anywhere else in the query.
This means that the left join ac_dob on ... part can be
removed from the query and this is what Table Elimination module does. The
detection logic is rather smart, for example it would be able to remove theleft join ac_rating on ... part as well, together with the
subquery (in the above example it won't be removed because ac_rating used in
the selection list of the query). The Table Elimination module can also handle
nested outer joins and multi-table outer joins.
This page is based on the following blog post about table elimination:
This page is licensed: CC BY-SA / Gnu FDL
In addition to MariaDB, Table Elimination is found in both Microsoft SQL Server 2005/2008 and Oracle 11g. Of the two, Microsoft SQL Server 2005/2008 seems to have the most advanced implementation. Oracle 11g has been confirmed to use table elimination but not to the same extent.
To compare the two, we will look at the following query:
SELECT
A.colA
FROM
tableA A
LEFT OUTER JOIN
tableB B
ON
B.id = A.id;When using A as the left table we ensure that the query will return at least as many rows as there are in that table. For rows where the join condition (B.id = A.id) is not met the selected column (A.colA) will still contain its original value. The not seen B.* row would contain all NULL:s.
However, the result set could actually contain more rows than what is found in tableA if there are duplicates of the column B.id in tableB. If A contains a row [1, "val1"] and B the rows [1, "other1a"],[1, "other1b"] then two rows will match in the join condition. The only way to know what the result will look like is to actually touch both tables during execution.
Instead, let's say tableB contains rows that make it possible to place a unique constraint on the column B.id, for example, which is often the case with a primary key. In this situation we know that we will get exactly as many rows as there are in tableA, since joining with tableB cannot introduce any duplicates. Furthermore, as in the example query, if we do not select any columns from tableB, touching that table during execution is unnecessary. We can remove the whole join operation from the execution plan.
Both SQL Server 2005/2008 and Oracle 11g deploy table elimination in the case described above. Let us look at a more advanced query, where Oracle fails.
In this example we have added another join condition, which ensures that we only pick the matching row from tableB having the latest fromDate. In this case tableB will contain duplicates of the column B.id, so in order to ensure uniqueness the primary key has to contain the fromDate column as well. In other words the primary key of tableB is (B.id, B.fromDate).
Furthermore, since the subselect ensures that we only pick the latest B.fromDate for a given B.id we know that at most one row will match the join condition. We will again have the situation where joining with tableB cannot affect the number of rows in the result set. Since we do not select any columns from tableB, the whole join operation can be eliminated from the execution plan.
SQL Server 2005/2008 will deploy table elimination in this situation as well. We have not found a way to make Oracle 11g use it for this type of query. Queries like these arise in two situations. Either when you have a denormalized model consisting of a fact table with several related dimension tables, or when you have a highly normalized model where each attribute is stored in its own table. The example with the subselect is common whenever you store historized/versioned data.
This page is based on the following blog post about table elimination:
This page is licensed: CC BY-SA / Gnu FDL
The basic idea behind table elimination is that sometimes it is possible to resolve a query without even accessing some of the tables that the query refers to. One can invent many kinds of such cases, but in Table Elimination we targeted only a certain class of SQL constructs that one ends up writing when they are querying data.
The sample queries were drawn from “Anchor Modeling”, a database modeling technique which takes normalization to the extreme. The at the have an in-depth explanation of Anchor modeling and its merits, but the part that's important for table elimination can be shown with an example.
Suppose the database stores information about actors, together with their names, birthdays, and ratings, where ratings can change over time:
According to anchor modeling, each attribute should go into its own table:
the 'anchor' table which only has a synthetic primary key:
SELECT
A.colA
FROM
tableA A
LEFT OUTER JOIN
tableB B
ON
B.id = A.id
AND
B.fromDate = (
SELECT
MAX(sub.fromDate)
FROM
tableB sub
WHERE
sub.id = A.id
);a table for the 'name' attribute:
a table for the 'birthdate' attribute:
a table for the ‘rating’ attribute, which is historized:
With this approach it becomes easy to add/change/remove attributes, but this comes at a cost of added complexity in querying the data: in order to answer the simplest, select-star question of displaying actors and their current ratings one has to write outer joins:
Display actors, with their names and current ratings:
We don't want to write the joins every time we need to access an actor's properties, so we’ll create a view:
This will allow us to access the data as if it was stored in a regular way:
And this is where table elimination will be needed.
This page is based on the following blog post about table elimination:?p=58
This page is licensed: CC BY-SA / Gnu FDL

CREATE TABLE ac_anchor(AC_ID INT PRIMARY KEY);CREATE TABLE ac_name(AC_ID INT, ACNAM_name CHAR(N),
PRIMARY KEY(AC_ID));CREATE TABLE ac_dob(AC_ID INT,
ACDOB_birthdate DATE,
PRIMARY KEY(AC_ID));CREATE TABLE ac_rating(AC_ID INT,
ACRAT_rating INT,
ACRAT_fromdate DATE,
PRIMARY KEY(AC_ID, ACRAT_fromdate));SELECT
ac_anchor.AC_ID, ACNAM_Name, ACDOB_birthdate, ACRAT_rating
FROM
ac_anchor
LEFT JOIN ac_name ON ac_anchor.AC_ID=ac_name.AC_ID
LEFT JOIN ac_dob ON ac_anchor.AC_ID=ac_dob.AC_ID
LEFT JOIN ac_rating ON (ac_anchor.AC_ID=ac_rating.AC_ID AND
ac_rating.ACRAT_fromdate =
(SELECT MAX(sub.ACRAT_fromdate)
FROM ac_rating sub WHERE sub.AC_ID = ac_rating.AC_ID))CREATE VIEW actors AS
SELECT ac_anchor.AC_ID, ACNAM_Name, ACDOB_birthdate, ACRAT_rating
FROM <see the SELECT above>SELECT ACRAT_rating FROM actors WHERE ACNAM_name='Gary Oldman'One can check that table elimination is working by looking at the output of EXPLAIN [EXTENDED] and not finding there the tables that were eliminated:
EXPLAIN SELECT ACRAT_rating FROM actors WHERE ACNAM_name=’Gary Oldman’;
+----+--------------------+-----------+--------+---------------+---------+---------+----------------------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+-----------+--------+---------------+---------+---------+----------------------+------+-------------+
| 1 | PRIMARY | ac_anchor | index | PRIMARY | PRIMARY | 4 | NULL | 2 | Using index |
| 1 | PRIMARY | ac_name | eq_ref | PRIMARY | PRIMARY | 4 | test.ac_anchor.AC_ID | 1 | Using where |
| 1 | PRIMARY | ac_rating | ref | PRIMARY | PRIMARY | 4 | test.ac_anchor.AC_ID | 1 | |
| 3 | DEPENDENT SUBQUERY | sub | ref | PRIMARY | PRIMARY | 4 | test.ac_rating.AC_ID | 1 | Using index |
+----+--------------------+-----------+--------+---------------+---------+---------+----------------------+------+-------------+Note that ac_dob table is not in the output. Now let's try getting birthdate instead:
The ac_dob table is there while ac_rating and the subquery are gone. Now, if we just want to check the name of the actor:
EXPLAIN SELECT count(*) FROM actors WHERE ACNAM_name=’Gary Oldman’;
+----+-------------+-----------+--------+---------------+---------+---------+----------------------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-----------+--------+---------------+---------+---------+----------------------+------+-------------+
| 1 | PRIMARY | ac_anchor | index | PRIMARY | PRIMARY | 4 | NULL | 2 | Using index |
| 1 | PRIMARY | ac_name | eq_ref | PRIMARY | PRIMARY | 4 | test.ac_anchor.AC_ID | 1 | Using where |
+----+-------------+-----------+--------+---------------+---------+---------+----------------------+------+-------------+
2 rows in set (0.01 sec)In this case it will eliminate both the ac_dob andac_rating tables.
Removing tables from a query does not make the query slower, and it does not cut off any optimization opportunities, so table elimination is unconditional and there are no plans on having any kind of query hints for it.
For debugging purposes there is a table_elimination=on|off switch in debug builds of the server.
This page is based on the following blog post about table elimination:
This page is licensed: CC BY-SA / Gnu FDL
EXPLAIN SELECT ACDOB_birthdate FROM actors WHERE ACNAM_name=’Gary Oldman’;
+----+-------------+-----------+--------+---------------+---------+---------+----------------------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-----------+--------+---------------+---------+---------+----------------------+------+-------------+
| 1 | PRIMARY | ac_anchor | index | PRIMARY | PRIMARY | 4 | NULL | 2 | Using index |
| 1 | PRIMARY | ac_name | eq_ref | PRIMARY | PRIMARY | 4 | test.ac_anchor.AC_ID | 1 | Using where |
| 1 | PRIMARY | ac_dob | eq_ref | PRIMARY | PRIMARY | 4 | test.ac_anchor.AC_ID | 1 | |
+----+-------------+-----------+--------+---------------+---------+---------+----------------------+------+-------------+
3 rows in set (0.01 sec)