How to force query plans
The optimizer is largely cost based and will try to choose the optimal plan for any query. However in some cases it does not have enough information to do a perfect plan and in these cases you may have to provide hints to force the optimizer to use another plan.
You can examine the query plan for a SELECT by writing EXPLAIN before the statement.
For the following queries, we will use the world database for the examples.
Setting up the world example database
Download it from ftp://ftp.askmonty.org/public/world.sql.gz
Install it:
mysqladmin create world zcat world.sql.gz | ../client/mysql world
or
mysqladmin create world gunzip world.sql.gz ../client/mysql world < world.sql
Forcing join order
You can force the join order by using STRAIGHT_JOIN either in the SELECT or JOIN part.
The simplest set to force the join order is to put the tables in the correct
order in the FROM
clause and use SELECT
STRAIGHT_JOIN
:
SELECT STRAIGHT_JOIN SUM(City.Population) FROM Country,City WHERE City.CountryCode=Country.Code AND Country.HeadOfState="Vladimir Putin";
If you only want to force the join order for a few tables, use
STRAIGHT_JOIN
in the FROM
clause: Only tables connected with STRAIGHT_JOIN
will have their order forced.
SELECT SUM(City.Population) FROM Country STRAIGHT_JOIN City WHERE City.CountryCode=Country.Code AND Country.HeadOfState="Vladimir Putin";
In both of the above cases Country
will be scanned
first and for each matching country (one in this case) all rows in
City
will be checked for match. As there is only one
matching country this will be faster than the original query.
The output of EXPLAIN for the above cases are:
id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
---|---|---|---|---|---|---|---|---|---|
1 | SIMPLE | Country | ALL | PRIMARY | NULL | NULL | NULL | 239 | Using where |
1 | SIMPLE | City | ALL | NULL | NULL | NULL | NULL | 4079 | Using where; Using join buffer (flat, BNL join) |
This is one of the few cases where ALL
is ok, as the scan
of the Country
table will only find one matching row.
Forcing usage of a specific index for the WHERE
clause.
In some cases the optimizer may choose a not optimal index or not use an index at all, even if some index could theoretically be used.
In this case you have the options to either tell the optimizer to only use a limited set of indexes, ignore one or more indexes or force the usage of some index.
Use a limited set of indexes
You can limit which index are considered with the USE INDEX option.
USE INDEX [{FOR {JOIN|ORDER BY|GROUP BY}] ([index_list])
The default is FOR JOIN
which means that the hint only
affects how the WHERE
clause is optimized.
USE INDEX
is used after the table name in the
FROM
clause.
Example:
CREATE INDEX Name ON City (Name); CREATE INDEX CountryCode ON City (Countrycode); EXPLAIN SELECT Name FROM City USE INDEX (CountryCode) WHERE name="Helsingborg" AND countrycode="SWE";
This produces:
id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
---|---|---|---|---|---|---|---|---|---|
1 | SIMPLE | City | ref | CountryCode | CountryCode | 3 | const | 14 | Using where |
If we would not have used USE INDEX
also the
Name
index would have been in possible
keys
.
Don't use some index
You can tell the optimizer to not consider some index with the IGNORE INDEX option.
IGNORE INDEX [{FOR {JOIN|ORDER BY|GROUP BY}] ([index_list])
This is used after the table name in the FROM
clause:
CREATE INDEX Name ON City (Name); CREATE INDEX CountryCode ON City (Countrycode); EXPLAIN SELECT Name FROM City IGNORE INDEX (Name) WHERE name="Helsingborg" AND countrycode="SWE";
This produces:
id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
---|---|---|---|---|---|---|---|---|---|
1 | SIMPLE | City | ref | CountryCode | CountryCode | 3 | const | 14 | Using where |
The benefit of using IGNORE_INDEX
instead of
USE_INDEX
is that the first version will not
disable new index that you add later.
Forcing an index
Forcing an index to be used is mostly useful when the optimizer decides to do a table scan even if you know that using an index would be better. (The reason that the optimizer could decide to do a table scan, even if there isan available index, is that if most rows in a table, then a table scan is usually faster than using an index).
CREATE INDEX Name ON City (Name); EXPLAIN SELECT Name,CountryCode FROM City FORCE INDEX (Name) WHERE name>="A" and CountryCode >="A";
This produces:
id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
---|---|---|---|---|---|---|---|---|---|
1 | SIMPLE | City | range | Name | Name | 35 | NULL | 4079 | Using where |
The FORCE_INDEX
works by only considering the given
indexes (like with USE_INDEX
but in addition it
tells the optimizer to regard a table scan as something very
expensive. However if no one of the 'forced' index can be used, then
a table scan will be used anyway.
Forcing an index to be used for ORDER BY
or GROUP BY
.
The optimzier will try to use indexes to resolve ORDER BY
and GROUP BY
.
You can use USE INDEX
, IGNORE INDEX
and FORCE INDEX
as for the WHERE
clause above to ensure that some specific index used:
USE INDEX [{FOR {JOIN|ORDER BY|GROUP BY}] ([index_list])
This is used after the table name in the FROM
clause.
Example:
CREATE INDEX Name ON City (Name); EXPLAIN SELECT Name,Count(*) FROM City FORCE INDEX FOR GROUP BY (Name) WHERE population >= 10000000 GROUP BY Name;
This produces:
id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
---|---|---|---|---|---|---|---|---|---|
1 | SIMPLE | City | index | NULL | Name | 35 | NULL | 4079 | Using where |
Without the FORCE INDEX
option we would have in the
extra column Using where; Using temporary; Using
filesort
which means that the optimizer would created a temporary
table and sort it.
Help the optimizer optimize GROUP BY
and ORDER BY
.
The optimizer is using several strategies to optimize GROUP BY
and ORDER BY
:
- Resolve with an index:
- Scan the table in index order an output data as we go. (This only works if the
ORDER BY
/GROUP BY
can be resolved by an index after constant propagation is done).
- Scan the table in index order an output data as we go. (This only works if the
- Filesort:
- Scan the table to be sorted and collect the sort keys in temporary file.
- Sort the keys + reference to row (with filesort)
- Scan the table in sorted order
- Use temporary table for
ORDER BY
.- Create a temporary (in memory) table for the 'to-be-sorted' data. (If this gets bigger than
max_heap_table_size
or contains blobs then an Aria or MyISAM disk based table will be used) - Sort the keys + reference to row (with filesort)
- Scan the table in sorted order
- Create a temporary (in memory) table for the 'to-be-sorted' data. (If this gets bigger than
A temporary table will always be used if the fields that will be sorted is not from the first table in the JOIN order.
- Use temporary table for
GROUP BY
:- Create a temporary table to hold the
GROUP BY
result with an index that matches theGROUP BY
fields. - Produce a result row
- If a row with the
GROUP BY
key exist in the temporary table, add the new result row to it. If not, create a new row. - Before sending the reusult to the user, sort the rows with filesort to get the result in the
GROUP BY
order.
- Create a temporary table to hold the
Force/disallowing temporary table to be used for GROUP BY
:
Using an in-memory table (as described above) is usually the fastest
option for GROUP BY
if the result set is small but
is not optimal if the result set is very big. You can tell the
optimizer this by using SELECT SQL_SMALL_RESULT
or
SELECT SQL_BIG_RESULT
:
EXPLAIN SELECT SQL_SMALL_RESULT Name,Count(*) AS Cities FROM City GROUP BY Name HAVING Cities > 2;
Produces:
id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
---|---|---|---|---|---|---|---|---|---|
1 | SIMPLE | City | ALL | NULL | NULL | NULL | NULL | 4079 | Using temporary; Using filesort |
while
EXPLAIN SELECT SQL_BIG_RESULT Name,Count(*) AS Cities FROM City GROUP BY Name HAVING Cities > 2;
Produces:
id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
---|---|---|---|---|---|---|---|---|---|
1 | SIMPLE | City | ALL | NULL | NULL | NULL | NULL | 4079 | Using filesort |
The difference is that with SQL_SMALL_RESULT
a
temporary table is used.
Forcing usage of temporary table
In some cases you may want to force the use of a temporary table for the result to free up the table/row locks for the used tables quickly as possible.
You can do this with the SQL_BUFFER_RESULT
option:
CREATE INDEX Name ON City (Name); EXPLAIN SELECT SQL_BUFFER_RESULT Name,Count(*) AS Cities FROM City GROUP BY Name HAVING Cities > 2;
This produces:
id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
---|---|---|---|---|---|---|---|---|---|
1 | SIMPLE | City | index | NULL | Name | 35 | NULL | 4079 | Using index; Using temporary |
Without SQL_BUFFER_RESULT
the above query would not use
a temporary table for the result set.
Optimizer switch
In MariaDB 5.3 we added an optimizer switch that allows you to specify which algorithms will be considered when optimizing an query.
See the optimizer section for more information about the different algorithms that are used.