Rowid Filtering Optimization
Rowid filtering is an optimization in MariaDB 10.4. Its target use case is as follows:
- a table uses ref access on index IDX1
- but it also has a fairly restrictive range predicate on another index IDX2.
In this case, it is advantageous to:
- Do an index-only scan on index IDX2 and collect rowids of index records into a data structure that allows filtering (let's call it $FILTER).
- When doing ref access on IDX1, check $FILTER before reading the full record
Example
Consider a query
<<code:sql>> SELECT ... FROM orders JOIN lineitem ON o_orderkey=l_orderkey WHERE l_shipdate BETWEEN '1997-01-01' AND '1997-01-31' AND o_totalprice between 200000 and 230000; <</code>>
Suppose the condition on l_shipdate
is very restrictive, which means lineitem table should go first in the join order. Then, the optimizer can use o_orderkey=l_orderkey
equality to do an index lookup to get the order the line item is from. On the other hand o_totalprice between ...
can also be rather selective.
With filtering, the query plan would be:
ANALYZE FORMAT=JSON output for table orders will show
"table": { "table_name": "orders", "access_type": "eq_ref", "possible_keys": ["PRIMARY", "i_o_totalprice"], "key": "PRIMARY", "key_length": "4", "used_key_parts": ["o_orderkey"], "ref": ["dbt3_s001.lineitem.l_orderkey"], "rowid_filter": { "range": { "key": "i_o_totalprice", "used_key_parts": ["o_totalprice"] }, "rows": 69, "selectivity_pct": 4.6, "r_rows": 71, "r_selectivity_pct": 10.417, "r_buffer_size": 53, "r_filling_time_ms": 0.0716 }
Note the rowid_filter
element. It has a range
element inside it. selectivity_pct
is the expected selectivity, accompanied by the r_selectivity_pct
showing the actual observed selectivity.
Details
The filter is currently implemented as an ordered array of rowids. A Bloom filter looks like a better candidate.
Control
Rowid filtering can be switched on/off using rowid_filter
flag in the @@optimizer_switch
variable. By default, the optimization is enabled.