MyRocks and Index-Only Scans
This article is about MyRocks and index-only scans on secondary indexes. It applies to MariaDB's MyRocks, Facebook's MyRocks, and other variants.
MyRocks indexes store "mem-comparable keys" (that is, the key values are compared with memcmp
). For some datatypes, it is easily possible to convert between the column value and its mem-comparable form, while for others the conversion is one-way.
For example, in case-insensitive collations capital and regular letters are considered identical, i.e. 'c' ='C'. For some datatypes, MyRocks stores some extra data which allows it to restore the original value back. (For the latin1_general_ci
collation and character 'c', for example, it will store one bit which says whether the original value was a small 'c' or a capital letter 'C'). This doesn't work for all datatypes, though.
In particular, index-only scans are not supported for
BIT(n)
columns (this could be implemented but at the moment it is not supported)SET(...)
columns (same as above)ENUM(...)
columns (same as above)
Index-only support for various collations
As far as Index-only support is concerned, MyRocks distinguishes three kinds of collations:
1. Binary (reversible) collations
These are binary
, latin1_bin
, and utf8_bin
.
For these collations, it is possible to convert a value back from its mem-comparable form. Hence, one can restore the original value back from its index record, and index-only scans are supported.
2. Restorable collations
These are collations where one can store some extra information which helps to restore the original value.
Criteria (from storage/rocksdb/rdb_datadic.cc, rdb_is_collation_supported()) - one-byte characters (so, unicode-based collations are not included) - strxfrm(one byte) = {one 1-byte weight value always} - no binary sorting - PAD attribute
The examples are: latin1_general_ci
, latin1_general_cs
, latin1_swedish_ci
, etc.
Index-only scans are supported for these collations.
3. All other collations
For these collations, there is no known way to restore the value from its mem-comparable form, and so index-only scans are not supported.
MyRocks needs to fetch the clustered PK record to get the field value.