Collect Static PDs with ANALYZE FULL TABLE for MariaDB Xpand

Overview

MariaDB Xpand's Dynamic Probability Distributions (PDs) on table columns are designed around providing live, up-to-the-minute statistics on commonly queried table columns. Dynamic PDs are automatically collected in real time by observing writes and persisting statistics to system data structures. The dynamic PD system is automatic, but comes with a performance overhead and is not very configurable.

MariaDB Xpand 6 adds support for Static Probability Distributions (PDs). Static PDs are used to collect statistics on-demand and persist them indefinitely, which means that all overhead for statistics collection occurs up front. If statistics collection is scheduled at an off-peak scheduled time, Static PDs can eliminate the overhead for statistics collection during Xpand's peak activity.

SQL Syntax

ANALYZE FULL TABLE `table_name` (`column_name` [, `column_name`] ...) [BASIC | EXTENDED]

BASIC

Collects cardinality, NDV, and hotlist statistics

EXTENDED

Collects GH quantiles, in addition to BASIC statistics

<default>

If no option is specified, BASIC is assumed.

NOTE: a non-full ANALYZE TABLE does not make sense in this context. Specifying a column list with a non-full ANALYZE TABLE will result in an error.

Enabling Static PDs

Once statistics have been collected, you can enable the query optimizer to use static PDs:

SET GLOBAL sierra_enable_static_pds = true;

Exporting Static PDs

In order to provide details to MariaDB Support, you may be asked to export Static PDs.

Exporting is done by simply running a SELECT query on the desired PD tables. This can be achieved using standard dump tools. Be sure to specify the --no-create-db and --no-create-info flags (or -n and -t), to interact with Xpand system tables:

mysqldump -u root -h host -nt system pd_groups pd_group_columns pds > exported_pds.sql

To only export PDs from a single table, you can add a --where constraint to the mysqldump. This will require multiple invocations of mysqldump, since mysqldump doesn't support join clauses in its --where option:

mysqldump -u root -h host -nt --where="relation=<oid>" system pd_groups > exported_pds.sql
mysqldump -u root -h host -nt --where="pd_group in (select pd_group from system.pd_groups where relation=<oid>)" system pd_group_columns >> exported_pds.sql
mysqldump -u root -h host -nt --where="pd_group in (select pd_group from system.pd_groups where relation=<oid>)" system pds >> exported_pds.sql

Caveats for Static PDs

Static PDs are not included in backups.