Storage Engine Choice: Aria

MariaDB Platform includes a variety of storage engines as pluggable components of MariaDB Enterprise Server. This allows you to choose the storage engine that best suits the workload of a particular database or table.

The Aria storage engine was developed as a crash-safe replacement for MyISAM, and has been in active development since 2007. When MariaDB Server restarts after a crash, Aria recovers all tables to the state as of the start of a statement or at the start of the last LOCK TABLES statement.

MariaDB Storage Engines

Current storage engine recommendations based on workload:

  • Read-heavy workloads: Aria
  • General purpose: InnoDB
  • ACID: InnoDB
  • Write-heavy workloads: MyRocks
  • Compression: MyRocks
  • Sharded: Spider
  • Analytical workloads: MariaDB ColumnStore

Why Use Aria for MariaDB

While InnoDB is the default storage engine for databases created with MariaDB, Aria is used for most of the internal system tables from MariaDB Enterprise Server 10.4. Aria has a small footprint and allows for easy copying between systems and is particularly suited for read-heavy workloads. Aria has an advanced page-based storage format that allows for superior caching performance and is optimized for concurrency.

In real-world environments, Aria produces better results for aggregated constructs (such as GROUP BY or ORDER BY) that are prevalent in analytics and transaction processing. Analytics and transactional processing find simple things like totals, maximums, and statistics. Most applications heavily rely on queries that use these aggregate functions.
MariaDB uses the Aria storage engine for all internal on-disk temporary tables. Since these tables are often created internally to execute GROUP BY and DISTINCT queries, these types of queries can benefit from Aria’s performance, even if the queried table uses a different engine.

The Aria storage engine is compiled-in by default in MariaDB. Aria is included with MariaDB Server (Enterprise and Community) with no additional installation or setup.

Aria Considerations

How best to use Aria will depend on the use case, table design, access patterns (SQL queries), performance targets, and recovery goals.

Advantages of AriaDisadvantages of Aria
  • Simultaneous inserting into the same table from multiple sources
  • Inefficient bulk loading
  • Cache by page ability
  • Table level locking
  • Efficiently backed up by MariaDB Enterprise Backup

  • Lacks native foreign key support
  • Enhanced PAGE row format: crash-safe by default; operations; provides notable speed improvement for data caching, especially on Windows
  • Lacks support INSERT DELAYED.
  • Crash-safe tables and indexes
  • Caching limitations: multiple key caches and depends on OS disk cache
  • Crash recovery to the start of a last statement or lock
  • PAGE format overhead: minimum file size 16K; storage of small rows; same page size for index and data.
  • Optimized LOAD INDEX
  • Does not support transactions (i.e. ROLLBACK and COMMIT)
  • Compatible with MyISAM ROW and PAGE formats
  • Low overhead
  • Consider using Aria:

    • If your application does a lot of FULL-TEXT searches.
    • If your application does a lot of GROUP BY queries.
    • If your application does not need ACID compliance or foreign keys.
    • If your application currently uses MyISAM and requires a more advanced engine to allow better speed, backups, and automated crash recovery.

    Performance Comparisons

    It’s one thing to compare isolated queries and quite another to look at real-world examples. We did some tests comparing Aria, InnoDB, and MyISAM.

    Performance Results
    InnoDBAriaMyISAM
    SQL 12.3890.5800.634
    SQL 22.1690.5300.598
    Simple Benchmark: Aria vs MyISAM vs InnoDB

    In this simple benchmark, a table was created and populated with 2 million rows. GROUP BY and ORDER BY queries were done and copied into all three alternative storage engines: InnoDB, Aria, and MyISAM. Each test ran 10 times, with the average time (seconds) recorded for all tests.

    Setup and commands used for testing:

    INSERT INTO fact VALUES (1,1,'Justin',md5(''), .1);
    INSERT INTO fact SELECT FLOOR(1+ rand()*9999), FLOOR(1 + rand()*499),
    (select name from names where id = 1 + rand() * 4),
    MD5(1+rand()*9999), rand()
    FROM fact;

    Keep running the above insert until about 2 million rows are in the table.

    MariaDB [test2]> SELECT count(*) from fact;
    +----------+
    | count(*) |
    +----------+
    | 2097152 |
    +----------+
    MariaDB [test2]> SHOW CREATE TABLE fact;
    +-------+-------------------------------
    | Table | Create Table
    +-------+-------------------------------
    | fact | CREATE TABLE `fact` (
    `dim1` int(11) DEFAULT NULL, `dim2` int(11) DEFAULT NULL,
     `name` varchar(20) DEFAULT NULL, `hash` varchar(32) DEFAULT NULL,
     `measure1` double DEFAULT NULL)
    ENGINE=InnoDB DEFAULT CHARSET=utf8 |

    Confirm that a good ratio of distinct vs total rows are present:

    MariaDB [test2]> SELECT count(distinct (dim1)) from fact;
    +------------------------+
    | count(distinct (dim1)) |
    +------------------------+
    | 9999              	 |
    +------------------------+
    MariaDB [test2]> SELECT count(distinct (dim2)) from fact;
    +------------------------+
    | count(distinct (dim2)) |
    +------------------------+
    | 499                    |
    +------------------------+
    MariaDB [test2]> SELECT count(distinct (name)) from fact;
    +------------------------+
    | count(distinct (name)) |
    +------------------------+
    | 1                      |
    +------------------------+
    

    Aria

    CREATE TABLE `test2`.`fact2` (
     `dim1` INT(11) NULL DEFAULT NULL,
     `dim2` INT(11) NULL DEFAULT NULL,
     `name` VARCHAR(20) NULL DEFAULT NULL,
     `hash` VARCHAR(32) NULL DEFAULT NULL,
     `measure1` DOUBLE NULL DEFAULT NULL )
    ENGINE=ARIA TRANSACTIONAL=1;
    INSERT INTO `test2`.`fact2` (`dim1`, `dim2`, `name`, `hash`, `measure1`) 
    SELECT `dim1`, `dim2`, `name`, `hash`, `measure1` FROM `fact`;
    

    MyISAM

    CREATE TABLE `test2`.`fact3` (
     `dim1` INT(11) NULL DEFAULT NULL,
     `dim2` INT(11) NULL DEFAULT NULL,
     `name` VARCHAR(20) NULL DEFAULT NULL,
     `hash` VARCHAR(32) NULL DEFAULT NULL,
     `measure1` DOUBLE NULL DEFAULT NULL )
    COLLATE='latin1_swedish_ci' ENGINE=MyISAM;
    INSERT INTO `test2`.`fact3` (`dim1`, `dim2`, `name`, `hash`, `measure1`) 
    SELECT `dim1`, `dim2`, `name`, `hash`, `measure1` FROM `fact`;
    

    Test 2 different sqls across InnoDB, Aria, and MyISAM:

    — sql 1:

    SELECT dim1, dim2 from fact group by dim1 order by dim1; 
    -- 9999 rows in set

    — sql 2:

    SELECT dim1, dim2 from fact group by dim2 order by dim2; 
    -- 499 rows in set

    Benchmark Recap

    Testing shows Aria is much faster at the GROUP BY level than either InnoDB or MyISAM. Aria is slightly faster than MyISAM and ensures a crash-safe environment. The performance tests revealed Aria is four times faster than InnoDB. Because of the differences in use cases, each one should undergo testing in the MariaDB Enterprise Server using both Aria and InnoDB (or another, depending on need).

    Summary

    Depending on your workload, Aria could provide performance advantages. Because it is included with MariaDB Server and can be applied per database or per table (ENGINE=ARIA), you can test performance for your specific workload with no additional cost and very little additional effort–just plug it in and give it a try.

    For more on storage engines and storage architecture, visit workload-optimized storage engines.