The Performance Module is responsible for performing I/O operations in support of query and write processing. It receives its instructions from a User Module with respect to the work that it does. A Performance Module doesn’t see the query itself, but only the set of instructions given it by a User Module. The Performance Module delivers three critical behaviors key to scaling out database behavior: distributed scans, distributed hash joins, and distributed aggregation. The combination of these three behaviors enables true MPP behavior for query intensive environments.
- The PrimProc handles query execution. The instructions sent by User module are received and executed by PrimProc as block oriented I/O operations to perform predicate filtering, join processing, initial aggregation of data. And then PrimProc sends data back to the User Module.
Shared Nothing Data Cache
All Performance Modules utilize a shared nothing data cache . When data is first accessed a Performance Module acts upon the amount of data that it has been instructed to by a User Module and caches them in an LRU-based cache for subsequent access. On dedicated servers running Performance Module, the majority of the box’s RAM can be dedicated to a Performance Module’s data cache. As Performance Module cache is a shared nothing design
- There is no data block pinging between participating Performance Module nodes as sometimes occurs in other multi-instance/ shared disk database systems.
- As more Performance Module nodes are added to a system, the overall cache size for the database is greatly increased
Load and Write Processing
A Performance Module node is given the task of performing loads and writes to the underlying persistent storage. There are two processes for handling write operations on Performance Module.
- WriteEngineServer: WriteEngineServer is responsible for coordinating DML, DDL and imports on each Performance Module. DDL changes are persisted within the MariaDB ColumnStore System Catalog which keeps track of all ColumnStore metadata.
- cpimport: This performs the database file updates, when bulk data is loaded. cpimport is aware of which module It is running on and, when running on the Performance Module, handles the actual updates of the database disk files. In this manner, MariaDB ColumnStore supports fully parallel load capabilities.