Cassandra storage engine
This page is a documentation for Cassandra storage engine which is a work in progress.
This page describes a feature that's under development. The feature has not been released (even in beta), its interface and function may change, etc.
Cassandra table
MariaDB's table represents a column family in Cassandra. The table must follow this pattern:
create table cassandra_tbl -- name can be chosen at will ( rowkey char(N), -- First field must be named 'rowkey'. This -- is what Cassandra's row key is mapped to.type must be the same as for column1 varchar(N), -- columns go here ... PRIMARY KEY(rowkey) -- Primary key over 'rowkey' is mandatory ) engine=cassandra thrift_host='192.168.1.0' keyspace= 'cassandra_key_space' column_family='column_family_name';
MariaDB's table is a view of Cassandra's column family.
- column family must exist before the MariaDB table can be created
- dropping MariaDB's table will not drop the column family.
Mapping Cassandra columns to SQL
Cassandra has three kinds of columns
- The row key. It is not considered to be a column by Cassandra, but MySQL/MariaDB do not have separate 'keys', so we make it a column.
- Columns that were present in static column family declaration. These columns have a defined name/data type.
- "Ad-hoc" columns that can be encountered in individual rows.
In SQL, we have
- The row key is mapped into a regular column named
rowkey
- Static column family members should be mapped to regular SQL columns (i.e, column named 'foo1' in Cassandra will map to column named 'foo' in SQL.
- Ad-hoc columns are all put into a regular SQL column which has a Dynamic Columns blob. This allows us to return arbitrary sets of columns within one row.
Column datatypes
For Cassandra columns that map to SQL's table columns, there is a question of which datatype should be used.
Cassandra's limits are greater than MySQL's: for example, row keys can be longer than MySQL's limitation on max. key size.
{ Hence, one must responsibly pick datatypes himself }
{ TODO: equality (and ordering?) relationship for SQL's `rowkey` must be the same as for Cassandra's primary key }
{TODO: put here a suggested mapping for Cassandra's datatypes}
TODO
More details here.