Cassandra storage engine

You are viewing an old version of this article. View the current version here.

This page is a documentation for Cassandra storage engine which is a work in progress.

This page describes a feature that's under development. The feature has not been released (even in beta), its interface and function may change, etc.

Cassandra table

MariaDB's table represents a column family in Cassandra. The table must follow this pattern:

create table cassandra_tbl               -- name can be chosen at will
(
  rowkey  char(N),                       -- First field must be named 'rowkey'. This
                                         --   is what Cassandra's row key is mapped to.type must be the same as for 

  column1    varchar(N),                 -- columns go here
  ... 
  PRIMARY KEY(rowkey)                    -- Primary key over 'rowkey' is mandatory
) engine=cassandra 
  thrift_host='192.168.1.0' 
  keyspace= 'cassandra_key_space'
  column_family='column_family_name';

MariaDB's table is a view of Cassandra's column family.

  • column family must exist before the MariaDB table can be created
  • dropping MariaDB's table will not drop the column family.

Mapping Cassandra columns to SQL

Cassandra has three kinds of columns

  1. The row key. It is not considered to be a column by Cassandra, but MySQL/MariaDB can only store data in columns, so we consider it a special kind of column.
  2. Columns that were present in static column family declaration. These columns have a defined name/data type.
  3. "Ad-hoc" columns that can be encountered in individual rows.

note: Cassandra's supercolumns are not handled at the moment. If we need to handle them, we'll make them dynamic-column blob. Cassandra's counter columns will be handled, and will be mapped to regular SQL columns. When that is mapped to SQL:

  1. The row key is mapped into a regular column named rowkey
  2. Static column family members should be mapped to regular SQL columns (i.e, column named 'foo1' in Cassandra will map to column named 'foo' in SQL.
  3. Ad-hoc columns are all put into a regular SQL column which has a Dynamic Columns blob. This allows us to return arbitrary sets of columns within one row.

Column datatypes

For Cassandra columns that map to SQL's table columns, there is a question of which datatype should be used.

Cassandra's limits are greater than MySQL's: for example, row keys can be longer than MySQL's limitation on max. key size.

{ Hence, one must responsibly pick datatypes himself }

{ TODO: equality (and ordering?) relationship for SQL's `rowkey` must be the same as for Cassandra's primary key }

{TODO: put here a suggested mapping for Cassandra's datatypes}

TODO

More details here.

See also

Comments

Comments loading...
Content reproduced on this site is the property of its respective owners, and this content is not reviewed in advance by MariaDB. The views, information and opinions expressed by this content do not necessarily represent those of MariaDB or any other party.