Cassandra storage engine
This page contains documentation for the Cassandra storage engine which is a work in progress.
This page describes a feature that's under development. The feature has not been released (even in beta), its interface and function may change, etc.
Contents
If you prefer a hands-on approach, cassandra-storage-engine-use-example shows an example Cassandra storage engine session. This page is more exhaustive and up-to-date, though.
Data mapping
A table in MariaDB represents a column family in Cassandra. The table must follow this pattern:
create table cassandra_tbl -- Table name can be chosen at will ( rowkey type PRIMARY KEY, -- represents Column Family's row key. Primary key -- must be defined over the column. column1 type, -- Names of columns must match the names of CQL's column2 type, -- columns; datatypes must also "match", see below -- for what this means ... dyn blob DYNAMIC_COLUMN_STORAGE=yes -- It is not mandatory but only one -- blob column with any name -- to collect all columns of the record -- which are not mentioned in -- MariaDB table definition ) engine=cassandra thrift_host='192.168.1.0' keyspace= 'cassandra_key_space' column_family='column_family_name';
The MariaDB table represents a view of Cassandra's Column Family. That is,
- Creating a table will not create a Column Family. The Column Family must exist in Cassandra before the MariaDB table can be created.
- Dropping the table will not drop the Column Family.
Mapping for columns
- The row key. The Cassandra API does not consider it to be a column, but CQL and MariaDB do.
- If the Column Family defines a name for the rowkey (always true when using CQL), then the name of the primary key column in MariaDB must match it.
- If the Column Family doesn't define a name for the row key, MariaDB's column must be named "rowkey".
- Columns that are defined as part of Cassandra's "static column family" (this includes columns that one can define with CQL). These columns should have their counterparts in MariaDB. A static column named 'foo' in Cassandra must have a counterpart named 'foo' in MariaDB. The types must also match, they are covered below.
- "Ad-hoc" columns that can be encountered in individual rows and static columns which are not part of mariaDB table definition mapped into one blob column with Dynamic Columns content. Such column have to be blob and marked with boolean option
DYNAMIC_COLUMN_STORAGE
with value meaning true (yes/on/1). If its name match Cassandra's column the Cassandra column also will be included into the blob. If such column absent all columns which are not listed in the MariaDB table definition will not be read and can't be changed.
- Counter columns are not supported but will be supported very soon.
- Cassandra's SuperColumns are not supported, and we have no plans to support them.
Mapping for datatypes
There is no direct 1-to-1 mapping between Cassandra's data types and MySQL datatypes. Also, Cassandra's size limitations are often more relaxed than MySQL/MariaDB's, for example, Cassandra's limit on rowkey is about 2G, while MySQL's is about 1.5Kb.
Mapping between Cassandra types and MySQL types are as follows:
Cassandra | MariaDB | Already supported? |
---|---|---|
blob | VARBINARY(n) | Yes, to [VAR]CHAR or [VAR]BINARY* |
ascii | VARCHAR(n) .. use charset=latin1 | Yes, to [VAR]CHAR or [VAR]BINARY* |
text | VARCHAR(n) .. use charset=utf8 | Yes, to [VAR]CHAR or [VAR]BINARY* |
varint | VARBINARY(N) | Yes |
int | INT | Yes |
bigint | BIGINT | Yes, also to TINY and SHORT |
uuid | CHAR(36), text representation | Yes |
timestamp | TIMESTAMP | Yes, you can map to TIMESTAMP with seconds precision, or microseconds precision (Cassandra stores milliseconds) |
boolean | BOOL | Yes |
float | FLOAT | Yes |
double | DOUBLE | Yes |
decimal | ?? todo: figure out thrift representation?? | |
counter | BIGINT | Reading only |
(*) - charsets/collations are not strictly enforced, yet.
SQL's NULL means that the "corresponding column is not present in the Cassandra column family". There are two possible approaches to handle them, see cassandra-storage-engine-issues.
(TODO: what about the scenario where we already have an SQL DDL and just want to use Cassandra for storage? Perhaps, we could allow arbitrary SQL types, which are stored as strings (or even as binary) in cassandra's blob
columns?)
Command mapping
INSERT
Cassandra doesn't provide any practical way to make INSERT different from UPDATE. Therefore, INSERT works as INSERT-or-UPDATE, it will overwrite the data, if necessary.
INSERT ... SELECT
and multi-line INSERT will try to write data in batches. Batch size is controlled by the @@cassandra_insert_batch_size
system variable, which specifies the max. batch size in records.
The status variables Cassandra_row_inserts
and Cassandra_row_insert_batches
allow one to see whether inserts are actually batched.
UPDATE
Currently, all UPDATE commands will fail (you can use INSERT instead) (TODO: make UPDATE work)
DELETE
DELETE FROM cassandra_table
maps totruncate(column_family)
call.- The DELETE with WHERE clause will do regular row deletions, and, just like the
del
command in cassandra-cli, may leave a row in a weird state where the row is present but has no columns, which is shown in SQL as all columns having NULL values (TODO: does this live till the nearest compaction?) (TODO: CQL SELECTs simply ignore rows that do not have all needed columns. Maybe, we should do so, too?)
SELECTs
Generally, all SELECT statement work like one expects SQL to work. Conditions in form primary_key=...
allow the server to construct query plans which access Cassandra's rows with key lookups.
Full table scan
Full table scans are preformed in a memory-efficient way. Cassandra SE performs a full table scan as a series of batches, each of which reads not more than @@cassandra_rnd_batch_size
records.
Batched Key Access support
Cassandra supports Batched Key Access in no-association mode. This means that it requires the SQL layer to do hashing, which means the following settings are required:
- optimizer_switch='join_cache_hashed=on'
- join_cache_level=7|8
Cassandra SE is currently unable to make use of space in the join buffer (the one whose size is controlled by @@join_buffer_size
). Instead, it will limit read batches to reading not more than @@cassandra_multiget_batch_size
at a time, and memory will be allocated on the heap.
Note that the @@join_buffer_size
buffer is still needed by the SQL layer, so its value should still be increased if you want to read in big batches.
It is possible to track the number of read batches, how many keys were looked-up, and how many results were produced with these status variables:
Variable_name Value Cassandra_multiget_reads 0 Cassandra_multiget_keys_scanned 0 Cassandra_multiget_rows_read 0
How to try it out
The code is at lp:maria-captains/maria/5.5-cassandra
The build process is not fully streamlined yet. It is known to work on Fedora 15 and OpenSUSE, known not to work on Ubuntu (some weird linking error).
You'll need to
- Install Cassandra (we tried 1.1.2 and 1.1.4)
- Install the Thrift library (we used 0.8.0), only the C++ backend is needed.
- We have installed it by compiling the source tarball downloaded from thrift.apache.org
- edit
storage/cassandra/CMakeLists.txt
and modify theINCLUDE_DIRECTORIES
directive to point to Thrift's include directory. export LIBS="-lthrift"
, on another machine it was "-lthrift -ldl"export LDFLAGS=-L/path/to/thrift/libs
- Build the server
Testing can be done as follows:
- Start Cassandra on localhost
- Look through
mysql-test/t/cassandra.test
see the required CQL initialization commands and run them incqlsh -3
andcassandra-cli
. - Run
./mysql-test-run t/cassandra.test
See also
- MDEV-431 - jira task for Cassandra SE works
- cassandra-storage-engine-future-plans
- cassandra-storage-engine-use-example
- cassandra-storage-engine-issues
- hbase-storage-engine