All pages
Powered by GitBook
1 of 11

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Building Cassandra Storage Engine for Packaging

Legacy Cassandra storage engine description. Cassandra was removed from MariaDB in MariaDB 10.6.

These are instructions on how exactly we build Cassandra SE packages.

Getting into build environment

See How_to_access_buildbot_VMs page on the internal wiki. The build VM to use is

Get into the VM and continue to next section.

Set up Thrift

Get the bzr checkout

  • Create another SSH connection to terrier, run the script suggested by motd.

  • Press (C-a C-c) to create another window

  • Copy the base bazaar repository into the VM:

Then, get back to the window with VM, and run in VM:

Compile

This should end with:

Free up some disk space:

Patch the tarball to include Thrift

Verify that mysqld was built with Cassandra SE:

This should point to libthrift-0.8.0.so.

Copy the data out of VM

In the second window (the one that's on terrier, but not in VM), run:

References

This page is licensed: CC BY-SA / Gnu FDL

ezvm  precise-amd64-build
2578
1907

Cassandra Status Variables

Legacy Cassandra storage engine description. Cassandra was removed from MariaDB in MariaDB 10.6.

This page documents status variables related to the Cassandra storage engine. See Server Status Variables for a complete list of status variables that can be viewed with SHOW STATUS.

Cassandra_multiget_keys_scanned

  • Description: Number of keys we've made lookups for.

  • Scope: Global, Session

  • Data Type: numeric

Cassandra_multiget_reads

  • Description: Number of read operations.

  • Scope: Global, Session

  • Data Type: numeric

Cassandra_multiget_rows_read

  • Description: Number of rows actually read.

  • Scope: Global, Session

  • Data Type: numeric

Cassandra_network_exceptions

  • Description: Number of network exceptions.

  • Scope: Global, Session

  • Data Type: numeric

  • Introduced:

Cassandra_row_insert_batches

  • Description: Number of insert batches performed.

  • Scope: Global, Session

  • Data Type: numeric

Cassandra_row_inserts

  • Description: Number of rows inserted.

  • Scope: Global, Session

  • Data Type: numeric

Cassandra_timeout_exceptions

  • Description: Number of Timeout exceptions we got from Cassandra.

  • Scope: Global, Session

  • Data Type: numeric

Cassandra_unavailable_exceptions

  • Description: Number of Unavailable exceptions we got from Cassandra.

  • Scope: Global, Session

  • Data Type: numeric

This page is licensed: CC BY-SA / Gnu FDL

Cassandra Storage Engine Issues

Legacy Cassandra storage engine description. Cassandra was removed from MariaDB in MariaDB 10.6.

This page lists difficulties and peculiarities of Cassandra Storage Engine. I'm not putting them into bug tracker because it is not clear whether these properties should be considered bugs.

No way to get E(#rows in column family)

There seems to be no way to get even a rough estimate of how many different keys are present in a column family. I'm using an arbitrary value of 1000 now, which causes

  • EXPLAIN will always show rows=1000 for full table scans. In the future, this may cause poor query plans.

  • DELETE FROM table always prints "1000 rows affected", with no regards how many records were actually there in the table.

We could use the new feature to get some data statistics.

This page is licensed: CC BY-SA / Gnu FDL

Handling Joins With Cassandra

Legacy Cassandra storage engine description. Cassandra was removed from MariaDB in MariaDB 10.6.

Joins with data stored in a Cassandra database are only possible on the MariaDB side. That is, if we want to compute a join between two tables, we will:

  1. Read the relevant data for the first table.

  2. Based on data we got in #1, read the matching records from the second table.

Either of the tables can be an InnoDB table, or a Cassandra table. In case the second table is a Cassandra table, the Cassandra Storage Engine allows to read matching records in an efficient way.

Some general info

All this is targeted at running joins which touch small fraction of the tables. The expected typical use-case looks like this:

  • The primary data is stored in MariaDB (ie. in InnoDB)

  • There is also some extra data stored in Cassandra (e.g. hit counters)

  • The user accesses data in MariaDB (think of a website and a query like:

Cassandra SE allows to grab some Cassandra data, as well. One can write things like this:

which is much easier to do than to use Thrift API.

If the user wants to run huge joins that touch a big fraction of table's data, for example:

"What are top 10 countries that my website had visitors from in the last month"?

or

"Go through last month's orders and give me top 10 selling items"

then Cassandra Storage engine is not a good answer. Queries like this are answered in two ways:

  1. Design their schema in Cassandra in such a way that allows to get this data in one small select. No kidding. This is what Cassandra is targeted at, they explicitly recommend that Cassandra schema design starts with the queries.

  2. If the query doesn't match Cassandra's schema, they need to run Hive (or Pig), which have some kind of distributed join support. Hive/Pig compile queries to Map/reduce job which are ran across the whole cluster, so they will certainly beat Cassandra Storage Engine which runs on one mysqld node (you can have multiple mysqld nodes of course, but they will not cooperate with one another).

It is possible to run Hive/Pig on Cassandra.

This page is licensed: CC BY-SA / Gnu FDL

Building Cassandra Storage Engine

Legacy Cassandra storage engine description. Cassandra was removed from MariaDB in MariaDB 10.6.

This page describes how to build the .

Getting the source code

The code is in bazaar branch at .

Alternatively, you can download a tarball from

Cassandra Storage Engine Use Example

Legacy Cassandra storage engine description. Cassandra was removed from MariaDB in MariaDB 10.6.

This page is a short demo of what using looks like.

First, a keyspace and column family must be created in Cassandra:

Now, let's try to connect an SQL table to it:

We've used a wrong datatype. Let's try again:

Ok. Let's insert some data:

Let's select it back:

Now, let's check if it can be seen in Cassandra:

Or, in cassandra-cli:

This page is licensed: CC BY-SA / Gnu FDL

Virtual Machine to Test the Cassandra Storage Engine

Legacy Cassandra storage engine description. Cassandra was removed from MariaDB in MariaDB 10.6.

Julien Duponchelle has made a virtual machine available for testing the Cassandra storage engine. Find out more at .

The virtual machine is based on:

  • Ubuntu 12.04

  • DataStax Cassandra

  • + Cassandra

mkdir build
cd build
wget https://dist.apache.org/repos/dist/release/thrift/0.8.0/thrift-0.8.0.tar.gz

sudo apt-get install bzr
sudo apt-get install flex

tar zxvf thrift-0.8.0.tar.gz 
cd thrift-0.8.0/

./configure --prefix=/home/buildbot/build/thrift-inst --without-qt4 --without-c_glib --without-csharp --without-java --without-erlang --without-python --without-perl --without-php --without-php_extension --without-ruby --without-haskell --without-go --without-d
make
make install

# free some space
make clean
cd ..
scp /home/psergey/5.5-cassandra-base.tgz runvm:
tar zxvf ../5.5-cassandra-base.tgz
rm -rf ../5.5-cassandra-base.tgz
cd 5.5-cassandra/
bzr pull lp:~maria-captains/maria/5.5-cassandra
export LIBS="-lthrift"
export LDFLAGS=-L/home/buildbot/build/thrift-inst/lib

mkdir mkdist
cd mkdist
cmake ..
make dist
basename mariadb-*.tar.gz .tar.gz > ../distdirname.txt

cp mariadb-5.5.25.tar.gz ../
cd ..
tar zxf "mariadb-5.5.25.tar.gz"
mv "mariadb-5.5.25" build
cd build
mkdir mkbin
cd mkbin
cmake -DBUILD_CONFIG=mysql_release ..
make -j4 package
CPack: - package: /home/buildbot/build/5.5-cassandra/build/mkbin/mariadb-5.5.25-linux-x86_64.tar.gz generated.
rm -fr ../../mkdist/
mv mariadb-5.5.25-linux-x86_64.tar.gz ../..
cd ../..
rm -rf build
mkdir fix-package
cd fix-package
tar zxvf ../mariadb-5.5.25-linux-x86_64.tar.gz
ldd mariadb-5.5.25-linux-x86_64/bin/mysqld
cp /home/buildbot/build/thrift-inst/lib/libthrift* mariadb-5.5.25-linux-x86_64/lib/
tar czf mariadb-5.5.25-linux-x86_64.tar.gz mariadb-5.5.25-linux-x86_64/
cp mariadb-5.5.25-linux-x86_64.tar.gz ..
mkdir build-cassandra
cd build-cassandra
scp runvm:/home/buildbot/build/5.5-cassandra/mariadb-5.5.25.tar.gz .
scp runvm:/home/buildbot/build/5.5-cassandra/mariadb-5.5.25-linux-x86_64.tar.gz .
Building

The build process is not fully streamlined yet. It is

  • known to work on Fedora 15 and OpenSUSE

  • known not to work on Ubuntu Oneiric Ocelot (see MDEV-501).

  • known to work on Ubuntu Precise Pangolin

The build process is as follows

  • Install Cassandra (we tried 1.1.3 ... 1.1.5, 1.2 beta versions should work but haven't been tested)

  • Install the Thrift library (we used 0.8.0 and 0.9.0-trunk), only the C++ backend is needed.

    • we have installed it by compiling the source tarball downloaded from thrift.apache.org

  • edit storage/cassandra/CMakeLists.txt and modify the INCLUDE_DIRECTORIES directive to point to Thrift's include directory.

  • export LIBS="-lthrift", on another machine it was "-lthrift -ldl"

  • export LDFLAGS=-L/path/to/thrift/libs

  • Build the server

    • we used BUILD/compile-pentium-max script (the name is for historic reasons. It will actually build an optimized amd64 binary)

Running the server

Cassandra storage engine is linked into the server (ie, it is not a plugin). All you need to do is to make sure Thrift's libthrift.so can be found by the loader. This may require adjusting the LD_LIBRARY_PATH variable.

Running tests

There is a basic testsuite. In order to run it, one needs

  • Start Cassandra on localhost

  • Set PATH so that cqlsh and cassandra-cli binaries can be found

  • From the build directory, run

This page is licensed: CC BY-SA / Gnu FDL

Cassandra Storage Engine
lp:~maria-captains/maria/5.5-cassandra

The Cassandra storage engine was removed in MariaDB 10.6.

Vagrant is used for setup. Full instructions are at Julien's website.

This page is licensed: CC BY-SA / Gnu FDL

Cassandra MariaDB VirtualBox

The Cassandra storage engine was removed in MariaDB 10.6.

MariaDB [j1]> delete from t1;
Query OK, 1000 rows affected (0.14 sec)
engine-independent-table-statistics

The Cassandra storage engine was removed in MariaDB 10.6.

The Cassandra storage engine was removed in MariaDB 10.6.

cqlsh> CREATE KEYSPACE mariadbtest2
   ...   WITH strategy_class = 'org.apache.cassandra.locator.SimpleStrategy'
   ...   AND strategy_options:replication_factor='1';
cqlsh> USE mariadbtest2;
cqlsh:mariadbtest2> create columnfamily cf1 ( pk varchar primary key, data1 varchar, data2 bigint);
cqlsh:mariadbtest2> select * from cf1;
cqlsh:mariadbtest2>
MariaDB [test]> create table t1 (
    ->   rowkey varchar(36) primary key, 
    ->   data1 varchar(60), data2 varchar(60)
    -> ) engine=cassandra    thrift_host='localhost' keyspace='mariadbtest2' column_family='cf1';
ERROR 1928 (HY000): Internal error: 'Failed to map column data2 to datatype org.apache.cassandra.db.marshal.LongType'
MariaDB [test]> create table t1 (
    ->   rowkey varchar(36) primary key, 
    ->   data1 varchar(60), data2 bigint
    -> ) engine=cassandra  thrift_host='localhost' keyspace='mariadbtest2' column_family='cf1';
Query OK, 0 rows affected (0.04 sec)
Cassandra Storage Engine

The Cassandra storage engine was removed in MariaDB 10.6.

Cassandra Storage Engine Future Plans

Legacy Cassandra storage engine description. Cassandra was removed from MariaDB in MariaDB 10.6.

The Cassandra storage engine was removed in MariaDB 10.6.

These are possible future directions for Cassandra Storage Engine. This is mostly brainstorming, nobody has committed to implementing this.

Unlike MySQL/MariaDB, Cassandra is not suitable for transaction processing. (the only limited scenario that they handle well is append-only databases)

Instead, they focus on ability to deliver real-time analytics. This is achieved via the following combination:

  1. Insert/update operations are reduced to inserting a newer version, their implementation (SSTree) allows to make lots of updates at a low cost

  2. The data model is targeted at denormalized data, cassandra docs and user stories all mention the practice of creating/populating a dedicated column family (=table) for each type of query you're going to run.

In other words, Cassandra encourages creation/use of materialized VIEWs. Having lots of materialized VIEWs makes updates expensive, but their low cost and non-conflicting nature should offset that.

How does one use Cassandra together with an SQL database? I can think of these use cases:

use case 1

  1. The "OLTP as in SQL" is kept in the SQL database.

  2. Data that's too big for SQL database (such as web views, or clicks) is stored in Cassandra, but can also be accessed from SQL.

As an example, one can think of a web shop, which provides real-time peeks into analytics data, like amazon's "people who looked at this item, also looked at ..." , or "today's best sellers in this category are ...", etc

Generally, CQL (Cassandra Query Language) allows to query Cassandra's data in an SQL-like fashion.

Access through a storage engine will additionally allow:

  1. to get all of the data from one point, instead of rwo

  2. joins betwen SQL and Cassandra's data might be more efficent due to Batched Key Access (this remains to be seen)

  3. ??

use case 2

Suppose, all of the system's data is actually stored in an OLTP SQL database.

Cassandra is only used as an accelerator for analytical queries. Cassandra won't allow arbitrary, ad-hoc dss-type queries, it will require the DBA to create and maintain appropriate column families (however, it is supposed to give a nearly-instant answers to analytics-type questions).

Tasks that currently have no apparent solutions:

  1. There is no way to replicate data from MySQL/MariaDB into Cassandra. It would be nice if one could update data in MySQL and that would cause appropriate inserts made into all relevant column families in Cassandra.

  2. ...

This page is licensed: CC BY-SA / Gnu FDL

Cassandra System Variables

Legacy Cassandra storage engine description. Cassandra was removed from MariaDB in MariaDB 10.6.

This page documents system variables related to the . See for a complete list of system variables and instructions on setting them.

cassandra_default_thrift_host

  • Description: Host to connect to, if not specified on per-table basis.

The Cassandra storage engine was removed in MariaDB 10.6.

The Cassandra storage engine was removed in MariaDB 10.6.

cd mysql-test
./mysql-test-run t/cassandra.test
SELECT * FROM user_accounts WHERE username='joe')
SELECT 
  user_accounts.*, 
  cassandra_table.some_more_fields
FROM 
  user_accounts, cassandra_data 
WHERE 
  user_accounts.username='joe' AND
  user_accounts.user_id= cassandra_table.user_id
MariaDB [test]> insert into t1 values ('rowkey10', 'data1-value', 123456);
Query OK, 1 row affected (0.01 sec)

MariaDB [test]> insert into t1 values ('rowkey11', 'data1-value2', 34543);
Query OK, 1 row affected (0.00 sec)

MariaDB [test]> insert into t1 values ('rowkey12', 'data1-value3', 454);
Query OK, 1 row affected (0.00 sec)
MariaDB [test]> select * from t1 where rowkey='rowkey11';
+----------+--------------+-------+
| rowkey   | data1        | data2 |
+----------+--------------+-------+
| rowkey11 | data1-value2 | 34543 |
+----------+--------------+-------+
1 row in set (0.00 sec)
cqlsh:mariadbtest2> select * from cf1;
 pk       | data1        | data2
----------+--------------+--------
 rowkey12 | data1-value3 |    454
 rowkey10 |  data1-value | 123456
 rowkey11 | data1-value2 |  34543
[default@mariadbtest2] list cf1;
Using default limit of 100
Using default column limit of 100
-------------------
RowKey: rowkey12
=> (column=data1, value=data1-value3, timestamp=1345452471835)
=> (column=data2, value=454, timestamp=1345452471835)
-------------------
RowKey: rowkey10
=> (column=data1, value=data1-value, timestamp=1345452467728)
=> (column=data2, value=123456, timestamp=1345452467728)
-------------------
RowKey: rowkey11
=> (column=data1, value=data1-value2, timestamp=1345452471831)
=> (column=data2, value=34543, timestamp=1345452471831)

3 Rows Returned.
Elapsed time: 5 msec(s).

Scope: Global

  • Dynamic: Yes

  • Data Type: string

  • cassandra_failure_retries

    • Description: Number of times to retry on timeout/unavailable failures.

    • Scope: Global, Session

    • Dynamic: Yes

    • Data Type: numeric

    • Default Value: 3

    • Valid Values: 1 to 1073741824

    cassandra_insert_batch_size

    • Description: INSERT batch size.

    • Scope: Global, Session

    • Dynamic: Yes

    • Data Type: numeric

    • Default Value: 100

    • Valid Values: 1 to 1073741824

    cassandra_multiget_batch_size

    • Description: Batched Key Access batch size.

    • Scope: Global, Session

    • Dynamic: Yes

    • Data Type: numeric

    • Default Value: 100

    • Valid Values: 1 to 1073741824

    cassandra_read_consistency

    • Description: Consistency to use for reading. See Datastax's documentation for details.

    • Scope: Global, Session

    • Default Value: ONE

    • Valid Values: ONE, TWO, THREE, ANY, ALL, QUORUM, EACH_QUORUM, LOCAL_QUORUM, ``

    cassandra_rnd_batch_size

    • Description: Full table scan batch size.

    • Scope: Global, Session

    • Default Value: 10000

    • Valid Values: 1 to 1073741824

    cassandra_write_consistency

    • Description: Consistency to use for writing. See Datastax's documentation for details.

    • Scope: Global, Session

    • Default Value: ONE

    • Valid Values: ONE, TWO, THREE, ANY, ALL, QUORUM, EACH_QUORUM, LOCAL_QUORUM, ``

    This page is licensed: CC BY-SA / Gnu FDL

    Cassandra storage engine
    Server System Variables

    The Cassandra storage engine was removed in MariaDB 10.6.

    Cassandra Storage Engine Overview

    Legacy Cassandra storage engine description. Cassandra was removed from MariaDB in MariaDB 10.6.

    The Cassandra storage engine was removed in MariaDB 10.6.

    Installing

    If using the YUM repositories on Fedora, Red Hat, or CentOS, first install the Cassandra storage engine package with:

    If using the Debian or Ubuntu repositories, the Cassandra plugin is in the main MariaDB server package.

    To install/activate the storage engine into MariaDB, issue the following command:

    You can also activate the storage engine by using the --plugin-load command on server startup.

    Introduction

    The Cassandra Storage Engine allows access to data in a Cassandra cluster from MariaDB. The overall architecture is shown in the picture below and is similar to that of the NDB cluster storage engine.

    You can access the same Cassandra cluster from multiple MariaDB instances, provided each of them runs the Cassandra Storage Engine:

    The primary goal of Cassandra SE (Storage Engine) is data integration between the SQL and NoSQL worlds. Have you ever needed to:

    • grab some of Cassandra's data from your web frontend, or SQL query?

    • insert a few records into Cassandra from some part of your app?

    Now, this is easily possible. Cassandra SE makes Cassandra's column family appear as a table in MariaDB that you can insert to, update, and select from. You can write joins against this table; it is possible to join data that's stored in MariaDB with data that's stored in Cassandra.

    Versions in MariaDB

    Cassandra SE Version
    Introduced
    Maturity

    What about CQL?

    The Cassandra Query Language (CQL) is the best way to work with Cassandra. It resembles SQL on first glance; however, the resemblance is very shallow. CQL queries are tightly bound to the way Cassandra accesses its data internally. For example, you can't have even the smallest join. In fact, adding a mere... AND non_indexed_column=1 into a WHERE clause is already invalid CQL.

    Our goal is to let one work in SQL instead of having to move between CQL and SQL all the time.

    Does this make Cassandra an SQL database?

    No. Cassandra SE is not suitable for running analytics-type queries that sift through huge amounts of data in a Cassandra cluster. That task is better handled by Hadoop-based tools like Apache Pig or Apache Hive. Cassandra SE is rather a "window" from an SQL environment into NoSQL.

    Data mapping

    Let's get specific. In order to access Cassandra's data from MariaDB, one needs to create a table with engine=cassandra. The table will represent a view of a Column Family in Cassandra and its definition will look like so:

    The name of the table can be arbitrary. However, primary key, column names, and types must "match" those of Cassandra.

    Cassandra's rowkey

    The table must define a column that corresponds to the Column Family's rowkey.

    • If Cassandra's rowkey has an alias (or name), then MariaDB's column must have the same name.

      • Otherwise, it must be named "rowkey".

    • The type of MariaDB's column must match the validation_class of Cassandra's rowkey (datatype matching is covered in more detail below).

    Note: Multi-column primary keys are currently not supported. Support may be added in a future version, depending on whether there is a demand for it.

    Cassandra's static columns

    Cassandra allows one to define a "static column family", where column metadata is defined in the Column Family header and is obeyed by all records.

    These "static" columns can be mapped to regular columns in MariaDB. A static column named 'foo' in Cassandra should have a counterpart named 'foo' in MariaDB. The types must also match; they are covered below.

    Cassandra's dynamic columns

    Cassandra also allows individual rows to have their own sets of columns. In other words, each row can have its own unique columns.

    These columns can be accessed through MariaDB's feature. To do so, one must define a column:

    • with an arbitrary name

    • of type blob

    • with the DYNAMIC_COLUMN_STORAGE=yes attribute

    Here is an example:

    Once define, one can access individual columns with the new variant of the Dynamic Column functions, which now support string names (they used to support integers only).

    Super columns

    Cassandra's SuperColumns are not supported, there are currently no plans to support them.

    Datatypes

    There is no direct 1-to-1 mapping between Cassandra's datatypes and MySQL/MariaDB datatypes. Also, Cassandra's size limitations are often more relaxed than MySQL/MariaDB's. For example, Cassandra's limit on rowkey length is about 2G, while MySQL limits unique key length to about 1.5Kb.

    The types must be mapped as follows:

    Cassandra
    MariaDB

    For types like "VARBINARY(n)", n should be chosen sufficiently large to accommodate all the data that is encountered in the table.

    Command mapping

    INSERT

    Cassandra doesn't provide any practical way to make INSERT different from UPDATE. Therefore, INSERT works as INSERT-or-UPDATE, it will overwrite the data, if necessary.

    INSERT ... SELECT and multi-line INSERT will try to write data in batches. Batch size is controlled by the system variable, which specifies the max. batch size in columns.

    The status variables and allow one to see whether inserts are actually batched.

    UPDATE

    UPDATE works like one would expect SQL's UPDATE command to work (i.e. changing a primary key value will result in the old record being deleted and a new record being inserted)

    DELETE

    • DELETE FROM cassandra_table maps to the truncate(column_family) call.

    • The DELETE with WHERE clause will do per-row deletions.

    SELECT

    Generally, all SELECT statements work like one expects SQL to work. Conditions in the form primary_key=... allow the server to construct query plans which access Cassandra's rows with key lookups.

    Full table scan

    Full table scans are performed in a memory-efficient way. Cassandra SE performs a full table scan as a series of batches, each of which reads not more than records.

    Batched Key Access support

    Cassandra supports Batched Key Access in no-association mode. This means that it requires the SQL layer to do hashing, which means the following settings are required:

    • optimizer_switch='join_cache_hashed=on'

    • join_cache_level=7|8

    Cassandra SE is currently unable to make use of space in the join buffer (the one whose size is controlled by ). Instead, it will limit read batches to reading not more than at a time, and memory are allocated on the heap.

    Note that the buffer is still needed by the SQL layer, so its value should still be increased if you want to read in big batches.

    It is possible to track the number of read batches, how many keys were looked-up, and how many results were produced with these status variables:

    Variable_name
    Value

    System and status variables

    The following are available:

    Variable name
    Description

    The following are available:

    Variable name
    Description

    A note about Cassandra 1.2

    Cassandra 1.2 has slightly changed its data model, as described at . This has caused some of Thrift-based clients to no longer work (for example, here's a problem experienced by Pig:).

    Currently, Cassandra SE is only able to access Cassandra 1.2's column families that were defined WITH COMPACT STORAGE attribute.

    See also

    • Slides from talk at Percona Live 2013 -

    • - JIRA task for Cassandra SE work

    This page is licensed: CC BY-SA / Gnu FDL

    yum install MariaDB-cassandra-engine
    install soname 'ha_cassandra.so';

    uuid

    CHAR(36), the UUID are represented in text form on the MariaDB side

    timestamp

    TIMESTAMP (second precision), TIMESTAMP(6) (microsecond precision), BIGINT (gets verbatim Cassandra's 64-bit milliseconds-since-epoch)

    boolean

    BOOL

    float

    FLOAT

    double

    DOUBLE

    decimal

    VARBINARY(n)

    counter

    BIGINT, only reading is supported

    Consistency to use for writing

    Number of Unavailable exceptions we got from Cassandra

    Cassandra Storage Engine - Use Example

  • Cassandra Storage Engine - Issues

  • Cassandra SE 1.8

    Experimental

    blob

    BLOB, VARBINARY(n)

    ascii

    BLOB, VARCHAR(n), use charset=latin1

    text

    BLOB, VARCHAR(n), use charset=utf8

    varint

    VARBINARY(n)

    int

    INT

    bigint

    BIGINT, TINY, SHORT (pick the one that will fit the real data)

    Cassandra_multiget_reads

    0

    Cassandra_multiget_keys_scanned

    0

    Cassandra_multiget_rows_read

    0

    cassandra_default_thrift_host

    Host to connect to, if not specified on per-table basis

    cassandra_failure_retries

    Number of times to retry on timeout/unavailable failures

    cassandra_insert_batch_size

    INSERT batch size

    cassandra_multiget_batch_size

    Batched Key Access batch size

    cassandra_rnd_batch_size

    Full table scan batch size

    cassandra_read_consistency

    Consistency to use for reading

    Cassandra_row_inserts

    Number of rows inserted

    Cassandra_row_insert_batches

    Number of insert batches performed

    Cassandra_multiget_reads

    Number of read operations

    Cassandra_multiget_keys_scanned

    Number of keys we've made lookups for

    Cassandra_multiget_rows_read

    Number of rows actually read

    Cassandra_timeout_exceptions

    Number of Timeout exceptions we got from Cassandra

    Dynamic Columns
    cassandra_insert_batch_size
    Cassandra_row_inserts
    Cassandra_row_insert_batches
    cassandra_rnd_batch_size
    #join_buffer_size
    cassandra_multiget_batch_size
    #join_buffer_size
    system variables
    status variables
    thrift-to-cql3
    CASSANDRA-5234
    MariaDB Cassandra Interoperability
    MDEV-431
    Instructions for creating binary tarball in MariaDB 5.5
    Cassandra Storage Engine - Future Plans
    cassandra-se-overview
    mariadb-and-cassandra
    set cassandra_default_thrift_host='192.168.0.10' -- Cassandra's address. It can also
                                                     -- be specified as startup parameter
                                                     -- or on per-table basis
    
    create table cassandra_tbl      -- table name can be chosen at will
    (
      rowkey  type PRIMARY KEY,     -- represents Column Family's rowkey. Primary key
                                    -- must be defined over this column.
    
      column1 type,                 -- Cassandra's static columns can be mapped to 
      column2 type,                 -- regular SQL columns.
    
      dynamic_cols blob DYNAMIC_COLUMN_STORAGE=yes -- If you need to access Cassandra's
                                                   -- dynamic columns, you can define
                                                   -- a blob which will receive all of 
                                                   -- them, packed as MariaDB's dynamic
                                                   -- columns.
    ) engine=cassandra
      keyspace= 'cassandra_key_space'        -- Cassandra's keyspace.columnFamily we  
      column_family='column_family_name';    -- are accessing.
    dynamic_cols blob DYNAMIC_COLUMN_STORAGE=yes
    cassandra_write_consistency
    Cassandra_unavailable_exceptions

    Cassandra Storage Engine

    Legacy Cassandra storage engine description. Cassandra was removed from MariaDB in MariaDB 10.6.

    The Cassandra storage engine was removed in MariaDB 10.6.

    A storage engine interface to Cassandra. Read the original announcement information about CassandraSE.

    MariaDB 10.0.3
    MariaDB 5.5.27
    MariaDB 10.0.1