Comments - TokuDB Differences

 
10 years, 5 months ago Jean Weisbuch

Here is a simple one liner to check the compression format of the TokuDB data files using the tokuftdump tool :

mysql -sNe 'SELECT dictionary_name, internal_file_name FROM information_schema.tokudb_file_map WHERE dictionary = "main" OR dictionary LIKE "key-%"' |perl -F'\t' -ane '@out=qx(tokuftdump --nodata --rootnode /var/lib/mysql/$F[1]); foreach $ftdump (@out) { if($ftdump=~/^ compression_method=([0-9]+)$/) { print $F[0]."\t".$1."\n"; last } }'

(don't forget to ajust to your datadir path if its not /var/lib/mysql)

The integer corresponds to the compression format, 0 is uncompressed, 8 is ZLIB and 10 is LZMA, the other format correspondances can be found on make_tdb.cc

 
10 years, 4 months ago Jean Weisbuch

On MariaDB5.5.37, the column "dictionary" has changed for "table_dictionary_name", so the right command is now :

mysql -sNe 'SELECT dictionary_name, internal_file_name FROM information_schema.tokudb_file_map WHERE table_dictionary_name = "main" OR table_dictionary_name LIKE "key-%"' |perl -F'\t' -ane '@out=qx(tokuftdump --nodata --rootnode /var/lib/mysql/$F[1]); foreach $ftdump (@out) { if($ftdump=~/^ compression_method=([0-9]+)$/) { print $F[0]."\t".$1."\n"; last } }'

Also note that the ZLIB compressed tables are having the value of 11 and not 8 as stated previously.

 
8 years, 7 months ago Sergei Khomutov

Hi, it does not work for me. Versions : tokudb_version - 5.6.26-74.0 , version mariaDB - 10.1.10-MariaDB-1jessie-log.

I've added --header to tokuftdump command. Finally it works for me :

 mysql -sNe 'SELECT dictionary_name, internal_file_name FROM information_schema.tokudb_file_map WHERE table_dictionary_name = "main" OR table_dictionary_name LIKE "key-%"' |perl -F'\t' -ane '@out=qx(tokuftdump --nodata --header --rootnode /var/lib/mysql/$F[1]); foreach $ftdump (@out) { if($ftdump=~/^ compression_method=([0-9]+)$/) { print $F[0]."\t".$1."\n"; last } }' 

And i noticed that changing compression method changes size of table on disk. But command

 SELECT 
    table_name AS `Table`, 
    round(((data_length) / 1024 / 1024), 2) `Table  size in MB` ,
    round(((index_length) / 1024 / 1024), 2) `Index  size in MB` 
FROM information_schema.TABLES 
WHERE table_schema = <table>
order by `Table  size in MB` desc; 

shows the same size of tables for different compression methods. Why ?

 
10 years, 11 months ago Pawel Pabian

Documentation CLUSTERED keyword is not correct

> CREATE TABLE foo (id int, key (id) CLUSTERED=YES) ENGINE=TokuDB COMPRESSION=TOKUDB_LZMA;
ERROR 1911 (HY000): Unknown option 'CLUSTERED'

should be

> CREATE TABLE foo (id int, key (id) CLUSTERING=YES) ENGINE=TokuDB COMPRESSION=TOKUDB_LZMA;
Query OK, 0 rows affected (0.01 sec)

> show create table foo\G
*************************** 1. row ***************************
       Table: foo
Create Table: CREATE TABLE `foo` (
  `id` int(11) DEFAULT NULL,
  KEY `id` (`id`) `CLUSTERING`=YES
) ENGINE=TokuDB DEFAULT CHARSET=utf8 `COMPRESSION`=TOKUDB_LZMA
1 row in set (0.00 sec)

It is not possible to create PARTITIONED table with CLUSTERING key in one go

> CREATE TABLE bar (id int, key (id) CLUSTERING=YES) ENGINE=TokuDB COMPRESSION=TOKUDB_LZMA PARTITION BY HASH(id) PARTITIONS 4;
ERROR 1911 (HY000): Unknown option 'CLUSTERING

but adding partitioning later works fine

> CREATE TABLE bar (id int, key (id) CLUSTERING=YES) ENGINE=TokuDB COMPRESSION=TOKUDB_LZMA;
Query OK, 0 rows affected (0.02 sec)

> ALTER TABLE bar PARTITION BY HASH(id) PARTITIONS 4;
Query OK, 0 rows affected (0.08 sec)
Records: 0  Duplicates: 0  Warnings: 0

> SHOW CREATE TABLE bar\G
*************************** 1. row ***************************
       Table: bar
Create Table: CREATE TABLE `bar` (
  `id` int(11) DEFAULT NULL,
  KEY `id` (`id`) `CLUSTERING`=YES
) ENGINE=TokuDB DEFAULT CHARSET=utf8 `COMPRESSION`=TOKUDB_LZMA
/*!50100 PARTITION BY HASH (id)
PARTITIONS 4 */
1 row in set (0.00 sec)
 
8 years, 6 months ago Mariusz Szot

If you want to add to the table and index partitioning, clustering, it is only in the following order:

create table,

1. alter add .... clustering = yes

2. and alter table ... partition to ...

reverse order :

1. alter table ... partition to ...

2. alter add .... clustering = yes

not WORK, rror 1911 (HY000): Unknown option 'clustering'.

EXAMPLE: drop table t1

CREATE TABLE `t1` ( `id_high` bigint(20) unsigned NOT NULL, `id_low` bigint(20) unsigned NOT NULL, `modified` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP -- KEY `clstr_time` (`modified`) `clustering` = yes, -- KEY `clstr_key_id_id` (`id_high`) `clustering` = yes, -- PRIMARY KEY (`modified`,`id_high`) ) ENGINE=TokuDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci

alter table n3.t1 add PRIMARY KEY `t1_pk` (`modified` ASC, `id_high` ASC); alter table t1 add key clstr_key (`id_high`) CLUSTERING=YES;

ALTER TABLE t1 PARTITION BY RANGE (UNIX_TIMESTAMP(modified)) ( PARTITION p201410 VALUES LESS THAN (1414796400) ENGINE = TokuDB, PARTITION p201411 VALUES LESS THAN (1417388400) ENGINE = TokuDB, PARTITION p201412 VALUES LESS THAN (1420066800) ENGINE = TokuDB, PARTITION p201501 VALUES LESS THAN (1422745200) ENGINE = TokuDB, PARTITION p201502 VALUES LESS THAN (1425164400) ENGINE = TokuDB, PARTITION p201503 VALUES LESS THAN (1427839200) ENGINE = TokuDB, PARTITION p201504 VALUES LESS THAN (1430431200) ENGINE = TokuDB, PARTITION p201505 VALUES LESS THAN (1433109600) ENGINE = TokuDB, PARTITION p201506 VALUES LESS THAN (1435701600) ENGINE = TokuDB, PARTITION p201507 VALUES LESS THAN (1438380000) ENGINE = TokuDB, PARTITION p201508 VALUES LESS THAN (1441058400) ENGINE = TokuDB, PARTITION p201509 VALUES LESS THAN (1443650400) ENGINE = TokuDB, PARTITION p2015010 VALUES LESS THAN (1446332400) ENGINE = TokuDB, PARTITION p_max VALUES LESS THAN MAXVALUE ENGINE = TokuDB );

 
10 years, 11 months ago Sergei Golubchik

I've fixed the CLUSTERING syntax, thanks. As far as partitioning is concerned, I've reported a bug about it: https://mariadb.atlassian.net/browse/MDEV-5058

 
11 years ago Tim Callaghan

Since the grammar for creating clustered indexes and picking compression type are different than Tokutek's TokuDB for MariaDB can someone please add the syntax to this page?

 
11 years ago Tim Callaghan

More differences

I'm pretty sure this is a complete list of clarifications and additional changes between the TokuDB from Tokutek and the version from MariaDB.org.

  • No "INSERT NOAR" or "UPDATE NOAR" means that Fast Upserts and Fast Updates are not possible.
  • No Hot Indexing (creating indexes while the table is available for insert/update/delete)
  • Index creation will force a complete table rewrite (table is read-only for the duration of the operation as it is rebuilt, all existing indexes are recreated, and new index is created)
  • Auto-increment behavior is as implemented in MyISAM, not TokuDB
  • DDL grammar changes. I'm not sure of the specifics but I believe the clustering keyword for index creation and how you specify compression type is different.
  • Memory Allocator. We ship MySQL and MariaDB with jemalloc statically linked to mysqld.
 
11 years ago Sergei Golubchik

DDL grammar and auto-inrement behavior, yes, thanks.

We also link jemalloc statically into mysqld, so that should be the same.

 
Content reproduced on this site is the property of its respective owners, and this content is not reviewed in advance by MariaDB. The views, information and opinions expressed by this content do not necessarily represent those of MariaDB or any other party.