Between MariaDB 5.5.21 and 5.5.22 some work was done on how the hash index for a MEMORY table is created. This results in better performance when inserting rows into a memory table.

The following benchmark compares MariaDB-5.5.21 and 5.5.25. Compiled with identical settings on the same machine. The operation was loading 50 million rows into a MEMORY table with LOAD DATA INFILE.

Two different tables were tested: one with an indexed INT column and one with an indexed CHAR(20) column. The data files were pre-generated and located on a SSD. In order to make the effect visible, the cpu speed was set to minimum (core I5 @ 800Mhz)

Result:

Table TypeMariaDB Versionrows per secondPercent
INT5.5.21411022100%
5.5.25510016124%
CHAR(20)5.5.21259399100%
5.5.25411535159%

This is how the benchmark was run:

MariaDB [test]> tee 5.5.21.txt
MariaDB [test]> set @instance="5.5.21";
MariaDB [test]> source bench.sql

The script used to generate the data files:

#!/usr/bin/perl -w                                                              
                                                                                
$ROWS=50*1024*1024;                                                             

open F, ">/tmp/hash1.txt" or die;                                               
for ($i=0; $i<$ROWS; $i++) {                                                    
    printf F "%d\n", int(rand($ROWS));                                          
}                                                                               
close F or die;                                                                 
                                                                                
open F, ">/tmp/hash2.txt" or die;                                               
for ($i=0; $i<$ROWS; $i++) {                                                    
    $s="";                                                                      
    for (1..20) { $s .= chr(ord('a')+int(rand(26))); }                          
    print F $s, "\n";                                                           
}                                                                               
close F or die;

The benchmark SQL script bench.sql:

use test;                                                                       
                                                                                
-- need big heap tables                                                         
set max_heap_table_size=4*1024*1024*1024;                                       
                                                                                
-- table to hold test results                                                   
create table if not exists results (                                            
  id serial,                                                                    
  operation char(32),                                                           
  opsize bigint unsigned,                                                       
  started datetime,                                                             
  ended datetime,                                                               
  instance char(20)                                                             
);                                                                              
                                                                                
-- dummy run with second data file                                              
drop table if exists t1;                                                        
create table t1 (c1 char(20), index (c1)) engine memory;                        
load data infile "/tmp/hash2.txt" into table t1;                                
drop table t1;                                                                  
                                                                                
-- do total of 5 runs for each table                                            
                                                                                
-- run #1                                                                       
create table t1 (c1 int, index (c1)) engine memory;                             
select @t1:=now();                                                              
load data infile "/tmp/hash1.txt" into table t1;                                
select @t2:=now();                                                              
select @rows:=count(*) from t1;                                                 
insert into results (operation, opsize, started, ended, instance)               
values ("load into INT table", @rows, @t1, @t2, @instance);                     
drop table t1;                                                                  
                                                                                
create table t1 (c1 char(20), index (c1)) engine memory;                        
select @t1:=now();                                                              
load data infile "/tmp/hash2.txt" into table t1;                                
select @t2:=now();                                                              
select @rows:=count(*) from t1;                                                 
insert into results (operation, opsize, started, ended, instance)               
values ("load into CHAR(20) table", @rows, @t1, @t2, @instance);                
drop table t1;                                                                  
                                                                                
-- run #2                                                                       
create table t1 (c1 int, index (c1)) engine memory;                             
select @t1:=now();                                                              
load data infile "/tmp/hash1.txt" into table t1;                                
select @t2:=now();                                                              
select @rows:=count(*) from t1;                                                 
insert into results (operation, opsize, started, ended, instance)               
values ("load into INT table", @rows, @t1, @t2, @instance);                     
drop table t1;                                                                  
                                                                                
create table t1 (c1 char(20), index (c1)) engine memory;                        
select @t1:=now();                                                              
load data infile "/tmp/hash2.txt" into table t1;                                
select @t2:=now();                                                              
select @rows:=count(*) from t1;                                                 
insert into results (operation, opsize, started, ended, instance)               
values ("load into CHAR(20) table", @rows, @t1, @t2, @instance);                
drop table t1;                                                                  
                                                                                
-- run #3                                                                       
create table t1 (c1 int, index (c1)) engine memory;                             
select @t1:=now();                                                              
load data infile "/tmp/hash1.txt" into table t1;                                
select @t2:=now();                                                              
select @rows:=count(*) from t1;                                                 
insert into results (operation, opsize, started, ended, instance)               
values ("load into INT table", @rows, @t1, @t2, @instance);                     
drop table t1;                                                                  
                                                                                
create table t1 (c1 char(20), index (c1)) engine memory;                        
select @t1:=now();                                                              
load data infile "/tmp/hash2.txt" into table t1;                                
select @t2:=now();                                                              
select @rows:=count(*) from t1;                                                 
insert into results (operation, opsize, started, ended, instance)               
values ("load into CHAR(20) table", @rows, @t1, @t2, @instance);                
drop table t1;                                                                  
                                                                                
-- run #4                                                                       
create table t1 (c1 int, index (c1)) engine memory;                             
select @t1:=now();                                                              
load data infile "/tmp/hash1.txt" into table t1;                                
select @t2:=now();                                                              
select @rows:=count(*) from t1;                                                 
insert into results (operation, opsize, started, ended, instance)               
values ("load into INT table", @rows, @t1, @t2, @instance);                     
drop table t1;                                                                  
                                                                                
create table t1 (c1 char(20), index (c1)) engine memory;                        
select @t1:=now();                                                              
load data infile "/tmp/hash2.txt" into table t1;                                
select @t2:=now();                                                              
select @rows:=count(*) from t1;                                                 
insert into results (operation, opsize, started, ended, instance)               
values ("load into CHAR(20) table", @rows, @t1, @t2, @instance);                
drop table t1;                                                                  
                                                                                
-- run #5                                                                       
create table t1 (c1 int, index (c1)) engine memory;                             
select @t1:=now();                                                              
load data infile "/tmp/hash1.txt" into table t1;                                
select @t2:=now();                                                              
select @rows:=count(*) from t1;                                                 
insert into results (operation, opsize, started, ended, instance)               
values ("load into INT table", @rows, @t1, @t2, @instance);                     
show table status like 't1';                                                    
drop table t1;                                                                  
                                                                                
create table t1 (c1 char(20), index (c1)) engine memory;                        
select @t1:=now();                                                              
load data infile "/tmp/hash2.txt" into table t1;                                
select @t2:=now();                                                              
select @rows:=count(*) from t1;                                                 
insert into results (operation, opsize, started, ended, instance)               
values ("load into CHAR(20) table", @rows, @t1, @t2, @instance);                
show table status like 't1';                                                    
drop table t1;                                                                  

-- list all results                                                             
select operation, instance, unix_timestamp(ended)-unix_timestamp(started) as duration,                                                                          
 opsize/(unix_timestamp(ended)-unix_timestamp(started)) as ops_per_sec          
from results order by operation, instance, started;                             
                                                                                
-- list average results                                                         
select operation, instance, avg(opsize/(unix_timestamp(ended)-unix_timestamp(started))) as avg_ops_per_sec                                                      
from results group by operation, instance;                                      

Comments

Comments loading...