How to quickly insert data into MariaDB

You are viewing an old version of this article. View the current version here.

This article describes different techniques of how to insert data quickly into MariaDB.

Background

When inserting new data into MariaDB, the things that takes time are: (In importance order):

  • Syncing data to disk (as part of end of transaction)
  • Adding new keys. The more index, the more time it takes to keep these updated
  • Checking against foreign keys (if such exists)
  • Adding rows to the storage engine
  • Sending data to the server

The following describes the different techniques (in importance order) you can use to quickly insert data into a table.

Disabling keys

You can temporary disable the update of non unique indexes. This is mostly useful when there is zero or very few rows in the table you are inserting data.

ALTER TABLE table_name DISABLE KEYS;
BEGIN;
... inserting data with INSERT or LOAD DATA ....
COMMIT;
ALTER TABLE table_name ENABLE KEYS;

In many storage engines (at least MyISAM, Aria and InnoDB/XtraDB) ENABLE KEYS works by scanning trough the row data collecting keys, sort them and create the index blocks. This is a magnitude faster than creating the index one row at the time and uses also much less key buffer memory.

Note that when you insert into an empty table with INSERT or LOAD DATA MariaDB automaticly does an DISABLE KEYS before and ENABLE KEYS afterwards.

Loading text files

The fastest way to insert data into MariaDB is trough the LOAD DATA INFILE command.

The simplest form of the command is:

LOAD DATA INFILE 'file_name' INTO TABLE table_name;

You can also read a file locally on the machine where the client is running by using:

LOAD DATA LOCAL INFILE 'file_name' INTO TABLE table_name;

This is not as fast as reading the file on the server side, but the difference is not that big.

LOAD DATA INFILE is very fast because:

  • There is no parsing of SQL.
  • Data is read in big blocks.
  • If table was empty at start, all non unique indexes are disabled during the operation.
  • The engine is told to cache rows first and them insert them in big blocks (At last MyISAM and Aria supports this).
  • For empty tables, some transactional engines (like Aria) does not log the inserted data in the transaction log as one can rollback the operation by just doing a TRUNCATE on the table.

In many cases when you need it insert many rows at a time it can be faster to create a file locally, add the rows there and then use LOAD DATA INFILE to load them than using INSERT to insert the rows.

In MariaDB 5.3 you will get progress reporting for LOAD DATA INFILE.

mysqlimport

You can import many files in parallel with mysqlimport

mysqlimport --use-threads=10 database text-file-name [text-file-name...]

Internally mysqlimport uses LOAD DATA INFILE to read in the data

Inserting data with INSERT statements

Using big transaction

When doing many inserts in a row, you should wrap them with BEGIN / END to not have to do a full transaction (which means a disk sync) for every row. For example doing a begin/end every 1000 insert will speed up your inserts almost 1000 times.

BEGIN;
INSERT ...
INSERT ...
END;
BEGIN;
INSERT ...
INSERT ...
END;
...

The reason why you may want to have many BEGIN/END instead of just one is that the former will use up less transaction log space.

Multi value insert

You can insert many rows at once with multi value row inserts:

INSERT INTO table_name values(1,"row 1"),(2, "row 2"),...;

The limit of how much data you can have in one statement is limited by the max_allowed_packet server variable.

Inserting into several tables at once

If you need to insert data in several tables at once, the way to do that is to enable multi row statements and send many inserts to the server at once:

INSERT INTO table_name_1 (auto_increment_key, data) VALUES (NULL,"row 1");
INSERT INTO table_name_2 (auto_increment, reference, data) values (NULL, LAST_INSERT_ID(), "row 2");

LAST_INSERT_ID() is a function that returns the last auto_increment value inserted.

Note that by default the command line mysql client will by default send the above as multiple statements.

To test this in the mysql client you have to do:

delimiter ;;
select 1; select 2;;
delimiter ;

Note that for multi query statements to work, your client must specify the CLIENT_MULTI_STATEMENTS flag to mysql_real_connect().

Server variables that can be used to tune the speed of insert

OptionDescription
innodb-buffer-pool-sizeIncrease this if you have many index in InnoDB/XtraDB tables
key_buffer_sizeIncrease this if you have many index in MyISAM tables
max_allowed_packetIncrease this to allow bigger multi-insert statements
read_buff_sizeRead block size when reading a file with LOAD DATA

See mysqld options for full list of server variables.

Comments

Comments loading...
Content reproduced on this site is the property of its respective owners, and this content is not reviewed in advance by MariaDB. The views, information and opinions expressed by this content do not necessarily represent those of MariaDB or any other party.