Instant ADD COLUMN for InnoDB
MariaDB Server 10.3.2 alpha was released this week. For InnoDB, the new features coming in MariaDB Server 10.3 include CREATE SEQUENCE which is a logical continuation of the Persistent AUTO_INCREMENT that appeared in MariaDB Server 10.2.
Perhaps one of the most important InnoDB changes coming in MariaDB Server 10.3 is Instant ADD COLUMN for InnoDB tables. The design was brainstormed in April by engineers from MariaDB Corporation, Alibaba and Tencent. A prototype was developed by Vin Chen (陈福荣) from the Tencent Game DBA Team.
What is so special about this? Normally, adding a column to a table requires the full table to be rebuilt. The complexity of the operation is proportional to the size of the table, or O(n·m) where n is the number of rows in the table and m is the number of indexes. Sure, with my online
ALTER TABLE for InnoDB tables in MySQL 5.6 and MariaDB Server 10.0, you would be able to modify the table while it is being rebuilt, but it would significantly increase the I/O and memory consumption and cause a replication lag. With instant
ADD COLUMN, all that is needed is an O(log n) operation to insert a special hidden record into the table, and an update of the data dictionary. For a large table, instead of taking several hours, the operation would be completed in the blink of an eye. The
ADD COLUMN operation is only slightly more expensive than a regular
INSERT, due to locking constraints.
In the past, some developers may have implemented a kind of
instant add column in the application by encoding multiple columns in a single
BLOB column. MariaDB Dynamic Columns was an early example of that. A more recent example is JSON and related string manipulation functions.
Adding real columns has the following advantages over encoding columns into a single
- Efficient storage in a native binary format
- Data type safety
- Indexes can be built natively
- Constraints are available:
DEFAULTvalues can be specified
- Triggers can be written more easily
ADD COLUMN, you can enjoy all the benefits of structured storage without the drawback of having to rebuild the table.
ADD COLUMN is available for both old and new InnoDB tables. Basically you can just upgrade from MySQL 5.x or MariaDB and start adding columns instantly.
ADD COLUMN only applies when the added columns appear last in the table. The place specifier
LAST is the default. If
AFTER col is specified, then
col must be the last column, or the operation will require the table to be rebuilt.
If the table contains a hidden
FTS_DOC_ID column due to
FULLTEXT INDEX, then instant
ADD COLUMN will not be possible.
If the table becomes empty (either via
DELETE), the table will be converted to the old
Naturally, the operation is crash-safe. If the server is killed while instant
ADD COLUMN was executing, the operation will be rolled back. If it was the very first instant
ADD COLUMN operation, the table will be restored to the old
InnoDB data files after instant
ADD COLUMN cannot be imported to older versions of MariaDB or MySQL. Any table-rebuilding operation such as
ALTER TABLE…FORCE will convert the table to the old format.
For technical reasons, instant
ADD COLUMN is not available for
ROW_FORMAT=COMPRESSED, which is one of the formats that I created before MySQL 5.5.
A simple example of instant
CREATE TABLE t(id INT PRIMARY KEY, u INT UNSIGNED NOT NULL UNIQUE) ENGINE=InnoDB; INSERT INTO t(id,u) VALUES(1,1),(2,2),(3,3); ALTER TABLE t ADD COLUMN (d DATETIME DEFAULT current_timestamp(), p POINT NOT NULL DEFAULT ST_GeomFromText('POINT(0 0)'), t TEXT CHARSET utf8 DEFAULT 'The quick brown fox jumps over the lazy dog'); UPDATE t SET t=NULL WHERE id=3; SELECT id,u,d,ST_AsText(p),t FROM t; SELECT variable_value FROM information_schema.global_status WHERE variable_name = 'innodb_instant_alter_column';
The above example illustrates that when the added columns are declared
NOT NULL, a
DEFAULT value must be available, either implied by the data type or set explicitly by the user. The expression need not be constant, but it must not refer to the columns of the table, such as
DEFAULT u+1 (a MariaDB extension). The
DEFAULT current_timestamp() would be evaluated at the time of the
ALTER TABLE and apply to each row, like it does for non-instant
ALTER TABLE. If a subsequent
ALTER TABLE changes the
DEFAULT value for subsequent
INSERT, the values of the columns in existing records will naturally be unaffected.
You can download MariaDB Server 10.3.2 here. Note that MariaDB Server 10.3.2 is an alpha release. Please do not use it in production, but feel free to test it.
ALTER TABLE…DROP COLUMN will require the table to be rebuilt. We are proud of the exciting contributions from the robust MariaDB community. Stay tuned for new improvements coming to MariaDB!