MariaDB starting with 5.5.31
Starting with MariaDB 5.5.31 and MariaDB 10.0.2 the versions of XtraDB and Innodb that are included now support atomic writes without using the doublewrite buffer. For those who are unfamiliar with Fusion-io, a Fusion-io introduction is available.
Partial write operations
When Innodb writes to the filesystem, there is generally no guarantee that a given write operation will be complete (not partial) in cases of a poweroff event, or if the operating system crashes at the exact moment a write is being done.
Without detection or prevention of partial writes, the integrity of the database can be compromised after recovery.
innodb_doublewrite - an imperfect solution
Since its inception, Innodb has had a mechanism to detect and ignore partial writes via the InnoDB Doublewrite Buffer (also innodb_checksum can be used to detect a partial write).
Doublewrites, controlled by the innodb_doublewrite system variable, comes with its own set of problems. Especially on SSD, writing each page twice can have detrimental effects (write leveling).
Atomic write - a faster alternative to innodb_doublewrite
A better solution is to directly ask the filesystem to provide an atomic (all or nothing) write guarantee. Currently this is only available on the NVMFS (previously called directFS) filesystem on FusionIO devices that provide atomic write functionality. This functionality is supported by MariaDB's XtraDB and Innodb storage engines with MDEV-4338.
Enabling Atomic Writes
To use atomic writes instead of the doublewrite buffer, add:
innodb_use_atomic_writes = 1
to the my.cnf config file.
The following happens when
innodb_use_atomic_writes is switched ON
- if innodb_flush_method is neither
O_DIRECT_NO_FSYNC, it is switched to
- innodb_use_fallocate is switched
ON(files are extended using
posix_fallocaterather than writing zeros behind the end of file)
- Whenever an Innodb datafile is opened, a special
ioctl()is issued to switch on atomic writes. If the call fails, an error is logged and returned to the caller. This means that if the system tablespace is not located on an atomic write capable device or filesystem, InnoDB/XtraDB will refuse to start.
- if innodb_doublewrite is set to
innodb_doublewritewill be switched
OFFand a message written to the error log.
Here is a flowchart showing how atomic writes work inside InnoDB: