1 of 5

Optimizing Tables

Optimize tables for enhanced performance. This section covers various techniques, including proper indexing, data types, and storage engine choices, to improve query speed and efficiency.

OPTIMIZE TABLE

Syntax

Description

OPTIMIZE TABLE has two main functions. It can either be used to defragment tables, or to update the InnoDB fulltext index.

WAIT/NOWAIT

Set the lock wait timeout. See .

Defragmenting

OPTIMIZE TABLE works for (before , only if the server system variable is set), , and tables, and should be used if you have deleted a large part of a table or if you have made many changes to a table with variable-length rows (tables that have , , , or columns). Deleted rows are maintained in a linked list and subsequent INSERT operations reuse old row positions.

This statement requires for the table.

By default, OPTIMIZE TABLE statements are written to the and will be . The NO_WRITE_TO_BINLOG keyword (LOCAL is an alias) will ensure the statement is not written to the binary log.

OPTIMIZE TABLE statements are not logged to the binary log if is set. See also .

OPTIMIZE TABLE is also supported for partitioned tables. You can use[ALTER TABLE](../../../../reference/sql-statements-and-structure/sql-statements/data-definition/alter/alter-table.md) ... OPTIMIZE PARTITION to optimize one or more partitions.

You can use OPTIMIZE TABLE to reclaim the unused space and to defragment the data file. With other storage engines, OPTIMIZE TABLE does nothing by default, and returns this message: " The storage engine for the table doesn't support optimize". However, if the server has been started with the --skip-new option, OPTIMIZE TABLE is linked to , and recreates the table. This operation frees the unused space and updates index statistics.

The storage engine supports for this statement.

If a table is fragmented, will not be performed until an OPTIMIZE TABLE statement is executed on that table, unless the server system variable is set to ALWAYS.

Updating an InnoDB fulltext index

When rows are added or deleted to an InnoDB , the index is not immediately re-organized, as this can be an expensive operation. Change statistics are stored in a separate location. The fulltext index is only fully re-organized when an OPTIMIZE TABLE statement is run.

By default, an OPTIMIZE TABLE will defragment a table. In order to use it to update fulltext index statistics, the system variable must be set to 1. This is intended to be a temporary setting and should be reset to 0 once the fulltext index has been re-organized.

Since fulltext re-organization can take a long time, the variable limits the re-organization to a number of words (2000 by default). You can run multiple OPTIMIZE statements to fully re-organize the index.

Defragmenting InnoDB tablespaces

merged the Facebook/Kakao defragmentation patch, allowing one to use OPTIMIZE TABLE to defragment InnoDB tablespaces. For this functionality to be enabled, the system variable must be enabled. No new tables are created and there is no need to copy data from old tables to new tables. Instead, this feature loads n pages (determined by ) and tries to move records so that pages would be full of records and then frees pages that are fully empty after the operation. Note that tablespace files (including ibdata1) will not shrink as the result of defragmentation, but one will get better memory utilization in the InnoDB buffer pool as there are fewer data pages in use.

See for more details.

Defragmenting InnoDB Tablespaces

Overview

When rows are deleted from an InnoDB table, the rows are simply marked as deleted and not physically deleted. The free space is not returned to the operating system for re-use.

The purge thread will physically delete index keys and rows, but the free space introduced is still not returned to operating system. This can lead to gaps in the pages. If you have variable length rows, new rows may be larger than old rows and cannot make use of the available space.

You can run OPTIMIZE TABLE or ALTER TABLE

ENGINE=InnoDB to reconstruct the table. Unfortunately running OPTIMIZE TABLE against an InnoDB table stored in the shared table-space file ibdata1 does two things:

Makes the table’s data and indexes contiguous inside ibdata1.
Increases the size of ibdata1 because the contiguous data and index pages are appended to ibdata1.

InnoDB Defragmentation

The feature described below has been deprecated in and was removed in . See and .

merged Facebook's defragmentation code prepared for MariaDB by Matt, Seong Uck Lee from Kakao. The only major difference to Facebook's code and Matt’s patch is that MariaDB does not introduce new literals to SQL and makes no changes to the server code. Instead, is used and all code changes are inside the InnoDB/XtraDB storage engines.

The behaviour of OPTIMIZE TABLE is unchanged by default, and to enable this new feature, you need to set the system variable to 1.

No new tables are created and there is no need to copy data from old tables to new tables. Instead, this feature loads n pages (determined by ) and tries to move records so that pages would be full of records and then frees pages that are fully empty after the operation.

Note that tablespace files (including ibdata1) will not shrink as the result of defragmentation, but one will get better memory utilization in the InnoDB buffer pool as there are fewer data pages in use.

A number of new system and status variables for controlling and monitoring the feature are introduced.

System Variables

: Enable InnoDB defragmentation.
: Number of pages considered at once when merging multiple pages to defragment.
: Number of defragment stats changes there are before the stats are written to persistent storage.
: Number of records of space that defragmentation should leave on the page.

Status Variables

: Number of defragment re-compression failures
: Number of defragment failures.
: Number of defragment operations.

Example

After these CREATE and INSERT operations, the following information can be seen from the INFORMATION SCHEMA:

Deleting three-quarters of the records, leaving gaps, and then optimizing:

Now some pages have been freed, and some merged:

See on the Mariadb.org blog for more details.

_{This page is licensed: CC BY-SA / Gnu FDL}

Entity-Attribute-Value Implementation

The desires

Open-ended set of "attributes" (key=value) for each "entity". That is, the list of attributes is not known at development time, and will grow in the future. (This makes one column per attribute impractical.)
"ad hoc" queries testing attributes.
Attribute values come in different types (numbers, strings, dates, etc.)
Scale to lots of entities, yet perform well.

It goes by various names

EAV -- Entity - Attribute - Value
key-value
RDF -- This is a flavor of EAV
MariaDB has dynamic columns that look something like the solution below, with the added advantage of being able to index the columns otherwise hidden in the blob. (There are caveats.)

Bad solution

Table with 3 columns: entity_id, key, value
The "value" is a string, or maybe multiple columns depending on datatype or other kludges.
a JOIN b ON a.entity=b.entity AND b.key='x' JOIN c ON ... WHERE a.value=... AND b.value=...

The problems

The SELECTs get messy -- multiple JOINs
Datatype issues -- It's clumsy to be putting numbers into strings
Numbers stored in do not compare 'correctly', especially for range tests.
Bulky.

A solution

Decide which columns need to be searched/sorted by SQL queries. No, you don't need all the columns to be searchable or sortable. Certain columns are frequently used for selection; identify these. You probably won't use all of them in all queries, but you will use some of them in every query.

The solution uses one table for all the EAV stuff. The columns include the searchable fields plus one . Searchable fields are declared appropriately (, , etc). The BLOB contains JSON-encoding of all the extra fields.

The table should be , hence it should have a PRIMARY KEY. The entitity_id is the 'natural' PK. Add a small number of other indexes (often 'composite') on the searchable fields. is unlikely to be of any use, unless the Entities should purged after some time. (Example: News Articles)

But what about the ad hoc queries?

You have included the most important fields to search on -- date, category, etc. These should filter the data down significantly. When you also need to filter on something more obscure, that will be handled differently. The application code will look at the BLOB for that; more on this later.

Why it works

You are not really going to search on more than a few fields.
The disk footprint is smaller; Smaller --> More cacheable --> Faster
It needs no JOINs
The indexes are useful

Details on the BLOB/JSON

Build the extra (or all) key-value pairs in a hash (associative array) in your application. Encode it. COMPRESS it. Insert that string into the .
JSON is recommended, but not mandatory; it is simpler than XML. Other serializations (eg, YAML) could be used.
COMPRESS the JSON and put it into a (or ) instead of a field. Compression gives about 3x shrinkage.
When SELECTing, UNCOMPRESS the blob. Decode the string into a hash. You are now ready to interrogate/display any of the extra fields.

Conclusions

Schema is reasonably compact (compression, real datatypes, less redundancy, etc, than EAV)
Queries are fast (since you have picked 'good' indexes)
Expandable (JSON is happy to have new fields)
Compatible (No 3rd party products, just supported products)

Postlog

Posted Jan, 2014; Refreshed Feb, 2016.

MariaDB's

This looks very promising; I will need to do more research to see how much of this article is obviated by it: ,

If you insist on EAV, set .

IP Range Table Performance

The situation

Your data includes a large set of non-overlapping 'ranges'. These could be IP addresses, datetimes (show times for a single station), zipcodes, etc.

You have pairs of start and end values; one 'item' belongs to each such 'range'. So, instinctively, you create a table with start and end of the range, plus info about the item. Your queries involve a WHERE clause that compares for being between the start and end values.

The problem

Once you get a large set of items, performance degrades. You play with the indexes, but find nothing that works well. The indexes fail to lead to optimal functioning because the database does not understand that the ranges are non-overlapping.

The solution

I will present a solution that enforces the fact that items cannot have overlapping ranges. The solution builds a table to take advantage of that, then uses Stored Routines to get around the clumsiness imposed by it.

Performance

The instinctive solution often leads to scanning half the table to do just about anything, such as finding the item containing an 'address'. In complexity terms, this is Order(N).

The solution here can usually get the desired information by fetching a single row, or a small number of rows. It is Order(1).

In a large table, "counting the disk hits" is the important part of performance. Since InnoDB is used, and the PRIMARY KEY (clustered) is used, most operations hit only 1 block.

Finding the 'block' where a given IP address lives:

For start of block: One single-row fetch using the PRIMARY KEY
For end of block: Ditto. The record containing this will be 'adjacent' to the other record.

For allocating or freeing a block:

2-7 SQL statements, hitting the clustered PRIMARY KEY for the rows containing and immediately adjacent to the block.
One SQL statement is a DELETE; if hits as many rows as are needed for the block.
The other statements hit one row each.

Design decisions

This is crucial to the design and its performance:

Having just one address in the row. These were alternative designs; they seemed to be no better, and possibly worse:
That one address could have been the 'end' address.
The routine parameters for a 'block' could have be start of this block and start of next block.
The IPv4 parameters could have been dotted quads; I chose to keep the reference implemetation simpler instead.

The interesting work is in the Ips, not the second table, so I focus on it. The inconvenience of JOINing to the second table is small compared to the performance gains.

Details

Two, not one, tables will be used. The first table (Ips in the reference implementations) is carefully designed to be optimal for all the basic operations needed. The second table contains other infomation about the 'owner' of each 'item'. In the reference implementations owner is an id used to JOIN the two tables. This discussion centers around Ips and how to efficiently map IP(s) to/from owner(s). The second table has "PRIMARY KEY(owner)".

In addition to the two-table schema, there are a set of Stored Routines to encapsulate the necessary code.

One row of Ips represents one 'item' by specifying the starting IP address and the 'owner'. The next row gives the starting IP address of the next "address block", thereby indirectly providing the ending address for the current block.

This lack of explicitly stating the "end address" leads to some clumsiness. The stored routines hide it from the user.

A special owner (indicated by '0') is reserved for "free" or "not-owned" blocks. Hence, sparse allocation of address blocks is no problem. Also, the 'free' owner is handled no differently than real owners, so there are no extra Stored Routines for such.

Links below give "reference" implementations for IPv4 and IPv6. You will need to make changes for non-IP situations, and may need to make changes even for IP situations.

These are the main stored routines provided:

IpIncr, IpDecr -- for adding/subtracting 1
IpStore -- for allocating/freeing a range
IpOwner, IpRangeOwners, IpFindRanges, Owner2IpStarts, Owner2IpRanges -- for lookups
IpNext, IpEnd -- IP of start of next block, or end of current block

None of the provided routines JOIN to the other table; you may wish to develop custom queries based on the given reference Stored Procedures.

The Ips table's size is proportional to the number of blocks. A million 'owned' blocks may be 20-50MB. This varies due to

number of 'free' gaps (between zero and the number of owned blocks)
datatypes used for ip and owner
overhead Even 100M blocks is quite manageable in today's hardware. Once things are cached, most operations would take only a few milliseconds. A trillion blocks would work, but most operations would hit the disk a few times -- only a few times.

Reference implementation of IPv4

This specific to IPv4 (32 bit, a la '196.168.1.255'). It can handle anywhere from 'nothing assigned' (1 row) to 'everything assigned' (4B rows) 'equally' well. That is, to ask the question "who owns '11.22.33.44'" is equally efficient regardless of how many blocks of IP addresses exist in the table. (OK, caching, disk hits, etc may make a slight difference.) The one function that can vary is the one that reassigns a range to a new owner. Its speed is a function of how many existing ranges need to be consumed, since those rows will be DELETEd. (It helps that they are, by schema design, 'clustered'.)

Notes on the :

Externally, the user may use the dotted quad notation (11.22.33.44), but needs to convert to INT UNSIGNED for calling the Stored Procs.
The user is responsible for converting to/from the calling datatype (INT UNSIGNED) when accessing the stored routine; suggest /.
The internal datatype for addresses is the same as the calling datatype (INT UNSIGNED).
Adding and subtracting 1 (simple arithmetic).

(The reference implementation does not handle CDRs. Such should be easy to add on, by first turning it into an IP range.)

Reference implementation of IPv6

The code for handling IP address is more complex, but the overall structure is the same as for IPv4. Launch into it only if you need IPv6.

Notes on the :

Externally, IPv6 has a complex string, VARCHAR(39) CHARACTER SET ASCII. The Stored Procedure IpStr2Hex() is provided.
The user is responsible for converting to/from the calling datatype (BINARY(16)) when accessing the stored routine; suggest /.
The internal datatype for addresses is the same as the calling datatype (BINARY(16)).
Communication with the Stored routines is via 32-char hex strings.

The INET6* functions were first available in MySQL 5.6.3 and

Adapting to a different non-IP 'address range' data

The external datatype for an 'address' should be whatever is convenient for the application.
The datatype for the 'address' in the table must be ordered, and should be as compact as possible.
You must write the Stored functions (IpIncr, IpDecr) for incrementing/decrementing an 'address'.
An 'owner' is an id of your choosing, but smaller is better.

"Owner" needs a special value to represent "not owned". The reference implementations use "=" and "!=" to compare two 'owners'. Numeric values and strings work nicely with those operators; NULL does not. Hence, please do not use NULL for "not owned".

Since the datatypes are pervasive in the stored routines, adapting a reference implementation to a different concept of 'address' would require multiple minor changes.

The code enforces that consecutive blocks never have the same 'owner', so the table is of 'minimal' size. Your application can assume that such is always the case.

Postlog

Original writing -- Oct, 2012; Notes on INET6 functions -- May, 2015.

OPTIMIZE TABLE

Syntax

OPTIMIZE [NO_WRITE_TO_BINLOG | LOCAL] TABLE
    tbl_name [, tbl_name] ...
    [WAIT n | NOWAIT]

Description

OPTIMIZE TABLE has two main functions. It can either be used to defragment tables, or to update the InnoDB fulltext index.

WAIT/NOWAIT

Set the lock wait timeout. See .

Defragmenting

This statement requires for the table.

By default, OPTIMIZE TABLE statements are written to the and will be . The NO_WRITE_TO_BINLOG keyword (LOCAL is an alias) will ensure the statement is not written to the binary log.

OPTIMIZE TABLE statements are not logged to the binary log if is set. See also .

The storage engine supports for this statement.

If a table is fragmented, will not be performed until an OPTIMIZE TABLE statement is executed on that table, unless the server system variable is set to ALWAYS.

Updating an InnoDB fulltext index

Defragmenting InnoDB tablespaces

See for more details.

IP Range Table Performance

The situation

Your data includes a large set of non-overlapping 'ranges'. These could be IP addresses, datetimes (show times for a single station), zipcodes, etc.

The problem

The solution

Performance

The instinctive solution often leads to scanning half the table to do just about anything, such as finding the item containing an 'address'. In complexity terms, this is Order(N).

The solution here can usually get the desired information by fetching a single row, or a small number of rows. It is Order(1).

In a large table, "counting the disk hits" is the important part of performance. Since InnoDB is used, and the PRIMARY KEY (clustered) is used, most operations hit only 1 block.

Finding the 'block' where a given IP address lives:

For start of block: One single-row fetch using the PRIMARY KEY
For end of block: Ditto. The record containing this will be 'adjacent' to the other record.

For allocating or freeing a block:

2-7 SQL statements, hitting the clustered PRIMARY KEY for the rows containing and immediately adjacent to the block.
One SQL statement is a DELETE; if hits as many rows as are needed for the block.
The other statements hit one row each.

Design decisions

This is crucial to the design and its performance:

Having just one address in the row. These were alternative designs; they seemed to be no better, and possibly worse:
That one address could have been the 'end' address.
The routine parameters for a 'block' could have be start of this block and start of next block.
The IPv4 parameters could have been dotted quads; I chose to keep the reference implemetation simpler instead.

The interesting work is in the Ips, not the second table, so I focus on it. The inconvenience of JOINing to the second table is small compared to the performance gains.

Details

In addition to the two-table schema, there are a set of Stored Routines to encapsulate the necessary code.

This lack of explicitly stating the "end address" leads to some clumsiness. The stored routines hide it from the user.

Links below give "reference" implementations for IPv4 and IPv6. You will need to make changes for non-IP situations, and may need to make changes even for IP situations.

These are the main stored routines provided:

IpIncr, IpDecr -- for adding/subtracting 1
IpStore -- for allocating/freeing a range
IpOwner, IpRangeOwners, IpFindRanges, Owner2IpStarts, Owner2IpRanges -- for lookups
IpNext, IpEnd -- IP of start of next block, or end of current block

None of the provided routines JOIN to the other table; you may wish to develop custom queries based on the given reference Stored Procedures.

The Ips table's size is proportional to the number of blocks. A million 'owned' blocks may be 20-50MB. This varies due to

number of 'free' gaps (between zero and the number of owned blocks)
datatypes used for ip and owner
overhead Even 100M blocks is quite manageable in today's hardware. Once things are cached, most operations would take only a few milliseconds. A trillion blocks would work, but most operations would hit the disk a few times -- only a few times.

Reference implementation of IPv4

Notes on the :

Externally, the user may use the dotted quad notation (11.22.33.44), but needs to convert to INT UNSIGNED for calling the Stored Procs.
The user is responsible for converting to/from the calling datatype (INT UNSIGNED) when accessing the stored routine; suggest /.
The internal datatype for addresses is the same as the calling datatype (INT UNSIGNED).
Adding and subtracting 1 (simple arithmetic).

(The reference implementation does not handle CDRs. Such should be easy to add on, by first turning it into an IP range.)

Reference implementation of IPv6

The code for handling IP address is more complex, but the overall structure is the same as for IPv4. Launch into it only if you need IPv6.

Notes on the :

Externally, IPv6 has a complex string, VARCHAR(39) CHARACTER SET ASCII. The Stored Procedure IpStr2Hex() is provided.
The user is responsible for converting to/from the calling datatype (BINARY(16)) when accessing the stored routine; suggest /.
The internal datatype for addresses is the same as the calling datatype (BINARY(16)).
Communication with the Stored routines is via 32-char hex strings.

The INET6* functions were first available in MySQL 5.6.3 and

Adapting to a different non-IP 'address range' data

The external datatype for an 'address' should be whatever is convenient for the application.
The datatype for the 'address' in the table must be ordered, and should be as compact as possible.
You must write the Stored functions (IpIncr, IpDecr) for incrementing/decrementing an 'address'.
An 'owner' is an id of your choosing, but smaller is better.

Since the datatypes are pervasive in the stored routines, adapting a reference implementation to a different concept of 'address' would require multiple minor changes.

The code enforces that consecutive blocks never have the same 'owner', so the table is of 'minimal' size. Your application can assume that such is always the case.

Postlog

Original writing -- Oct, 2012; Notes on INET6 functions -- May, 2015.

Optimizing Tables

OPTIMIZE TABLE

Syntax

Description

WAIT/NOWAIT

Defragmenting

Updating an InnoDB fulltext index

Defragmenting InnoDB tablespaces

See Also

Defragmenting InnoDB Tablespaces

Overview

InnoDB Defragmentation

System Variables

Status Variables

Example

Entity-Attribute-Value Implementation

The desires

Bad solution

The problems

A solution

But what about the ad hoc queries?

Why it works

Details on the BLOB/JSON

Conclusions

Postlog

See also

IP Range Table Performance

The situation

The problem

The solution

Performance

Design decisions

Details

Reference implementation of IPv4

Reference implementation of IPv6

Postlog

See also

Optimizing Tables

OPTIMIZE TABLE

Syntax

Description

WAIT/NOWAIT

Defragmenting

Updating an InnoDB fulltext index

Defragmenting InnoDB tablespaces

See Also

Entity-Attribute-Value Implementation

The desires

Bad solution

The problems

A solution

But what about the ad hoc queries?

Why it works

Details on the BLOB/JSON

Conclusions

Postlog

See also

IP Range Table Performance

The situation

The problem

The solution

Performance

Design decisions

Details

Reference implementation of IPv4

Reference implementation of IPv6

Postlog

See also

Defragmenting InnoDB Tablespaces

Overview

InnoDB Defragmentation

System Variables

Status Variables

Example