Data Types for MariaDB Enterprise ColumnStore
This page is part of MariaDB's Documentation.
The parent of this page is: Data Types
Topics on this page:
Overview
MariaDB Enterprise ColumnStore supports a subset of the MariaDB Enterprise Server data types.
Compatibility
MariaDB Enterprise ColumnStore 5
MariaDB Enterprise ColumnStore 6
Supported Data Types
Supported Integer Data Types
In MariaDB Enterprise ColumnStore, the supported integer data types are:
Data Type | Size | Signed Range | Unsigned Range |
---|---|---|---|
1 byte | 0 - 1 | 0 - 1 | |
1 byte | -126 - 127 | 0 - 253 | |
2 bytes | -32766 - 32767 | 0 - 65533 | |
3 bytes | -8388608 - 8388607 | 0 - 16777215 | |
4 bytes | -2147483646 - 2147483647 | 0 - 4294967293 | |
8 bytes | -9223372036854775806 - 9223372036854775807 | 0 - 18446744073709551613 |
Enterprise ColumnStore reserves 2 integer values for internal use, so the maximum integer values are 2 less than the maximum integer values for Enterprise Server.
Enterprise ColumnStore uses "saturation semantics" for integer values. If a value is too big for an integer data type, Enterprise ColumnStore will substitute the maximum value for the specific integer data type. If a value is too small for an integer data type, Enterprise ColumnStore will substitute the minimum value for the specific integer data type.
Supported Fixed/Floating Point Numeric Data Types
In MariaDB Enterprise ColumnStore, the supported fixed and floating point numeric data types are:
Data Type | Aliases | Size | Format |
---|---|---|---|
| Packed fixed-point number | ||
4 bytes | IEEE-754 floating point number | ||
8 bytes | IEEE-754 floating point number |
Enterprise ColumnStore stores DECIMAL, NUMERIC, and FIXED columns as a scaled integer without packing.
Enterprise ColumnStore supports specifying precision (
M
) and scale (D
) for DECIMAL, NUMERIC, and FIXED:In Enterprise ColumnStore 6, the maximum precision (
M
) is 38, and the maximum scale (D
) is 38.In Enterprise ColumnStore 5, the maximum precision (
M
) is 18, and the maximum scale (D
) is 18.When not specified, the default precision (
M
) is10
, and the default scale (D
) is determined by the columnstore_decimal_ system variable.scale
Enterprise ColumnStore does not support specifying precision and scale for
FLOAT
,DOUBLE
, andREAL
. ForFLOAT
, the maximum significant digits is about 6, and the range is +/- 1e38. ForDOUBLE
/REAL
, the maximum significant digits is about 15, and the range is about +/- 1e308.When executing
SELECT
statements bulk loads, Enterprise ColumnStore truncates, rather than rounds, numeric values that have too many decimal places.When executing
INSERT
andUPDATE
statements, Enterprise ColumnStore rounds numeric values that have too many decimal places.
Supported String Data Types
In MariaDB Enterprise ColumnStore, the supported string data types are:
Data Type | Size | Maximum Size |
---|---|---|
1-4 bytes per character | 8000 bytes | |
1-4 bytes per character | 255 bytes |
Enterprise ColumnStore does not support the non-standard extensions
VARCHAR(0)
andCHAR(0)
, whereas other engines may support them.Enterprise ColumnStore supports a maximum of 8000 bytes per
VARCHAR
column, which results in a maximum length for different character sets:For 1-byte character sets (such as
latin1
), Enterprise ColumnStore supports up toVARCHAR(8000)
.For 2-byte character sets (such as
ucs2
), Enterprise ColumnStore supports up toVARCHAR(4000)
.For 3-byte character sets (such as
utf8
), Enterprise ColumnStore supports up toVARCHAR(2666)
.For 4-byte character sets (such as
utf8mb4
), Enterprise ColumnStore supports up toVARCHAR(2000)
.
Enterprise ColumnStore supports a maximum of 255 bytes per
CHAR
column, which results in a maximum length for different character sets:For 1-byte character sets (such as
latin1
), Enterprise ColumnStore supports up toCHAR(255)
.For 2-byte character sets (such as
ucs2
), Enterprise ColumnStore supports up toVARCHAR(127)
.For 3-byte character sets (such as
utf8
), Enterprise ColumnStore supports up toVARCHAR(85)
.For 4-byte character sets (such as
utf8mb4
), Enterprise ColumnStore supports up toVARCHAR(63)
.
Enterprise ColumnStore stores large string values in dictionary extents, which do not support extent elimination, so it is recommended to avoid creating long string columns.
Supported Large Object Data Types
In MariaDB Enterprise ColumnStore, the supported large object data types are:
Data Type | Size |
---|---|
255 bytes | |
255 bytes | |
64 KB | |
64 KB | |
16 MB | |
16 MB | |
16 MB | |
16 MB |
Enterprise ColumnStore supports a maximum size of 16 MB for the
LONGBLOB
andLONGTEXT
data types, whereas other engines support a maximum size of 4 GB.Enterprise ColumnStore stores large object values in dictionary extents, which do not support extent elimination.
Data Type Recommendations
When you are designing a schema for Enterprise ColumnStore, you should follow some best practices to get the best performance.
Choose Appropriate Size for Numeric Data Types
Enterprise ColumnStore can use I/O much more efficiently if you choose the smallest possible size for a numeric data type, because the smaller type would allow Enterprise ColumnStore to fit more values into each extent.
For example, when the values for an integer column are small, queries can perform better when TINYINT is used instead of BIGINT, because more values can fit into each extent with TINYINT.
For best performance, choose the smallest data type possible for a given column.
Avoid Creating Long String Columns
It is recommended to avoid long string columns, because Enterprise ColumnStore can only store 8 bytes for each string value in a column's extent. Multi-byte character sets can cause the effective limit to be even lower. If a string value is longer than the effective limit, the value is stored in a separate dictionary extent, and the column's extent contains a pointer to the value in the dictionary extent.
The effective limit depends on the specific data type used by the column:
For the CHAR data type, a maximum of 8 bytes of the value can be the column's extent.
For the VARCHAR data type, a maximum of 7 bytes of the value can be stored in the column's extent, because 1 byte is required to store the length of the value.
The effective limit also depends on the specific character set used by the column:
For 1-byte character sets, such as
latin1
: if the column is declared asCHAR(8)
orVARCHAR(7)
or shorter, the value is stored in the column's extent.For 2-byte character sets, such as
ucs2
: if the column is declared asCHAR(4)
orVARCHAR(3)
or shorter, the value is stored in the column's extent.For 3-byte character sets, such as
utf8
: if the column is declared asCHAR(2)
orVARCHAR(2)
or shorter, the value is stored in the column's extent.
When a string column is stored in a separate dictionary extent, it creates multiple problems:
When Enterprise ColumnStore reads the values, it will have to read the pointer from the column's extent, and then it will have to read the value from the separate dictionary extent. As a consequence, Enterprise ColumnStore will have to read twice as many extents, so the query will require more I/O.
When Enterprise ColumnStore evaluates join conditions and filter conditions (i.e.,
WHERE
clause), it usually uses extent elimination to reduce I/O. However, extent elimination does not apply to dictionary extents. As a consequence, Enterprise ColumnStore will have to read every column extent, so the query may require even more I/O.
These problems can be avoided by following the recommended best practices:
If a column's values can fit into a non-string data type, declare the column using the non-string data type, rather than the string data type. The specific details depend on the type of data:
If the column's values are compatible with a numeric data type (INT, FLOAT, NUMERIC, etc.), declare the column using the smallest numeric data type that is most appropriate for your data. This could apply to columns storing information like zip codes, product IDs, account numbers, etc.
If the column's values are compatible with a date and time data type (DATE, DATETIME, TIME, TIMESTAMP), declare the column using the date and time data type that is most appropriate for your data.
If a column's values can fit into a 1-byte character set, declare the column using the 1-byte character set.
If a column's values can only fit into a long string data type, try to avoid referencing the column in join conditions and filter conditions (i.e.,
WHERE
clause).