Collations in MariaDB Enterprise Server
This page is part of MariaDB's Documentation.
The parent of this page is: Collations
Topics on this page:
Overview
MariaDB Enterprise Server
supports many different collations. A collation is a set of rules for sorting and comparing character sets.Compatibility
The detail provided here applies to:
MariaDB Enterprise Server 10.2
MariaDB Enterprise Server 10.3
MariaDB Enterprise Server 10.4
MariaDB Enterprise Server 10.5
MariaDB Enterprise Server 10.6
Example of Collation
As an example, consider a list of names. The character set defines which characters can appear in each name. The collation defines the order in which the names would be sorted, and whether two names would be equal:
In a case-insensitive collation, "Mueller" and "mueller" would be considered the same name.
In a case-sensitive collation, "Mueller" and "mueller" would be considered different names.
In a German collation, "Mueller" and "Müller" would be considered the same name.
In an English collation, "Mueller" and "Müller" would be considered different names.
In a collation that ignores padding, "Mueller" and "Mueller " would be considered the same name.
In a no-pad collation, "Mueller" and "Mueller " would be considered different names.
Supported Collations
For the full list of supported collations, see MariaDB Enterprise Server 10.6, Collations in Enterprise Server 10.5 and Collations in Enterprise Server 10.4.
The supported collations can also be queried with the SHOW COLLATION statement:
SHOW COLLATION
LIKE 'utf8mb4_%';
+------------------------------+---------+------+---------+----------+---------+
| Collation | Charset | Id | Default | Compiled | Sortlen |
+------------------------------+---------+------+---------+----------+---------+
| utf8mb4_general_ci | utf8mb4 | 45 | Yes | Yes | 1 |
| utf8mb4_bin | utf8mb4 | 46 | | Yes | 1 |
| utf8mb4_unicode_ci | utf8mb4 | 224 | | Yes | 8 |
| utf8mb4_icelandic_ci | utf8mb4 | 225 | | Yes | 8 |
| utf8mb4_latvian_ci | utf8mb4 | 226 | | Yes | 8 |
| utf8mb4_romanian_ci | utf8mb4 | 227 | | Yes | 8 |
| utf8mb4_slovenian_ci | utf8mb4 | 228 | | Yes | 8 |
| utf8mb4_polish_ci | utf8mb4 | 229 | | Yes | 8 |
| utf8mb4_estonian_ci | utf8mb4 | 230 | | Yes | 8 |
| utf8mb4_spanish_ci | utf8mb4 | 231 | | Yes | 8 |
| utf8mb4_swedish_ci | utf8mb4 | 232 | | Yes | 8 |
| utf8mb4_turkish_ci | utf8mb4 | 233 | | Yes | 8 |
| utf8mb4_czech_ci | utf8mb4 | 234 | | Yes | 8 |
| utf8mb4_danish_ci | utf8mb4 | 235 | | Yes | 8 |
| utf8mb4_lithuanian_ci | utf8mb4 | 236 | | Yes | 8 |
| utf8mb4_slovak_ci | utf8mb4 | 237 | | Yes | 8 |
| utf8mb4_spanish2_ci | utf8mb4 | 238 | | Yes | 8 |
| utf8mb4_roman_ci | utf8mb4 | 239 | | Yes | 8 |
| utf8mb4_persian_ci | utf8mb4 | 240 | | Yes | 8 |
| utf8mb4_esperanto_ci | utf8mb4 | 241 | | Yes | 8 |
| utf8mb4_hungarian_ci | utf8mb4 | 242 | | Yes | 8 |
| utf8mb4_sinhala_ci | utf8mb4 | 243 | | Yes | 8 |
| utf8mb4_german2_ci | utf8mb4 | 244 | | Yes | 8 |
| utf8mb4_croatian_mysql561_ci | utf8mb4 | 245 | | Yes | 8 |
| utf8mb4_unicode_520_ci | utf8mb4 | 246 | | Yes | 8 |
| utf8mb4_vietnamese_ci | utf8mb4 | 247 | | Yes | 8 |
| utf8mb4_croatian_ci | utf8mb4 | 608 | | Yes | 8 |
| utf8mb4_myanmar_ci | utf8mb4 | 609 | | Yes | 8 |
| utf8mb4_thai_520_w2 | utf8mb4 | 610 | | Yes | 4 |
| utf8mb4_general_nopad_ci | utf8mb4 | 1069 | | Yes | 1 |
| utf8mb4_nopad_bin | utf8mb4 | 1070 | | Yes | 1 |
| utf8mb4_unicode_nopad_ci | utf8mb4 | 1248 | | Yes | 8 |
| utf8mb4_unicode_520_nopad_ci | utf8mb4 | 1270 | | Yes | 8 |
+------------------------------+---------+------+---------+----------+---------+
Collation Suffixes
There are several different collation suffixes, which refer to collations with different types of rules:
Collation Suffix | Case-Sensitive | Case-Insensitive | Binary | Padded |
---|---|---|---|---|
| No | Yes | No | Yes |
| Yes | No | No | Yes |
| No | No | Yes | Yes |
| No | Yes | No | No |
| Yes | No | No | No |
| No | No | Yes | No |
Server Collation
The server collation is used to determine the default collation of new databases and tables, unless a different collation is specified for a given database or table.
The server collation is determined by the collation_
Checking the Server Collation
You can check the server collation by executing the SHOW SESSION VARIABLES statement:
SHOW SESSION VARIABLES
LIKE 'collation_server';
+------------------+--------------------+
| Variable_name | Value |
+------------------+--------------------+
| collation_server | utf8mb4_unicode_ci |
+------------------+--------------------+
Effect of the Server Collation
The server collation is used to determine the default collation of new databases and tables, unless a different collation is specified for a given database or table. To demonstrate this effect:
Confirm the current server collation with SHOW SESSION VARIABLES:
SHOW SESSION VARIABLES LIKE 'collation_server';
+------------------+--------------------+ | Variable_name | Value | +------------------+--------------------+ | collation_server | utf8mb4_unicode_ci | +------------------+--------------------+
Create a database:
CREATE DATABASE db1;
You can confirm this database's default collation by querying the information_
schema.SCHEMATA table:SELECT DEFAULT_COLLATION_NAME FROM information_schema.SCHEMATA WHERE SCHEMA_NAME='db1';
+------------------------+ | DEFAULT_COLLATION_NAME | +------------------------+ | utf8mb4_unicode_ci | +------------------------+
Also create a table in this database, which has at least one column which contains characters. This column could be a data type such as VARCHAR, CHAR, TEXT, or any of the aliases for those types:
CREATE TABLE db1.tab ( id int PRIMARY KEY, str varchar(50) );
You can confirm that the table's default collation is the same as the database's default collation by querying the information_
schema.TABLES table:SELECT TABLE_COLLATION FROM information_schema.TABLES WHERE TABLE_SCHEMA='db1' AND TABLE_NAME='tab';
+--------------------+ | TABLE_COLLATION | +--------------------+ | utf8mb4_unicode_ci | +--------------------+
You can also confirm the
str
column's collation by querying the information_schema.COLUMNS table:SELECT COLLATION_NAME FROM information_schema.COLUMNS WHERE TABLE_SCHEMA='db1' AND TABLE_NAME='tab' AND COLUMN_NAME='str';
+--------------------+ | COLLATION_NAME | +--------------------+ | utf8mb4_unicode_ci | +--------------------+
Database's Default Collation
A database's default collation is used to determine the collation of new tables within the database, unless a different collation is specified for a given table.
Checking a Database's Default Collation
You can check a database's default collation by querying the information_
SELECT DEFAULT_COLLATION_NAME
FROM information_schema.SCHEMATA
WHERE SCHEMA_NAME='db1';
+------------------------+
| DEFAULT_COLLATION_NAME |
+------------------------+
| utf8mb4_unicode_ci |
+------------------------+
Setting a Database's Default Collation
A database's default collation can be set at creation by executing the CREATE DATABASE statement and specifying the DEFAULT COLLATE
database option:
CREATE DATABASE db1
DEFAULT COLLATE 'utf8mb4_bin';
The new database's default character set will be the character set associated with the specified collation.
Changing a Database's Default Collation
A database's default collation can be set by executing the ALTER DATABASE statement and specifying the DEFAULT COLLATE
database option:
ALTER DATABASE db1
DEFAULT COLLATE 'utf8mb4_bin';
The database's default character set will be the character set associated with the specified collation.
Effect of a Database's Default Collation
A database's default collation is used to determine the default collation of new tables within the database, unless a different collation is specified for a given table. To demonstrate this effect:
Check the database's default collation by querying the information_
schema.SCHEMATA table:SELECT DEFAULT_COLLATION_NAME FROM information_schema.SCHEMATA WHERE SCHEMA_NAME='db1';
+------------------------+ | DEFAULT_COLLATION_NAME | +------------------------+ | utf8mb4_unicode_ci | +------------------------+
Create a table in this database, which has at least one column which contains characters. This column could be a data type such as VARCHAR, CHAR, TEXT, or any of the aliases for those types:
CREATE TABLE db1.tab ( id int PRIMARY KEY, str varchar(50) );
You can confirm that the table's default collation is the same as the database's default collation by querying the information_
schema.TABLES table:SELECT TABLE_COLLATION FROM information_schema.TABLES WHERE TABLE_SCHEMA='db1' AND TABLE_NAME='tab';
+--------------------+ | TABLE_COLLATION | +--------------------+ | utf8mb4_unicode_ci | +--------------------+
You can confirm that the
str
column is using the database's default collation by querying the information_schema.COLUMNS table:SELECT COLLATION_NAME FROM information_schema.COLUMNS WHERE TABLE_SCHEMA='db1' AND TABLE_NAME='tab' AND COLUMN_NAME='str';
+--------------------+ | COLLATION_NAME | +--------------------+ | utf8mb4_unicode_ci | +--------------------+
Table's Default Collation
A table's default collation is used to determine the collation of new columns within the table, unless a different collation is specified for a given column.
Checking a Table's Default Collation
You can check a table's default collation by querying the information_
SELECT TABLE_COLLATION
FROM information_schema.TABLES
WHERE TABLE_SCHEMA='db1'
AND TABLE_NAME='tab';
+--------------------+
| TABLE_COLLATION |
+--------------------+
| utf8mb4_unicode_ci |
+--------------------+
Setting a Table's Default Collation
A table's default collation can be set at creation by executing the CREATE TABLE statement and specifying the DEFAULT COLLATE
table option:
CREATE TABLE db1.tab (
id int PRIMARY KEY,
str varchar(50)
) DEFAULT COLLATE 'utf8mb4_bin';
The new table's default character set will be the character set associated with the specified collation.
Changing a Table's Default Collation
A table's default collation can be set by executing the ALTER TABLE statement and specifying the DEFAULT COLLATE
table option:
ALTER TABLE db1.tab
DEFAULT COLLATE 'utf8mb4_bin';
The table's default character set will be the character set associated with the specified collation.
Effect of a Table's Default Collation
A table's default collation is used to determine the default collation of new columns within the table, unless a different collation is specified for a given column. To demonstrate this effect:
Check the table's default collation by querying the information_
schema.TABLES table:SELECT TABLE_COLLATION FROM information_schema.TABLES WHERE TABLE_SCHEMA='db1' AND TABLE_NAME='tab';
+--------------------+ | TABLE_COLLATION | +--------------------+ | utf8mb4_unicode_ci | +--------------------+
Add a new column in this table by executing the ALTER TABLE ... ADD COLUMN statement. This column could be a data type such as VARCHAR, CHAR, TEXT, or any of the aliases for those types:
ALTER TABLE db1.tab ADD COLUMN new_str VARCHAR(50);
You can confirm this column is using the table's default collation by querying the information_
schema.COLUMNS table:SELECT COLLATION_NAME FROM information_schema.COLUMNS WHERE TABLE_SCHEMA='db1' AND TABLE_NAME='tab' AND COLUMN_NAME='new_str';
+--------------------+ | COLLATION_NAME | +--------------------+ | utf8mb4_unicode_ci | +--------------------+
Column's Collation
A column's collation is used by the column to sort and compare its character data.
Checking a Column's collation
You can check a column's collation by querying the information_
SELECT COLLATION_NAME
FROM information_schema.COLUMNS
WHERE TABLE_SCHEMA='db1'
AND TABLE_NAME='tab'
AND COLUMN_NAME='str';
+--------------------+
| COLLATION_NAME |
+--------------------+
| utf8mb4_unicode_ci |
+--------------------+
Setting a Column's Collation
A column's collation can be set at table creation by executing the CREATE TABLE statement and specifying the COLLATE
column option:
CREATE TABLE db1.tab (
id int PRIMARY KEY,
str varchar(50) COLLATE 'utf8mb4_bin'
);
The column's character set will be the character set associated with the specified collation.
Changing a Column's Collation
A column's collation can be changed by executing the ALTER TABLE statement and specifying the MODIFY
clause with the COLLATE
column option:
ALTER TABLE db1.tab
MODIFY str varchar(50)
COLLATE 'utf8mb4_bin';
The column's character set will be the character set associated with the specified collation.
Binary Column Collations
For binary column collations, the collation compares the column's character data by comparing the binary values of each character.
For example, to see how the utf8mb4_bin
collation would handle sorting strings that use the utf8mb4
character set:
Create a table with a column that uses the
utf8mb4_bin
collation by executing the CREATE TABLE statement and specifying theCOLLATE
column option:CREATE TABLE db1.tab ( id int PRIMARY KEY, str varchar(50) COLLATE 'utf8mb4_bin' );
Insert some data into the table:
INSERT INTO db1.tab VALUES (1, 'SkySQL'), (2, 'skysql'), (3, 'SKYSQL'), (4, 'MariaDB'), (5, 'mariadb'), (6, 'MARIADB');
Execute a query that eliminates duplicates and sorts the data:
SELECT DISTINCT str FROM db1.tab ORDER BY str;
+---------+ | str | +---------+ | MARIADB | | MariaDB | | SKYSQL | | SkySQL | | mariadb | | skysql | +---------+
The results show that lower case characters and upper case characters are considered to be different, and upper case characters are sorted before lower case characters, because upper case characters have lower code points (numeric values) than lower case characters.
Case-Insensitive Column Collations
For case-insensitive column collations, the collation compares the column's character data by treating upper case and lower case characters of the same character as equal.
For example, to see how the utf8mb4_unicode_ci
collation would handle sorting strings that use the utf8mb4
character set:
Create a table with a column that uses the
utf8mb4_unicode_ci
collation by executing the CREATE TABLE statement and specifying theCOLLATE
column option:CREATE TABLE db1.tab ( id int PRIMARY KEY, str varchar(50) COLLATE 'utf8mb4_unicode_ci' );
Insert some data into the table:
INSERT INTO db1.tab VALUES (1, 'SkySQL'), (2, 'skysql'), (3, 'SKYSQL'), (4, 'MariaDB'), (5, 'mariadb'), (6, 'MARIADB');
Execute a query that eliminates duplicates and sorts the data:
SELECT DISTINCT str FROM db1.tab ORDER BY str;
+---------+ | str | +---------+ | MariaDB | | SkySQL | +---------+
The results show that lower case characters and upper case characters are considered to be the same, so many of the strings were considered to be duplicates.
Case-Sensitive Column Collations
For case-sensitive column collations, the collation compares the column's character data by treating upper case and lower case characters of the same character as different.
For example, to see how the latin1_general_cs
collation would handle sorting strings that use the latin1
character set:
Create a table with a column that uses the
latin1_general_cs
collation by executing the CREATE TABLE statement and specifying theCOLLATE
column option:CREATE TABLE db1.tab ( id int PRIMARY KEY, str varchar(50) COLLATE 'latin1_general_cs' );
Insert some data into the table:
INSERT INTO db1.tab VALUES (1, 'SkySQL'), (2, 'skysql'), (3, 'SKYSQL'), (4, 'MariaDB'), (5, 'mariadb'), (6, 'MARIADB');
Execute a query that eliminates duplicates and sorts the data:
SELECT DISTINCT str FROM db1.tab ORDER BY str;
+---------+ | str | +---------+ | MARIADB | | MariaDB | | mariadb | | SKYSQL | | SkySQL | | skysql | +---------
The results show that lower case characters and upper case characters are considered to be different, and strings are sorted in lexicographical order.
Connection Collation
The connection collation is used to sort and compare string literals.
The connection collation is configured by the collation_
Setting the Connection Collation
The connection collation for the current connection can be set by changing the collation_
SET SESSION collation_connection='utf8mb4_bin';
Binary Connection Collations
For binary connection collations, the collation compares string literals by comparing the binary values of each character.
For example, to see how the utf8mb4_bin
collation would handle sorting strings that use the utf8mb4
character set:
Change the connection collation to
utf8_bin
with SET SESSION:SET SESSION collation_connection='utf8mb4_bin';
Execute a query that eliminates duplicates from and sorts multiple string literals:
WITH t AS ( SELECT 'SkySQL' AS str UNION SELECT 'skysql' AS str UNION SELECT 'SKYSQL' AS str UNION SELECT 'MariaDB' AS str UNION SELECT 'mariadb' AS str UNION SELECT 'MARIADB' AS str ) SELECT DISTINCT t.str FROM t ORDER BY t.str;
+---------+ | str | +---------+ | MARIADB | | MariaDB | | SKYSQL | | SkySQL | | mariadb | | skysql | +---------+
The results show that lower case characters and upper case characters are considered to be different, and upper case characters are sorted before lower case characters, because upper case characters have lower code points (numeric values) than lower case characters.
Case-Insensitive Connection Collations
For case-insensitive connection collations, the collation compares string literals by treating upper case and lower case characters of the same character as equal.
For example, to see how the utf8mb4_unicode_ci
collation would handle sorting strings that use the utf8mb4
character set:
Change the connection collation to
utf8_unicode_ci
with SET SESSION:SET SESSION collation_connection='utf8mb4_unicode_ci';
Execute a query that eliminates duplicates from and sorts multiple string literals:
WITH t AS ( SELECT 'SkySQL' AS str UNION SELECT 'skysql' AS str UNION SELECT 'SKYSQL' AS str UNION SELECT 'MariaDB' AS str UNION SELECT 'mariadb' AS str UNION SELECT 'MARIADB' AS str ) SELECT DISTINCT t.str FROM t ORDER BY t.str;
+---------+ | str | +---------+ | MariaDB | | SkySQL | +---------+
The results show that lower case characters and upper case characters are considered to be the same, so many of the strings were considered to be duplicates.
Case-Sensitive Connection Collations
For case-sensitive connection collations, the collation compares string literals by treating upper case and lower case characters of the same character as different.
For example, to see how the latin1_general_cs
collation would handle sorting strings that use the latin1
character set:
Change the connection collation to
latin1_general_cs
with SET SESSION:SET SESSION collation_connection='latin1_general_cs';
Execute a query that eliminates duplicates from and sorts multiple string literals:
WITH t AS ( SELECT 'SkySQL' AS str UNION SELECT 'skysql' AS str UNION SELECT 'SKYSQL' AS str UNION SELECT 'MariaDB' AS str UNION SELECT 'mariadb' AS str UNION SELECT 'MARIADB' AS str ) SELECT DISTINCT t.str FROM t ORDER BY t.str;
+---------+ | str | +---------+ | MARIADB | | MariaDB | | mariadb | | SKYSQL | | SkySQL | | skysql | +---------+
The results show that lower case characters and upper case characters are considered to be different, and strings are sorted in lexicographical order.