Unicode
You are viewing an old version of this article. View
the current version here.
Unicode is a standard for encoding text across multiple writing systems. MariaDB 5.5 supports a number of character sets for storing Unicode data:
Character Set | Description | Added |
---|---|---|
ucs2 | UCS-2, each character is represented by a 2-byte code with the most significant byte first. Fixed-length 16-bit encoding. | |
utf8 | UTF-8 encoding using one to three bytes per character. Basic Latin letters, numbers and punctuation use one byte. European and Middle East letters mostly fit into 2 bytes. Korean, Chinese, and Japanese ideographs use 3-bytes. No supplementary characters are stored. | |
utf8mb3 | Currently an alias for utf8. | MariaDB 5.5 |
utf8mb4 | Same as utf8, but stores supplementary characters in four bytes. | MariaDB 5.5 |
utf16 | UTF-16, same as ucs2, but stores supplementary characters in 32 bits. 16 or 32-bits. | MariaDB 5.5 |
utf32 | UTF-32, fixed-length 32-bit encoding. | MariaDB 5.5 |
Comments
Comments loading...
Content reproduced on this site is the property of its respective owners,
and this content is not reviewed in advance by MariaDB. The views, information and opinions
expressed by this content do not necessarily represent those of MariaDB or any other party.