VARCHAR

Complete VARCHAR reference: VARCHAR(M) syntax, length limits (0-65532 per row), CHARACTER SET/COLLATE options, indexing rules, and trailing spaces.

Syntax

[NATIONAL] VARCHAR(M) [CHARACTER SET charset_name] [COLLATE collation_name]

Description

A variable-length string. M represents the maximum column length in characters. The range of M is 0 to 65,532. The effective maximum length of a VARCHAR is subject to the maximum row size and the character set used. For example, utf-8 characters can require up to three bytes per character, so a VARCHAR column that uses the utf-8 character set can be declared to be a maximum of 21,844 characters.

VARCHAR is shorthand for CHARACTER VARYING. NATIONAL VARCHAR is the standard SQL way to define that a VARCHAR column should use some predefined character set. MariaDB uses utf-8 as its predefined character set, as does MySQL. NVARCHAR is shorthand for NATIONAL VARCHAR.

MariaDB stores VARCHAR values as a one-byte or two-byte length prefix plus data. The length prefix indicates the number of bytes in the value. A VARCHAR column uses one length byte if values require no more than 255 bytes, two length bytes if values may require more than 255 bytes.

MariaDB follows the standard SQL specification and does not remove trailing spaces from VARCHAR values.

If a unique index consists of a column where trailing pad characters are stripped or ignored, inserts into that column where values differ only by the number of trailing pad characters will result in a duplicate-key error.

For the ColumnStore engine, M represents the maximum column length in bytes.

For MariaDB, a number of NO PAD collations are available.

VARCHAR(0) columns can contain 2 values: an empty string or NULL. Such columns cannot be part of an index. The CONNECT storage engine does not support VARCHAR(0).

Use Cases for Zero Length

A CHAR(0) or VARCHAR(0) column occupies minimal space and is restricted to two possible values: an empty string ('') or NULL. You can use these columns for the following purposes:

  • Legacy Compatibility: Include these columns to maintain compatibility with older applications that require a specific table schema, even if the data is no longer collected.

  • Two-State Flags: A CHAR(0) NULL column can function as a boolean indicator. It uses only one bit of storage to distinguish between a "set" state (the empty string) and an "unset" state (NULL).

  • Row Marking: You can use a CHAR(0) column to mark a specific row in a table. For example, if you require only one "active" row, set that row to an empty string while keeping all other rows NULL.

The following error occurs if you attempt to insert any character data into a 0-length column: ERROR 1406 (22001): Data too long for column.

SYNONYMS

The following are synonyms for VARCHAR:

  • CHAR VARYING

  • CHARACTER VARYING

  • VARCHAR2

  • VARCHARACTER

Examples

The following are equivalent:

Trailing spaces

VARCHAR

For our example of VARCHAR, we picked a maximum size that avoids overflowing the maximum row size (65535). Keep in mind that a multi-byte character set would need more space in the row than a single-byte character set. We also avoid the auto-conversion of a VARCHAR into a TEXT, MEDIUMTEXT, or LONGTEXT that can happen when STRICT_TRANS_TABLES is not set in the SQL_MODE.

The example:

Data too Long

When SQL_MODE is strict (the default), a value is considered "too long" when its length exceeds the size of the data type, and an error is generated.

Example of data too long behavior for VARCHAR:

Truncation

  • Depending on whether or not strict sql mode is set, you will either get a warning or an error if you try to insert a string that is too long into a VARCHAR column. If the extra characters are spaces, the spaces that can't fit will be removed, and you will always get a warning, regardless of the sql mode setting.

Difference Between VARCHAR and TEXT

  • VARCHAR columns can be fully indexed. TEXT columns can only be indexed over a specified length.

  • Using TEXT or BLOB in a SELECT query that uses temporary tables for storing intermediate results will force the temporary table to be disk-based (using the Aria storage engine instead of the memory storage engine, which is a bit slower. This is not that bad, as the Aria storage engine caches the rows in memory. To get the benefit of this, one should ensure that the aria_pagecache_buffer_size variable is big enough to hold most of the row and index data for temporary tables.

Oracle Mode

In Oracle mode, VARCHAR2 is a synonym.

For Storage Engine Developers

  • Internally, the full length of the VARCHAR column is allocated inside each TABLE objects record[] structure. As there are three such buffers, each open table will allocate 3 times the max-length-to-store-varchar bytes of memory.

  • TEXT and BLOB columns are stored with a pointer (4 or 8 bytes) + a 1-4 byte length. The TEXT data is only stored once. This means that internally TEXT uses less memory for each open table but instead has the additional overhead that each TEXT object needs to be allocated and freed for each row access (with some caching in between).

See Also

This page is licensed: GPLv2, originally from fill_help_tables.sqlarrow-up-right

spinner

Last updated

Was this helpful?