1 of 75

String Functions

Learn about string functions in MariaDB Server. This section details SQL functions for manipulating, searching, and formatting text strings, essential for data cleansing and presentation.

ASCII

Return the ASCII value of the first character. This function returns the numeric ASCII code for the leftmost character of the input string.

Syntax

Description

Returns the numeric ASCII value of the leftmost character of the string argument. Returns 0

BIN

Return the binary representation of a number. This function converts a number to its binary string equivalent.

Syntax

Description

Returns a string representation of the binary value of the given longlong (that is, ) number. This is equivalent to . The argument should be positive. If it is a

BINARY Operator

Cast a string to a binary string. This operator converts a character string to a binary string, often used for case-sensitive comparisons.

This page describes the BINARY operator. For details about the data type, see Binary Data Type.

Syntax

BINARY

Description

The BINARY operator casts the string following it to a binary string. This is an easy way to force a column comparison to be done byte by byte rather than character by character. This causes the comparison to be case sensitive even if the column isn't defined as or .

BINARY also causes trailing spaces to be significant.

Examples

BIT_LENGTH

Return the length of a string in bits. This function calculates the size of the string in bits (length in bytes multiplied by 8).

Syntax

BIT_LENGTH(str)

Description

Returns the length of the given string argument in bits. If the argument is not a string, it will be converted to string. If the argument is NULL, it returns NULL.

Examples

Compatibility

PostgreSQL and Sybase support BIT_LENGTH().

_{This page is licensed: GPLv2, originally from}

CHAR Function

Return the character for each integer passed. This function interprets arguments as integer ASCII values and returns a string of those characters.

Syntax

Description

CHAR() interprets each argument as an and returns a string consisting of the characters given by the code values of those integers. NULL values are skipped. By default, CHAR() returns a binary string. To produce a string in a given , use the optional USING clause:

If USING is given and the result string is illegal for the given character set, a warning is issued. Also, if strict is enabled, the result from CHAR() becomes NULL.

Examples

CHAR_LENGTH

Return the length of a string in characters. This function counts the number of characters in the string, treating multi-byte characters as single units.

Syntax

Description

Returns the length of the given string argument, measured in characters. A multi-byte character counts as a single character. This means that for a string containing five two-byte characters,

CHARACTER_LENGTH

Synonym for CHAR_LENGTH(). Returns the number of characters in the string.

Syntax

CHARACTER_LENGTH(str)

Description

CHARACTER_LENGTH() is a synonym for CHAR_LENGTH().

_{This page is licensed: GPLv2, originally from}

CHR

Return the character for a specific ASCII value. This function is similar to CHAR() but accepts a single integer argument.

Syntax

Description

CHR() interprets each argument N as an integer and returns a

CONCAT_WS

Concatenate with separator. This function joins strings with a specified separator. It skips NULL values during concatenation.

Syntax

Description

CONCAT_WS() stands for Concatenate With Separator and is a special form of

ELT

Return the string at a specific index. This function returns the N-th string from a list of arguments.

Syntax

ELT(N, str1[, str2, str3,...])

Description

Takes a numeric argument and a series of string arguments. Returns the string that corresponds to the given numeric position. For instance, it returns str1 if N is 1, str2 if N is 2, and so on. If the numeric argument is a , MariaDB rounds it to the nearest . If the numeric argument is less than 1, greater than the total number of arguments, or not a number, ELT() returns NULL. It must have at least two arguments.

It is complementary to the function.

Examples

EXPORT_SET

Return a string representation of bits. This function generates a string based on the bits set in a number, using specified 'on' and 'off' strings.

Syntax

EXPORT_SET(bits, on, off[, separator[, number_of_bits]])

Description

Takes a minimum of three arguments. Returns a string where each bit in the given bits argument is returned, with the string values given for on and off.

Bits are examined from right to left, (from low-order to high-order bits). Strings are added to the result from left to right, separated by a separator string (defaults as ','). You can optionally limit the number of bits the EXPORT_SET() function examines using the number_of_bits option.

If any of the arguments are set as NULL, the function returns NULL.

Examples

_{This page is licensed: GPLv2, originally from}

FIELD

Return the index of a string in a list. This function returns the position of the first argument within the subsequent list of arguments.

Syntax

FIELD(pattern, str1[,str2,...])

Description

Returns the index position of the string or number matching the given pattern. Returns 0 in the event that none of the arguments match the pattern. Raises an Error 1582 if not given at least two arguments.

When all arguments given to the FIELD() function are strings, they are treated as case-insensitive. When all the arguments are numbers, they are treated as numbers. Otherwise, they are treated as doubles.

If the given pattern occurs more than once, the FIELD() function only returns the index of the first instance. If the given pattern is NULL, the function returns 0, as a NULL pattern always fails to match.

This function is complementary to the function.

Examples

FIND_IN_SET

Return the index of a string in a comma-separated list. This function finds the position of a string within a list of strings separated by commas.

Syntax

Description

Returns the index position where the given pattern occurs in a string list. The first argument is the pattern you want to search for. The second argument is a string containing comma-separated variables. If the second argument is of the

FORMAT

Format a number. This function formats a number to a format like '#,###,###.##', rounded to a specified number of decimal places.

Syntax

FORMAT(num, decimal_position[, locale])

Description

Formats the given number for display as a string, adding separators to appropriate position and rounding the results to the given decimal position. For instance, it would format 15233.345 to 15,233.35.

If the given decimal position is 0, it rounds to return no decimal point or fractional part. You can optionally specify a value to format numbers to the pattern appropriate for the given region.

Examples

_{This page is licensed: GPLv2, originally from}

FROM_BASE64

Decode a base-64 encoded string. This function takes a base-64 string and returns the decoded binary result.

Syntax

FROM_BASE64(str)

Description

Decodes the given base-64 encode string, returning the result as a binary string. Returns NULL if the given string is NULL or if it's invalid.

It is the reverse of the function.

There are numerous methods to base-64 encode a string. MariaDB uses the following:

It encodes alphabet value 64 as '+'.
It encodes alphabet value 63 as '/'.
It codes output in groups of four printable characters. Each three byte of data encoded uses four characters. If the final group is incomplete, it pads the difference with the '=' character.

_{This page is licensed: CC BY-SA / Gnu FDL}

INSERT Function

Insert a substring into a string. This function inserts a string within another string at a specified position and length, replacing existing characters.

Syntax

Description

Returns the string str

LCASE

Synonym for LOWER(). Converts a string to lowercase characters.

Syntax

Description

LCASE() is a synonym for .

LEFT

Return the leftmost characters. This function returns the specified number of characters from the beginning (left) of a string.

Syntax

Description

Returns the leftmost len

LENGTH

Return the length of a string in bytes. This function counts the number of bytes in the string, which may differ from character count for multi-byte strings.

Syntax

Description

Returns the length of the string str.

In the default mode, when is not set, the length is measured in bytes. In this case, a multi-byte character counts as multiple bytes. This means that for a string containing five two-byte characters, LENGTH() returns 10, whereas returns 5.

When running Oracle mode, the length is measured in characters, and LENGTH is a synonym for .

If str is not a string value, it is converted into a string. If str is NULL, the function returns NULL.

Examples

When Oracle mode is not set:

In Oracle mode:

LENGTHB

Return the length of a string in bytes. This function is a synonym for LENGTH() in default mode, returning the byte count.

Syntax

Description

LENGTHB() returns the length of the given string, in bytes. When

LOAD_FILE

Read a file from the server. This function reads the content of a file located on the server host and returns it as a string.

Syntax

Description

Reads the file and returns the file contents as a string. To use this function, the file must be located on the server host, you must specify the full path name to the file, and you must have the FILE privilege. The file must be readable by all and it must be less than the size, in bytes, of the

LOWER

Convert a string to lowercase. This function returns the string with all characters converted to lowercase.

Syntax

Description

Returns the string str with all characters changed to lowercase according to the current character set mapping. The default is latin1 (cp1252 West European).

LPAD

Left-pad a string. This function pads a string on the left side with a specified string until it reaches a certain length.

Syntax

Description

Returns the string str, left-padded with the string

LTRIM

Remove leading spaces. This function returns the string with any leading whitespace characters removed.

Syntax

Description

Returns the string str with leading space characters removed.

MAKE_SET

Return a set of comma-separated strings. This function returns a string consisting of substrings corresponding to the set bits in a given number.

Syntax

Description

Returns a set value (a string containing substrings separated by "," characters) consisting of the strings that have the corresponding bit in bits set.

MID

Synonym for SUBSTRING(). Returns a substring starting at a specified position for a given length.

Syntax

Description

MID(str,pos,len) is a synonym for !

NOT LIKE

Negated pattern matching. This operator tests whether a string does NOT match a specified SQL pattern.

Syntax

expr NOT LIKE pat [ESCAPE 'escape_char']

Description

This is the same as NOT (expr LIKE pat [ESCAPE 'escape_char']).

_{This page is licensed: GPLv2, originally from}

NOT REGEXP

Negated regular expression matching. This operator tests whether a string does NOT match a specified regular expression pattern.

Syntax

expr NOT REGEXP pat, expr NOT RLIKE pat

Description

This is the same as NOT (expr REGEXP pat).

_{This page is licensed: GPLv2, originally from}

ORD

Return the numeric value of the first character. This function returns the code for the leftmost character, supporting multi-byte characters.

Syntax

Description

If the leftmost character of the string str

POSITION

Synonym for LOCATE(). Returns the position of the first occurrence of a substring within a string.

Syntax

Description

POSITION(substr IN str) is a synonym for

QUOTE

Quote a string for SQL usage. This function produces a string ready for use as a data value in an SQL statement, escaping special characters.

Syntax

Description

Quotes a string to produce a result that can be used as a properly escaped data value in an SQL statement. The string is returned enclosed by single quotes and with each instance of single quote ("

REPEAT Function

Repeat a string. This function returns a string consisting of the input string repeated a specified number of times.

Syntax

REPEAT(str,count)

Description

Returns a string consisting of the string str repeated count times. Ifcount is less than 1, returns an empty string. Returns NULL if str orcount are NULL.

Examples

_{This page is licensed: GPLv2, originally from}

REVERSE

Reverse a string. This function returns the string with the order of its characters reversed.

Syntax

Description

Returns the string str with the order of the characters reversed.

RTRIM

Remove trailing spaces. This function returns the string with any trailing whitespace characters removed.

Syntax

Description

Returns the string str with trailing space characters removed.

SOUNDEX

Return the Soundex string. This function calculates the Soundex key for a string, allowing comparison of words that sound similar.

Syntax

Description

Returns a soundex string from

SOUNDS LIKE

Compare strings by sound. This operator tests if two strings have the same Soundex value, useful for fuzzy matching.

Syntax

Description

This is the same as (expr1) = SOUNDEX(expr2)

SPACE

Return a string of spaces. This function returns a string consisting of a specified number of space characters.

Syntax

Description

Returns a string consisting of N

SUBSTR

Description

SUBSTR() is a synonym for SUBSTRING().

_{This page is licensed: GPLv2, originally from}_{fill_help_tables.sql}

SUBSTRING_INDEX

Return a substring before a delimiter. This function returns the substring from a string before a specified number of occurrences of a delimiter.

Syntax

Description

Returns the substring from string

TRIM

Remove spaces from both ends. This function removes leading and trailing whitespace (or other specified characters) from a string.

Syntax

Description

Returns the string str

TRIM_ORACLE

Oracle-compatible TRIM function. This version of TRIM provides compatibility with Oracle's syntax for removing characters from a string.

TRIM_ORACLE is a synonym for the version of the TRIM function, and is available in all modes.

_{This page is licensed: CC BY-SA / Gnu FDL}

UCASE

Synonym for UPPER(). Converts a string to uppercase characters.

Syntax

UCASE(str)

Description

UCASE() is a synonym for UPPER().

_{This page is licensed: GPLv2, originally from}

UPDATEXML

Replace a portion of XML. This function replaces a section of XML markup matching an XPath expression with a new XML fragment.

Syntax

Description

This function replaces a single portion of a given fragment of XML markupxml_target

UPPER

Convert a string to uppercase. This function returns the string with all characters converted to uppercase.

Syntax

Description

Returns the string str with all characters changed to uppercase according to the current character set mapping. The default is latin1 (cp1252 West European).

Regular Expressions Functions

Learn about regular expression functions in MariaDB Server. This section details SQL functions for powerful pattern matching and manipulation of string data using regular expressions.

FLOAT

NULL

SELECT ASCII(9);
+----------+
| ASCII(9) |
+----------+
|       57 |
+----------+

SELECT ASCII('9');
+------------+
| ASCII('9') |
+------------+
|         57 |
+------------+

SELECT ASCII('abc');
+--------------+
| ASCII('abc') |
+--------------+
|           97 |
+--------------+

SELECT 'a' = 'A';
+-----------+
| 'a' = 'A' |
+-----------+
|         1 |
+-----------+

SELECT BINARY 'a' = 'A';
+------------------+
| BINARY 'a' = 'A' |
+------------------+
|                0 |
+------------------+

SELECT 'a' = 'a ';
+------------+
| 'a' = 'a ' |
+------------+
|          1 |
+------------+

SELECT BINARY 'a' = 'a ';
+-------------------+
| BINARY 'a' = 'a ' |
+-------------------+
|                 0 |
+-------------------+

SELECT EXPORT_SET(5,'Y','N',',',4);
+-----------------------------+
| EXPORT_SET(5,'Y','N',',',4) |
+-----------------------------+
| Y,N,Y,N                     |
+-----------------------------+

SELECT EXPORT_SET(6,'1','0',',',10);
+------------------------------+
| EXPORT_SET(6,'1','0',',',10) |
+------------------------------+
| 0,1,1,0,0,0,0,0,0,0          |
+------------------------------+

SELECT QUOTE(REPEAT('MariaDB ',4));
+------------------------------------+
| QUOTE(REPEAT('MariaDB ',4))        |
+------------------------------------+
| 'MariaDB MariaDB MariaDB MariaDB ' |
+------------------------------------+

pos

len

newstr

pos

len

NULL

delim

count

SUBSTRING_INDEX()

delim

new_xml

xml_target

xpath_expr

xml_target

str

NULL

str2

NULL

str1

str2

SOUNDEX()

SUBSTRING()

str

'

\

ASCII NUL

NULL

N

NULL

SELECT CHARSET(CHAR(0x65)), CHARSET(CHAR(0x65 USING utf8));
+---------------------+--------------------------------+
| CHARSET(CHAR(0x65)) | CHARSET(CHAR(0x65 USING utf8)) |
+---------------------+--------------------------------+
| binary              | utf8                           |
+---------------------+--------------------------------+

SELECT CHAR(77,97,114,'105',97,'68',66);
+----------------------------------+
| CHAR(77,97,114,'105',97,'68',66) |
+----------------------------------+
| MariaDB                          |
+----------------------------------+

SELECT CHAR(77,77.3,'77.3');
+----------------------+
| CHAR(77,77.3,'77.3') |
+----------------------+
| MMM                  |
+----------------------+
1 row in set, 1 warning (0.00 sec)

Warning (Code 1292): Truncated incorrect INTEGER value: '77.3'

SELECT ELT(1, 'ej', 'Heja', 'hej', 'foo');
+------------------------------------+
| ELT(1, 'ej', 'Heja', 'hej', 'foo') |
+------------------------------------+
| ej                                 |
+------------------------------------+

SELECT ELT(4, 'ej', 'Heja', 'hej', 'foo');
+------------------------------------+
| ELT(4, 'ej', 'Heja', 'hej', 'foo') |
+------------------------------------+
| foo                                |
+------------------------------------+

SELECT FIELD('ej', 'Hej', 'ej', 'Heja', 'hej', 'foo') 
   AS 'Field Results';
+---------------+
| Field Results | 
+---------------+
|             2 |
+---------------+

SELECT FIELD('fo', 'Hej', 'ej', 'Heja', 'hej', 'foo')
   AS 'Field Results';
+---------------+
| Field Results | 
+---------------+
|             0 |
+---------------+

SELECT FIELD(1, 2, 3, 4, 5, 1) AS 'Field Results';
+---------------+
| Field Results |
+---------------+
|             5 |
+---------------+

SELECT FIELD(NULL, 2, 3) AS 'Field Results';
+---------------+
| Field Results |
+---------------+
|             0 |
+---------------+

SELECT FIELD('fail') AS 'Field Results';
Error 1582 (42000): Incorrect parameter count in call
to native function 'field'

SELECT FORMAT(1234567890.09876543210, 4) AS 'Format';
+--------------------+
| Format             |
+--------------------+
| 1,234,567,890.0988 |
+--------------------+

SELECT FORMAT(1234567.89, 4) AS 'Format';
+----------------+
| Format         |
+----------------+
| 1,234,567.8900 |
+----------------+

SELECT FORMAT(1234567.89, 0) AS 'Format';
+-----------+
| Format    |
+-----------+
| 1,234,568 |
+-----------+

-- Format number to German number formatting
SELECT FORMAT(123456789,2,'de_DE') AS 'Format';
+----------------+
| Format         |
+----------------+
| 123.456.789,00 |
+----------------+

SELECT TO_BASE64('Maria') AS 'Input';
+-----------+
| Input     |
+-----------+
| TWFyaWE=  |
+-----------+

SELECT FROM_BASE64('TWFyaWE=') AS 'Output';
+--------+
| Output |
+--------+
| Maria  |
+--------+

SELECT CONCAT_WS(',','First name','Second name','Last Name');
+-------------------------------------------------------+
| CONCAT_WS(',','First name','Second name','Last Name') |
+-------------------------------------------------------+
| First name,Second name,Last Name                      |
+-------------------------------------------------------+

SELECT CONCAT_WS('-','Floor',NULL,'Room');
+------------------------------------+
| CONCAT_WS('-','Floor',NULL,'Room') |
+------------------------------------+
| Floor-Room                         |
+------------------------------------+

SET @a = 'gnu', @b = 'penguin', @c = 'sea lion';
Query OK, 0 rows affected (0.00 sec)

SELECT CONCAT_WS(', ', @a, @b, @c);
+-----------------------------+
| CONCAT_WS(', ', @a, @b, @c) |
+-----------------------------+
| gnu, penguin, sea lion      |
+-----------------------------+

SET @a = 'a', @b = NULL, @c = 'c';

SELECT CONCAT_WS('', @a, @b, @c);
+---------------------------+
| CONCAT_WS('', @a, @b, @c) |
+---------------------------+
| ac                        |
+---------------------------+

SELECT INSERT('Quadratic', 3, 4, 'What');
+-----------------------------------+
| INSERT('Quadratic', 3, 4, 'What') |
+-----------------------------------+
| QuWhattic                         |
+-----------------------------------+

SELECT INSERT('Quadratic', -1, 4, 'What');
+------------------------------------+
| INSERT('Quadratic', -1, 4, 'What') |
+------------------------------------+
| Quadratic                          |
+------------------------------------+

SELECT INSERT('Quadratic', 3, 100, 'What');
+-------------------------------------+
| INSERT('Quadratic', 3, 100, 'What') |
+-------------------------------------+
| QuWhat                              |
+-------------------------------------+

SELECT SUBSTRING_INDEX('www.mariadb.org', '.', 2);
+--------------------------------------------+
| SUBSTRING_INDEX('www.mariadb.org', '.', 2) |
+--------------------------------------------+
| www.mariadb                                |
+--------------------------------------------+

SELECT SUBSTRING_INDEX('www.mariadb.org', '.', -2);
+---------------------------------------------+
| SUBSTRING_INDEX('www.mariadb.org', '.', -2) |
+---------------------------------------------+
| mariadb.org                                 |
+---------------------------------------------+

SELECT
    UpdateXML('<a><b>ccc</b><d></d></a>', '/a', '<e>fff</e>') AS val1,
    UpdateXML('<a><b>ccc</b><d></d></a>', '/b', '<e>fff</e>') AS val2,
    UpdateXML('<a><b>ccc</b><d></d></a>', '//b', '<e>fff</e>') AS val3,
    UpdateXML('<a><b>ccc</b><d></d></a>', '/a/d', '<e>fff</e>') AS val4,
    UpdateXML('<a><d></d><b>ccc</b><d></d></a>', '/a/d', '<e>fff</e>') AS val5
    \G
*************************** 1. row ***************************
val1: <e>fff</e>
val2: <a><b>ccc</b><d></d></a>
val3: <a><e>fff</e><d></d></a>
val4: <a><b>ccc</b><e>fff</e></a>
val5: <a><d></d><b>ccc</b><d></d></a>
1 row in set (0.00 sec)

SELECT UPPER(surname), givenname FROM users ORDER BY surname;
+----------------+------------+
| UPPER(surname) | givenname  |
+----------------+------------+
| ABEL           | Jacinto    |
| CASTRO         | Robert     |
| COSTA          | Phestos    |
| MOSCHELLA      | Hippolytos |
+----------------+------------+

SET @str = BINARY 'North Carolina';

SELECT LOWER(@str), LOWER(CONVERT(@str USING latin1));
+----------------+-----------------------------------+
| LOWER(@str)    | LOWER(CONVERT(@str USING latin1)) |
+----------------+-----------------------------------+
| North Carolina | north carolina                    |
+----------------+-----------------------------------+

SELECT CHR(67);
+---------+
| CHR(67) |
+---------+
| C       |
+---------+

SELECT CHR('67');
+-----------+
| CHR('67') |
+-----------+
| C         |
+-----------+

SELECT CHR('C');
+----------+
| CHR('C') |
+----------+
|          |
+----------+
1 row in set, 1 warning (0.000 sec)

SHOW WARNINGS;
+---------+------+----------------------------------------+
| Level   | Code | Message                                |
+---------+------+----------------------------------------+
| Warning | 1292 | Truncated incorrect INTEGER value: 'C' |
+---------+------+----------------------------------------+

SELECT MID('abcd',4,1);
+-----------------+
| MID('abcd',4,1) |
+-----------------+
| d               |
+-----------------+

SELECT MID('abcd',2,2);
+-----------------+
| MID('abcd',2,2) |
+-----------------+
| bc              |
+-----------------+

SELECT MAKE_SET(1,'a','b','c');
+-------------------------+
| MAKE_SET(1,'a','b','c') |
+-------------------------+
| a                       |
+-------------------------+

SELECT MAKE_SET(1 | 4,'hello','nice','world');
+----------------------------------------+
| MAKE_SET(1 | 4,'hello','nice','world') |
+----------------------------------------+
| hello,world                            |
+----------------------------------------+

SELECT MAKE_SET(1 | 4,'hello','nice',NULL,'world');
+---------------------------------------------+
| MAKE_SET(1 | 4,'hello','nice',NULL,'world') |
+---------------------------------------------+
| hello                                       |
+---------------------------------------------+

SELECT QUOTE(MAKE_SET(0,'a','b','c'));
+--------------------------------+
| QUOTE(MAKE_SET(0,'a','b','c')) |
+--------------------------------+
| ''                             |
+--------------------------------+

SELECT givenname, surname FROM users WHERE SOUNDEX(givenname) = SOUNDEX("robert");
+-----------+---------+
| givenname | surname |
+-----------+---------+
| Roberto   | Castro  |
+-----------+---------+

SELECT QUOTE("Don't!");
+-----------------+
| QUOTE("Don't!") |
+-----------------+
| 'Don\'t!'       |
+-----------------+

SELECT QUOTE(NULL); 
+-------------+
| QUOTE(NULL) |
+-------------+
| NULL        |
+-------------+

) returns 10, whereas CHAR_LENGTH() returns 5. If the argument is NULL, it returns NULL.

remstr

BOTH

LEADING

TRAILING

BOTH

remstr

padstr

len

str

len

padstr

EXTRACTVALUE

Extract a value from XML. This function returns the text content of an XML fragment matching a given XPath expression.

Syntax

Description

The EXTRACTVALUE() function takes two string arguments: a fragment of XML markup and an XPath expression, (also known as a locator). It returns the text (That is, CDDATA), of the first text node which is a child of the element or elements matching the XPath expression.

In cases where a valid XPath expression does not match any text nodes in a valid XML fragment, (including the implicit /text() expression), the EXTRACTVALUE() function returns an empty string.

Invalid Arguments

When either the XML fragment or the XPath expression is NULL, the EXTRACTVALUE() function returns NULL. When the XML fragment is invalid, it raises a warning Code 1525:

When the XPath value is invalid, it generates an Error 1105:

Explicit text() Expressions

This function is the equivalent of performing a match using the XPath expression after appending /text(). In other words:

Count Matches

When EXTRACTVALUE() returns multiple matches, it returns the content of the first child text node of each matching element, in the matched order, as a single, space-delimited string.

By design, the EXTRACTVALUE() function makes no distinction between a match on an empty element and no match at all. If you need to determine whether no matching element was found in the XML fragment or if an element was found that contained no child text nodes, use the XPath count() function.

For instance, when looking for a value that exists, but contains no child text nodes, you would get a count of the number of matching instances:

Alternatively, when looking for a value that doesn't exist, count() returns 0.

Matches

Important: The EXTRACTVALUE() function only returns CDDATA. It does not return tags that the element might contain or the text that these child elements contain.

Note, in the above example, while the XPath expression matches to the parent <case> instance, it does not return the contained <email> tag or its content.

Examples

_{This page is licensed: GPLv2, originally from}

MATCH AGAINST

Perform a full-text search. This construct searches for a text query against a set of columns indexed with a FULLTEXT index.

Syntax

MATCH (col1,col2,...) AGAINST (expr [search_modifier])

Description

A special construct used to perform a fulltext search on a fulltext index.

See for a full description, and for more articles on the topic.

Examples

_{This page is licensed: GPLv2, originally from}

LOCATE

Return the position of the first occurrence of a substring. This function finds the starting position of a substring within a string.

Syntax

LOCATE(substr,str), LOCATE(substr,str,pos)

Description

The first syntax returns the position of the first occurrence of substring substr in string str. The second syntax returns the position of the first occurrence of substring substr in string str, starting at position pos. Returns 0 if substr is not in str.

LOCATE() performs a case-insensitive search.

If any argument is NULL, returns NULL.

is the same as the two-argument form of LOCATE(), except that the order of the arguments is reversed.

Examples

RPAD

Right-pad a string. This function pads a string on the right side with a specified string until it reaches a certain length.

Syntax

RPAD(str, len [, padstr])

Description

Returns the string str, right-padded with the string padstr to a length of len characters. If str is longer than len, the return value is shortened to len characters. If padstr is omitted, the RPAD function pads spaces.

Returns NULL if given a NULL argument. If the result is empty (a length of zero), returns either an empty string, or, with , NULL.

The Oracle mode version of the function can be accessed outside of Oracle mode by using RPAD_ORACLE as the function name.

Examples

With the pad string defaulting to space:

Oracle mode:

INSTR

Return the position of the first occurrence of a substring. This function locates a substring within a string and returns its index.

Syntax

INSTR(str,substr)

Description

Returns the position of the first occurrence of substring substr in string str. This is the same as the two-argument form of , except that the order of the arguments is reversed.

INSTR() performs a case-insensitive search.

If any argument is NULL, returns NULL.

Examples

TO_BASE64

Encode a string to base-64. This function converts a string argument to its base-64 encoded form.

Syntax

TO_BASE64(str)

Description

Converts the string argument str to its base-64 encoded form, returning the result as a character string in the connection character set and collation.

The argument str will be converted to string first if it is not a string. A NULL argument will return a NULL result.

The reverse function, , decodes an encoded base-64 string.

There are a numerous different methods to base-64 encode a string. The following are used by MariaDB and MySQL:

Alphabet value 64 is encoded as '+'.
Alphabet value 63 is encoded as '/'.
Encoding output is made up of groups of four printable characters, with each three bytes of data encoded using four characters. If the final group is not complete, it is padded with '=' characters to make up a length of four.
To divide long output, a newline is added after every 76 characters.

Examples

_{This page is licensed: CC BY-SA / Gnu FDL}

REGEXP_SUBSTR

Return the substring matching a regex. This function extracts the actual part of the string that matches the given pattern.

Syntax

REGEXP_SUBSTR(subject,pattern)

Description

Returns the part of the string subject that matches the regular expression pattern, or an empty string if pattern was not found.

The function follows the case sensitivity rules of the effective . Matching is performed case insensitively for case insensitive collations, and case sensitively for case sensitive collations and for binary data.

The collation case sensitivity can be overwritten using the (?i) and (?-i) PCRE flags.

MariaDB uses the library for enhanced regular expression performance, and REGEXP_SUBSTR was introduced as part of this enhancement.

The variable addresses the remaining compatibilities between PCRE and the old regex library.

Examples

_{This page is licensed: CC BY-SA / Gnu FDL}

REPLACE Function

Replace occurrences of a substring. This function replaces all instances of a specified search string within a string with a replacement string.

Syntax

REPLACE(str,from_str,to_str)

Description

Returns the string str with all occurrences of the string from_str replaced by the string to_str. REPLACE() performs a case-sensitive match when searching for from_str.

Examples

_{This page is licensed: GPLv2, originally from}

REGEXP_INSTR

Return the index of a regex match. This function finds the starting position of the first substring that matches the given pattern.

Syntax

REGEXP_INSTR(subject, pattern)

Returns the position of the first occurrence of the regular expression pattern in the string subject, or 0 if pattern was not found.

The positions start with 1 and are measured in characters (i.e. not in bytes), which is important for multi-byte character sets. You can cast a multi-byte character set to BINARY to get offsets in bytes.

The function follows the case sensitivity rules of the effective collation. Matching is performed case insensitively for case insensitive collations, and case sensitively for case sensitive collations and for binary data.

The collation case sensitivity can be overwritten using the (?i) and (?-i) PCRE flags.

MariaDB uses the library for enhanced regular expression performance, and REGEXP_INSTR was introduced as part of this enhancement.

Examples

Casting a multi-byte character set as BINARY to get offsets in bytes:

Case sensitivity:

_{This page is licensed: CC BY-SA / Gnu FDL}

RLIKE

Synonym for REGEXP. This operator performs a regular expression match against a string argument.

Syntax

expr REGEXP pat, expr RLIKE pat

Description

RLIKE is a synonym for REGEXP.

_{This page is licensed: CC BY-SA / Gnu FDL}

SELECT CHAR_LENGTH('π'), LENGTH('π'), LENGTHB('π'), OCTET_LENGTH('π');
+-------------------+--------------+---------------+--------------------+
| CHAR_LENGTH('π')  | LENGTH('π')  | LENGTHB('π')  | OCTET_LENGTH('π')  |
+-------------------+--------------+---------------+--------------------+
|                 1 |            2 |             2 |                  2 |
+-------------------+--------------+---------------+--------------------+

SELECT CHAR_LENGTH('π'), LENGTH('π'), LENGTHB('π'), OCTET_LENGTH('π');
+-------------------+--------------+---------------+--------------------+
| CHAR_LENGTH('π')  | LENGTH('π')  | LENGTHB('π')  | OCTET_LENGTH('π')  |
+-------------------+--------------+---------------+--------------------+
|                 1 |            1 |             2 |                  2 |
+-------------------+--------------+---------------+--------------------+

SELECT TRIM('  bar   ')\G
*************************** 1. row ***************************
TRIM('  bar   '): bar

SELECT TRIM(LEADING 'x' FROM 'xxxbarxxx')\G
*************************** 1. row ***************************
TRIM(LEADING 'x' FROM 'xxxbarxxx'): barxxx

SELECT TRIM(BOTH 'x' FROM 'xxxbarxxx')\G
*************************** 1. row ***************************
TRIM(BOTH 'x' FROM 'xxxbarxxx'): bar

SELECT TRIM(TRAILING 'xyz' FROM 'barxxyz')\G
*************************** 1. row ***************************
TRIM(TRAILING 'xyz' FROM 'barxxyz'): barx

SELECT QUOTE(LTRIM('   MariaDB   '));
+-------------------------------+
| QUOTE(LTRIM('   MariaDB   ')) |
+-------------------------------+
| 'MariaDB   '                  |
+-------------------------------+

SELECT LTRIM(''),LTRIM_ORACLE('');
+-----------+------------------+
| LTRIM('') | LTRIM_ORACLE('') |
+-----------+------------------+
|           | NULL             |
+-----------+------------------+

SELECT LPAD('hello',10,'.');
+----------------------+
| LPAD('hello',10,'.') |
+----------------------+
| .....hello           |
+----------------------+

SELECT LPAD('hello',2,'.');
+---------------------+
| LPAD('hello',2,'.') |
+---------------------+
| he                  |
+---------------------+

SELECT LPAD('',0),LPAD_ORACLE('',0);
+------------+-------------------+
| LPAD('',0) | LPAD_ORACLE('',0) |
+------------+-------------------+
|            | NULL              |
+------------+-------------------+

SELECT RTRIM(''),RTRIM_ORACLE('');
+-----------+------------------+
| RTRIM('') | RTRIM_ORACLE('') |
+-----------+------------------+
|           | NULL             |
+-----------+------------------+

SELECT CHAR_LENGTH('π'), LENGTH('π'), LENGTHB('π'), OCTET_LENGTH('π');
+-------------------+--------------+---------------+--------------------+
| CHAR_LENGTH('π')  | LENGTH('π')  | LENGTHB('π')  | OCTET_LENGTH('π')  |
+-------------------+--------------+---------------+--------------------+
|                 1 |            2 |             2 |                  2 |
+-------------------+--------------+---------------+--------------------+

SELECT CHAR_LENGTH('π'), LENGTH('π'), LENGTHB('π'), OCTET_LENGTH('π');
+-------------------+--------------+---------------+--------------------+
| CHAR_LENGTH('π')  | LENGTH('π')  | LENGTHB('π')  | OCTET_LENGTH('π')  |
+-------------------+--------------+---------------+--------------------+
|                 1 |            1 |             2 |                  2 |
+-------------------+--------------+---------------+--------------------+

SELECT CHAR_LENGTH('π'), LENGTH('π'), LENGTHB('π'), OCTET_LENGTH('π');
+-------------------+--------------+---------------+--------------------+
| CHAR_LENGTH('π')  | LENGTH('π')  | LENGTHB('π')  | OCTET_LENGTH('π')  |
+-------------------+--------------+---------------+--------------------+
|                 1 |            2 |             2 |                  2 |
+-------------------+--------------+---------------+--------------------+

SELECT CHAR_LENGTH('π'), LENGTH('π'), LENGTHB('π'), OCTET_LENGTH('π');
+-------------------+--------------+---------------+--------------------+
| CHAR_LENGTH('π')  | LENGTH('π')  | LENGTHB('π')  | OCTET_LENGTH('π')  |
+-------------------+--------------+---------------+--------------------+
|                 1 |            1 |             2 |                  2 |
+-------------------+--------------+---------------+--------------------+

CREATE TABLE ft_myisam(copy TEXT,FULLTEXT(copy)) ENGINE=MyISAM;

INSERT INTO ft_myisam(copy) VALUES ('Once upon a time'), ('There was a wicked witch'), 
 ('Who ate everybody up');

SELECT * FROM ft_myisam WHERE MATCH(copy) AGAINST('wicked');
+--------------------------+
| copy                     |
+--------------------------+
| There was a wicked witch |
+--------------------------+

SELECT REPLACE('www.mariadb.org', 'w', 'Ww');
+---------------------------------------+
| REPLACE('www.mariadb.org', 'w', 'Ww') |
+---------------------------------------+
| WwWwWw.mariadb.org                    |
+---------------------------------------+

0

-1

1

NULL

subject

pattern

replace

subject

SELECT
   EXTRACTVALUE('<cases><case>example</case></cases>', '/cases/case') 
    AS 'Base Example',
   EXTRACTVALUE('<cases><case>example</case></cases>', '/cases/case/text()') 
    AS 'text() Example';
+--------------+----------------+
| Base Example | text() Example |
+--------------+----------------+
| example      | example        |
+--------------+----------------+

SELECT
   EXTRACTVALUE('<cases><case/></cases>', '/cases/case') 
    AS 'Empty Example',
   EXTRACTVALUE('<cases><case/></cases>', 'count(/cases/case)') 
    AS 'count() Example';
+---------------+-----------------+
| Empty Example | count() Example |
+---------------+-----------------+
|               |               1 |
+---------------+-----------------+

SELECT
   EXTRACTVALUE('<cases><case/></cases>', '/cases/person') 
    AS 'No Match Example',
   EXTRACTVALUE('<cases><case/></cases>', 'count(/cases/person)') 
    AS 'count() Example';
+------------------+-----------------+
| No Match Example | count() Example |
+------------------+-----------------+
|                  |                0|
+------------------+-----------------+

SELECT
    ExtractValue('<a>ccc<b>ddd</b></a>', '/a')            AS val1,
    ExtractValue('<a>ccc<b>ddd</b></a>', '/a/b')          AS val2,
    ExtractValue('<a>ccc<b>ddd</b></a>', '//b')           AS val3,
    ExtractValue('<a>ccc<b>ddd</b></a>', '/b')            AS val4,
    ExtractValue('<a>ccc<b>ddd</b><b>eee</b></a>', '//b') AS val5;
+------+------+------+------+---------+
| val1 | val2 | val3 | val4 | val5    |
+------+------+------+------+---------+
| ccc  | ddd  | ddd  |      | ddd eee |
+------+------+------+------+---------+

SELECT id, body, MATCH (title,body) AGAINST
    ('Security implications of running MySQL as root'
    IN NATURAL LANGUAGE MODE) AS score
    FROM articles WHERE MATCH (title,body) AGAINST
    ('Security implications of running MySQL as root'
    IN NATURAL LANGUAGE MODE);
+----+-------------------------------------+-----------------+
| id | body                                | score           |
+----+-------------------------------------+-----------------+
|  4 | 1. Never run mysqld as root. 2. ... | 1.5219271183014 |
|  6 | When configured properly, MySQL ... | 1.3114095926285 |
+----+-------------------------------------+-----------------+

SELECT HEX(255);
+----------+
| HEX(255) |
+----------+
| FF       |
+----------+

SELECT 0x4D617269614442;
+------------------+
| 0x4D617269614442 |
+------------------+
| MariaDB          |
+------------------+

SELECT HEX('MariaDB');
+----------------+
| HEX('MariaDB') |
+----------------+
| 4D617269614442 |
+----------------+

SELECT HEX(CAST('2001:db8::ff00:42:8329' AS INET6));
+----------------------------------------------+
| HEX(CAST('2001:db8::ff00:42:8329' AS INET6)) |
+----------------------------------------------+
| 20010DB8000000000000FF0000428329             |
+----------------------------------------------+

SELECT LOCATE('bar', 'foobarbar');
+----------------------------+
| LOCATE('bar', 'foobarbar') |
+----------------------------+
|                          4 |
+----------------------------+

SELECT LOCATE('My', 'Maria');
+-----------------------+
| LOCATE('My', 'Maria') |
+-----------------------+
|                     0 |
+-----------------------+

SELECT LOCATE('bar', 'foobarbar', 5);
+-------------------------------+
| LOCATE('bar', 'foobarbar', 5) |
+-------------------------------+
|                             7 |
+-------------------------------+

SELECT INSTR('foobarbar', 'bar');
+---------------------------+
| INSTR('foobarbar', 'bar') |
+---------------------------+
|                         4 |
+---------------------------+

SELECT INSTR('My', 'Maria');
+----------------------+
| INSTR('My', 'Maria') |
+----------------------+
|                    0 |
+----------------------+

SELECT REGEXP_INSTR('ABC','b');
-> 2

SELECT REGEXP_INSTR('ABC' COLLATE utf8_bin,'b');
-> 0

SELECT REGEXP_INSTR(BINARY'ABC','b');
-> 0

SELECT REGEXP_INSTR('ABC','(?-i)b');
-> 0

SELECT REGEXP_INSTR('ABC' COLLATE utf8_bin,'(?i)b');
-> 2

SELECT SFORMAT("The answer is {}.", 42);
+----------------------------------+
| SFORMAT("The answer is {}.", 42) |
+----------------------------------+
| The answer is 42.                |
+----------------------------------+

CREATE TABLE test_sformat(mdb_release char(6), mdev int, feature char(20));

INSERT INTO test_sformat VALUES('10.7.0', 25015, 'Python style sformat'), 
  ('10.7.0', 4958, 'UUID');

SELECT * FROM test_sformat;
+-------------+-------+----------------------+
| mdb_release | mdev  | feature              |
+-------------+-------+----------------------+
| 10.7.0      | 25015 | Python style sformat |
| 10.7.0      |  4958 | UUID                 |
+-------------+-------+----------------------+

SELECT SFORMAT('MariaDB Server {} has a preview for MDEV-{} which is about {}', 
  mdb_release, mdev, feature) AS 'Preview Release Examples'
  FROM test_sformat;
+----------------------------------------------------------------------------------------+
| Preview Release Examples                                                               |
+----------------------------------------------------------------------------------------+
| MariaDB Server 10.7.0 has a preview for MDEV-25015 which is about Python style sformat |
| MariaDB Server 10.7.0 has a preview for MDEV-4958 which is about UUID                  |
+----------------------------------------------------------------------------------------+

SELECT STRCMP('text', 'text2');
+-------------------------+
| STRCMP('text', 'text2') |
+-------------------------+
|                      -1 |
+-------------------------+

SELECT STRCMP('text2', 'text');
+-------------------------+
| STRCMP('text2', 'text') |
+-------------------------+
|                       1 |
+-------------------------+

SELECT STRCMP('text', 'text');
+------------------------+
| STRCMP('text', 'text') |
+------------------------+
|                      0 |
+------------------------+

SELECT REGEXP_REPLACE('ABC','b','-') AS case_insensitive;
-> A-C

SELECT REGEXP_REPLACE('ABC' COLLATE utf8_bin,'b','-') AS case_sensitive;
-> ABC

SELECT REGEXP_REPLACE(BINARY 'ABC','b','-') AS binary_data;
-> ABC

SELECT HEX('MariaDB');
+----------------+
| HEX('MariaDB') |
+----------------+
| 4D617269614442 |
+----------------+

SELECT UNHEX('4D617269614442');
+-------------------------+
| UNHEX('4D617269614442') |
+-------------------------+
| MariaDB                 |
+-------------------------+

SELECT 0x4D617269614442;
+------------------+
| 0x4D617269614442 |
+------------------+
| MariaDB          |
+------------------+

SELECT UNHEX(HEX('string'));
+----------------------+
| UNHEX(HEX('string')) |
+----------------------+
| string               |
+----------------------+

SELECT HEX(UNHEX('1267'));
+--------------------+
| HEX(UNHEX('1267')) |
+--------------------+
| 1267               |
+--------------------+

WEIGHT_STRING

Return the weight string. This function returns the binary string that represents the sorting and comparison value of the input string.

Syntax

Description

Returns a binary string representing the string's sorting and comparison value. A string with a lower result means that for sorting purposes the string appears before a string with a higher result.

WEIGHT_STRING() is particularly useful when adding new collations, for testing purposes.

If str is a non-binary string (, or ), WEIGHT_STRING returns the string's collation weight. If str is a binary string (, or ), the return value is simply the input value, since the weight for each byte in a binary string is the byte value.

WEIGHT_STRING() returns NULL if given a NULL input.

The optional AS clause permits casting the input string to a binary or non-binary string, as well as to a particular length.

AS BINARY(N) measures the length in bytes rather than characters, and right pads with 0x00 bytes to the desired length.

AS CHAR(N) measures the length in characters, and right pads with spaces to the desired length.

N has a minimum value of 1, and if it is less than the length of the input string, the string is truncated without warning.

The optional LEVEL clause specifies that the return value should contain weights for specific collation levels. The levels specifier can either be a single integer, a comma-separated list of integers, or a range of integers separated by a dash (whitespace is ignored). Integers can range from 1 to a maximum of 6, dependent on the collation, and need to be listed in ascending order.

If the LEVEL clause is no provided, a default of 1 to the maximum for the collation is assumed.

If the q is specified without using a range, an optional modifier is permitted.

ASC, the default, returns the weights without any modification.

DESC returns bitwise-inverted weights.

REVERSE returns the weights in reverse order.

Examples

The examples below use the function to represent non-printable results in hexadecimal format.

_{This page is licensed: CC BY-SA / Gnu FDL}

RIGHT

Return the rightmost characters. This function returns the specified number of characters from the end (right) of a string.

Syntax

RIGHT(str,len)

Description

Returns the rightmost len characters from the string str, or NULL if any argument is NULL.

Examples

_{This page is licensed: GPLv2, originally from}

SET @x = 'AardVark';

SET @x = BINARY 'AardVark';

SELECT LOWER(@x), LOWER(CONVERT (@x USING latin1));
+-----------+----------------------------------+
| LOWER(@x) | LOWER(CONVERT (@x USING latin1)) |
+-----------+----------------------------------+
| AardVark  | aardvark                         |
+-----------+----------------------------------+

SELECT CONCAT('Ma', 'ria', 'DB');
+---------------------------+
| CONCAT('Ma', 'ria', 'DB') |
+---------------------------+
| MariaDB                   |
+---------------------------+

SELECT CONCAT('Ma', 'ria', NULL, 'DB');
+---------------------------------+
| CONCAT('Ma', 'ria', NULL, 'DB') |
+---------------------------------+
| NULL                            |
+---------------------------------+

SELECT CONCAT(42.0);
+--------------+
| CONCAT(42.0) |
+--------------+
| 42.0         |
+--------------+

SELECT CONCAT('The value of @v is: ', IFNULL(@v, ''));
+------------------------------------------------+
| CONCAT('The value of @v is: ', IFNULL(@v, '')) |
+------------------------------------------------+
| The value of @v is:                            |
+------------------------------------------------+

SELECT CONCAT('Ma', 'ria', NULL, 'DB');
+---------------------------------+
| CONCAT('Ma', 'ria', NULL, 'DB') |
+---------------------------------+
| MariaDB                         |
+---------------------------------+

SELECT RPAD('hello',10,'.');
+----------------------+
| RPAD('hello',10,'.') |
+----------------------+
| hello.....           |
+----------------------+

SELECT RPAD('hello',2,'.');
+---------------------+
| RPAD('hello',2,'.') |
+---------------------+
| he                  |
+---------------------+

SELECT RPAD('hello',30);
+--------------------------------+
| RPAD('hello',30)               |
+--------------------------------+
| hello                          |
+--------------------------------+

SELECT RPAD('',0),RPAD_ORACLE('',0);
+------------+-------------------+
| RPAD('',0) | RPAD_ORACLE('',0) |
+------------+-------------------+
|            | NULL              |
+------------+-------------------+

SELECT 'Monty!' REGEXP 'm%y%%';
+-------------------------+
| 'Monty!' REGEXP 'm%y%%' |
+-------------------------+
|                       0 |
+-------------------------+

SELECT 'Monty!' REGEXP '.*';
+----------------------+
| 'Monty!' REGEXP '.*' |
+----------------------+
|                    1 |
+----------------------+

SELECT 'new*\n*line' REGEXP 'new\\*.\\*line';
+---------------------------------------+
| 'new*\n*line' REGEXP 'new\\*.\\*line' |
+---------------------------------------+
|                                     1 |
+---------------------------------------+

SELECT 'a' REGEXP 'A', 'a' REGEXP BINARY 'A';
+----------------+-----------------------+
| 'a' REGEXP 'A' | 'a' REGEXP BINARY 'A' |
+----------------+-----------------------+
|              1 |                     0 |
+----------------+-----------------------+

SELECT 'a' REGEXP '^[a-d]';
+---------------------+
| 'a' REGEXP '^[a-d]' |
+---------------------+
|                   1 |
+---------------------+

SET default_regex_flags='MULTILINE';
SELECT 'a\nb\nc' RLIKE '^b$';
+-----------------------+
| 'a\nb\nc' RLIKE '^b$' |
+-----------------------+
|                     1 |
+-----------------------+

SELECT HEX(WEIGHT_STRING('x'));
+-------------------------+
| HEX(WEIGHT_STRING('x')) |
+-------------------------+
| 0058                    |
+-------------------------+

SELECT HEX(WEIGHT_STRING('x' AS BINARY(4)));
+--------------------------------------+
| HEX(WEIGHT_STRING('x' AS BINARY(4))) |
+--------------------------------------+
| 78000000                             |
+--------------------------------------+

SELECT HEX(WEIGHT_STRING('x' AS CHAR(4)));
+------------------------------------+
| HEX(WEIGHT_STRING('x' AS CHAR(4))) |
+------------------------------------+
| 0058002000200020                   |
+------------------------------------+

SELECT HEX(WEIGHT_STRING(0xaa22ee LEVEL 1));
+--------------------------------------+
| HEX(WEIGHT_STRING(0xaa22ee LEVEL 1)) |
+--------------------------------------+
| AA22EE                               |
+--------------------------------------+

SELECT HEX(WEIGHT_STRING(0xaa22ee LEVEL 1 DESC));
+-------------------------------------------+
| HEX(WEIGHT_STRING(0xaa22ee LEVEL 1 DESC)) |
+-------------------------------------------+
| 55DD11                                    |
+-------------------------------------------+

SELECT HEX(WEIGHT_STRING(0xaa22ee LEVEL 1 REVERSE));
+----------------------------------------------+
| HEX(WEIGHT_STRING(0xaa22ee LEVEL 1 REVERSE)) |
+----------------------------------------------+
| EE22AA                                       |
+----------------------------------------------+

str

pos

LIKE

Pattern matching operator. This function tests whether a string matches a specified SQL pattern using wildcards like '%' and '_'.

Syntax

Description

Tests whether expr matches the pattern pat. Returns either 1 (TRUE) or 0 (FALSE). Both expr and pat may be any valid expression and are evaluated to strings. Patterns may use the following wildcard characters:

% matches any number of characters, including zero.
_ matches any single character.

Use NOT LIKE to test if a string does not match a pattern. This is equivalent to using the operator on the entire LIKE expression.

If either the expression or the pattern is NULL, the result is NULL.

LIKE performs case-insensitive substring matches if the collation for the expression and pattern is case-insensitive. For case-sensitive matches, declare either argument to use a binary collation using collate, or coerce either of them to a string using . Use to get a list of available collations. Collations ending in _bin are case-sensitive.

Numeric arguments are coerced to binary strings.

The _ wildcard matches a single character, not byte. It will only match a multi-byte character if it is valid in the expression's character set. For example, _ will match _utf8"€", but it will not match _latin1"€" because the Euro sign is not a valid latin1 character. If necessary, use to use the expression in a different character set.

If you need to match the characters _ or %, you must escape them. By default, you can prefix the wildcard characters the backslash character \ to escape them. The backslash is used both to encode special characters like newlines when a string is parsed as well as to escape wildcards in a pattern after parsing. Thus, to match an actual backslash, you sometimes need to double-escape it as "\``\``\``\".

To avoid difficulties with the backslash character, you can change the wildcard escape character using ESCAPE in a LIKE expression. The argument to ESCAPE must be a single-character string.

Examples

Select the days that begin with "T":

Select the days that contain the substring "es":

Select the six-character day names:

With the default collations, LIKE is case-insensitive:

Use to specify a binary collation, forcing case-sensitive matches:

You can include functions and operators in the expression to match. Select dates based on their day name:

Optimizing LIKE

MariaDB can use indexes for LIKE on string columns in the case where the LIKE doesn't start with % or _.
You can set the variable to 5. If this is done, then the optimizer will read rows to calculate the selectivity of the LIKE expression before starting to calculate the query plan. This can help speed up some LIKE queries by providing the optimizer with more information about your data.

Regular Expressions Overview

Get an overview of regex usage. This page introduces the pattern matching capabilities and common metacharacters used in MariaDB regular expressions.

Regular Expressions allow MariaDB to perform complex pattern matching on a string. In many cases, the simple pattern matching provided by LIKE is sufficient. LIKE performs two kinds of matches:

_ - the underscore, matching a single character
% - the percentage sign, matching any number of characters.

In other cases you may need more control over the returned matches, and will need to use regular expressions.

Regular expression matches are performed with the function. RLIKE is a synonym for REGEXP.

Comparisons are performed on the byte value, so characters that are treated as equivalent by a collation, but do not have the same byte-value, such as accented characters, could evaluate as unequal.

Without any special characters, a regular expression match is true if the characters match. The match is case-insensitive, except in the case of BINARY strings.

Note that the word being matched must match the whole pattern:

The first returns true because the pattern "Mari" exists in the expression "Maria". When the order is reversed, the result is false, as the pattern "Maria" does not exist in the expression "Mari"

A match can be performed against more than one word with the | character. For example:

Special Characters

The above examples introduce the syntax, but are not very useful on their own. It's the special characters that give regular expressions their power.

^

^ matches the beginning of a string (inside square brackets it can also mean NOT - see below):

$

$ matches the end of a string:

.

. matches any single character:

*

x* matches zero or more of a character x. In the examples below, it's the r character.

+

x+ matches one or more of a character x. In the examples below, it's the r character.

?

x? matches zero or one of a character x. In the examples below, it's the r character.

()

(xyz) - combine a sequence, for example (xyz)+ or (xyz)*

{}

x{n} and x{m,n} This notation is used to match many instances of the x. In the case of x{n} the match must be exactly that many times. In the case of x{m,n}, the match can occur from m to n times. For example, to match zero or one instance of the string ari (which is identical to (ari)?), the following can be used:

[]

[xy] groups characters for matching purposes. For example, to match either the p or the r character:

The square brackets also permit a range match, for example, to match any character from a-z, [a-z] is used. Numeric ranges are also permitted.

The following does not match, as r falls outside of the range a-p.

The ^ character means does NOT match, for example:

The [ and ] characters on their own can be literally matched inside a [] block, without escaping, as long as they immediately match the opening bracket:

Incorrect order, so no match:

The - character can also be matched in the same way:

Word boundaries

The :<: and :>: patterns match the beginning and the end of a word respectively. For example:

Character Classes

There are a number of shortcuts to match particular preset character classes. These are matched with the [:character_class:] pattern (inside a [] set). The following character classes exist:

Character Class

Description

For example:

Remember that matches are by default case-insensitive, unless a binary string is used, so the following example, specifically looking for an uppercase, counter-intuitively matches a lowercase character:

Character Names

There are also number of shortcuts to match particular preset character names. These are matched with the [.character.] pattern (inside a [] set). The following character classes exist:

Name

Character

For example:

Combining

The true power of regular expressions is unleashed when the above is combined, to form more complex examples. Regular expression's reputation for complexity stems from the seeming complexity of multiple combined regular expressions, when in reality, it's simply a matter of understanding the characters and how they apply:

The first example fails to match, as while the Ma matches, either i or r only matches once before the ia characters at the end.

This example matches, as either i or r match exactly twice after the Ma, in this case one r and one i.

Escaping

With the large number of special characters, care needs to be taken to properly escape characters. Two backslash characters, `` (one for the MariaDB parser, one for the regex library), are required to properly escape a character. For example:

To match the literal (Ma:

To match r+: The first two examples are incorrect, as they match r one or more times, not r+:

_{This page is licensed: CC BY-SA / Gnu FDL}

SELECT 'Maria' REGEXP 'Maria';
+------------------------+
| 'Maria' REGEXP 'Maria' |
+------------------------+
|                      1 |
+------------------------+

SELECT 'Maria' REGEXP 'maria';
+------------------------+
| 'Maria' REGEXP 'maria' |
+------------------------+
|                      1 |
+------------------------+

SELECT BINARY 'Maria' REGEXP 'maria';
+-------------------------------+
| BINARY 'Maria' REGEXP 'maria' |
+-------------------------------+
|                             0 |
+-------------------------------+

SELECT 'Maria' REGEXP 'Mari';
+-----------------------+
| 'Maria' REGEXP 'Mari' |
+-----------------------+
|                     1 |
+-----------------------+

SELECT 'Mari' REGEXP 'Maria';
+-----------------------+
| 'Mari' REGEXP 'Maria' |
+-----------------------+
|                     0 |
+-----------------------+

SELECT 'Maria' REGEXP 'Monty|Maria';
+------------------------------+
| 'Maria' REGEXP 'Monty|Maria' |
+------------------------------+
|                            1 |
+------------------------------+

SELECT 'Maria' REGEXP 'Ma.ia';
+------------------------+
| 'Maria' REGEXP 'Ma.ia' |
+------------------------+
|                      1 |
+------------------------+

SELECT 'Maria' REGEXP 'Ma..ia';
+-------------------------+
| 'Maria' REGEXP 'Ma..ia' |
+-------------------------+
|                       0 |
+-------------------------+

SELECT 'Maria' REGEXP 'Mar*ia';
+-------------------------+
| 'Maria' REGEXP 'Mar*ia' |
+-------------------------+
|                       1 |
+-------------------------+

SELECT 'Maia' REGEXP 'Mar*ia';
+------------------------+
| 'Maia' REGEXP 'Mar*ia' |
+------------------------+
|                      1 |
+------------------------+

SELECT 'Marrria' REGEXP 'Mar*ia';
+---------------------------+
| 'Marrria' REGEXP 'Mar*ia' |
+---------------------------+
|                         1 |
+---------------------------+

SELECT 'Maria' REGEXP 'Mar+ia';
+-------------------------+
| 'Maria' REGEXP 'Mar+ia' |
+-------------------------+
|                       1 |
+-------------------------+

SELECT 'Maia' REGEXP 'Mar+ia';
+------------------------+
| 'Maia' REGEXP 'Mar+ia' |
+------------------------+
|                      0 |
+------------------------+

SELECT 'Marrria' REGEXP 'Mar+ia';
+---------------------------+
| 'Marrria' REGEXP 'Mar+ia' |
+---------------------------+
|                         1 |
+---------------------------+

SELECT 'Maria' REGEXP 'Mar?ia';
+-------------------------+
| 'Maria' REGEXP 'Mar?ia' |
+-------------------------+
|                       1 |
+-------------------------+

SELECT 'Maia' REGEXP 'Mar?ia';
+------------------------+
| 'Maia' REGEXP 'Mar?ia' |
+------------------------+
|                      1 |
+------------------------+

SELECT 'Marrria' REGEXP 'Mar?ia';
+---------------------------+
| 'Marrria' REGEXP 'Mar?ia' |
+---------------------------+
|                         0 |
+---------------------------+

SELECT 'Maria' REGEXP 'Ma[^p]ia';
+---------------------------+
| 'Maria' REGEXP 'Ma[^p]ia' |
+---------------------------+
|                         1 |
+---------------------------+

SELECT 'Maria' REGEXP 'Ma[^r]ia';
+---------------------------+
| 'Maria' REGEXP 'Ma[^r]ia' |
+---------------------------+
|                         0 |
+---------------------------+

SELECT '[Maria' REGEXP '[[]';
+-----------------------+
| '[Maria' REGEXP '[[]' |
+-----------------------+
|                     1 |
+-----------------------+

SELECT '[Maria' REGEXP '[]]';
+-----------------------+
| '[Maria' REGEXP '[]]' |
+-----------------------+
|                     0 |
+-----------------------+

SELECT ']Maria' REGEXP '[]]';
+-----------------------+
| ']Maria' REGEXP '[]]' |
+-----------------------+
|                     1 |
+-----------------------+

SELECT ']Maria' REGEXP '[]a]';
+------------------------+
| ']Maria' REGEXP '[]a]' |
+------------------------+
|                      1 |
+------------------------+

SELECT '-Maria' REGEXP '[1-10]';
+--------------------------+
| '-Maria' REGEXP '[1-10]' |
+--------------------------+
|                        0 |
+--------------------------+

SELECT '-Maria' REGEXP '[-1-10]';
+---------------------------+
| '-Maria' REGEXP '[-1-10]' |
+---------------------------+
|                         1 |
+---------------------------+

SELECT 'How do I upgrade MariaDB?' REGEXP '[[:<:]]MariaDB[[:>:]]';
+------------------------------------------------------------+
| 'How do I upgrade MariaDB?' REGEXP '[[:<:]]MariaDB[[:>:]]' |
+------------------------------------------------------------+
|                                                          1 |
+------------------------------------------------------------+

SELECT 'How do I upgrade MariaDB?' REGEXP '[[:<:]]Maria[[:>:]]';
+----------------------------------------------------------+
| 'How do I upgrade MariaDB?' REGEXP '[[:<:]]Maria[[:>:]]' |
+----------------------------------------------------------+
|                                                        0 |
+----------------------------------------------------------+

SELECT 'Maria' REGEXP 'Mar[[:alnum:]]*';
+--------------------------------+
| 'Maria' REGEXP 'Mar[:alnum:]*' |
+--------------------------------+
|                              1 |
+--------------------------------+

SELECT 'Mari' REGEXP 'Mar[[:upper:]]+';
+---------------------------------+
| 'Mari' REGEXP 'Mar[[:upper:]]+' |
+---------------------------------+
|                               1 |
+---------------------------------+

SELECT BINARY 'Mari' REGEXP 'Mar[[:upper:]]+';
+----------------------------------------+
| BINARY 'Mari' REGEXP 'Mar[[:upper:]]+' |
+----------------------------------------+
|                                      0 |
+----------------------------------------+

SELECT '|' REGEXP '[[.vertical-line.]]';
+----------------------------------+
| '|' REGEXP '[[.vertical-line.]]' |
+----------------------------------+
|                                1 |
+----------------------------------+

SELECT 'Maria' REGEXP 'Ma[ir]{2}ia';
+------------------------------+
| 'Maria' REGEXP 'Ma[ir]{2}ia' |
+------------------------------+
|                            0 |
+------------------------------+

SELECT '(Maria)' REGEXP '(Ma';
ERROR 1139 (42000): Got error 'parentheses not balanced' from regexp

SELECT '(Maria)' REGEXP '\(Ma';
ERROR 1139 (42000): Got error 'parentheses not balanced' from regexp

SELECT '(Maria)' REGEXP '\\(Ma';
+--------------------------+
| '(Maria)' REGEXP '\\(Ma' |
+--------------------------+
|                        1 |
+--------------------------+

SELECT 'Mar+ia' REGEXP 'r+';
+----------------------+
| 'Mar+ia' REGEXP 'r+' |
+----------------------+
|                    1 |
+----------------------+

SELECT 'Maria' REGEXP 'r+';
+---------------------+
| 'Maria' REGEXP 'r+' |
+---------------------+
|                   1 |
+---------------------+

SELECT 'Maria' REGEXP 'r\\+';
+-----------------------+
| 'Maria' REGEXP 'r\\+' |
+-----------------------+
|                     0 |
+-----------------------+

SELECT 'Maria' REGEXP 'r+';
+---------------------+
| 'Maria' REGEXP 'r+' |
+---------------------+
|                   1 |
+---------------------+

PCRE - Perl Compatible Regular Expressions

Understand MariaDB's regex support. This concept page explains the PCRE library integration, detailing supported syntax, character classes, and special characters.

PCRE Versions

PCRE Version

Introduced

Maturity

10.34

PCRE Enhancements

MariaDB uses the PCRE library, which significantly improves the power of the operator.

The switch to PCRE added a number of features, including recursive patterns, named capture, look-ahead and look-behind assertions, non-capturing groups, non-greedy quantifiers, Unicode character properties, extended syntax for characters and character classes, multi-line matching, and many other.

These functions work with regular expressions: , , and .

Also, REGEXP/RLIKE, and the new functions, work correctly with all multi-byte supported by MariaDB, including East-Asian character sets (big5, gb2313, gbk, eucjp, eucjpms, cp932, ujis, euckr), and Unicode character sets (utf8, utf8mb4, ucs2, utf16, utf16le, utf32).

New Regular Expression Functions

- Replaces all occurrences of a pattern.
- Position of the first appearance of a regex.
- Returns the matching part of a string.

See the individual articles for more details and examples.

PCRE Syntax

In most cases PCRE is backward compatible with the old POSIX 1003.2 compliant regexp library (see ), so you won't need to change your applications that use SQL queries with the REGEXP/RLIKE predicate.

This section briefly describes the most important extended PCRE features. For more details, please refer to the documentation on the , or to the documentation which is bundled in the /pcre/doc/html/ directory of a MariaDB sources distribution. The pages pcresyntax.html and pcrepattern.html should be a good start. is another good resource to learn about PCRE and regular expressions generally.

Special Characters

PCRE supports the following escape sequences to match special characters:

Sequence

Description

Note, the backslash characters (here, and in all examples in the sections below) must be escaped with another backslash, unless you're using the NO_BACKSLASH_ESCAPES.

This example tests if a character has hex code 0x61:

Character Classes

PCRE supports the standard POSIX character classes such as alnum, alpha, blank, cntrl, digit, graph, lower, print, punct, space, upper, xdigit, with the following additional classes:

Class

Description

This example checks if the string consists of ASCII characters only:

Generic Character Types

Generic character types complement the POSIX character classes and serve to simplify writing patterns:

Class

Description

This example checks if the string consists of "word" characters only:

Unicode Character Properties

\p{xx} is a character with the xx property, and \P{xx} is a character without the xx property.

The property names represented by xx above are limited to the Unicode script names, the general category properties, and "Any", which matches any character (including newline). Those that are not part of an identified script are lumped together as "Common".

General Category Properties For \p and \P

Property

Description

This example checks if the string consists only of characters with property N (number):

Special Category Properties For \p and \P

Property

Description

The property Xuc matches any character that can be represented by a Universal Character Name (in C++ and other programming languages). These include $, @, ```, and all characters with Unicode code points greater than U+00A0, excluding the surrogates U+D800..U+DFFF.

Script Names For \p and \P

Arabic, Armenian, Avestan, Balinese, Bamum, Batak, Bengali, Bopomofo, Brahmi, Braille, Buginese, Buhid, Canadian_Aboriginal, Carian, Chakma, Cham, Cherokee, Common, Coptic, Cuneiform, Cypriot, Cyrillic, Deseret, Devanagari, Egyptian_Hieroglyphs, Ethiopic, Georgian, Glagolitic, Gothic, Greek, Gujarati, Gurmukhi, Han, Hangul, Hanunoo, Hebrew, Hiragana, Imperial_Aramaic, Inherited, Inscriptional_Pahlavi, Inscriptional_Parthian, Javanese, Kaithi, Kannada, Katakana, Kayah_Li, Kharoshthi, Khmer, Lao, Latin, Lepcha, Limbu, Linear_B, Lisu, Lycian, Lydian, Malayalam, Mandaic, Meetei_Mayek, Meroitic_Cursive, Meroitic_Hieroglyphs, Miao, Mongolian, Myanmar, New_Tai_Lue, Nko, Ogham, Old_Italic, Old_Persian, Old_South_Arabian, Old_Turkic, Ol_Chiki, Oriya, Osmanya, Phags_Pa, Phoenician, Rejang, Runic, Samaritan, Saurashtra, Sharada, Shavian, Sinhala, Sora_Sompeng, Sundanese, Syloti_Nagri, Syriac, Tagalog, Tagbanwa, Tai_Le, Tai_Tham, Tai_Viet, Takri, Tamil, Telugu, Thaana, Thai, Tibetan, Tifinagh, Ugaritic, Vai, Yi.

This example checks if the string consists only of Greek characters:

Extended Unicode Grapheme Sequence

The \X escape sequence matches a character sequence that makes an "extended grapheme cluster", i.e. a composite character that consists of multiple Unicode code points.

One of the examples of a composite character can be a letter followed by non-spacing accent marks. This example demonstrates that U+0045 LATIN CAPITAL LETTER E followed by U+0302 COMBINING CIRCUMFLEX ACCENT followed by U+0323 COMBINING DOT BELOW together form an extended grapheme cluster:

See the for the other types of extended grapheme clusters.

Simple Assertions

An assertion specifies a certain condition that must match at a particular point, but without consuming characters from the subject string. In addition to the standard POSIX simple assertions ^ (that matches at the beginning of a line) and $ (that matches at the end of a line), PCRE supports a number of other assertions:

Assertion

Description

This example cuts a word that consists only of 3 characters from a string:

Notice that the two \b assertions checked the word boundaries but did not get into the matching pattern.

The \b assertions work well in the beginning and the end of the subject string:

By default, the ^ and $ assertions have the same meaning with \A, \Z, and \z. However, the meanings of ^ and $ can change in multiline mode (see below). By contrast, the meanings of \A, \Z, and \z are always the same; they are independent of the multiline mode.

Option Setting

A number of options that control the default match behavior can be changed within the pattern by a sequence of option letters enclosed between (? and ).

Option

Description

For example, (?im) sets case insensitive multiline matching.

A hyphen followed by the option letters unset the options. For example, (?-im) means case sensitive single line match.

A combined setting and unsetting is also possible, e.g. (?im-sx).

If an option is set outside of subpattern parentheses, the option applies to the remainder of the pattern that follows the option. If an option is set inside a subpattern, it applies to the part of this subpattern that follows the option.

In this example the pattern (?i)m((?-i)aria)db matches the words MariaDB, Mariadb, mariadb, but not MARIADB:

This example demonstrates that the (?x) option makes the regexp engine ignore all white spaces in the pattern (other than in a class).

Note, putting spaces into a pattern in combination with the (?x) option can be useful to split different logical parts of a complex pattern, to make it more readable.

Multiline Matching

Multiline matching changes the meaning of ^ and $ from "the beginning of the subject string" and "the end of the subject string" to "the beginning of any line in the subject string" and "the end of any line in the subject string" respectively.

This example checks if the subject string contains two consequent lines that fully consist of digits:

Notice the (?m) option in the beginning of the pattern, which switches to the multiline matching mode.

Newline Conventions

PCRE supports five line break conventions:

CR (\r) - a single carriage return character
LF (\n) - a single linefeed character
CRLF (\r\n) - a carriage return followed by a linefeed

By default, the newline convention is set to any Unicode newline sequence, which includes:

Sequence

Description

The newline convention can be set by starting a pattern with one of the following sequences:

Sequence

Description

The newline conversion affects the ^ and $ assertions, the interpretation of the dot metacharacter, and the behavior of .

Note, the new line convention does not affect the meaning of .

This example demonstrates that the dot metacharacter matches , because it is not a newline sequence anymore:

Newline Sequences

By default, the escape sequence matches any Unicode newline sequences.

The meaning of can be set by starting a pattern with one of the following sequences:

Sequence

Description

Comments

It's possible to include comments inside a pattern. Comments do not participate in the pattern matching. Comments start at the (?

sequence and continue up to the next closing parenthesis:

Quoting

POSIX uses the backslash to remove a special meaning from a character. PCRE introduces a syntax to remove special meaning from a sequence of characters. The characters inside \Q ... \E are treated literally, without their special meaning.

This example checks if the string matches a dollar sign followed by a parenthesized name (a variable reference in some languages):

Note that the leftmost dollar sign and the parentheses are used literally, while the rightmost dollar sign is still used to match the end of the string.

Resetting the Match Start

The escape sequence \K causes any previously matched characters to be excluded from the final matched sequence. For example, the pattern: (foo)\Kbar matches foobar, but reports that it has matched bar. This feature is similar to a look-behind assertion. However, in this case, the part of the subject before the real match does not have to be of fixed length:

Non-Capturing Groups

The question mark and the colon after the opening parenthesis create a non-capturing group: (?:...).

This example removes an optional article from a word, for example for better sorting of the results.

Note that the articles are listed inside the left parentheses using the alternation operator | but they do not produce a captured subpattern, so the word followed by the article is referenced by '1' in the third argument to the function. Using non-capturing groups can be useful to save numbers on the sup-patterns that won't be used in the third argument of , as well as for performance purposes.

Non-Greedy Quantifiers

By default, the repetition quantifiers ?, *, + and {n,m} are "greedy", that is, they try to match as much as possible. Adding a question mark after a repetition quantifier makes it "non-greedy", so the pattern matches the minimum number of times possible.

This example cuts C comments from a line:

The pattern without the non-greedy flag to the quantifier /[*].*[*]/ would match the entire string between the leftmost /* and the rightmost */.

Atomic Groups

A sequence inside (?>...) makes an atomic group. Backtracking inside an atomic group is prevented once it has matched; however, backtracking past to the previous items works normally.

Consider the pattern \d+foo applied to the subject string 123bar. Once the engine scans 123 and fails on the letter b, it would normally backtrack to 2 and try to match again, then fail and backtrack to 1 and try to match and fail again, and finally fail the entire pattern. In case of an atomic group (?>\d+)foo with the same subject string 123bar, the engine gives up immediately after the first failure to match foo. An atomic group with a quantifier can match all or nothing.

Atomic groups produce faster false results (i.e. in case when a long subject string does not match the pattern), because the regexp engine saves performance on backtracking. However, don't hurry to put everything into atomic groups. This example demonstrates the difference between atomic and non-atomic match:

The non-atomic pattern matches both abbc and abc, while the atomic pattern matches abbc only.

The atomic group (?>bc|b) in the above example can be "translated" as "if there is bc, then don't try to match as b". So b can match only if bc is not found.

Atomic groups are not capturing. To make an atomic group capturing, put it into parentheses:

Possessive quantifiers

An atomic group which ends with a quantifier can be rewritten using a so called "possessive quantifier" syntax by putting an additional + sign following the quantifier.

The pattern (?>\d+)foo from the previous section's example can be rewritten as \d++foo.

Absolute and Relative Numeric Backreferences

Backreferences match the same text as previously matched by a capturing group. Backreferences can be written using:

a backslash followed by a digit
the \g escape sequence followed by a positive or negative number
the \g escape sequence followed by a positive or negative number enclosed in braces

The following backreferences are identical and refer to the first capturing group:

\1
\g1
\g{1}

This example demonstrates a pattern that matches "sense and sensibility" and "response and responsibility", but not "sense and responsibility":

This example removes doubled words that can unintentionally creep in when you edit a text in a text editor:

Note that all double words were removed, in the beginning, in the middle and in the end of the subject string.

A negative number in a \g sequence means a relative reference. Relative references can be helpful in long patterns, and also in patterns that are created by joining fragments together that contain references within themselves. The sequence \g{-1} is a reference to the most recently started capturing subpattern before \g.

In this example \g{-1} is equivalent to \2:

Named Subpatterns and Backreferences

Using numeric backreferences for capturing groups can be hard to track in a complicated regular expression. Also, the numbers can change if an expression is modified. To overcome these difficulties, PCRE supports named subpatterns.

A subpattern can be named in one of three ways: (?<name>...) or (?'name'...) as in Perl, or (?P<name>...) as in Python. References to capturing subpatterns from other parts of the pattern, can be made by name as well as by number.

Backreferences to a named subpattern can be written using the .NET syntax \k{name}, the Perl syntax \k<name> or \k'name' or \g{name}, or the Python syntax (?P=name).

This example tests if the string is a correct HTML tag:

Positive and Negative Look-Ahead and Look-Behind Assertions

Look-ahead and look-behind assertions serve to specify the context for the searched regular expression pattern. Note that the assertions only check the context, they do not capture anything themselves!

This example finds the letter which is not followed by another letter (negative look-ahead):

This example finds the letter which is followed by a digit (positive look-ahead):

This example finds the letter which does not follow a digit character (negative look-behind):

This example finds the letter which follows another letter character (positive look-behind):

Note that look-behind assertions can only be of fixed length; you cannot have repetition operators or alternations with different lengths:

Subroutine Reference and Recursive Patterns

PCRE supports a special syntax to recourse the entire pattern or its individual subpatterns:

Syntax

Description

This example checks for a correct additive arithmetic expression consisting of numbers, unary plus and minus, binary plus and minus, and parentheses:

The recursion is done using (?1) to call for the first parenthesized subpattern, which includes everything except the leading ^ and the trailing $.

The regular expression in the above example implements the following BNF grammar:

<expression> ::= <term> [(<sign> <term>)...]
<term> ::= [ <sign> ] <primary>
<primary> ::= <number> | <left paren> <expression> <right paren>

Defining Subpatterns For Use By Reference

Use the (?(DEFINE)...) syntax to define subpatterns that can be referenced from elsewhere.

This example defines a subpattern with the name letters that matches one or more letters, which is further reused two times:

The above example can also be rewritten to define the digit part as a subpattern as well:

Conditional Subpatterns

There are two forms of conditional subpatterns:

The yes-pattern is used if the condition is satisfied, otherwise the no-pattern (if any) is used.

Conditions With Subpattern References

If a condition consists of a number, it makes a condition with a subpattern reference. Such a condition is true if a capturing subpattern corresponding to the number has previously matched.

This example finds an optionally parenthesized number in a string:

The ([(])? part makes a capturing subpattern that matches an optional opening parenthesis; the [0-9]+ part matches a number, and the (?(1)[)]) part matches a closing parenthesis, but only if the opening parenthesis has been previously found.

Other Kinds of Conditions

The other possible condition kinds are: recursion references and assertions. See the for details.

Matching Zero Bytes (0x00)

PCRE correctly works with zero bytes in the subject strings:

Zero bytes, however, are not supported literally in the pattern strings and should be escaped using the \xhh or \x{hh} syntax:

Other PCRE Features

PCRE provides other extended features that were not covered in this document, such as duplicate subpattern numbers, backtracking control, breaking utf-8 sequences into individual bytes, setting the match limit, setting the recursion limit, optimization control, recursion conditions, assertion conditions and more types of extended grapheme clusters. Please refer to the for details.

Enhanced regex was implemented as a GSoC 2013 project by Sudheera Palihakkara.

default_regex_flags Examples

The variable was introduced to address the remaining incompatibilities between PCRE and the old regex library. Here are some examples of its usage:

The default behaviour (multiline match is off)

Enabling the multiline option using the PCRE option syntax:

Enabling the miltiline option using default_regex_flags

String Functions

ASCII

Syntax

Description

BIN

Syntax

Description

BINARY Operator

Syntax

Description

Examples

See Also

BIT_LENGTH

Syntax

Description

Examples

Compatibility

CHAR Function

Syntax

Description

Examples

See Also

CHAR_LENGTH

Syntax

Description

CHARACTER_LENGTH

Syntax

Description

CHR

Syntax

Description

CONCAT_WS

Syntax

Description

ELT

Syntax

Description

Examples

See also

EXPORT_SET

Syntax

Description

Examples

FIELD

Syntax

Description

Examples

See also

FIND_IN_SET

Syntax

Description

FORMAT

Syntax

Description

Examples

FROM_BASE64

Syntax

Description

INSERT Function

Syntax

Description

LCASE

Syntax

Description

LEFT

Syntax

Description

LENGTH

Syntax

Description

Examples

See Also

LENGTHB

Syntax

Description

LOAD_FILE

Syntax

Description

LOWER

Syntax