Implement full-text indexes in MariaDB Server for efficient text search. This section guides you through creating and utilizing these indexes to optimize queries on large text datasets.
Stopwords are used to provide a list of commonly-used words that can be ignored for the purposes of Full-text-indexes.
Full-text indexes built in MyISAM and InnoDB have different stopword lists by default.
For full-text indexes on MyISAM tables, by default, the list is built from the file storage/myisam/ft_static.c, and searched using the server's character set and collation. The ft_stopword_file system variable allows the default list to be overridden with words from another file, or for stopwords to be ignored altogether.
If the stopword list is changed, any existing full-text indexes need to be rebuilt
The following table shows the default list of stopwords, although you should always treat storage/myisam/ft_static.c as the definitive list. See the for more details, and for related articles.
Stopwords on full-text indexes are only enabled if the system variable is set (by default it is) at the time the index was created.
The stopword list is determined as follows:
If the system variable is set, that table is used as a stopword list.
If innodb_ft_user_stopword_table is not set, the table set by is used.
If neither variable is set, the built-in list is used, which can be viewed by querying the in the .
In the first two cases, the specified table must exist at the time the system variable is set and the full-text index created. It must be an InnoDB table with a single column, a named VALUE.
The default InnoDB stopword list differs from the default MyISAM list, being much shorter, and contains the following words:
This page is licensed: CC BY-SA / Gnu FDL
almost
alone
along
already
also
although
always
am
among
amongst
an
and
another
any
anybody
anyhow
anyone
anything
anyway
anyways
anywhere
apart
appear
appreciate
appropriate
are
aren't
around
as
aside
ask
asking
associated
at
available
away
awfully
be
became
because
become
becomes
becoming
been
before
beforehand
behind
being
believe
below
beside
besides
best
better
between
beyond
both
brief
but
by
c'mon
c's
came
can
can't
cannot
cant
cause
causes
certain
certainly
changes
clearly
co
com
come
comes
concerning
consequently
consider
considering
contain
containing
contains
corresponding
could
couldn't
course
currently
definitely
described
despite
did
didn't
different
do
does
doesn't
doing
don't
done
down
downwards
during
each
edu
eg
eight
either
else
elsewhere
enough
entirely
especially
et
etc
even
ever
every
everybody
everyone
everything
everywhere
ex
exactly
example
except
far
few
fifth
first
five
followed
following
follows
for
former
formerly
forth
four
from
further
furthermore
get
gets
getting
given
gives
go
goes
going
gone
got
gotten
greetings
had
hadn't
happens
hardly
has
hasn't
have
haven't
having
he
he's
hello
help
hence
her
here
here's
hereafter
hereby
herein
hereupon
hers
herself
hi
him
himself
his
hither
hopefully
how
howbeit
however
i'd
i'll
i'm
i've
ie
if
ignored
immediate
in
inasmuch
inc
indeed
indicate
indicated
indicates
inner
insofar
instead
into
inward
is
isn't
it
it'd
it'll
it's
its
itself
just
keep
keeps
kept
know
knows
known
last
lately
later
latter
latterly
least
less
lest
let
let's
like
liked
likely
little
look
looking
looks
ltd
mainly
many
may
maybe
me
mean
meanwhile
merely
might
more
moreover
most
mostly
much
must
my
myself
name
namely
nd
near
nearly
necessary
need
needs
neither
never
nevertheless
new
next
nine
no
nobody
non
none
noone
nor
normally
not
nothing
novel
now
nowhere
obviously
of
off
often
oh
ok
okay
old
on
once
one
ones
only
onto
or
other
others
otherwise
ought
our
ours
ourselves
out
outside
over
overall
own
particular
particularly
per
perhaps
placed
please
plus
possible
presumably
probably
provides
que
quite
qv
rather
rd
re
really
reasonably
regarding
regardless
regards
relatively
respectively
right
said
same
saw
say
saying
says
second
secondly
see
seeing
seem
seemed
seeming
seems
seen
self
selves
sensible
sent
serious
seriously
seven
several
shall
she
should
shouldn't
since
six
so
some
somebody
somehow
someone
something
sometime
sometimes
somewhat
somewhere
soon
sorry
specified
specify
specifying
still
sub
such
sup
sure
t's
take
taken
tell
tends
th
than
thank
thanks
thanx
that
that's
thats
the
their
theirs
them
themselves
then
thence
there
there's
thereafter
thereby
therefore
therein
theres
thereupon
these
they
they'd
they'll
they're
they've
think
third
this
thorough
thoroughly
those
though
three
through
throughout
thru
thus
to
together
too
took
toward
towards
tried
tries
truly
try
trying
twice
two
un
under
unfortunately
unless
unlikely
until
unto
up
upon
us
use
used
useful
uses
using
usually
value
various
very
via
viz
vs
want
wants
was
wasn't
way
we
we'd
we'll
we're
we've
welcome
well
went
were
weren't
what
what's
whatever
when
whence
whenever
where
where's
whereafter
whereas
whereby
wherein
whereupon
wherever
whether
which
while
whither
who
who's
whoever
whole
whom
whose
why
will
willing
wish
with
within
without
won't
wonder
would
wouldn't
yes
yet
you
you'd
you'll
you're
you've
your
yours
yourself
yourselves
zero
is
it
la
of
on
or
that
the
this
to
was
what
when
where
who
will
with
und
the
www
a's
able
about
above
according
accordingly
across
actually
after
afterwards
again
against
ain't
all
allow
allows
a
about
an
are
as
at
be
by
com
de
en
for
from
how
i
in
MariaDB has support for full-text indexing and searching:
A full-text index in MariaDB is an index of type FULLTEXT, and it allows more options when searching for portions of text from a field.
Partitioned tables cannot contain fulltext indexes, even if the storage engine supports them.
A FULLTEXT index definition can be given in the statement when a table is created, or added later using or .
For large data sets, it is much faster to load your data into a table that has no FULLTEXT index and then create the index after that, than to load data into a table that has an existing FULLTEXT index.
Full-text searching is performed using syntax. MATCH() takes a comma-separated list that names the columns to be searched. AGAINST takes a string to search for, and an optional modifier that indicates what type of search to perform. The search string must be a literal string, not a variable or a column name.
Partial words are excluded.
Words less than 4 (MyISAM) or 3 (InnoDB) characters in length will not be stored in the fulltext index. This value can be adjusted by changing the system variable (or, for , ).
Words longer than 84 characters in length will also not be stored in the fulltext index. This values can be adjusted by changing the system variable (or, for , ).
MariaDB calculates a relevance for each result, based on a number of factors, including the number of words in the index, the number of unique words in a row, the total number of words in both the index and the result, and the weight of the word. In English, 'cool' will be weighted less than 'dandy', at least at present! The relevance can be returned as part of a query simply by using the MATCH function in the field list.
IN NATURAL LANGUAGE MODE is the default type of full-text search, and the keywords can be omitted. There are no special operators, and searches consist of one or more comma-separated keywords.
Searches are returned in descending order of relevance.
Boolean search permits the use of a number of special operators:
Searches are not returned in order of relevance, and nor does the 50% limit apply. Stopwords and word minimum and maximum lengths still apply as usual.
A query expansion search is a modification of a natural language search. The search string is used to perform a regular natural language search. Then, words from the most relevant rows returned by the search are added to the search string and the search is done again. The query returns the rows from the second search. The IN NATURAL LANGUAGE MODE WITH QUERY EXPANSION or WITH QUERY EXPANSION modifier specifies a query expansion search. It can be useful when relying on implied knowledge within the data, for example that MariaDB is a database.
Creating a table, and performing a basic search:
Multiple words:
Since 'Once' is a , no result is returned:
Inserting the word 'wicked' into more than half the rows excludes it from the results:
Using IN BOOLEAN MODE to overcome the 50% limitation:
Returning the relevance:
WITH QUERY EXPANSION. In the following example, 'MariaDB' is always associated with the word 'database', so it is returned when query expansion is used, even though not explicitly requested.
Partial word matching with IN BOOLEAN MODE:
Using boolean operators
For simpler searches of a substring in text columns, see the operator.
This page is licensed: CC BY-SA / Gnu FDL
For MyISAM/Aria fulltext indexes only, if a word appears in more than half the rows, it is also excluded from the results of a fulltext search.
For InnoDB indexes, only committed rows appear - modifications from the current transaction do not apply.
*
The wildcard, indicating zero or more characters. It can only appear at the end of a word.
"
Anything enclosed in the double quotes is taken as a whole (so you can match phrases, for example).
+
The word is mandatory in all rows returned.
-
The word cannot appear in any row returned.
<
The word that follows has a lower relevance than other words, although rows containing it will still match
>
The word that follows has a higher relevance than other words.
()
Used to group words into subexpressions.
~
The word following contributes negatively to the relevance of the row (which is different to the '-' operator, which specifically excludes the word, or the '<' operator, which still causes the word to contribute positively to the relevance of the row.
MATCH (col1,col2,...) AGAINST (expr [search_modifier])CREATE TABLE ft_myisam(copy TEXT,FULLTEXT(copy)) ENGINE=MyISAM;
INSERT INTO ft_myisam(copy) VALUES ('Once upon a time'),
('There was a wicked witch'), ('Who ate everybody up');
SELECT * FROM ft_myisam WHERE MATCH(copy) AGAINST('wicked');
+--------------------------+
| copy |
+--------------------------+
| There was a wicked witch |
+--------------------------+SELECT * FROM ft_myisam WHERE MATCH(copy) AGAINST('wicked,witch');
+---------------------------------+
| copy |
+---------------------------------+
| There was a wicked witch |
+---------------------------------+SELECT * FROM ft_myisam WHERE MATCH(copy) AGAINST('Once');
Empty set (0.00 sec)INSERT INTO ft_myisam(copy) VALUES ('Once upon a wicked time'),
('There was a wicked wicked witch'), ('Who ate everybody wicked up');
SELECT * FROM ft_myisam WHERE MATCH(copy) AGAINST('wicked');
Empty set (0.00 sec)SELECT * FROM ft_myisam WHERE MATCH(copy) AGAINST('wicked' IN BOOLEAN MODE);
+---------------------------------+
| copy |
+---------------------------------+
| There was a wicked witch |
| Once upon a wicked time |
| There was a wicked wicked witch |
| Who ate everybody wicked up |
+---------------------------------+SELECT copy,MATCH(copy) AGAINST('witch') AS relevance
FROM ft_myisam WHERE MATCH(copy) AGAINST('witch');
+---------------------------------+--------------------+
| copy | relevance |
+---------------------------------+--------------------+
| There was a wicked witch | 0.6775632500648499 |
| There was a wicked wicked witch | 0.5031757950782776 |
+---------------------------------+--------------------+CREATE TABLE ft2(copy TEXT,FULLTEXT(copy)) ENGINE=MyISAM;
INSERT INTO ft2(copy) VALUES
('MySQL vs MariaDB database'),
('Oracle vs MariaDB database'),
('PostgreSQL vs MariaDB database'),
('MariaDB overview'),
('Foreign keys'),
('Primary keys'),
('Indexes'),
('Transactions'),
('Triggers');
SELECT * FROM ft2 WHERE MATCH(copy) AGAINST('database');
+--------------------------------+
| copy |
+--------------------------------+
| MySQL vs MariaDB database |
| Oracle vs MariaDB database |
| PostgreSQL vs MariaDB database |
+--------------------------------+
3 rows in set (0.00 sec)
SELECT * FROM ft2 WHERE MATCH(copy) AGAINST('database' WITH QUERY EXPANSION);
+--------------------------------+
| copy |
+--------------------------------+
| MySQL vs MariaDB database |
| Oracle vs MariaDB database |
| PostgreSQL vs MariaDB database |
| MariaDB overview |
+--------------------------------+
4 rows in set (0.00 sec)SELECT * FROM ft2 WHERE MATCH(copy) AGAINST('Maria*' IN BOOLEAN MODE);
+--------------------------------+
| copy |
+--------------------------------+
| MySQL vs MariaDB database |
| Oracle vs MariaDB database |
| PostgreSQL vs MariaDB database |
| MariaDB overview |
+--------------------------------+SELECT * FROM ft2 WHERE MATCH(copy) AGAINST('+MariaDB -database'
IN BOOLEAN MODE);
+------------------+
| copy |
+------------------+
| MariaDB overview |
+------------------+