MariaDB full-text searching skipped a stopword even in Boolean Mode

You are viewing an old version of this question. View the current version here.

We are verifying MyISAM and InnoDB in full-text searching following [this post](https://stackoverflow.com/a/45674350).

The word `about` belongs to the [list of InnoDB stopwords](https://mariadb.com/kb/en/full-text-index-stopwords/#innodb-stopwords), so the query got an empty result when matching against it in Natural Language Mode.

In Boolean Mode, we expect to get matches on the relevant rows, but the results were still empty for the following queries for both InnoDB and MyISAM engines. Both engines showed the same system behavior. We need help understanding the testing results.

  • the query SELECT * FROM ft_1 WHERE MATCH(copy) AGAINST('about' in BOOLEAN MODE);, for InnoDB and
  • the query SELECT * FROM ft_myisam WHERE MATCH(copy) AGAINST('about' in BOOLEAN MODE); for MyISAM

So, we wonder why and will highly appreciate hints and suggestions.

Technical Details:

  • SQL:
sql
-- Definition of the InnoDB table:

CREATE TABLE test.ft_1 (
  copy TEXT NULL
)
ENGINE=InnoDB
DEFAULT CHARSET=utf8mb4
COLLATE=utf8mb4_unicode_ci;
CREATE FULLTEXT INDEX ft_1_copy_IDX ON test.ft_1 (copy);


-- Definition of the MyISAM table:
CREATE TABLE `ft_myisam` (
  `copy` text DEFAULT NULL,
  FULLTEXT KEY `copy` (`copy`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;


-- Queries:

SELECT * FROM ft_1;
SELECT * FROM ft_myisam;

SELECT * FROM ft_1 WHERE MATCH(copy) AGAINST('about');
SELECT * FROM ft_myisam WHERE MATCH(copy) AGAINST('about');
-- The default Natural Language Mode returns an empty set as expected.

SELECT * FROM ft_1 WHERE MATCH(copy) AGAINST('about' in BOOLEAN MODE);
SELECT * FROM ft_myisam WHERE MATCH(copy) AGAINST('about' in BOOLEAN MODE);
-- The Boolean Mode still returns an empty set, and we wonder why.

SELECT * FROM ft_1 WHERE MATCH(copy) AGAINST('clock');
SELECT * FROM ft_myisam WHERE MATCH(copy) AGAINST('clock');
-- Returns the row `It is about two o'clock` because 'clock' is not a stopword.

  • The data:
copy
Once upon a time
There was a wicked witch
Who ate everybody up
Once upon a wicked time
There was a wicked wicked witch
Who ate everybody wicked up
It is about two o'clock
About two
is two

Comments

Comments loading...
Content reproduced on this site is the property of its respective owners, and this content is not reviewed in advance by MariaDB. The views, information and opinions expressed by this content do not necessarily represent those of MariaDB or any other party.