Implementing Wordle’s game logic in SQL

spacer

Wordle is a word-guessing game that became popular in the beginning of 2022. The best way to learn the game rules is by playing it, but in short, you have to guess a five-letter word in a series of tries as you get hints on which letters are included in the word and whether they are in the right position or not. A yellow square means that the letter is in the final word but it’s in the wrong position. A green square means that the letter is in the right position. A gray square means that the letter is not in the final word. The game selects one word per day, so you can play only once per day.

Since everybody was talking about Wordle on Twitter, I had to try it. And I’m going to be honest, it wasn’t my thing. But hey! I’m a developer and I work for MariaDB, so my brain immediately went: “Could I implement the game logic using SQL queries only?

Wordle Tweet

The next weekend, I rolled up my sleeves and started working on it. Here’s what I came up with.

The data model

We need to store words and dates. For example:

CREATE OR REPLACE TABLE word
(
    id        INT PRIMARY KEY AUTO_INCREMENT,
    text      VARCHAR(64) UNIQUE,
    published DATE UNIQUE
) ENGINE = Aria;

Notice that I picked the Aria storage engine. MariaDB databases support a variety of storage engines for different kinds of workloads. You can combine tables that use different storage engines in a single SQL query. Check them out.

The idea of the word table is to not only store words but to mark the date in which the word is to be guessed (or published). I inserted some words and picked MARIA as today’s guessing word:

Word table

 

The requirements

The game logic requires two operations:

  • Pick today’s word, which updates the word.published column, and
  • Calculate a hint, for example, with numbers indicating the color of each box:
    • 0: not present
    • 1: wrong position
    • 2: right position

So, for example if today’s final word is “MARIA” and we try the word “FERRY”, we’d get the following result:

Final word: M A R I A

Test word:  F E R R Y

Result:        0 0 2 1 0

This is not exactly how it works in the original Wordle, but it’s a good approximation. I call it a “variant” of the game to make me feel better. In the original game, the second ‘R’ in “FERRY’ would get a 0 instead of a 1. If after reading this blog post, you find a way to include this edge case in the solution I found, please let me know and let’s include it here, with due credits of course.

Picking today’s word

Let’s say we want to pick each day’s word randomly. If so, this requirement boils down to how to select a random row from an SQL table. A quick online search showed me that the RAND() function is the way to go:

UPDATE word
SET published = CURRENT_DATE()
WHERE published IS NULL
ORDER BY RAND()
LIMIT 1;

You can almost read the logic by reading the SQL sentence. That’s one of the cool things about SQL! The key here is the LIMIT clause that allows us to update only one of the words that haven’t been picked (published) before.

An application would only need to run this update to make sure that a word is selected today. It should also either check that no word has been picked today before executing the update  or catch (and safely ignore) a constraint violation error in case it tries to pick a word twice or more using the previous update.

Computing hints for exactly 5-letter words

Computing the hints for a given test word was a bit harder. I spent quite a bit of time trying to figure it out, but I came up with the following:

SELECT CONCAT(
   CASE
       WHEN LOCATE(SUBSTR(test.text, 1, 1), todays.text) = 1 THEN 2
       WHEN LOCATE(SUBSTR(test.text, 1, 1), todays.text) = 0 THEN 0
       ELSE 1
       END,
   CASE
       WHEN LOCATE(SUBSTR(test.text, 2, 1), todays.text) = 2 THEN 2
       WHEN LOCATE(SUBSTR(test.text, 2, 1), todays.text) = 0 THEN 0
       ELSE 1
       END,
   CASE
       WHEN LOCATE(SUBSTR(test.text, 3, 1), todays.text) = 3 THEN 2
       WHEN LOCATE(SUBSTR(test.text, 3, 1), todays.text) = 0 THEN 0
       ELSE 1
       END,
   CASE
       WHEN LOCATE(SUBSTR(test.text, 4, 1), todays.text) = 4 THEN 2
       WHEN LOCATE(SUBSTR(test.text, 4, 1), todays.text) = 0 THEN 0
       ELSE 1
       END,
   CASE
       WHEN LOCATE(SUBSTR(test.text, 5, 1), todays.text) = 5 THEN 2
       WHEN LOCATE(SUBSTR(test.text, 5, 1), todays.text) = 0 THEN 0
       ELSE 1
       END
) AS result
FROM word todays
         JOIN word test
WHERE todays.published = CURRENT_DATE()
  AND LOWER(test.text) = LOWER('great');

Let’s break it down. We are joining word with word to get today’s published word as todays and the word to test as test. The test word has to be in the database in order to be a valid test. In the example, we are using the test word “great”.

The SELECT statement concatenates a group of five characters. Each character takes the value 0, 1, or 2 according to the requirements:

To get the value 0, 1, or 2 we use WHEN statements using a couple of string manipulation functions:

  • LOCATE(substr, str): Returns the position of the first occurrence of the substring substr in the string str.
  • SUBSTR(str, pos, len): Returns a substring of len characters from the string str starting at position pos.

For example, we can take the fourth character in the test word (‘A’ in “GREAT”):

SUBSTR(test.text, 4, 1)

And locate that character in today’s final word (“MARIA”):

LOCATE(SUBSTR(test.text, 4, 1), todays.text)

If the location is 4, then we got a match and we can return the value 2 (character in the right position):

WHEN LOCATE(SUBSTR(test.text, 4, 1), todays.text) = 4 THEN 2

Since the LOCATE function returns zero if the character is not in the string, we can return the value 0 (character not in the final word):

WHEN LOCATE(SUBSTR(test.text, 4, 1), todays.text) = 0 THEN 0

Otherwise, we return 1 (character in the wrong position):

ELSE 1

Computing hints for n-letter words with stored procedures

The previous SQL query works well (if we accept the “variant”) for 5-letter words. What if we wanted an arbitrary number of letters? The first idea that comes to mind is to use dynamic queries using a programming language to append the WHEN clauses depending on the length of the words we want to handle. And in fact, as I was investigating this, I contacted my friend Faisal Saeed, an expert on everything MariaDB (check out some of his blog posts), to get some guidance.

He not only pointed me to the right direction, but sent me an alternative solution that uses a stored procedure:

DELIMITER //

CREATE OR REPLACE FUNCTION wordle(word VARCHAR(30), input VARCHAR(30)) RETURNS VARCHAR(30) DETERMINISTIC
BEGIN
    DECLARE x TINYINT;
    DECLARE wordLen TINYINT;
    DECLARE i TINYINT;
    DECLARE result VARCHAR(30);

    SET wordLen = LENGTH(input);
    SET result = '';
    SET i = 1;
    WHILE i <= wordLen
        DO
            IF LOCATE(SUBSTR(input, i, 1), word) = 0 THEN
                SET x = 0;
            ELSEIF SUBSTR(input, i, 1) = SUBSTR(word, i, 1) THEN
                SET x = 2;
            ELSE
                SET x = 1;
            END IF;
            SET result = CONCAT(result, x);
            SET i = i + 1;
        END WHILE;
    RETURN result;
END //

DELIMITER ;

Then we can simply use the following query to get the hints on a given word:

SELECT wordle(test.text, todays.text)
FROM word todays
         JOIN word test
WHERE todays.published = CURRENT_DATE()
  AND test.text = 'GREAT';

Computing hints for n-letter words with recursive SQL

Faisal’s help was great. But I still wanted a single-query solution. He mentioned that Recursive SQL could be the answer. And it indeed was! Recursive SQL is achieved through a feature called Common Table Expressions (CTE). A CTE is simply a temporary result set that has a custom name. You can define a CTE using the WITH keyword:

WITH five_letter_word
    AS (
        SELECT *
        FROM word
        WHERE LENGTH(text) = 5
    )
SELECT text
FROM five_letter_word
WHERE text LIKE 'M%';

CTE’s can be recursive. But how does this help? To answer that, I should show my notes:

Wordle notes

I’m using “QUERY” as the final word, and “GREAT” as a test word. I figured that it’d be easier to return the hints in one column with N rows (where N is the length of the word) instead of N columns. The important thing is that if I could find a way to a column with rows 1, 2, 3,… N (r3 in my notes), a column with the letters of the test word each in one row (r1), and a column with the final word on each row (r2), then I’d be able to calculate the hints (ignore those 0 + 0, 1 + 0, etc., they didn’t help in the end).

It turns out that you can use a CTE to create a column with the numbers 1, 2, 3, 4,… N, using recursion! Here’s how:

WITH RECURSIVE number AS (
    SELECT 1 AS i
    UNION ALL
    SELECT i + 1
    FROM number
    WHERE i < 5
)
SELECT i
FROM number;

Of course, 5 (the length of the word) would become a parameter of the query in the future. The important thing is that now I have the value i that I can use in string functions alongside the ideas developed in the previous section. A bunch of try-and-error iterations led me to this:

WITH RECURSIVE number AS (
    SELECT 1 AS i
    UNION ALL
    SELECT i + 1
    FROM number
    WHERE i < 5
)
SELECT CASE
    WHEN SUBSTR(test.text, i, 1) = SUBSTR(todays.text, i, 1) THEN 2
    WHEN LOCATE(SUBSTR(test.text, i, 1), todays.text) = 0 THEN 0
    ELSE 1
    END
FROM number
         JOIN word todays
         JOIN word test
WHERE todays.published = CURRENT_DATE()
  AND LENGTH(todays.text) = 5
  AND LOWER(test.text) = LOWER('GREAT');

There it is! A query that calculates the hints for a Wordle clone with words of any length. SQL is truly powerful!

The source code

I implemented a Wordle clone using Java, Spring Boot, R2DBC, and of course, MariaDB. You can find it on GitHub with instructions on how to configure the database, import data, and compile and run the application. Check it out!

Wordle example

 

Developer Resources