Implementing Wordle’s game logic in SQL
Wordle is a word-guessing game that became popular in the beginning of 2022. The best way to learn the game rules is by playing it, but in short, you have to guess a five-letter word in a series of tries as you get hints on which letters are included in the word and whether they are in the right position or not. A yellow square means that the letter is in the final word but it’s in the wrong position. A green square means that the letter is in the right position. A gray square means that the letter is not in the final word. The game selects one word per day, so you can play only once per day.
Since everybody was talking about Wordle on Twitter, I had to try it. And I’m going to be honest, it wasn’t my thing. But hey! I’m a developer and I work for MariaDB, so my brain immediately went: “Could I implement the game logic using SQL queries only?”
The next weekend, I rolled up my sleeves and started working on it. Here’s what I came up with.
The data model
We need to store words and dates. For example:
CREATE OR REPLACE TABLE word ( id INT PRIMARY KEY AUTO_INCREMENT, text VARCHAR(64) UNIQUE, published DATE UNIQUE ) ENGINE = Aria;
Notice that I picked the Aria storage engine. MariaDB databases support a variety of storage engines for different kinds of workloads. You can combine tables that use different storage engines in a single SQL query. Check them out.
The idea of the word
table is to not only store words but to mark the date in which the word is to be guessed (or published). I inserted some words and picked MARIA
as today’s guessing word:
The requirements
The game logic requires two operations:
- Pick today’s word, which updates the
word.published
column, and - Calculate a hint, for example, with numbers indicating the color of each box:
- 0: not present
- 1: wrong position
- 2: right position
So, for example if today’s final word is “MARIA” and we try the word “FERRY”, we’d get the following result:
Final word: M A R I A
Test word: F E R R Y
Result: 0 0 2 1 0
This is not exactly how it works in the original Wordle, but it’s a good approximation. I call it a “variant” of the game to make me feel better. In the original game, the second ‘R’ in “FERRY’ would get a 0 instead of a 1. If after reading this blog post, you find a way to include this edge case in the solution I found, please let me know and let’s include it here, with due credits of course.
Picking today’s word
Let’s say we want to pick each day’s word randomly. If so, this requirement boils down to how to select a random row from an SQL table. A quick online search showed me that the RAND()
function is the way to go:
UPDATE word SET published = CURRENT_DATE() WHERE published IS NULL ORDER BY RAND() LIMIT 1;
You can almost read the logic by reading the SQL sentence. That’s one of the cool things about SQL! The key here is the LIMIT
clause that allows us to update only one of the words that haven’t been picked (published) before.
An application would only need to run this update to make sure that a word is selected today. It should also either check that no word has been picked today before executing the update or catch (and safely ignore) a constraint violation error in case it tries to pick a word twice or more using the previous update.
Computing hints for exactly 5-letter words
Computing the hints for a given test word was a bit harder. I spent quite a bit of time trying to figure it out, but I came up with the following:
SELECT CONCAT( CASE WHEN LOCATE(SUBSTR(test.text, 1, 1), todays.text) = 1 THEN 2 WHEN LOCATE(SUBSTR(test.text, 1, 1), todays.text) = 0 THEN 0 ELSE 1 END, CASE WHEN LOCATE(SUBSTR(test.text, 2, 1), todays.text) = 2 THEN 2 WHEN LOCATE(SUBSTR(test.text, 2, 1), todays.text) = 0 THEN 0 ELSE 1 END, CASE WHEN LOCATE(SUBSTR(test.text, 3, 1), todays.text) = 3 THEN 2 WHEN LOCATE(SUBSTR(test.text, 3, 1), todays.text) = 0 THEN 0 ELSE 1 END, CASE WHEN LOCATE(SUBSTR(test.text, 4, 1), todays.text) = 4 THEN 2 WHEN LOCATE(SUBSTR(test.text, 4, 1), todays.text) = 0 THEN 0 ELSE 1 END, CASE WHEN LOCATE(SUBSTR(test.text, 5, 1), todays.text) = 5 THEN 2 WHEN LOCATE(SUBSTR(test.text, 5, 1), todays.text) = 0 THEN 0 ELSE 1 END ) AS result FROM word todays JOIN word test WHERE todays.published = CURRENT_DATE() AND LOWER(test.text) = LOWER('great');
Let’s break it down. We are joining word
with word
to get today’s published word as todays
and the word to test as test
. The test word has to be in the database in order to be a valid test. In the example, we are using the test word “great”.
The SELECT
statement concatenates a group of five characters. Each character takes the value 0, 1, or 2 according to the requirements:
To get the value 0, 1, or 2 we use WHEN
statements using a couple of string manipulation functions:
- LOCATE(substr, str): Returns the position of the first occurrence of the substring
substr
in the stringstr
. - SUBSTR(str, pos, len): Returns a substring of
len
characters from the stringstr
starting at positionpos
.
For example, we can take the fourth character in the test word (‘A’ in “GREAT”):
SUBSTR(test.text, 4, 1)
And locate that character in today’s final word (“MARIA”):
LOCATE(SUBSTR(test.text, 4, 1), todays.text)
If the location is 4, then we got a match and we can return the value 2 (character in the right position):
WHEN LOCATE(SUBSTR(test.text, 4, 1), todays.text) = 4 THEN 2
Since the LOCATE
function returns zero if the character is not in the string, we can return the value 0 (character not in the final word):
WHEN LOCATE(SUBSTR(test.text, 4, 1), todays.text) = 0 THEN 0
Otherwise, we return 1 (character in the wrong position):
ELSE 1
Computing hints for n-letter words with stored procedures
The previous SQL query works well (if we accept the “variant”) for 5-letter words. What if we wanted an arbitrary number of letters? The first idea that comes to mind is to use dynamic queries using a programming language to append the WHEN
clauses depending on the length of the words we want to handle. And in fact, as I was investigating this, I contacted my friend Faisal Saeed, an expert on everything MariaDB (check out some of his blog posts), to get some guidance.
He not only pointed me to the right direction, but sent me an alternative solution that uses a stored procedure:
DELIMITER // CREATE OR REPLACE FUNCTION wordle(word VARCHAR(30), input VARCHAR(30)) RETURNS VARCHAR(30) DETERMINISTIC BEGIN DECLARE x TINYINT; DECLARE wordLen TINYINT; DECLARE i TINYINT; DECLARE result VARCHAR(30); SET wordLen = LENGTH(input); SET result = ''; SET i = 1; WHILE i <= wordLen DO IF LOCATE(SUBSTR(input, i, 1), word) = 0 THEN SET x = 0; ELSEIF SUBSTR(input, i, 1) = SUBSTR(word, i, 1) THEN SET x = 2; ELSE SET x = 1; END IF; SET result = CONCAT(result, x); SET i = i + 1; END WHILE; RETURN result; END // DELIMITER ;
Then we can simply use the following query to get the hints on a given word:
SELECT wordle(test.text, todays.text) FROM word todays JOIN word test WHERE todays.published = CURRENT_DATE() AND test.text = 'GREAT';
Computing hints for n-letter words with recursive SQL
Faisal’s help was great. But I still wanted a single-query solution. He mentioned that Recursive SQL could be the answer. And it indeed was! Recursive SQL is achieved through a feature called Common Table Expressions (CTE). A CTE is simply a temporary result set that has a custom name. You can define a CTE using the WITH
keyword:
WITH five_letter_word AS ( SELECT * FROM word WHERE LENGTH(text) = 5 ) SELECT text FROM five_letter_word WHERE text LIKE 'M%';
CTE’s can be recursive. But how does this help? To answer that, I should show my notes:
I’m using “QUERY” as the final word, and “GREAT” as a test word. I figured that it’d be easier to return the hints in one column with N rows (where N is the length of the word) instead of N columns. The important thing is that if I could find a way to a column with rows 1, 2, 3,… N (r3 in my notes), a column with the letters of the test word each in one row (r1), and a column with the final word on each row (r2), then I’d be able to calculate the hints (ignore those 0 + 0, 1 + 0, etc., they didn’t help in the end).
It turns out that you can use a CTE to create a column with the numbers 1, 2, 3, 4,… N, using recursion! Here’s how:
WITH RECURSIVE number AS ( SELECT 1 AS i UNION ALL SELECT i + 1 FROM number WHERE i < 5 ) SELECT i FROM number;
Of course, 5 (the length of the word) would become a parameter of the query in the future. The important thing is that now I have the value i
that I can use in string functions alongside the ideas developed in the previous section. A bunch of try-and-error iterations led me to this:
WITH RECURSIVE number AS ( SELECT 1 AS i UNION ALL SELECT i + 1 FROM number WHERE i < 5 ) SELECT CASE WHEN SUBSTR(test.text, i, 1) = SUBSTR(todays.text, i, 1) THEN 2 WHEN LOCATE(SUBSTR(test.text, i, 1), todays.text) = 0 THEN 0 ELSE 1 END FROM number JOIN word todays JOIN word test WHERE todays.published = CURRENT_DATE() AND LENGTH(todays.text) = 5 AND LOWER(test.text) = LOWER('GREAT');
There it is! A query that calculates the hints for a Wordle clone with words of any length. SQL is truly powerful!
The source code
I implemented a Wordle clone using Java, Spring Boot, R2DBC, and of course, MariaDB. You can find it on GitHub with instructions on how to configure the database, import data, and compile and run the application. Check it out!