Adding a latin name to a larin1 column works on 10.4, doesn't work on 10.5

I'm trying to add this name: Gdynia (Orłowo)

into a varchar(255) column which is latin1

on 10.4 I get this in the column: Gdynia (Or?owo)

on 10.5 I get this: Incorrect string value '\xC5\x82owo)' for column

how can I make it work correctly ?

Answer Answered by Marko Mäkelä in this comment.

The MariaDB encoding name "latin1" refers to Microsoft Code Page 1252, which extends ISO 8859-1 (a.k.a. ISO Latin 1) with a few code points.

The \xC5\x82 in your output looks like UTF-8 for U+0142 to me. The 0x82 is not a valid code point in "latin1" encoding. 0xc5 would be interpreted as U+00C5 LATIN CAPITAL LETTER A WITH RING ABOVE.

So, it looks like in 10.5 some additional validation was implemented. Previously, the incorrect data was wrongly and silently accepted. You could get surprising results for ORDER BY when using the wrong encoding and collation.

I think that you should define the table with the utf8mb3 or utf8mb4 encoding if you are going to insert UTF-8 encoded data into it.

Comments

Comments loading...
Content reproduced on this site is the property of its respective owners, and this content is not reviewed in advance by MariaDB. The views, information and opinions expressed by this content do not necessarily represent those of MariaDB or any other party.