Comments - on duplicate key work around
Content reproduced on this site is the property of its respective owners,
and this content is not reviewed in advance by MariaDB. The views, information and opinions
expressed by this content do not necessarily represent those of MariaDB or any other party.
Doing the insert select should be faster even just purely from the I/O involved i think. The case where it might be faster would be if you were to truncate the table or use partition drop but i assume that you are planning to just do some subset.
My data comes in somewhat randomly for some id's. As an innodb table we used insert on dup key update... What I have found in testing various workarounds is that loading the new data into a column store staging table. select the staging table and join to the table to be updated. Then left join to the table to be updated for the inserts. Also it was faster to just do the update without a predicate to select only the rows with differences in the numeric_val column.
Deleting the rows via the join and re-inserting took about the same time as the update and insert.
One note... when I was doing these tests I observed that when I delete a row (via a join to another column store table) a row the server reports 0 rows affected even when rows are deleted.
Here's a sample...