Comments - on duplicate key work around
Content reproduced on this site is the property of its respective owners,
and this content is not reviewed in advance by MariaDB. The views, information and opinions
expressed by this content do not necessarily represent those of MariaDB or any other party.
Since columnstore doesn't have unique indexes, then yes more or less. Another option is to load your inbound data into a staging table and then doing an insert select where you join and filter out the duplicates in what gets selected.
Thanks David. That is what I thought. I was going handle the inserts as you suggested and the updates with a left join. Any thoughts on weather that will provide better performance than deleting the range and re-inserting the whole set?
Doing the insert select should be faster even just purely from the I/O involved i think. The case where it might be faster would be if you were to truncate the table or use partition drop but i assume that you are planning to just do some subset.
My data comes in somewhat randomly for some id's. As an innodb table we used insert on dup key update... What I have found in testing various workarounds is that loading the new data into a column store staging table. select the staging table and join to the table to be updated. Then left join to the table to be updated for the inserts. Also it was faster to just do the update without a predicate to select only the rows with differences in the numeric_val column.
Deleting the rows via the join and re-inserting took about the same time as the update and insert.
One note... when I was doing these tests I observed that when I delete a row (via a join to another column store table) a row the server reports 0 rows affected even when rows are deleted.
Here's a sample...