Best surrogate key for clustered index for receiving mass inserts

You are viewing an old version of this question. View the current version here.

I need to choose the best clustered index for a table that will hold real time messages received into an application from Solace. Each message received will be inserted as one row in the table. Volumes aren't particularly high, probably around tens of thousands of messages per day. Average row size will be around 500 bytes. Once inserted, the row will never be updated. A periodic purge process will remove rows that are older than a given number of days (say, 28 days). Insert performance is more important than select performance.

I intend to use a single surrogate key column as the primary key. Is the most efficient option to use an auto incrementing column? Would this avoid page splits and fragmentation? Would this cause hot spots? I'm not keen on using a timestamp and I assume that using a UUID would be a very bad option (fragmentation etc.).

I have worked with other database technologies but I don't want to assume that my knowledge of those databases translates well to MariaDB. Any thoughts or recommendations would be appreciated, thanks.

We are using MariaDB 10.1.35.

Answer

Auto_increment is quite efficient for insert speed and reducing fragmentation. To aid easy purge would require either partitioning or a secondary key.

e.g. partitioning for easy monthly removal:

create table x (
      id int unsigned not null auto_increment, 
     created datetime, 
     primary key(id,created)
) PARTITION BY LIST (MONTH(created)) (
     PARTITION partition_name VALUES IN (1,2,3,4,5,6,7,8,9,10,11,12)
)

UUIDs can be ok if they are time based however if you don't need them AI is easier.

Comments

Comments loading...
Content reproduced on this site is the property of its respective owners, and this content is not reviewed in advance by MariaDB. The views, information and opinions expressed by this content do not necessarily represent those of MariaDB or any other party.