Comments - SET Data Type
Content reproduced on this site is the property of its respective owners,
and this content is not reviewed in advance by MariaDB. The views, information and opinions
expressed by this content do not necessarily represent those of MariaDB or any other party.
Just a quick comment about this, as im trying to optimize a task scheduler and I was trying to find a way to index an enum column or to redesign the table whole yesterday, as it's getting bigger and it appears that mysql can't use indexes so we're not able to pick pending tasks fast enough. I wasn't able to find anything related to the subject, beside that's not possible / bug reports that index can't be used for SET...
It appears to the contrary to what's written it can be done using VIRTUAL COLUMNS.
Table definition
`tasks_running` set('INIT_LIST','INIT_PREBUILD','SAVE_DATA'...) NOT NULL DEFAULT ''
`tasks_completed` set('INIT_LIST','INIT_PREBUILD','SAVE_DATA'...) NOT NULL DEFAULT ''
[...]
So we'll have about 10 different task types which can be running or completed and we need to select specific task type(s) to be run, or to be checked (in running state). This will result in full table scan. In this case we're getting next pending task of type INIT_LIST...
Now maria / mysql can only use index if we're doing static comparison (column = "a,b,c"), we can't use index for finding if some value is just present in set. But we can use virtual columns to create indexes which will help in filtering out rows...
Multi-set case
If we have sets with same values which are co-related as like in this example (task can be only in 3 states - paused, running, completed), we can create 1 char column and put index on it. Notice that IF output will be trimmed, so that's why it can be done much simpler, without nested IFs... so in this case c means that task is completed and r running.
Including new condition, to help mariadb filter out unneded rows. Q time drop from couple seconds to <0.1s.
We need to create single index and single column for each value for which we want to speed up lookup, in SET.
Single-set case
If we want to index just single SET (eg. task_completed in following example) it's much simpler than multi-set. In this case we can just use y/n to indicate if that value is present in set.
And code can be easily written to both: