Cluster Overall Memory Requirements (compared to multiple stand-alone nodes)
Maybe I'm just missing it somewhere on the site, but I'm trying to find a way to contrast overall memory requirements for a Galera cluster compared to that of stand-alone mySQL instances. In my profession we build servers that are typically 32GB overall memory running an instance of mySQL within, in which we hand 16GB to innodb_buffer_pool_size (the actually size of the mySQL database files is upwards of four times that). The server does other things by the way (a 6 GB JVM is also loaded). Our buffer pool hit rate is typically a nice high number (over 99%) with that 16GB allocated. So let's take a simple scenario - 6 of my stand-alone servers are going to evolve to MariaDB + Galera cluster. I completely realize we need 3 nodes minimum on the Galera side, and I realize we now are accommodating about 6 x 64GB of former mySQL database files (as long as we presume the database files overall size in mariaDB is about the same as that of mySQL). So I've got 384 GB of database files, and let's suppose I want to replace this with a 3 node mariaDB+Galera cluster. I know that 384 GB has to be replicated amongst those three nodes, storage sizing is not so complex, and I know I can probably get enough CPU under these node with 4 vCPUs. But my question is more so around innodb_buffer_pool size for each node. Do I need to run 6 x 16 GB = 96 GB for innodb_buffer_pool_size in each node of that Galera cluster (essentially the same ratio 1:4 as I did in mySQL instances with my stand-alone servers) ? Any help appreciated.
Answer Answered by Daniel Black in this comment.
galera doesn't use that much more memory. Its got an apply queue which in theory can be up to gcs.fc_limit ( wsrep_provider_options) entries of up to wsrep_max_ws_size size. If you're not doing large and multiple row updates (row based replication) updates hitting this limit will be hard.
wsrep_thread_count are full connection threads and in theory can consume about the same as real connection threads however as they are processing row based events some of the usual connection based buffers will never be used.
There are some other memory allocations but they are quite minor.
Unless you doing some massive updates where the hardware on the receiving node is significantly less than the update node, I wouldn't consider Galera usage to be any more than 1-2G (very conservative estimate with increased gcs.fc_limit at 1K).