# High Availability

<table data-view="cards"><thead><tr><th align="center"></th><th align="center"></th><th align="center"></th><th data-hidden data-card-cover data-type="files"></th></tr></thead><tbody><tr><td align="center"><strong>WHITE PAPER</strong></td><td align="center">The Ultimate Guide to High Availability with MariaDB</td><td align="center"><a href="https://go.mariadb.com/high-availability-guide-MariaDB-whitepaper.html?utm_source=onpagepromo&#x26;utm_medium=kb&#x26;utm_campaign=high-availability"><strong>Download Now</strong></a></td><td><a href="/files/JDzSeoj8SSLyYtuNxbQ6">/files/JDzSeoj8SSLyYtuNxbQ6</a></td></tr></tbody></table>

This section provides guidance on how to configure high availability in `MariaDB` and `MaxScale` instances. If you are looking for an HA setup for the operator, please refer to the [Helm documentation](/docs/tools/mariadb-enterprise-operator/installation/helm.md#operator-high-availability).

Our recommended setup for production is:

* Use a [**highly available topology**](#highly-available-topologies) for MariaDB:
  * [**Asynchronous replication**](/docs/tools/mariadb-enterprise-operator/topologies/high-availability/replication.md) with a primary node and at least 2 replicas.
  * Synchronous multi-master [**Galera**](/docs/tools/mariadb-enterprise-operator/topologies/high-availability/galera.md) with at least 3 nodes. Always an odd number of nodes, as it is quorum-based.
* Leverage [**MaxScale**](/docs/tools/mariadb-enterprise-operator/topologies/maxscale.md) as database proxy to load balance requests and perform failover/switchover operations. Configure 2 replicas to enable MaxScale upgrades without downtime.
* Use [dedicated nodes](#dedicated-nodes) to avoid noisy neighbours.
* Define [pod disruption budgets](#pod-disruption-budgets).

## Highly Available Topologies

* [**Asynchronous replication**](/docs/tools/mariadb-enterprise-operator/topologies/high-availability/replication.md): The primary node allows both reads and writes, while secondary nodes only serve reads. The primary has a binary log and the replicas asynchronously replicate the binary log events.
* [**Synchronous multi-master Galera**](/docs/tools/mariadb-enterprise-operator/topologies/high-availability/galera.md): All nodes support reads and writes, but writes are only sent to one node to avoid contention. The fact that is synchronous and that all nodes are equally configured makes the primary failover/switchover operation seamless and usually instantaneous.

## Kubernetes Services

In order to address nodes, MariaDB Enterprise Kubernetes Operator provides you with the following Kubernetes `Services`:

* `<mariadb-name>`: This is the default `Service`, only intended for the [standalone topology](/docs/tools/mariadb-enterprise-operator/topologies/standalone.md).
* `<mariadb-name>-primary`: To be used for write requests. It will point to the primary node.
* `<mariadb-name>-secondary`: To be used for read requests. It will load balance requests to all nodes except the primary.

Whenever the primary changes, either by the user or by the operator, both the `<mariadb-name>-primary` and `<mariadb-name>-secondary` `Services` will be automatically updated by the operator to address the right nodes.

The primary may be manually changed by the user at any point by updating the `spec.[replication|galera].primary.podIndex` field. Alternatively, automatic primary failover can be enabled by setting `spec.[replication|galera].primary.autoFailover`, which will make the operator to switch primary whenever the primary `Pod` goes down.

## MaxScale

While Kubernetes `Services` can be used for addressing primary and secondary instances, we recommend utilizing [MaxScale](/docs/tools/mariadb-enterprise-operator/topologies/maxscale.md) as database proxy for doing so, as it comes with additional advantages:

* Enhanced failover/switchover operations for both replication and Galera
* Single entrypoint for both reads and writes
* Multiple router modules available to define how to route requests
* Replay pending transaction when primary goes down
* Ability to choose whether the old primary rejoins as a replica
* Connection pooling

The full lifecycle of the MaxScale proxy is covered by this operator. Please refer to [MaxScale docs](/docs/tools/mariadb-enterprise-operator/topologies/maxscale.md) for further detail.

## Pod Anti-Affinity

{% hint style="warning" %}
Bear in mind that, when enabling this, you need to have at least as many `Nodes` available as the replicas specified. Otherwise your `Pods` will be unscheduled and the cluster won't bootstrap.
{% endhint %}

To achieve real high availability, we need to run each `MariaDB` `Pod` in different Kubernetes `Nodes`. This practice, known as anti-affinity, helps reducing the blast radius of `Nodes` being unavailable.

By default, anti-affinity is disabled, which means that multiple `Pods` may be scheduled in the same `Node`, something not desired in HA scenarios.

You can selectively enable anti-affinity in all the different `Pods` managed by the `MariaDB` resource:

```yaml
apiVersion: enterprise.mariadb.com/v1alpha1
kind: MariaDB
metadata:
  name: mariadb-galera
spec:
  # [...]
  bootstrapFrom:
    restoreJob:
      affinity:
        antiAffinityEnabled: true
  metrics:
    exporter:
      affinity:
        antiAffinityEnabled: true
  affinity:
    antiAffinityEnabled: true
  # [...]
```

Anti-affinity may also be enabled in the resources that have a reference to `MariaDB`, resulting in their `Pods` being scheduled in `Nodes` where `MariaDB` is not running. For instance, the `Backup` and `Restore` processes can run in different `Nodes`:

```yaml
apiVersion: enterprise.mariadb.com/v1alpha1
kind: Backup
metadata:
  name: backup
spec:
  # [...]
  mariaDbRef:
    name: mariadb-galera
  affinity:
    antiAffinityEnabled: true
  # [...]
```

```yaml
apiVersion: enterprise.mariadb.com/v1alpha1
kind: Restore
metadata:
  name: restore
spec:
  # [...]
  mariaDbRef:
    name: mariadb-galera

  affinity:
    antiAffinityEnabled: true
  # [...]
```

In the case of `MaxScale`, the `Pods` will also be placed in `Nodes` isolated in terms of compute, ensuring isolation not only among themselves but also from the `MariaDB` `Pods`. For example, if you run a `MariaDB` and `MaxScale` with 3 replicas each, you will need 6 `Nodes` in total:

```yaml
apiVersion: enterprise.mariadb.com/v1alpha1
kind: MaxScale
metadata:
  name: maxscale-galera
spec:
  storage:
    size: 1Gi
  mariaDbRef:
    name: mariadb-galera

  metrics:
    exporter:
      affinity:
        antiAffinityEnabled: true

  affinity:
    antiAffinityEnabled: true
```

Default anti-affinity rules generated by the operator might not satisfy your needs, but you can always define your own rules. For example, if you want the `MaxScale` `Pods` to be in different `Nodes`, but you want them to share `Nodes` with `MariaDB`:

```yaml
apiVersion: enterprise.mariadb.com/v1alpha1
kind: MaxScale
metadata:
  name: maxscale-galera
spec:
  storage:
    size: 1Gi
  mariaDbRef:
    name: mariadb-galera
  affinity:
    podAntiAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
      - labelSelector:
          matchExpressions:
          - key: app.kubernetes.io/instance
            operator: In
            values:
            - maxscale-galera
            # 'mariadb-galera' instance omitted (default anti-affinity rule)
        topologyKey: kubernetes.io/hostname
```

## Dedicated Nodes

If you want to avoid noisy neighbours running in the same Kubernetes `Nodes` as your `MariaDB`, you may consider using dedicated `Nodes`. For achieving this, you will need:

* Taint your `Nodes` and add the counterpart toleration in your `Pods`.

{% hint style="info" %}
Tainting your `Nodes` is not covered by this operator, it is something you need to do by yourself beforehand. You may take a look at the [Kubernetes documentation](https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/) to understand how to achieve this.
{% endhint %}

* Select the `Nodes` where `Pods` will be scheduled in via a `nodeSelector`.

{% hint style="info" %}
Although you can use the default `Node` labels, you may consider adding more significative labels to your `Nodes`, as you will have to set to them in your `Pod` `nodeSelector`. Refer to the [Kubernetes documentation](https://kubernetes.io/docs/tasks/configure-pod-container/assign-pods-nodes/#add-a-label-to-a-node).
{% endhint %}

* Add `podAntiAffinity` to your `Pods` as described in the [Pod Anti-Affinity](#pod-anti-affinity) section.

The previous steps can be achieved by setting these fields in the `MariaDB` resource:

```yaml
apiVersion: enterprise.mariadb.com/v1alpha1
kind: MariaDB
metadata:
  name: mariadb-galera
spec:
  storage:
    size: 1Gi
  tolerations:
    - key: "enterprise.mariadb.com/ha"
      operator: "Exists"
      effect: "NoSchedule"
  nodeSelector:
    "enterprise.mariadb.com/node": "ha" 
  affinity:
    antiAffinityEnabled: true
```

## Pod Disruption Budgets

{% hint style="info" %}
Take a look at the [Kubernetes documentation](https://kubernetes.io/docs/tasks/run-application/configure-pdb/) if you are unfamiliar to `PodDisruptionBudgets`
{% endhint %}

By defining a `PodDisruptionBudget`, you are telling Kubernetes how many `Pods` your database tolerates to be down. This quite important for planned maintenance operations such as `Node` upgrades.

MariaDB Enterprise Kubernetes Operator creates a default `PodDisruptionBudget` if you are running in HA, but you are able to define your own by setting:

```yaml
apiVersion: enterprise.mariadb.com/v1alpha1
kind: MariaDB
metadata:
  name: mariadb-galera
spec:
  # [...]
    podDisruptionBudget:
      maxUnavailable: 33%
  # [...]
```

<sub>*This page is: Copyright © 2025 MariaDB. All rights reserved.*</sub>

{% @marketo/form formId="4316" %}


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://mariadb.com/docs/tools/mariadb-enterprise-operator/topologies/high-availability.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
