Get Started with MariaDB in Kubernetes and mariadb-operator
MariaDB users and customers alike have been exploring ways to run MariaDB in Kubernetes (K8s). At MariaDB, we’re investing in the leading open source Kubernetes operator for MariaDB Community Server. Here we’ll cover everything you need to know: why it’s important, what it is, and how to get going with the operator and run MariaDB successfully in K8s.
Why Kubernetes and Database?
Databases are notoriously difficult to manage and operate, due to the valuable data they hold, and their central position in any/every IT effort mandates they provide high levels of reliability, availability, scalability and performance. K8s, PaaS and DBaaS are very attractive solutions for lowering operational complexity and costs by centrally managing, scaling and securing databases reliably. While this is, of course, true of applications as well, platforms like these offer unique value to stateful processes.
The industry approach to address simplification for databases has created three key shifts:
- Demand for running database workload on lightweight virtualization (containers)
- Increasingly leveraging security & operational features of the underlying platform
- Software developers and SREs tend to be at the helm of the operational tasks that remain, reducing the need for dedicated DBA roles
The modern cloud-native landscape has predominantly adopted Kubernetes as the de facto standard for orchestrating containerized processes. Database is the #2 K8s workload and may be the fastest growing.
Source: Dynatrace Survey
Focusing on non-application workloads, enterprises used an increasing variety of technologies. This reflects the need to enhance Kubernetes with better observability, security, and service-to-service communications. Other technologies enable specific use cases like CI/CD tools or databases. Across all categories, open source projects rank among the most frequently used solutions.
In such an environment, managing databases efficiently is crucial. Operators are the de-facto method of enabling a generic platform like K8s to provision, deploy, operate and scale a specific workload (like a database), along with their associated clustering solutions (Galera, Vitess, etc). Sophisticated K8s operators that are deeply aware of server internals will play a pivotal role in achieving the full potential of Kubernetes for database workloads.
Source: CNCF Survey
What is the mariadb-operator?
At its essence, the mariadb-operator is an open source, K8s native application that simplifies the deployment and management of MariaDB instances in a K8s environment. It automates many of the traditional database management tasks. Its key features are engineered to leverage the full potential of K8s, providing a robust, scalable, and efficient solution for managing MariaDB instances.
Let’s delve into each of these features in detail:
Custom Resource Definitions (CRDs)
CRDs in K8s allow the extension of K8s APIs, and the MariaDB Operator follows K8s API conventions to design CRDs. This essentially is the foundational mechanism to customize a generic platform like Kubernetes to work with MariaDB-specific structures and concepts.
Scalability
One of the most significant features of the MariaDB K8s Operator is its ability to scale MariaDB instances seamlessly. This scalability is crucial in cloud-native environments where application demands can fluctuate rapidly.
- Horizontal Scaling: Using the K8s scale subresource, the operator enables easy horizontal scaling, allowing the number of MariaDB instances to be increased or decreased automatically based on the workload. This ensures that the database can handle varying levels of traffic without manual intervention.
- Vertical Scaling: Besides horizontal scaling, the operator also supports vertical scaling, enabling the adjustment of resources (like CPU and memory) allocated to each database instance. This flexibility ensures optimal resource utilization.
Ease of Deployment
The operator simplifies the deployment process of MariaDB in K8s via YAML manifests, reducing the need for database – specific expertise.
- Automated Deployments: It automates several aspects of deployment, including the setup of primary-replica configurations, network settings, and storage provisioning.
- Configuration Templates: The operator offers pre-configured templates (manifests), which help in quickly setting up a database instance with users, grants, restore-from-backups and more.
High Availability
The MariaDB operator today supports multiple High Availability (HA) modes:
- Single-Primary using SemiSync replication: The operator automatically configures the primary and replica nodes. In an event of primary failure, the operator is able to promote one replica to be primary.
- Multimaster with Galera: The operator handles Galera setup across all nodes and continuously monitors the cluster, automatically initiating recovery procedures when necessary.
While the MariaDB operator provides the above two basic mechanisms for addressing the primary and replica nodes and performing automatic primary failover today, the recommended approach for true HA in the future will be to use MaxScale. We are also looking at adding native MaxScale support in the operator to ensure optimal high availability by offering:
- Query-based routing: Transparently route write queries to the primary nodes and read queries to the replica nodes.
- Connection-based routing: Load balance connections between multiple servers
- Enhanced automatic primary failover based on MariaDB internals.
- Replay pending transactions when a server goes down.
Backup and Restore Capabilities
Effective backup and restore capabilities are essential for data integrity and recovery
- Automated Backups: The operator can be configured to perform automated backups at scheduled intervals, ensuring that the data is regularly backed up.
- Target Recovery Time: Specify a date and time, and the operator will match the closest backup to restore, minimizing data loss in the event of an issue.
- Storage Flexibility: Backups can make use of any Kubernetes-compatible storage, including but not limited to, local storage (mounted directly on the node) , network storage (e.g. NFS), S3, or any storage class installed in the cluster (e.g. EBS in AWS).
Security Enhancements
Security is a top priority, and the MariaDB K8s Operator includes features to safeguard data.
- Access Controls: The operator uses K8s’ role-based access control (RBAC), enabling fine-grained control over who can access the database.
- Secure Operator Image: the operator image is based on distroless, a minimal Linux distribution from Google that reduces surface attack area significantly.
Monitoring and Logging
For efficient database management, monitoring and logging are crucial.
- Integration with Monitoring Tools: The operator can be integrated with popular K8s monitoring tools like Prometheus, providing insights into database performance and health.
- Log Management: It facilitates log management using K8s friendly Grafana/Loki using Grafana Stack or Fluentbit, enabling easy tracking of database activities and aiding in troubleshooting.
Installing Kubernetes and MariaDB
The operator is available in Operatorhub.io and it can be installed via OLM in IBM RedHat OpenShift. Most often, people use Helm, and you can easily install the mariadb-operator
in your Kubernetes cluster this way. The recommended installation will install the required custom resources, the operator itself, prometheus operator, and cert-manager for TLS connections.
helm repo add mariadb-operator https://mariadb-operator.github.io/mariadb-operator helm install mariadb-operator mariadb-operator/mariadb-operator \ --set metrics.enabled=true --set webhook.cert.certManager.enabled=true
Quickstart Deployment Guide
While the QuickStart Guide will help you get up and running fast, deploying a MariaDB instance using the Operator involves:
- Selecting the appropriate configuration options.
- Understanding best practices for deployment.
- Documentation that we’re actively working on! See how you can help.
Example YAML Manifests
- Provisioning a MariaDB server
- Configuring databases, users and grants
- Configure connections for applications
- Orchestrate and schedule SQL scripts
- Backup and restore
- Bootstrap a new MariaDB from a backup
With more on the way!
Use Cases and Advantages
Self-Managed Infrastructure
The MariaDB K8s Operator is particularly beneficial in self-managed scenarios requiring rapid scaling, high availability, and agile development and/or GitOps environments. This could be on your own hardware, third-party Kubernetes as a service offerings, or PaaS platforms built on K8s.
PaaS
When coupled with PaaS platforms built on top of Kubernetes, like RedHat OpenShift, it offers additional advantages such as reduced operational complexity, improved reliability, and better resource utilization.
Next Steps
Join the MariaDB community Slack for discussion and help from the community.
If you are using, or plan to use, MariaDB and/or MaxScale on Kubernetes, you’ll get the most out of it by getting involved and contributing. We’re looking for contributors, and pull requests are most welcome!
The MariaDB K8s Operator is a significant step forward in the deployment and management of MariaDB in K8s environments. It offers scalability, high availability, and ease of management, making it an excellent choice for modern cloud-native applications. We hope you’ll give it a try!