Monitoring

Overview

MariaDB SkySQL provides health check and service state information, and detailed status and metrics via SkySQL Monitoring.

SkySQL Monitoring

SkySQL Monitoring shows detailed service and server status information and metrics.

A dedicated whitelist is used for SkySQL Monitoring access, separate from the IP whitelist used for services. An IP must be whitelisted to access SkySQL Monitoring.

To access SkySQL Monitoring:

  1. Log in to the MariaDB SkySQL Portal

  2. Click on "Monitoring" in the SkySQL main menu (left navigation)

Monitoring Dashboard

Monitoring Dashboard

The initial display for SkySQL Monitoring is the dashboard, which provides at-a-glance status, uptime, and average queries metrics. Service listings can be expanded for a server-level status view.

Click on a service name to access detailed service status and service metrics.

Server-Focused Monitoring

Servers - Caches

Monitoring - Servers - Caches

The Caches tab of the Servers view displays the following metrics:

Chart

Notes

MariaDB Table Definition Cache

MariaDB Table Open Cache Status

MariaDB Thread Cache

Servers - Database

Monitoring - Servers - Database

The Database tab of the Servers view displays the following metrics:

Chart

Notes

MariaDB Aborted Connections

MariaDB Client Thread Activity

MariaDB Connections

MariaDB Opened Files / sec

MariaDB Open Files

MariaDB Open Tables

MariaDB Table Locks

Temporary Objects Created

Servers - Galera

The Galera tab of the Servers view displays the following metrics:

Chart

Notes

Ready

Galera only

Connected

Galera only

Status

Galera only

Cluster Component

Galera only

Replication Latency

Galera only

Local Send Queue

Galera only

Local Received Queue

Galera only

Flow Control Commits

Galera only

Flow Control Pauses

Galera only

Replicated Writesets

Galera only

Replicated Writeset Bytes

Galera only

Sequentially in Parallel

Galera only

Servers - Queries

Monitoring - Servers - Queries

The Queries tab of the Servers view displays the following metrics:

Chart

Notes

MariaDB Handlers / sec

MariaDB QPS and Questions

MariaDB Select Types

MariaDB Slow Queries

MariaDB Sorts

MariaDB Transaction Handlers / sec

Top Command Counters

Top Command Counters Hourly

Servers - Status

Monitoring - Servers - Status

The Status tab of the Servers view displays the following metrics:

Chart

Notes

Aborted Connections

Buffer Pool Size of Total RAM

CPU (Gauge)

CPU (Graph)

Current QPS

Current SQL Commands

Disk size

InnoDB Data / sec

I/O Activity

Network Traffic

RAM (Gauge)

RAM (Graph)

Rows / sec

Used Connections

Servers - System

Monitoring - Servers - System

The System tab of the Servers view displays the following metrics:

Chart

Notes

CPU Usage / Load

Disk Size of data

Disk Size of logs

I/O Activity

IOPS

InnoDB Data / sec

MariaDB Memory Overview

MariaDB Network Traffic

MariaDB Network Usage Hourly

Memory Distribution

Network Errors

Network Packets Dropped

Network Traffic

Servers - Tables

The Tables tab of the Servers view displays the following metrics:

Chart

Notes

Table Sizes

Includes a Table Sizes monitor for each database on the server.

Service-Focused Monitoring

Service - Database

Monitoring - Service - Database

The Database tab of the Service view displays the following metrics:

Chart

Notes

MariaDB Aborted Connections

Except Galera

MariaDB Open Tables

Except Galera

MariaDB Service Connections

Except Galera

MariaDB Table Locks

Except Galera

MariaDB Table Opened

Except Galera

MaxScale Server Connections

Except Galera

Service - Lags

The Lags tab of the Service view is shown only for HA (Primary/Replica), Galera, and HTAP.

The Lags tab of the Service view displays the following metrics:

Chart

Notes

Exec Primary Log Position

HA (Primary/Replica) and HTAP only

GTID Replication Position

Queue Received

Galera only

Queue Send

Galera only

Read Primary Log Position

HA (Primary/Replica) and HTAP only

Seconds Behind Primary

HA (Primary/Replica) only

Service - Queries

Monitoring - Service - Queries

The Queries tab of the Service view displays the following metrics:

Chart

Notes

MariaDB Client Thread Activity

MariaDB QPS

MariaDB Questions / sec

MariaDB Slow Queries

Top Command Counters

Top Command Counters Hourly

Service - Status

Monitoring - Service - Status

The Status tab of the Service view displays the following metrics:

Chart

Notes

Connections

CPU Load

Current SQL Commands

Disk Size of data

Disk Size of logs

Exec Primary Log Position

HTAP only

MariaDB Slow Queries

QPS

Read Primary Log Position

HTAP only

Replica Lags

HA (Primary/Replica) only

Replica Lag

HTAP only

Replicas Status

HA (Primary/Replica) only

Replicas Status

HTAP only

Service - System

Monitoring - Service - System

The System tab of the Service view displays the following metrics:

Chart

Notes

CPU Load

Database Disk Size

I/O Activity - Page In

I/O Activity - Page Out

IOPS - Page In

IOPS - Page Out

Logs Disk Size

Except ColumnStore, HTAP

MariaDB Network Traffic

MariaDB Network Usage Hourly

Memory Usage

Network Errors

Network Packets Dropped

Network Traffic - Inbound

Network Traffic - Outbound

MaxScale-Focused Monitoring

The MaxScale server view is shown only for HA (Primary/Replica) and HTAP.

MaxScale Status

The Status tab of the MaxScale view displays the following metrics:

Chart

Description

Utilization

Reports the current CPU and RAM usage.

Server Routed Packets

Reports the number of packets routed to Primary and Replica servers.

Thread Count

Reports the current number of threads for the instance.

RO Service Connections

Reports the current number of read-only connections.

RW/sec

Charts the number of reads per second and the number of writes per second.

Resident

Reports the amount of resident memory in use.

RW Service Connections

Reports the number of read-write connections.

Stack Size

Reports the size of the memory stack.

Service Connections

Charts the number of read-only and read-write connections.

Server Connections

Charts connections to Primary and Replica servers.

Max Time in Queue

Charts the maximum amount of time it took for an I/O thread to become ready for processing.

MaxScale Performance

The Performance tab of the MaxScale view displays the following metrics:

Chart

Description

Memory

Charts the amount of memory used by the instance.

Errors

Charts error events for the instance.

MaxScale Hangups

Charts hangup events for the instance.

Event Queue Length

Charts the number of events in the Event Queue.

MaxScale Descriptors

Charts current descriptors for the instance in relation to the total number of descriptors.

Server Routed Packets

Charts the number of packets routed from each Server through the MaxScale Instance.

MaxScale Modules

The Modules tab of the MaxScale view displays a list of each module configured in the MaxScale instance.

Distributed Transactions Topology Monitoring

Enhanced monitoring is provided for the Distributed Transactions topology.

Location

Chart

Service > Status

Nodes in Cluster

Service > Status

Nodes in quorum

Service > Status

Uptime

Service > Status

Rebalancer - Underprotected Slices

Service > Status

Rebalancer Jobs

Service > Status

Current Rebalancer Actions

Service > Status

Rebalancer Activity in the last 24 hours

Service > Status

WAL sync time in the last 24 hours (5 minute intervals)

Service > Queries

Current Execution Times

Service > Queries

Current Avg Cluster Latency - Read

Service > Queries

Current Avg Cluster Latency - Write

Service > Queries

TPS

Service > Queries

QPS

Service > Queries

TPS (at 5 minute intervals)

Service > Queries

QPS (at 5 minute intervals)

Service > Queries

TPS Hourly

Service > Queries

QPS Hourly

Service > Queries

Top Command Counters in the last 24 hours (5 minute intervals)

Service > Queries

Query Latency over the last 24 hours (5 minute intervals)

Service > Connections

Sessions in the last 24 hours (5 minute intervals)

Service > Connections

Session transaction age in the last 24 hours (5 minute intervals)

Service > Connections

Session time in state in the last 24 hours (5 minute intervals)

Service > Connections

Connections/minute in the last 24 hours

Service > System

CPU

Service > System

RAM

Service > System

Disk

Service > System

CPU Usage over the last 24 hours (5 minute intervals)

Service > System

Memory Utilization in the last 24 hours (5 minute intervals)

Service > System

Container Memory Usage in the last 24 hours (5 minute intervals)

Service > System

Container Memory Hourly

Service > System

Xpand Total Memory in the last 24 hours (5 minute intervals)

Service > System

Storage allocation in the last 24 hours (5 minute intervals)

Service > System

Disk space usage in the last 24 hours (5 minute intervals)

Service > System

Average Network Latency in the last 24 hours (5 minute intervals)

Service > System

I/O Latency in the last 24 hours (5 minute intervals)

Service > Historical

Transactions per Second (TPS)

Service > Historical

Queries Per Second (QPS)

Service > Historical

Queries Per Second (QPS) by Query Type

Service > Historical

Top Command Counters

Service > Historical

CPU Usage

Service > Historical

Rebalancer Activity

Service > Historical

Memory Utilization

Service > Historical

Container Memory Usage

Service > Historical

Xpand Total Memory

Service > Historical

Storage allocation

Service > Historical

Disk space usage

Service > Historical

Query Latency

Service > Historical

Average Network Latency

Service > Historical

I/O Latency

Service > Historical

WAL sync time

Service > Historical

Sessions

Service > Historical

Session transaction age

Service > Historical

Session time in state

Service > Historical

Connections/minute

Servers > MaxScale > Status

CPU/RAM Utilization

Servers > MaxScale > Status

Threads

Servers > MaxScale > Status

Connections

Servers > MaxScale > Status

RW/sec

Servers > MaxScale > Status

Stack Size

Servers > MaxScale > Status

Resident Memory

Servers > MaxScale > Status

MaxScale Connections

Servers > MaxScale > Status

Database Server Connections

Servers > MaxScale > Status

Max Time in Queue

Servers > MaxScale > Performance

Memory

Servers > MaxScale > Performance

Errors

Servers > MaxScale > Performance

MaxScale Hangups

Servers > MaxScale > Performance

Event Queue Length

Servers > MaxScale > Performance

MaxScale Descriptors

Servers > MaxScale > Modules

MaxScale Modules (name, type, version, enabled)

Servers > Xpand > Status

RAM (Gauge)

Servers > Xpand > Status

RAM (Graph)

Servers > Xpand > Status

CPU (Gauge)

Servers > Xpand > Status

CPU (Graph)

Servers > Xpand > Status

Disk size and usage

Servers > Xpand > System

Current CPU Usage

Servers > Xpand > System

Current Memory Utilization

Servers > Xpand > System

Current Storage

Servers > Xpand > System

CPU Usage in the last 24 hours (5 minute intervals)

Servers > Xpand > System

Memory Utilization in the last 24 hours (5 minute intervals)

Servers > Xpand > System

Container Memory Distribution in the last 24 hours (5 minute intervals)

Servers > Xpand > System

Xpand Memory Distribution in the last 24 hours (5 minute intervals)

Servers > Xpand > System

Storage in the last 24 hours (5 minute intervals)

Servers > Xpand > System

Average Network Latency over the last 24 hours (5 minute intervals)

Servers > Xpand > System

I/O Latency over the last 24 hours (5 minute intervals)

Servers > Xpand > Historical

Container Row Operations

Servers > Xpand > Historical

CPU

Servers > Xpand > Historical

TIL Core Utilization

Servers > Xpand > Historical

Core 0 Utilization

Servers > Xpand > Historical

Memory Utilization

Servers > Xpand > Historical

Container Memory Distribution

Servers > Xpand > Historical

Xpand Memory Distribution

Servers > Xpand > Historical

Storage

Servers > Xpand > Historical

Average Network Latency

Servers > Xpand > Historical

I/O Latency

Servers > Xpand > Historical

Network Data

Health Check and Service State

  • Health Check and Service State are shown on the Services view and Service Details view in the SkySQL Portal.

  • When a database service is Healthy and Running, it is prepared to receive whitelisted client connections.

  • Other states:

    • Unhealthy health check is shown when a database service encounters a service fault, or is being restarted by Configuration Manager.

    • Pending health check and Pending service state are shown when a service is being created.

    • Stopping service state, leading to Stopped service state, is shown when an instance is Stopped through the Portal.

    • Starting service state, leading to Running service state, is shown when an instance is Started through the Portal.

    • Internal Error service state is shown when a fault occurs during the launch or termination of a service.

Alerts and Notifications

SkySQL Monitoring performs checks against servers and services, providing access to status and metrics via web interface. SkySQL Monitoring includes alerting features to allow rule-based problem notifications to customer-defined contacts.

While SkySQL Monitoring is Generally Available (GA), the alerting features are available as a Technical Preview.

To access the alerting features, click on the "Alerts" link in the SkySQL Monitoring main menu (left navigation).

The default view of "Alerts and Notifications" provides access to "Active" alerts and a "History" of past alerts.

Click on the gear icon (upper right) to modify alerting rules, define notification criteria, and define the channels that will receive notifications. Use the "Add" button (bottom of screen) to add new entries.

Rules accept a metric choice with value based threshold and time interval. Rules are tagged for grouping.

Notification criteria bind alerting rules and notification channels together, defining who will receive which notifications. Notification criteria are defined by tag.

Notification channels currently support email contacts.

Transactional Storage

Services that utilize transactional storage, like HA, Xpand, and Galera, have fixed disk sizes. Services launched before the 2021-08-09 update may show up to 30% additional disk space. Customers are only billed for the requested disk sizes.

For additional information, see "Transactional Storage".