Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
A filter resource represents an instance of a filter inside MaxScale. Multiple services can use the same filter and a single service can use multiple filters.
The :name in all of the URIs must be the name of a filter in MaxScale.
Get a single filter.
GET /v1/filters/:name
Response
Status: 200 OK
{
"data": {
"attributes": {
"filter_diagnostics": null,
"module": "qlafilter",
"parameters": {
"append": false,
"duration_unit": "ms",
"exclude": null,
"filebase": "/tmp/qla.log",
"flush": true,
"log_data": "date,user,query",
"log_type": "unified",
"match": null,
"module": "qlafilter",
"newline_replacement": " ",
"options": "",
"separator": ",",
"source": null,
"source_exclude": null,
"source_match": null,
"use_canonical_form": false,
"user": null,
"user_exclude": null,
"user_match": null
},
"source": {
"file": "/etc/maxscale.cnf",
"type": "static"
}
},
"id": "QLA",
"links": {
"self": "http://localhost:8989/v1/filters/QLA/"
},
"relationships": {
"services": {
"data": [
{
"id": "Read-Connection-Router",
"type": "services"
}
],
"links": {
"related": "http://localhost:8989/v1/services/",
"self": "http://localhost:8989/v1/filters/QLA/relationships/services/"
}
}
},
"type": "filters"
},
"links": {
"self": "http://localhost:8989/v1/filters/QLA/"
}
}
Get all filters.
GET /v1/filters
Response
Status: 200 OK
{
"data": [
{
"attributes": {
"module": "hintfilter",
"parameters": {
"module": "hintfilter"
},
"source": {
"file": "/etc/maxscale.cnf",
"type": "static"
}
},
"id": "Hint",
"links": {
"self": "http://localhost:8989/v1/filters/Hint/"
},
"relationships": {
"services": {
"data": [
{
"id": "Read-Connection-Router",
"type": "services"
}
],
"links": {
"related": "http://localhost:8989/v1/services/",
"self": "http://localhost:8989/v1/filters/Hint/relationships/services/"
}
}
},
"type": "filters"
},
{
"attributes": {
"filter_diagnostics": null,
"module": "qlafilter",
"parameters": {
"append": false,
"duration_unit": "ms",
"exclude": null,
"filebase": "/tmp/qla.log",
"flush": true,
"log_data": "date,user,query",
"log_type": "unified",
"match": null,
"module": "qlafilter",
"newline_replacement": " ",
"options": "",
"separator": ",",
"source": null,
"source_exclude": null,
"source_match": null,
"use_canonical_form": false,
"user": null,
"user_exclude": null,
"user_match": null
},
"source": {
"file": "/etc/maxscale.cnf",
"type": "static"
}
},
"id": "QLA",
"links": {
"self": "http://localhost:8989/v1/filters/QLA/"
},
"relationships": {
"services": {
"data": [
{
"id": "Read-Connection-Router",
"type": "services"
}
],
"links": {
"related": "http://localhost:8989/v1/services/",
"self": "http://localhost:8989/v1/filters/QLA/relationships/services/"
}
}
},
"type": "filters"
}
],
"links": {
"self": "http://localhost:8989/v1/filters/"
}
}
POST /v1/filters
Create a new filter. The posted object must define at least the following fields.
data.id
Name of the filter
data.type
Type of the object, must be filters
data.atttributes.module
The filter module to use
All of the filter parameters should be defined at creation time in thedata.atttributes.parameters
object.
As the service to filter relationship is ordered (filters are applied in the order they are listed), filter to service relationships cannot be defined at creation time.
The following example defines a request body which creates a new filter.
{
"data": {
"id": "test-filter", // Name of the filter
"type": "filters",
"attributes": {
"module": "qlafilter", // The filter uses the qlafilter module
"parameters": { // Filter parameters
"filebase": "/tmp/qla.log"
}
}
}
}
Response
Filter is created:
Status: 204 No Content
PATCH /v1/filters/:name
Filter parameters can be updated at runtime if the module supports it. Refer to the individual module documentation for more details on whether it supports runtime configuration and which parameters can be updated.
The following example modifies a filter by changing the match
parameter to.*users.*
.
{
"data": {
"attributes": {
"parameters": {
"match": ".*users.*"
}
}
}
}
Response
Filter is modified:
Status: 204 No Content
DELETE /v1/filters/:filter
The :filter in the URI must map to the name of the filter to be destroyed.
A filter can only be destroyed if no service uses it. This means that thedata.relationships
object for the filter must be empty. Note that the service
→ filter relationship cannot be modified from the filters resource and must be
done via the services resource.
This endpoint also supports the force=yes
parameter that will unconditionally
delete the filter by first removing it from all services that it uses.
Response
Filter is destroyed:
Status: 204 No Content
CC BY-SA / Gnu FDL
This document provides a short overview of the readwritesplit router module and its intended use case scenarios. It also displays all router configuration parameters with their descriptions. A list of current limitations of the module is included and use examples are provided.
The readwritesplit router is designed to increase the read-only processing capability of a cluster while maintaining consistency. This is achieved by splitting the query load into read and write queries. Read queries, which do not modify data, are spread across multiple nodes while all write queries will be sent to a single node. For more details on how the load balancing works, refer to slave_selection_criteria andmaster_accept_reads.
The router is designed to be used with a traditional Primary-Replica replication cluster. It automatically detects changes in the primary server and will use the current primary server of the cluster. With a Galera cluster, one can achieve a resilient setup and easy primary failover by using one of the Galera nodes as a Write-Primary node, where all write queries are routed, and spreading the read load over all the nodes.
Maintenance
and Draining
stateWhen a server that readwritesplit uses is put into maintenance mode, any ongoing requests are allowed to finish before the connection is closed. If the server that is put into maintenance mode is a primary, open transaction are allowed to complete before the connection is closed. Note that this means neither idle session nor long-running transactions will be closed by readwritesplit. To forcefully close the connections, use the following command:
maxctrl set server <server> maintenance --force
If a server is put into the Draining
state while a connection is open, the
connection will be used normally. Whenever a new connection needs to be created,
whether that be due to a network error or when a new session being opened, only
servers that are neither Draining
nor Drained
will be used.
Readwritesplit router-specific settings are specified in the configuration file of MariaDB MaxScale in its specific section. The section can be freely named but the name is used later as a reference in a listener section.
For more details about the standard service parameters, refer to theConfiguration Guide.
Starting with 2.3, all router parameters can be configured at runtime. Usemaxctrl alter service
to modify them. The changed configuration will only be
taken into use by new sessions.
max_slave_connections
Type: integer
Mandatory: No
Dynamic: Yes
Default: 255
max_slave_connections
sets the maximum number of replicas a router session uses
at any moment. The default is to use at most 255 replica connections per client
connection. In older versions the default was to use all available replicas with
no limit.
For MaxScale 2.5.12 and newer, the minimum value is 0.
For MaxScale versions 2.5.11 and older, the minimum value is 1. These versions suffer from a bug (MXS-3536) that causes the parameter to accept any values but only function when a value greater than one was given.
Starting with MaxScale 2.5.0, the use of percentage values inmax_slave_connections
is deprecated. The support for percentages will be
removed in a future release.
For example, if you have configured MaxScale with one primary and three replicas
and set max_slave_connections=2
, for each client connection a connection to
the primary and two replica connections would be opened. The read query load
balancing is then done between these two replicas and writes are sent to the
primary.
By tuning this parameter, you can control how dynamic the load balancing is at
the cost of extra created connections. With a lower value ofmax_slave_connections
, less connections per session are created and the set of
possible replica servers is smaller. With a higher value inmax_slave_connections
, more connections are created which requires more
resources but load balancing will almost always give the best single query
response time and performance. Longer sessions are less affected by a highmax_slave_connections
as the relative cost of opening a connection is lower.
Behavior of max_slave_connections=0
When readwritesplit is configured with max_slave_connections=0
, readwritesplit
will behave slightly differently in that it will route all reads to the current
master server. This is a convenient way to force all of the traffic to go to a
single node while still being able to leverage the replay and reconnection
features of readwritesplit.
In this mode, the behavior of master_failure_mode=fail_on_write
also changes
slightly. If the current Master
server fails and a read is done when there's
no other Master
server available, the connection will be closed. This is done
to prevent an extra slave connection from being opened that would not be closed
if a new Master
server would arrive.
slave_connections
Type: integer
Mandatory: No
Dynamic: Yes
Default: 255
This parameter controls how many replica connections each new session starts
with. The default value is 255 which is the same as the default value ofmax_slave_connections
.
In contrast to max_slave_connections
, slave_connections
serves as a
soft limit on how many replica connections are created. The number of replica
connections can exceed slave_connections
if the load balancing algorithm
finds an unconnected replica server better than all other replicas.
Setting this parameter to 1 allows faster connection creation and improved
resource usage due to the smaller amount of initial backend
connections. It is recommended to use slave_connections=1
when the
lifetime of the client connections is short.
max_replication_lag
Type: duration
Mandatory: No
Dynamic: Yes
Default: 0s
NOTE Up until 23.02, this parameter was called max_slave_replication_lag
,
which has been deprecated but still works as an alias for max_replication_lag
.
Specify how many seconds a replica is allowed to be behind the primary. The lag of
a replica must be less than the configured value in order for it to be used for
routing. If set to 0s
(the default value), the feature is disabled.
The replica lag must be less than max_replication_lag
. This means that it
is possible to define, with max_replication_lag=1s
, that all replicas must
be up to date in order for them to be used for routing.
Note that this feature does not guarantee that writes done on the primary are visible for reads done on the replica. This is mainly due to the method of replication lag measurement. For a feature that guarantees this, refer tocausal_reads.
The lag is specified as documentedhere. Note that since the granularity of the lag is seconds, a lag specified in milliseconds will be rejected, even if the duration is longer than a second.
The Readwritesplit-router does not detect the replication lag itself. A monitor such as the MariaDB-monitor for a Primary-Replica cluster is required. This option only affects Primary-Replica clusters. Galera clusters do not have a concept of replica lag even if the application of write sets might have lag. When a server is disqualified from routing because of replication lag, a warning is logged. Similarly, when the server has caught up enough to be a valid routing target, another warning is logged. These messages are only logged when a query is being routed and the replication state changes.
Starting with MaxScale versions 23.08.7, 24.02.3 and 24.08.1, readwritesplit
will discard connections to any servers that have excessive replication lag. The
connection will be discarded if a server is lagging behind by more than twice
the amount of max_replication_lag
and the server is behind by more than 300
seconds (replication lag > MAX(300, 2 * max_replication_lag)
).
use_sql_variables_in
Type: enum
Mandatory: No
Dynamic: Yes
Values: master
, all
Default: all
This parameter controls how SELECT
statements that use SQL user variables are
handled. Here is an example of such a query that uses it to return an increasing
row number for a resultset:
SET @rownum := 0;
SELECT @rownum := @rownum + 1 AS rownum, user, host FROM mysql.user;
By default MaxScale will route both the SET
and SELECT
statements to all
nodes. Any future reads of the user variables can also be performed on any node.
The possible values for this parameter are:
all
(default)
Modifications to user variables inside SELECT
statements as well as reads
of user variables are routed to all servers.
Versions before MaxScale 22.08 returned an error if a user variable was
modified inside of a SELECT
statement when use_sql_variables_in=all
was
used. MaxScale 22.08 will instead route the query to all servers and discard
the extra results.
master
Modifications to user variables inside SELECT
statements as well as reads
of user variables are routed to the primary server. This forces more of the
traffic onto the primary server but it reduces the amount of data that is
discarded for any SELECT
statement that also modifies a user
variable. With this mode, the state of user variables is not deterministic
if they are modified inside of a SELECT
statement. SET
statements that
modify user variabels are still routed to all servers.
DML statements, such as INSERT
, UPDATE
or DELETE
, that modify SQL user
variables are still treated as writes and are only routed to the primary
server. For example, after the following query the value of @myid
is no longer
the same on all servers and the SELECT
statement can return different values
depending where it ends up being executed:
SET @myid := 0;
INSERT INTO test.t1 VALUES (@myid := @myid + 1);
SELECT @myid; -- Might return 1 or 0
connection_keepalive
Note: This parameter has been moved into the MaxScale core. For the current documentation, read theconnection_keepalive section in the configuration guide.
Send keepalive pings to backend servers. This feature was introduced in MaxScale 2.2.0. The default value is 300 seconds starting with 2.3.2 and for older versions the feature was disabled by default. This parameter was converted into a service parameter in MaxScale 2.5.0.
master_reconnection
Type: boolean
Mandatory: No
Dynamic: Yes
Default: false
Allow the primary server to change mid-session. This feature was introduced in
MaxScale 2.3.0 and is disabled by default. This feature requires thatdisable_sescmd_history
is not used.
When a readwritesplit session starts, it will pick a primary server as the current primary server of that session. By default, when this primary server is lost or changes to another server, the connection will be closed.
When master_reconnection
is enabled, readwritesplit can sometimes recover a
lost connection to the primary server. This largely depends on the value ofmaster_failure_mode
.
With master_failure_mode=fail_instantly
, the primary server is only allowed to
change to another server. This change must happen without a loss of the primary
server.
With master_failure_mode=fail_on_write
, the loss of the primary server is no
longer a fatal error: if a replacement primary server appears before any write
queries are received, readwritesplit will transparently reconnect to the new
primary server.
In both cases the change in the primary server can only take place ifprune_sescmd_history
is enabled or max_sescmd_history
has not yet
been exceeded and the session does not have an open transaction.
The recommended configuration is to use master_reconnection=true
andmaster_failure_mode=fail_on_write
. This provides improved fault tolerance
without any risk to the consistency of the database.
slave_selection_criteria
Type: enum
Mandatory: No
Dynamic: Yes
Values: least_current_operations
, adaptive_routing
, least_behind_master
, least_router_connections
, least_global_connections
Default: least_current_operations
This option controls how the readwritesplit router chooses the replicas it
connects to and how the load balancing is done. The default behavior is to route
read queries to the replica server with the lowest amount of ongoing queries i.e.least_current_operations
.
All of the load balancing methods use MaxScale's own accounting. Connections and queries done directly on the database and not through MaxScale are not taken into account by readwritesplit. For example, if server A has 100 queries running all of which are routed through MaxScale and server B has 115 queries but only 95 of those were routed through MaxScale, server B is considered a better candidate even if the absolute number of active queries on it is higher. This is because MaxScale only tracks the connections and queries routed through the same process.
The option syntax:
slave_selection_criteria=<criteria>
Where <criteria>
is one of the following values.
least_current_operations
(default), the replica with least active operations
adaptive_routing
, based on server average response times.
least_behind_master
, the replica with smallest replication lag
least_global_connections
, the replica with least connections from MariaDB MaxScale
least_router_connections
, the replica with least connections from this service
least_current_operations
uses the current number of active operations
(i.e. SQL queries) as the load balancing metric and it optimizes for maximal
query throughput. Each query gets routed to the server with the least active
operations which results in faster servers processing more traffic. If two
servers have an equal number of active operations, the one that was least
recently used is chosen.
adaptive_routing
uses the server response time and current estimated server
load as the load balancing metric. The server that is estimated to finish an
additional query first is chosen. A modified average response time for each
server is continuously updated to allow slow servers at least some traffic and
quickly react to changes in server load conditions. If a server has not received
any traffic, the network lag to the server as measured by the monitor is used as
the proxy of the true response time. This selection criteria is designed for
heterogeneous clusters: servers of differing hardware, differing network
distances, or when other loads are running on the servers (including a
backup). If the servers are queried by other clients than MaxScale, the load
caused by them is indirectly taken into account.
least_behind_master
uses the measured replication lag as the load balancing
metric. This means that servers that are more up-to-date are favored which
increases the likelihood of the data being read being up-to-date. However, this
is not as effective as causal_reads
would be as there's no guarantee that
writes done by the same connection will be routed to a server that has
replicated those changes. The recommended approach is to useLEAST_CURRENT_OPERATIONS
or ADAPTIVE_ROUTING
in combination withcausal_reads
NOTE: least_global_connections
and least_router_connections
should not
be used, they are legacy options that exist only for backwards
compatibility. Using them will result in skewed load balancing as the algorithm
uses a metric that's too coarse (number of connections) to load balance
something that's finer (individual SQL queries).
The least_global_connections
and least_router_connections
use the
connections from MariaDB MaxScale to the server, not the amount of connections
reported by the server itself.
Starting with MaxScale versions 2.5.29, 6.4.11, 22.08.9, 23.02.5 and 23.08.1,
lowercase versions of the values are also accepted. For example,slave_selection_criteria=LEAST_CURRENT_OPERATIONS
andslave_selection_criteria=least_current_operations
are both accepted as valid
values.
Starting with MaxScale 23.08.1, the legacy uppercase values have been deprecated. All runtime modifications of the parameter will now be persisted in lowercase. The uppercase values are still accepted but will be removed in a future MaxScale release.
max_sescmd_history
This parameter has been moved tothe MaxScale core in MaxScale 6.0.
disable_sescmd_history
This parameter has been moved tothe MaxScale core in MaxScale 6.0.
prune_sescmd_history
This parameter has been moved tothe MaxScale core in MaxScale 6.0.
master_accept_reads
Type: boolean
Mandatory: No
Dynamic: Yes
Default: false
master_accept_reads
allows the primary server to be used for reads. This is
a useful option to enable if you are using a small number of servers and wish to
use the primary for reads as well.
By default, no reads are sent to the primary as long as there is a valid replica
server available. If no replicas are available, reads are sent to the primary
regardless of the value of master_accept_reads
.
# Use the primary for reads
master_accept_reads=true
strict_multi_stmt
Type: boolean
Mandatory: No
Dynamic: Yes
Default: false
This option is disabled by default since MaxScale 2.2.1. In older versions, this option was enabled by default.
When a client executes a multi-statement query, it will be treated as if it were a DML statement and routed to the primary. If the option is enabled, all queries after a multi-statement query will be routed to the primary to guarantee a consistent session state.
If the feature is disabled, queries are routed normally after a multi-statement query.
Warning: Enable the strict mode only if you know that the clients will send statements that cause inconsistencies in the session state.
# Enable strict multi-statement mode
strict_multi_stmt=true
strict_sp_calls
Type: boolean
Mandatory: No
Dynamic: Yes
Default: false
Similar to strict_multi_stmt
, this option allows all queries after a CALL
operation on a stored procedure to be routed to the primary. This option is
disabled by default and was added in MaxScale 2.1.9.
All warnings and restrictions that apply to strict_multi_stmt
also apply tostrict_sp_calls
.
strict_tmp_tables
Type: boolean
Mandatory: No
Dynamic: Yes
Default: false
By default, all temporary tables are lost when a reconnection of the primary
node occurs. This means that when master_reconnection
is enabled, the use of
temporary tables might appear to disappear when a reconnection happens.
If strict_tmp_tables
is enabled, reconnections are prevented as long as a
temporary tables exist. In this case if the primary node is lost and temporary
table exist, the session is closed. If a session creates temporary tables but
does not drop them, this behavior will effectively disable reconnections until
the session is closed.
master_failure_mode
Type: enum
Mandatory: No
Dynamic: Yes
Values: fail_instantly
, fail_on_write
, error_on_write
Default: fail_instantly
This option controls how the failure of a primary server is handled. By default, the router will close the client connection as soon as the primary is lost.
The following table describes the values for this option and how they treat the loss of a primary server.
fail_instantly
When the failure of the primary server is detected, the connection will be closed immediately.
fail_on_write
The client connection is closed if a write query is received when no primary is available.
error_on_write
If no primary is available and a write query is received, an error is returned stating that the connection is in read-only mode.
These also apply to new sessions created after the primary has failed. This means
that in fail_on_write
or error_on_write
mode, connections are accepted as
long as replica servers are available.
When configured with fail_on_write
or error_on_write
, sessions that are idle
will not be closed even if all backend connections for that session have
failed. This is done in the hopes that before the next query from the idle
session arrives, a reconnection to one of the replicas is made. However, this can
leave idle connections around unless the client application actively closes
them. To prevent this, use theconnection_timeout
parameter.
Note: If master_failure_mode
is set to error_on_write
and the connection
to the primary is lost, by default, clients will not be able to execute write
queries without reconnecting to MariaDB MaxScale once a new primary is
available. If master_reconnection is enabled, the
session can recover if one of the replicas is promoted as the primary.
retry_failed_reads
Type: boolean
Mandatory: No
Dynamic: Yes
Default: true
This option controls whether autocommit selects are retried in case of failure. This option is enabled by default.
When a simple autocommit select is being executed outside of a transaction and the replica server where the query is being executed fails, readwritesplit can retry the read on a replacement server. This makes the failure of a replica transparent to the client.
If a part of the result was already delivered to the client, the query will not
be retried. The retrying of queries with partially delivered results is only
possible when transaction_replay
is enabled.
delayed_retry
Type: boolean
Mandatory: No
Dynamic: Yes
Default: false
Retry queries over a period of time. This parameter takes a boolean value, was added in Maxscale 2.3.0 and is disabled by default.
When this feature is enabled, a failure to route a query due to a connection problem will not immediately result in an error. The routing of the query is delayed until either a valid candidate server is available or the retry timeout is reached. If a candidate server becomes available before the timeout is reached, the query is routed normally and no connection error is returned. If no candidates are found and the timeout is exceeded, the router returns to normal behavior and returns an error.
When combined with the master_reconnection
parameter, failures of writes done
outside of transactions can be hidden from the client connection. This allows a
primary to be replaced while writes are being sent.
Starting with MaxScale 21.06.18, 22.08.15, 23.02.12, 23.08.8, 24.02.4 and
24.08.1, delayed_retry
will no longer attempt to retry a query if it was
already sent to the database. If a query is received while a valid target server
is not available, the execution of the query is delayed until a valid target is
found or the delayed retry timeout is hit. If a query was already sent, it will
not be replayed to prevent duplicate execution of statements.
In older versions of MaxScale, duplicate execution of a statement can occur if
the connection to the server is lost or the server crashes but the server comes
back up before the timeout for the retrying is exceeded. At this point, if the
server managed to read the client's statement, it will be executed. For this
reason, it is recommended to only enable delayed_retry
for older versions of
MaxScale when the possibility of duplicate statement execution is an acceptable
risk.
delayed_retry_timeout
Type: duration
Mandatory: No
Dynamic: Yes
Default: 10s
The duration to wait until an error is returned to the client whendelayed_retry
is enabled. The default value is 10 seconds.
The timeout is specified as documentedhere. If no explicit unit is provided, the value is interpreted as seconds in MaxScale 2.4. In subsequent versions a value without a unit may be rejected. Note that since the granularity of the timeout is seconds, a timeout specified in milliseconds will be rejected, even if the duration is longer than a second.
transaction_replay
Type: boolean
Mandatory: No
Dynamic: Yes
Default: false
Replay interrupted transactions. This parameter was added in MaxScale 2.3.0 and
is disabled by default. Enabling this parameter enables both delayed_retry
andmaster_reconnection
and sets master_failure_mode
to fail_on_write
, thereby
overriding any configured values for these parameters.
When the server where the transaction is in progress fails, readwritesplit can migrate the transaction to a replacement server. This can completely hide the failure of a primary node without any visible effects to the client.
If no replacement node becomes available, the client connection is closed.
To control how long a transaction replay can take, usetransaction_replay_timeout
.
Please refer to theTransaction Replay Limitations section for a more detailed explanation of what should and should not be done with transaction replay.
transaction_replay_max_size
Type: size
Mandatory: No
Dynamic: Yes
Default: 1 MiB
The limit on transaction size for transaction replay in bytes. Any transaction
that exceeds this limit will not be replayed. The default value is 1 MiB. This
limit applies at a session level which means that the total peak memory
consumption can be transaction_replay_max_size
times the number of client
connections.
The amount of memory needed to store a particular transaction will be slightly larger than the length in bytes of the SQL used in the transaction. If the limit is ever exceeded, a message will be logged at the info level.
Starting with MaxScale 6.4.10, the number of times that this limit has been
exceeded is shown in maxctrl show service
as trx_max_size_exceeded
.
Read the configuration guide for more details on size type parameters in MaxScale.
transaction_replay_attempts
Type: integer
Mandatory: No
Dynamic: Yes
Default: 5
The upper limit on how many times a transaction replay is attempted before giving up. The default value is 5.
A transaction replay failure can happen if the server where the transaction is being replayed fails while the replay is in progress. In practice this parameter controls how many server and network failures a single transaction replay tolerates. If a transaction is replayed successfully, the counter for failed attempts is reset.
transaction_replay_timeout
Type: duration
Mandatory: No
Dynamic: Yes
Default: 0s
The time how long transactions are attempted for. This feature is disabled by default and was added in MaxScale 6.2.1. To explicitly disable this feature, set the value to 0 seconds.
The timeout isa duration type and the value must include a unit for the duration.
When transaction_replay_timeout
is enabled, the time a transaction replay can
take is controlled solely by this parameter. This is a more convenient and
predictable method of controlling how long a transaction replay can be attempted
before the connection is closed.
If delayed_retry_timeout
is less than transaction_replay_timeout
, it is set
to the same value.
By default the time how long a transaction can be retried is controlled bydelayed_retry_timeout
and transaction_replay_attempts
. This can result in a
maximum replay time limit of delayed_retry_timeout
multiplied bytransaction_replay_attempts
, by default this is 50 seconds. The minimum replay
time limit can be as low as transaction_replay_attempts
seconds (5 seconds by
default) in cases where the connection fails after it was created. Usually this
happens due to problems like the max_connections limit being hit on the database
server.
With the introduction of transaction_replay_timeout
, these problems are
avoided. Starting with MaxScale 6.2.1, this is the recommended method of
controlling the timeouts for transaction replay.
transaction_replay_retry_on_deadlock
Type: boolean
Mandatory: No
Dynamic: Yes
Default: false
Enable automatic retrying of transactions that end up in a deadlock. This parameter was added in MaxScale 2.4.6 and the feature is disabled by default. MaxScale versions from 2.4.0 to 2.4.5 always tried to replay deadlocked transactions.
If this feature is enabled and a transaction returns a deadlock error
(e.g. SQLSTATE 40001: Deadlock found when trying to get lock; try restarting transaction
),
the transaction is automatically retried. If the retrying of the transaction
results in another deadlock error, it is retried until it either succeeds or a
transaction checksum error is encountered.
transaction_replay_safe_commit
Type: boolean
Mandatory: No
Dynamic: Yes
Default: true
If a transaction is ending and the COMMIT
statement at the end of it is
interrupted, there is a risk of duplicating the transaction if it is
replayed. This parameter prevents the retrying of transactions that are about to
commit.
This parameter was added in MaxScale 23.08.0 and is enabled by default. The older version of MaxScale always attempted to replay the transaction even if there was a risk of duplicating the transaction.
If the data that is about to be modified is read before it is modified and it is
locked in an appropriate manner (e.g. with SELECT ... FOR UPDATE
or with theSERIALIZABLE
isolation level), it is safe to replay a transaction that was
about to commit. This is because the checksum of the transaction will mismatch
if the original transaction ended up committing on the server. Disabling this
feature can enable more robust delivery of transactions but it requires that the
SQL is correctly formed and compatible with this behavior.
transaction_replay_retry_on_mismatch
Type: boolean
Mandatory: No
Dynamic: Yes
Default: false
Retry transactions that end in checksum mismatch. This parameter was added in MaxScale 6.2.1 is disabled by default.
When enabled, any replayed transactions that end with a checksum mismatch are
retried until they either succeeds or one of the transaction replay limits is
reached (delayed_retry_timeout
, transaction_replay_timeout
ortransaction_replay_attempts
).
transaction_replay_checksum
Type: enum
Mandatory: No
Dynamic: Yes
Values: full
, result_only
, no_insert_id
Default: full
Selects which transaction checksum method is used to verify the result of the replayed transaction.
Note that only transaction_replay_checksum=full
is guaranteed to retain the
consistency of the replayed transaction.
Possible values are:
full
(default)
All responses from the server are included in the checksum. This retains the full consistency guarantee of the replayed transaction as it must match exactly the one that was already returned to the client.
result_only
Only resultsets and errors are included in the checksum. OK packets
(i.e. successful queries that do not return results) are ignored. This mode
is intended to be used in cases where the extra information (auto-generated
ID, warnings etc.) returned in the OK packet is not used by the
application.
This mode is safe to use only if the auto-generated ID is not actually used
by any following queries. An example of such behavior would be a transaction
that ends with an INSERT
into a table with an AUTO_INCREMENT
field.
no_insert_id
The same as result_only
but results from queries that useLAST_INSERT_ID()
are also ignored. This mode is safe to use only if the
result of the query is not used by any subsequent statement in the
transaction.
optimistic_trx
Type: boolean
Mandatory: No
Dynamic: Yes
Default: false
Enable optimistic transaction execution. This parameter controls whether normal
transactions (i.e. START TRANSACTION
or BEGIN
) are load balanced across
replicas. This feature is disabled by default and enabling it implicitly enablestransaction_replay
, delayed_retry
and master_reconnection
parameters.
When this mode is enabled, all transactions are first attempted on replica servers. If the transaction contains no statements that modify data, it is completed on the replica. If the transaction contains statements that modify data, it is rolled back on the replica server and restarted on the primary. The rollback is initiated the moment a data modifying statement is intercepted by readwritesplit so only read-only statements are executed on replica servers.
As with transaction_replay
and transactions that are replayed, if the results
returned by the primary server are not identical to the ones returned by the
replica up to the point where the first data modifying statement was executed, the
connection is closed. If the execution of ROLLBACK statement on the replica fails,
the connection to that replica is closed.
All limitations that apply to transaction_replay
also apply tooptimistic_trx
.
causal_reads
Type: enum
Mandatory: No
Dynamic: Yes
Values: none
, local
, global
, fast
, fast_global
, universal
, fast_universal
Default: none
Enable causal reads. This parameter is disabled by default and was introduced in MaxScale 2.3.0.
If a client connection modifies the database and causal_reads
is enabled, any
subsequent reads performed on replica servers will be done in a manner that
prevents replication lag from affecting the results.
The following table contains a comparison of the modes. Read theimplementation of causal_reads for more information on what a sync consists of and why minimizing the number of them is important.
local
Session
Low, one sync per write.
fast
Session
None, no sync at all.
global
Service
Medium, one sync per read.
fast_global
Service
None, no sync at all.
universal
Cluster
High, one sync per read plus a roundtrip to the primary.
fast_universal
Cluster
Low, one roundtrip to the primary.
The fast
, fast_global
and fast_universal
modes should only be used when
low latency is more important than proper distribution of reads. These modes
should only be used when the workload is mostly read-only with only occasional
writes. If used with a mixed or a write-heavy workload, the traffic will end up
being routed almost exclusively to the primary server.
Note: This feature requires MariaDB 10.2.16 or newer to function. In
addition to this, the session_track_system_variables
parameter must includelast_gtid
in its list of tracked system variables.
Note: This feature also enables multi-statement execution of SQL in the
protocol. This is equivalent to using allowMultiQueries=true
inConnector/J
or using CLIENT_MULTI_STATEMENTS
and CLIENT_MULTI_RESULTS
in the
Connector/C. The Implementation of causal_reads section explains why this is
necessary.
The possible values for this parameter are:
none
(default)
Read causality is disabled.
local
Writes are locally visible. Writes are guaranteed to be visible only to the connection that does it. Unrelated modifications done by other connections are not visible. This mode improves read scalability at the cost of latency and reduces the overall load placed on the primary server without breaking causality guarantees.
global
Writes are globally visible. If one connection writes a value, all
connections to the same service will see it. In general this mode is slower
than the local
mode due to the extra synchronization it has to do. This
guarantees global happens-before ordering of reads when all transactions are
inside a single GTID domain.This mode gives similar benefits as the local
mode in that it improves read scalability at the cost of latency.
With MaxScale versions 2.5.14 and older, multi-domain use of causal_reads
could cause non-causal reads to occur. Starting with MaxScale 2.5.15, this
was fixed and all the GTID coordinates are passed alongside all requests
which makes multi-domain GTIDs safe to use. However, this does mean that the
GTID coordinates will never be reset: if replication is reset and GTID
coordinates go "backwards", readwritesplit will not consider these as being
newer than the ones already stored. To reset the stored GTID coordinates in
readwritesplit, MaxScale must be restarted.
MaxScale 6.4.11 added the new reset-gtid
module command to
readwritesplit. This allows the global GTID state used bycausal_reads=global
to be reset without having to restart MaxScale.
fast
This mode is similar to the local
mode where it will only affect the
connection that does the write but where the local
mode waits for a replica
server to catch up, the fast
mode will only use servers that are known to
have replicated the write. This means that if no replica has replicated the
write, the primary where the write was done will be used. The value ofcausal_reads_timeout
is ignored in this mode. Currently the replication
state is only updated by the mariadbmon monitor whenever the servers are
monitored. This means that a smaller monitor_interval
provides faster
replication state updates and possibly better overall usage of servers.
This mode is the inverse of the local
mode in the sense that it improves
read latency at the cost of read scalability while still retaining the
causality guarantees for reads. This functionality can also be considered an
improved version of the functionality that theCCRFilter module provides.
fast_global
This mode is identical to the fast
mode except that it uses the global
GTID instead of the session local one. This is similar to how local
andglobal
modes differ from each other. The value of causal_reads_timeout
is ignored in this mode. Currently the replication state is only updated by
the mariadbmon monitor whenever the servers are monitored. This means that a
smaller monitor_interval
provides faster replication state updates and
possibly better overall usage of servers.
universal
The universal mode guarantees that all SELECT statements always see the
latest observable transaction state on a database cluster. The basis of this
is the @@gtid_binlog_pos
variable which is read from the current primary
server before each read. This guarantees that if a transaction was visible
at the time the read is received by readwritesplit, the transaction is
guaranteed to be complete on the replica server where the read is
done. Versions 22.08.16, 23.02.13, 23.08.9, 24.02.5 and older used@@gtid_current_pos
as the GTID value
(MXS-5588) but this caused
problems with Galera clusters.
This mode is the most consistent of all the modes. It provides consistency
regardless of where a write originated from but it comes at the cost of
increased latency. For every read, a round trip to the current primary server
is done. This means that the latency of any given SELECT statement increases
by roughly twice the network latency between MaxScale and the database
cluster. In addition, an extra SELECT statement is always executed on the
primary which places some load on the server.
fast_universal
A mix of fast
and universal
. This mode that guarantees that all SELECT
statements always see the latest observable transaction state but unlike theuniversal
mode that waits on the server to catch up, this mode behaves
like fast
and routes the query to the current primary if no replicas are
available that have caught up.
This mode provides the same consistency guarantees of universal
with a
constant latency overhead of one extra roundtrip. However, this also puts
the most load on the primary node as even a moderate write load can cause
the GTIDs of replicas to lag too far behind.
Before MaxScale 2.5.0, the causal_reads
parameter was a boolean
parameter. False values translated to none
and true values translated tolocal
. The use of boolean parameters is deprecated but still accepted in
MaxScale 2.5.0.
Implementation of causal_reads
This feature is based on the MASTER_GTID_WAIT
function and the tracking of
server-side status variables. By tracking the latest GTID that each statement
generates, readwritesplit can then perform a synchronization operation with the
help of the MASTER_GTID_WAIT
function.
If the replica has not caught up to the primary within the configured time, as specified by causal_reads_timeout, it will be retried on the primary. In MaxScale 2.3.0 an error was returned to the client when the replica timed out.
The exception to this rule is the fast
mode which does not do any
synchronization at all. This can be done as any reads that would go to
out-of-date servers will be re-routed to the current primary.
Normal SQL
A practical example can be given by the following set of SQL commands executed
with autocommit=1
.
INSERT INTO test.t1 (id) VALUES (1);
SELECT * FROM test.t1 WHERE id = 1;
As the statements are not executed inside a transaction, from the load balancer's point of view, the latter statement can be routed to a replica server. The problem with this is that if the value that was inserted on the primary has not yet replicated to the server where the SELECT statement is being performed, it can appear as if the value we just inserted is not there.
By prefixing these types of SELECT statements with a command that guarantees consistent results for the reads, read scalability can be improved without sacrificing consistency.
The set of example SQL above will be translated by MaxScale into the following statements.
INSERT INTO test.t1 (id) VALUES (1);
-- These are executed as one multi-query
SET @maxscale_secret_variable=(
SELECT CASE
WHEN MASTER_GTID_WAIT('0-3000-8', 10) = 0 THEN 1
ELSE (SELECT 1 FROM INFORMATION_SCHEMA.ENGINES)
END); SELECT * FROM test.t1 WHERE id = 1;
The SET
command will synchronize the replica to a certain logical point in the
replication stream (seeMASTER_GTID_WAIT for more
details). If the synchronization fails, the query will not run and it will be
retried on the server where the transaction was originally done.
Prepared Statements
Binary protocol prepared statements are handled in a different manner. Instead of adding the synchronization SQL into the original SQL query, it is sent as a separate packet before the prepared statement is executed.
We'll use the same example SQL but use a binary protocol prepared statement for the SELECT:
COM_QUERY: INSERT INTO test.t1 (id) VALUES (1);
COM_STMT_PREPARE: SELECT * FROM test.t1 WHERE id = ?;
COM_STMT_EXECUTE: ? = 123
The SQL that MaxScale executes will be the following:
COM_QUERY: INSERT INTO test.t1 (id) VALUES (1);
COM_STMT_PREPARE: SELECT * FROM test.t1 WHERE id = ?;
COM_QUERY: IF (MASTER_GTID_WAIT('0-3000-8', 10) <> 0) THEN KILL (SELECT CONNECTION_ID()); END IF
COM_STMT_EXECUTE: ? = 123
Both the synchronization query and the execution of the prepared statement are sent at the same time. This is done to remove the need to wait for the result of the synchronization query before routing the execution of the prepared statement. This keeps the performance of causal_reads for prepared statements the same as it is for normal SQL queries.
As a result of this, each time the synchronization query times out, the
connection will be killed by the KILL
statement and readwritesplit will retry
the query on the primary. This is done to prevent the execution of the prepared
statement that follows the synchronization query from being processed by the
MariaDB server.
It is recommend that the session command history is enabled whenever prepared
statements are used with causal_reads
. This allows new connections to be
created whenever a causal read times out.
Starting with MaxScale 2.5.17, a failed causal read inside of a read-only
transaction started with START TRANSACTION READ ONLY
will return the following
error:
Error: 1792
SQLSTATE: 25006
Message: Causal read timed out while in a read-only transaction, cannot retry command.
Older versions of MaxScale attempted to retry the command on the current primary server which would cause the connection to be closed and a warning to be logged.
Limitations of Causal Reads
This feature does not work with Galera or any other non-standard
replication mechanisms. As Galera does not update the gtid_slave_pos
variable when events are replicated via the Galera library, theMASTER_GTID_WAIT
function used by MaxScale to synchronize reads will wait until the
timeout. With Galera this is not a serious issue as it, by nature, is a
mostly-synchronous replication mechanism.
If the combination of the original SQL statement and the modifications added to it by readwritesplit exceed the maximum packet size (16777213 bytes), the causal read will not be attempted and a non-causal read is done instead. This applies only to text protocol queries as the binary protocol queries use a different synchronization mechanism.
SQL like INSERT ... RETURNING
that commits a transaction and returns a
resultset will only work with causal reads if the connector supports the
DEPRECATE_EOF protocol feature. The following table contains a list of MariaDB
connectors and whether they support the protocol feature.
Connector/J
Yes
3.5.2
Connector/Node.js
Yes
3.4.0
Connector/R2DBC
Yes
1.3.0
Connector/C
No
3.4.4
Connector/C++
No
1.1.5
Connector/ODBC
No
3.2.5
causal_reads_timeout
Type: duration
Mandatory: No
Dynamic: Yes
Default: 10s
The timeout for the replica synchronization done by causal_reads
. The
default value is 10 seconds.
The timeout is specified as documentedhere. If no explicit unit is provided, the value is interpreted as seconds in MaxScale 2.4. In subsequent versions a value without a unit may be rejected. Note that since the granularity of the timeout is seconds, a timeout specified in milliseconds will be rejected, even if the duration is longer than a second.
lazy_connect
Type: boolean
Mandatory: No
Dynamic: Yes
Default: false
Lazy connection creation causes connections to backend servers to be opened only when they are needed. This reduces the load that is placed on the backend servers when the client connections are short. This parameter is a boolean type and is disabled by default.
By default readwritesplit opens as many connections as it can when the session
is first opened. This makes the execution of the first query faster when all
available connections are already created. When lazy_connect
is enabled, this
initial connection creation is skipped. If the client executes only read
queries, no connection to the primary is made. If only write queries are made,
only the primary connection is used.
In MaxScale 23.08.2, if a session command
is received as the first command, the default behavior is to execute it on a
replica. If master_accept_reads is enabled, the query is
executed on the primary server, if one is available. In practice this means that
workloads which are mostly reads with infrequent writes should disablemaster_accept_reads
if they also use lazy_connect
.
Older versions of MaxScale always tried to execute all session commands on the primary node if one was available.
reuse_prepared_statements
Type: boolean
Mandatory: No
Dynamic: Yes
Default: false
Reuse identical prepared statements inside the same client connection. This is a boolean parameter and is disabled by default. This feature only applies to binary protocol prepared statements.
When this parameter is enabled and the connection prepares an identical prepared statement multiple times, instead of preparing it on the server the existing prepared statement handle is reused. This also means that whenever prepared statements are closed by the client, they will be left open by readwritesplit.
Enabling this feature will increase memory usage of a session. The amount of memory stored per prepared statement is proportional to the length of the prepared SQL statement and the number of parameters the statement has.
The router_diagnostics
output for a readwritesplit service contains the
following fields.
queries
: Number of queries executed through this service.
route_master
: Number of writes routed to primary.
route_slave
: Number of reads routed to replicas.
route_all
: Number of session commands routed to all servers.
rw_transactions
: Number of explicit read-write transactions.
ro_transactions
: Number of explicit read-only transactions.
replayed_transactions
: Number of replayed transactions.
server_query_statistics
: Statistics for each configured and used server consisting of the following fields.
id
: Name of the server
total
: Total number of queries.
read
: Total number of reads.
write
: Total number of writes.
avg_sess_duration
: Average duration of a client session to this server.
avg_sess_active_pct
: Average percentage of time client sessions were active. 0% means connections were opened but never used.
avg_selects_per_session
: Average number of selects per session.
The general rule with server ranks is that primary servers will be used before secondary servers. Readwritesplit is an exception to this rule. The following rules govern how readwritesplit behaves with servers that have different ranks.
Sessions will use the current primary server as long as possible. This means that sessions with a secondary primary will not use the primary primary as long as the secondary primary is available.
All replica connections will use the same rank as the primary connection. Any stale connections with a different rank than the primary will be discarded.
If no primary connection is available and master_reconnection
is enabled, a
connection to the best primary is created. If the new primary has a different
priority than existing connections have, the connections with a different rank
will be discarded.
If open connections exist, these will be used for routing. This means that if the primary is lost but the session still has replica servers with the same rank, they will remain in use.
If no open connections exist, the servers with the best rank will used.
The readwritesplit router supports routing hints. For a detailed guide on hint syntax and functionality, please read this document.
Note: Routing hints will always have the highest priority when a routing decision is made. This means that it is possible to cause inconsistencies in the session state and the actual data in the database by adding routing hints to DDL/DML statements which are then directed to replica servers. Only use routing hints when you are sure that they can cause no harm.
An exception to this rule is transaction_replay
: when it is enabled, all
routing hints inside transaction are ignored. This is done to prevent changes
done inside a re-playable transaction from affecting servers outside of the
transaction. This behavior was added in MaxScale 6.1.4. Older versions allowed
routing hints to override the transaction logic.
If a SELECT
statement with a maxscale route to slave
hint is received
while autocommit is disabled, the query will be routed to a replica server. This
causes some metadata locks to be acquired on the database in question which
will block DDL statements on the server until either the connection is closed
or autocommit is enabled again.
The readwritesplit router implements the following module commands.
reset-gtid
The command resets the global GTID state in the router. It can be used withcausal_reads=global
to reset the state. This can be useful when the cluster is
reverted to an earlier state and the GTIDs recorded in MaxScale are no longer
valid.
The first and only argument to the command is the router name. For example, to
reset the GTID state of a readwritesplit named My-RW-Router
, the following
MaxCtrl command should be used:
maxctrl call command readwritesplit reset-gtid My-RW-Router
Examples of the readwritesplit router in use can be found in theTutorials folder.
Here is a small explanation which shows what kinds of queries are routed to which type of server.
Routing to primary is important for data consistency and because majority of writes are written to binlog and thus become replicated to replicas.
The following operations are routed to primary:
DML statements (INSERT
, UPDATE
, DELETE
etc.)
DDL statements (DROP
, CREATE
, ALTER
etc.)
All statements within an open read-write transaction
Stored procedure calls
User-defined function calls
Queries that use sequences (NEXT VALUE FOR seq
, NEXTVAL(seq)
or seq.nextval
)
Statements that use any of the following functions:
LAST_INSERT_ID()
GET_LOCK()
RELEASE_LOCK()
IS_USED_LOCK()
IS_FREE_LOCK()
Statements that use any of the following variables:
@@last_insert_id
@@identity
In addition to these, if the readwritesplit service is configured with themax_replication_lag
parameter, and if all replicas suffer from too much
replication lag, then statements will be routed to the primary. (There might be
other similar configuration parameters in the future which limit the number of
statements that will be routed to replicas.)
Transaction Isolation Level Tracking
If either session_track_transaction_info=CHARACTERISTICS
orsession_track_system_variables=tx_isolation
is configured for the MariaDB
server, readwritesplit will track the transaction isolation level and lock the
session to the primary when the isolation level is set to serializable. This
retains the correctness of the isolation level which can otherwise cause
problems.
Starting with MaxScale 23.08, once the transaction isolation level is set to
something other than SERIALIZABLE
, the session is no longer locked to the
primary and returns to its normal state. Older versions of MaxScale remain
locked to the primary even if the session goes out of the SERIALIZABLE
isolation level.
The ability to route some statements to replicas is important because it also decreases the load targeted to primary. Moreover, it is possible to have multiple replicas to share the load in contrast to single primary.
Queries which can be routed to replicas must be auto committed and belong to one of the following group:
Read-only statements (i.e. SELECT
) that only use read-only built-in functions
All statements within an explicit read-only transaction (START TRANSACTION READ ONLY
)
SHOW
statements except SHOW MASTER STATUS
The list of supported built-in fuctions can be foundhere.
A third class of statements includes those which modify session data, such as session system variables, user-defined variables, the default database, etc. We call them session commands, and they must be replicated as they affect the future results of read and write operations. They must be executed on all servers that could execute statements on behalf of this client.
Session commands include for example:
Commands that modify the session state (SET
, USE
, CHANGE USER
)
Text protocol PREPARE
statements
Binary protocol prepared statements
Other miscellaneous commands (COM_QUIT, COM_PING etc.)
NOTE: if variable assignment is embedded in a write statement it is routed
to primary only. For example, INSERT INTO t1 values(@myvar:=5, 7)
would be
routed to primary only.
The router stores all of the executed session commands so that in case of a
replica failure, a replacement replica can be chosen and the session command history
can be repeated on that new replica. This means that the router stores each
executed session command for the duration of the session. Applications that use
long-running sessions might cause MariaDB MaxScale to consume a growing amount
of memory unless the sessions are closed. This can be solved by adjusting the
value of max_sescmd_history
.
In the following cases, a query is routed to the same server where the previous query was executed. If no previous target is found, the query is routed to the current primary.
If a query uses the FOUND_ROWS()
function, it will be routed to the server
where the last query was executed. This is done with the assumption that a
query with SQL_CALC_FOUND_ROWS
was previously executed.
COM_STMT_FETCH_ROWS will always be routed to the same server where the COM_STMT_EXECUTE was routed.
Read queries are routed to the primary server in the following situations:
Query is executed inside an open read-write transaction
Statement includes a stored procedure or an UDF call
If there are multiple statements inside one query e.g.INSERT INTO ... ; SELECT LAST_INSERT_ID();
If a prepared statement targets a temporary table on the primary, the replica servers will fail to execute it. This will cause all replica connections to be closed (MXS-1816).
When transaction replay is enabled, readwritesplit calculates a checksum of the server responses for each transaction. This is used to determine whether a replayed transaction was identical to the original transaction. Starting with MaxScale 23.08, a 128-bit xxHash checksum is stored for each statement that is in the transaction. Older versions of MaxScale used a single 160-bit SHA1 checksum for the whole transaction.
If the results from the replacement server are not identical when the
transaction is replayed, the client connection is closed. This means that any
transaction with a server specific result (e.g. NOW()
, @@server_id
) cannot
be replayed successfully but it will still be attempted.
If a transaction reads data before updating it, the rows should be locked by
using SELECT ... FOR UPDATE
. This will prevent overlapping transactions when
multiple transactions are being replayed that modify the same set of rows.
If the connection to the server where the transaction is being executed is
lost when the final COMMIT
is being executed, it is impossible to know
whether the transaction was successfully committed. This means that there
is a possibility for duplicate transaction execution which can result in
data duplication in certain cases.
In MaxScale 23.08, the transaction_replay_safe_commit
variable controls
whether a replay is attempted or not whenever a COMMIT
is interrupted. By
default the transaction will not be replayed. Older versions of MaxScale always
replayed the transaction.
Data duplication can happen if the transaction consists of the following statement types:
INSERT of rows into a table that does not have an auto-increment primary key
A "blind update" of one or more rows e.g. UPDATE t SET c = c + 1 WHERE id = 123
A "blind delete" e.g. DELETE FROM t LIMIT 100
This is not an exhaustive list and any operations that do not check the row contents before performing the operation on them might face this problem.
In all cases the problem of duplicate transaction execution can be avoided by
including a SELECT ... FOR UPDATE
in the statement. This will guarantee that
in the case that the transaction fails when it is being committed, the row is
only modified if it matches the expected contents.
Similarly, a connection loss during COMMIT
can also result in transaction
replay failure. This happens due to the same reason as duplicate transaction
execution but the retried transaction will not be committed. This can be
considered a success case as the transaction replay detected that the results of
the two transactions are different. In these cases readwritesplit will abort the
transaction and close the client connection.
Statements that result in an implicit commit do not reset the transaction when
transaction_replay is enabled. This means that if the transaction is replayed,
the transaction will be committed twice due to the implicit commit being
present. The exception to this are the transaction management statements such asBEGIN
and START TRANSACTION
: they are detected and will cause the
transaction to be correctly reset.
In older versions of MaxScale, if a connection to a server is lost while a
statement is being executed and the result was partially delivered to the
client, readwritesplit would immediately close the session without attempting to
replay the failing statement. Starting with MaxScale 23.08, this limitation no
longer applies if the statement was done inside of a transaction andtransaction_replay
is enabled
(MXS-4549).
If the connection to the server where a transaction is being executed is lost
while a ROLLBACK
is being executed, readwritesplit will still attempt to
replay the transaction in the hopes that the real response can be delivered to
the client. However, this does mean that it is possible that a rolled back
transaction which gets replayed ends up with a conflict and is reported as a
replay failure when in reality a rolled back transaction could be safely
ignored.
Limitations in Session State Modifications
Any changes to the session state (e.g. autocommit state, SQL mode) done inside a transaction will remain in effect even if the connection to the server where the transaction is being executed fails. When readwritesplit creates a new connection to a server to replay the transaction, it will first restore the session state by executing all session commands that were executed. This means that if the session state is changed mid-transaction in a way that affects the results, transaction replay will fail.
The following partial transaction demonstrates the problem by using inside a transaction.
SET SQL_MODE=''; -- A session command
BEGIN;
SELECT "hello world"; -- Returns the string "hello world"
SET SQL_MODE='ANSI_QUOTES'; -- A session command
SELECT 'hello world'; -- Returns the string "hello world"
If this transaction has to be replayed the actual SQL that gets executed is the following.
SET SQL_MODE=''; -- Replayed session command
SET SQL_MODE='ANSI_QUOTES'; -- Replayed session command
BEGIN;
SELECT "hello world"; -- Returns an error
SELECT 'hello world'; -- Returns the string "hello world"
First the session state is restored by executing all commands that changed the state after which the actual transaction is replayed. Due to the fact that the SQL_MODE was changed mid-transaction, one of the queries will now return an error instead of the result we expected leading to a transaction replay failure.
Limitations in Service-to-Service Routing
In a service-to-service configuration (i.e. a service using another service in
its targets
list ), if the topmost service starts a transaction, all
lower-level readwritesplit services will also behave as if a transaction is
open. If a connection to a backend database fails during this, it can result in
unnecessary transaction replays which in turn can end up with checksum
conflicts. The recommended approach is to not use any commands inside a
transaction that would be routed to more than one node.
Limitations in multi-statement handling
When a multi-statement query is executed through the readwritesplit router, it will always be routed to the primary. Seestrict_multi_stmt for more details.
If the multi-statement query creates a temporary table, it will not be detected and reads to this table can be routed to replica servers. To prevent this, always execute the temporary table creation as an individual statement.
Limitations in client session handling
Some of the queries that a client sends are routed to all backends instead of
just to one. These queries include USE <db name>
and SET autocommit=0
, among
many others. Readwritesplit sends a copy of these queries to each backend server
and forwards the primary's reply to the client. Below is a list of MySQL commands
which are classified as session commands.
COM_INIT_DB (USE <db name> creates this)
COM_CHANGE_USER
COM_STMT_CLOSE
COM_STMT_SEND_LONG_DATA
COM_STMT_RESET
COM_STMT_PREPARE
COM_QUIT (no response, session is closed)
COM_REFRESH
COM_DEBUG
COM_PING
SQLCOM_CHANGE_DB (USE ... statements)
SQLCOM_DEALLOCATE_PREPARE
SQLCOM_PREPARE
SQLCOM_SET_OPTION
SELECT ..INTO variable|OUTFILE|DUMPFILE
SET autocommit=1|0
Prior to MaxScale 2.3.0, session commands that were 2²⁴ - 1 bytes or longer were not supported and caused the session to be closed.
There is a possibility for misbehavior. If USE mytable
is executed in one of
the replicas and fails, it may be due to replication lag rather than the database
not existing. Thus, the same command may produce different result in different
backend servers. The replicas which fail to execute a session command will be
dropped from the active list of replicas for this session to guarantee a
consistent session state across all the servers used by the session. In
addition, the server will not be used again for routing for the duration of the
session.
The above-mentioned behavior for user variables can be partially controlled with
the configuration parameter use_sql_variables_in
:
use_sql_variables_in=[master|all] (default: all)
WARNING
If a SELECT query modifies a user variable when the use_sql_variables_in
parameter is set to all
, it will not be routed and the client will receive an
error. A log message is written into the log further explaining the reason for
the error. Here is an example use of a SELECT query which modifies a user
variable and how MariaDB MaxScale responds to it.
MySQL [(none)]> set @id=1;
Query OK, 0 rows affected (0.00 sec)
MySQL [(none)]> SELECT @id := @id + 1 FROM test.t1;
ERROR 1064 (42000): Routing query to backend failed. See the error log for further details.
Allow user variable modification in SELECT queries by settinguse_sql_variables_in=master
. This will route all queries that use user
variables to the primary.
CC BY-SA / Gnu FDL
This document describes how to configure a MariaDB primary-replica cluster monitor to be used with MaxScale.
Define the monitor that monitors the servers.
[Replication-Monitor]
type=monitor
module=mariadbmon
servers=dbserv1, dbserv2, dbserv3
user=monitor_user
password=my_password
monitor_interval=2000ms
The mandatory parameters are the object type, the monitor module to use, the
list of servers to monitor and the username and password to use when connecting
to the servers. The monitor_interval
parameter controls for how long
the monitor waits between each monitoring loop.
The monitor user requires the REPLICATION CLIENT privileges to do basic monitoring. To create a user with the proper grants, execute the following SQL.
CREATE USER 'monitor_user'@'%' IDENTIFIED BY 'my_password';
GRANT REPLICATION CLIENT ON *.* TO 'monitor_user'@'%';
Note: If the automatic failover of the MariaDB Monitor will used, the user will require additional grants. Execute the following SQL to grant them.
GRANT SUPER, RELOAD on *.* to 'monitor_user'@'%';
CC BY-SA / Gnu FDL
The mariadbprotocol
module implements the MariaDB client-server protocol.
The legacy protocol names mysqlclient
, mariadb
and mariadbclient
are all
aliases to mariadbprotocol
.
Protocol level parameters are defined in the listeners. They must be defined using the scoped parameter syntax where the protocol name is used as the prefix.
[MyListener]
type=listener
service=MyService
protocol=mariadbprotocol
mariadbprotocol.allow_replication=false
port=3306
For the MariaDB protocol module, the prefix is always mariadbprotocol
.
allow_replication
Type: boolean
Mandatory: No
Dynamic: Yes
Default: true
Whether the use of the replication protocol is allowed through this listener. If
disabled with mariadbprotocol.allow_replication=false
, all attempts to start
replication will be rejected with a ER_FEATURE_DISABLED error (error number
1289).
CC BY-SA / Gnu FDL
Refer to the Hintfilter documentation for the MaxScale hint syntax.
CC BY-SA / Gnu FDL
This document describes possible issues when upgrading MariaDB MaxScale from version 23.02 to 23.08.
For more information about MaxScale 23.08, refer to theChangeLog.
Before starting the upgrade, any existing configuration files should be backed up.
CC BY-SA / Gnu FDL
This document describes possible issues when upgrading MariaDB MaxScale from version 6 to 22.08.
For more information about MaxScale 22.08, refer to theChangeLog.
Before starting the upgrade, any existing configuration files should be backed up.
The support for legacy encryption keys generated with maxkeys
from pre-2.5
versions has been removed. This feature was deprecated in MaxScale 2.5 when
the new key storage format was introduced. To migrate to the new key storage
format, create a new key file with maxkeys
and re-encrypt the passwords withmaxpasswd
.
The deprecated Database Firewall filter has been removed.
CC BY-SA / Gnu FDL
The first step is to define the servers that make up the cluster. These servers will be used by the services and are monitored by the monitor.
[dbserv1]
type=server
address=192.168.2.1
port=3306
[dbserv2]
type=server
address=192.168.2.2
port=3306
[dbserv3]
type=server
address=192.168.2.3
port=3306
The address
and port
parameters tell where the server is located.
To enable encryption for the MaxScale-to-MariaDB communication, add ssl=true
to the server section. To enable server certificate verification, addssl_verify_peer_certificate=true
.
The ssl
and ssl_verify_peer_certificate
parameters are similar to the--ssl
and --ssl-verify-server-cert
options of the mysql
command line
client.
For more information about TLS, refer to theConfiguration Guide.
CC BY-SA / Gnu FDL
This document describes possible issues when upgrading MariaDB MaxScale from version 22.08 to 23.02.
For more information about MaxScale 23.02, refer to the.
Before starting the upgrade, any existing configuration files should be backed up.
The csmon
and auroramon
monitors have been removed.
The obsolete maxctrl drain
command has been removed.
The maxctrl cluster
commands have been removed.
CC BY-SA / Gnu FDL
HintRouter was introduced in 2.2 and is still beta.
The HintRouter module is a simple router intended to operate in conjunction with the NamedServerFilter. The router looks at the hints embedded in a packet buffer and attempts to route the packet according to the hint. The user can also set a default action to be taken when a query has no hints or when the hints could not be applied.
If a packet has multiple hints attached, the router will read them in order and attempt routing. Any successful routing ends the process and any further hints are ignored for the packet.
The HintRouter is a rather simple router and only accepts a few configuration settings.
This setting defines what happens when a query has no routing hint or applying the routing hint(s) fails. If also the default action fails, the routing will end in error and the session closes. The different values are:
Note that setting default action to anything other than all
means that session
variable write commands are by default not routed to all backends.
Defines the default backend name if default_action=named
. <server-name>
must
be a valid backend name.
<limit>
should be an integer, -1 by default. Defines how many backend replica
servers a session should attempt to connect to. Having less replicas defined in
the services and/or less successful connections during session creation is not
an error. The router will attempt to distribute replicas evenly between sessions
by assigning them in a round robin fashion. The session will always try to
connect to a primary regardless of this setting, although not finding one is not
an error.
Negative values activate default mode, in which case this value is set to the number of backends in the service - 1, so that the sessions are connected to all replicas.
If the hints or the default_action
point to a named server, this setting is
probably best left to default to ensure that the specific server is connected to
at session creation. The router will not attempt to connect to additional
servers after session creation.
A minimal configuration doesn't require any parameters as all settings have reasonable defaults.
If packets should be routed to the primary server by default and only a few connections are required, the configuration might be as follows.
CC BY-SA / Gnu FDL
This document describes how to configure the Xpand monitor for use with a Xpand cluster.
Contrary to the other monitors of MaxScale, the Xpand monitor will autonomously figure out the cluster configuration and for each Xpand node create the corresponding MaxScale server object.
In order to do that, a sufficient number of "bootstrap" server instances must be specified in the MaxScale configuration file for the Xpand monitor to start with. One server instance is in principle sufficient, but if the corresponding node happens to be down when MaxScale starts, the monitor will not be able to function.
The server configuration is identical with that of any other server, but since these servers are only used for bootstrapping the Xpand monitor it is adviceable to use names that clearly will identify them as such.
The actual Xpand monitor configuration looks as follows:
The mandatory parameters are the object type, the monitor module to use, the list of servers to use for bootstrapping and the username and password to use when connecting to the servers.
The monitor_interval
parameter specifies how frequently the monitor should
ping the health check port of each server and the cluster_monitor_interval
specifies how frequently the monitor should do a complete cluster check, that
is, access the system
tables of the Cluster for checking the Cluster
configuration. The default values are 2000
and 60000
, that is, 2 seconds
and 1 minute, respectively.
For each detected Xpand node a corresponding MaxScale server object will be
created, whose name is @@<Monitor-Name>:node-<id>, where _Monitor-Name_ is the name of the monitor, in this example
Xpand` and id is the node id
of the Xpand node. So, with a cluster of three nodes, the created servers
might be named like.
Note that as these are created at runtime and may disappear at any moment, depending on changes happening in and made to the Xpand cluster, they should never be referred to directly from service configurations. Instead, services should refer to the monitor, as shown in the following:
Instead of listing the servers of the service explicitly using the servers
parameter as usually is the case, the service refers to the Xpand monitor
using the cluster
parameter. This will cause the service to use the Xpand
nodes that the Xpand monitor discovers at runtime.
For additional details, please consult the monitor.
CC BY-SA / Gnu FDL
Note: The password encryption format changed in MaxScale 2.5. All encrypted passwords created with MaxScale 2.4 or older need to be re-encrypted.
There are two options for representing the password, either plain text or
encrypted passwords may be used. In order to use encrypted passwords a set of
keys must be generated that will be used by the encryption and decryption
process. To generate the keys, use the maxkeys
command.
By default the key file will be generated in /var/lib/maxscale
. If a different
directory is required, it can be given as the first argument to the program. For
more information, see maxkeys --help
.
Once the keys have been created the maxpasswd
command can be used to generate
the encrypted password.
The username and password, either encrypted or plain text, are stored in the
service section using the user
and password
parameters.
If a custom location was used for the key file, give it as the first argument tomaxpasswd
and pass the password to be encrypted as the second argument. For
more information, see maxkeys --help
.
Here is an example configuration that uses an encrypted password.
If the key file is not in the default location, the parameter must be set to the directory that contains it.
CC BY-SA / Gnu FDL
The goal of this tutorial is to configure a system that appears to the client as a single database. MariaDB MaxScale will split the statements such that write statements are sent to the primary server and read statements are balanced across the replica servers.
This tutorial is a part of . Please read it and follow the instructions. Return here once basic setup is complete.
After configuring the servers and the monitor, we create a read-write-splitter service configuration. Create the following section in your configuration file. The section name is also the name of the service and should be meaningful. For this tutorial, we use the name Splitter-Service.
router defines the routing module used. Here we use readwritesplit for query-level read-write-splitting.
A service needs a list of servers where queries will be routed to. The server names must match the names of server sections in the configuration file and not the hostnames or addresses of the servers.
The user and password parameters define the credentials the service uses to populate user authentication data. These users were created at the start of the.
For increased security, see .
To allow network connections to a service, a network ports must be associated with it. This is done by creating a separate listener section in the configuration file. A service may have multiple listeners but for this tutorial one is enough.
The service parameter tells which service the listener connects to. For theSplitter-Listener we set it to Splitter-Service.
A listener must define the network port to listen on.
The optional address-parameter defines the local address the listener should bind to.
This may be required when the host machine has multiple network interfaces. The
default behavior is to listen on all network interfaces (the IPv6 address ::
).
For the last steps, please return to .
CC BY-SA / Gnu FDL
This document describes how to configure a Galera cluster monitor.
Define the monitor that monitors the servers.
The mandatory parameters are the object type, the monitor module to use, the
list of servers to monitor and the username and password to use when connecting
to the servers. The monitor_interval
parameter controls for how long
the monitor waits between each monitoring loop.
This monitor module will assign one node within the Galera Cluster as the
current primary and other nodes as replica. Only those nodes that are active
members of the cluster are considered when making the choice of primary node. The
primary node will be the node with the lowest value of wsrep_local_index
.
The monitor user does not require any special grants to monitor a Galera cluster. To create a user for the monitor, execute the following SQL.
CC BY-SA / Gnu FDL
This document describes possible issues when upgrading MariaDB MaxScale from version 2.4 to 2.5.
For more information about MaxScale 2.5, refer to the.
Before starting the upgrade, any existing configuration files should be backed up.
The deprecated MaxAdmin interface has been removed in 2.5.0 in favor of the REST
API and the MaxCtrl command line client. The cli
and maxscaled
modules can
no longer be used.
The credentials used by services now require additional grants. For a full list of required grants, refer to the.
The settings detect_stale_master
, detect_standalone_master
anddetect_stale_slave
are replaced by master_conditions
andslave_conditions
. The old settings may still be used, but will be removed in
a later version.
The encrypted passwords feature has been updated to be more secure. Users are
recommended to generate a new encryption key and re-encrypt their passwords
using the maxkeys
and maxpasswd
utilities. Old passwords still work.
The default state of servers in 2.4 was Running
and in 2.5 it is nowDown
. This was done to prevent newly added servers from being accidentally
used before they were monitored.
It is now mandatory to specify in the configuration what version the monitored Columnstore cluster is.
Please see the for details.
The binlog router delivered with MaxScale 2.5 is completely new and not 100% backward compatible with the binlog router delivered with earlier MaxScale versions. If you use the binlog router, carefully assess whether the functionality provided by the new one fulfills your requirements, before upgrading MaxScale.
The tee filter parameter service
has been deprecated in favor of the target
parameter. All usages of service
can be replaced with target
.
CC BY-SA / Gnu FDL
The MariaDBAuth-module implements the client and backend authentication for the server plugin mysql_native_password. This is the default authentication plugin used by both MariaDB and MySQL.
The following settings may be given in the authenticator_options of the listener.
clear_pw_passthrough
Boolean, default value is "false". Activates passthrough-mode. In this mode, MaxScale does not check client credentials at all and defers authentication to the backend server. This feature is primarily meant to be used with Xpand LDAP- authentication, although it may be useful in any situation where MaxScale cannot check the existence of client user account nor authenticate the client.
When a client connects to a listener with this setting enabled, MaxScale will change authentication method to "mysql_clear_password", causing the client to send their cleartext password to MaxScale. MaxScale will then attempt to use the password to authenticate to backends. The authentication result of the first backend to respond will be sent to the client. The backend may ask MaxScale for either cleartext password or standard ("mysql_native_password") authentication token. MaxScale can work with both backend plugins since it has the original password.
This feature is incompatible with service setting lazy_connect. Either leave
it unspecified or set lazy_connect=false
in the linked service. Also,
multiple client authenticators are not allowed on the listener when
passthrough-mode is on.
Because passwords are sent in cleartext, the listener should be configured for ssl.
log_password_mismatch
Type:
Mandatory: No
Dynamic: No
Default: false
The service setting log_auth_warnings must also be enabled for this setting to have effect. When both settings are enabled, password hashes are logged if a client gives a wrong password. This feature may be useful when diagnosing authentication issues. It should only be enabled on a secure system as the logging of password hashes may be a security risk.
cache_dir
Deprecated and ignored.
inject_service_user
Deprecated and ignored.
CC BY-SA / Gnu FDL
default_action=<master|slave|named|all>
master
Route to the primary server.
slave
Route to any single replica server.
named
Route to a named server. The name is given in the default_server-setting.
all
Default value. Route to all connected servers.
default_server=<server-name>
max_slaves=<limit>
[Routing-Service]
type=service
router=hintrouter
servers=replica1,replica2,replica3
[Routing-Service]
type=service
router=hintrouter
servers=MyPrimary, replica1,replica2,replica3,replica4,replica5,replica6,replica7
default_action=master
max_slaves=2
[Galera-Monitor]
type=monitor
module=galeramon
servers=dbserv1, dbserv2, dbserv3
user=monitor_user
password=my_password
monitor_interval=2000ms
CREATE USER 'monitor_user'@'%' IDENTIFIED BY 'my_password';
[Bootstrap1]
type=server
address=10.2.224.101
port=3306
protocol=mariadbbackend
[Bootstrap2]
type=server
address=10.2.224.102
port=3306
protocol=mariadbbackend
[Xpand]
type=monitor
module=xpandmon
servers=Bootstrap1, Bootstrap2
user=monitor_user
password=monitor_password
monitor_interval=2s
cluster_monitor_interval=60s
@@Xpand:node-2`
@@Xpand:node-3`
@@Xpand:node-7`
[MyService]
type=service
router=readconnroute
user=service_user
password=service_password
cluster=Xpand
maxkeys
maxpasswd plainpassword
96F99AA1315BDC3604B006F427DD9484
[My-Service]
type=service
router=readconnroute
router_options=master
servers=dbserv1, dbserv2, dbserv3
user=maxscale
password=96F99AA1315BDC3604B006F427DD9484
[Splitter-Service]
type=service
router=readwritesplit
servers=dbserv1, dbserv2, dbserv3
user=maxscale
password=maxscale_pw
[Splitter-Listener]
type=listener
service=Splitter-Service
port=3306
[CSMonitor]
type=monitor
module=csmon
version=1.5
...
[MyListener]
type=listener
authenticator=mariadbauth
authenticator_options=clear_pw_passthrough=true
ssl=true
<other options>
The cat
router is a special router that concatenates result sets.
Note: This module is experimental and must be built from source. The module is deprecated in MaxScale 23.08 and might be removed in a future release.
The router has no special parameters. To use it, define a service withrouter=cat
and add the servers you want to use.
The order the servers are defined in is the order in which the servers are
queried. This means that the results are ordered based on the servers
parameter of the service. The result will only be completed once all servers
have executed this.
All commands executed via this router will be executed on all servers. This
means that an INSERT through the cat
router will send it to all servers. In
the case of commands that do not return resultsets, the response of the last
server is sent to the client. This means that if one of the earlier servers
returns a different result, the client will not see it.
As the intended use-case of the router is to mainly reduce multiple result sets into one, it has no mechanisms to prevent writes from being executed on slave servers (which would cause data corruption or replication failure). Take great care when performing administrative operations though this router.
If a connection to one of the servers is lost, the client connection will also be closed.
Here is a simple example service definition that uses the servers from theConfiguring Servers tutorial and the credentials from the MaxScale Tutorial.
[concat-service]
type=service
router=cat
servers=dbserv1,dbserv2,dbserv3
user=maxscale
password=maxscale_pw
CC BY-SA / Gnu FDL
CDC is a new protocol that allows compatible clients to authenticate and register for Change Data Capture events. The new protocol must be use in conjunction with AVRO router which currently converts MariaDB binlog events into AVRO records. Change Data Capture protocol is used by clients in order to interact with stored AVRO file and also allows registered clients to be notified with the new events coming from MariaDB 10.0/10.1 database.
The users and their hashed passwords are stored in /var/cache/maxscale/<service name>/cdcusers
where <service name>
is the name of the service.
For example, the following service entry will look into /var/cache/maxscale/CDC-Service/
for a file called cdcusers
. If that file is found, the users in that file will be used for authentication.
[CDC-Service]
type=service
router=avrorouter
user=maxuser
password=maxpwd
If the cdcusers
file cannot be found, the service user (maxuser:maxpwd in the example) can be used to connect through the CDC protocol.
For more details, refer to the CDC users documentation.
Client connects to MaxScale CDC protocol listener.
Send the authentication message which includes the user and the SHA1 of the password
In the future, optional flags could be implemented.
Sending UUID
Specify the output format (AVRO or JSON) for data retrieval.
Send CDC commands to retrieve router statistics or to query for data events
The authentication starts when the client sends the hexadecimal representation
of the username concatenated with a colon (:
) and the SHA1 of the password.
bin2hex(username + ':' + SHA1(password))
For example the user foobar with a password of foopasswd should send the following hexadecimal string
foobar:SHA1(foopasswd) -> 666f6f6261723a3137336363643535253331
Server returns OK
on success and ERR
on failure.
REGISTER
REGISTER UUID=UUID, TYPE={JSON | AVRO}
Register as a client to the service.
Example:
REGISTER UUID=11ec2300-2e23-11e6-8308-0002a5d5c51b, TYPE=AVRO
Server returns OK
on success and ERR
on failure.
REQUEST-DATA
REQUEST-DATA DATABASE.TABLE[.VERSION] [GTID]
This command fetches data from specified table in a database and returns the output in the requested format (AVRO or JSON). Data records are sent to clients and if new AVRO versions are found (e.g. mydb.mytable.0000002.avro) the new schema and data will be sent as well.
The data will be streamed until the client closes the connection.
Clients should continue reading from network in order to automatically gets new events.
Example:
REQUEST-DATA db1.table1
REQUEST-DATA dbi1.table1.000003
REQUEST-DATA db2.table4 0-11-345
MaxScale includes an example CDC client application written in Python 3. You can find the source code for it in the MaxScale repository.
CC BY-SA / Gnu FDL
Introduced in MaxScale 2.1, the module commands are special, module-specific commands. They allow the modules to expand beyond the capabilities of the module API. Currently, only MaxCtrl implements an interface to the module commands.
All registered module commands can be shown with maxctrl list commands
and
they can be executed with maxctrl call command <module> <name> ARGS...
whereis the name of the module and is the name of the command.ARGS is a command specific list of arguments.
The module command API is defined in the modulecmd.h header. It consists of various functions to register and call module commands. Read the function documentation in the header for more details.
The following example registers the module command my_command for modulemy_module.
#include <maxscale/modulecmd.hh>
bool my_simple_cmd(const MODULECMD_ARG *argv)
{
printf("%d arguments given\n", argv->argc);
}
int main(int argc, char **argv)
{
modulecmd_arg_type_t my_args[] =
{
{MODULECMD_ARG_BOOLEAN, "This is a boolean parameter"},
{MODULECMD_ARG_STRING | MODULECMD_ARG_OPTIONAL, "This is an optional string parameter"}
};
// Register the command
modulecmd_register_command("my_module", "my_command", my_simple_cmd, 2, my_args);
// Find the registered command
const MODULECMD *cmd = modulecmd_find_command("my_module", "my_command");
// Parse the arguments for the command
const void *arglist[] = {"true", "optional string"};
MODULECMD_ARG *arg = modulecmd_arg_parse(cmd, arglist, 2);
// Call the module command
modulecmd_call_command(cmd, arg);
// Free the parsed arguments
modulecmd_arg_free(arg);
return 0;
}
The array my_args of type modulecmd_arg_type_t is used to tell what kinds of arguments the command expects. The first argument is a boolean and the second argument is an optional string.
Arguments are passed to the parsing function as an array of void pointers. They are interpreted as the types the command expects.
When the module command is executed, the argv parameter for themy_simple_cmd contains the parsed arguments received from the caller of the command.
CC BY-SA / Gnu FDL
MariaDB MaxScale is a database proxy that forwards database statements to one or more database servers.
The forwarding is performed using rules based on the semantic understanding of the database statements and on the roles of the servers within the backend cluster of databases.
MariaDB MaxScale is designed to provide, transparently to applications, load balancing and high availability functionality. MariaDB MaxScale has a scalable and flexible architecture, with plugin components to support different protocols and routing approaches.
MariaDB MaxScale makes extensive use of the asynchronous I/O capabilities of the Linux operating system, combined with a fixed number of worker threads. epoll is used to provide the event driven framework for the input and output via sockets.
Many of the services provided by MariaDB MaxScale are implemented as external shared object modules loaded at runtime. These modules support a fixed interface, communicating the entry points via a structure consisting of a set of function pointers. This structure is called the "module object". Additional modules can be created to work with MariaDB MaxScale.
Commonly used module types are protocol, router and filter. Protocol modules implement the communication between clients and MariaDB MaxScale, and between MariaDB MaxScale and backend servers. Routers inspect the queries from clients and decide the target backend. The decisions are usually based on routing rules and backend server status. Filters work on data as it passes through MariaDB MaxScale. Filter are often used for logging queries or modifying server responses.
A Google Group exists for MariaDB MaxScale. The Group is used to discuss ideas, issues and communicate with the MariaDB MaxScale community. Send email tomaxscale@googlegroups.com or use theforum interface.
Bugs can be reported in the MariaDB Jirajira.mariadb.org
Information about installing MariaDB MaxScale, either from a repository or by building from source code, is included in the MariaDB MaxScale Installation Guide.
The same guide also provides basic information on running MariaDB MaxScale. More detailed information about configuring MariaDB MaxScale can be found in theConfiguration Guide.
CC BY-SA / Gnu FDL
Change Data Capture (CDC) is a new MaxScale protocol that allows compatible clients to authenticate and register for Change Data Capture events. The new protocol must be use in conjunction with AVRO router which currently converts MariaDB binlog events into AVRO records. Clients connect to CDC listener and authenticate using credentials provided in a format described in the CDC Protocol documentation.
Note: If no users are found in that file or if it doesn't exist, the only available user will be the service user:
[avro-service]
type=service
router=avrorouter
source=replication-service
user=cdc_user
password=cdc_password
Starting with MaxScale 2.1, users can also be created through maxctrl:
maxctrl call command cdc add_user <service> <name> <password>
The should be the service name where the user is created. Older versions of MaxScale should use the cdc_users.py script.
bash$ cdc_users.py [-h] USER PASSWORD
The output of this command should be appended to the cdcusers file at/var/lib/maxscale/<service name>/
.
bash$ cdc_users.py user1 pass1 >> /var/lib/maxscale/avro-service/cdcusers
Users can be deleted by removing the related rows in 'cdcusers' file. For more details on the format of the cdcusers file, read the CDC Protocol documentation.
CC BY-SA / Gnu FDL
This document describes possible issues when upgrading MariaDB MaxScale from version 2.5 to 6.
Note that the versioning scheme has changed and that version 6 immediately
follows version 2.5. Effectively, the non-changing 2.
-prefix has been dropped
and henceforth at a major release, the major, instead of the minor version
number, will be bumped. This change also affects how maintenance releases are
versioned. For instance, 2.5.1, the first GA version of MaxScale 2.5, was
followed by the maintenace release 2.5.2. 6.1, the first GA version of MaxScale
6, will be followed by the maintenance release 6.2.
For more information about MaxScale 6, refer to theChangeLog.
Before starting the upgrade, any existing configuration files should be backed up.
Using duration type parameters without an explicit suffix has been deprecated in
MaxScale 2.4. In MaxScale 6 they are no longer allowed when used with the REST
API or MaxCtrl. This means that any create
or alter
commands in MaxCtrl that
use a duration type parameter must explicitly specify the suffix of the unit.
For example, the following command:
maxctrl alter service My-Service connection_keepalive 30000
should be replaced with:
maxctrl alter service My-Service connection_keepalive 30000ms
Duration type parameters can still be defined in the configuration file without an explicit suffix but this behavior is deprecated. The recommended approach is to add explicit suffixes to all duration type parameters when upgrading to MaxScale 6.
threads
The default value of threads
was changed to auto
.
The following deprecated core parameters have been removed:
thread_stack_size
The deprecated aliases for the schemarouter parameters ignore_databases
andignore_databases_regex
have been removed. They can be replaced withignore_tables
and ignore_tables_regex
.
In addition, the preferred_server
parameter that was deprecated in 2.5 has
also been removed.
The prune_sescmd_history
, max_sescmd_history
and disable_sescmd_history
have been made into generic service parameters that are shared between all
routers that support it.
The default value of prune_sescmd_history
was changed from false
totrue
. This was done as most MaxScale installations either benefit from it
being enabled or are not affected by it.
CC BY-SA / Gnu FDL
With the comment filter it is possible to define comments that are injected before the actual statements. These comments appear as sql comments when they are received by the server.
The Comment filter requires one mandatory parameter to be defined.
inject
Type: string
Mandatory: Yes
Dynamic: Yes
A parameter that contains the comment injected before the statements. There is also defined variable $IP that can be used to comment the IP address of the client in the injected comment. Variables must be written in all caps.
as comment.
The following configuration adds the IP address of the client to the comment.
In this example when MaxScale receives statement like:
It would look like
when received by server.
CC BY-SA / Gnu FDL
MariaDB MaxScale is also made available as a tarball, which is named likemaxscale-x.y.z.OS.tar.gz
where x.y.z
is the same as the corresponding version and OS
identifies the operating system, e.g. maxscale-2.5.6.centos.7.tar.gz
.
In order to use the tarball, the following libraries are required:
libcurl
libaio
OpenSSL
gnutls
libatomic
unixODBC
The tarball has been built with the assumption that it will be installed in /usr/local
.
However, it is possible to install it in any directory, but in that case MariaDB MaxScale
must be invoked with a flag.
/usr/local
If you have root access to the system you probably want to install MariaDB MaxScale under
the user and group maxscale
.
The required steps are as follows:
Creating the symbolic link is necessary, since MariaDB MaxScale has been built
with the assumption that the plugin directory is /usr/local/maxscale/lib/maxscale
.
The symbolic link also makes it easy to switch between different versions of
MariaDB MaxScale that have been installed side by side in /usr/local
;
just make the symbolic link point to another installation.
In addition, the first time you install MariaDB MaxScale from a tarball you need to create the following directories:
and make maxscale
the owner of them:
The following step is to create the MariaDB MaxScale configuration file /etc/maxscale.cnf
.
The file etc/maxscale.cnf.template
can be used as a base.
Please refer to for details.
When the configuration file has been created, MariaDB MaxScale can be started.
The -d
flag causes maxscale not to turn itself into a daemon,
which is adviseable the first time MariaDB MaxScale is started, as it makes it easier to spot problems.
If you want to place the configuration file somewhere else but in /etc
you can invoke MariaDB MaxScale with the --config
flag,
for instance, --config=/usr/local/maxscale/etc/maxscale.cnf
.
Note also that if you want to keep everything under /usr/local/maxscale
you can invoke MariaDB MaxScale using the flag --basedir
.
That will cause MariaDB MaxScale to look for its configuration file in/usr/local/maxscale/etc
and to store all runtime files under /usr/local/maxscale/var
.
Enter a directory where you have the right to create a subdirectory. Then do as follows.
The next step is to create the MaxScale configuration file maxscale-x.y.z/etc/maxscale.cnf
.
The file maxscale-x.y.z/etc/maxscale.cnf.template
can be used as a base.
Please refer to for details.
When the configuration file has been created, MariaDB MaxScale can be started.
With the flag --basedir
, MariaDB MaxScale is told where the lib
, etc
and var
directories are found. Unless it is specified, MariaDB MaxScale assumes
the lib
directory is found in /usr/local/maxscale
,
and the var
and etc
directories in /
.
It is also possible to specify the directories and the location of the configuration file individually. Invoke MaxScale like
to find out the appropriate flags.
CC BY-SA / Gnu FDL
GSSAPI is an authentication protocol that is commonly implemented with Kerberos on Unix or Active Directory on Windows. This document describes GSSAPI authentication in MaxScale. The authentication module name in MaxScale isGSSAPIAuth.
For Unix systems, the usual GSSAPI implementation is Kerberos. This is a short guide on how to set up Kerberos for MaxScale.
The first step is to configure MariaDB to use GSSAPI authentication. The MariaDB documentation for the is a good example on how to set it up.
The next step is to copy the keytab file from the server where MariaDB is
installed to the server where MaxScale is located. The keytab file must be
placed in the configured default location which almost always is/etc/krb5.keytab
. Alternatively, the keytab filepath can be given as an
authenticator option.
The location of the keytab file can be changed with the KRB5_KTNAME environment variable:
To take GSSAPI authentication into use, add the following to the listener.
The principal name should be the same as on the MariaDB servers.
principal_name
Type: string
Mandatory: No
Dynamic: No
Default: mariadb/localhost.localdomain
The service principal name to send to the client. This parameter is a string parameter which is used by the client to request the token.
This parameter must be the same as the principal name that the backend MariaDB server uses.
gssapi_keytab_path
Type: path
Mandatory: No
Dynamic: No
Default: Kerberos Default
Keytab file location. This should be an absolute path to the file containing the
keytab. If not defined, Kerberos will search from a default location, usually/etc/krb5.keytab
. This path is set to an environment variable. This means that
multiple listeners with GSSAPIAuth will override each other. If using multiple
GSSAPI authenticators, either do not set this option or use the same value for
all listeners.
Read the document for more details on how authentication modules work in MaxScale.
The GSSAPI plugin authentication starts when the database server sends the
service principal name in the AuthSwitchRequest packet. The principal name will
usually be in the form service@REALM.COM
.
The client searches its local cache for a token for the service or may request it from the GSSAPI server. If found, the client sends the token to the database server. The database server verifies the authenticity of the token using its keytab file and sends the final OK packet to the client.
The GSSAPI authenticator modules require the GSSAPI development libraries (krb5-devel on CentOS 7).
CC BY-SA / Gnu FDL
This document describes possible issues when upgrading MariaDB MaxScale from version 2.3 to 2.4.
For more information about MariaDB MaxScale 2.4, please refer to the .
Before starting the upgrade, we recommend you back up your current configuration file.
Section and object names starting with @@
are now reserved for
internal use by MaxScale.
In case such names have been used, they must manually be changed in all configuration files of MaxScale, before MaxScale 2.4 is started.
Those files are:
The main configuration file; typically /etc/maxscale.cnf
.
All nested configuration files; typically /etc/maxscale.cnf.d/*
.
All dynamic configuration files; typically /var/lib/maxscale/maxscale.cnd.d/*
.
Whitespace in section names that was deprecated in MaxScale 2.2 will now be rejected, which will cause the startup of MaxScale to fail.
To prevent that, section names like
must be changed, for instance, to
Durations can now be specified using one of the suffixes h
, m
, s
and ms
for specifying durations in hours, minutes, seconds and
milliseconds, respectively.
Not providing an explicit unit has been deprecated in MaxScale 2.4, so it is adviseable to add suffixes to durations. For instance,
MaxScale 2.4 will use a SHA2-512 hash for new admin user passwords. To upgrade a
user to use the better hashing algorithm, either recreate the user or use themaxctrl alter user
command.
The following settings have been removed and cause a startup error if defined:
mysql51_replication
multimaster
allow_cluster_recovery
.
If multiple masters are available for a readwritesplit service, the one with the lowest connection count is selected.
If a master server is placed into maintenance mode, all open transactions are
allowed to gracefully finish before the session is closed. To forcefully close
the connections, use the --force
option for maxctrl set server
.
The lazy_connect
feature can be used as a workaround to. It also reduces the
overall load on the system when connections are rapidly opened and closed.
Transaction replays now have a limit on how many times a replay is
attempted. The default values is five attempts and is controlled by thetransaction_replay_attempts
parameter.
If transaction replay is enabled and a deadlock occurs (SQLSTATE 40XXX), the transaction is automatically retried.
CC BY-SA / Gnu FDL
The goal of this tutorial is to configure a system that has two ports available, one for write connections and another for read connections. The read connections are load- balanced across replica servers.
This tutorial is a part of the . Please read it and follow the instructions. Return here once basic setup is complete.
We want two services and ports to which the client application can connect. One service routes client connections to the primary server, the other load balances between replica servers. To achieve this, we need to define two services in the configuration file.
Create the following two sections in your configuration file. The section names are the names of the services and should be meaningful. For this tutorial, we use the namesWrite-Service and Read-Service.
router defines the routing module used. Here we use readconnroute for connection-level routing.
A service needs a list of servers to route queries to. The server names must match the names of server sections in the configuration file and not the hostnames or addresses of the servers.
The router_options-parameter tells the readconnroute-module which servers it should
route a client connection to. For the write service we use the master
-type and for the
read service the slave
-type.
The user and password parameters define the credentials the service uses to populate user authentication data. These users were created at the start of the.
For increased security, see .
To allow network connections to a service, a network ports must be associated with it. This is done by creating a separate listener section in the configuration file. A service may have multiple listeners but for this tutorial one per service is enough.
The service parameter tells which service the listener connects to. For theWrite-Listener we set it to Write-Service and for the Read-Listener we set it to Read-Service.
A listener must define the network port to listen on.
The optional address-parameter defines the local address the listener should bind to.
This may be required when the host machine has multiple network interfaces. The
default behavior is to listen on all network interfaces (the IPv6 address ::
).
For the last steps, please return to .
CC BY-SA / Gnu FDL
[MyComment]
type=filter
module=comment
inject="Comment to be injected"
[MyService]
type=service
router=readwritesplit
servers=server1
user=myuser
password=mypasswd
filters=MyComment
[IPComment]
type=filter
module=comment
inject="IP=$IP"
[MyService]
type=service
router=readwritesplit
servers=server1
user=myuser
password=mypasswd
filters=IPComment
SELECT user FROM people;
/* IP=::ffff:127.0.0.1 */SELECT user FROM people;
$ sudo groupadd maxscale
$ sudo useradd -g maxscale maxscale
$ cd /usr/local
$ sudo tar -xzvf maxscale-x.y.z.OS.tar.gz
$ sudo ln -s maxscale-x.y.z.OS maxscale
$ cd maxscale
$ sudo chown -R maxscale var
$ sudo mkdir /var/log/maxscale
$ sudo mkdir /var/lib/maxscale
$ sudo mkdir /var/run/maxscale
$ sudo mkdir /var/cache/maxscale
$ sudo chown maxscale /var/log/maxscale
$ sudo chown maxscale /var/lib/maxscale
$ sudo chown maxscale /var/run/maxscale
$ sudo chown maxscale /var/cache/maxscale
$ sudo bin/maxscale --user=maxscale -d
$ sudo bin/maxscale --user=maxscale --basedir=/usr/local/maxscale -d
$ tar -xzvf maxscale-x.y.z.OS.tar.gz
$ cd maxscale-x.y.z.OS
$ bin/maxscale -d --basedir=.
$ bin/maxscale --help
[My Server]
...
[My Service]
...
servers=My Server
[MyServer]
...
[MyService]
...
servers=MyServer
some_param=60s
some_param=60000ms
[Write-Service]
type=service
router=readconnroute
router_options=master
servers=dbserv1, dbserv2, dbserv3
user=maxscale
password=maxscale_pw
[Read-Service]
type=service
router=readconnroute
router_options=slave
servers=dbserv1, dbserv2, dbserv3
user=maxscale
password=maxscale_pw
[Write-Listener]
type=listener
service=Write-Service
port=3306
[Read-Listener]
type=listener
service=Read-Service
port=3307
This filter was introduced in MariaDB MaxScale 2.3.0.
The binlogfilter
can be combined with a binlogrouter
service to selectively
replicate the binary log events to replica servers.
The filter uses two settings, match and exclude, to determine which events are replicated. If a binlog event does not match or is excluded, the event is replaced with an empty data event. The empty event is always 35 bytes which translates to a space reduction in most cases.
When statement-based replication is used, any query events that are filtered out are replaced with a SQL comment. This causes the query event to do nothing and thus the event will not modify the contents of the database. The GTID position of the replicating database will still advance which means that downstream servers replicating from it keep functioning correctly.
The filter works with both row based and statement based replication but we recommend using row based replication with the binlogfilter. This guarantees that there are no ambiguities in the event filtering.
match
Type: regex
Mandatory: No
Dynamic: Yes
Default: None
Include queries that match the regex. See next entry, exclude
, for more information.
exclude
Type: regex
Mandatory: No
Dynamic: Yes
Default: None
Exclude queries that match the regex.
If neither match
nor exclude
are defined, the filter does nothing and all events
are replicated. This filter does not accept regular expression options as a separate
setting, such settings must be defined in the patterns themselves. See thePCRE2 api documentation for
more information.
The two settings are matched against the database and table name concatenated
with a period. For example, the string the patterns are matched against for the
database test
and table t1
is test.t1
.
For statement based replication, the pattern is matched against all the tables in the statements. If any of the tables matches the match pattern, the event is replicated. If any of the tables matches the exclude pattern, the event is not replicated.
rewrite_src
Type: regex
Mandatory: No
Dynamic: Yes
Default: None
See the next entry, rewrite_dest
, for more information.
rewrite_dest
Type: regex
Mandatory: No
Dynamic: Yes
Default: None
rewrite_src
and rewrite_dest
control the statement rewriting of the binlogfilter.
The rewrite_src
setting is a PCRE2 regular expression that is matched against
the default database and the SQL of statement based replication events (query
events). rewrite_dest
is the replacement string which supports the normal
PCRE2 backreferences (e.g the first capture group is $1
, the second is $2
,
etc.).
Both rewrite_src
and rewrite_dest
must be defined to enable statement rewriting.
When statement rewriting is enabledGTID-based replication must be used. The filter will disallow replication for all replicas that attempt to replicate with traditional file-and-position based replication.
The replacement is done both on the default database as well as the SQL statement in the query event. This means that great care must be taken when defining the rewriting rules. To prevent accidental modification of the SQL into a form that is no longer valid, use database and table names that never occur in the inserted data and is never used as a constant value.
With the following configuration, only events belonging to database customers
are replicated. In addition to this, events for the table orders
are excluded
and thus are not replicated.
[BinlogFilter]
type=filter
module=binlogfilter
match=/customers[.]/
exclude=/[.]orders/
[BinlogServer]
type=service
router=binlogrouter
server_id=33
filters=BinlogFilter
[BinlogListener]
type=listener
service=BinlogServer
port=4000
For more information about the binlogrouter and how to use it, refer to thebinlogrouter documentation.
CC BY-SA / Gnu FDL
MariaDB MaxScale can be built on any system that meets the requirements. The main requirements are as follows:
CMake version 3.16 or later (Packaging requires CMake 3.25.1 or later)
GCC version 4.9 or later
OpenSSL version 1.0.1 or later
GNUTLS
Node.js 14 or newer for building MaxCtrl and the GUI (webpack), Node.js 10 or newer for running MaxCtrl
PAM
SASL2 (cyrus-sasl)
SQLite3 version 3.3 or later
Tcl
git
jansson
libatomic
libcurl
libmicrohttpd
libuuid
libxml2
libssh
pcre2
This is the minimum set of requirements that must be met to build the MaxScale core package. Some modules in MaxScale require optional extra dependencies.
libuuid (binlogrouter)
boost (binlogrouter)
Bison 2.7 or later (dbfwfilter)
Flex 2.5.35 or later (dbfwfilter)
librdkafka (kafkacdc, kafkaimporter and mirror)
memcached (storage_memcached for the cache filter)
hiredis (storage_redis for the cache filter)
Some of these dependencies are not available on all operating systems and are
downloaded automatically during the build step. To skip the building of modules
that need automatic downloading of the dependencies, use -DBUNDLE=N
when
configuring CMake.
This installs MaxScale as if it was installed from a package. Install git
before running the following commands.
git clone https://github.com/mariadb-corporation/MaxScale
mkdir build
cd build
../MaxScale/BUILD/install_build_deps.sh
cmake ../MaxScale -DCMAKE_INSTALL_PREFIX=/usr
make
sudo make install
sudo ./postinst
For a definitive list of packages, consult theinstall_build_deps.sh script.
The tests and other parts of the build can be controlled via CMake arguments.
Here is a small table with the names of the most common parameters and what
they control. These should all be given as parameters to the -D switch inNAME=VALUE format (e.g. -DBUILD_TESTS=Y
).
CMAKE_INSTALL_PREFIX
Location where MariaDB MaxScale will be installed to. Set this to /usr if you want MariaDB MaxScale installed into the same place the packages are installed.
BUILD_TESTS
Build unit tests
WITH_SCRIPTS
Install systemd and init.d scripts
PACKAGE
Enable building of packages
TARGET_COMPONENT
Which component to install, default is the 'core' package. Other targets are 'experimental', which installs experimental packages and 'all' which installs all components.
TARBALL
Build tar.gz packages, requires PACKAGE=Y
Note: You can look into defaults.cmake for a list of the CMake variables.
To run the MaxScale unit test suite, configure the build with -DBUILD_TESTS=Y
,
compile and then run the make test
command.
If you wish to build packages, just add -DPACKAGE=Y
to the CMake invocation
and build the package with make package
instead of installing MaxScale withmake install
. This process will create a RPM/DEB package depending on your
system.
To build a tarball, add -DTARBALL=Y
to the cmake invokation. This will create
a maxscale-x.y.z.tar.gz file where x.y.z is the version number.
Some Debian and Ubuntu systems suffer from a bug where make package
fails
with errors from dpkg-shlibdeps. This can be fixed by running make
beforemake package
and adding the path to the libmaxscale-common.so library to
the LD_LIBRARY_PATH environment variable.
make
LD_LIBRARY_PATH=$PWD/server/core/ make package
The MaxScale build system is split into multiple components. The main component
is the core
MaxScale package which contains MaxScale and all the modules. This
is the default component that is build, installed and packaged. There is also
the experimental
component that contains all experimental modules which are
not considered as part of the core MaxScale package and are either alpha or beta
quality modules.
To build the experimental modules along with the MaxScale core components,
invoke CMake with -DTARGET_COMPONENT=core,experimental
.
CC BY-SA / Gnu FDL
authenticator=GSSAPIAuth
authenticator_options=principal_name=mariadb/localhost.localdomain@EXAMPLE.COM
authenticator_options=principal_name=mymariadb@EXAMPLE.COM,gssapi_keytab_path=/home/user/mymariadb.keytab
Since version 2.4, MaxScale has built-in support for Xpand. This tutorial explains how to setup MaxScale in front of a Xpand cluster.
There is no Xpand specific router, but both the and the routers can be used.
With readconnroute you get simple connection based routing, where each new connection is created (by default) to the Xpand node with the least amount of existing connections. That is, with readconnroute the behaviour will be very similar to the behaviour when is used as the Xpand load balancer.
The Xpand monitor is capable of autonomously figuring out the cluster configuration, but in order to get going there must be at least oneserver-section referring to a node in the Xpand cluster.
That server defintion will be used by the monitor in order to connect to the Xpand cluster. There can be more than one such "bootstrap" definition to cater for the case that the node used as a bootstrap server is down when MaxScale starts.
NOTE These bootstrap servers should only be referred to from the Xpand monitor configuration, but never from a service.
In the Xpand monitor section, the bootstrap servers are referred to in the same way as "ordinary" servers are referred to in other monitors.
The user defined by the user
parameter needs the following grants:
In case the same user is used both for the monitor and the service (see below), then the user must be given the grants required by the service as well.
The bootstrap servers are only used for connecting to the Xpand cluster; thereafter the Xpand monitor will dynamically find out the cluster configuration.
The discovered cluster configuration will be stored (the ips and ports of the Xpand nodes) and upon subsequent restarts the Xpand monitor will use that information if the bootstrap servers happen to be unavailable.
With the configuration above maxctrl list servers
might output
the following:
All servers whose name start with @@
have been detected dynamically.
Note that the address 10.2.224.101
appears twice; once forBootstrap-1
and another time for @@Xpand:node-6
. The Xpand
monitor will create a dynamic server instance for all nodes in the
Xpand cluster; also for the ones used in bootstrap server sections.
The service is specified as follows:
The user defined by the user
parameter needs the following grants:
In case the same user is used both for the monitor (see above) and the service, then the user must be given the grants required by the monitor as well.
Note that the service does not list any specific servers, but
instead refers, using the argument cluster
, to the Xpand monitor.
In practice this means that the service will use the servers of the
monitor named Xpand
and in the case of a Xpand monitor those
servers will be the ones that the monitor has detected
dynamically. That is, when setup like this, the service will
automatically adjust to any changes taking place in the Xpand
cluster.
NOTE There is no need to specify any router_options
, but the
default router_options=running
provides the desired behaviour.
In particular do not specify router_options=master
as that will
cause only a single node to be used.
To complete the configuration, a listener must be specified.
The primary purpose of the router readwritesplit is to split statements between one primary and multiple replicas. In the case of Xpand, all servers will be primaries, but readwritesplit may still be the right choise.
Namely, as readwritesplit is transaction aware and capable of replaying transactions, it can be used for hiding certain events taking place in Xpand from the clients that use it.
For instance, whenever a node is removed from or added to a Xpand cluster there will be a group change, which is visible to a client as a transaction rollback. However, if readwritesplit is used and transaction replay is enabled, then MaxScale may be able to hide the group change so that the client only detects a slight delay.
Apart from the service section, the configuration when usingreadwritesplit is identical to the readconnroute configuration described above.
The service is specified as follows:
With this configuration, subject to the boundary conditions of transaction replaying, a client will neither notice group change events nor the disappearance of the very node the client is connected to. In that latter case, MaxScale will simply connect to another node and replay the current transaction (if one is active). For detailed information about the transaction replay functionality, please refer to the readwritesplit.
NOTE It is vital to haveslave_selection_criteria=LEAST_GLOBAL_CONNECTIONS
, as otherwise
connections will not be distributed evenly across all Xpand
nodes.
As a rule of thumb, use readwritesplit if it is important that changes taking place in the cluster configuration are hidden from the applications, otherwise use readconnroute.
CC BY-SA / Gnu FDL
We recommend to install MaxScale on a separate server, to ensure that there can be no competition of resources between MaxScale and a MariaDB Server that it manages.
The recommended approach is to use to install MaxScale. After enabling the repository by following the instructions, MaxScale can be installed with the following commands.
For RHEL/Rocky Linux/Alma Linux, use dnf install maxscale
.
For Debian and Ubuntu, run apt update
followed by apt install maxscale
.
For SLES, use zypper install maxscale
.
Download the correct MaxScale package for your CPU architecture and operating system from . MaxScale can be installed with the following commands.
For RHEL/Rocky Linux/Alma Linux, use dnf install /path/to/maxscale-*.rpm
For Debian and Ubuntu, use apt install /path/to/maxscale-*.deb
.
For SLES, use zypper install /path/to/maxscale-*.rpm
.
MaxScale can also be installed using a tarball. That may be required if you are using a Linux distribution for which there exist no installation package or if you want to install many different MaxScale versions side by side. For instructions on how to do that, please refer to.
Alternatively you may download the MariaDB MaxScale source and build your own binaries. To do this, refer to the separate document
MaxScale assumes that memory allocations always succeed and in general does
not check for memory allocation failures. This assumption is compatible with
the Linux kernel parameter
having the value 0
, which is also the default on most systems.
With vm.overcommit_memory
being 0
, memory allocations made by an
application never fail, but instead the application may be killed by the
so-called OOM (out-of-memory) killer if, by the time the application
actually attempts to use the allocated memory, there is not available
free memory on the system.
If the value is 2
, then a memory allocation made by an application may
fail and unless the application is prepared for that possiblity, it will
likely crash with a SIGSEGV. As MaxScale is not prepared to handle memory
allocation failures, it will crash in this situation.
The current value of vm.overcommit_memory
can be checked with
or
covers the first steps in configuring your MariaDB MaxScale installation. Follow this tutorial to learn how to configure and start using MaxScale.
For a detailed list of all configuration parameters, refer to the and the module specific documents listed in the .
Read the section of the configuration guide to set up password encryption for the configuration file.
There are various administration tasks that may be done with MariaDB MaxScale. A command line tools is available, , that will interact with a running MariaDB MaxScale and allow the status of MariaDB MaxScale to be monitored and give some control of the MariaDB MaxScale functionality.
covers the common administration tasks that need to be done with MariaDB MaxScale.
The main configuration file for MaxScale is in /etc/maxscale.cnf
and
additional user-created configuration files are in/etc/maxscale.cnf.d/
. Objects created or modified at runtime are stored in/var/lib/maxscale/maxscale.cnf.d/
. Some modules also store internal data in/var/lib/maxscale/
named after the module or the configuration object.
The simplest way to back up the configuration and runtime data of a MaxScale installation is to create an archive from the following files and directories:
/etc/maxscale.cnf
/etc/maxscale.cnf.d/
/var/lib/maxscale/
This can be done with the following command:
If MaxScale is configured to store data in custom locations, these should be included in the backup as well.
CC BY-SA / Gnu FDL
[Bootstrap-1]
type=server
address=IP-OF-NODE
port=3306
protocol=MySQLBackend
[Xpand]
type=monitor
module=xpandmon
servers=Bootstrap-1
user=USER
password=PASSWORD
CREATE USER 'maxscale-monitor'@'maxscalehost' IDENTIFIED BY 'maxscale-monitor-password';
GRANT SELECT ON system.membership TO 'maxscale-monitor'@'maxscalehost';
GRANT SELECT ON system.nodeinfo TO 'maxscale-monitor'@'maxscalehost';
GRANT SELECT ON system.softfailed_nodes TO 'maxscale-monitor'@'maxscalehost';
┌───────────────────┬──────────────┬──────┬─────────────┬─────────────────┬──────┐
│ Server │ Address │ Port │ Connections │ State │ GTID │
├───────────────────┼──────────────┼──────┼─────────────┼─────────────────┼──────┤
│ @@Xpand:node-7 │ 10.2.224.102 │ 3306 │ 0 │ Master, Running │ │
├───────────────────┼──────────────┼──────┼─────────────┼─────────────────┼──────┤
│ @@Xpand:node-8 │ 10.2.224.103 │ 3306 │ 0 │ Master, Running │ │
├───────────────────┼──────────────┼──────┼─────────────┼─────────────────┼──────┤
│ @@Xpand:node-6 │ 10.2.224.101 │ 3306 │ 0 │ Master, Running │ │
├───────────────────┼──────────────┼──────┼─────────────┼─────────────────┼──────┤
│ Bootstrap-1 │ 10.2.224.101 │ 3306 │ 0 │ Master, Running │ │
└───────────────────┴──────────────┴──────┴─────────────┴─────────────────┴──────┘
[Xpand-Service]
type=service
router=readconnroute
user=USER
password=PASSWORD
cluster=Xpand
CREATE USER 'maxscale-service'@'maxscalehost' IDENTIFIED BY 'maxscale-service-password';
GRANT SELECT ON system.users TO 'maxscale-service'@'maxscalehost';
GRANT SELECT ON system.user_acl TO 'maxscale-service'@'maxscalehost';
[Xpand-Service-Listener]
type=listener
service=Xpand-Service
protocol=MariaDBClient
port=4008
[Xpand-Service]
type=service
router=readwritesplit
user=maxscale-service
password=maxscale-service-password
cluster=Xpand
transaction_replay=true
slave_selection_criteria=LEAST_GLOBAL_CONNECTIONS
sysctl vm.overcommit_memory
cat /proc/sys/vm/overcommit_memory
tar -caf maxscale-backup.tar.gz /etc/maxscale.cnf /etc/maxscale.cnf.d/ /var/lib/maxscale/
This filter was added in MariaDB MaxScale 2.3
The throttle filter is used to limit the maximum query frequency (QPS - queries per second) of a database session to a configurable value. The main use cases are to prevent a rogue session (client side error) and a DoS attack from overloading the system.
The throttling is dynamic. The query frequency is not limited to an absolute value. Depending on the configuration the throttle will allow some amount of high frequency queries, or especially short bursts with no frequency limitation.
[Throttle]
type = filter
module = throttlefilter
max_qps = 500
throttling_duration = 60000
...
[Routing-Service]
type = service
filters = Throttle
This configuration states that the query frequency will be throttled to around 500 qps, and that the time limit a query is allowed to stay at the maximum frequency is 60 seconds. All values involving time are configured in milliseconds. With the basic configuration the throttling will be nearly immediate, i.e. a session will only be allowed very short bursts of high frequency querying.
When a session has been continuously throttled for throttling_duration
milliseconds, or 60 seconds in this example, MaxScale will disconnect the
session.
The two parameters max_qps
and sampling_duration
together define how a
session is throttled.
Suppose max qps is 400 qps and sampling duration is 10 seconds. Since QPS is not an instantaneous measure, but one could say it has a granularity of 10 seconds, we see that over the 10 seconds 10*400 = 4000 queries are allowed before throttling kicks in.
With these values, a fresh session can start off with a speed of 2000 qps, and maintain that speed for 2 seconds before throttling starts.
If the client continues to query at high speed and throttling duration is set to 10 seconds, Maxscale will disconnect the session 12 seconds after it started.
max_qps
Type: number
Mandatory: Yes
Dynamic: Yes
Maximum queries per second.
This is the frequency to which a session will be limited over a given time
period. QPS is not measured as an instantaneous value but over a configurable
sampling duration (see sampling_duration
).
throttling_duration
Type: duration
Mandatory: Yes
Dynamic: Yes
This defines how long a session is allowed to be throttled before MaxScale disconnects the session.
sampling_duration
Type: duration
Mandatory: No
Dynamic: Yes
Default: 250ms
Sampling duration defines the window of time over which QPS is measured. This parameter directly affects the amount of time that high frequency queries are allowed before throttling kicks in.
The lower this value is, the more strict throttling becomes. Conversely, the longer this time is, the longer bursts of high frequency querying is allowed.
continuous_duration
Type: duration
Mandatory: No
Dynamic: Yes
Default: 2s
This value defines what continuous throttling means. Continuous throttling
starts as soon as the filter throttles the frequency. Continuous throttling ends
when no throttling has been performed in the past continuous_duration
time.
CC BY-SA / Gnu FDL
This filter was introduced in MariaDB MaxScale 2.1.
The Consistent Critical Read (CCR) filter allows consistent critical reads to be done through MaxScale while still allowing scaleout of non-critical reads.
When the filter detects a statement that would modify the database, it attaches a routing hint to all following statements done by that connection. This routing hint guides the routing module to route the statement to the primary server where data is guaranteed to be in an up-to-date state. Writes from one session do not, by default, propagate to other sessions.
Note: This filter does not work with prepared statements. Only text protocol queries are handled by this filter.
The triggering of the filter can be limited further by adding MaxScale supported comments to queries and/or by using regular expressions. The query comments take precedence: if a comment is found it is obeyed even if a regular expression parameter might give a different result. Even a comment cannot cause a SELECT-query to trigger the filter. Such a comment is considered an error and ignored.
The comments must follow the MaxScale hint syntax
and the HintFilter needs to be in the filter chain before the CCR-filter. If a
query has a MaxScale supported comment line which defines the parameter ccr
,
that comment is caught by the CCR-filter. Parameter values match
and ignore
are supported, causing the filter to trigger (match
) or not trigger (ignore
)
on receiving the write query. For example, the query
INSERT INTO departments VALUES ('d1234', 'NewDepartment'); -- maxscale ccr=ignore
would normally cause the filter to trigger, but does not because of the
comment. The match
-comment typically has no effect, since write queries by
default trigger the filter anyway. It can be used to override an ignore-type
regular expression that would otherwise prevent triggering.
The CCR filter has no mandatory parameters.
time
Type: duration
Mandatory: No
Dynamic: Yes
Default: 60s
The time window during which queries are routed to the primary. The duration can be specified as documentedhere but the value will always be rounded to the nearest second. If no explicit unit has been specified, the value is interpreted as seconds in MaxScale 2.4. In subsequent versions a value without a unit may be rejected. The default value for this parameter is 60 seconds.
When a data modifying SQL statement is processed, a timer is set to the value oftime. Once the timer has elapsed, all statements are routed normally. If a new data modifying SQL statement is processed within the time window, the timer is reset to the value of time.
Enabling this parameter in combination with the count parameter causes both the time window and number of queries to be inspected. If either of the two conditions are met, the query is re-routed to the primary.
count
Type: count
Mandatory: No
Dynamic: Yes
Default: 0
The number of SQL statements to route to primary after detecting a data modifying SQL statement. This feature is disabled by default.
After processing a data modifying SQL statement, a counter is set to the value of count and all statements are routed to the primary. Each executed statement after a data modifying SQL statement cause the counter to be decremented. Once the counter reaches zero, the statements are routed normally. If a new data modifying SQL statement is processed, the counter is reset to the value ofcount.
match
, ignore
Type: regex
Mandatory: No
Dynamic: No
Default: ""
These regular expression settings control which statements trigger statement re-routing. Only non-SELECT statements are inspected. For CCRFilter, the exclude-parameter is instead named ignore, yet works similarly.
match=.*INSERT.*
ignore=.*UPDATE.*
options=case,extended
options
Type: enum
Mandatory: No
Dynamic: No
Values: ignorecase
, case
, extended
Default: ignorecase
Regular expression options for match
and ignore
.
global
Type: boolean
Mandatory: No
Dynamic: Yes
Default: false
global
is a boolean parameter that when enabled causes writes from one
connection to propagate to all other connections. This can be used to work
around cases where one connection writes data and another reads it, expecting
the write done by the other connection to be visible.
This parameter only works with the time
parameter. The use of global
andcount
at the same time is not allowed and will be treated as an error.
Here is a minimal filter configuration for the CCRFilter which should solve most problems with critical reads after writes.
[CCRFilter]
type=filter
module=ccrfilter
time=5
With this configuration, whenever a connection does a write, all subsequent reads done by that connection will be forced to the primary for 5 seconds.
This prevents read scaling until the modifications have been replicated to the replicas. For best performance, the value of time should be slightly greater than the actual replication lag between the primary and its replicas. If the number of critical read statements is known, the count parameter could be used to control the number reads that are sent to the primary.
CC BY-SA / Gnu FDL
This tutorial is a short introduction to theAvrorouter, how to set it up and how it interacts with the binlogrouter.
The first part configures the services and sets them up for the binary log to Avro file conversion. The second part of this tutorial uses the client listener interface for the avrorouter and shows how to communicate with the service over the network.
The primary server where we will be replicating from needs to have binary logging
enabled, binlog_format
set to row
and binlog_row_image
set tofull
. These can be enabled by adding the two following lines to the my.cnf
file of the primary.
binlog_format=row
binlog_row_image=full
You can find out more about replication formats from the
We start by adding two new services into the configuration file. The first service is the binlogrouter service which will read the binary logs from the primary server. The second service will read the binlogs as they are streamed from the primary and convert them into Avro format files.
# The Replication Proxy service
[replication-service]
type=service
router=binlogrouter
server_id=4000
master_id=3000
filestem=binlog
user=maxuser
password=maxpwd
# The Avro conversion service
[avro-service]
type=service
router=avrorouter
source=replication-service
filestem=binlog
start_index=15
# The listener for the replication-service
[replication-listener]
type=listener
service=replication-service
port=3306
# The client listener for the avro-service
[avro-listener]
type=listener
service=avro-service
protocol=CDC
port=4001
The source
parameter in the avro-service points to the replication-service
we defined before. This service will be the data source for the avrorouter. Thefilestem is the prefix in the binlog files and start_index is the binlog
number to start from. With these parameters, the avrorouter will start reading
events from binlog binlog.000015
.
Note that the filestem and start_index must point to the file that is the
first binlog that the binlogrouter will replicate. For example, if the first
file you are replicating is my-binlog-file.001234
, set the parameters tofilestem=my-binlog-file
and start_index=1234
.
For more information on the avrorouter options, read the Avrorouter Documentation.
Before starting the MaxScale process, we need to make sure that the binary logs
of the primary server contain the DDL statements that define the table
layouts. What this means is that the CREATE TABLE
statements need to be in the
binary logs before the conversion process is started.
If the binary logs contain data modification events for tables that aren't created in the binary logs, the Avro schema of the table needs to be manually created. There are multiple ways to do this:
Dump the database to a replica, configure it to replicate from the primary and point MaxScale to this replica (this is the recommended method as it requires no extra steps)
Use the cdc_schema Go utility and copy the generated .avsc files to the avrodir
Use the Python version of the schema generator and copy the generated .avsc files to the avrodir
If you used the schema generator scripts, all Avro schema files for tables that
are not created in the binary logs need to be in the location pointed to by theavrodir parameter. The files use the following naming:<database>.<table>.<schema_version>.avsc
. For example, the schema file name of
the test.t1 table would be test.t1.0000001.avsc
.
The next step is to start MariaDB MaxScale and set up the binlogrouter. We do that by connecting to the MySQL listener of the replication_router service and executing a few commands.
CHANGE MASTER TO MASTER_HOST='172.18.0.1',
MASTER_PORT=3000,
MASTER_LOG_FILE='binlog.000015',
MASTER_LOG_POS=4,
MASTER_USER='maxuser',
MASTER_PASSWORD='maxpwd';
START SLAVE;
NOTE: GTID replication is not currently supported and file-and-position replication must be used.
This will start the replication of binary logs from the primary server at
172.18.0.1 listening on port 3000. The first file that the binlogrouter
replicates is binlog.000015
. This is the same file that was configured as the
starting file in the avrorouter.
For more details about the SQL commands, refer to theBinlogrouter documentation.
After the binary log streaming has started, the avrorouter will automatically start processing the binlogs.
Next, create a simple test table and populated it with some data by executing the following statements.
CREATE TABLE test.t1 (id INT);
INSERT INTO test.t1 VALUES (1), (2), (3), (4), (5), (6), (7), (8), (9), (10);
To use the cdc.py command line client to connect to the CDC service, we must first create a user. This can be done via maxctrl by executing the following command.
maxctrl call command cdc add_user avro-service maxuser maxpwd
This will create the maxuser:maxpwd credentials which can then be used to
request a JSON data stream of the test.t1
table that was created earlier.
cdc.py -u maxuser -p maxpwd -h 127.0.0.1 -P 4001 test.t1
The output is a stream of JSON events describing the changes done to the database.
{"namespace": "MaxScaleChangeDataSchema.avro", "type": "record", "name": "ChangeRecord", "fields": [{"name": "domain", "type": "int"}, {"name": "server_id", "type": "int"}, {"name": "sequence", "type": "int"}, {"name": "event_number", "type": "int"}, {"name": "timestamp", "type": "int"}, {"name": "event_type", "type": {"type": "enum", "name": "EVENT_TYPES", "symbols": ["insert", "update_before", "update_after", "delete"]}}, {"name": "id", "type": "int", "real_type": "int", "length": -1}]}
{"domain": 0, "server_id": 3000, "sequence": 11, "event_number": 1, "timestamp": 1537429419, "event_type": "insert", "id": 1}
{"domain": 0, "server_id": 3000, "sequence": 11, "event_number": 2, "timestamp": 1537429419, "event_type": "insert", "id": 2}
{"domain": 0, "server_id": 3000, "sequence": 11, "event_number": 3, "timestamp": 1537429419, "event_type": "insert", "id": 3}
{"domain": 0, "server_id": 3000, "sequence": 11, "event_number": 4, "timestamp": 1537429419, "event_type": "insert", "id": 4}
{"domain": 0, "server_id": 3000, "sequence": 11, "event_number": 5, "timestamp": 1537429419, "event_type": "insert", "id": 5}
{"domain": 0, "server_id": 3000, "sequence": 11, "event_number": 6, "timestamp": 1537429419, "event_type": "insert", "id": 6}
{"domain": 0, "server_id": 3000, "sequence": 11, "event_number": 7, "timestamp": 1537429419, "event_type": "insert", "id": 7}
{"domain": 0, "server_id": 3000, "sequence": 11, "event_number": 8, "timestamp": 1537429419, "event_type": "insert", "id": 8}
{"domain": 0, "server_id": 3000, "sequence": 11, "event_number": 9, "timestamp": 1537429419, "event_type": "insert", "id": 9}
{"domain": 0, "server_id": 3000, "sequence": 11, "event_number": 10, "timestamp": 1537429419, "event_type": "insert", "id": 10}
The first record is always the JSON format schema for the table describing the types and names of the fields. All records that follow it represent the changes that have happened on the database.
CC BY-SA / Gnu FDL
Ed25519 is a highly secure authentication method based on public key cryptography. It is used with the auth_ed25519-plugin of MariaDB Server.
When a client authenticates via ed25519, MaxScale first sends them a random message. The client signs the message using their password as private key and sends the signature back. MaxScale then checks the signature using the public key fetched from the mysql.user-table. The client password or an equivalent token is never exposed. For more information, see.
The security of this authentication scheme presents a problem for a proxy such as MaxScale since MaxScale needs to log in to backend servers on behalf of the client. Since each server will generate their own random messages, MaxScale cannot simply forward the original signature. Either the real password is required, or a different authentication scheme must be used between MaxScale and backends. The MaxScale ed25519auth-plugin supports both alternatives.
To begin, add "ed25519auth" to the list of authenticators for a listener.
MaxScale will now authenticate incoming clients with ed25519 if their user account has plugin set to "ed25519" in the mysql.user-table. However, routing queries will fail since MaxScale cannot authenticate to backends. To continue, either use a mapping file or enable sha256-mode. Sha256-mode is enabled with the following settings.
ed_mode
This setting defines the authentication mode used. Two values are supported:
ed25519
(default) Digital signature based authentication. Requires mapping
for backend support.
sha256
Authenticate client with caching_sha2_password-plugin instead.
Requires either SSL or configured RSA-keys.
ed_rsa_privkey_path
and ed_rsa_pubkey_path
Defines the RSA-keys used for encrypting the client password if SSL is not in use. Should point to files with the private and public keys.
To enable MaxScale to authenticate to backends, can be used. The mapping and backend passwords are given in a json-file. The client can map to an identical username or to another user, and the backend authentication scheme can be something else than ed25519.
The following example maps user "alpha" to "beta" and MaxScale then uses standard authentication to log into backends as "beta". User "alpha" authenticates to MaxScale using whatever method configured in the server. User "gamma" does not map to another user, just the password is given.
MaxScale configuration:
/home/joe/mapping.json:
The mapping-based solution requires the DBA to maintain a file with user passwords, which has security and upkeep implications. To avoid this, MaxScale can instead use the caching_sha2_password-plugin to authenticate the client. This authentication scheme transmits the client password to MaxScale in full, allowing MaxScale to log into backends using ed25519. MaxScale effectively lies to the client about its authentication plugin and then uses the correct plugin with the backends. Enable sha256-authentication by setting authentication option ed_mode to "sha256".
sha256-authentication is best used with encrypted connections. The example below shows a listener configured for sha256-mode and SSL.
If SSL is not in use, caching_sha2_password transmits the password using RSA-encryption. In this case, MaxScale needs the public and private RSA-keys. MaxScale sends the public key to the client if they don't already have it and the client uses it to encrypt the password. MaxScale then uses the private key to decrypt the password. The example below shows a listener configured for sha256-mode without SSL.
The keyfiles can be generated with OpenSSL using the following commands.
CC BY-SA / Gnu FDL
This document describes general MySQL protocol authentication in MaxScale. For REST-api authentication, see the and the.
Similar to the MariaDB Server, MaxScale uses authentication plugins to implement different authentication schemes for incoming clients. The same plugins also handle authenticating the clients to backend servers. The authentication plugins available in MaxScale are, and.
Most of the authentication processing is performed on the protocol level, before handing it over to one of the plugins. This shared part is described in this document. For information on an individual plugin, see its documentation.
Every MaxScale service with a MariaDB protocol listener requires knowledge of the user accounts defined on the backend databases. The service maintains this information in an internal component called the user account manager (UAM). The UAM queries relevant data from the mysql-database of the backends and stores it. Typically, only the current primary server is queried, as all servers are assumed to have the same users. The service settings user and password define the credentials used when fetching user accounts.
The service uses the stored data when authenticating clients, checking their passwords and database access rights. This results in an authentication process very similar to the MariaDB Server itself. Unauthorized users are generally detected already at the MaxScale level instead of the backend servers. This may not apply in some cases, for example if MaxScale is using old user account data.
If authentication fails, the UAM updates its data from a backend. MaxScale may attempt authenticating the client again with the refreshed data without communicating the first failure to the client. This transparent user data update does not always work, in which case the client should try to log in again.
As the UAM is shared between all listeners of a service, its settings are defined in the service configuration. For more information, search the for users_refresh_time, users_refresh_interval andauth_all_servers. Other settings which affect how the UAM connects to backends are the global settings auth_connect_timeout and local_address, and the various server-level ssl-settings.
To properly fetch user account information, the MaxScale service user must be able to read from various tables in the mysql-database: user, db,tables_priv, columns_priv, procs_priv, proxies_priv and roles_mapping. The user should also have the SHOW DATABASES-grant.
If using MariaDB ColumnStore, the following grant is required:
When a client logs in to MaxScale, MaxScale sees the client's IP address. When
MaxScale then connects the client to backends (using the client's username and
password), the backends see the connection coming from the IP address of
MaxScale. If the client user account is to a wildcard host ('alice'@'%'
), this
is not an issue. If the host is restricted ('alice'@'123.123.123.123'
),
authentication to backends will fail.
There are two primary ways to deal with this:
Duplicate user accounts. For every user account with a restricted hostname an
equivalent user account for MaxScale is added ('alice'@'maxscale-ip'
).
Use .
Option 1 limits the passwords for user accounts with shared usernames. Such accounts must use the same password since they will effectively share the MaxScale-to-backend user account. Option 2 requires server support.
See for additional information on how to solve authentication issues.
MaxScale supports wildcards _
and %
for database-level grants. As with
MariaDB Server, grant select on test_.* to 'alice'@'%';
gives access totest_ as well as test1, test2 and so on. If the GRANT command escapes the
wildcard (grant select on
test_.* to 'alice'@'%';
) both MaxScale and the
MariaDB Server interpret it as only allowing access to test_. _
and %
are only interpreted as wildcards when the grant is to a database:grant select on
test_.t1 to 'alice'@'%';
only grants access to thetest_.t1-table, not to test1.t1.
The listener configuration defines authentication options which only affect the listener. authenticator defines the authentication plugins to use.authenticator_options sets various options. These options may affect an individual authentication plugin or the authentication as a whole. The latter are explained below. Multiple options can be given as a comma-separated list.
skip_authentication
Type:
Mandatory: No
Dynamic: No
Default: false
If enabled, MaxScale will not check the passwords of incoming clients and just assumes that they are correct. Wrong passwords are instead detected when MaxScale tries to authenticate to the backend servers.
This setting is mainly meant for failure tolerance in situations where the password check is performed outside of MaxScale. If, for example, MaxScale cannot use an LDAP-server but the backend databases can, enabling this setting allows clients to log in. Even with this setting enabled, a user account matching the incoming client username and IP must exist on the backends for MaxScale to accept the client.
This setting is incompatible with standard MariaDB/MySQL authentication plugin (MariaDBAuth in MaxScale). If enabled, MaxScale cannot authenticate clients to backend servers using standard authentication.
match_host
Type:
Mandatory: No
Dynamic: No
Default: true
If disabled, MaxScale does not require that a valid user account entry for incoming clients exists on the backends. Specifically, only the client username needs to match a user account, hostname/IP is ignored.
This setting may be used to force clients to connect through MaxScale. Normally, creating the user jdoe@% will allow the user jdoe to connect from any IP-address. By disabling match_host and replacing the user withjdoe@maxscale-IP, the user can still connect from any client IP but will be forced to go through MaxScale.
lower_case_table_names
Type: number
Mandatory: No
Dynamic: No
Default: 0
Controls database name matching for authentication when an incoming client logs in to a non-empty database. The setting functions similar to the MariaDB Server setting and should be set to the value used by the backends.
The setting accepts the values 0, 1 or 2:
0
: case-sensitive matching (default)
1
: convert the requested database name to lower case before using case-insensitive
matching. Assumes that database names on the server are stored in lower case.
2
: use case-insensitive matching.
true and false are also accepted for backwards compatibility. These map to 1 and 0, respectively.
The identifier names are converted using an ASCII-only function. This means that non-ASCII characters will retain their case-sensitivity.
Starting with MaxScale versions 2.5.25, 6.4.6, 22.08.5 and 23.02.2, the behavior
of lower_case_table_names=1
is identical with how the MariaDB server
behaves. In older releases the comparisons were done in a case-sensitive manner
after the requested database name was converted into lowercase. Usinglower_case_table_names=2
will behave identically in all versions which makes
it a safe alternative to use when a mix of older and newer MaxScale versions is
being used.
CC BY-SA / Gnu FDL
This document is designed as a quick introduction to setting up MariaDB MaxScale.
The installation and configuration of the MariaDB Server is not covered in this document. See the following MariaDB documentation articles for more information on setting up a primary-replica-cluster or a Galera-cluster: and .
This tutorial assumes that one of the standard MaxScale binary distributions is used and that MaxScale is installed using default options.
Building from source code in GitHub is covered in.
The precise installation process varies from one distribution to another. Details on package installation can be found in the.
MaxScale checks that incoming clients are valid. To do this, MaxScale needs to retrieve user authentication information from the backend databases. Create a special user account for this purpose by executing the following SQL commands on the primary server of your database cluster. The following tutorials will use these credentials.
MariaDB versions 10.2.2 to 10.2.10 also require GRANT SELECT ON mysql.* TO 'maxscale'@'%';
Because MariaDB MaxScale sits between the clients and the backend databases, the backend databases will see all clients as if they were connecting from MaxScale's address. This usually means that two sets of grants for each user are required.
For example, assume that the user 'jdoe'@'client-host' exists and MaxScale is located atmaxscale-host. If 'jdoe'@'client-host' needs to be able to connect through MaxScale, another user, 'jdoe'@'maxscale-host', must be created. The second user must have the same password and similar grants as 'jdoe'@'client-host'.
The quickest way to do this is to first create the new user:
Then do a SHOW GRANTS
query:
Then copy the same grants to the 'jdoe'@'maxscale-host'
user.
An alternative to generating two separate accounts is to use one account with a wildcard host ('jdoe'@'%') which covers both hosts. This is more convenient but less secure than having specific user accounts as it allows access from all hosts.
MaxScale reads its configuration from /etc/maxscale.cnf. A template configuration is provided with the MaxScale installation.
A global maxscale section is included in every MaxScale configuration file. This section sets the values of various global parameters, such as the number of threads MaxScale uses to handle client requests. To set thread count to the number of available cpu cores, set the following.
Read the mini-tutorial for server configuration instructions.
The type of monitor used depends on the type of cluster used. For a primary-replica cluster read. For a Galera cluster read.
This part is covered in two different tutorials. For a fully automated read-write-splitting setup, read the. For a simple connection based setup, read the.
After configuration is complete, MariaDB MaxScale is ready to start. For systems that use systemd, use the systemctl command.
For older SysV systems, use the service command.
If MaxScale fails to start, check the error log in /var/log/maxscale/maxscale.log to see if any errors are detected in the configuration file.
The maxctrl-command can be used to confirm that MaxScale is running and the services, listeners and servers have been correctly configured. The following shows expected output when using a read-write-splitting configuration.
MariaDB MaxScale is now ready to start accepting client connections and route queries to the backend cluster.
More options can be found in the, and.
For more information about MaxCtrl and how to secure it, see the.
CC BY-SA / Gnu FDL
SmartRouter is the query router of the SmartQuery framework. Based on the type of the query, each query is routed to the server or cluster that can best handle it.
For workloads where both transactional and analytical queries are needed, SmartRouter unites the Transactional (OLTP) and Analytical (OLAP) workloads into a single entry point in MaxScale. This allows a MaxScale client to freely mix transactional and analytical queries using the same connection. This is known as Hybrid Transactional and Analytical Processing, HTAP.
SmartRouter is configured as a service that either routes to other MaxScale routers or plain servers. Although one can configure SmartRouter to use a plain server directly, we refer to the configured "servers" as clusters.
For details about the standard service parameters, refer to the.
master
Type: target
Mandatory: Yes
Dynamic: No
One of the clusters must be designated as the master
. All writes go to the
primary cluster, which for all practical purposes should be a primary-replica
ReadWriteSplit. This document does not go into details about setting up
primary-replica clusters, but suffice to say, that when setting up the ColumnStore
servers they should be configured to be replicas of a MariaDB server running an
InnoDB engine.
The ReadWriteSplit has more on primary-replica setup.
Example
Suppose we have a Transactional service like
for which we have defined the listener
That is, that service can be accessed using the socket /tmp/rws-row.sock
.
The Analytical service could look like this
Then we can define the SmartQuery service as follows
Note that the SmartQuery listener listens on a port, while the Row and Column service listeners listen on Unix domain sockets. The reason is that there is a significant performance benefit when SmartRouter accesses the services over a Unix domain socket compared to accessing them over a TCP/IP socket.
A complete configuration example can be found at the end of this document.
SmartRouter keeps track of the performance, or the execution time, of queries to the clusters. Measurements are stored with the canonical of a query as the key. The canonical of a query is the sql with all user-defined constants replaced with question marks. When SmartRouter sees a read-query whose canonical has not been seen before, it will send the query to all clusters. The first response from a cluster will designate that cluster as the best one for that canonical. Also, when the first response is received, the other queries are cancelled. The response is sent to the client once all clusters have responded to the query or the cancel.
There is obviously overhead when a new canonical is seen. This means that queries after a MaxScale start will be slightly slower than normal. The execution time of a query depends on the database engine, and on the contents of the tables being queried. As a result, MaxScale will periodically re-measure queries.
The performance behavior of queries under dynamic conditions, and their effect on different storage engines is being studied at MariaDB. As we learn more, we will be able to better categorize queries and move that knowledge into SmartRouter.
LOAD DATA LOCAL INFILE
is not supported.
The performance data is not persisted. The measurements will be performed anew after each startup.
CC BY-SA / Gnu FDL
The filter mechanism in MariaDB MaxScale is a means by which processing can be inserted into the flow of requests and responses between the client connection to MariaDB MaxScale and the MariaDB MaxScale connection to the backend database servers. The path from the client side of MariaDB MaxScale out to the actual database servers can be considered a pipeline, filters can then be placed in that pipeline to monitor, modify, copy or block the content that flows through that pipeline.
Filters can be divided into a number of categories
Logging filters do not in any way alter the statement or results of the statements that are passed through MariaDB MaxScale. They merely log some information about some or all of the statements and/or result sets.
Two examples of logging filters are contained within the MariaDB MaxScale, a filter that will log all statements and another that will log only a number of statements, based on the duration of the execution of the query.
Statement rewriting filters modify the statements that are passed through the filter. This allows a filter to be used as a mechanism to alter the statements that are seen by the database, an example of the use of this might be to allow an application to remain unchanged when the underlying database changes or to compensate for the migration from one database schema to another.
A result set manipulation filter is very similar to a statement rewriting but applies to the result set returned rather than the statement executed. An example of this may be obfuscating the values in a column.
Routing hint filters are filters that embed hints in the request that can be used by the router onto which the query is passed. These hints include suggested destinations as well as metric that may be used by the routing process.
A firewall filter is a mechanism that allows queries to be blocked within MariaDB MaxScale before they are sent on to the database server for execution. They allow constructs or individual queries to be intercepted and give a level of access control that is more flexible than the traditional database grant mechanism.
A pipeline filter is one that has an affect on how the requests are routed within the internal MariaDB MaxScale components. The most obvious version of this is the ability to add a "tee" connector in the pipeline, duplicating the request and sending it to a second MariaDB MaxScale service for processing.
Filters are defined in the configuration file, typically maxscale.cnf, using a section for each filter instance. The content of the filter sections in the configuration file various from filter to filter, however there are always to entries present for every filter, the type and module.
The type is used by the configuration manager within MariaDB MaxScale to determine what this section is defining and the module is the name of the plugin that implements the filter.
When a filter is used within a service in MariaDB MaxScale the entry filters= is added to the service definition in the ini file section for the service. Multiple filters can be defined using a syntax akin to the Linux shell pipe syntax.
The names used in the filters= parameter are the names of the filter definition sections in the ini file. The same filter definition can be used in multiple services and the same filter module can have multiple instances, each with its own section in the ini file.
The filters that are bundled with the MariaDB MaxScale are documented separately, in this section a short overview of how these might be used for some simple tasks will be discussed. These are just examples of how these filters might be used, other filters may also be easily added that will enhance the MariaDB MaxScale functionality still further.
The top filter can be used to measure the execution time of every statement within a connection and log the details of the longest running statements.
The first thing to do is to define a filter entry in the ini file for the top filter. In this case we will call it "top30". The type is filter and the module that implements the filter is called topfilter.
In the definition above we have defined two filter specific parameters, the count of the number of statement to be logged and a filebase that is used to define where to log the information. This filename is a stem to which a session id is added for each new connection that uses the filter.
The filter keeps track of every statement that is executed, monitors the time it takes for a response to come back and uses this as the measure of execution time for the statement. If the time is longer than the other statements that have been recorded, then this is added to the ordered list within the filter. Once 30 statements have been recorded those statements that have been recorded with the least time are discarded from the list. The result is that at any time the filter has a list of the 30 longest running statements in each session.
When the session ends a report will be written for the session into the logfile defined. That report will include the top 30 longest running statements, plus summary data for the session;
The time the connection was opened.
The host the connection was from.
The username used in the connection.
The duration of the connection.
The total number of statements executed in the connection.
The average execution time for a statement in this connection.
The scenario we are using in this example is one in which you have an online gaming application that is designed to work with a MariaDB database. The database schema includes a high score table which you would like to have access to in a Cassandra cluster. The application is already using MariaDB MaxScale to connect to a MariaDB Galera cluster, using a service names BubbleGame. The definition of that service is as follows
The table you wish to store in Cassandra in called HighScore and will contain the same columns in both the MariaDB table and the Cassandra table. The first step is to install a MariaDB instance with the Cassandra storage engine to act as a bridge server between the relational database and Cassandra. In this bridge server add a table definition for the HighScore table with the engine type set to Cassandra. See for details. Add this server into the MariaDB MaxScale configuration and create a service that will connect to this server.
Next add a filter definition for the tee filter that will duplication insert statements that are destined for the HighScore table to this new service.
The above filter definition will cause all statements that match the regular expression inset.*HighScore.*values to be duplication and sent not just to the original destination, via the router but also to the service named Cassandra.
The final step is to add the filter to the BubbleGame service to enable the use of the filter.
CC BY-SA / Gnu FDL
This document provides an overview of the readconnroute router module and its intended use case scenarios. It also displays all router configuration parameters with their descriptions.
The readconnroute router provides simple and lightweight load balancing across a set of servers.
Note that *readconnroute balances connections and not statements. When a
client connects, the router selects the server that matches the value ofrouter_options
and has the least number of connections. Once the connection is
opened, it will not be changed for the duration of the session. If the
connection between MaxScale and the server breaks, the connection can not be
re-established and the client session will be closed. The fact that the server
is fixed when the client connects also means that routing hints are ignored.
Connections from other MaxScale instances or connections done directly on a database are not taken into account. Only connections done through the same Maxscale instance are taken into account.
Warning: readconnroute
will not prevent writes from being done even if you
define router_options=slave
. The client application is responsible for
making sure that it only performs read-only queries in such
cases. readconnroute
is simple by design: it selects a server for each
client connection and routes all queries there. If something more complex is
required, the router is usually the right
choice.
For more details about the standard service parameters, refer to the.
router_options
Type:
Mandatory: No
Dynamic: Yes
Values: master
, slave
, synced
, running
Default: running
router_options
can contain a comma separated list of valid server
roles. These roles are used as the valid types of servers the router will
form connections to when new sessions are created.
Examples:
Here is a list of all possible values for the router_options
.
If no router_options
parameter is configured in the service definition,
the router will use the default value of running
. This means that it will
load balance connections across all running servers defined in the servers
parameter of the service.
When a connection is being created and the candidate server is being chosen, the list of servers is processed in from first entry to last. This means that if two servers with equal rank and number of connections are found, the one that's listed first in the servers parameter for the service is chosen.
When using router_options=slave
, only servers with the Slave
status are
used. If there are no servers with the Slave
status but there is a Master
status, it will be used as the fallback server. Note that the use ofrouter_options=slave
does not prevent writes from being done and the client
application is responsible for making sure that no writes are done on a Slave
server.
master_accept_reads
Type:
Mandatory: No
Dynamic: Yes
Default: true
This option can be used to prevent queries from being sent to the current primary.
If router_options
does not contain master
, the readconnroute instance is
usually meant for reading. Setting master_accept_reads=false
excludes the primary
from server selection (and thus from receiving reads).
If router_options
contains master
, the setting of master_accept_reads
has no effect.
By default master_accept_reads=true
.
max_replication_lag
Type:
Mandatory: No
Dynamic: Yes
Default: 0s
The maximum acceptable replication lag. The value is in seconds and is specified
as documented . The
default value is 0s
, which means that the lag is ignored.
The replication lag of a server must be less than the configured value in order
for it to be used for routing. To configure the router to not allow any lag, use
the smallest duration larger than 0, that is, max_replication_lag=1s
.
The most common use for the readconnroute is to provide either a read or write port for an application. This provides a more lightweight routing solution than the more complex readwritesplit router but requires the application to be able to use distinct write and read ports.
To configure a read-only service that tolerates primary failures, we first need to add a new section in to the configuration file.
Here the router_options
designates replicas as the only valid server
type. With this configuration, the queries are load balanced across the
replica servers.
For more complex examples of the readconnroute router, take a look at the examples in the folder.
The router_diagnostics
output for readconnroute has the following fields.
queries
: Number of queries executed through this service.
Sending of binary data with LOAD DATA LOCAL INFILE
is not supported.
The router will never reconnect to the server it initially connected to.
CC BY-SA / Gnu FDL
MaxGUI is a browser-based interface for MaxScale REST-API and query execution.
To enable MaxGUI in a testing mode, add admin_host=0.0.0.0
andadmin_secure_gui=false
under the [maxscale]
section of the MaxScale
configuration file. Once enabled, MaxGUI will be available on port 8989:http://127.0.0.1:8989/
To make MaxGUI secure, set admin_secure_gui=true
and configure both theadmin_ssl_key
and admin_ssl_cert
parameters.
See and for instructions on how to harden your MaxScale installation for production use.
MaxGUI uses the same credentials as maxctrl
. The default username is admin
with mariadb
as the password.
Internally, MaxGUI uses as the authentication method for persisting the user's session. If the Remember me checkbox is ticked, the session will persist for 24 hours. Otherwise, the session will expire as soon as MaxGUI is closed.
To log out, simply click the username section in the top right corner of the page header to access the logout menu.
This page provides an overview of MaxScale configuration which includes Monitors, Servers, Services, Sessions, Listeners, and Filters.
By default, the refresh interval is 10 seconds.
This page shows information on each and allow to edit its parameter, relationships and perform other manipulation operations.
Access this page by clicking on the MaxScale object name on the
This page visualizes MaxScale configuration and clusters.
Configuration: Visualizing MaxScale configuration.
Cluster: Visualizing a replication cluster into a tree graph and provides
manual cluster manipulation operations such asswitchover, reset-replication, release-locks, failover, rejoin
. At the
moment, it supports only servers monitored by Monitor using module.
Access this page by clicking the graph icon on the sidebar navigation.
This page shows and allows editing of MaxScale parameters.
Access this page by clicking the gear icon on the sidebar navigation.
Realtime MaxScale logs can be accessed by clicking the logs icon on the sidebar navigation.
The "Workspace" page offers a versatile set of tools for effectively managing data and database interactions. It includes the following key tasks:
Execute queries on various servers, services, or listeners to retrieve data and perform database operations. Visualize query results using different graph types such as line, bar, or scatter graphs. Export query results in formats like CSV or JSON for further analysis and sharing.
The "Data Migration" feature facilitates seamless transitions from PostgreSQL to MariaDB. Transfer data and database structures between the two systems while ensuring data integrity and consistency throughout the process.
Generating Entity-Relationship Diagrams (ERDs) to gain insights regarding data structure, optimizing database design for both efficiency and clarity.
CC BY-SA / Gnu FDL
CREATE USER 'maxscale'@'maxscalehost' IDENTIFIED BY 'maxscale-password';
GRANT SELECT ON mysql.user TO 'maxscale'@'maxscalehost';
GRANT SELECT ON mysql.db TO 'maxscale'@'maxscalehost';
GRANT SELECT ON mysql.tables_priv TO 'maxscale'@'maxscalehost';
GRANT SELECT ON mysql.columns_priv TO 'maxscale'@'maxscalehost';
GRANT SELECT ON mysql.procs_priv TO 'maxscale'@'maxscalehost';
GRANT SELECT ON mysql.proxies_priv TO 'maxscale'@'maxscalehost';
GRANT SELECT ON mysql.roles_mapping TO 'maxscale'@'maxscalehost';
GRANT SHOW DATABASES ON *.* TO 'maxscale'@'maxscalehost';
GRANT ALL ON infinidb_vtable.* TO 'maxscale'@'maxscalehost';
authenticator_options=skip_authentication=true,lower_case_table_names=1
authenticator_options=skip_authentication=true
authenticator_options=match_host=false
authenticator_options=lower_case_table_names=0
[RWS-Row]
type=service
router=readwritesplit
servers = row_server_1, row_server_2, ...
[RWS-Row-Listener]
type=listener
service=RWS-Row
socket=/tmp/rws-row.sock
[RWS-Column]
type = service
router = readwritesplit
servers = column_server_1, column_server_2, ...
[RWS-Column-Listener]
type = listener
service = RWS-Column
socket = /tmp/rws-col.sock
[SmartQuery]
type = service
router = smartrouter
targets = RWS-Row, RWS-Column
master = RWS-Row
[SmartQuery-Listener]
type = listener
service = SmartQuery
port = <port>
[maxscale]
[row_server_1]
type = server
address = <ip>
port = <port>
[row_server_2]
type = server
address = <ip>
port = <port>
[Row-Monitor]
type = monitor
module = mariadbmon
servers = row_server_1, row_server_2
user = <user>
password = <password>
monitor_interval = 2000ms
[column_server_1]
type = server
address = <ip>
port = <port>
[Column-Monitor]
type = monitor
module = csmon
servers = column_server_1
user = <user>
password = <password>
monitor_interval = 2000ms
# Row Read write split
[RWS-Row]
type = service
router = readwritesplit
servers = row_server_1, row_server_2
user = <user>
password = <password>
[RWS-Row-Listener]
type = listener
service = RWS-Row
socket = /tmp/rws-row.sock
# Columnstore Read write split
[RWS-Column]
type = service
router = readwritesplit
servers = column_server_1
user = <user>
password = <password>
[RWS-Column-Listener]
type = listener
service = RWS-Column
socket = /tmp/rws-col.sock
# Smart Query router
[SmartQuery]
type = service
router = smartrouter
targets = RWS-Row, RWS-Column
master = RWS-Row
user = <user>
password = <password>
[SmartQuery-Listener]
type = listener
service = SmartQuery
port = <port>
[MyFilter]
type=filter
module=xxxfilter
[Split-Service]
type=service
router=readwritesplit
servers=dbserver1,dbserver2,dbserver3,dbserver4
user=massi
password=6628C50E07CCE1F0392EDEEB9D1203F3
filters=hints | top10
[top30]
type=filter
module=topfilter
count=30
filebase=/var/log/DBSessions/top30
[BubbleGame]
type=service
router=readwritesplit
servers=dbbubble1,dbbubble2,dbbubble3,dbbubble4,dbbubble5
user=maxscale
password=6628C50E07CCE1F0392EDEEB9D1203F3
[CassandraDB]
type=server
address=192.168.4.28
port=3306
[Cassandra]
type=service
router=readconnroute
router_options=running
servers=CassandraDB
user=maxscale
password=6628C50E07CCE1F0392EDEEB9D1203F3
[HighScores]
type=filter
module=teefilter
match=insert.*HighScore.*values
service=Cassandra
[BubbleGame]
type=service
router=readwritesplit
servers=dbbubble1,dbbubble2,dbbubble3,dbbubble4,dbbubble5
user=maxscale
password=6628C50E07CCE1F0392EDEEB9D1203F3
filters=HighScores
router_options=slave
router_options=master,slave
master
A server assigned as a primary by one of MariaDB MaxScale monitors. Depending on the monitor implementation, this could be a primary server of a Primary-Replica replication cluster or a Write-Primary of a Galera cluster.
slave
A server assigned as a replica of a primary. If all replicas are down, but the primary is still available, then the router will use the primary.
synced
A Galera cluster node which is in a synced state with the cluster.
running
A server that is up and running. All servers that MariaDB MaxScale can connect to are labeled as running.
[Read-Service]
type=service
router=readconnroute
servers=replica1,replica2,replica3
router_options=slave
The Regex filter is a filter module for MariaDB MaxScale that is able to rewrite query content using regular expression matches and text substitution. The regular expressions use thePCRE2 syntax.
PCRE2 library uses a different syntax than POSIX to refer to capture
groups in the replacement string. The main difference is the usage of the dollar
character instead of the backslash character for references e.g. $1
instead of\1
. For more details about the replacement string differences, please read theCreating a new string with substitutions
chapter in the PCRE2 manual.
The following demonstrates a minimal configuration.
[MyRegexFilter]
type=filter
module=regexfilter
match=some string
replace=replacement string
[MyService]
type=service
router=readconnroute
servers=server1
user=myuser
password=mypasswd
filters=MyRegexfilter
The Regex filter has two mandatory parameters: match and replace.
match
Type: regex
Mandatory: Yes
Dynamic: Yes
Defines the text in the SQL statements that is replaced.
match=TYPE[ ]*=
options=case
options
Type: enum
Mandatory: No
Dynamic: Yes
Values: ignorecase
, case
, extended
Default: ignorecase
The options-parameter affects how the patterns are compiled asusual.
replace
Type: string
Mandatory: Yes
Dynamic: Yes
This is the text that should replace the part of the SQL-query matching the pattern defined in match.
replace=ENGINE =
source
Type: string
Mandatory: No
Dynamic: Yes
Default: None
The optional source parameter defines an address that is used to match against the address from which the client connection to MariaDB MaxScale originates. Only sessions that originate from this address will have the match and replacement applied to them.
source=127.0.0.1
user
Type: string
Mandatory: No
Dynamic: Yes
Default: None
The optional user parameter defines a username that is used to match against the user from which the client connection to MariaDB MaxScale originates. Only sessions that are connected using this username will have the match and replacement applied to them.
user=john
log_file
Type: string
Mandatory: No
Dynamic: Yes
Default: None
The optional log_file parameter defines a log file in which the filter writes all queries that are not matched and matching queries with their replacement queries. All sessions will log to this file so this should only be used for diagnostic purposes.
log_file=/tmp/regexfilter.log
log_trace
Type: string
Mandatory: No
Dynamic: Yes
Default: None
The optional log_trace parameter toggles the logging of non-matching and matching queries with their replacements into the log file on the info level. This is the preferred method of diagnosing the matching of queries since the log level can be changed at runtime. For more details about logging levels and session specific logging, please read theConfiguration Guide.
log_trace=true
MySQL 5.1 used the parameter TYPE = to set the storage engine that should be used for a table. In later versions this changed to be ENGINE =. Imagine you have an application that you can not change for some reason, but you wish to migrate to a newer version of MySQL. The regexfilter can be used to transform the create table statements into the form that could be used by MySQL 5.5
[CreateTableFilter]
type=filter
module=regexfilter
options=ignorecase
match=TYPE\s*=
replace=ENGINE=
[MyService]
type=service
router=readconnroute
servers=server1
user=myuser
password=mypasswd
filters=CreateTableFilter
CC BY-SA / Gnu FDL
Sharding is the method of splitting a single logical database server into separate physical databases. This tutorial describes a very simple way of sharding. Each schema is located on a different database server and MariaDB MaxScale's schemarouter module is used to combine them into a single logical database server.
This tutorial was written for Ubuntu 22.04, MaxScale 23.08 andMariaDB 10.11. In addition to the MaxScale server, you'll need two MariaDB servers which will be used for the sharding. The installation of MariaDB is not covered by this tutorial.
The easiest way to install MaxScale is to use the MariaDB repositories.
# Install MaxScale
apt update
apt -y install sudo curl
curl -LsS https://r.mariadb.com/downloads/mariadb_repo_setup | sudo bash
apt -y install maxscale
This tutorial uses a broader set of grants than is required for the sake of brevity and backwards compatibility. For the minimal set of grants, refer to theMaxScale Configuration Guide.
All MaxScale configurations require at least two accounts: one for reading authentication data and another for monitoring the state of the database. Services will use the first one and monitors will use the second one. In addition to this, we want to have a separate account that our application will use.
-- Create the user for the service
-- https://mariadb.com/kb/en/mariadb-maxscale-2308-authentication-modules/#required-grants
CREATE USER 'service_user'@'%' IDENTIFIED BY 'secret';
GRANT SELECT ON mysql.* TO 'service_user'@'%';
GRANT SHOW DATABASES ON *.* TO 'service_user'@'%';
-- Create the user for the monitor
-- https://mariadb.com/kb/en/mariadb-maxscale-2308-galera-monitor/#required-grants
CREATE USER 'monitor_user'@'%' IDENTIFIED BY 'secret';
GRANT REPLICATION CLIENT ON *.* TO 'monitor_user'@'%';
-- Create the application user
-- https://mariadb.com/kb/en/mariadb-maxscale-2308-authentication-modules/#limitations-and-troubleshooting
CREATE USER app_user@'%' IDENTIFIED BY 'secret';
GRANT SELECT, INSERT, UPDATE, DELETE ON *.* TO app_user@'%';
All of the users must be created on both of the MariaDB servers.
Each server will hold one unique schema which contains the data of one specific customer. We'll also create a shared schema that is present on all shards that the shard-local tables can be joined into.
Create the tables on the first server:
CREATE DATABASE IF NOT EXISTS customer_01;
CREATE TABLE IF NOT EXISTS customer_01.accounts(id INT, account_type INT, account_name VARCHAR(255));
INSERT INTO customer_01.accounts VALUES (1, 1, 'foo');
-- The shared schema that's on all shards
CREATE DATABASE IF NOT EXISTS shared_info;
CREATE TABLE IF NOT EXISTS shared_info.account_types(account_type INT, type_name VARCHAR(255));
INSERT INTO shared_info.account_types VALUES (1, 'admin'), (2, 'user');
Create the tables on the second server:
CREATE DATABASE IF NOT EXISTS customer_02;
CREATE TABLE IF NOT EXISTS customer_02.accounts(id INT, account_type INT, account_name VARCHAR(255));
INSERT INTO customer_02.accounts VALUES (2, 2, 'bar');
-- The shared schema that's on all shards
CREATE DATABASE IF NOT EXISTS shared_info;
CREATE TABLE IF NOT EXISTS shared_info.account_types(account_type INT, type_name VARCHAR(255));
INSERT INTO shared_info.account_types VALUES (1, 'admin'), (2, 'user');
The MaxScale configuration is stored in /etc/maxscale.cnf
.
First, we configure two servers we will use to shard our database. The db-01
server has the customer_01
schema and the db-02
server has the customer_02
schema.
[db-01]
type=server
address=192.168.0.102
port=3306
[db-02]
type=server
address=192.168.0.103
port=3306
The next step is to configure the service which the users connect to. This
section defines which router to use, which servers to connect to and the
credentials to use. For sharding, we use schemarouter router and the
service_user credentials we defined earlier. By default the schemarouter warns
if two or more nodes have duplicate schemas so we need to ignore them withignore_tables_regex=.*
.
[Sharded-Service]
type=service
router=schemarouter
targets=db-02,db-01
user=service_user
password=secret
ignore_tables_regex=.*
After this we configure a listener for the service. The listener is the actual port that the user connects to. We will use the port 4000.
[Sharded-Service-Listener]
type=listener
service=Sharded-Service
port=4000
The final step is to configure a monitor which will monitor the state of the
servers. The monitor will notify MariaDB MaxScale if the servers are down. We
add the two servers to the monitor and use the monitor_user
credentials. For
the sharding use-case, the galeramon
module is suitable even if we're not
using a Galera cluster. The schemarouter
is only interested in whether the
server is in the Running
state or in the Down
state.
[Shard-Monitor]
type=monitor
module=galeramon
servers=db-02,db-01
user=monitor_user
password=secret
After this we have a fully working configuration and the contents of/etc/maxscale.cnf
should look like this.
[db-01]
type=server
address=192.168.0.102
port=3306
[db-02]
type=server
address=192.168.0.103
port=3306
[Sharded-Service]
type=service
router=schemarouter
targets=db-02,db-01
user=service_user
password=secret
ignore_tables_regex=.*
[Sharded-Service-Listener]
type=listener
service=Sharded-Service
protocol=MariaDBClient
port=4000
[Shard-Monitor]
type=monitor
module=galeramon
servers=db-02,db-01
user=monitor_user
password=secret
Then you're ready to start MaxScale.
systemctl start maxscale.service
MariaDB MaxScale is now ready to start accepting client connections and routing them. Queries are routed to the right servers based on the database they target and switching between the shards is seamless since MariaDB MaxScale keeps the session state intact between servers.
To test, we query the schema that's located on the local shard and join it to the shared table.
$ mariadb -A -u app_user -psecret -h 127.0.0.1 -P 4000
Welcome to the MariaDB monitor. Commands end with ; or \g.
Your MariaDB connection id is 3
Server version: 10.11.7-MariaDB-1:10.11.7+maria~ubu2004-log mariadb.org binary distribution
Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
MariaDB [(none)]> USE customer_01;
Database changed
MariaDB [customer_01]> SELECT c.account_name, c.account_type, s.type_name FROM accounts c
-> JOIN shared_info.account_types s ON (c.account_type = s.account_type);
+--------------+--------------+-----------+
| account_name | account_type | type_name |
+--------------+--------------+-----------+
| foo | 1 | admin |
+--------------+--------------+-----------+
1 row in set (0.001 sec)
MariaDB [customer_01]> USE customer_02;
Database changed
MariaDB [customer_02]> SELECT c.account_name, c.account_type, s.type_name FROM accounts c
-> JOIN shared_info.account_types s ON (c.account_type = s.account_type);
+--------------+--------------+-----------+
| account_name | account_type | type_name |
+--------------+--------------+-----------+
| bar | 2 | user |
+--------------+--------------+-----------+
1 row in set (0.000 sec)
The sharding also works even if no default database is selected.
MariaDB [(none)]> SELECT c.account_name, c.account_type, s.type_name FROM customer_01.accounts c
-> JOIN shared_info.account_types s ON (c.account_type = s.account_type);
+--------------+--------------+-----------+
| account_name | account_type | type_name |
+--------------+--------------+-----------+
| foo | 1 | admin |
+--------------+--------------+-----------+
1 row in set (0.001 sec)
MariaDB [(none)]> SELECT c.account_name, c.account_type, s.type_name FROM customer_02.accounts c
-> JOIN shared_info.account_types s ON (c.account_type = s.account_type);
+--------------+--------------+-----------+
| account_name | account_type | type_name |
+--------------+--------------+-----------+
| bar | 2 | user |
+--------------+--------------+-----------+
1 row in set (0.001 sec)
One limitation of this sort of simple sharding is that cross-shard joins are not possible.
MariaDB [(none)]> SELECT * FROM customer_01.accounts UNION SELECT * FROM customer_02.accounts;
ERROR 1146 (42S02): Table 'customer_01.accounts' doesn't exist
MariaDB [(none)]> USE customer_01;
Database changed
MariaDB [customer_01]> SELECT * FROM customer_01.accounts UNION SELECT * FROM customer_02.accounts;
ERROR 1146 (42S02): Table 'customer_02.accounts' doesn't exist
MariaDB [customer_01]> USE customer_02;
Database changed
MariaDB [customer_02]> SELECT * FROM customer_01.accounts UNION SELECT * FROM customer_02.accounts;
ERROR 1146 (42S02): Table 'customer_01.accounts' doesn't exist
In most multi-tenant situations, this is an acceptable limitation. If you do need cross-shard joins, theSpider storage engine will provide you this.
CC BY-SA / Gnu FDL
The luafilter is a filter that calls a set of functions in a Lua script.
Read the Lua language documentation for information on how to write Lua scripts.
Note: This module is experimental and must be built from source. The module is deprecated in MaxScale 23.08 and might be removed in a future release.
The luafilter has two parameters. They control which scripts will be called by
the filter. Both parameters are optional but at least one should be defined. If
both global_script
and session_script
are defined, the entry points in both
scripts will be called.
global_script
The global Lua script. The parameter value is a path to a readable Lua script which will be executed.
This script will always be called with the same global Lua state and it can be used to build a global view of the whole service.
session_script
The session level Lua script. The parameter value is a path to a readable Lua script which will be executed once for each session.
Each session will have its own Lua state meaning that each session can have a unique Lua environment. Use this script to do session specific tasks.
The entry points for the Lua script expect the following signatures:
nil createInstance(name)
- global script only, called when the script is first loaded
When the global script is loaded, it first executes on a global level before the luafilter calls the createInstance function in the Lua script with the filter's name as its argument.
nil newSession(string, string)
- new session is created
After the session script is loaded, the newSession function in the Lua scripts is called. The first parameter is the username of the client and the second parameter is the client's network address.
nil closeSession()
- session is closed
The closeSession
function in the Lua scripts will be called.
(nil | bool | string) routeQuery()
- query is being routed
The Luafilter calls the routeQuery
functions of both the session and the
global script. The query is passed as a string parameter to the
routeQuery Lua function and the return values of the session specific
function, if any were returned, are interpreted. If the first value is
bool, it is interpreted as a decision whether to route the query or to
send an error packet to the client. If it is a string, the current query
is replaced with the return value and the query will be routed. If nil is
returned, the query is routed normally.
nil clientReply()
- reply to a query is being routed
This function is called with the name of the server that returned the response.
string diagnostic()
- global script only, print diagnostic information
If the Lua function returns a string that is valid JSON, it will be decoded as JSON and displayed as such in the REST API. If the object does not decode into JSON, it will be stored as a JSON string.
These functions, if found in the script, will be called whenever a call to the matching entry point is made.
Script Template
Here is a script template that can be used to try out the luafilter. Copy it
into a file and add global_script=<path to script>
into the filter
configuration. Make sure the file is readable by the maxscale
user.
function createInstance(name)
end
function newSession(user, host)
end
function closeSession()
end
function routeQuery()
end
function clientReply()
end
function diagnostic()
end
The luafilter exposes the following functions that can be called inside the Lua script API endpoints. The callback function in which they can be called is documented after the function signature. If the functions are called outside of the correct callback function, they raise a Lua error.
string mxs_get_sql()
(use: routeQuery
)
Returns the SQL of the query being executed. This returns an empty string for any query that is not a text protocol query (COM_QUERY). Support for prepared statements is not yet implemented.
string mxs_get_type_mask()
(use: routeQuery
)
Returns the type of the current query being executed as a string. The values
are the string versions of the query types defined in query_classifier.h
are separated by vertical bars (|
).
This function can only be called from the routeQuery
entry point.
string mxs_get_operation()
(use: routeQuery
)
Returns the current operation type as a string. The values are defined in
query_classifier.h.
This function can only be called from the routeQuery
entry point.
string mxs_get_canonical()
(use: routeQuery
)
Returns the canonical version of a query by replacing all user-defined constant values with question marks.
This function can only be called from the routeQuery
entry point.
number mxs_get_session_id()
(use: newSession
, routeQuery
, clientReply
, closeSession
)
This function returns the session ID of the current session. Inside thecreateInstance
and diagnostic
endpoints this function will always return
the value 0.
string mxs_get_db()
(use: newSession
, routeQuery
, clientReply
, closeSession
)
Returns the current default database used by the connection.
string mxs_get_user()
(use: newSession
, routeQuery
, clientReply
, closeSession
)
Returns the username of the client connection.
string mxs_get_host()
(use: newSession
, routeQuery
, clientReply
, closeSession
)
Returns the address of the client connection.
string mxs_get_replier()
(use: clientReply
)
Returns the target that returned the result to the latest query.
Here is a minimal configuration entry for a luafilter definition.
[MyLuaFilter]
type=filter
module=luafilter
global_script=/path/to/script.lua
And here is a script that opens a file in /tmp/
and logs output to it.
f = io.open("/tmp/test.log", "a+")
function createInstance(name)
f:write("createInstance for " .. name .. "\n")
end
function newSession(user, host)
f:write("newSession for: " .. user .. "@" .. host .. "\n")
end
function closeSession()
f:write("closeSession\n")
end
function routeQuery()
f:write("routeQuery: " .. mxs_get_sql() .. " -- type: " .. mxs_qc_get_type_mask() .. " operation: " .. mxs_qc_get_operation() .. "\n")
end
function clientReply()
f:write("clientReply: " .. mxs_get_replier() .. "\n")
end
function diagnostic()
f:write("diagnostics\n")
return "Hello from Lua!"
end
mxs_get_sql()
and mxs_get_canonical()
do not work with queries done with
the binary protocol.
The Lua code is not restricted in any way which means excessively slow execution of it can cause the MaxScale process to become slower or to be aborted due to a SystemD watchdog timeout.
CC BY-SA / Gnu FDL
[Read-Write-Listener]
type=listener
address=::
service=Read-Write-Service
authenticator=ed25519auth
authenticator_options=ed_mode=sha256
authenticator_options=ed_mode=sha256,
ed_rsa_privkey_path=/tmp/sha_private_key.pem,
ed_rsa_pubkey_path=/tmp/sha_public_key.pem
[Read-Write-Listener]
type=listener
address=::
service=Read-Write-Service
authenticator=ed25519auth,mariadbauth
user_mapping_file=/home/joe/mapping.json
{
"user_map": [
{
"original_user": "alpha",
"mapped_user": "beta"
},
{
"original_user": "gamma",
"mapped_user": "gamma"
}
],
"server_credentials": [
{
"mapped_user": "beta",
"password": "hunter2",
"plugin": "mysql_native_password"
},
{
"mapped_user": "gamma",
"password": "letmein",
"plugin": "ed25519"
}
]
}
[Read-Write-Listener]
type=listener
address=::
service=Read-Write-Service
authenticator=ed25519auth
authenticator_options=ed_mode=sha256
ssl=true
ssl_key=/tmp/my-key.pem
ssl_cert=/tmp/my-cert.pem
ssl_ca=/tmp/myCA.pem
[Read-Write-Listener]
type=listener
address=::
service=Read-Write-Service
authenticator=ed25519auth
authenticator_options=ed_mode=sha256,
ed_rsa_privkey_path=/tmp/sha_private_key.pem,
ed_rsa_pubkey_path=/tmp/sha_public_key.pem
openssl genrsa -out sha_private_key.pem 2048
openssl rsa -in sha_private_key.pem -pubout -out sha_public_key.pem
CREATE USER 'maxscale'@'%' IDENTIFIED BY 'maxscale_pw';
GRANT SELECT ON mysql.user TO 'maxscale'@'%';
GRANT SELECT ON mysql.db TO 'maxscale'@'%';
GRANT SELECT ON mysql.tables_priv TO 'maxscale'@'%';
GRANT SELECT ON mysql.columns_priv TO 'maxscale'@'%';
GRANT SELECT ON mysql.procs_priv TO 'maxscale'@'%';
GRANT SELECT ON mysql.proxies_priv TO 'maxscale'@'%';
GRANT SELECT ON mysql.roles_mapping TO 'maxscale'@'%';
GRANT SHOW DATABASES ON *.* TO 'maxscale'@'%';
CREATE USER 'jdoe'@'maxscale-host' IDENTIFIED BY 'my_secret_password';
MariaDB [(none)]> SHOW GRANTS FOR 'jdoe'@'client-host';
+-----------------------------------------------------------------------+
| Grants for jdoe@client-host |
+-----------------------------------------------------------------------+
| GRANT SELECT, INSERT, UPDATE, DELETE ON *.* TO 'jdoe'@'client-host' |
+-----------------------------------------------------------------------+
1 row in set (0.01 sec)
GRANT SELECT, INSERT, UPDATE, DELETE ON *.* TO 'jdoe'@'maxscale-host';
[maxscale]
threads=auto
sudo systemctl start maxscale
sudo service maxscale start
% sudo maxctrl list services
┌──────────────────┬────────────────┬─────────────┬───────────────────┬───────────────────────────┐
│ Service │ Router │ Connections │ Total Connections │ Servers │
├──────────────────┼────────────────┼─────────────┼───────────────────┼───────────────────────────┤
│ Splitter-Service │ readwritesplit │ 1 │ 1 │ dbserv1, dbserv2, dbserv3 │
└──────────────────┴────────────────┴─────────────┴───────────────────┴───────────────────────────┘
% sudo maxctrl list servers
┌─────────┬─────────────┬──────┬─────────────┬─────────────────┬───────────┐
│ Server │ Address │ Port │ Connections │ State │ GTID │
├─────────┼─────────────┼──────┼─────────────┼─────────────────┼───────────┤
│ dbserv1 │ 192.168.2.1 │ 3306 │ 0 │ Master, Running │ 0-3000-62 │
├─────────┼─────────────┼──────┼─────────────┼─────────────────┼───────────┤
│ dbserv2 │ 192.168.2.2 │ 3306 │ 0 │ Slave, Running │ 0-3000-62 │
├─────────┼─────────────┼──────┼─────────────┼─────────────────┼───────────┤
│ dbserv3 │ 192.168.2.3 │ 3306 │ 0 │ Slave, Running │ 0-3000-62 │
└─────────┴─────────────┴──────┴─────────────┴─────────────────┴───────────┘
% sudo maxctrl list listeners Splitter-Service
┌───────────────────┬──────┬──────┬─────────┐
│ Name │ Port │ Host │ State │
├───────────────────┼──────┼──────┼─────────┤
│ Splitter-Listener │ 3306 │ │ Running │
└───────────────────┴──────┴──────┴─────────┘
The KafkaImporter module reads messages from Kafka and streams them into a MariaDB server. The messages are inserted into a table designated by either the topic name or the message key (see for details). By default the table will be automatically created with the following SQL:
The payload of the message is inserted into the data
field from which the id
field is calculated. The payload must be a valid JSON object and it must either
contain a unique _id
field or it must not exist or the value must be a JSON
null. This is similar to the MongoDB document format where the _id
field is
the primary key of the document collection.
If a message is read from Kafka and the insertion into the table fails due to a
violation of one of the constraints, the message is ignored. Similarly, messages
with duplicate _id
value are also ignored: this is done to avoid inserting the
same document multiple times whenever the connection to either Kafka or MariaDB
is lost.
The limitations on the data can be removed by either creating the table before
the KafkaImporter is started, in which case the CREATE TABLE IF NOT EXISTS
does nothing, or by altering the structure of the existing table. The minimum
requirement that must be met is that the table contains the data
field to
which string values can be inserted into.
The database server where the data is inserted is chosen from the set of servers
available to the service. The first server labeled as the Master
with the best
rank will be chosen. This means that a monitor must be configured for the
MariaDB server where the data is to be inserted.
In MaxScale versions 21.06.18, 22.08.15, 23.02.12, 23.08.8, 24.02.4 and 24.08.2
the _id
field is not required to be present. Older versions of MaxScale used
the following SQL where the _id
field was mandatory:
The user defined by the user
parameter of the service must have INSERT
andCREATE
privileges on all tables that are created.
bootstrap_servers
Type: string
Mandatory: Yes
Dynamic: Yes
The list of Kafka brokers as a CSV list in host:port
format.
topics
Type: stringlist
Mandatory: Yes
Dynamic: Yes
The comma separated list of topics to subscribe to.
batch_size
Type: count
Mandatory: No
Dynamic: Yes
Default: 100
Maximum number of uncommitted records. The KafkaImporter will buffer records into batches and commit them once either enough records are gathered (controlled by this parameter) or when the KafkaImporter goes idle. Any uncommitted records will be read again if a reconnection to either Kafka or MariaDB occurs.
kafka_sasl_mechanism
Type:
Mandatory: No
Dynamic: Yes
Values: PLAIN
, SCRAM-SHA-256
, SCRAM-SHA-512
Default: PLAIN
SASL mechanism to use. The Kafka broker must be configured with the same authentication scheme.
kafka_sasl_user
Type: string
Mandatory: No
Dynamic: Yes
Default: ""
SASL username used for authentication. If this parameter is defined,kafka_sasl_password
must also be provided.
kafka_sasl_password
Type: string
Mandatory: No
Dynamic: Yes
Default: ""
SASL password for the user. If this parameter is defined, kafka_sasl_user
must
also be provided.
kafka_ssl
Type:
Mandatory: No
Dynamic: Yes
Default: false
Enable SSL for Kafka connections.
kafka_ssl_ca
Type: path
Mandatory: No
Dynamic: Yes
Default: ""
SSL Certificate Authority file in PEM format. If this parameter is not defined, the system default CA certificate is used.
kafka_ssl_cert
Type: path
Mandatory: No
Dynamic: Yes
Default: ""
SSL public certificate file in PEM format. If this parameter is defined,kafka_ssl_key
must also be provided.
kafka_ssl_key
Type: path
Mandatory: No
Dynamic: Yes
Default: ""
SSL private key file in PEM format. If this parameter is defined,kafka_ssl_cert
must also be provided.
table_name_in
Type:
Mandatory: No
Dynamic: Yes
Values: topic
, key
Default: topic
The Kafka message part that is used to locate the table to insert the data into.
Enumeration Values:
topic
: The topic named is used as the fully qualified table name.
key
: The message key is used as the fully qualified table name. If the Kafka
message does not have a key, the message is ignored.
For example, all messages with a fully qualified table name of my_db.my_table
will be inserted into the table my_table
located in the my_db
database. If
the table or database names have special characters that must be escaped to make
them valid identifiers, the name must also contain those escape characters. For
example, to insert into a table named my table
in the database my database
,
the name would be:
timeout
Type:
Mandatory: No
Dynamic: Yes
Default: 5000ms
Timeout for both Kafka and MariaDB network communication.
engine
Type: string
Default: InnoDB
Mandatory: No
Dynamic: Yes
The storage engine used for tables that are created by the KafkaImporter.
This defines the ENGINE
table option and must be the name of a valid storage
engine in MariaDB. When the storage engine is something other than InnoDB
, the
table is created without the generated column and the check constraints:
This is done to avoid conflicts where the custom engine does not support all the features that InnoDB supports.
The backend servers used by this service must be MariaDB version 10.2 or newer.
CC BY-SA / Gnu FDL
Note: This module is experimental and must be built from source. The module is deprecated in MaxScale 23.08 and might be removed in a future release.
The Transaction Performance Monitoring (TPM) filter is a filter module for MaxScale that monitors every SQL statement that passes through the filter. The filter groups a series of SQL statements into a transaction by detecting 'commit' or 'rollback' statements. It logs all committed transactions with necessary information, such as timestamp, client, SQL statements, latency, etc., which can be used later for transaction performance analysis.
The configuration block for the TPM filter requires the minimal filter options in it's section within the maxscale.cnf file, stored in /etc/maxscale.cnf.
The TPM filter does not support any filter options currently.
The TPM filter accepts a number of optional parameters.
The name of the output file created for performance logging. The default filename is tpm.log.
The optional source
parameter defines an address that is used
to match against the address from which the client connection
to MaxScale originates. Only sessions that originate from this
address will be logged.
The optional user
parameter defines a user name that is used
to match against the user from which the client connection to
MaxScale originates. Only sessions that are connected using
this username are logged.
The optional delimiter
parameter defines a delimiter that is used to
distinguish columns in the log. The default delimiter is :::
.
The optional query_delimiter
defines a delimiter that is used to
distinguish different SQL statements in a transaction.
The default query delimiter is @@@
.
named_pipe
is the path to a named pipe, which TPM filter uses to
communicate with 3rd-party applications (e.g., ).
Logging is enabled when the router receives the character '1' and logging is
disabled when the router receives the character '0' from this named pipe.
The default named pipe is /tmp/tpmfilter
and logging is disabled by default.
For example, the following command enables the logging:
Similarly, the following command disables the logging:
For each transaction, the TPM filter prints its log in the following format:
<timestamp> | <server_name> | <user_name> | <latency of the transaction> | <latencies of individual statements in the transaction> (delimited by 'query_delimiter') | <actual SQL statements>
You want to log every transaction with its SQL statements and latency for future transaction performance analysis.
Add a filter with the following definition:
After the filter reads the character '1' from its named pipe, the following is an example log that is generated from the above TPM filter with the above configuration:
Note that 3 and 6 are latencies of each transaction in milliseconds, while 0.165 and 0.123 are latencies of the first statement of each transaction in milliseconds.
CC BY-SA / Gnu FDL
The rewrite filter allows modification of sql queries on the fly. Reasons for modifying queries can be to rewrite a query for performance, or to change a specific query when the client query is incorrect and cannot be changed in a timely manner.
The examples will use Rewrite Filter file format. See below.
Native syntax
Rewriter native syntax uses placeholders to grab and replace parts of text.
Placeholders
The syntax for a plain placeholder is @{N}
where N is a positive integer.
The syntax for a placeholder regex is @{N:regex}
. It allows more control
when needed.
The below is a valid entry in rf format. For demonstration, all options are set. This entry is a do-nothing entry, but illustrates placeholders.
If the input sql is select id, name from my_table where id = 42
then @{2} = "id, name"
and @{3} = "42"
. Since the replace template
is identical to the match template the end result is that the output sql
will be the same as the input sql.
Placeholders can be used as forward references.@{1:^}select @{2}, count(*) from @{3} group by @{2}
.
For a match, the two @{2}
text grabs must be equal.
Match template
The match template is used to match against the sql to be rewritten.
The match template can be partial from mytable
. But the actual underlying
regex match is always for the whole sql. If the match template does not
start or end with a placeholder, placeholders are automatically added so
that the above becomes @{1}from mytable@{2}
. The automatically added
placeholders cannot be used in the replace template.
Matching the whole input also means that Native syntax does not support
(and is not intended to support) scan and replace. Only the first occurrance
of the above from mytable
can be modified in the replace template.
However, one can selectively choose to modify e.g. the first through
third occurrance of from mytable
by writingfrom mytable @{1} from mytable @{2} from mytable @{3}
.
For scan and replace use a different regex_grammar (see below).
Replace template
The replace template uses the placeholders from the match template to rewrite sql.
An important option for smooth matching is ignore_whitespace
, which
is on (true) by default. It creates the match regex in such a way that
the amount and kind of whitespace does not affect matching. However,
to make ignore_whitespace
always work, it is important to add
whitespace where allowed. If "id=42" is in the match template then
only the exact "id=42" can match. But if "id = 42" is used, andignore_whitespace
is on, both "id=42" and "id = 42" will match.
Another example, and what not to do:
That works, but because the match lacks specific detail about the
expected sql, things are likely to break. In this caseshow indexes from my_table
would no longer work.
The minimum detail in this case could be:
but if more detail is known, like something specific in the where clause, that too should be added.
Placeholder Regex
Syntax: @{N:regex}
In a placeholder regex the character }
must be escaped to \}
(for literal matching). Plain parenthesis "()" indicate capturing
groups, which are internally used by the Native grammar.
Thus plain parentheses in a placeholder regex will break matching.
However, non-capturing groups can be used: e.g. @{1:(:?Jane|Joe)}
.
To match a literal parenthesis use an escape, e.g. \(
.
Suppose an application is misbehaving after an upgrade and a quick fix is needed.
This query select zip from address_book where str_id = "AZ-124"
is correct,
but if the id is an integer the where clause should be id = 1234
.
Using plain regular expressions
For scan and replace the regex_grammar must be set to something else than Native. An example will illustrate the usage.
Replace all occurrances of "wrong_table_name" with "correct_table_name". Further, if the replacement was made then replace all occurrances of wrong_column_name with correct_column_name.
Adding a rewrite filter.
template_file
Type: string
Mandatory: Yes
Dynamic: Yes
Default: No default value
Path to the template file.
regex_grammar
Type: string
Mandatory: No
Dynamic: Yes
Default: Native
Values: Native
, ECMAScript
, Posix
, EPosix
, Awk
, Grep
, EGrep
Default regex_grammar for templates
case_sensitive
Type: boolean
Mandatory: No
Dynamic: Yes
Default: true
Default case sensitivity for templates
log_replacement
Type: boolean
Mandatory: No
Dynamic: Yes
Default: false
Log replacements at NOTICE level.
regex_grammar
Type: string
Values: Native
, ECMAScript
, Posix
, EPosix
, Awk
, Grep
, EGrep
Default: From maxscale.cnf
Overrides the global regex_grammar of a template.
case_sensitive
Type: boolean
Default: From maxscale.cnf
Overrides the global case sensitivity of a template.
ignore_whitespace
Type: boolean
Default: true
Ignore whitespace differences in the match template and input sql.
continue_if_matched
Type: boolean
Default: false
If a template matches and the replacement is done, continue to the next template and apply it to the result of the previous rewrite.
what_if
Type: boolean
Default: false
Do not make the replacement, only log what would have been replaced (NOTICE level).
The rf format for an entry is:
The character #
starts a single line comment when it is the
first character on a line.
Empty lines are ignored.
The rf format does not need any additional escaping to what the basic format requires (see Placeholder Regex).
Options are specified as follows:
The colon must stick to the option name.
The separators %
and %%
must be the exact content of
their respective separator lines.
The templates can span multiple lines. Whitespace does not
matter as long as ignore_whitespace = true
. Always use space
where space is allowed to maximize the utility ofignore_whitespace
.
Example
The json file format is harder to read and edit manually. It will be needed if support for editing of rewrite templates is added to the GUI.
All double quotes and escape characters have to be escaped in json, i.e '"' and '\'.
The same example as above is:
The configuration is re-read if any dynamic value is updated even if the value does not change.
ECMAScript
Posix
EPosix
Awk
Grep Same as Posix with the addition of newline '\n' as an alternation separator.
EGrep Same as EPosix with the addition of newline '\n' as an alternation separator in addition to '|'.
CC BY-SA / Gnu FDL
The Xpand Monitor is a monitor that monitors a Xpand cluster. It is capable of detecting the cluster setup and creating corresponding server instances within MaxScale.
The monitor user must have the following grants:
Further, if you want be able to softfail and unsoftfail a node via MaxScale,
then the monitor user must have SUPER
privileges:
A minimal configuration for a monitor requires one server in the Xpand cluster, and a username and a password to connect to the server. Note that by default the Xpand monitor will only use that server in order to dynamically find out the configuration of the cluster; after startup it will completely rely upon information obtained at runtime. To change the default behaviour, please see the parameter.
To ensure that the Xpand monitor will be able to start, it is adviseable to provide more than one server to cater for the case that not all nodes are always up when MaxScale starts.
Note: All services that use servers monitored by xpandmon should use
the cluster
parameter to define the set of servers they use. This will
guarantee that the services use servers that are valid members of the
XPand cluster.
The server objects the Xpand monitor creates for each detected Xpand node will be named like
where <name-of-xpand-monitor>
is the name of the Xpand monitor
instance, as defined in the MaxScale configuration file, and <id>
is the
id of the Xpand node.
For instance, with the Xpand monitor defined as above and a Xpand
cluster consisting of 3 nodes whose ids are 1
, 2
and 3
respectively,
the names of the created server objects will be:
When dynamic servers are created, the values for the configuraton settingsmax_routing_connections
, persistmaxtime
, persistpoolmax
andproxy_protocol
are copied from the settings of the bootstrap servers.
Note that the values of these settings must be identical on every
bootstrap server.
For a list of optional parameters that all monitors support, read the document.
These are optional parameters specific to the Xpand Monitor.
cluster_monitor_interval
Type:
Mandatory: No
Dynamic: Yes
Default: 60s
Defines how often the monitor checks the state of the entire cluster. The default value is 60 seconds, which should not be lowered as that may have an adverse effect on the Cluster itself.
health_check_threshold
Type: count
Mandatory: No
Dynamic: Yes
Default: 2
Defines how many times the health check may fail before the monitor considers a particular node to be down.
dynamic_node_detection
Type:
Mandatory: No
Dynamic: Yes
Default: true
By default, the Xpand monitor will only use the bootstrap nodes in order to connect to the Xpand cluster and then find out the cluster configuration dynamically at runtime.
That behaviour can be turned off with this optional parameter, in which case all Xpand nodes must manually be defined as shown below.
The default value of dynamic_node_detection
is true
.
See also .
health_check_port
Type: integer
Mandatory: No
Dynamic: Yes
Default: 3581
With this optional parameter it can be specified what health check
port to use, if dynamic_node_detection
has been disabled.
Note that this parameter is ignored unless dynamic_node_detection
is false
. Note also that the port must be the same for all nodes.
region_name
Type: string
Mandatory: No
Dynamic: Yes
Default: ''
Mutually exclusive with region_oid
.
region_name
speficies the name of the region the instance MaxScale is
running in. Should be specified only if Xpand Multi-Region HA cluster,
which is available from Xpand 23.08 onwards, is used. With region_name
specified, MaxScale will only consider nodes from that region.
region_oid
Type: integer
Mandatory: No
Dynamic: Yes
Default: ''
Mutually exclusive with region_name
.
region_oid
speficies the oid of the region the instance MaxScale is
running in. Should be specified only if Xpand Multi-Region HA cluster,
which is available from Xpand 23.08 onwards, is used. With region_oid
specified, MaxScale will only consider nodes from that region.
The Xpand monitor supports the following module commands.
softfail
With the softfail
module command, a node can be softfailed via
MaxScale. The command requires as argument the name of the Xpand
monitor instance (as defined in the configuration file) and the name
of the node to be softfailed.
For instance, with a configuration file like
then the node whose server name is @@TheXpandMonitor:node-1
can
be softfailed like
If the softfailing of a node is successfully initiated, then the status
of the corresponding MaxScale server object will be set to Draining
,
which will prevent new connections from being created to the node.
When the number of connections through MaxScale to the node has dropped
to 0, its state will change to Drained
. Note that the state Drained
only tells that there are no connections to the node, not what the state
of the softfailing operation is.
unsoftfail
With the unsoftfail
module command, a node can be unsoftfailed via
MaxScale. The command requires as argument the name of the Xpand
monitor instance (as defined in the configuration file) and the name
of the node to be unsoftfailed.
With a setup similar to the softfail
case, a node can be unsoftfailed
like:
If a node is successfully softfailed, then a Draining
status of
the corresponding MaxScale server object will be cleared.
During the cluster check, which is performed once percluster_monitor_interval
, the monitor will also check whether any
nodes are being softfailed. The status of the corresponding server
object of a node being softfailed will be set to Draining
,
which will prevent new connections from being created to that node.
When the number of connections through MaxScale to the node has dropped
to 0, its state will change to Drained
. Note that the state Drained
only tells that there are no connections to the node, not what the state
of the softfailing operation is.
If a node that was softfailed is UNSOFTFAILed then the Draining
status will be cleared.
If the softfailing and unsoftfailing is initiated using the softfail
and unsoftfail
commands of the Xpand monitor, then there will be
no delay between the softfailing or unsoftfailing being initiated and theDraining
status being turned on/off.
CC BY-SA / Gnu FDL