MaxScale Workload Capture and Replay
The WCAR (Workload Capture and Replay) module is a sophisticated feature designed to capture and archive client traffic passing through a MaxScale instance. This allows system administrators and developers to process and store vast volumes of data related to client-server interactions in a reliable manner. By harnessing this captured data, users gain the flexibility to replay and simulate the varied client activity typically seen in a production environment.
One of the module's primary advantages is that it removes the necessity of creating explicit traffic generators, which can be resource-intensive and complex to maintain. Instead, the WCAR module provides a seamless method for mirroring realistic client interactions and behavior patterns, which can be critical for testing, debugging, and optimizing system performance.
Additionally, by facilitating traffic replay, the WCAR module aids in identifying potential system vulnerabilities and performance bottlenecks, allowing for preemptive optimization. This proactive approach ensures that systems are well-prepared for live production scenarios, enhancing overall efficiency and uptime.
In essence, the WCAR module not only preserves detailed and valuable traffic data but also empowers users with the tools to analyze and refine their systems through accurate simulation, paving the way for robust and resilient system architectures.
Overview
The WCAR filter (module wcar) captures client traffic and stores it in a replayable format.
WCAR is designed to capture traffic on a production MaxScale instance. The captured data can then be used as a reproducible way of generating client traffic without having to write application-specific traffic generators.
The captured workloads can be used to:
Verify that upgrades of MariaDB behave as expected.
Repeatedly measure effects of configuration changes, which is useful for database tuning.
Investigate why certain scenarios take longer then expected, as a kind of SQL debugging tool.
Prerequisites
Both the capture MaxScale and replay MaxScale servers must use the same linux distribution and CPU architecture. For example, if the capture was taken on an x86_64 RHEL 8 instance, the replay should also happen on an x86_64 RHEL 8 instance. Captured workloads are however usually compatible across different linux distributions that use the same CPU architecture.
The capture MariaDB instance must have binlogging enabled (
log-bin=1)
Capture
Quick start
Workload capture can be used without definitions in static configuration files and without a MaxScale restart.
If you have an existing routing service named, e.g., RWS-Router in your configuration you can attach a capture filter to it dynamically:
You can then start a capture with
If limiting options were given the capture will stop automatically when one of the limits is triggered. You can also stop the capture at any time with:
See Replay to see how the captured files are used.
When capture is no longer needed you can remove it with:
File based configuration
Define a capture filter by adding the following configuration object and add it to each service whose traffic is to be captured. The traffic from all services that use the filter will be combined so only use the filter in services that point to the same database cluster.
Example configuration file
Here is an example configuration for capturing from a single MariaDB server, where capture starts when MaxScale starts and stops when MaxScale is stopped (start_capture=true). MaxScale listens on port 4006 and connects to MariaDB on port 3306.
Capturing Traffic
This section explains how capture is done with configuration value start_capture=true.
Two things are needed to replay a workload: the client traffic that's captured by MaxScale and a backup of the database that is used to initialize the replay server. The backup should be taken from the point in time where the capture starts and the simplest way to achieve this is to take a logical backup by doing the following.
Stop MaxScale
Take a backup of the database with
mariadb-dump --all-databases --system=allStart MaxScale
Once MaxScale has been started, the captured traffic will be written to files in/var/lib/maxscale/wcar/<name> where <name> is the name of the filter (CAPTURE_FLTR in the examples).
Each capture will generate a number of files named NAME_YYYY-MM-DD_HHMMSS.SUFFIX where NAME is the capture name (defaults to capture), YYYY-MM-DD is the date and HHMMSS is the time and the SUFFIX is one of .cx, .ex or.tx. For example, a capture started on the 18th of April 2024 at 10:26:11 would generate a file named capture_2024-04-18_102611.cx.
Stopping the Capture
To stop the capture, simply stop MaxScale, or issue the command:
where "CAPTURE_FLTR" is the name given to the filter as in the example configuration above.
To disable capturing altogether, remove the capture filter from the configuration and remove it from all services that it was added to. Restart MaxScale.
If the replay is to take place on another server, the results can be collected easily from /var/lib/maxscale/wcar/ with the following command.
Once the capture tarball has been generated, copy it to the replay server. You might then want to delete the directories on the capture server from /var/lib/maxscale/wcar/* to save space (and not copy them again later).
Commands
Each of the commands can be called with the following syntax.
The <filter> is the name of the filter instance. In the example configuration, the value is CAPTURE_FLTR. The [options] is a list of optional arguments that the command might expect.
start <filter> [options]
start <filter> [options]Starts a new capture. Issuing a start command will stop any ongoing capture.
The start command supports optional key-value pairs. If the values are also defined in the configuration file the command line options have priority. The supported keys are:
prefix The prefix added to capture files. The default value is
capture.duration Limit capture to this duration. See also configuration file value 'capture_duration'.
size Limit capture to approximately this many bytes in the file system. See also configuration file value 'capture_size'.
The start command options are not persistent, and only apply to the capture that was thus started.
For example, starting a capture with the below command would create a capture file named Scenario1_2024-04-18_102605.cx and limit the file system usage to approximately 10GiB. If capture_duration was defined in the configuration file it would also be used.
If both duration and size are specified, the one that triggers first, stops the capture.
Running the same command again, but without size=10G, the capture_size used would be that defined in the configuration file or no limit if there was no such definition.
stop <filter>
stop <filter>Stops the currently active capture if one is in progress.
Replay
Installation
Install the required packages on the MaxScale server where the replay is to be done. An additional dependency that must be manually installed is Python, version 3.9 or newer. On most linux distributions a new enough version is available as the default Python interpreter. You may also need to install the development packages for Python, python3-devel on RHEL based systems or python3-dev on Debian based systems, as well as a C++ compiler.
For RHEL 8, Rocky Linux 8 and Alma Linux 8, a newer version of Python must be installed along with the development headers with dnf install python39 python39-devel gcc-c++ and it must be set as the default python implementation with alternatives --set python3 /usr/bin/python3.9.
The replay consists of restoring the database to the point in time where the capture was started. Start by restoring the replay database to this state. Once the database has been restored from the backup, copy the capture files over to the replay MaxScale server.
Preparing the Replay MariaDB Database
Full Restore
Start by restoring the database from the backup to put it at the point in time where the capture was started. The GTID position of the first commit within the capture can be seen in the output of the summary command:
If the captured data has not been transformed to replay format yet, the command will perform the transformation before displaying the summary.
Run maxplayer --help to see the command line options. The help output is also shown at the end of this file.
The replay also requires a user account using which the captured traffic is replayed. This user must have access to all the tables in question. In practice the simplest way to do this for testing is to create the user as follows:
Restore for read-only Replay
For captures that are intended for read-only Replay, it may not be as important that the servers to be tested against are in the exact GTID the capture server was when capture started. In fact, it may be advantageous that the servers are at the state after the capture finished.
On the other hand, Replay also supports write-only. Following the Full Restore procedure above and then running a write-only Replay prepares the replay server(s) for easily running read-only multiple times. This way of running read-only may, for example, be used when fine tuning server settings.
Replaying the Capture
When replay is first done, the capture files will be transformed in-place. Transform can be run separately as well. Depending on the size and structure of the capture file, Transform can use up to twice the space of the capture.ex file. The files with extension .ex contain most of the captured data (events).
Start by copying the replay file tarball created earlier (captures.tar.gz) to the replay MaxScale server and copy it to a directory of your choice (here called/path/to/capture-dir). Then extract the files.
After this, replay the workload against the baseline MariaDB setup:
Once the baseline replay results have been generated, run the replay again but this time against the new MariaDB setup to which the baseline is compared to:
After both replays have been completed, the results can be post-processed and visualized.
Visualizing
The results of the captured replay must first be post-processed into summaries that the visualization will then use. First, the canonicals.csv file must be generated that is needed in the post-processing:
After that, the baseline and comparison replay results can be post-processed into summaries using the maxpostprocess command:
The visualization itself is done with the maxvisualize program. The visualization will open up a browser window to show the visualization. If no browser opens up, the visualization URL is also printed into the command line which by default should be http://localhost:8866/.
To listen on all network interfaces, use --Voila.ip='0.0.0.0' as the last argument.
Settings
capture_dir
capture_dirType: path
Default: /var/lib/maxscale/wcar/
Mandatory: No
Dynamic: No
Directory under which capture directories are stored. Each capture directory has the name of the filter. In the examples above the name "CAPTURE_FLTR" was used.
start_capture
start_captureType: boolean
Default: false
Mandatory: No
Dynamic: No
Start capture when maxscale starts.
capture_duration
capture_durationType: duration
Default: 0s
Maximum: Unlimited in MaxScale, 5min in MaxScale Lite.
Mandatory: No
Dynamic: No
Limit capture to this duration. If set to zero there is no limit.
capture_size
capture_sizeType: size
Default: 0
Maximum: Unlimited in MaxScale, 10MB in MaxScale Lite.
Mandatory: No
Dynamic: No
Limit capture to approximately this many bytes in the file system. If set to zero there is no limit.
maxplayer command line options
Limitations
KILL commands do not work correctly during replay and may kill the wrong session (MXS-5056)
COM_STMT_BULK_EXECUTE is not captured (MXS-5057)
COM_STMT_EXECUTE that uses a cursor is replayed without a cursor (MXS-5059)
For MyISAM and Aria tables, this will cause the table level lock to be held for a shorter time.
Execution of a COM_STMT_SEND_LONG_DATA will not work (MXS-5060)
The capture files are not necessarily compatible with different linux distributions and CPU architectures than the original capture server has. Different combinations will require further testing, and once done, this document will be updated.
This page is licensed: CC BY-SA / Gnu FDL
Last updated
Was this helpful?

