# MaxScale 25.01 Cache

## Cache

This filter was introduced in MariaDB MaxScale 2.1.

* [Cache](#cache)
  * [Overview](#overview)
  * [Limitations](#limitations)
    * [Prepared Statements](#prepared-statements)
    * [Multi-statements](#multi-statements)
    * [Security](#security)
    * [information\_schema](#information_schema)
  * [Invalidation](#invalidation)
    * [Best Efforts](#best-efforts)
  * [Configuration](#configuration)
  * [Settings](#settings)
    * [storage](#storage)
    * [storage\_options](#storage_options)
    * [hard\_ttl](#hard_ttl)
    * [soft\_ttl](#soft_ttl)
    * [max\_resultset\_rows](#max_resultset_rows)
    * [max\_resultset\_size](#max_resultset_size)
    * [max\_count](#max_count)
    * [max\_size](#max_size)
    * [rules](#rules)
    * [cached\_data](#cached_data)
    * [selects](#selects)
    * [cache\_in\_transactions](#cache_in_transactions)
    * [debug](#debug)
    * [enabled](#enabled)
    * [invalidate](#invalidate)
    * [clear\_cache\_on\_parse\_errors](#clear_cache_on_parse_errors)
    * [users](#users)
    * [timeout](#timeout)
  * [Runtime Configuration](#runtime-configuration)
    * [@maxscale.cache.populate](#maxscalecachepopulate)
    * [@maxscale.cache.use](#maxscalecacheuse)
    * [@maxscale.cache.soft\_ttl](#maxscalecachesoft_ttl)
    * [@maxscale.cache.hard\_ttl](#maxscalecachehard_ttl)
  * [Client Driven Caching](#client-driven-caching)
  * [Threads, Users and Invalidation](#threads-users-and-invalidation)
    * [Invalidation](#invalidation_1)
      * [cached\_data=thread\_specific](#cached_datathread_specific)
      * [cache\_data=shared](#cache_datashared)
  * [Rules](#rules_1)
    * [When to Store](#when-to-store)
      * [Qualified Names](#qualified-names)
      * [Implication of the default database](#implication-of-the-default-database)
      * [Regexp Matching](#regexp-matching)
      * [Examples](#examples)
    * [When to Use](#when-to-use)
      * [Examples](#examples_1)
  * [Security](#security_1)
  * [Storage](#storage_1)
    * [storage\_inmemory](#storage_inmemory)
    * [storage\_memcached](#storage_memcached)
      * [server](#server)
      * [max\_value\_size](#max_value_size)
      * [Example](#example)
      * [Limitations](#limitations_1)
      * [Security](#security_2)
    * [storage\_redis](#storage_redis)
      * [server](#server_1)
      * [username](#username)
      * [password](#password)
      * [ssl](#ssl)
      * [ssl\_cert](#ssl_cert)
      * [ssl\_key](#ssl_key)
      * [ssl\_ca](#ssl_ca)
      * [Authentication](#authentication)
      * [SSL](#ssl_1)
      * [Example](#example_1)
      * [Limitations](#limitations_2)
      * [Invalidation](#invalidation_2)
      * [Security](#security_3)
  * [Example](#example_2)
    * [Configuration](#configuration_1)
    * [cache\_rules.json](#cache_rulesjson)
  * [Performance](#performance)
    * [Summary](#summary)

### Overview

From MaxScale version 2.2.11 onwards, the cache filter is no longer\
considered experimental. The following changes to the default behaviour\
have also been made:

* The default value of `cached_data` is now `thread_specific` (used to be`shared`).
* The default value of `selects` is now `assume_cacheable` (used to be`verify_cacheable`).

The cache filter is a simple cache that is capable of caching the result of\
SELECTs, so that subsequent identical SELECTs are served directly by MaxScale,\
without the queries being routed to any server.

By *default* the cache will be used and populated in the following circumstances:

* There is no explicit transaction active, that is, autocommit is used,
* there is an explicitly read-only transaction (that is,`START TRANSACTION READ ONLY`) active, or
* there is a transaction active and no statement that modifies the database\
  has been performed.

In practice, the last bullet point basically means that if a transaction has\
been started with `BEGIN`, `START TRANSACTION` or `START TRANSACTION READ WRITE`, then the cache will be used and populated until the first `UPDATE`,`INSERT` or `DELETE` statement is encountered.

That is, in default mode the cache effectively causes the system to behave\
as if the *isolation level* would be `READ COMMITTED`, irrespective of what\
the isolation level of the backends actually is.

The default behaviour can be altered using the configuration parameter [cache\_in\_transactions](#cache_in_transactions).

By default it is assumed that all `SELECT` statements are cacheable, which\
means that also statements like `SELECT LOCALTIME` are cached. Please check [selects](#selects) for how to change the default behaviour.

### Limitations

All of these limitations may be addressed in forthcoming releases.

#### Prepared Statements

Resultsets of prepared statements are **not** cached.

#### Multi-statements

Multi-statements are always sent to the backend and their result is**not** cached.

#### Security

The cache is **not** aware of grants.

The implication is that unless the cache has been explicitly configured\
who the caching should apply to, the presence of the cache may provide\
a user with access to data he should not have access to.

Please read the section [Security](#security-1) for more detailed information.

However, from 2.5 onwards it is possible to configure the cache to cache\
the data of each user separately, which effectively means that there can\
be no unintended sharing. Please see [users](#users) for how to change\
the default behaviour.

#### `information_schema`

When [invalidation](#invalidation) is enabled, SELECTs targeting tables\
in `information_schema` are not cached. The reason is that as the content\
of the tables changes as the side-effect of something else, the cache would\
not know when to invalidate the cache-entries.

### Invalidation

Since MaxScale 2.5, the cache is capable of invalidating entries in the\
cache when a modification (UPDATE, INSERT or DELETE) that may affect those\
entries is made.

The cache invalidation works on the table-level, that is, a modification\
made to a particular table will cause all cache entries that refer to that\
table to be invalidated, irrespective of whether the modification actually\
has an impact on the cache entries or not. For instance, suppose the result\
of the following SELECT has been cached

```
SELECT * FROM t WHERE a=1;
```

An insert like

```
INSERT INTO t SET a=42;
```

will cause the cache entry containing the result of that SELECT to be\
invalidated even if the INSERT actually does not affect it. Please see [invalidate](#invalidate) for how to enable the invalidation.

When invalidation has been enabled MaxScale must be able to completely\
parse a SELECT statement for its results to be stored in the cache. The\
reason is that in order to be able to invalidate cache entries, MaxScale\
must know what tables a SELECT statement depends upon. Consequently, if\
(and only if) invalidation has been enabled and MaxScale fails to parse a\
statement, the result of that particular statement will not be cached.

When invalidation has been enabled, MaxScale will also parse all UPDATE,\
INSERT and DELETE statements, in order to find out what tables are\
modified. If that parsing fails, MaxScale will *by default* clear the\
entire cache. The reason is that unless MaxScale can completely parse\
the statement it cannot know what tables are modified and hence not what\
cache entries should be invalidated. Consequently, to prevent stale data\
from being returned, the entire cache is cleared. The default behaviour\
can be changed using the configuration parameter [clear\_cache\_on\_parse\_errors](#clear_cache_on_parse_errors).

Note that what threading approach is used has a big impact on the\
invalidation. Please see [Threads, Users and Invalidation](#threads-users-and-invalidation)\
for how the threading approach affects the invalidation.

Note also that since the invalidation may not, depending on how the\
cache has been configured, be visible to all sessions of all users, it\
is still important to configure a reasonable [soft](#soft_ttl) and [hard](#hard_ttl) TTL.

#### Best Efforts

The invalidation offered by the MaxScale cache can be said to be of\_best efforts\_ quality. The reason is that in order to ensure that the\
cache in all circumstances reflects the state in the actual database,\
would require that the operations involving the cache and the MariaDB\
server are synchronized, which would cause an unacceptable overhead.

What *best efforts* means in this context is best illustrated using an example.

Suppose a client executes the statement `SELECT * FROM tbl` and that the result\
is cached. Next time that or any other client executes the same statement, the\
result is returned from the cache and the MariaDB server will not be accessed\
at all.

If a client now executes the statement `INSERT INTO tbl VALUES (...)`, the\
cached value for the `SELECT` statement above and all other statements that are\
dependent upon `tbl` will be invalidated. That is, the next time someone executes\
the statement `SELECT * FROM tbl` the result will again be fetched from the\
MariaDB server and stored to the cache.

However, suppose some client executes the statement `SELECT COUNT(*) FROM tbl`\
at the same time someone else executes the `INSERT ...` statement. A possible\
chain of events is as follows:

```
Timeline 1                 Timeline 2

Clients execute       INSERT ...                 SELECT COUNT(*) FROM tbl
MaxScale -> DB                                   SELECT COUNT(*) FROM tbl
MaxScale -> DB        INSERT ...
```

That is, the `SELECT` is performed in the database server *before* the`INSERT`. However, since the timelines are proceeding independently of\
each other, the events may be re-ordered as far as the cache is concerned.

```
MaxScale -> Cache     Delete invalidated values
MaxScale -> Cache                                Store result and invalidation key
```

That is, the cached value for `SELECT COUNT(*) FROM tbl` will reflect the\
situation *before* the insert and will thus not be correct.

The stale result will be returned until the value has reached its *time-to-live*\
or its invalidation is caused by some update operation.

### Configuration

The cache is simple to add to any existing service. However, some experimentation\
may be required in order to find the configuration settings that provide\
the maximum benefit.

```
[Cache]
type=filter
module=cache
hard_ttl=30
soft_ttl=20
rules=...
...

[Cached-Routing-Service]
type=service
...
filters=Cache
```

Each configured cache filter uses a storage of its own. That is, if there\
are two services, each configured with a specific cache filter, then,\
even if queries target the very same servers the cached data will not\
be shared.

Two services can use the same cache filter, but then either the services\
should use the very same servers *or* a completely different set of servers,\
where the used table names are different. Otherwise there can be unintended\
sharing.

### Settings

The cache filter has no mandatory parameters but a range of optional ones.\
Note that it is advisable to specify `max_size` to prevent the cache from\
using up all memory there is, in case there is very little overlap among the\
queries.

#### `storage`

* Type: string
* Mandatory: No
* Dynamic: No
* Default: `storage_inmemory`

The name of the module that provides the storage for the cache. That\
module will be loaded and provided with the value of `storage_options` as\
argument. For instance:

```
storage=storage_redis
```

See [Storage](#storage-1) for what storage modules are available.

#### `storage_options`

* Type: string
* Mandatory: No
* Dynamic: No
* Default:

**NOTE** Deprecated in 23.02.

A string that is provided verbatim to the storage module specified in `storage`,\
when the module is loaded. Note that the needed arguments and their format depend\
upon the specific module.

From 23.02 onwards, the storage module configuration should be provided using\
nested parameters.

#### `hard_ttl`

* Type: [duration](https://mariadb.com/docs/maxscale/maxscale-archive/archive/mariadb-maxscale-25-01/mariadb-maxscale-25-01-getting-started/mariadb-maxscale-2501-maxscale-2501-mariadb-maxscale-configuration-guide)
* Mandatory: No
* Dynamic: No
* Default: `0s` (no limit)

*Hard time to live*; the maximum amount of time the cached\
result is used before it is discarded and the result is fetched from the\
backend (and cached). See also [soft\_ttl](#soft_ttl).

```
hard_ttl=60s
```

#### `soft_ttl`

* Type: [duration](https://mariadb.com/docs/maxscale/maxscale-archive/archive/mariadb-maxscale-25-01/mariadb-maxscale-25-01-getting-started/mariadb-maxscale-2501-maxscale-2501-mariadb-maxscale-configuration-guide)
* Mandatory: No
* Dynamic: No
* Default: `0s` (no limit)

*Soft time to live*; the amount of time - in seconds - the cached result is\
used before it is refreshed from the server. When `soft_ttl` has passed, the\
result will be refreshed when the *first* client requests the value.

However, as long as [hard\_ttl](#hard_ttl) has not passed, *all* other clients\
requesting the same value will use the result from the cache while it is being\
fetched from the backend. That is, as long as `soft_ttl` but not `hard_ttl`\
has passed, even if several clients request the same value at the same time,\
there will be just one request to the backend.

```
soft_ttl=60s
```

If the value of `soft_ttl` is larger than `hard_ttl` it will be adjusted\
down to the same value.

#### `max_resultset_rows`

* Type: count
* Mandatory: No
* Dynamic: No
* Default: `0` (no limit)

Specifies the maximum number of rows a resultset can have in order to be\
stored in the cache. A resultset larger than this, will not be stored.

```
max_resultset_rows=1000
```

#### `max_resultset_size`

* Type: [size](https://mariadb.com/docs/maxscale/maxscale-archive/archive/mariadb-maxscale-25-01/mariadb-maxscale-25-01-getting-started/mariadb-maxscale-2501-maxscale-2501-mariadb-maxscale-configuration-guide)
* Mandatory: No
* Dynamic: No
* Default: `0` (no limit)

Specifies the maximum size of a resultset, for it to be stored in the cache.\
A resultset larger than this, will not be stored. The size can be specified\
as described [here](https://mariadb.com/docs/maxscale/maxscale-archive/archive/mariadb-maxscale-25-01/mariadb-maxscale-25-01-getting-started/mariadb-maxscale-2501-maxscale-2501-mariadb-maxscale-configuration-guide).

```
max_resultset_size=128Ki
```

Note that the value of `max_resultset_size` should not be larger than the\
value of `max_size`.

#### `max_count`

* Type: count
* Mandatory: No
* Dynamic: No
* Default: `0` (no limit)

The maximum number of items the cache may contain. If the limit has been\
reached and a new item should be stored, then an older item will be evicted.

Note that if `cached_data` is `thread_specific` then this limit will be\
applied to each cache *separately*. That is, if a thread specific cache\
is used, then the total number of cached items is #threads \* the value\
of `max_count`.

```
max_count=1000
```

#### `max_size`

* Type: [size](https://mariadb.com/docs/maxscale/maxscale-archive/archive/mariadb-maxscale-25-01/mariadb-maxscale-25-01-getting-started/mariadb-maxscale-2501-maxscale-2501-mariadb-maxscale-configuration-guide)
* Mandatory: No
* Dynamic: No
* Default: `0` (no limit)

The maximum size the cache may occupy. If the limit has been reached and a new\
item should be stored, then some older item(s) will be evicted to make space.

Note that if `cached_data` is `thread_specific` then this limit will be\
applied to each cache *separately*. That is, if a thread specific cache\
is used, then the total size is #threads \* the value of `max_size`.

```
max_size=100Mi
```

#### `rules`

* Type: path
* Mandatory: No
* Dynamic: Yes
* Default: `""` (no rules)

Specifies the path of the file where the caching rules are stored. A relative\
path is interpreted relative to the *data directory* of MariaDB MaxScale.

```
rules=/path/to/rules-file
```

Note that the rules will be reloaded, and applied if different, every time\
a dynamic configuration change is made. Thus, to cause a reloading of the\
rules, alter the rules parameter to the same value it has.

```
maxctrl alter filter MyCache rules='/path/to/rules-file'
```

#### `cached_data`

* Type: [enum](https://mariadb.com/docs/maxscale/maxscale-archive/archive/mariadb-maxscale-25-01/mariadb-maxscale-25-01-getting-started/mariadb-maxscale-2501-maxscale-2501-mariadb-maxscale-configuration-guide)
* Mandatory: No
* Dynamic: No
* Values: `shared`, `thread_specific`
* Default: `thread_specific`

An enumeration option specifying how data is shared between threads. The\
allowed values are:

* `shared`: The cached data is shared between threads. On the one hand\
  it implies that there will be synchronization between threads, on\
  the other hand that all threads will use data fetched by any thread.
* `thread_specific`: The cached data is specific to a thread. On the\
  one hand it implies that no synchronization is needed between threads,\
  on the other hand that the very same data may be fetched and stored\
  multiple times.

```
cached_data=shared
```

Default is `thread_specific`. See `max_count` and `max_size` what implication\
changing this setting to `shared` has.

#### `selects`

* Type: [enum](https://mariadb.com/docs/maxscale/maxscale-archive/archive/mariadb-maxscale-25-01/mariadb-maxscale-25-01-getting-started/mariadb-maxscale-2501-maxscale-2501-mariadb-maxscale-configuration-guide)
* Mandatory: No
* Dynamic: Yes
* Values: `assume_cacheable`, `verify_cacheable`
* Default: `assume_cacheable`

An enumeration option specifying what approach the cache should take with\
respect to `SELECT` statements. The allowed values are:

* `assume_cacheable`: The cache can assume that all `SELECT` statements,\
  without exceptions, are cacheable.
* `verify_cacheable`: The cache can not assume that all `SELECT`\
  statements are cacheable, but must verify that.

```
selects=verify_cacheable
```

Default is `assume_cacheable`. In this case, all `SELECT` statements are\
assumed to be cacheable and will be parsed *only* if some specific rule\
requires that.

If `verify_cacheable` is specified, then all `SELECT` statements will be\
parsed and only those that are safe for caching - e.g. do *not* call any\
non-cacheable functions or access any non-cacheable variables - will be\
subject to caching.

If `verify_cacheable` has been specified, the cache will not be used in\
the following circumstances:

* The `SELECT` uses any of the following functions: `BENCHMARK`,`CONNECTION_ID`, `CONVERT_TZ`, `CURDATE`, `CURRENT_DATE`, `CURRENT_TIMESTAMP`,`CURTIME`, `DATABASE`, `ENCRYPT`, `FOUND_ROWS`, `GET_LOCK`, `IS_FREE_LOCK`,`IS_USED_LOCK`, `LAST_INSERT_ID`, `LOAD_FILE`, `LOCALTIME`, `LOCALTIMESTAMP`,`MASTER_POS_WAIT`, `NOW`, `RAND`, `RELEASE_LOCK`, `SESSION_USER`, `SLEEP`,`SYSDATE`, `SYSTEM_USER`, `UNIX_TIMESTAMP`, `USER`, `UUID`, `UUID_SHORT`.
* The `SELECT` accesses any of the following fields: `CURRENT_DATE`,`CURRENT_TIMESTAMP`, `LOCALTIME`, `LOCALTIMESTAMP`
* The `SELECT` uses system or user variables.

Note that parsing all `SELECT` statements carries a performance\
cost. Please read [performance](#performance) for more details.

#### `cache_in_transactions`

* Type: [enum](https://mariadb.com/docs/maxscale/maxscale-archive/archive/mariadb-maxscale-25-01/mariadb-maxscale-25-01-getting-started/mariadb-maxscale-2501-maxscale-2501-mariadb-maxscale-configuration-guide)
* Mandatory: No
* Dynamic: No
* Values: `never`, `read_only_transactions`, `all_transactions`
* Default: `all_transactions`

An enumeration option specifying how the cache should behave when there\
are active transactions:

* `never`: When there is an active transaction, no data will be returned\
  from the cache, but all requests will always be sent to the backend.\
  The cache will be populated inside explicitly read-only transactions.\
  Inside transactions that are not explicitly read-only, the cache will\
  be populated until the first non-SELECT statement.
* `read_only_transactions`: The cache will be used and populated inside\
  explicitly read-only transactions. Inside transactions that are not\
  explicitly read-only, the cache will be populated, but not used\
  until the first non-SELECT statement.
* `all_transactions`: The cache will be used and populated inside\
  explicitly read-only transactions. Inside transactions that are not\
  explicitly read-only, the cache will be used and populated until the\
  first non-SELECT statement.

```
cache_in_transactions=never
```

Default is `all_transactions`.

The values `read_only_transactions` and `all_transactions` have roughly the\
same effect as changing the isolation level of the backend to `read_committed`.

#### `debug`

* Type: number
* Mandatory: No
* Dynamic: Yes
* Default: `0`

An integer value, using which the level of debug logging made by the cache\
can be controlled. The value is actually a bitfield with different bits\
denoting different logging.

* `0` (`0b00000`) No logging is made.
* `1` (`0b00001`) A matching rule is logged.
* `2` (`0b00010`) A non-matching rule is logged.
* `4` (`0b00100`) A decision to use data from the cache is logged.
* `8` (`0b01000`) A decision not to use data from the cache is logged.
* `16` (`0b10000`) Higher level decisions are logged.

Default is `0`. To log everything, give `debug` a value of `31`.

```
debug=31
```

#### `enabled`

* Type: [boolean](https://mariadb.com/docs/maxscale/maxscale-archive/archive/mariadb-maxscale-25-01/mariadb-maxscale-25-01-getting-started/mariadb-maxscale-2501-maxscale-2501-mariadb-maxscale-configuration-guide)
* Mandatory: No
* Dynamic: No
* Default: `true`

Specifies whether the cache is initially enabled or disabled.

```
enabled=false
```

The value affects the initial state of the MaxScale user\
variables using which the behaviour of the cache can be modified\
at runtime. Please see [Runtime Configuration](#runtime-configuration)\
for details.

#### `invalidate`

* Type: [enum](https://mariadb.com/docs/maxscale/maxscale-archive/archive/mariadb-maxscale-25-01/mariadb-maxscale-25-01-getting-started/mariadb-maxscale-2501-maxscale-2501-mariadb-maxscale-configuration-guide)
* Mandatory: No
* Dynamic: No
* Values: `never`, `current`
* Default: `never`

An enumeration option specifying how the cache should invalidate\
cache entries.

```
* `never`: No invalidation is performed. This is the default.
* `current`: When a modification is made, entries in the cache used by
  the current session are invalidated. Other sessions that use the same
  cache will also be affected, but sessions that use another cache will
  not.
```

The effect of `current` depends upon the value of `cached_data`. If the value\
is `shared`, that is, all threads share the same cache, then the effect of an\
invalidation is immediately visible to all sessions, as there is just one cache.\
However, if the value is `thread_specific`, then an invalidation will affect only\
the cache that the session happens to be using.

If it is important and *sufficient* that an application immediately sees a change\
that it itself has caused, then a combination of `invalidate=current`\
and `cached_data=thread_specific` can be used.

If it is important that an application *immediately* sees all changes, irrespective\
of who has caused them, then a combination of `invalidate=current`\
and `cached_data=shared` *must* be used.

#### `clear_cache_on_parse_errors`

* Type: [boolean](https://mariadb.com/docs/maxscale/maxscale-archive/archive/mariadb-maxscale-25-01/mariadb-maxscale-25-01-getting-started/mariadb-maxscale-2501-maxscale-2501-mariadb-maxscale-configuration-guide)
* Mandatory: No
* Dynamic: No
* Default: `true`

This boolean option specifies how the cache should behave in case of\
parsing errors when invalidation has been enabled.

* `true`: If the cache fails to parse an UPDATE/INSERT/DELETE\
  statement then all cached data will be cleared.
* `false`: A failure to parse an UPDATE/INSERT/DELETE statement\
  is ignored and no invalidation will take place due that statement.

The default value is `true`.

Changing the value to `false` may mean that stale data is returned from\
the cache, if an UPDATE/INSERT/DELETE cannot be parsed and the statement\
affects entries in the cache.

#### `users`

* Type: [enum](https://mariadb.com/docs/maxscale/maxscale-archive/archive/mariadb-maxscale-25-01/mariadb-maxscale-25-01-getting-started/mariadb-maxscale-2501-maxscale-2501-mariadb-maxscale-configuration-guide)
* Mandatory: No
* Dynamic: No
* Values: `mixed`, `isolated`
* Default: `mixed`

An enumeration option specifying how the cache should cache data for\
different users.

```
* `mixed`: The data of different users is stored in the same
  cache. This is the default and may cause that a user can
  access data he should not have access to.
* `isolated`: Each user has a unique cache and there can be
  no unintended sharing.
```

Note that if `isolated` has been specified, then each user will\
conceptually have a cache of his own, which is populated\
independently from each other. That is, if two users make the\
same query, then the data will be fetched twice and also stored\
twice. So, a `isolated` cache will in general use more memory and\
cause more traffic to the backend compared to a `mixed` cache.

#### `timeout`

* Type: [duration](https://mariadb.com/docs/maxscale/maxscale-archive/archive/mariadb-maxscale-25-01/mariadb-maxscale-25-01-getting-started/mariadb-maxscale-2501-maxscale-2501-mariadb-maxscale-configuration-guide)
* Mandatory: No
* Dynamic: No
* Default: `5s`

The *timeout* used when performing operations to distributed storages\
such as *redis* or *memcached*.

```
timeout=7000ms
```

### Runtime Configuration

The cache filter can be configured at runtime by executing SQL commands. If\
there is more than one cache filter in a service, only the first cache filter\
will be able to process the variables. The remaining filters will not see them\
and thus configuring them at runtime is not possible.

#### `@maxscale.cache.populate`

Using the variable `@maxscale.cache.populate` it is possible to specify at\
runtime whether the cache should be populated or not. Its initial value is\
the value of the configuration parameter `enabled`. That is, by default the\
value is `true`.

The purpose of this variable is make it possible for an application to decide\
statement by statement whether the cache should be populated.

```
SET @maxscale.cache.populate=TRUE;
SELECT a, b FROM tbl;
SET @maxscale.cache.populate=FALSE;
SELECT a, b FROM tbl;
```

In the example above, the first `SELECT` will always be sent to the\
server and the result will be cached, provided the actual cache rules\
specifies that it should be. The second `SELECT` may be served from the\
cache, depending on the value of `@maxscale.cache.use` (and the cache\
rules).

The value of `@maxscale.cache.populate` can be queried

```
SELECT @maxscale.cache.populate;
```

but only *after* it has been explicitly set once.

#### `@maxscale.cache.use`

Using the variable `@maxscale.cache.use` it is possible to specify at\
runtime whether the cache should be used or not. Its initial value is\
the value of the configuration parameter `enabled`. That is, by default the\
value is `true`.

The purpose of this variable is make it possible for an application to decide\
statement by statement whether the cache should be used.

```
SET @maxscale.cache.use=TRUE;
SELECT a, b FROM tbl;
SET @maxscale.cache.use=FALSE;
SELECT a, b FROM tbl;
```

The first `SELECT` will be served from the cache, providing the rules\
specify that the statement should be cached, the cache indeed contains\
the result and the date is not stale (as specified by the *TTL*).

If the data is stale, the `SELECT` will be sent to the server **and**\
the cache entry will be updated, irrespective of the value of`@maxscale.cache.populate`.

If `@maxscale.cache.use` is `true` but the result is not found in the\
cache, and the result is subsequently fetched from the server, the\
result will **not** be added to the cache, unless`@maxscale.cache.populate` is also `true`.

The value of `@maxscale.cache.use` can be queried

```
SELECT @maxscale.cache.use;
```

but only after it has explicitly been set once.

#### `@maxscale.cache.soft_ttl`

Using the variable `@maxscale.cache.soft_ttl` it is possible at runtime\
to specify *in seconds* what *soft ttl* should be applied. Its initial\
value is the value of the configuration parameter `soft_ttl`. That is,\
by default the value is 0.

The purpose of this variable is make it possible for an application to decide\
statement by statement what *soft ttl* should be applied.

```
SET @maxscale.cache.soft_ttl=600;
SELECT a, b FROM unimportant;
SET @maxscale.cache.soft_ttl=60;
SELECT c, d FROM important;
```

When data is `SELECT`ed from the unimportant table `unimportant`, the data\
will be returned from the cache provided it is no older than 10 minutes,\
but when data is `SELECT`ed from the important table `important`, the\
data will be returned from the cache provided it is no older than 1 minute.

Note that `@maxscale.cache.hard_ttl` overrules `@maxscale.cache.soft_ttl`\
in the sense that if the former is less that the latter, then *soft ttl*\
will, when used, be adjusted down to the value of *hard ttl*.

The value of `@maxscale.cache.soft_ttl` can be queried

```
SELECT @maxscale.cache.soft_ttl;
```

but only after it has explicitly been set once.

#### `@maxscale.cache.hard_ttl`

Using the variable `@maxscale.cache.hard_ttl` it is possible at runtime\
to specify *in seconds* what *hard ttl* should be applied. Its initial\
value is the value of the configuration parameter `hard_ttl`. That is,\
by default the value is 0.

The purpose of this variable is make it possible for an application to decide\
statement by statement what *hard ttl* should be applied.

Note that as `@maxscale.cache.hard_ttl` overrules `@maxscale.cache.soft_ttl`,\
is is important to ensure that the former is at least as large as the latter\
and for best overall performance that it is larger.

```
SET @maxscale.cache.soft_ttl=600, @maxscale.cache.hard_ttl=610;
SELECT a, b FROM unimportant;
SET @maxscale.cache.soft_ttl=60, @maxscale.cache.hard_ttl=65;
SELECT c, d FROM important;
```

The value of `@maxscale.cache.hard_ttl` can be queried

```
SELECT @maxscale.cache.hard_ttl;
```

but only after it has explicitly been set once.

### Client Driven Caching

With `@maxscale.cache.populate` and `@maxscale.cache.use` is it possible\
to make the caching completely client driven.

Provide no `rules` file, which means that *all* `SELECT` statements are\
subject to caching and that all users receive data from the cache. Set\
the startup mode of the cache to *disabled*.

```
[TheCache]
type=filter
module=cache
enabled=false
```

Now, in order to *mark* statements that should be cached, set`@maxscale.cache.populate` to `true`, and perform those `SELECT`s.

```
SET @maxscale.cache.populate=TRUE;
SELECT a, b FROM tbl1;
SELECT c, d FROM tbl2;
SELECT e, f FROM tbl3;
SET @maxscale.cache.populate=FALSE;
```

Note that those `SELECT`s must return something in order for the\
statement to be *marked* for caching.

After this, the value of `@maxscale.cache.use` will decide whether\
or not the cache is considered.

```
SET @maxscale.cache.use=TRUE;
SELECT a, b FROM tbl1;
SET @maxscale.cache.use=FALSE;
```

With `@maxscale.cache.use` being `true`, the cache is considered\
and the result returned from there, if not stale. If it is stale,\
the result is fetched from the server and the cached entry is updated.

By setting a very long *TTL* it is possible to prevent the cache\
from ever considering an entry to be stale and instead manually\
cause the cache to be updated when needed.

```
UPDATE tbl1 SET a = ...;
SET @maxscale.cache.populate=TRUE;
SELECT a, b FROM tbl1;
SET @maxscale.cache.populate=FALSE;
```

### Threads, Users and Invalidation

What caching approach is used and how different users are treated\
has a significant impact on the behaviour of the cache. In the\
following the implication of different combinations is explained.

| cached\_data/users | mixed                                                                                                                          | isolated                                                                                                                        |
| ------------------ | ------------------------------------------------------------------------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------- |
| thread\_specific   | No thread contention. Data/work duplicated across threads. May cause unintended sharing.                                       | No thread contention. Data/work duplicated across threads and users. No unintended sharing. Requires the most amount of memory. |
| shared             | Thread contention under high load. No duplicated data/work. May cause unintended sharing. Requires the least amount of memory. | Thread contention under high load. Data/work duplicated across users. No unintended sharing.                                    |

#### Invalidation

Invalidation takes place only in the current cache, so how *visible*\
the invalidation is, depends upon the configuration value of`cached_data`.

**`cached_data=thread_specific`**

The invalidation is visible only to the sessions that are handled by\
the same worker thread where the invalidation occurred. Sessions of the\
same or other users that are handled by different worker threads will\
not see the new value before the TTL causes the value to be refreshed.

**`cache_data=shared`**

The invalidation is immediately visible to all sessions of all users.

### Rules

The caching rules are expressed as a JSON object or as an array of JSON objects.

There are two decisions to be made regarding the caching; in what circumstances\
should data be stored to the cache and in what circumstances should the data in\
the cache be used.

Expressed in JSON this looks as follows

```
{
    store: [ ... ],
    use: [ ... ]
}
```

or, in case an array is used, as

```
[
    {
        store: [ ... ],
        use: [ ... ]
    },
    { ... }
]
```

The `store` field specifies in what circumstances data should be stored to\
the cache and the `use` field specifies in what circumstances the data in\
the cache should be used. In both cases, the value is a JSON array containing\
objects.

If an array of rule objects is specified, then, when looking for a rule that\
matches, the `store` field of each object are evaluated in sequential order\
until a match is found. Then, the `use` field of that object is used when\
deciding whether data in the cache should be used.

#### When to Store

By default, if no rules file have been provided or if the `store` field is\
missing from the object, the results of all queries will be stored to the\
cache, subject to `max_resultset_rows` and `max_resultset_size` cache filter\
parameters.

By providing a `store` field in the JSON object, the decision whether to\
store the result of a particular query to the cache can be controlled in\
a more detailed manner. The decision to cache the results of a query can\
depend upon

* the database,
* the table,
* the column, or
* the query itself.

Each entry in the `store` array is an object containing three fields,

```
{
    "attribute": <string>,
    "op": <string>
    "value": <string>
}
```

where,

* the attribute can be `database`, `table`, `column` or `query`,
* the op can be `=`, `!=`, `like` or `unlike`, and
* the value a string.

If *op* is `=` or `!=` then *value* is used as a string; if it is `like`\
or `unlike`, then *value* is interpreted as a *pcre2* regular expression.\
Note though that if *attribute* is `database`, `table` or `column`, then\
the string is interpreted as a name, where a dot `.` denotes qualification\
or scoping.

The objects in the `store` array are processed in order. If the result\
of a comparison is *true*, no further processing will be made and the\
result of the query in question will be stored to the cache.

If the result of the comparison is *false*, then the next object is\
processed. The process continues until the array is exhausted. If there\
is no match, then the result of the query is not stored to the cache.

Note that as the query itself is used as the key, although the following\
queries

```
SELECT * FROM db1.tbl
```

and

```
USE db1;
SELECT * FROM tbl
```

target the same table and produce the same results, they will be cached\
separately. The same holds for queries like

```
SELECT * FROM tbl WHERE a = 2 AND b = 3;
```

and

```
SELECT * FROM tbl WHERE b = 3 AND a = 2;
```

as well. Although they conceptually are identical, there will be two\
cache entries.

Note that if a column has been specified in a rule, then a statement\
will match *irrespective* of where that particular column appears.\
For instance, if a rule specifies that the result of statements referring\
to the column *a* should be cached, then the following statement will\
match

```
SELECT a FROM tbl;
```

and so will

```
SELECT b FROM tbl WHERE a > 5;
```

**Qualified Names**

When using `=` or `!=` in the rule object in conjunction with `database`,`table` and `column`, the provided string is interpreted as a name, that is,\
dot (`.`) denotes qualification or scope.

In practice that means that if *attribute* is `database` then *value* may\
not contain a dot, if *attribute* is `table` then *value* may contain one\
dot, used for separating the database and table names respectively, and\
if *attribute* is `column` then *value* may contain one or two dots, used\
for separating table and column names, or database, table and column names.

Note that if a qualified name is used as a *value*, then all parts of the\
name must be available for a match. Currently Maria DB MaxScale may not\
always be capable of deducing in what table a particular column is. If\
that is the case, then a value like `tbl.field` may not necessarily\
be a match even if the field is `field` and the table actually is `tbl`.

**Implication of the default database**

If the rules concerns the `database`, then only if the statement refers\
to *no* specific database, will the default database be considered.

**Regexp Matching**

The string used for matching the regular expression contains as much\
information as there is available. For instance, in a situation like

```
USE somedb;
SELECT fld FROM tbl;
```

the string matched against the regular expression will be `somedb.tbl.fld`.

**Examples**

Cache all queries targeting a particular database.

```
{
    "store": [
        {
            "attribute": "database",
            "op": "=",
            "value": "db1"
        }
    ]
}
```

Cache all queries *not* targeting a particular table

```
{
    "store": [
        {
            "attribute": "table",
            "op": "!=",
            "value": "tbl1"
        }
    ]
}
```

That will exclude queries targeting table *tbl1* irrespective of which\
database it is in. To exclude a table in a particular database, specify\
the table name using a qualified name.

```
{
    "store": [
        {
            "attribute": "table",
            "op": "!=",
            "value": "db1.tbl1"
        }
    ]
}
```

Cache all queries containing a WHERE clause

```
{
    "store": [
        {
            "attribute": "query",
            "op": "like",
            "value": ".*WHERE.*"
        }
    ]
}
```

Note that this will actually cause all queries that contain WHERE anywhere,\
to be cached.

#### When to Use

By default, if no rules file have been provided or if the `use` field is\
missing from the object, all users may be returned data from the cache.

By providing a `use` field in the JSON object, the decision whether to use\
data from the cache can be controlled in a more detailed manner. The decision\
to use data from the cache can depend upon

* the user.

Each entry in the `use` array is an object containing three fields,

```
{
    "attribute": <string>,
    "op": <string>
    "value": <string>
}
```

where,

* the attribute can be `user`,
* the op can be `=`, `!=`, `like` or `unlike`, and
* the value a string.

If *op* is `=` or `!=` then *value* is interpreted as a MariaDB account\
string, that is, `%` means indicates wildcard, but if *op* is `like` or`unlike` it is simply assumed *value* is a pcre2 regular expression.

For instance, the following are equivalent:

```
{
    "attribute": "user",
    "op": "=",
    "value": "'bob'@'%'"
}

{
    "attribute": "user",
    "op": "like",
    "value": "bob@.*"
}
```

Note that if *op* is `=` or `!=` then the usual assumptions apply,\
that is, a value of `bob` is equivalent with `'bob'@'%'`. If *like*\
or *unlike* is used, then no assumptions apply, but the string is\
used verbatim as a regular expression.

The objects in the `use` array are processed in order. If the result\
of a comparison is *true*, no further processing will be made and the\
data in the cache will be used, subject to the value of `ttl`.

If the result of the comparison is *false*, then the next object is\
processed. The process continues until the array is exhausted. If there\
is no match, then data in the cache will not be used.

Note that `use` is relevant only if the query is subject to caching,\
that is, if all queries are cached or if a query matches a particular\
rule in the `store` array.

**Examples**

Use data from the cache for all users except `admin` (actually `'admin'@'%'`),\
regardless of what host the `admin` user comes from.

```
{
    "use": [
        {
            "attribute": "user",
            "op": "!=",
            "value": "admin"
        }
    ]
}
```

### Security

As the cache is not aware of grants, unless the cache has been explicitly\
configured who the caching should apply to, the presence of the cache\
may provide a user with access to data he should not have access to.\
Note that the following applies *only* if `users=mixed` has been configured.\
If `users=isolated` has been configured, then there can never be any\
unintended sharing between users.

Suppose there is a table `access` that the user *alice* has access to,\
but the user *bob* does not. If *bob* tries to access the table, he will\
get an error as reply:

```
MySQL [testdb]> select * from access;
ERROR 1142 (42000): SELECT command denied to user 'bob'@'localhost' for table 'access'
```

If we now setup caching for the table, using the simplest possible rules\
file, *bob* will get access to data from the table, provided he executes\
a select identical with one *alice* has executed.

For instance, suppose the rules look as follows:

```
{
    "store": [
        {
            "attribute": "table",
            "op": "=",
            "value": "access"
        }
    ]
}
```

If *alice* now queries the table, she will get the result, which also will\
be cached:

```
MySQL [testdb]> select * from access;
+------+------+
| a    | b    |
+------+------+
|   47 |   11 |
+------+------+
```

If *bob* now executes the very same query, and the result is still in the\
cache, it will be returned to him.

```
MySQL [testdb]> select current_user();
+----------------+
| current_user() |
+----------------+
| bob@127.0.0.1  |
+----------------+
1 row in set (0.00 sec)

MySQL [testdb]> select * from access;
+------+------+
| a    | b    |
+------+------+
|   47 |   11 |
+------+------+
```

That can be prevented, by explicitly declaring in the rules that the caching\
should be applied to *alice* only.

```
{
    "store": [
        {
            "attribute": "table",
            "op": "=",
            "value": "access"
        }
    ],
    "use": [
        {
            "attribute": "user",
            "op": "=",
            "value": "'alice'@'%'"
        }
    ]
}
```

With these rules in place, *bob* is again denied access, since queries\
targeting the table `access` will in his case not be served from the cache.

### Storage

There are two types of storages that can be used; *local* and *shared*.

The only *local* storage implementation is `storage_inmemory` that simply\
stores the cache values in memory. The storage is not persistent and is\
destroyed when MaxScale terminates. Since the storage exists in the MaxScale\
process, it is very fast and provides almost always a performance benefit.

Currently there are two *shared* storages; `storage_memcached` and`storage_redis` that are implemented using [memcached](https://memcached.org/)\
and [redis](https://redis.io/) respectively.

The shared storages are accessed across the network and consequently it is\_not\_ self-evident that their use will provide any performance benefit.\
Namely, irrespective of whether the data is fetched from the cache or from\
the server there will be a network hop and often that network hop is, as far\
as the performance goes, what costs the most.

The presence of a shared cache *may* provide a performance benefit\_if the network between MaxScale and the storage server (memcached or\_\
\&#xNAN;*Redis) is faster than the network between MaxScale and the database*\
\&#xNAN;*server,* if the used SELECT statements are heavy (that is, take a significant\
amount of time) to process for the database server, or

* if the presence of the cache reduces the overall load of an\
  otherwise overloaded database server.

As a general rule a *shared* storage should not be used without first\
assessing its value using a realistic workload.

#### `storage_inmemory`

This simple storage module uses the standard memory allocator for storing\
the cached data.

```
storage=storage_inmemory
```

This storage module takes no arguments.

#### `storage_memcached`

This storage module uses [memcached](https://memcached.org/) for storing the\
cached data.

Multiple MaxScale instances can share the same memcached server and items\
cached by one MaxScale instance will be used by the other. Note that all\
MaxScale instances should have exactly the same configuration, as otherwise\
there can be unintended sharing.

```
storage=storage_memcached
```

`storage_memcache` has the following parameters:

**`server`**

* Type: The Memcached server address specified as `host[:port]`
* Mandatory: Yes
* Dynamic: No

If no port is provided, then the default port `11211` will be used.

**`max_value_size`**

* Type: [size](https://mariadb.com/docs/maxscale/maxscale-archive/archive/mariadb-maxscale-25-01/mariadb-maxscale-25-01-getting-started/mariadb-maxscale-2501-maxscale-2501-mariadb-maxscale-configuration-guide)
* Mandatory: No
* Dynamic: No
* Default: 1Mi

By default, the maximum size of a value stored to memcached is 1MiB, but that\
can be configured to something else, in which case this parameter should be\
set accordingly.

The value of `max_value_size` will be used for capping `max_resultset_size`,\
that is, if memcached has been configured to allow larger values than 1MiB\
but `max_value_size` has not been set accordingly, only resultsets up to 1MiB\
in size will be cached.

**Example**

From MaxScale 23.02 onwards, the storage configuration should be provided\
as nested parameters.

```
[Cache-Filter]
type=filter
module=cache
storage=storage_memcached
storage_memcached.server=192.168.1.31
storage_memcached.max_value_size=10M
```

Although *deprecated* in 23.02, the configuration can also be provided\
using `storage_options`:

```
storage_options="server=192.168.1.31,max_value_size=10M"
```

**Limitations**

* Invalidation is not supported.
* Configuration values given to `max_size` and `max_count` are ignored.

**Security**

*Neither* the data in the memcached server *nor* the traffic between MaxScale and\
the memcached server is encrypted. Consequently, *anybody* with access to the\
memcached server or to the network have access to the cached data.

#### `storage_redis`

This storage module uses [redis](https://redis.io/) for storing the\
cached data.

Note that Redis should be configured with no idle timeout or with a timeout that\
is very large. Otherwise MaxScale may have to repeatedly connect to Redis, which\
will hurt both the functionality and the performance.

Multiple MaxScale instances can share the same redis server and items\
cached by one MaxScale instance will be used by the other. Note that all\
MaxScale instances should have exactly the same configuration, as otherwise\
there can be unintended sharing.

```
storage=storage_redis
```

If `storage_redis` cannot connect to the Redis server, caching will silently\
be disabled and a connection attempt will be made after a [timeout](#timeout)\
interval.

If a timeout error occurs during an operation, reconnecting will be attempted\
after a delay, which will be an increasing multiple of `timeout`. For example,\
if `timeout` is the default 5 seconds, then reconnection attempts will first\
be made after 10 seconds, then after 15 seconds, then 20 and so on. However,\
once 60 seconds have been reached, the delay will no longer be increased but\
the delay will stay at one minute. Note that each time a reconnection attempt\
is made, unless the reason for the timeout has disappeared, the client will be\
stalled for `timeout` seconds.

`storage_redis` has the following parameters:

**`server`**

* Type: The Redis server address specified as `host[:port]`
* Mandatory: Yes
* Dynamic: No

If no port is provided, then the default port `6379` will be used.

**`username`**

* Type: string
* Mandatory: No
* Dynamic: No
* Default: `""`

Please see [authentication](#authentication) for more information.

**`password`**

* Type: string
* Mandatory: No
* Dynamic: No
* Default: `""`

Please see [authentication](#authentication) for more information.

**`ssl`**

* Type: [boolean](https://mariadb.com/docs/maxscale/maxscale-archive/archive/mariadb-maxscale-25-01/mariadb-maxscale-25-01-getting-started/mariadb-maxscale-2501-maxscale-2501-mariadb-maxscale-configuration-guide)
* Mandatory: No
* Dynamic: No
* Default: `false`

Please see [ssl](#ssl-1) for more information.

**`ssl_cert`**

* Type: Path to existing readable file.
* Mandatory: No
* Dynamic: No
* Default: `""`

The SSL client certificate that MaxScale should use with the Redis\
server. The certificate must match the key defined in `ssl_key`.

Please see [ssl](#ssl-1) for more information.

**`ssl_key`**

* Type: Path to existing readable file.
* Mandatory: No
* Dynamic: No
* Default: `""`

The SSL client private key MaxScale should use with the Redis server.

Please see [ssl](#ssl-1) for more information.

**`ssl_ca`**

* Type: Path to existing readable file.
* Mandatory: No
* Dynamic: No
* Default: `""`

The Certificate Authority (CA) certificate for the CA that signed the\
certificate specified with `ssl_cert`.

Please see [ssl](#ssl-1) for more information.

**Authentication**

If `password` is provided, MaxScale will authenticate against Redis when a connection\
has been created. The authentication is performed using the [auth](https://redis.io/commands/auth/) command, with only the `password` as argument,\
if no `username` was provided in the configuration, or `username` and `password` as\
arguments, if both were.

Note that if the authentication is in the Redis configuration file\
specified using `requirepass`, then only the *password* should be provided.\
If the Redis server version is 6 or higher and the *Redis ACL system* is used,\
then both *username* and *password* must be provided.

**SSL**

If `ssl_key`, `ssl_cert` and `ssl_ca` are provided, then SSL/TLS will be used\
in the communication with the Redis server, if `ssl` is set to `true`.

Note that the SSL/TLS support is only available in Redis from version 6\
onwards and that the support is not by default built into Redis, but has\
to be specifically enabled at compile time as explained [here](https://redis.io/docs/management/security/encryption/).

**Example**

From MaxScale 23.02 onwards, the storage configuration should be provided\
as nested parameters.

```
[Cache-Filter]
type=filter
module=cache
storage=storage_redis
storage_redis.server=192.168.1.31
storage_redis.username=hello
storage_redis.password=world
```

Although *deprecated* in 23.02, the configuration can also be provided\
using `storage_options`:

```
storage_options="server=192.168.1.31,username=hello,password=world"
```

**Limitations**

* There is no distinction between soft and hard ttl, but only hard ttl is used.
* Configuration values given to `max_size` and `max_count` are ignored.

**Invalidation**

`storage_redis` supports invalidation, but the caveats documented [here](#best-efforts)\
are of greater significance since also the communication between the cache and the\
cache storage is asynchronous and takes place over the network.

*NOTE* If invalidation is turned on after caching has been used (in non-invalidation\
mode), redis must be flushed as otherwise there will be entries in the cache that\
will not be affected by the invalidation.

```
$ redis-cli flushall
```

**Security**

The data in the redis server is *not* encrypted. Consequently, *anybody* with\
access to the redis server has access to the cached data.

Unless [SSL](#ssl) has been enabled, *anybody* with access to the network has\
access to the cached data.

### Example

In the following we define a cache *MyCache* that uses the cache storage module`storage_inmemory` and whose *soft ttl* is `30` seconds and whose *hard ttl* is`45` seconds. The cached data is shared between all threads and the maximum size\
of the cached data is `50` mebibytes. The rules for the cache are in the file`cache_rules.json`.

#### Configuration

```
[MyCache]
type=filter
module=cache
storage=storage_inmemory
soft_ttl=30
hard_ttl=45
cached_data=shared
max_size=50Mi
rules=cache_rules.json

[MyService]
type=service
...
filters=MyCache
```

#### `cache_rules.json`

The rules specify that the data of the table `sbtest` should be cached.

```
{
    "store": [
        {
            "attribute": "table",
            "op": "=",
            "value": "sbtest"
        }
    ]
}
```

### Performance

When the cache filter was introduced, the most significant factor affecting\
the performance of the cache was whether the statements needed to be parsed.\
Initially, all statements were parsed in order to exclude `SELECT` statements\
that use non-cacheable functions, access non-cacheable variables or refer\
to system or user variables. Later, the default value of the `selects` parameter\
was changed to `assume_cacheable`, to maximize the default performance.

With the default configuration, the cache itself will not cause the statements\
to be parsed. However, even with `assume_cacheable` configured, a rule referring\
specifically to a *database*, *table* or *column* will still cause the\
statement to be parsed.

For instance, a simple rule like

```
{
    "store": [
        {
            "attribute": "database",
            "op": "=",
            "value": "db1"
        }
    ]
}
```

cannot be fulfilled without parsing the statement.

If the rule is instead expressed using a regular expression

```
{
    "store": [
        {
            "attribute": "query",
            "op": "like",
            "value": "FROM db1\\..*"
        }
    ]
}
```

then the statement will not be parsed.

However, when the [query classifier cache](https://mariadb.com/docs/maxscale/maxscale-archive/archive/mariadb-maxscale-25-01/mariadb-maxscale-25-01-getting-started/mariadb-maxscale-2501-maxscale-2501-mariadb-maxscale-configuration-guide)\
was introduced, the parsing cost was significantly reduced and\
currently the cost for parsing and regular expression matching\
is roughly the same.

In the following is a table with numbers giving a rough picture\
of the relative cost of different approaches.

In the table, *regexp match* means that the cacheable statements\
were picked out using a rule like

```
{
    "attribute": "query",
    "op": "unlike",
    "value": "FROM nomatch"
}
```

while *exact match* means that the cacheable statements were picked out using a\
rule like

```
{
    "attribute": "database",
    "op": "!=",
    "value": "nomatch"
}
```

The exact match rule requires all statements to be parsed.

As the purpose of the test is to illustrate the overhead of\
different approaches, the rules were formulated so that\
all SELECT statements would match.

Note that these figures were obtained by running sysbench,\
MaxScale and the server in the same computer, so they are\
only indicative.

| selects           | Rule         | qps |
| ----------------- | ------------ | --- |
| assume\_cacheable | none         | 100 |
| assume\_cacheable | regexp match | 83  |
| assume\_cacheable | exact match  | 83  |
| verify\_cacheable | none         | 80  |
| verify\_cacheable | regexp match | 80  |
| verify\_cacheable | exact match  | 80  |

For comparison, without caching, the qps is `33`.

As can be seen, due to the query classifier cache there is\
no difference between exact and regex based matching.

#### Summary

For maximum performance:

* Arrange the situation so that the default `selects=assume_cacheable`\
  can be used, and use no rules.

Otherwise it is mostly a personal preference whether exact or regex\
based rules are used. However, one should always test with real data\
and real queries before choosing one over the other.

<sub>*This page is licensed: CC BY-SA / Gnu FDL*</sub>

{% @marketo/form formId="4316" %}
