A transaction duration tracking audit plugin for MariaDB and MySQL

Stopwatch iconThis plugin logs long running transactions to the server error log when transaction took more than a configurable number of seconds.

https://github.com/hholzgra/transaction_time_audit

Motivation

A customer was looking for a solution to track down long running transactions. After discussing some alternatives we decided that an Audit Plugin was the best approach towards this.

An Audit Plugin has the advantage that it:

  • Is executed within the mysqld server. Its access to server internals via the official API is limited, but it turned out that the existing API provided all the building blocks we needed for a basic implementation:
    • the full stream of executed queries is visible
    • current transaction ID is visible via the thd_get_xid() API function
  • doesn’t require special system privileges
  • doesn’t require an extra process
  • doesn’t add extra network protocol latencies
  • no need to parse SQL queries as transaction transitions are visible via thd_get_xid() changes
  • all explicit and implicit commits and rollbacks are exposed this way

Alternatives

We discussed the following alternatives, but turned down all of them due to their disadvantages compared to an audit plugin based solution:

MySlowTranCapture

MySlowTranCapture (by Yoshinori Matsunobu) [1] captures TCP traffic and parses the MySQL network protocol stream to identify transactions.

The problems with this approach are:

  • the necessary privileges to capture the protocol stream in the first place (running as root or e.g. having the CAP_NET_RAW privilege on Linux)
  • the extra parser overhead required to analyze the protocol stream
  • by parsing the protocol stream transaction transitions are only identified by explicit BEGIN, COMMIT and ROLLBACK statements, implicit commits and rollbacks can go unnoticed

Implicit rollbacks may happen due to errors and implicit commits are caused by quite a few different statements in MySQL [2]

While it would be possible to extend the protocol parser to catch all these extra cases it would be a lot of work to do so, and would require working against a moving target.

PERFORMANCE_SCHEMA

The events_transactions_% tables [3] that were added in MySQL 5.7.3 provide insight in transactions and amongst other things their start and end time.

Disadvantages:

  • not available in any GA release yet
  • active polling is needed to create a log of long running transactions
  • logging actual queries performed by these transactions might be tricky …

Use a proxy

A proxy solution like e.g. “MySQL Proxy” [4] or “MaxScale for MariaDB & MySQL” [5] could be used to analyze protocol traffic between applications and the mysqld server.

Disadvantages:

  • same problem with identifying transaction transitions as with MySlowTranCapture above
  • adding latency due to an extra network hop and processing time within the proxy

Current status

The plugin in its current form only provides basic transaction duration monitoring. The start time of the current transaction, the relative time since it started and the number of queries executed so far are visible in SHOW SESSION STATUS as transaction_time_start, transaction_time_span and transaction_time_queries.

The threshold after which a long running transaction is reported in the errror log can be set via the configuration variable transaction_time_limit. It defaults to a value of 10 seconds and can be set on the mysqld command line, in the servers option file (my.cnf / my.ini) or with SET GLOBAL or SET SESSION at runtime.

The output format in the error log currently looks like this:

140909 16:24:07 [note] long transaction: user: 'root[root] @ localhost [127.0.0.1]', time taken: 59, queries: 8
  insert into t1 values(1)
  insert into t1 values(2)
[... 3 more queries ...]
  insert into t1 values(6)
  insert into t1 values(7)
  commit

The number of queries listed is currently hardcoded to five, and newlines in the query text may scew up formating.

Future plans

  • logging into a separate log file, similar to the slow query log
  • retrieve certain values from SESSION STATUS, like number of rows read/modified
  • retrieve connection annotation information for log purposes
  • …

Implementation details

If you are not interested in actual C code etc. you can probably stop reading here.

System variables

This plugin defines two system variables:

transaction_time_limitThis is the transaction duration limit (in seconds) after which a transaction is logged to the server error log. The variable can be set via command line option, global options file, or via SET GLOBAL or SET SESSION.

transaction_limit_session_dataThis is an internal variable declared with PLUGIN_VAR_READONLY, PLUGIN_VAR_NOCMDOPT and PLUGIN_VAR_NOSYSVAR flags so that it is not settable, not shown as a command line option and not part of the SHOW VARIABLES output either. Its only purpose is to store a pointer to a structure with actual internal per-session information like transaction start time and current transactions MYSQL_XID.

/* log long running transactions that took more than this many seconds */
static MYSQL_THDVAR_ULONGLONG(limit,
                              PLUGIN_VAR_RQCMDARG,
                              "log transactions that took longer than this many seconds",
                              /* check */ NULL,
                              /* update */ NULL,
                              /* default*/ 10,
                              /* min */ 0,
                              /* max */ ULLONG_MAX,
                              /* blocksize */ 1
                              );

/* this is an internal per-connection structure only,
 so NOCMDOPT and NOSYSVAR hide it in --help and
 SHOW VARIABLES output */

static MYSQL_THDVAR_ULONGLONG(session_data,
                              PLUGIN_VAR_READONLY | PLUGIN_VAR_NOCMDOPT | PLUGIN_VAR_NOSYSVAR,
                              "private structure pointer",
                              /* check */ NULL,
                              /* update */ NULL,
                              /* default*/ 0,
                              /* min */ 0,
                              /* max */ ULLONG_MAX,
                              /* blocksize */ 1
                              );

/* register session variables declared above */
static struct st_mysql_sys_var* audit_plugin_sysvars[] = {
  MYSQL_SYSVAR(limit),
  MYSQL_SYSVAR(session_data),
  NULL
};

The private my_vars structure is allocated when receiving a MYSQL_AUDIT_CONNECTION_CONNECT event and freed on MYSQL_AUDIT_CONNECTION_DISCONNECT. It is defined as:

/* private session data structure */
struct my_vars {
  MYSQL_XID xid;                    /* previous events XID */
  char *user;                       /* session user */
  time_t start_time;                /* start time of current transaction */
  unsigned long long query_counter; /* simple query counter counting MYSQL_AUDIT_GENERAL_LOG events */
  char **queries;                   /* store log of actual queries here */
};

and a pointer to it is stored in transaction_time_session_data.

To ease access to its elements an additional convenience macro SESSION_VAR() is defined as:

#define SESSION_VAR(THD, VAR) (((struct my_vars *)THDVAR(THD, session_data))->VAR)

so e.g. the start_time element can easily be accessed and assigned to using SESSION_VAR(thd, start_time)

Status variables

transaction_time_start Start time of the current active transaction, or zero if not currently in a transaction.

transaction_time_span Time since the start of the current transaction in seconds.

transaction_time_queries Number of queries executed in the current transaction so far.

As these are per session values we need to implement them using SHOW_FUNC callbacks:

/* SHOW_FUNC callback function that returns time the transaction started */
static int show_transaction_start(MYSQL_THD thd,
                                      struct st_mysql_show_var* var,
                                      char *buff)
{
  time_t t = (time_t)SESSION_VAR(thd, start_time);
  struct tm lt = *localtime(&t);

  sprintf(buff, "%4d-%02d-%02d %02d:%02d:%02d",
          lt.tm_year + 1900, lt.tm_mon + 1, lt.tm_mday,
          lt.tm_hour, lt.tm_min, lt.tm_sec);

  var->type  = SHOW_CHAR;
  var->value = buff;

  return 0;
}

/* SHOW_FUNC callback function that returns time elapsed since session start */
static int show_transaction_time(MYSQL_THD thd,
                                      struct st_mysql_show_var* var,
                                      char *buff)
{
  unsigned long long *result = (unsigned long long *)buff;

  var->type  = SHOW_LONGLONG;
  var->value = buff;

  if (SESSION_VAR(thd, start_time)) {
    *result = (unsigned long long)(time(NULL) - SESSION_VAR(thd, start_time));
  } else {
    *result = 0;
  }

  return 0;
}

/* SHOW_FUNC callback function that returns queries counted */
static int show_transaction_query_count(MYSQL_THD thd,
                                      struct st_mysql_show_var* var,
                                      char *buff)
{
  unsigned long long *result = (unsigned long long *)buff;

  var->type  = SHOW_LONGLONG;
  var->value = buff;

        *result = SESSION_VAR(thd, query_counter);

  return 0;
}

/* register SHOW STATUS variables and callbacks */
static struct st_mysql_show_var audit_plugin_statvars[]=
  {
    {"transaction_time_start", (char *)&show_transaction_start, SHOW_FUNC},
    {"transaction_time_span", (char *)&show_transaction_time, SHOW_FUNC},
    {NULL, NULL, SHOW_UNDEF}
  };

Identifying transaction transitions

The Plugin API provides a function thd_get_xid() to retrieve the current transactions MYSQL_XID identifier. The MYSQL_XID type is defined in plugin.h as:

  struct st_mysql_xid {
    long formatID;
    long gtrid_length;
    long bqual_length;
    char data[MYSQL_XIDDATASIZE];  /* Not -terminated */
  };
  typedef struct st_mysql_xid MYSQL_XID;

A formatID of -1 indicates that we’re currently not in an active transaction, otherwise data contains the transaction ID as a binary string of length gtrid_length + bqual_length.

<

So whenever the xid changes from invalid to valid or its data string changes a new transaction has been started. This does not happen right away on BEGIN / START TRANSACTION by the way, but only on the first actual statement after that.

A change of the data string or of formatID to -1 also marks the end of the previously active transaction. This is also the point where we can log transaction duration.

I’m simply checkig for xid changes whenever receiving a MYSQL_AUDIT_GENERAL_CLASS event here, without distinguishing between the different event subclasses as the transaction state can change on any of them.

The full transaction change detection logic looks like this:

MYSQL_XID xid, *old_xid;

  [...]

  thd_get_xid(thd, &xid);
  old_xid = &SESSION_VAR(thd, xid);

  /* a transaction is valid if its formatID is >= 0 */

  if (xid.formatID == -1) {
    /* currently not in a valid transaction */
    if (old_xid->formatID >= 0) {
      /* transition from valid to invalid transaction ID -> transaction ended */
      end_transaction(thd);
    }
  } else {
    /* currently in a valid transaction */
    if (old_xid->formatID == -1) {
      /* previously not in a transaction -> transaction started */
      begin_transaction(thd, event_general);
    } else if (memcmp(old_xid, &xid, sizeof(MYSQL_XID))) {
      /* previously in a valid transaction, too, and transaction ID changed */
      /* old transaction ended, new one started */
      end_transaction(thd);
      begin_transaction(thd, event_general);
    }
    /* else: still in same transaction */
  }

Transaction start

On transaction start we’re for now just storing the current user name (actually user@host) as seen in the current event data and the transaction start_time:

static void begin_transaction(MYSQL_THD thd, const struct mysql_event_general *event)
{
    [...]
    SESSION_VAR(thd, user) = (char *)calloc(1, event->general_user_length + 1);
    memcpy(SESSION_VAR(thd, user), event->general_user, event->general_user_length);

    SESSION_VAR(thd, start_time) = event->general_time;
    [...]
}

Transaction end

On transaction end we’re resetting start_time and some simple loging is done if the transaction took longer than transaction_time_limit seconds:

static void end_transaction(MYSQL_THD thd)
{
  if (THDVAR(thd, session_data) && SESSION_VAR(thd, user)) {
    unsigned long time_taken = (unsigned long)(time(NULL) - SESSION_VAR(thd, start_time));

    if ((THDVAR(thd, limit)) && (time_taken >= THDVAR(thd, limit))) {
      time_t t = time(NULL);
      struct tm lt = *localtime(&t);
      [...]
      fprintf(stderr, "%2.2d%2.2d%2.2d %2.2d:%2.2d:%2.2d [note] long transaction: user: '%s', time taken: %lu, queries: %lun",
              lt.tm_year % 100, lt.tm_mon, lt.tm_mday, lt.tm_hour, lt.tm_min, lt.tm_sec,
              SESSION_VAR(thd, user), time_taken, (unsigned long)SESSION_VAR(thd, query_counter));

    for (i = 0; i < MAX_QUERIES; i++) {
      [...]
    }

  [...]
}

Wishlist

A few things I would have liked to have while implementing this:

  • an easier way for plugins to handle private per-thread data, e.g. as proposed in the patch that Jonas Oreland provided in https://mariadb.atlassian.net/browse/MDEV-6712
  • an easier way to define per-thread STATUS variables (maybe in combination with per-thread data above)
  • additional audit events that get triggered on transaction start and end, including the reason why a transaction ended (COMMIT/ROLLBACK, explicit or implicit) so that transition logic doesn’t need to be reimplemented in plugins and to expose whether a transaction was committed or rolled back (bonus points for telling whether it was explicit or implicit) 

Bibliography