What is Flexviews?
Flexviews is a materialized views implementation for MariaDB, MySQL and its forks. It includes a simple SQL API that is used to create materialized views and to refresh them. The advantage of using Flexviews is that the materialized views are incrementally refreshed, that is, the views are updated efficiently by using special logs which record the changes to database tables. Flexviews includes tools which create and maintain these logs. The views created by Flexviews include support for
JOINs and for all major aggregate functions.
Flexviews works well with Shard-Query, that can be used to shard the source data, making the refresh process faster.
What is a materialized view?
A "regular" view is a virtual table representing the result of a database query. Each time the view is accessed, the RDBMS must run the query to produce the result set for the view.
A materialized view is similar to regular view, except that the results are stored into an actual database table, not a virtual one. The result set is effectively cached for a period of time. When the underlying data changes the view becomes stale. Because of this, materialized views must be frequently "refreshed" to bring them up-to-date.
By caching result set data into a real table, the data can be accessed much more quickly, but at the cost of some data being potentially out-of-date. It is most useful in data warehousing scenarios, where frequent queries of the actual base tables can be extremely expensive. Since the materialized view is stored as a real table, you can build indexes on any combination of columns of the view, enabling drastic speedups in response time for queries that access the view.
MariaDB and MySQL do not support materialized views natively. Flexviews creates them as regular tables and keeps them up to date.
The swanhart-tools repository, on GitHub, contains tools for MariaDB and MySQL from Justin Swanhart. Flexviews is located in the
flexviews directory. The whole project can be downloaded via the GitHub's Download button. It is possible to report bugs using the Issues tab.
A project on Google Code also exists, but it is obsolete. All bugs from Google Code and all downloads from Google Code are deprecated and will be removed.
A Facebook page is also available.
The following articles are also useful:
- Flexviews @ MySQL Performance Blog
- OurSQL Episode 179 contains an interview with Justin Swanhart on Flexviews
- MariaDB or MySQL, version 5.1+
- PHP 5.2+ required, 5.3+ is recommended (for FlexCDC)
- pcntl extension
MariaDB required settings:
- Row Level Binary logging (binlog_format = ROW in my.cnf)
- server_id set to unique value (server_id = 999 in my.cnf)
- SUPER privileges
MariaDB suggested settings:
Materialized views support two refresh methods:
INCREMENTAL refresh is preferable when it can be used. Incrementally refreshing the view requires examining only the rows which changed since the view was last refreshed. Flexviews must be instructed about the calculations that it must perform on the data, and this can be done via the SQL_API or via
convert.php (see below). However, not all SQL expressions are supported.
INCREMENTAL refresh requires the FlexCDC utility, that examines the binary log and records the changes into tables called change logs (one for each materialized view). An SQL event called apply_views periodically applies these changes to the materialized views. This can also be applied manually by calling the refresh_all() stored procedure, or refresh().
COMPLETE refresh is slower, because it rebuild the data in the materialized view each time. The main reason for using this method is that the materialized views can be based on any SQL statement. Also, FlexCDC is not needed for
COMPLETE refresh. An SQL event periodically rebuilds the materialized views.
The materialized views that use the
COMPLETE refresh method can be based on any
SELECT statement, as long as it is deterministic. However, for the
INCREMENTAL method, materialized view's definitions only support the following clauses:
- All non-deterministic expressions.
Unsupported clauses are:
- Subqueries/Derived tables
- All deterministic expressions
HAVING, you can use a
WHERE when querying the materialize view. And since
ORDER BY cannot be used in a materialized view's definition, it should be used while querying the view.
Interfaces and tools
An SQL API provides the stored routines can be used to create, define and enable materialized views.
The API has a documentation automatically-generated with ROBODoc, located in the
Many routines produce a warning for non-fatal errors. If the
@fv_force user-variable is set exactly to
TRUE, such warnings are turned into errors.
MariaDB starting with 5.5
MariaDB until 5.5
On older versions, the only significative property is the error message.
convert.php script can be used via the command line to create and enable
INCREMENTAL materialized views without directly calling the SQL API. The syntax is:
php convert.php [schema] < file_name
schema, if specified, is the name of the default database. The specified schema is used for:
- The location into which to place the materialized view(s).
- The location to use for tables that are not explicitly prefixed with a schema name in the
FROMclause of the SELECT statement.
file_name is a file which contains
CREATE TABLE ... SELECT and
INSERT ... SELECT statements, which will be translated to SQL API calls. Each statement must be terminated with a semicolon (
Flexviews includes FlexCDC - a Change Data Capture utility which makes captures table changes to table change logs, making incremental refreshing of views possible. It is also called the consumer.
Data changes are written by the server to the binary log. FlexCDC detects such changes and writes them to log tables. Log tables are stored in the
flexviews database and contain the changes that need to be applied to bring the materialized views up to date.
This image shows how FlexCDC works:
- FlexCDC invokes
mysqlbinlogin an external process with the commandline options
--base64-output=decode-rows -v. This instructs the utility to present RBR base64 entries as an easily readable SBR notation.
mysqlbinlogconnects to MySQL and asks for binary logs. The output from
mysqlbinlogis captured by FlexCDC and processed further.
- Instead of applying the actual changes, FlexCDC records the changes into log tables. FlexCDC assigns a unique monotonically increasing transaction id to each set of changes. FlexCDC inserts each set changes into one or more table changelogs (one per changed table) in a single transaction.
FlexCDC is written in PHP and is located in the
consumer directory. By default, it reads a configuration file (
consumer.ini) located in the working directory. There are three required .ini sections:
[dest]. See the
consumer.ini.example file for an example and settings descriptions.
After changing the configuration file,
setup_flexcdc.php can be used to create the metadata tables. If the script detects a problems, it exists with an error.
FlexCDC can be executed via
run_consumer.php. A shell script called
consumer_safe.sh is also avaible. It restarts FlexCDC in case it crashes. It is still possible to stop FlexCDC by sending a HUP signal.
consumer_safe.sh creates a .pid file, which tells the number of the thread that can receive the HUP.
FlexCDC only works on Unix-like systems.