All pages
Powered by GitBook
1 of 42

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

ColumnStore Commit

The COMMIT statement makes changes to a table permanent. You should only commit changes after you have verified the integrity of the changed data. Once data is committed, it cannot be undone with the ROLLBACK statement. To return the database to its former state, you must restore the data from backups.

images here

Reference

The MariaDB ColumnStore Reference is a key guide to its architecture, setup, SQL, and analytics use.

ColumnStore Data Definition Statements

You can use most normal statements from the MariaDB data definition language (DDL) with ColumnStore tables. This section lists DDL that differs for ColumnStore compared to normal MariaDB usage.

ColumnStore Data Manipulation Statements

Learn data manipulation statements for MariaDB ColumnStore. This section covers INSERT, UPDATE, DELETE, and LOAD DATA operations, optimized for efficient handling of large analytical datasets.

ColumnStore Rename Table

The RENAME TABLE statement renames one or more ColumnStore tables.

Notes:

  • You cannot currently use RENAME TABLE to move a table from one database to another.

  • See the ALTER TABLE syntax for an alternative to RENAME Table. The following statement renames the orders table:

RENAME TABLE orders TO customer_order;

The following statement renames both the orders table and customer table:

You may also use RENAME TABLE to swap tables. This example swaps the customer and vendor tables (assuming the temp_table does not already exist):

ColumnStore Operating Mode

ColumnStore has the ability to support full MariaDB query syntax through an operating mode. This operating mode may be set as a default for the instance or set at the session level. To set the operating mode at the session level, the following command is used. Once the session has ended, any subsequent session will return to the default for the instance.

SET infinidb_vtable_mode = n

where n is:

  1. a generic, highly compatible row-by-row processing mode. Some WHERE clause components can be processed by ColumnStore, but joins are processed entirely by mysqld using a nested loop join mechanism.

  2. (the default) query syntax is evaluated by ColumnStore for compatibility with distributed execution and incompatible queries are rejected. Queries executed in this mode take advantage of distributed execution and typically result in higher performance.

  3. auto-switch mode: ColumnStore will attempt to process the query internally, if it cannot, it will automatically switch the query to run in row-by-row mode.

ColumnStore Drop Procedure

The DROP PROCEDURE statement deletes a stored procedure from ColumnStore.

The following statement drops the sp_complex_variable procedure:

DROP PROCEDURE sp_complex_variable;

ColumnStore Non-Distributed Post-Processed Functions

Overview

ColumnStore supports all MariaDB functions that can be used in a post-processing manner where data is returned by ColumnStore first and then MariaDB executes the function on the data returned. The functions are currently supported only in the projection (SELECT) and ORDER BY portions of the SQL statement.

See Also

ColumnStore ALTER VIEW

Alters the definition of a view. CREATE OR REPLACE VIEW may also be used to alter the definition of a view.

Syntax

CREATE
    [OR REPLACE]
    VIEW view_name [(column_list)]
    AS select_statement

ColumnStore Compression Mode

MariaDB ColumnStore has the ability to compress data and this is controlled through a compression mode. This compression mode may be set as a default for the instance or set at the session level.

To set the compression mode at the session level, the following command is used. Once the session has ended, any subsequent session will return to the default for the instance.

set infinidb_compression_type = n

where n is:

  1. 0 = Compression is turned off. Any subsequent table create statements run will have compression turned off for that table unless any statement overrides have been performed. Any ALTER statements run to add a column will have compression turned off for that column unless any statement override has been performed.

  2. 1 = Compression is turned on. Any subsequent table create statements run will have compression turned on for that table unless any statement overrides have been performed. Any ALTER statements run to add a column will have compression turned on for that column unless any statement override has been performed. ColumnStore uses snappy compression in this mode.

ColumnStore Rollback

The ROLLBACK statement undoes transactions that have not been permanently saved to the database with the COMMIT statement. You cannot rollback changes to table properties including ALTER, CREATE, or DROP TABLE statements.

ColumnStore Decimal Math and Scale

ColumnStore Utility Functions

MariaDB ColumnStore utility functions are a set of simple functions that return useful information about the system, such as whether it is ready for queries.

Function
Description

ColumnStore CREATE TABLE

A database consists of tables that store user data. You can create multiple columns with the CREATE TABLE statement. The data type follows the column name when adding columns.

Syntax

images here

ColumnStore CREATE PROCEDURE

Creates a stored routine in ColumnStore.

Syntax

ColumnStore currently accepts definition of stored procedures with only input arguments and a single SELECT query while in Operating Mode = 1 (VTABLE mode). However, while in the Operating Mode = 0 (TABLE mode), ColumnStore will allow additional complex definition of stored procedures (i.e., OUT parameter, declare, cursors, etc.)

See Operating Mode for information on .

The following statements create and call the sp_complex_variable stored procedure:

ColumnStore DELETE

The DELETE statement is used to remove rows from tables.

Syntax

No disk space is recovered after a DELETE. TRUNCATE and DROP PARTITION can be used to recover space, or CREATE TABLE, loading only the remaining rows, then using DROP TABLE on the original table and RENAME TABLE

ColumnStore DROP TABLE

The statement deletes a table from ColumnStore.

Syntax

The RESTRICT clause limits the table to being dropped in the front end only. This could be useful when the table has been dropped on one user module and needs to be synced to others.

The following statement drops the orders table on the front end only:

MariaDB ColumnStore has the ability to change intermediate decimal mathematical results from decimal type to double. The decimal type has approximately 17-18 digits of precision but a smaller maximum range. Whereas the double type has approximately 15-16 digits of precision, but a much larger maximum range.

In typical mathematical and scientific applications, the ability to avoid overflow in intermediate results with double math is likely more beneficial than the additional two digits of precision. In banking applications, however, it may be more appropriate to leave in the default decimal setting to ensure accuracy to the least significant digit.

Enable/Disable decimal to double math

The columnstore_double_for_decimal_math variable is used to control the data type for intermediate decimal results. This decimal for double math may be set as a default for the instance, set at the session level, or at the statement level by toggling this variable on and off.

To enable/disable the use of the decimal to double math at the session level, the following command is used. Once the session has ended, any subsequent session will return to the default for the instance.

where n is:

  • off (disabled, default)

  • on (enabled)

ColumnStore Decimal Scale

ColumnStore has the ability to support varied internal precision on decimal calculations. columnstore_decimal_scale is used internally by the ColumnStore engine to control how many significant digits to the right of the decimal point are carried through in suboperations on calculated columns. If, while running a query, you receive the message ‘aggregate overflow,’ try reducing columnstore_decimal_scale and running the query again. Note that, as you decrease columnstore_decimal_scale, you may see reduced accuracy in the least significant digit(s) of a returned calculated column.

columnstore_decimal_scale is used internally by the ColumnStore engine to turn the use of this internal precision on and off. These two system variables may be set as a default for the instance or set at the session level.

Enable/Disable Decimal Scale

To enable/disable the use of the decimal scale at the session level, the following command is used. Once the session has ended, any subsequent session will return to the default for the instance.

where n is off (disabled) or on (enabled).

Set Decimal Scale Level

To set the decimal scale at the session level, the following command is used. Once the session has ended, any subsequent session will return to the default for the instance.

where n is the amount of precision desired for calculations.

Enable/Disable decimal to double math "Enable/Disable decimal to double math"
ColumnStore decimal scale "ColumnStore decimal scale"
Enable/disable decimal scale "Enable/disable decimal scale"
Set decimal scale level "Set decimal scale level"
Notes
  • ColumnStore tables should not be created in the mysql, information_schema, calpontsys, or test databases.

  • ColumnStore stores all object names in lowercase.

  • CREATE TABLE AS SELECT is not supported and will instead create the table in the default storage engine.

  • Compression level (0 for no compression, 1 for compression) is set at the system level. If a session default exists, it will override the system default. In turn, it can be overridden by the table-level compression comment and, finally, a compression comment at the column level.

  • A table is created in the front end only by using a ‘schema sync only’ comment.

  • The column DEFAULT value can be a maximum of 64 characters.

  • For maximum compatibility with external tools, MariaDB ColumnStore will accept the following table attributes; however, these are not implemented within MariaDB ColumnStore:

    • MIN_ROWS

    • MAX_ROWS

    • AUTO_INCREMENT

All of these are ignored by ColumnStore. The following statement creates a table called "orders" with two columns: orderkey with datatype INTEGER, and customer with datatype VARCHAR:

.

LIMIT will limit the number of rows deleted, which will perform the DELETE more quickly. The DELETE ... LIMIT statement can then be performed multiple times to achieve the same effect as DELETE with no LIMIT.

The following statement deletes customer records with a customer key identification between 1001 and 1999:

DELETE 
 [FROM] tbl_name 
    [WHERE where_condition]
    [ORDER BY ...]
    [LIMIT row_count]
DELETE FROM customer 
  WHERE custkey > 1000 AND custkey <2000
See also
DROP  TABLE [IF EXISTS] 
    tbl_name 
    [RESTRICT ]
DROP TABLE orders RESTRICT;
RENAME TABLE orders TO customer_orders,customer TO customers;
RENAME TABLE customer TO temp_table, vendor TO customer,temp_table to vendor;

mcsSystemReady()

Returns 1 if the system can accept queries, 0 if it's not ready yet.

mcsSystemReadOnly()

Returns 1 if ColumnStore is in a write-suspended mode. That is, a user executed the SuspendDatabaseWrites. It returns 2 if in a read-only state. ColumnStore puts itself into a read-only state if it detects a logic error that may have corrupted data. Generally, it means a ROLLBACK operation failed. Returns 0 if the system is writable.

CREATE
    [DEFINER = { user | CURRENT_USER }]
    PROCEDURE sp_name ([proc_parameter[,...]])
    [characteristic ...] routine_body

proc_parameter:
    [ IN | OUT | INOUT ] param_name type

type:
    Any valid MariaDB ColumnStore data type

routine_body:
    Valid SQL procedure statement
CREATE PROCEDURE sp_complex_variable(IN arg_key INT, IN arg_date DATE)
  BEGIN
    SELECT *
    FROM lineitem, orders
    WHERE o_custkey < arg_key
    AND l_partkey < 10000
    AND l_shipdate>arg_date
    AND l_orderkey = o_orderkey
    ORDER BY l_orderkey, l_linenumber;
  END

CALL sp_complex_variable(1000, '1998-10-10');
Operating Modes
ColumnStore Distributed Functions

ColumnStore Conditions

A condition is a combination of expressions and operators that return TRUE, FALSE or NULL. The following syntax shows the conditions that can be used to return a TRUE, FALSE, or NULL condition.

Filter

Note: A ‘literal’ may be a constant (e.g. 3) or an expression that evaluates to a constant [e.g. 100 - (27 * 3)]. For date columns, you may use the SQL ‘interval’ syntax to perform date arithmetic, as long as all the components of the expression are constants (e.g. ‘1998-12-01’ - interval ‘1’ year)

String Comparisons

ColumnStore, unlike the MyISAM engine, is case sensitive for string comparisons used in filters. For the most accurate results, and to avoid confusing results, make sure string filter constants are no longer than the column width itself.

Pattern Matching

Pattern matching as described with the LIKE condition allows you to use “” to match any single character and “%” to match an arbitrary number of characters (including zero characters). To test for literal instances of a wildcard character, (“%” or “”), precede it by the “\” character.

OR Processing

OR Processing has the following restrictions:

  • Only column comparisons against a literal are allowed in conjunction with an OR. The following query would be allowed since all comparisons are against literals. SELECT count(*) from lineitem WHERE l_partkey < 100 OR l_linestatus =‘F‘;

  • ColumnStore binds AND’s more tightly than OR’s, just like any other SQLparser. Therefore you must enclose OR-relations in parentheses, just like in any other SQL parser.

table filter

The following syntax shows the conditions you can use when executing a condition against two columns. Note that the columns must be from the same table.

Join

The following syntax shows the conditions you can use when executing a join on two tables.

Notes:

  • ColumnStore tables can only be joined with non-ColumnStore tables in table mode only. See Operating Mode for information.

  • ColumnStore will require a join in the WHERE clause for each set of tables in the FROM clause. No cartesian product queries will be allowed.

  • ColumnStore requires that joins must be on the same datatype. In addition, numeric datatypes (INT variations, NUMERIC, DECIMAL

ColumnStore User Defined Aggregate and Window Functions

Introduction

Starting with MariaDB ColumnStore 1.1, the ability to create and use user-defined aggregate and window functions is supported in addition to scalar functions. With Columnstore 1.2, multiple parameters are supported. A C++ SDK is provided, as well as 3 reference examples that provide additional functions that may be of general use:

  • median—mathematical median, equivalent to percentile_cont(0.5).

  • avg_mode—mathematical mode, i.e., the most frequent value in the set.

  • SSQ—sum of squares, i.e., the sum of each number squared in the set.

Similar to built-in functions, the SDK supports distributed aggregate execution, where much of the calculation is scaled out across PrimProc nodes.

Using User-Defined Aggregate Functions

The reference examples above are included in the standard build of MariaDB ColumnStore. They can be used by registering them as user-defined aggregate functions. The same can be done for new functions, assuming the instance has the updated libraries included.

From a mcsmysql prompt:

After this, they may be used in the same way as any other aggregate or window function, like SUM:

Developing a New Function

This requires a MariaDB ColumnStore source tree and the necessary tools to compile C/C++ code. The SDK and reference examples are available in the utils/udfsdk directory of the source tree. It contains the SDK documentation, which is also available here:

Limitations

  • The implementation of the median and avg_mode functions will scale in memory consumption to the size of the set of unique values in the aggregation.

ColumnStore ALTER TABLE

The ALTER TABLE statement modifies existing tables. It includes adding, deleting, and renaming columns as well as renaming tables.

Syntax

images here

ADD

The ADD clause allows you to add columns to a table. You must specify the data type after the column name. The following statement adds a priority column with an integer datatype to the orders table:

  • Compression level (0 for no compression, 1 for compression) is set at the system level. If a session default exists, it will override the system default. In turn, it can be overridden by the table-level compression comment and finally, a compression comment at the column level.

Online ALTER TABLE ADD Columntable Level

The ColumnStore engine fully supports online DDL (one session can be adding columns to a table while another session is querying that table). MariaDB ColumnStore has provided its own syntax to do so for adding columns to a table, one at a time only. Do not attempt to use it for any other purpose. Follow the example below as closely as possible.

We have also provided the following workaround. This workaround is intended for adding columns to a table, one at a time only. Do not attempt to use it for any other purpose. Follow the example below as closely as possible.

Scenario: Add an INT column named col7 to the existing table foo:

The SELECT statement may take several tens of seconds to run, depending on how many rows are currently in the table. Regardless, other sessions can select against the table during this time (but they won’t be able to see the new column yet). The ALTER TABLE statement will take less than 1 second (depending on how busy MariaDB is), and during this brief time interval, other table reads will be held off.

CHANGE

The CHANGE clause allows you to rename a column in a table.

Notes to CHANGE COLUMN:

  • You cannot currently use CHANGE COLUMN to change the definition of that column.

  • You can only change a single column at a time. The following example renames the order_qty field to quantity in the orders table:

DROP

The DROP clause allows you to drop columns. All associated data is removed when the column is dropped. You can DROP COLUMN (column_name). The following example alters the orders table to drop the priority column:

RENAME

The RENAME clause allows to rename a table. The following example renames the orders table:

ColumnStore Information Functions

Functions

MariaDB ColumnStore information functions are selectable pseudo functions that return MariaDB ColumnStore specific “meta” information, to ensure queries can be locally directed to a specific node. These functions can be specified in the projection (SELECT), WHERE, GROUP BY, HAVING and ORDER BY portions of the SQL statement and are processed in a distributed manner.

Function
Description

ColumnStore Disk-Based Joins

Overview

Joins are performed in memory unless disk-based joins are enabled via AllowDiskBasedJoin in the columnstore.xml. When a join operation exceeds the memory allocated for query joins, the query is aborted with an error code IDB-2001.

Disk-based joins enable such queries to use disk for intermediate join data in case when the memory needed for the join exceeds the memory limit. Although slower in performance as compared to a fully in-memory join and bound by the temporary space on disk, it does allow such queries to complete.

Disk-based joins do not include aggregation and DML joins.

The following variables in the HashJoin element in the Columnstore.xml configuration file relate the o disk-based joins. Columnstore.xml resides in the etc. directory for your installation (/usr/local/mariadb/columnstore/etc).

  • AllowDiskBasedJoin: Option to use disk-based joins. Valid values are Y (enabled) or N (disabled). The default is disabled.

  • TempFileCompression: Option to use compression for disk join files. Valid values are Y (use compressed files) or N (use non-compressed files).

  • TempFilePath: The directory path used for the disk joins. By default, this path is the tmp directory for your installation (i.e., /tmp/columnstore_tmp_files/joins/). Files in this directory will be created and cleaned on an as-needed basis. The entire directory is removed and recreated by ExeMgr at startup.)

When using disk-based joins, it is strongly recommended that the TempFilePath reside on its partition, as the partition may fill up as queries are executed.

Per-User Join Memory Limit

In addition to the system-wide flags at the SQL global and session levels, the following system variables exist for managing per-user memory limits for joins.

  • columnstore_um_mem_limit - A value for memory limit in MB per user. When this limit is exceeded by a join, it will switch to a disk-based join. By default, the limit is not set (value of 0).

For modification at the global level: In my.cnf file (example: /etc/my.cnf.d/server.cnf):

where value is the value in MB for in memory limitation per user.

For modification at the session level, before issuing your join query from the SQL client, set the session variable as follows.

ColumnStore CREATE VIEW

Creates a stored query in the MariaDB ColumnStore.

Syntax

CREATE
    [OR REPLACE]
    VIEW view_name [(column_list)]
    AS select_statement

Notes

  • If you describe a view in MariaDB ColumnStore, the column types reported may not match the actual column types in the underlying tables. This is normal and can be ignored.

The following statement creates a customer view of orders with status:

start

Starts ColumnStore on all nodes.

See CMAPI for detail on REST API endpoint, required headers, and other available actions.

Description

Upon successful start call CMAPI connects to each MariaDB Enterprise ColumnStore node and performs start-up.

Call made via HTTPS PUT, with authentication via shared secret using the x-api-key header.

JSON data required for this call:

Key
Value

Bash alias mcsStart is available starting with Enterprise ColumnStore 5.5.2.

Examples

Executing cURL Manually

CMAPI calls can be made from the command-line using cURL.

Replace the CMAPI_API_KEY and sample data in the following example:

The command returns a JSON payload. Piping it to jq makes the output more human-readable.

Executing the Bash Alias

Starting with Enterprise ColumnStore 5.5.2, if your bash shell is configured to source the columnstoreAlias shell script, this command can be executed using the mcsStart alias. The alias executes curl and jq, so both programs must be installed on the system.

The alias automatically retrieves the IP address for the primary node using the command. The alias automatically retrieves the API key by reading /etc/columnstore/cmapi_server.conf.

These aliases use jq produces human-readable output from the returned JSON response.

node DELETE

Removes a ColumnStore node.

See CMAPI for detail on REST API endpoint, required headers, and other available actions.

Description

Upon successful node DELETE call CMAPI configures MariaDB Enterprise ColumnStore to remove the specified node.

Call made via HTTPS DELETE, with authentication via shared secret using the x-api-key header.

JSON data required for this call:

Key
Value

Examples

Executing cURL Manually

CMAPI calls can be made from the command-line using cURL.

Replace the CMAPI_API_KEY and sample data in the following example:

In this example, jq produces human-readable output from the returned JSON response:

ColumnStore INSERT

The INSERT statement allows you to add data to tables.

Syntax

The following statement inserts a row with all column values into the customer table:

The following statement inserts two rows with all column values into the customer table:

mode-set

Sets all ColumnStore nodes to read-only or read-write.

See for detail on REST API endpoint, required headers, and other available actions.

Description

Upon successful mode-set call CMAPI re-configures MariaDB Enterprise ColumnStore to the designated mode, either read-only or read/write operation.

Call made via HTTPS PUT, with authentication via shared secret using the x-api-key

ColumnStore UPDATE

The UPDATE statement changes data stored in rows.

Syntax

Single-Table Syntax

DDL Statements Different in ColumnStore

In most cases, a ColumnStore table works just as any other MariaDB table. There are, however, a few differences.

The following table lists the data definition statements (DDL) that differ from normal MariaDB when used on ColumnStore tables.

DDL
Difference

shutdown

Shuts down ColumnStore on all nodes

See for detail on REST API endpoint, required headers, and other available actions.

Description

Upon successful shutdown call CMAPI connects to each MariaDB Enterprise ColumnStore node and performs shut down.

Call made via HTTPS PUT, with authentication via shared secret using the x-api-key header.

node PUT

Adds a ColumnStore node.

See for detail on REST API endpoint, required headers, and other available actions.

Description

Upon successful node PUT call CMAPI configures MariaDB Enterprise ColumnStore to add the new node and a dbroot for that node.

Call made via HTTPS PUT, with authentication via shared secret using the x-api-key header.

SET columnstore_double_for_decimal_math
SET columnstore_decimal_scale
SET columnstore_use_decimal_scale
CREATE TABLE [IF NOT EXISTS] tbl_name
    (create_definition,...)  
ENGINE=columnstore  [ DEFAULT CHARSET=character-set] 
[COMMENT '[compression=0|1][;]
CREATE TABLE [IF NOT EXISTS] tbl_name
   { LIKE old_table_name | (LIKE old_table_name) }
create_definition:
    { col_name column_definition } 
column_definition:
    data_type
      [NOT NULL | NULL]
      [DEFAULT default_value]
      [COMMENT '[compression=0|1]
      [COMMENT='schema sync only']
      [COMMENT 'autoincrement column_name'];
CREATE TABLE orders (
  orderkey INTEGER, 
  customer VARCHAR(45)
) ENGINE=ColumnStore
filter:
column| literal| function [=|!=|<>|<|<=|>=|>] column| literal| function | select_statement
column| function [NOT] IN (select_statement | literal, literal,...)
column| function [NOT] BETWEEN (select_statement | literal, literal,...)
column| function  IS [NOT] NULL
string_column|string_function [NOT] LIKE pattern
EXISTS (select_statement)

NOT (filter)
(filter|function) [AND|OR] (filter|function)
ALTER TABLE tbl_name
    alter_specification [, alter_specification] ...

alter_specification:
    table_option ...
  | ADD [COLUMN] col_name column_definition
  | ADD [COLUMN] (col_name column_definition,...)
  | ALTER [COLUMN] col_name {SET DEFAULT literal | DROP DEFAULT}
  | CHANGE [COLUMN] old_col_name new_col_name column_definition
  | DROP [COLUMN] col_name
  | RENAME [TO] new_tbl_name
 

column_definition:
    data_type
      [NOT NULL | NULL]
      [DEFAULT default_value]
      [COMMENT '[compression=0|1];']

table_options:
    table_option [[,] table_option] ...  (see CREATE TABLE OPTIONS)
) may be mixed in the join if they have the same scale.
  • Circular joins are not supported in ColumnStore. See the Troubleshooting section

  • When the join memory limit is exceeded, a disk-based join will be used for processing if this option has been enabled.

  • idbPartition(column)

    The three part partition id (Directory.Segment.DBRoot)

    idbPm(column)

    The PrimProc where the physical row resides

    idbSegmentDir(column)

    The lowest level directory id for the column file containing the physical row

    idbSegment(column)

    The number of the segment file containing the physical row

    idbBlockId(column)

    The Logical Block Identifier (LBID) for the block containing the physical row

    idbDBRoot(column)

    The DBRoot where the physical row resides

    idbExtentId(column)

    The Logical Block Identifier (LBID) for the first block in the extent containing the physical row

    idbExtentMax(column)

    The max value from the extent map entry for the extent containing the physical row

    idbExtentMin(column)

    The min value from the extent map entry for the extent containing the physical row

    idbExtentRelativeRid(column)

    The row id (1 to 8,388,608) within the column's extent

    idbLocalPm()

    The PrimProc from which the query was launched.

    INSERT SELECT

    With INSERT ... SELECT, you can quickly insert many rows into a table from one or more other tables.

    • ColumnStore ignores the ON DUPLICATE KEY clause.

    • Non-transactional INSERT ... SELECT is directed to ColumnStores cpimport tool by default, which significantly increases performance.

    • Transactional INSERT ... SELECT statements (that is with AUTOCOMMIT off or after a START TRANSACTION) are processed through normal DML processes.

    AUTO_INCREMENT

    Example for using AUTO_INCREMENT in ColumnStore:

    INSERT 
     INTO tbl_name [(col,...)]
     {VALUES | VALUE} ({expr | DEFAULT},...),(...),...
    header.

    JSON data required for this call:

    Key
    Value

    timeout

    Maximum time in seconds to wait for completion of mode-set operation

    mode

    Accepted values: readonly for read-only, readwrite for read/write.

    Bash aliases mcsReadOnly and mcsReadWrite are available starting with Enterprise ColumnStore 5.5.2.

    Examples

    Executing cURL Manually

    CMAPI calls can be made from the command-line using cURL.

    Replace the CMAPI_API_KEY and sample data in the following example:

    In this example, jq produces human-readable output from the returned JSON response.

    Executing the Bash Aliases

    Starting with Enterprise ColumnStore 5.5.2, if your bash shell is configured to source the columnstoreAlias shell script, this command can be executed using the mcsReadOnly and mcsReadWrite aliases. The alias executes curl and jq, so both programs must be installed on the system.

    The aliases automatically retrieve the IP address for the primary node using the mcsGetConfig command. The aliases automatically retrieve the API key by reading /etc/columnstore/cmapi_server.conf.

    To set the deployment's mode to read-only:

    To set the deployment's mode to read-write:

    These aliases use jq produces human-readable output from the returned JSON response.

    CMAPI
    Multiple-Table Syntax

    Only one table can be updated from the table list in table_reference. However, multiple columns can be updated.

    UPDATE  table_reference 
      SET col1={expr1|DEFAULT} [,col2={expr2|DEFAULT}] ...
      [WHERE where_condition]
      [ORDER BY ...]
      [LIMIT row_count]
    UPDATE table_references
        SET col1={expr1|DEFAULT} [, col2={expr2|DEFAULT}] ...
        [WHERE where_condition]

    ColumnStore doesn't need indexes. Hence an index many not be created on a table that is defined with engine=columnstore

    Columnstore supports DROP TABLE ...RESTRICT which only drops the table in the front end.

    ColumnStore doesn't allow one to rename a table between databases.

    ColumnStore doesn't need indexes, partitions and many other table and column options. See here for ColumnStore Specific Syntax

    JSON data required for this call:

    Key
    Value

    timeout

    Maximum time in seconds to wait for completion of add-node operation

    Bash alias mcsShutdown is available starting with Enterprise ColumnStore 5.5.2.

    Examples

    Executing cURL Manually

    CMAPI calls can be made from the command-line using cURL.

    Replace the CMAPI_API_KEY and sample data in the following example:

    In this example, jq produces human-readable output from the returned JSON response.

    Executing the Bash Alias

    Starting with Enterprise ColumnStore 5.5.2, if your bash shell is configured to source the columnstoreAlias shell script, this command can be executed using the mcsShutdown alias. The alias executes curl and jq, so both programs must be installed on the system.

    The alias automatically retrieves the IP address for the primary node using the mcsGetConfig command. The alias automatically retrieves the API key by reading /etc/columnstore/cmapi_server.conf.

    These aliases use jq produces human-readable output from the returned JSON response.

    CMAPI

    JSON data required for this call:

    Key
    Value

    timeout

    Maximum time in seconds to wait for completion of add-node operation

    node

    IP address of the node to add

    Examples

    Executing cURL Manually

    CMAPI calls can be made from the command-line using cURL.

    Replace the CMAPI_API_KEY and sample data in the following example:

    In this example, jq produces human-readable output from the returned JSON response:

    CMAPI
    1.2.x UDAF SDK Guide "1.2.x UDAF SDK Guide"
    CREATE VIEW v_cust_orders (cust_name, order_number, order_status) AS
    SELECT c.cust_name, o.ordernum, o.status FROM customer c, orders o
    WHERE c.custnum = o.custnum;

    timeout

    Maximum time in seconds to wait for completion of add-node operation

    mcsGetConfig

    timeout

    Maximum time in seconds to wait for completion of add-node operation

    node

    IP address of the node to remove

    ColumnStore Distributed Aggregate Functions

    Functions

    MariaDB ColumnStore supports the following aggregate functions, these can be used in the SELECT, HAVING, and ORDER BY clauses of the SQL statement.

    Function
    Description

    Regression functions (REGR_AVGX to REGR_YY), CORR, COVAR_POP and COVAR_SAMP are supported for version 1.2.0 and higher

    Example

    An example group by query using aggregate functions is:

    ColumnStore LOAD DATA INFILE

    Overview

    The LOAD DATA INFILE statement reads rows from a text file into a table at a very high speed. The file name must be given as a literal string.

    LOAD DATA [LOCAL] INFILE 'file_name' 
      INTO TABLE tbl_name
      [CHARACTER SET charset_name]
      [{FIELDS | COLUMNS}
        [TERMINATED BY 'string']
        [[OPTIONALLY] ENCLOSED BY 'char']
        [ESCAPED BY 'char']
      ]
      [LINES
        [STARTING BY 'string']
        [TERMINATED BY 'string']
    ]
    • ColumnStore ignores the ON DUPLICATE KEY clause.

    • Non-transactional LOAD DATA INFILE is directed to ColumnStores cpimport tool by default, which significantly increases performance.

    • Transactional LOAD DATA INFILE statements (that is, with AUTOCOMMIT off or after a START TRANSACTION) are processed through normal DML processes.

    • Use cpimport for importing UTF-8 data that contains multi-byte values

    The following example loads data into a simple 5- column table: A file named /simpletable.tblhas the following data in it.

    The data can then be loaded into the simpletable table with the following syntax:

    If the default mode is set to use cpimport internally, any output error files will be written to /var/log/mariadb/columnstore/cpimport/ directory. It can be consulted for troubleshooting any errors reported.

    See Also

    ColumnStore Information Schema Tables

    MariaDB ColumnStore has four information schema tables that expose information about the table and column storage. The tables were added in version 1.0.5 and were heavily modified for 1.0.6.

    COLUMNSTORE_TABLES

    The first table is the INFORMATION_SCHEMA.COLUMNSTORE_TABLES. It contains information about the tables inside ColumnStore.

    ColumnStore SELECT

    The SELECT statement is used to query the database and display table data. You can add many clauses to filter the data.

    Syntax

    Projection List (SELECT)

    status

    Checks the status of ColumnStore.

    See for detail on REST API endpoint, required headers, and other available actions.

    Description

    Upon successful status call CMAPI returns JSON payload containing detailed information on MariaDB Enterprise Cluster status.

    Call made via HTTPS GET, with authentication via shared secret using the x-api-key header.

    SELECT COUNT(*) FROM orders, lineitem 
      WHERE (lineitem.l_orderkey < 100 OR lineitem.l_linenumber > 10) 
        AND lineitem.l_orderkey =orders.o_orderkey;
    col_name_1 [=|!=|<>|<|<=|>=|>] col_name_2
    join_condition [AND join_condition]
    join_condition:
               [col_name_1|function_name_1] = [col_name_2|function_name_2]
    INSERT INTO customer (custno, custname, custaddress, phoneno, cardnumber, comments) 
      VALUES (12, ‘JohnSmith’, ‘100 First Street, Dallas’, ‘(214) 555-1212’,100, ‘On Time’)
    INSERT INTO customer (custno, custname, custaddress, phoneno, cardnumber, comments) VALUES 
      (12, ‘JohnSmith’, ‘100 First Street, Dallas’, ‘(214) 555-1212’,100, ‘On Time’),
      (13, ‘John Q Public’, ‘200 Second Street, Dallas’, ‘(972) 555-1234’, 200, ‘LatePayment’);
    CREATE TABLE autoinc_test(
    id INT,
    name VARCHAR(10))
    ENGINE=columnstore COMMENT 'autoincrement=id';
    
    INSERT INTO autoinc_test (name) VALUES ('John');
    INSERT INTO autoinc_test (name) VALUES ('Doe');
    curl -k -s -X PUT https://mcs1:8640/cmapi/0.4.0/cluster/mode-set \
       --header 'Content-Type:application/json' \
       --header 'x-api-key:<CMAPI_API_KEY>' \
       --data '{"timeout": 20, "mode": "readwrite"}' \
       | jq .
    mcsReadOnly
    mcsReadWrite
    curl -k -s -X PUT https://mcs1:8640/cmapi/0.4.0/cluster/shutdown \
       --header 'Content-Type:application/json' \
       --header 'x-api-key:<CMAPI_API_KEY>' \
       --data '{"timeout":20}' \
       | jq .
    mcsShutdown
    curl -k -s -X PUT https://mcs1:8640/cmapi/0.4.0/cluster/node \
       --header 'Content-Type:application/json' \
       --header 'x-api-key:<CMAPI_API_KEY>' \
       --data '{"timeout": 20, "node": "192.0.2.2"}' \
       | jq .
    {
      "timestamp": "2020-10-28 00:42:42.796050",
      "node_id": "192.0.2.2"
    }
    CREATE AGGREGATE FUNCTION median RETURNS REAL soname 'libudf_mysql.so';
    CREATE AGGREGATE FUNCTION avg_mode RETURNS REAL soname 'libudf_mysql.so';
    CREATE AGGREGATE FUNCTION ssq RETURNS REAL soname 'libudf_mysql.so';
    SELECT grade, 
    AVG(loan_amnt) AVG, 
    MEDIAN(loan_amnt) median 
    FROM loanstats 
    GROUP BY grade 
    ORDER BY grade;
    ALTER TABLE orders ADD COLUMN priority INTEGER;
    SELECT calonlinealter('alter table foo add column col7 int;');
    ALTER TABLE foo ADD COLUMN col7 INT COMMENT 'schema sync only';
    ALTER TABLE orders CHANGE COLUMN order_qty quantity
    INTEGER;
    ALTER TABLE orders DROP COLUMN priority;
    ALTER TABLE orders RENAME TO customer_orders;
    [mysqld]
    ...
    columnstore_um_mem_limit = value
    SET columnstore_um_mem_limit = value
    curl -k -s -X PUT https://mcs1:8640/cmapi/0.4.0/cluster/start \
       --header 'Content-Type:application/json' \
       --header 'x-api-key:<CMAPI_API_KEY>' \
       --data '{"timeout":20}' \
       | jq .
    mcsStart
    curl -k -s -X DELETE https://mcs1:8640/cmapi/0.4.0/cluster/node \
       --header 'Content-Type:application/json' \
       --header 'x-api-key:<CMAPI_API_KEY>' \
       --data '{"timeout": 20, "node": "192.0.2.2"}' \
       | jq .
    {
      "timestamp": "2020-10-28 00:42:42.796050",
      "node_id": "192.0.2.2"
    }

    REGR_AVGY (ColumnY, ColumnX)

    Average of the dependent variable (sum(ColumnY)/N), where N is number of rows processed by the query

    REGR_COUNT (ColumnY, ColumnX)

    The total number of input rows in which both column Y and column X are nonnull

    REGR_INTERCEPT (ColumnY, ColumnX)

    The y-intercept of the least-squares-fit linear equation determined by the (ColumnX, ColumnY) pairs

    REGR_R2(ColumnY, ColumnX)

    Square of the correlation coefficient. correlation coefficient is the regr_intercept(ColumnY, ColumnX) for linear model

    REGR_SLOPE(ColumnY, ColumnX)

    The slope of the least-squares-fit linear equation determined by the (ColumnX, ColumnY) pairs

    REGR_SXX(ColumnY, ColumnX)

    REGR_COUNT(y, x) * VAR_POP(x) for non-null pairs.

    REGR_SXY(ColumnY, ColumnX)

    REGR_COUNT(y, x) * COVAR_POP(y, x) for non-null pairs.

    REGR_SYY(ColumnY, ColumnX)

    REGR_COUNT(y, x) * VAR_POP(y) for non-null pairs.

    STD(), STDDEV(), STDDEV_POP()

    The population standard deviation of a numeric (INT variations, NUMERIC, DECIMAL) column

    STDDEV_SAMP()

    The sample standard deviation of a numeric (INT variations, NUMERIC, DECIMAL) column

    SUM([DISTINCT] column)

    The sum of a numeric (INT variations, NUMERIC, DECIMAL) column

    VARIANCE(), VAR_POP()

    The population standard variance of a numeric (INT variations, NUMERIC, DECIMAL) column

    VAR_SAMP()

    The population standard variance of a numeric (INT variations, NUMERIC, DECIMAL) column

    AVG([DISTINCT] column)

    Average value of a numeric (INT variations, NUMERIC, DECIMAL) column

    CORR (ColumnY, ColumnX)

    The correlation coefficient for non-null pairs in a group.

    COUNT (*, [DISTINCT] column)

    The number of rows returned by a query or grouping. All datatypes are supported

    COVAR_POP (ColumnY, ColumnX)

    The population covariance for non-null pairs in a group.

    COVAR_SAMP (ColumnY, ColumnX)

    The sample covariance for non-null pairs in a group.

    MAX ([DISTINCT] column)

    The maximum value of a column. All datatypes are supported.

    MIN ([DISTINCT] column)

    The maximum value of a column. All datatypes are supported.

    REGR_AVGX (ColumnY, ColumnX)

    Average of the independent variable (sum(ColumnX)/N), where N is number of rows processed by the query

    COLUMNSTORE_COLUMNS

    The INFORMATION_SCHEMA.COLUMNSTORE_COLUMNS table contains information about every single column inside ColumnStore.

    COLUMNSTORE_EXTENTS

    This table displays the extent map in a user-consumable form. An extent is a collection of details about a section of data related to a columnstore column. A majority of columns in ColumnStore will have multiple extents, and the columns table above can be joined to this one to filter results by table or column. The table layout is as follows:

    Column
    Description

    OBJECT_ID

    The object ID for the extent

    OBJECT_TYPE

    Whether this is a "Column" or "Dictionary" extent

    LOGICAL_BLOCK_START

    ColumnStore's internal start LBID for this extent

    LOGICAL_BLOCK_END

    ColumnStore's internal end LBID for this extent

    MIN_VALUE

    This minimum value stored in this extent

    MAX_VALUE

    The maximum value stored in this extent

    Notes:

    1. The state is "Valid" for a normal state, "Invalid" if a cpimport has completed but the table has not yet been accessed (min/max values will be invalid) or "Updating" if there is a DML statement writing to the column.

    2. In ColumnStore the block size is 8192 bytes.

    3. By default, ColumnStore will write and create an extent file of 256_1024_WIDTH bytes for the first partition; if this is too small, then for uncompressed data, it will create a file of the maximum size for the extent (MAX_BLOCKS * BLOCK_SIZE). Snappy always compression adds a header block.

    4. Object IDs of less than 3000 are for internal tables and will not appear in any of the information schema tables.

    5. HWM is set to zero for the lower segments when there are multiple segments in an extent file; these can be observed when BLOCK_OFFSET > 0 .

    6. When HWM is 0, the DATA_SIZE will show 0 instead of 8192 to avoid confusion when there are multiple segments in an extent file.

    COLUMNSTORE_FILES

    The columnstore_files table provides information about each file associated with extensions. Each extension can reuse a file at different block offsets, so this is not a 1:1 relationship to the columnstore_extents table.

    Column
    Description

    OBJECT_ID

    The object ID for the extent

    SEGMENT_ID

    The segment ID for the extent

    PARTITION_ID

    The partition ID for the extent

    FILENAME

    The full path and filename for the extent file, multiple extents for the same column can point to this file with different BLOCK_OFFSETs

    FILE_SIZE

    The disk file size for the extent

    COMPRESSED_DATA_SIZE

    The amount of the compressed file used, NULL if this is an uncompressed file

    Stored Procedures

    The total_usage() procedure gives a total disk usage summary for all the columns in ColumnStore except the columns used for internal maintenance. It is executed using the following query:

    table_usage Procedure

    The table_usage() procedure gives the total data disk usage, dictionary disk usage, and grand total disk usage per table. It can be called in several ways; the first gives a total for each table:

    Or for a specific table, my_table in my_schema in this example:

    You can also request all tables for a specified schema:

    Note: The quotes around the table name are required; an error will occur without them.

    compression_ratio Procedure

    The compression_ratio() procedure calculates the average compression ratio across all the compressed extents in ColumnStore. It is called using

    If the same column needs to be referenced more than once in the projection list, a unique name is required for each column using a column alias. The total length of the name of a column, inclusive of the length of functions, in the projection list must be 64 characters or less.

    WHERE

    The WHERE clause filters data retrieval based on criteria. Note that column_alias cannot be used in the WHERE clause. The following statement returns rows in the region table where the region = ‘ASIA’:

    GROUP BY

    GROUP BY groups data based on values in one or more specific columns. The following statement returns rows from the lineitem table where /orderkey_is less than 1 000 000 and groups them by the quantity._

    HAVING

    HAVING is used in combination with the GROUP BY clause. It can be used in a SELECT statement to filter the records that a GROUP BY returns. The following statement returns shipping dates, and the respective quantity where the quantity is 2500 or more.

    ORDER BY

    The ORDER BY clause presents results in a specific order. Note that the ORDER BY clause represents a statement that is post-processed by MariaDB. The following statement returns an ordered quantity column from the lineitem table.

    The following statement returns an ordered shipmode column from the lineitem table.

    NOTE: When ORDER BY is used in an inner query and LIMIT on an outer query, LIMIT is applied first and then ORDER BY is applied when returning results.

    UNION

    Used to combine the result from multiple SELECT statements into a single result set. The UNION or UNION DISTINCT clause returns query results from multiple queries into one display and discards duplicate results. The UNION ALL clause displays query results from multiple queries and does not discard the duplicates. The following statement returns the p_name rows in the part table and the partno table and discards the duplicate results:

    The following statement returns all the p_name rows in the part table and the partno table:

    LIMIT

    A limit is used to constrain the number of rows returned by the SELECT statement. LIMIT can have up to two arguments. LIMIT must contain a row count and may optionally contain an offset of the first row to return (the initial row is 0).

    The following statement returns 5 customer keys from the customer table:

    The following statement returns 5 customer keys from the customer table beginning at offset 1000:

    When LIMIT is used in a nested query, and the inner query contains an ORDER BY clause, LIMIT is applied before ORDER BY is applied.

    Bash alias mcsStatus is available starting with Enterprise ColumnStore 5.5.2.

    Examples

    Executing cURL Manually

    CMAPI calls can be made from the command-line using cURL.

    Replace the CMAPI_API_KEY and sample data in the following example:

    In this example, jq produces human-readable output from the returned JSON response:

    Executing the Bash Alias

    Starting with Enterprise ColumnStore 5.5.2, if your bash shell is configured to source the columnstoreAlias shell script, this command can be executed using the mcsStatus alias. The alias executes curl and jq, so both programs must be installed on the system.

    The alias automatically retrieves the IP address for the primary node using the mcsGetConfig command. The alias automatically retrieves the API key by reading /etc/columnstore/cmapi_server.conf.

    In this example, jq produces human-readable output from the returned JSON response.

    CMAPI
    1|100|1000|10000|Test Number 1|
    2|200|2000|20000|Test Number 2|
    3|300|3000|30000|Test Number 3|

    ColumnStore Naming Conventions

    Overview

    This lists the different naming conventions enforced by the column store, compared to the normal .

    • Usernames: 64 characters (MariaDB has 80)

    • Table and column names are restricted to alphanumeric and underscore only, i.e. "A-Z a-z 0-9 _".

    • The first character of all table and column names should be an ASCII letter (a-z A-Z).

    • ColumnStore reserves certain words that MariaDB does not, such as SELECT, CHAR and TABLE, so even wrapped in backticks these cannot be used.

    Reserved Words

    In addition to MariaDB Server , ColumnStore has additional reserved words that cannot be used as table names, column names or user defined variables, functions or stored procedure names.

    Keyword

    ColumnStore User Defined Functions

    Introduction

    MariaDB provides extensibility support through user defined functions (UDFs). For more details on the MariaDB server framework see . This documentation applies to MariaDB ColumnStore version 1.0.10 and above.

    MariaDB ColumnStore provides scale-out query processing. As such, it requires a separate distributed implementation of each SQL function. This allows for the function application to happen on each PrimProc server node, providing distributed scale-out performance.

    Thus, to fully implement a user defined function for MariaDB ColumnStore requires implementing 2 different APIs:

    • The MariaDB server UDF API allows utilization on all storage engines, and is the implementation used if applied in the select list.

    • The ColumnStore distributed UDF API enables distributed execution of WHERE clause and GROUP BY usage of the function, and is pushed down to PrimProc nodes for execution where possible.

    MariaDB ColumnStore supports user-defined function implementations in C/C++. User-defined aggregate and window functions are not supported in ColumnStore 1.0.

    Developing a User-Defined Function

    The development kit can be found under the directory of the mariadb-columnstore-engine source tree. Developing a UDF requires setting up a development environment and familiarity with c++ development. To set up a ColumnStore development environment, follow the instructions on dependencies in the repository.

    Three main files need to be modified in order to add a new UDF:

    • udfmysql.cpp: MariaDB server UDF implementation.

    • udfsdk.h: Class headers.

    • udfsdk.cpp: Distributed ColumnStore UDF implementation.

    Two reference implementations are provided for guidance on creating your own functions:

    • MCS_IsNull: Illustrates a simple single-argument function, providing the ability to return a Boolean if the expression parameter is NULL.

    • MCS_Add: Illustrates a simple two-argument function that adds two values and return the sum.

    It is simplest to copy these and adapt to your needs. There are no system dependencies on the included reference implementations so these can be removed to simplify the class files if preferred.

    MariaDB Server UDF Implementation

    Three functions are required to be implemented (for more details see ):

    • x_init: Performs any parameter validation or setup such as memory allocation.

    • x: Performs the actual function implementation.

    • x_deinit: Performs any cleanup tasks such as deallocating memory where 'x' is the function name.

    ColumnStore Distributed UDF Implementation

    The function name and class must be registered in order to be recognized and used by the ColumnStore primitives processor (PrimProc). This is done by adding a line to perform the registration in the UDFSDK::UDFMap()_` function in the file udfsdk.cpp:

    For a new UDF, add a new entry to the FuncMap object, mapping the name to the udf class.

    The UDF class should be defined in file udfsdk.h and implemented in file udfsdk.cpp. It is easiest to adapt the example classes for new instance but each class must implement the funcexp::Func_` C++ class definition:

    • constructor: any initialization necessary.

    • destructor: any de-initialization.

    • getOperationType performs parameter validation and returns the result data type.

    • getVal computes and returns the value of the UDF for each given return datatype.

    The code changes can be built using make within the directory utils/udfsdk, this creates the following libraries in the same directory:

    • libudf_mysql.so.1.0.0

    • libudfsdk.so.1.0.0

    Those libraries contain the compiled code.

    Deploying and Using a UDF

    The two libraries created before must be deployed to the /usr/local/mariadb/columnstore/lib directory (or equivalent lib directory in a non-root installation), replacing the existing files. Symbolic links in the MariaDB server directory point to these, but should be validated. Run the following as root from the utils/udfsdk directory in the build tree (specifying a password to restartSystem if needed for a multi-server cluster):

    Repeat this for each ColumnStore PrimProc node in the cluster, then restart ColumnStore to make the libraries available.

    After restarting the system, the UDF must be registered with the MariaDB server so it can be used:

    The function mcs_add can now be used. Verify that it can be used both in the SELECT list and WHERE clause for correct installation:

    SELECT year(o_orderdate) order_year, 
    AVG(o_totalprice) avg_totalprice, 
    MAX(o_totalprice) max_totalprice, 
    COUNT(*) order_count 
    FROM orders 
    GROUP BY order_year 
    ORDER BY order_year;
    > call columnstore_info.total_usage();
    > call columnstore_info.table_usage(NULL, NULL);
    > call columnstore_info.table_usage('my_schema', 'my_table');
    > call columnstore_info.table_usage('my_schema', NULL);
    > call columnstore_info.compression_ratio();
    SELECT
    [ALL | DISTINCT ]
        select_expr [, select_expr ...]
        [ FROM table_references
          [WHERE where_condition]
          [GROUP BY {col_name | expr | POSITION} [ASC | DESC], ... [WITH ROLLUP]]
          [HAVING where_condition]
          [ORDER BY {col_name | expr | POSITION} [ASC | DESC], ...]
          [LIMIT {[offset,] ROW_COUNT | ROW_COUNT OFFSET OFFSET}]
          [PROCEDURE procedure_name(argument_list)]
          [INTO OUTFILE 'file_name' [CHARACTER SET charset_name] [export_options]
             | INTO DUMPFILE 'file_name' | INTO var_name [, var_name] ]
    export_options:
        [{FIELDS | COLUMNS}
            [TERMINATED BY 'string']
            [[OPTIONALLY] ENCLOSED BY 'char']
            [ESCAPED BY 'char']
        ]
        [LINES
            [STARTING BY 'string']
            [TERMINATED BY 'string']
        ]
    SELECT * FROM region WHERE name = ’ASIA’;
    SELECT quantity, COUNT(*) FROM lineitem WHERE orderkey < 1000000 GROUP BY quantity;
    SELECT shipdate, COUNT(*) FROM lineitem GROUP BYshipdate HAVING COUNT(*) >= 2500;
    SELECT quantity FROM lineitem WHERE orderkey < 1000000 ORDER BY quantity;
    SELECT shipmode FROM lineitem WHERE orderkey < 1000000 ORDER BY 1;
    SELECT p_name FROM part UNION SELECT p_name FROM  partno;
    SELECT p_name FROM part UNION ALL SELECT p_name FROM  partno;
    SELECT custkey FROM customer LIMIT 5;
    SELECT custkey FROM customer LIMIT 1000,5;
    curl -k -s https://mcs1:8640/cmapi/0.4.0/cluster/status \
       --header 'Content-Type:application/json' \
       --header 'x-api-key:<CMAPI_API_KEY>' \
       | jq .
    {
      "timestamp": "2020-12-15 00:40:34.353574",
      "192.0.2.1": {
        "timestamp": "2020-12-15 00:40:34.362374",
        "uptime": 11467,
        "dbrm_mode": "master",
        "cluster_mode": "readwrite",
        "dbroots": [
          "1"
        ],
        "module_id": 1,
        "services": [
          {
            "name": "workernode",
            "pid": 19202
          },
          {
            "name": "controllernode",
            "pid": 19232
          },
          {
            "name": "PrimProc",
            "pid": 19254
          },
          {
            "name": "ExeMgr",
            "pid": 19292
          },
          {
            "name": "WriteEngine",
            "pid": 19316
          },
          {
            "name": "DMLProc",
            "pid": 19332
          },
          {
            "name": "DDLProc",
            "pid": 19366
          }
        ]
      },
      "192.0.2.2": {
        "timestamp": "2020-12-15 00:40:34.428554",
        "uptime": 11437,
        "dbrm_mode": "slave",
        "cluster_mode": "readonly",
        "dbroots": [
          "2"
        ],
        "module_id": 2,
        "services": [
          {
            "name": "workernode",
            "pid": 17789
          },
          {
            "name": "PrimProc",
            "pid": 17813
          },
          {
            "name": "ExeMgr",
            "pid": 17854
          },
          {
            "name": "WriteEngine",
            "pid": 17877
          }
        ]
      },
      "192.0.2.3": {
        "timestamp": "2020-12-15 00:40:34.428554",
        "uptime": 11437,
        "dbrm_mode": "slave",
        "cluster_mode": "readonly",
        "dbroots": [
          "2"
        ],
        "module_id": 2,
        "services": [
          {
            "name": "workernode",
            "pid": 17789
          },
          {
            "name": "PrimProc",
            "pid": 17813
          },
          {
            "name": "ExeMgr",
            "pid": 17854
          },
          {
            "name": "WriteEngine",
            "pid": 17877
          }
        ]
      },
      "num_nodes": 3
    }
    mcsStatus
    LOAD DATA INFILE 'simpletable.tbl' INTO TABLE simpletable FIELDS TERMINATED BY '|'

    COLUMN

    COLUMNS

    COMMENT

    CONSTRAINT

    CONSTRAINTS

    CREATE

    CURRENT_USER

    DATETIME

    DEC

    DECIMAL

    DEFERRED

    DEFAULT

    DEFERRABLE

    DOUBLE

    DROP

    ENGINE

    EXISTS

    FOREIGN

    FULL

    IDB_BLOB

    IDB_CHAR

    IDB_DELETE

    IDB_FLOAT

    IDB_INT

    IF

    IMMEDIATE

    INDEX

    INITIALLY

    INTEGER

    KEY

    MATCH

    MAX_ROWS

    MIN_ROWS

    MODIFY

    NO

    NOT

    NULL_TOK

    NUMBER

    NUMERIC

    ON

    PARTIAL

    PRECISION

    PRIMARY

    REAL

    REFERENCES

    RENAME

    RESTRICT

    SESSION_USER

    SET

    SMALLINT

    SYSTEM_USER

    TABLE

    TIME

    TINYINT

    TO

    TRUNCATE

    UNIQUE

    UNSIGNED

    UPDATE

    USER

    VARBINARY

    VARCHAR

    VARYING

    WITH

    ZONE

    ACTION

    ADD

    ALTER

    AUTO_INCREMENT

    BIGINT

    BIT

    CASCADE

    CHANGE

    CHARACTER

    CHARSET

    CHECK

    CLOB

    WIDTH

    The data width for the extent

    DBROOT

    The DBRoot number for the extent

    PARTITION_ID

    The parition ID for the extent

    SEGMENT_ID

    The segment ID for the extent

    BLOCK_OFFSET

    The block offset for the data file, each data file can contain multiple extents for a column

    MAX_BLOCKS

    The maximum number of blocks for the extent

    HIGH_WATER_MARK

    The last block committed to the extent (starting at 0)

    STATE

    The state of the extent (see below)

    STATUS

    The availability status for the column which is either "Available", "Unavailable" or "Out of service"

    DATA_SIZE

    The uncompressed data size for the extent calculated as (HWM + 1) * BLOCK_SIZE

    utils/udfsdk
    ColumnStore server fork

    ColumnStore Data Types

    ColumnStore supports the following data types:

    Numeric Data Types

    Datatypes
    Column Size
    Description
    FuncMap UDFSDK::UDFMap() const
    {
    	FuncMap fm;
    	
    	// first: function name
    	// second: Function pointer
    	// please use lower case for the function name. Because the names might be 
    	// case-insensitive in MariaDB depending on the setting. In such case,
    	// the function names passed to the interface is always in lower case.
    	fm["mcs_add"] = new MCS_add();
    	fm["mcs_isnull"] = new MCS_isnull();
    
    	return fm;
    }
    $ cp libudf_mysql.so.1.0.0 libudfsdk.so.1.0.0 /usr/local/mariadb/columnstore/lib/
    $ ls -l /usr/local/mariadb/columnstore/mysql/lib/plugin/libudf_mysql.so
    lrwxrwxrwx. 1 root root 56 Jul 19 09:47 /usr/local/mariadb/columnstore/mysql/lib/plugin/libudf_mysql.so -> /usr/local/mariadb/columnstore/lib/libudf_mysql.so.1.0.0
    $ mcsmysql
        > create function mcs_add returns integer soname 'libudf_mysql.so';
    MariaDB [test]> create function mcs_add returns integer soname 'libudf_mysql.so';
    Query OK, 0 rows affected (0.01 sec)
    
    MariaDB [test]> create table t1(i1 int, i2 int) engine=columnstore;
    Query OK, 0 rows affected (0.58 sec)
    
    MariaDB [test]> insert into t1 values (1,1), (2,2);
    Query OK, 2 rows affected (0.24 sec)
    Records: 2  Duplicates: 0  Warnings: 0
    
    MariaDB [test]> select i1, i2, mcs_add(i1,i2) sum from t1;
    +------+------+------+
    | i1   | i2   | sum  |
    +------+------+------+
    |    1 |    1 |    2 |
    |    2 |    2 |    4 |
    +------+------+------+
    2 rows in set (0.05 sec)
    
    MariaDB [test]> select i1, i2 from t1 where mcs_add(i1,i2) = 4;
    +------+------+
    | i1   | i2   |
    +------+------+
    |    2 |    2 |
    +------+------+
    1 row in set (0.02 sec)

    1-byte

    A very small integer. Numeric value with scale 0. Signed: -126 to +127. Unsigned: 0 to 253.

    2-bytes

    A small integer. Signed: -32,766 to 32,767. Unsigned: 0 to 65,533.

    3-bytes

    A medium integer. Signed: -8388608 to 8388607. Unsigned: 0 to 16777215. Supported starting with MariaDB ColumnStore 1.4.2.

    4-bytes

    A normal-size integer. Numeric value with scale 0. Signed: -2,147,483,646 to 2,147,483,647. Unsigned: 0 to 4,294,967,293

    8-bytes

    A large integer. Numeric value with scale 0. Signed: -9,223,372,036,854,775,806 to+9,223,372,036,854,775,807 Unsigned: 0 to +18,446,744,073,709,551,613

    2, 4, or 8 bytes

    A packed fixed-point number that can have a specific total number of digits and with a set number of digits after a decimal. The maximum precision (total number of digits) that can be specified is 18.

    4 bytes

    Stored in 32-bit IEEE-754 floating point format. As such, the number of significant digits is about 6and the range of values is approximately +/- 1e38.The MySQL extension to specify precision and scale is not supported.

    8 bytes

    Stored in 64-bit IEEE-754 floating point format. As such, the number of significant digits is about 15 and the range of values is approximately +/-1e308. The MySQL extension to specify precision and scale is not supported. “REAL” is a synonym for “DOUBLE”.

    String Data Types

    Datatypes
    Column Size
    Description

    1, 2, 4, or 8 bytes

    Holds letters and special characters of fixed length. Max length is 255. Default and minimum size is 1 byte.

    1, 2, 4, or 8 bytes or 8-byte token

    Holds letters, numbers, and special characters of variable length. Max length = 8000 bytes or characters and minimum length = 1 byte or character.

    255 bytes

    Holds a small amount of letters, numbers, and special characters of variable length. Supported from version 1.1.0 onwards.

    255 bytes

    Date and Time Data Types

    Datatypes
    Column Size
    Description

    4-bytes

    Date has year, month, and day. The internal representation of a date is a string of 4 bytes. The first 2 bytes represent the year, .5 bytes the month, and .75 bytes the day in the following format: YYYY-MM-DD. Supported range is 1000-01-01 to 9999-12-31.

    8-bytes

    A date and time combination. Supported range is 1000-01-01 00:00:00 to 9999-12-31 23:59:59. From version 1.2.0 microseconds are also supported.

    8-bytes

    Holds hour, minute, second and optionally microseconds for time. Supported range is '-838:59:59.999999' to '838:59:59.999999'. Supported from version 1.2.0 onwards.

    4-bytes

    Notes

    • ColumnStore treats a zero-length string as a NULL value.

    • As with core MariaDB, ColumnStore employs “saturation semantics” on integer values. This means that if a value is inserted into an integer field that is too big/small for it to hold (i.e., it is more negative or more positive than the values indicated above), ColumnStore will “saturate” that value to the min/max value indicated above as appropriate. For example, for a SMALLINT column, if 32800 is attempted, the actual value inserted will be 32767.

    • ColumnStore's largest negative and positive numbers appear to be 2 less than what MariaDB supports. ColumnStore reserves these for its internal use, and they cannot be used. For example, if there is a need to store -128 in a column, the TINYINT datatype cannot be used; the SMALLINT datatype must be used instead. If the value -128 is inserted into a TINYINT column, ColumnStore will saturate it to -126 (and issue a warning).

    • ColumnStore truncates rather than rounds decimal constants that have too many digits after the decimal point during bulk load and when running SELECT statements. For INSERT and UPDATE, however, the MariaDB parser will round such constants. You should verify that the ETL tools used and any INSERT/UPDATE statements only specify the correct number of decimal digits to avoid potential confusion.

    • An optional display width may be added to the BIGINT, INTEGER/INT, SMALLINT & TINYINT columns. As with core MariaDB tables, this value does not affect the internal storage requirements of the column, nor does it affect the valid value ranges.

    • All columns in ColumnStore are nullable, and the default value for any column is NULL. You may optionally specify NOT NULL for any column and/or one with a DEFAULT value.

      • Unlike other MariaDB storage engines, the actual storage limit for LONGBLOB/LONGTEXT is 2,100,000,000 bytes instead of 4 GB per entry. MariaDB's client API is limited to a row length of 1 GB.

    • Timestamp and current_timestamp are still not supported. ( / )

    1-byte

    A synonym for "TINYINT(1)". Supported from version 1.2.0 onwards.

    Holds a small amount of binary data of variable length. Supported from version 1.1.0 onwards.

    64 KB

    Holds letters, numbers, and special characters of variable length. Supported from version 1.1.0 onwards.

    64 KB

    Holds binary data of variable length. Supported from version 1.1.0 onwards.

    16 MB

    Holds a medium amount of letters, numbers, and special characters of variable length. Supported from version 1.1.0 onwards.

    16 MB

    Holds a medium amount of binary data of variable length. Supported from version 1.1.0 onwards.

    1.96 GB

    Holds a large amount of letters, numbers, and special characters of variable length. Supported from version 1.1.0 onwards.

    1.96 GB

    Holds a large amount of binary data of variable length. Supported from version 1.1.0 onwards.

    Values are stored as the number of seconds since 1970-01-01 00:00:00 UTC, and optionally microseconds. The max value is currently 2038-01-19 03:14:07 UTC. Supported starting with MariaDB ColumnStore 1.4.2.

    MCOL-3694
    MCOL-1039

    CMAPI

    CMAPI is a REST API for administering MariaDB Enterprise ColumnStore in multi-node topologies.

    Reference material is available for MariaDB Enterprise ColumnStore.

    MariaDB Enterprise ColumnStore is included with MariaDB Enterprise Server.

    Clients

    CMAPI is a REST API, so it should be compatible with most standard REST clients.

    CMAPI examples throughout the documentation use curl as the REST client. The examples also pipe the JSON output to jq for enhanced readability.

    Endpoint

    The endpoint for CMAPI contains the hostname and port for the primary node running Enterprise ColumnStore, /cmapi/, the CMAPI API version (0.4.0), and a action-specific endpoint path.

    Example: https://mcs1:8640/cmapi/0.4.0/cluster/node

    Endpoint Paths

    Endpoint Path
    Method
    Action

    Method and required data vary by CMAPI endpoint path.

    Required Headers

    Header
    Description

    Authentication

    Authentication is performed via an API key, which performs the role of a shared secret. The API key is passed to the API using the x-api-key header.

    The API key is stored in /etc/columnstore/cmapi_server.conf.

    Generate an API Key

    The API key is a shared secret that can be used to add nodes to multi-node Enterprise ColumnStore. The API key can be any string, but it is recommended to use a long, random string. The API key should be stored securely and kept confidential.

    For example, to create a random 256-bit API key using openssl rand:

    Set the API Key

    To set the API key for the first time, provide the desired API key when you add the first node using the PUT command. Since Enterprise ColumnStore does not yet have an API key, CMAPI will write the first API key it receives to /etc/columnstore/cmapi_server.conf.

    For example, if the primary server's host name is mcs1 and its IP address is 192.0.2.1, the following command will add the primary server to Enterprise ColumnStore and write the provided API key to the node's CMAPI configuration file:

    Change the API Key

    To change the API key in multi-node Enterprise ColumnStore, change it in the CMAPI configuration file on each node located at /etc/columnstore/cmapi_server.conf. The CMAPI server must be restarted on each node for the changes to be applied.

    Restart the CMAPI server by running the following command on each node:

    Bash Aliases

    Bash aliases are available starting with Enterprise ColumnStore 5.5.2.

    These aliases are available if your bash shell is configured to source the columnstoreAlias shell script.

    These aliases execute curl and jq, so both programs must be installed on the system.

    These aliases automatically retrieve the IP address for the primary node using the command. The aliases automatically retrieve the API key by reading /etc/columnstore/cmapi_server.conf.

    Available aliases:

    Alias
    Endpoint
    Action

    CMAPI Service Management

    The systemctl command is used to start and stop the CMAPI service.

    Operation
    Command

    Configuration

    The CMAPI configuration file is located at /etc/columnstore/cmapi_server.conf.

    To change the configuration:

    • Modify the configuration file on each node

    • Restart the CMAPI server on each node:

    Configure Failover

    Starting with CMAPI 6.4.1, the auto_failover option can be set to True or False in the [application] section:

    • The default value of the auto_failover option is True.

    • The auto_failover option should be set to False when is used.

    Logging

    Starting with Enterprise ColumnStore 5.5.2, the can be found at /var/log/mariadb/columnstore/cmapi_server.log.

    In previous versions, CMAPI's log messages can be viewed in the systemd journal:

    CMAPI Responses

    CMAPI responds to client requests with standard HTTP response messages.

    Status Line

    The first part of the standard HTTP response message is the status line. To determine if your request was successful, check the status code and the reason phrase from the status line.

    Consult the to see the full list of status codes and their descriptions.

    Headers

    The second part of the standard HTTP response message is the HTTP headers. To determine what kind of message body is in the response message, check the Content-Type header field.

    Body

    The final part of the standard HTTP response message is the body.

    ColumnStore Distributed Functions

    Functions

    ColumnStore supports the following functions. These functions can be specified in the projection (SELECT), WHERE and ORDER BY portions of the SQL statement and will be processed in a distributed manner.

    PUT

    Starts ColumnStore on all nodes

    GET

    Checks the status of ColumnStore

    mcsStatus

    Checks the status of ColumnStore

    View systemd journal

    journalctl -u mariadb-columnstore-cmapi

    Varies

    Server-side error Contact support

    mode-set

    PUT

    Sets all ColumnStore nodes to read-only or read-write

    node

    DELETE

    Removes a ColumnStore node

    node

    PUT

    Adds a ColumnStore node

    shutdown

    PUT

    Shuts down ColumnStore on all nodes

    Content-Type

    Set to application/json

    x-api-key

    Set to the API key configured for CMAPI. Calls using the incorrect keys are rejected.

    mcsReadOnly

    mode-set

    Sets all ColumnStore nodes to read-only

    mcsReadWrite

    mode-set

    Sets all ColumnStore nodes to read/write

    mcsShutdown

    shutdown

    Shuts down ColumnStore on all nodes

    mcsStart

    start

    Starts ColumnStore on all nodes

    Status

    systemctl status mariadb-columnstore-cmapi

    Start

    systemctl start mariadb-columnstore-cmapi

    Stop

    systemctl stop mariadb-columnstore-cmapi

    Restart

    systemctl restart mariadb-columnstore-cmapi

    Enable startup

    systemctl enable mariadb-columnstore-cmapi

    Disable startup

    systemctl disable mariadb-columnstore-cmapi

    Status Code

    Reason Phrase

    Outcome

    200

    OK

    Successful

    200 < x < 300

    Varies

    Possibly successful

    300 <= x < 400

    Varies

    Request redirected

    400 <= x < 500

    Varies

    Client-side error Check endpoint, API key, and JSON data

    Outcome

    Content-Type

    Success

    application/json

    Failure

    Undefined Depends on specific failure

    Outcome

    Body

    Success

    JSON Data

    Failure

    Undefined Depends on specific failure

    node
    mcsGetConfig
    non-shared local storage
    CMAPI logs
    HTTP standard

    500 <= x < 600

  • ,

  • ,

  • ,

  • See Also

    • ColumnStore Non-Distributed Post-Processed Functions

    openssl rand -hex 32
    93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd
    curl -k -s -X PUT https://mcs1:8640/cmapi/0.4.0/cluster/node \
       --header 'Content-Type:application/json' \
       --header 'x-api-key:93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd' \
       --data '{"timeout":120, "node": "192.0.2.1"}' \
       | jq .
    sudo systemctl restart mariadb-columnstore-cmapi
    sudo systemctl restart mariadb-columnstore-cmapi
    [application]
    auto_failover = False
    sudo journalctl -u mariadb-columnstore-cmapi
    start
    status
    status

    ColumnStore Window Functions

    Introduction

    MariaDB ColumnStore provides support for window functions broadly following the SQL 2003 specification. A window function allows for calculations relating to a window of data surrounding the current row in a result set. This capability provides for simplified queries in support of common business questions such as cumulative totals, rolling averages, and top 10 lists.

    Aggregate functions are utilized for window functions however differ in behavior from a group by query because the rows remain ungrouped. This provides support for cumulative sums and rolling averages, for example.

    Two key concepts for window functions are Partition and Frame:

    • A Partition is a group of rows, or window, that have the same value for a specific column, for example a Partition can be created over a time period such as a quarter or lookup values.

    • The Frame for each row is a subset of the row's Partition. The frame typically is dynamic allowing for a sliding frame of rows within the Partition. The Frame determines the range of rows for the windowing function. A Frame could be defined as the last X rows and next Y rows all the way up to the entire Partition.

    Window functions are applied after joins, group by, and having clauses are calculated.

    Syntax

    A window function is applied in the select clause using the following syntax:

    where window_definition is defined as:

    PARTITION BY

    • Divides the window result set into groups based on one or more expressions.

    • An expression may be a constant, column, and non window function expressions.

    • A query is not limited to a single partition by clause. Different partition clauses can be used across different window function applications.

    • The partition by columns do not need to be in the select list but do need to be available from the query result set.

    ORDER BY

    • Defines the ordering of values within the partition.

    • Can be ordered by multiple keys which may be a constant, column or non window function expression.

    • The order by columns do not need to be in the select list but need to be available from the query result set.

    • Use of a select column alias from the query is not supported.

    and the optional frame_clause is defined as:

    and the optional frame_start and frame_end are defined as (value being a numeric expression):

    RANGE/ROWS

    • Defines the windowing clause for calculating the set of rows that the function applies to for calculating a given rows window function result.

    • Requires an ORDER BY clause to define the row order for the window.

    • ROWS specify the window in physical units, i.e. result set rows and must be a constant or expression evaluating to a positive numeric value.

    Supported Window Functions

    Function
    Description

    Examples

    Example Schema

    The examples are all based on the following simplified sales opportunity table:

    Some example values are (thanks to for sample data generation):

    id
    accountName
    name
    owner
    amount
    closeDate
    stageName

    The schema, sample data, and queries are available as an attachment to this article.

    Cumulative Sum and Running Max Example

    Window functions can be used to achieve cumulative / running calculations on a detail report. In this case a won opportunity report for a 7 day period adds columns to show the accumulated won amount as well as the current highest opportunity amount in preceding rows.

    with example results:

    owner
    accountName
    CloseDate
    amount
    cumeWon
    runningMax

    Partitioned Cumulative Sum and Running Max Example

    The above example can be partitioned, so that the window functions are over a particular field grouping such as owner and accumulate within that grouping. This is achieved by adding the syntax "partition by " in the window function clause.

    with example results:

    owner
    accountName
    CloseDate
    amount
    cumeWon
    runningMax

    Ranking / Top Results

    The rank window function allows for ranking or assigning a numeric order value based on the window function definition. Using the Rank() function will result in the same value for ties / equal values and the next rank value skipped. The Dense_Rank() function behaves similarly except the next consecutive number is used after a tie rather than skipped. The Row_Number() function will provide a unique ordering value. The example query shows the Rank() function being applied to rank sales reps by the number of opportunities for Q4 2016.

    with example results (note the query is technically incorrect by using closeDate < '2016-12-31' however this creates a tie scenario for illustrative purposes):

    owner
    wonCount
    rank

    If the dense_rank function is used the rank values would be 1,2,3,3,4 and for the row_number function the values would be 1,2,3,4,5.

    First and Last Values

    The first_value and last_value functions allow determining the first and last values of a given range. Combined with a group by this allows summarizing opening and closing values. The example shows a more complex case where detailed information is presented for first and last opportunity by quarter.

    with example results:

    year
    quarter
    firstAccountName
    firstOwner
    firstAmount
    lastAccountName
    lastOwner
    lastAmount

    Prior and Next Example

    Sometimes it useful to understand the previous and next values in the context of a given row. The lag and lead window functions provide this capability. By default, the offset is one providing the prior or next value but can also be provided to get a larger offset. The example query is a report of opportunities by account name showing the opportunity amount, and the prior and next opportunity amount for that account by close date.

    with example results:

    accountName
    closeDate
    currentOppAmount
    priorAmount
    nextAmount

    Quartiles Example

    The NTile window function allows for breaking up a data set into portions assigned a numeric value to each portion of the range. NTile(4) breaks the data up into quartiles (4 sets). The example query produces a report of all opportunities summarizing the quartile boundaries of amount values.

    With example results:

    quartile
    min
    max

    Percentile Example

    The percentile functions have a slightly different syntax from other window functions as can be seen in the example below. These functions can be only applied against numeric values. The argument to the function is the percentile to evaluate. Following 'within group' is the sort expression which indicates the sort column and optionally order. Finally after 'over' is an optional partition by clause, for no partition clause use 'over ()'. The example below utilizes the value 0.5 to calculate the median opportunity amount in the rows. The values differ sometimes because percentile_cont will return the average of the 2 middle rows for an even data set while percentile_desc returns the first encountered in the sort.

    With example results:

    owner
    accountName
    CloseDate
    amount
    pct_cont
    pct_disc

    If there is no PARTITION BY clause, all rows of the result set define the group.

    ASC (default) and DESC options allow for ordering ascending or descending.

  • NULLS FIRST and NULL_LAST options specify whether null values come first or last in the ordering sequence. NULLS_FIRST is the default for ASC order, and NULLS_LAST is the default for DESC order.

  • RANGE specifies the window as a logical offset. If the expression evaluates to a numeric value, then the ORDER BY expression must be a numeric or DATE type. If the expression evaluates to an interval value, then the ORDER BY expression must be a DATE data type.
  • UNBOUNDED PRECEDING indicates the window starts at the first row of the partition.

  • UNBOUNDED FOLLOWING indicates the window ends at the last row of the partition.

  • CURRENT ROW specifies the window start or ends at the current row or value.

  • If omitted, the default is ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW.

  • LAST_VALUE()

    The value evaluated at the row that is the last row of the window frame (counting from 1); null if no such row.

    LEAD()

    Provides access to a row at a given physical offset beyond that position. Returns value evaluated at the row that is offset rows after the current row within the partition; if there is no such row, instead return default. Both offset and default are evaluated with respect to the current row. If omitted, offset defaults to 1 and default to null.

    MAX()

    Maximum value of expression across all input values.

    MEDIAN()

    An inverse distribution function that assumes a continuous distribution model. It takes a numeric or datetime value and returns the middle value or an interpolated value that would be the middle value once the values are sorted. Nulls are ignored in the calculation. Not available in MariaDB Columnstore 1.1

    MIN()

    Minimum value of expression across all input values.

    NTH_VALUE()

    The value evaluated at the row that is the nth row of the window frame (counting from 1); null if no such row.

    NTILE()

    Divides an ordered data set into a number of buckets indicated by expr and assigns the appropriate bucket number to each row. The buckets are numbered 1 through expr. The expr value must resolve to a positive constant for each partition. Integer ranging from 1 to the argument value, dividing the partition as equally as possible.

    PERCENT_RANK()

    Relative rank of the current row: (rank - 1) / (total rows - 1).

    PERCENTILE_CONT()

    An inverse distribution function that assumes a continuous distribution model. It takes a percentile value and a sort specification, and returns an interpolated value that would fall into that percentile value with respect to the sort specification. Nulls are ignored in the calculation. Not available in MariaDB Columnstore 1.1

    PERCENTILE_DISC()

    An inverse distribution function that assumes a discrete distribution model. It takes a percentile value and a sort specification and returns an element from the set. Nulls are ignored in the calculation. Not available in MariaDB Columnstore 1.1

    RANK()

    Rank of the current row with gaps; same as row_number of its first peer.

    REGR_COUNT(ColumnY, ColumnX)

    The total number of input rows in which both column Y and column X are nonnull

    REGR_SLOPE(ColumnY, ColumnX)

    The slope of the least-squares-fit linear equation determined by the (ColumnX, ColumnY) pairs

    REGR_INTERCEPT(ColumnY, ColumnX)

    The y-intercept of the least-squares-fit linear equation determined by the (ColumnX, ColumnY) pairs

    REGR_R2(ColumnY, ColumnX)

    Square of the correlation coefficient. correlation coefficient is the regr_intercept(ColumnY, ColumnX) for linear model

    REGR_SXX(ColumnY, ColumnX)

    REGR_COUNT(y, x) * VAR_POP(x) for non-null pairs.

    REGR_SXY(ColumnY, ColumnX)

    REGR_COUNT(y, x) * COVAR_POP(y, x) for non-null pairs.

    REGR_SYY(ColumnY, ColumnX)

    REGR_COUNT(y, x) * VAR_POP(y) for non-null pairs.

    ROW_NUMBER()

    Number of the current row within its partition, counting from 1

    STDDEV() STDDEV_POP()

    Computes the population standard deviation and returns the square root of the population variance.

    STDDEV_SAMP()

    Computes the cumulative sample standard deviation and returns the square root of the sample variance.

    SUM()

    Sum of expression across all input values.

    VARIANCE() VAR_POP()

    Population variance of the input values (square of the population standard deviation).

    VAR_SAMP()

    Sample variance of the input values (square of the sample standard deviation).

    Organic demand-driven benchmark

    Maria

    477878.41

    2016-11-28

    ClosedWon

    3

    Miboo

    De-engineered hybrid groupware

    Olivier

    80181.78

    2017-01-05

    ClosedWon

    4

    Youbridge

    Enterprise-wide bottom-line Graphic Interface

    Chris

    946245.29

    2016-07-02

    ClosedWon

    5

    Skyba

    Reverse-engineered fresh-thinking standardization

    Maria

    696241.82

    2017-02-17

    Negotiating

    6

    Eayo

    Fundamental well-modulated artificial intelligence

    Bob

    765605.52

    2016-08-27

    Prospecting

    7

    Yotz

    Extended secondary infrastructure

    Chris

    319624.20

    2017-01-06

    ClosedLost

    8

    Oloo

    Configurable web-enabled data-warehouse

    Chris

    321016.26

    2017-03-08

    ClosedLost

    9

    Kaymbo

    Multi-lateral web-enabled definition

    Bob

    690881.01

    2017-01-02

    Developing

    10

    Rhyloo

    Public-key coherent infrastructure

    Chris

    965477.74

    2016-11-07

    Prospecting

    583722.98

    437636.47

    Olivier

    Devpulse

    2016-10-05

    834235.93

    1417958.91

    834235.93

    Chris

    Linkbridge

    2016-10-07

    539977.45

    2458738.65

    834235.93

    Olivier

    Trupe

    2016-10-07

    500802.29

    1918761.20

    834235.93

    Bill

    Latz

    2016-10-08

    857254.87

    3315993.52

    857254.87

    Chris

    Avamm

    2016-10-09

    699566.86

    4015560.38

    857254.87

    583722.98

    437636.47

    Bill

    Latz

    2016-10-08

    857254.87

    1440977.85

    857254.87

    Chris

    Linkbridge

    2016-10-07

    539977.45

    539977.45

    539977.45

    Chris

    Avamm

    2016-10-09

    699566.86

    1239544.31

    699566.86

    Olivier

    Devpulse

    2016-10-05

    834235.93

    834235.93

    834235.93

    Olivier

    Trupe

    2016-10-07

    500802.29

    1335038.22

    834235.93

    Olivier

    10

    5

    2016

    4

    Skimia

    Chris

    961513.59

    Avamm

    Maria

    112493.65

    2017

    1

    Yombu

    Bob

    536875.51

    Skaboo

    Chris

    270273.08

    2016-12-18

    350235.75

    161086.82

    878595.89

    Abata

    2016-12-31

    878595.89

    350235.75

    922322.39

    Abata

    2017-01-21

    922322.39

    878595.89

    NULL

    Abatz

    2016-10-19

    795424.15

    NULL

    NULL

    Agimba

    2016-07-09

    288974.84

    NULL

    914461.49

    Agimba

    2016-09-07

    914461.49

    288974.84

    176645.52

    Agimba

    2016-09-20

    176645.52

    914461.49

    NULL

    437636.4700000000

    437636.47

    Bill

    Latz

    2016-10-08

    857254.87

    437636.4700000000

    437636.47

    Chris

    Linkbridge

    2016-10-07

    539977.45

    619772.1550000000

    539977.45

    Chris

    Avamm

    2016-10-09

    699566.86

    619772.1550000000

    539977.45

    Olivier

    Devpulse

    2016-10-05

    834235.93

    667519.1100000000

    500802.29

    Olivier

    Trupe

    2016-10-07

    500802.29

    667519.1100000000

    500802.29

    AVG()

    The average of all input values.

    COUNT()

    Number of input rows.

    CUME_DIST()

    Calculates the cumulative distribution, or relative rank, of the current row to other rows in the same partition. Number of peer or preceding rows / number of rows in partition.

    DENSE_RANK()

    Ranks items in a group leaving no gaps in ranking sequence when there are ties.

    FIRST_VALUE()

    The value evaluated at the row that is the first row of the window frame (counting from 1); null if no such row.

    LAG()

    The value evaluated at the row that is offset rows before the current row within the partition; if there is no such row, instead return default. Both offset and default are evaluated with respect to the current row. If omitted, offset defaults to 1 and default to null. LAG provides access to more than one row of a table at the same time without a self-join. Given a series of rows returned from a query and a position of the cursor, LAG provides access to a row at a given physical offset prior to that position.

    1

    Browseblab

    Multi-lateral executive function

    Bob

    26444.86

    2016-10-20

    Negotiating

    2

    Bill

    Babbleopia

    2016-10-02

    437636.47

    437636.47

    437636.47

    Bill

    Thoughtworks

    2016-10-04

    Bill

    Babbleopia

    2016-10-02

    437636.47

    437636.47

    437636.47

    Bill

    Thoughtworks

    2016-10-04

    Bill

    19

    1

    Chris

    15

    2

    Maria

    14

    3

    Bob

    14

    3

    2016

    3

    Skidoo

    Bill

    523295.07

    Skipstorm

    Bill

    Abata

    2016-09-10

    645098.45

    NULL

    161086.82

    Abata

    2016-10-14

    161086.82

    645098.45

    350235.75

    1

    6337.15

    287634.01

    2

    288796.14

    539977.45

    3

    540070.04

    748727.51

    4

    753670.77

    998864.47

    Bill

    Babbleopia

    2016-10-02

    437636.47

    437636.4700000000

    437636.47

    Bill

    Thoughtworks

    2016-10-04

    www.mockaroo.com

    Mita

    146086.51

    146086.51

    151420.86

    Abata

    146086.51

    function_name ([expression [, expression ... ]]) OVER ( window_definition )
    [ PARTITION BY expression [, ...] ]
    [ ORDER BY expression [ ASC | DESC ] [ NULLS { FIRST | LAST } ] [, ...] ]
    [ frame_clause ]
    { RANGE | ROWS } frame_start
    { RANGE | ROWS } BETWEEN frame_start AND frame_end
    UNBOUNDED PRECEDING
    value PRECEDING
    CURRENT ROW
    value FOLLOWING
    UNBOUNDED FOLLOWING
    CREATE TABLE opportunities (
    id INT,
    accountName VARCHAR(20),
    name VARCHAR(128),
    owner VARCHAR(7),
    amount DECIMAL(10,2),
    closeDate DATE,
    stageName VARCHAR(11)
    ) ENGINE=columnstore;
    SELECT owner, 
    accountName, 
    CloseDate, 
    amount, 
    SUM(amount) OVER (ORDER BY CloseDate ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) cumeWon, 
    MAX(amount) OVER (ORDER BY CloseDate ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) runningMax
    FROM opportunities 
    WHERE stageName='ClosedWon' 
    AND closeDate >= '2016-10-02' AND closeDate <= '2016-10-09' 
    ORDER BY CloseDate;
    SELECT owner,  
    accountName,  
    CloseDate,  
    amount,  
    SUM(amount) OVER (PARTITION BY owner ORDER BY CloseDate ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) cumeWon,  
    MAX(amount) OVER (PARTITION BY owner ORDER BY CloseDate ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) runningMax 
    FROM opportunities  
    WHERE stageName='ClosedWon' 
    AND closeDate >= '2016-10-02' AND closeDate <= '2016-10-09'  
    ORDER BY owner, CloseDate;
    SELECT owner, 
    wonCount, 
    rank() OVER (ORDER BY wonCount DESC) rank 
    FROM (
      SELECT owner, 
      COUNT(*) wonCount 
      FROM opportunities 
      WHERE stageName='ClosedWon' 
      AND closeDate >= '2016-10-01' AND closeDate < '2016-12-31'  
      GROUP BY owner
    ) t
    ORDER BY rank;
    SELECT a.YEAR, 
    a.quarter, 
    f.accountName firstAccountName, 
    f.owner firstOwner, 
    f.amount firstAmount, 
    l.accountName lastAccountName, 
    l.owner lastOwner, 
    l.amount lastAmount 
    FROM (
      SELECT YEAR, 
      QUARTER, 
      MIN(firstId) firstId, 
      MIN(lastId) lastId 
      FROM (
        SELECT YEAR(closeDate) YEAR, 
        quarter(closeDate) QUARTER, 
        first_value(id) OVER (PARTITION BY YEAR(closeDate), quarter(closeDate) ORDER BY closeDate ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) firstId, 
        last_value(id) OVER (PARTITION BY YEAR(closeDate), quarter(closeDate) ORDER BY closeDate ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) lastId 
        FROM opportunities  WHERE stageName='ClosedWon'
      ) t 
      GROUP BY YEAR, QUARTER ORDER BY YEAR,QUARTER
    ) a 
    JOIN opportunities f ON a.firstId = f.id 
    JOIN opportunities l ON a.lastId = l.id 
    ORDER BY YEAR, QUARTER;
    SELECT accountName, 
    closeDate,  
    amount currentOppAmount, 
    lag(amount) OVER (PARTITION BY accountName ORDER BY closeDate) priorAmount, lead(amount) OVER (PARTITION BY accountName ORDER BY closeDate) nextAmount 
    FROM opportunities 
    ORDER BY accountName, closeDate 
    LIMIT 9;
    SELECT t.quartile, 
    MIN(t.amount) MIN, 
    MAX(t.amount) MAX 
    FROM (
      SELECT amount, 
      ntile(4) OVER (ORDER BY amount ASC) quartile 
      FROM opportunities 
      WHERE closeDate >= '2016-10-01' AND closeDate <= '2016-12-31'
      ) t 
    GROUP BY quartile 
    ORDER BY quartile;
    SELECT owner,  
    accountName,  
    CloseDate,  
    amount,
    percentile_cont(0.5) within GROUP (ORDER BY amount) OVER (PARTITION BY owner) pct_cont,
    percentile_disc(0.5) within GROUP (ORDER BY amount) OVER (PARTITION BY owner) pct_disc
    FROM opportunities  
    WHERE stageName='ClosedWon' 
    AND closeDate >= '2016-10-02' AND closeDate <= '2016-10-09'  
    ORDER BY owner, CloseDate;
    DROP TABLE
    DROP TABLE
    DDL
    DROP TABLE
    RENAME TABLE
    CREATE TABLE
    CREATE INDEX
    LOAD DATA INFILE
    MariaDB naming conventions
    reserved words
    User-Defined Functions
    user-defined-functions
    BOOLEAN
    TINYINT
    SMALLINT
    MEDIUMINT
    INTEGER/INT
    BIGINT
    DECIMAL/NUMERIC
    FLOAT
    DOUBLE/REAL
    CHAR
    VARCHAR
    TINYTEXT
    TINYBLOB
    DATE
    DATETIME
    TIME
    TIMESTAMP
    TEXT
    BLOB
    MEDIUMTEXT
    MEDIUMBLOB
    LONGTEXT
    LONGBLOB
    &
    ABS()
    ACOS()
    ADDDATE()
    ADDTIME()
    ASCII()
    ASIN()
    ATAN()
    BETWEEN...AND...
    BIT_AND()
    BIT_OR()
    BIT_XOR()
    CASE()
    CAST()
    CEIL()
    CEILING()
    CHAR_LENGTH(), CHARACTER_LENGTH()
    COALESCE()
    CONCAT()
    CONCAT_WS()
    CONV()
    CONVERT()
    COS()
    COT()
    CRC32()
    DATE()
    DATE_ADD()
    DATE_FORMAT()
    DATE_SUB()
    DATEDIFF()
    DAY(), DAYOFMONTH()
    DAYNAME()
    DAYOFWEEK()
    DAYOFYEAR()
    DEGREES()
    DIV
    ELT()
    EXP()
    EXTRACT()
    FIND_IN_SET()
    FLOOR()
    FORMAT()
    FROM_DAYS()
    FROM_UNIXTIME()
    GET_FORMAT()
    GREATEST()
    GROUP_CONCAT()
    HEX()
    HOUR()
    IF()
    IFNULL()
    IN
    INET_ATON()
    INET_NTOA()
    INSERT()
    INSTR()
    ISNULL()
    LAST_DAY()
    LCASE()
    LEAST()
    LEFT()
    LENGTH()
    LIKE
    LN()
    LOCATE()
    LOG()
    LOG2()
    LOG10()
    LOWER()
    LPAD()
    LTRIM()
    MAKEDATE()
    MAKETIME()
    MD5()
    MICROSECOND()
    MINUTE()
    MOD()
    MONTH()
    MONTHNAME()
    NOW()
    NULLIF()
    PERIOD_ADD()
    PERIOD_DIFF()
    POSITION()
    POW()
    POWER()
    QUARTER()
    RADIANS()
    RAND()
    REGEXP()
    REPEAT()
    REPLACE()
    REVERSE()
    RIGHT()
    RLIKE()
    ROUND()
    RPAD()
    RTRIM()
    SEC_TO_TIME()
    SECOND()
    SHA(), SHA1()
    SIGN()
    SIN()
    SPACE()
    SQRT()
    STR_TO_DATE()
    STRCMP()
    SUBDATE()
    SUBSTR()
    SUBSTRING()
    SUBSTRING_INDEX()
    SUBTIME()
    SYSDATE()
    TAN()
    TIME()
    TIME_FORMAT()
    TIME_TO_SEC()
    TIMEDIFF()
    TIMESTAMPADD()
    TIMESTAMPDIFF()
    TO_DAYS()
    TRIM()
    TRUNCATE()
    UCASE()
    UNIX_TIMESTAMP()
    UNIX_TIME()
    UPPER()
    WEEK()
    WEEKDAY()
    WEEKOFYEAR()
    XOR()
    YEAR()
    YEARWEEK()

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.