AVG()

USAGE

AVG([ALL | DISTINCT] expr) [OVER (window_clause)]

DETAILS

  • AVG() is an aggregate function which returns the average of a set of values, as a decimal value.

  • "Average" is the arithmetic mean of a set of values.

  • NULL values are not included in computing the average.

  • As detailed in ISO 9075:2016, AVG() has implementation-defined behavior:

    • When acting upon an exact numeric data type (such as INTEGER, NUMERIC, DECIMAL), the resulting data type, precision, and scale behavior is implementation-defined. Precision and scale will be at least that of the processed values.

    • When acting upon an approximate numeric data type (FLOAT, DOUBLE), the resulting data type and precision are implementation-defined. Precision will be at least that of the processed values.

    • This behavior is implementation-defined and may differ on other database platforms. Testing may be appropriate when migrating from other database platforms.

    • For additional information on the standard, see ISO/IEC 9075-2:2016, 10.9 <aggregate function>.

SYNONYMS

SCHEMA

PARAMETERS

ALL or DISTINCT

If the parameters start with either the ALL or DISTINCT clause, it specifies if either all values are to be used in computing the average or if duplicate values are first removed from the list of values before computing the average. The default behavior is ALL when this parameter is not specified.

expr

The expression specifies what values are to be averaged. It can be a column in a field list, an expression comprised of multiple columns, a variable, or a numeric literal.

OVER (window_clause)

If the function is followed by OVER and a window clause, the function aggregates the values in the window of values into an average instead of aggregating any of the row values together.

SKYSQL

PRIVILEGES

EXAMPLES

Example Schema and Data

Some of the examples are based on the temperatures table:

CREATE TABLE temperatures (
  location VARCHAR(255),
  year INT,
  temp INT
);
INSERT INTO temperatures VALUES
  -- 2020
  ('Berlin',2020,-1), ('Berlin',2020,15),
  ('China',2020,15), ('China',2020,0),
  ('Denmark',2020,0), ('Denmark',2020,-10), ('Denmark',2020,10),
  ('Paris',2020,10), ('Paris',2020,5), ('Paris',2020,-1),
  -- 1920
  ('Berlin',1920,-10), ('China',1920,NULL), ('Denmark',1920,-1),
  ('Paris',1920,20),
  -- 1820
  ('Berlin',1820,NULL), ('China',1820,NULL), ('Denmark',1820,NULL),
  ('Paris',1820,NULL);

ALL Clause (default behavior)

The ALL clause is the default behavior for AVG(), applied when DISTINCT is not specified. In this example, all numeric temperature values are used to compute the average.

The values to be averaged:

SELECT * FROM temperatures WHERE year = 2020
 ORDER BY temp, location;
+----------+------+------+
| location | year | temp |
+----------+------+------+
| Denmark  | 2020 |  -10 |
| Berlin   | 2020 |   -1 |
| Paris    | 2020 |   -1 |
| China    | 2020 |    0 |
| Denmark  | 2020 |    0 |
| Paris    | 2020 |    5 |
| Denmark  | 2020 |   10 |
| Paris    | 2020 |   10 |
| Berlin   | 2020 |   15 |
| China    | 2020 |   15 |
+----------+------+------+

The result of averaging these values can be requested with or without the "ALL" clause:

SELECT AVG(ALL temp), AVG(temp)
 FROM temperatures WHERE year = 2020;
+---------------+-----------+
| AVG(ALL temp) | AVG(temp) |
+---------------+-----------+
|        4.3000 |    4.3000 |
+---------------+-----------+

DISTINCT Clause

Using AVG() with the DISTINCT clause ignores duplicate values from the list, so each value is counted and summed only once when computing the average. In this example, all distinct numeric temperature values are used to compute the average:

SELECT AVG(DISTINCT temp)
 FROM temperatures WHERE year = 2020;
+--------------------+
| AVG(DISTINCT temp) |
+--------------------+
|             3.1667 |
+--------------------+

With GROUP BY

Using AVG() with GROUP BY creates a separate average for each group. In this example, GROUP BY location creates a separate average for each unique location:

SELECT location, AVG(temp)
 FROM temperatures
 GROUP BY location ORDER BY location;
+----------+-----------+
| location | AVG(temp) |
+----------+-----------+
| Berlin   |    1.3333 |
| China    |    7.5000 |
| Denmark  |   -0.2500 |
| Paris    |    8.5000 |
+----------+-----------+

With NULL Values

When AVG() acts on NULL values, those values are not included when computing the average. In this example, some rows contain NULL values:

SELECT * FROM temperatures
 WHERE year = 1920 ORDER BY temp;
+----------+------+------+
| location | year | temp |
+----------+------+------+
| China    | 1920 | NULL |
| Berlin   | 1920 |  -10 |
| Denmark  | 1920 |   -1 |
| Paris    | 1920 |   20 |
+----------+------+------+
SELECT AVG(temp) FROM temperatures WHERE year = 1920;
+-----------+
| AVG(temp) |
+-----------+
|    3.0000 |
+-----------+

With Only NULL Values

When AVG() is acting only on NULL values, AVG() returns NULL. In this example, all rows contain NULL values:

SELECT AVG(temp) FROM temperatures WHERE year=1820;
+-----------+
| AVG(temp) |
+-----------+
|      NULL |
+-----------+

Precision and Scale

As detailed in ISO 9075:2016, the data type produced as the result of AVG() is implementation-defined. The result of AVG() will always have at least as much precision (INTEGER, DECIMAL, FLOAT, DOUBLE) and scale (INTEGER, DECIMAL) as the source values.

This example demonstrates precision and scale of AVG() acting on a DECIMAL(8,6):

CREATE TABLE avg_decimal (
  item VARCHAR(20),
  count DECIMAL(8,6)
);
INSERT INTO avg_decimal VALUES
  ('Foo', 1.123456),
  ('Bar', 3.141592),
  ('Baz', 9.999999);
SELECT AVG(count) FROM avg_decimal;
+--------------+
| AVG(count)   |
+--------------+
| 4.7550156667 |
+--------------+

This example demonstrates precision and scale of AVG() acting on a DECIMAL(11,9), an increase in precision from our prior input which results in increased output precision:

ALTER TABLE avg_decimal MODIFY count DECIMAL(11,9);
SELECT AVG(count) FROM avg_decimal;
+-----------------+
| AVG(count)      |
+-----------------+
| 4.7550156666667 |
+-----------------+

As Window Function

As detailed in ISO 9075:2016, aggregate functions can be used as window functions. When an aggregate function is used as a window function, the result is derived from the window frame of the row. The window frame is defined with the OVER clause.

AVG() can be used as a window function by including the OVER clause. This example demonstrates the use of AVG() as a window function:

CREATE TABLE inventory (
  item VARCHAR(20),
  category VARCHAR(20),
  count INT
);
INSERT INTO inventory VALUES
  ('Foo','in stock', 42), ('Foo','back order', 50),
  ('Bar','in stock', 10), ('Bar','back order', 60),
  ('Baz','in stock', 99), ('Baz','back order', 10);
SELECT item, category, count,
       AVG(count) OVER (PARTITION BY category) AS average_by_category
  FROM inventory ORDER BY item, category;
+------+------------+-------+---------------------+
| item | category   | count | average_by_category |
+------+------------+-------+---------------------+
| Bar  | back order |    60 |             40.0000 |
| Bar  | in stock   |    10 |             50.3333 |
| Baz  | back order |    10 |             40.0000 |
| Baz  | in stock   |    99 |             50.3333 |
| Foo  | back order |    50 |             40.0000 |
| Foo  | in stock   |    42 |             50.3333 |
+------+------------+-------+---------------------+

ERROR HANDLING

FEATURE INTERACTION

sql_mode=ONLY_FULL_GROUP_BY

MariaDB Server does not implement ISO 9075:2016 Feature T301, Functional Dependencies.

By default, MariaDB Server will accept the following query which does not conform (without Feature T301) to ISO 9075:2016 because this query contains a non-aggregated column (location) in the SELECT list which is not mentioned in a GROUP BY list:

SELECT location, AVG(temp)
FROM temperatures;
+----------+-----------+
| location | AVG(temp) |
+----------+-----------+
| Berlin   |    4.0000 |
+----------+-----------+

With the default behavior, MariaDB Server does not perform any sorting or grouping with the non-aggregated columns that are not in the GROUP BY list. Instead, it simply chooses the first value returned for that column in each group. If the column has the same value for each row in the group, then this behavior can help performance by avoiding unnecessary sorting. However, if the column has multiple values in the group, the behavior is nondeterministic.

The nondeterministic behavior does not apply when an aggregate function is used as a window function, because MariaDB Server derives the result for window functions from the window frame, rather than from aggregated rows.

MariaDB Server's nondeterministic behavior can be demonstrated by manually controlling the order that the rows are returned using CTEs.

The following CTE defines 'Berlin' as the value for the first row, so MariaDB Server returns that value:

WITH asc_sorted_temp AS (
   SELECT 'Berlin' AS location, 15 AS temp
   UNION
   SELECT 'Paris' AS location, 15 AS temp
)
SELECT location, AVG(temp)
FROM asc_sorted_temp;
+----------+-----------+
| location | AVG(temp) |
+----------+-----------+
| Berlin   |   15.0000 |
+----------+-----------+

In contrast, the following CTE defines 'Paris' as the value for the first row, so MariaDB Server returns that value:

WITH desc_sorted_temp AS (
   SELECT 'Paris' AS location, 15 AS temp
   UNION
   SELECT 'Berlin' AS location, 15 AS temp
)
SELECT location, AVG(temp)
FROM desc_sorted_temp;
+----------+-----------+
| location | AVG(temp) |
+----------+-----------+
| Paris    |   15.0000 |
+----------+-----------+

Some DBAs or application developers may find the nondeterministic behavior to be undesirable.

MariaDB Server allows this non-standard behavior to be disabled by adding the ONLY_FULL_GROUP_BY mode to the sql_mode system variable. When the non-standard behavior is disabled, an error with the ER_MIX_OF_GROUP_FUNC_AND_FIELDS error code will be raised if a query contains any non-aggregated columns in the SELECT list that are not in the GROUP BY list:

SET sql_mode=CONCAT(@@sql_mode, ',ONLY_FULL_GROUP_BY');

SELECT location, AVG(temp)
FROM temperatures;
ERROR 1140 (42000): Mixing of GROUP columns (MIN(),MAX(),COUNT(),...) with no GROUP columns is illegal if there is no GROUP BY clause

sql_mode=ORACLE

sql_mode=ORACLE enables SQL/PL, a compatible subset of Oracle PL/SQL.

With sql_mode=ORACLE, the behavior of the AVG() function does not change:

SET sql_mode=CONCAT(@@sql_mode, ',ORACLE');

SELECT location, AVG(temp)
FROM temperatures
GROUP BY location
ORDER BY location;
+----------+-----------+
| location | AVG(temp) |
+----------+-----------+
| Berlin   |    1.3333 |
| China    |    7.5000 |
| Denmark  |   -0.2500 |
| Paris    |    8.5000 |
+----------+-----------+

RESPONSES

DIAGNOSIS

ISO 9075:2016

CHANGE HISTORY

Release Series

History

10.6 Enterprise

  • Present starting in MariaDB Enterprise Server 10.6.4-1.

10.6 Community

  • Present starting in MariaDB Community Server 10.6.0.

10.5 Enterprise

  • Present starting in MariaDB Enterprise Server 10.5.3-1.

10.5 Community

  • Present starting in MariaDB Community Server 10.5.0.

10.4 Enterprise

  • Present starting in MariaDB Enterprise Server 10.4.6-1.

10.4 Community

  • Present starting in MariaDB Community Server 10.4.0.

10.3 Enterprise

  • Present starting in MariaDB Enterprise Server 10.3.16-1.

10.3 Community

  • Present starting in MariaDB Community Server 10.3.0.

10.2 Enterprise

  • Present starting in MariaDB Enterprise Server 10.2.25-1.

10.2 Community

  • Present starting in MariaDB Community Server 10.2.0.

Release Series

History

10.6 Enterprise

  • Present starting in MariaDB Enterprise Server 10.6.4-1.

10.5 Enterprise

  • Present starting in MariaDB Enterprise Server 10.5.3-1.

10.4 Enterprise

  • Present starting in MariaDB Enterprise Server 10.4.6-1.

EXTERNAL REFERENCES