AVG()
This page is part of MariaDB's Documentation.
The parent of this page is: Functions for MariaDB Enterprise Server
Topics on this page:
Overview
Returns the average of a set of values, as a decimal value.
USAGE
AVG([ALL | DISTINCT] expr) [OVER (window_clause)]
DETAILS
AVG()
is an aggregate function which returns the average of a set of values, as a decimal value."Average" is the arithmetic mean of a set of values.
NULL
values are not included in computing the average.As detailed in ISO 9075:2016,
AVG()
has implementation-defined behavior:When acting upon an exact numeric data type (such as
INTEGER
,NUMERIC
,DECIMAL
), the resulting data type, precision, and scale behavior is implementation-defined. Precision and scale will be at least that of the processed values.When acting upon an approximate numeric data type (
FLOAT
,DOUBLE
), the resulting data type and precision are implementation-defined. Precision will be at least that of the processed values.This behavior is implementation-defined and may differ on other database platforms. Testing may be appropriate when migrating from other database platforms.
For additional information on the standard, see ISO/IEC 9075-2:2016, 10.9 <aggregate function>.
PARAMETERS
| If the parameters start with either the |
| The expression specifies what values are to be averaged. It can be a column in a field list, an expression comprised of multiple columns, a variable, or a numeric literal. |
| If the function is followed by |
EXAMPLES
Example Schema and Data
Some of the examples are based on the temperatures
table:
CREATE TABLE temperatures (
location VARCHAR(255),
year INT,
temp INT
);
INSERT INTO temperatures VALUES
-- 2020
('Berlin',2020,-1), ('Berlin',2020,15),
('China',2020,15), ('China',2020,0),
('Denmark',2020,0), ('Denmark',2020,-10), ('Denmark',2020,10),
('Paris',2020,10), ('Paris',2020,5), ('Paris',2020,-1),
-- 1920
('Berlin',1920,-10), ('China',1920,NULL), ('Denmark',1920,-1),
('Paris',1920,20),
-- 1820
('Berlin',1820,NULL), ('China',1820,NULL), ('Denmark',1820,NULL),
('Paris',1820,NULL);
ALL Clause (default behavior)
The ALL
clause is the default behavior for AVG()
, applied when DISTINCT
is not specified. In this example, all numeric temperature values are used to compute the average.
The values to be averaged:
SELECT * FROM temperatures WHERE year = 2020
ORDER BY temp, location;
+----------+------+------+
| location | year | temp |
+----------+------+------+
| Denmark | 2020 | -10 |
| Berlin | 2020 | -1 |
| Paris | 2020 | -1 |
| China | 2020 | 0 |
| Denmark | 2020 | 0 |
| Paris | 2020 | 5 |
| Denmark | 2020 | 10 |
| Paris | 2020 | 10 |
| Berlin | 2020 | 15 |
| China | 2020 | 15 |
+----------+------+------+
The result of averaging these values can be requested with or without the "ALL" clause:
SELECT AVG(ALL temp), AVG(temp)
FROM temperatures WHERE year = 2020;
+---------------+-----------+
| AVG(ALL temp) | AVG(temp) |
+---------------+-----------+
| 4.3000 | 4.3000 |
+---------------+-----------+
DISTINCT Clause
Using AVG()
with the DISTINCT
clause ignores duplicate values from the list, so each value is counted and summed only once when computing the average. In this example, all distinct numeric temperature values are used to compute the average:
SELECT AVG(DISTINCT temp)
FROM temperatures WHERE year = 2020;
+--------------------+
| AVG(DISTINCT temp) |
+--------------------+
| 3.1667 |
+--------------------+
With GROUP BY
Using AVG()
with GROUP BY
creates a separate average for each group. In this example, GROUP BY location
creates a separate average for each unique location:
SELECT location, AVG(temp)
FROM temperatures
GROUP BY location ORDER BY location;
+----------+-----------+
| location | AVG(temp) |
+----------+-----------+
| Berlin | 1.3333 |
| China | 7.5000 |
| Denmark | -0.2500 |
| Paris | 8.5000 |
+----------+-----------+
With NULL Values
When AVG()
acts on NULL
values, those values are not included when computing the average. In this example, some rows contain NULL
values:
SELECT * FROM temperatures
WHERE year = 1920 ORDER BY temp;
+----------+------+------+
| location | year | temp |
+----------+------+------+
| China | 1920 | NULL |
| Berlin | 1920 | -10 |
| Denmark | 1920 | -1 |
| Paris | 1920 | 20 |
+----------+------+------+
SELECT AVG(temp) FROM temperatures WHERE year = 1920;
+-----------+
| AVG(temp) |
+-----------+
| 3.0000 |
+-----------+
With Only NULL Values
When AVG()
is acting only on NULL
values, AVG()
returns NULL
. In this example, all rows contain NULL
values:
SELECT AVG(temp) FROM temperatures WHERE year=1820;
+-----------+
| AVG(temp) |
+-----------+
| NULL |
+-----------+
Precision and Scale
As detailed in ISO 9075:2016, the data type produced as the result of AVG()
is implementation-defined. The result of AVG()
will always have at least as much precision (INTEGER
, DECIMAL
, FLOAT
, DOUBLE
) and scale (INTEGER
, DECIMAL
) as the source values.
This example demonstrates precision and scale of AVG()
acting on a DECIMAL(8,6)
:
CREATE TABLE avg_decimal (
item VARCHAR(20),
count DECIMAL(8,6)
);
INSERT INTO avg_decimal VALUES
('Foo', 1.123456),
('Bar', 3.141592),
('Baz', 9.999999);
SELECT AVG(count) FROM avg_decimal;
+--------------+
| AVG(count) |
+--------------+
| 4.7550156667 |
+--------------+
This example demonstrates precision and scale of AVG()
acting on a DECIMAL(11,9)
, an increase in precision from our prior input which results in increased output precision:
ALTER TABLE avg_decimal MODIFY count DECIMAL(11,9);
SELECT AVG(count) FROM avg_decimal;
+-----------------+
| AVG(count) |
+-----------------+
| 4.7550156666667 |
+-----------------+
As Window Function
As detailed in ISO 9075:2016, aggregate functions can be used as window functions. When an aggregate function is used as a window function, the result is derived from the window frame of the row. The window frame is defined with the OVER
clause.
AVG()
can be used as a window function by including the OVER
clause. This example demonstrates the use of AVG()
as a window function:
CREATE TABLE inventory (
item VARCHAR(20),
category VARCHAR(20),
count INT
);
INSERT INTO inventory VALUES
('Foo','in stock', 42), ('Foo','back order', 50),
('Bar','in stock', 10), ('Bar','back order', 60),
('Baz','in stock', 99), ('Baz','back order', 10);
SELECT item, category, count,
AVG(count) OVER (PARTITION BY category) AS average_by_category
FROM inventory ORDER BY item, category;
+------+------------+-------+---------------------+
| item | category | count | average_by_category |
+------+------------+-------+---------------------+
| Bar | back order | 60 | 40.0000 |
| Bar | in stock | 10 | 50.3333 |
| Baz | back order | 10 | 40.0000 |
| Baz | in stock | 99 | 50.3333 |
| Foo | back order | 50 | 40.0000 |
| Foo | in stock | 42 | 50.3333 |
+------+------------+-------+---------------------+
FEATURE INTERACTION
sql_mode=ONLY_FULL_GROUP_BY
MariaDB Server does not implement ISO 9075:2016 Feature T301, Functional Dependencies.
By default, MariaDB Server will accept the following query which does not conform (without Feature T301) to ISO 9075:2016 because this query contains a non-aggregated column (location
) in the SELECT
list which is not mentioned in a GROUP BY
list:
SELECT location, AVG(temp)
FROM temperatures;
+----------+-----------+
| location | AVG(temp) |
+----------+-----------+
| Berlin | 4.0000 |
+----------+-----------+
With the default behavior, MariaDB Server does not perform any sorting or grouping with the non-aggregated columns that are not in the GROUP BY
list. Instead, it simply chooses the first value returned for that column in each group. If the column has the same value for each row in the group, then this behavior can help performance by avoiding unnecessary sorting. However, if the column has multiple values in the group, the behavior is nondeterministic.
The nondeterministic behavior does not apply when an aggregate function is used as a window function, because MariaDB Server derives the result for window functions from the window frame, rather than from aggregated rows.
MariaDB Server's nondeterministic behavior can be demonstrated by manually controlling the order that the rows are returned using CTEs.
The following CTE defines 'Berlin'
as the value for the first row, so MariaDB Server returns that value:
WITH asc_sorted_temp AS (
SELECT 'Berlin' AS location, 15 AS temp
UNION
SELECT 'Paris' AS location, 15 AS temp
)
SELECT location, AVG(temp)
FROM asc_sorted_temp;
+----------+-----------+
| location | AVG(temp) |
+----------+-----------+
| Berlin | 15.0000 |
+----------+-----------+
In contrast, the following CTE defines 'Paris'
as the value for the first row, so MariaDB Server returns that value:
WITH desc_sorted_temp AS (
SELECT 'Paris' AS location, 15 AS temp
UNION
SELECT 'Berlin' AS location, 15 AS temp
)
SELECT location, AVG(temp)
FROM desc_sorted_temp;
+----------+-----------+
| location | AVG(temp) |
+----------+-----------+
| Paris | 15.0000 |
+----------+-----------+
Some DBAs or application developers may find the nondeterministic behavior to be undesirable.
MariaDB Server allows this non-standard behavior to be disabled by adding the ONLY_FULL_GROUP_BY
mode to the sql_SELECT
list that are not in the GROUP BY
list:
SET sql_mode=CONCAT(@@sql_mode, ',ONLY_FULL_GROUP_BY');
SELECT location, AVG(temp)
FROM temperatures;
ERROR 1140 (42000): Mixing of GROUP columns (MIN(),MAX(),COUNT(),...) with no GROUP columns is illegal if there is no GROUP BY clause
sql_mode=ORACLE
sql_mode=ORACLE
enables SQL/PL, a compatible subset of Oracle PL/SQL.
With sql_mode=ORACLE
, the behavior of the AVG()
function does not change:
SET sql_mode=CONCAT(@@sql_mode, ',ORACLE');
SELECT location, AVG(temp)
FROM temperatures
GROUP BY location
ORDER BY location;
+----------+-----------+
| location | AVG(temp) |
+----------+-----------+
| Berlin | 1.3333 |
| China | 7.5000 |
| Denmark | -0.2500 |
| Paris | 8.5000 |
+----------+-----------+