Comments - Should GROUP BY require ORDER BY in columnstore?

6 years, 2 months ago Andrew Hutchings

Hi,

That behaviour does not sound correct. Can you please file a bug, preferably with a test case, at:

https://jira.mariadb.org/projects/MCOL/

 
6 years, 2 months ago Mike Thibodeau

Thanks Andrew,

I have created MCOL-1180.

Also, seems like this may be related to MCOL-643. If you wrap the SUM or AVG function with a FORMAT function in the test case you get the error message:

 Error Code: 1815
Internal error: IDB-2015: Sorting length exceeded. Session variable max_length_for_sort_data needs to be set higher.

I found this to be the case even if I put the main query in a derived table. The query will execute if max_length_for_sort_data is set to 3M but will fail with the default of 1024.

SELECT `Day`,FORMAT(Total,4), FORMAT(Average,4) 
    FROM (
            SELECT DAY(timestamp_) AS 'Day',SUM(numeric_val) AS Total, AVG(numeric_val) AS Average 
                FROM cs1
                WHERE timestamp_ >= '2017-09-01 00:00:00' AND timestamp_ < '2017-10-01 00:00:00'
                  AND id IN (228091,228231,228233,227867,228216,228241)
                  GROUP BY DAY(timestamp_) ASC                
           ) dt 
      ORDER BY `Day`
;
 
6 years, 2 months ago Andrew Hutchings

In this particular case the GROUP BY wouldn't order because it is within a subquery. But the MCOL ticket has it without the subquery so we will investigate ASAP.

MCOL-643 is semi-related. You need to set it max_length_for_sort_data high in ColumnStore for now. The reason is too complex to explain here. But we are working with the MariaDB Server team on a solution for this right now.

 
6 years, 2 months ago Mike Thibodeau

Thanks Andrew, I had originally tested this as one query. I should have pasted in this query,

SELECT DAY(timestamp_),FORMAT(SUM(numeric_val),4), FORMAT(AVG(numeric_val),4) 
    FROM cs1
    WHERE timestamp_ >= '2017-09-01 00:00:00' AND timestamp_ < '2017-10-01 00:00:00'
       AND id IN (228091,228231,228233,227867,228216,228241)
      GROUP BY DAY(timestamp_) ASC;

I separated it hoping to get past the max_length_for_sort_data problem. This query executes if the columns 'Total' and 'Average' are not formatted.

 
Content reproduced on this site is the property of its respective owners, and this content is not reviewed in advance by MariaDB. The views, information and opinions expressed by this content do not necessarily represent those of MariaDB or any other party.