What are CTEs?
CTE is also known as the WITH clause or sub-query factoring. It is a named temporary result set that is used to manipulate the complex sub-query data.
In simple terms, the code can be made reusable if the already calculated values are stored in a temporary space, eliminating the need to calculate them again.
It can be used with SELECT, INSERT, UPDATE, DELETE, etc. Also, a CTE can reference other CTEs that are part of the same WITH clause, but those CTEs should be defined earlier.
MySQL CTE syntax
The structure of CTE includes:
- name: name of the CTE to store the temporary result
- column list: column to store the values in the CTE
- query to define the CTE; the result of this query is stored in the CTE.
CTE is always defined with the WITH
keyword, followed by the CTE name and column name. Once the CTE has been declared, it can be referenced to make use of the results. The execution scope of CTE exists within the statement in which it is used.
WITH cte_name (column_list) AS (
query
)
SELECT * FROM cte_name;
The number of columns in the CTE arguments must be equal to the number of columns in the query. The query columns that define the CTE will be used if we haven't declared the columns in the CTE parameters.
MySQL CTE example
Let us assume ABC Company sells goods to its customers online. Here is the orders
table representing their weekly sales.
We have the order and the orderDetails table, with the order having basic information about the order and the orderDetails having a price breakdown of the order. We want to get the order's total order value. To write a query for this, we need to calculate the total order value and then join the two tables.
Steps :
- Create a CTE -
orderValue
fromorderDetails
table - Store average in the column
orderNumber, orderTotalValue
- Join CTE with the
orders
table and show the orders along withorderTotalValue
WITH
`orderValue` (`orderNumber`, `orderTotalValue`) AS (
SELECT
`orderNumber`,
SUM(
`orderdetails`.`priceEach`*`orderdetails`.`quantityOrdered`
) AS `orderTotalValue`
FROM
`orderdetails`
GROUP BY
`orderNumber`
)
SELECT
`orders`.`orderNumber`,
`orderTotalValue`,
`orderDate`,
`status`
FROM
orders
INNER JOIN `orderValue` ON `orderValue`.`orderNumber`=`orders`.`orderNumber`;
Output:
Entries of those customers who ordered more than the average order price.
MySQL CTE use case
XYZ company has multiple electronics stores selling laptops, phones, desktops, etc. online. Here is the sales
table representing the weekly sales of all of the stores under XYZ.
We are supposed to show the entries of those stores having sales higher than the average sales for all the stores.
Steps :
- Calculate the total sales of each store.
- Calculate the average sales for all the stores.
- Show entries for all the stores whose total sales are greater than the average sales.
Without using MySQL CTE
-
Find the total sales of each store.
SELECT s.store_id, sum(s.cost) AS total_sales_per_store FROM sales s GROUP BY s.store_id;
Total sales of each store:
-
Find average sales with respect to all stores.
SELECT (avg(total_sales_per_store)) avg_sale_for_all_store FROM (SELECT s.store_id, sum(s.cost) AS total_sales_per_store FROM sales s GROUP BY s.store_id) x
Average sales of each store
-
Display data for all stores whose sales were better than the average sales across all stores.
SELECT * -- Total sales FROM (SELECT s.store_id, sum(s.cost) as total_sales_per_store FROM sales s GROUP BY s.store_id ) total_sales -- avg sales JOIN (SELECT avg(total_sales_per_store) avg_sale_for_all_store FROM (SELECT s.store_id, sum(s.cost) AS total_sales_per_store FROM sales s GROUP BY s.store_id) x ) avg_sales -- condition ON total_sales.total_sales_per_store > avg_sales.avg_sale_for_all_store;
Output:
Using MySQL CTE
As it gets difficult to read this code, we can use Common Table Expressions to store the results of already calculated parameters. The total sales have been calculated twice, which can be avoided.
We can use the CTEs to temporarily store the total sales in the total_sales
CTE and create total_sales_per_store
column.
Similarly, average sales in the avg_sales
CTE and avg_sale_for_all_store
columns.
After creating CTE, we can join avg_sales
with total_sales
and display all the entries that satisfy the condition of total_sales_per_store
> avg_sale_for_all_store
WITH total_sales AS
(SELECT s.store_id, sum(s.cost) AS total_sales_per_store
FROM sales s
GROUP BY s.store_id), -- Total sales
avg_sales as
(SELECT (avg(total_sales_per_store)) avg_sale_for_all_store
FROM total_sales) -- average sales
SELECT *
FROM total_sales
JOIN avg_sales
ON total_sales.total_sales_per_store > avg_sales.avg_sale_for_all_store;
Output:
MySQL Recursive CTE Syntax
When a subquery in a common table expression (CTE) refers to the CTE name itself, the CTE is said to be recursive.
The recursive CTEs are used for series generation and traversal of hierarchical or tree-structured data.
The recursive CTE syntax consists of three elements:
- Initial query (anchor): Base result set of the CTE structure. It is a non-recursive statement.
-
Recursive query: Query that references the CTE name; hence, it is called a recursive member. The recursive member is joined with the anchor member by a
UNION
orUNION DISTINCT
operator. - Termination condition: ensures the recursion stops when the recursive member returns no row.
WITH RECURSIVE ctename AS (
initial_query -- anchor
UNION ALL
recursive_query -- recursive member that references the CTE name
)
SELECT * FROM ctename;
Steps to execute a recursive CTE
- The anchor/initial query is executed first to store the base case’s result. This base result is used for the next iterations.
- The recursive query is then executed until the termination condition.
- The anchor member and the result from the recursive query are combined using the
UNION ALL
operator.
MySQL recursive CTE example
Consider a case to generate 10 order IDs starting from 1. Look at it as a while loop that keeps on executing until the termination condition.
WITH RECURSIVE natural_sequence AS
( SELECT 1 AS order_id -- anchor member: our sequence starts from 1
UNION ALL
SELECT order_id + 1 FROM natural_sequence -- recursive member: reference to itself
WHERE order_id < 10 -- stop condition
)
SELECT * FROM natural_sequence; -- executes CTE
Explanation
-
The non-recursive part gets executed.
SELECT 1 AS order_id
It produces initial rows, which consist of one column named
order_id
and a single row with entry 1. -
The recursive part gets executed. It will add rows until
order_id
< 10. -
When the
order_id
becomes 10, the condition becomes false, the recursion is terminated, and a series of 10 order ids are generated.
MySQL recursive CTE use cases
Case 1: Series Generation
Consider ABC shop selling stationery items online. Here is the
sales
table representing the weekly sales of ABC. We are supposed to show the total sales for each day according to the dates. Lets first show total sales per day usingSUM
and theGROUP BY
operatorSELECT order_date,SUM(price) AS sales FROM sales GROUP BY order_date;
Output: The manager of ABC wants to have a look at the sales of the first week of February i.e. from 1st - 7th of February. Also whichever day has 0 sales, 0.00 should be allocated to that day. This kind of weekly sales data can be used to create charts/graphs to understand the sales of the shop. We can use recursive CTE here to generate dates from 1st Aug to 7th Aug and later merge them with the
sales
table. We are using theCOALESCE
operator to allocateSUM
of prices to days having sales and 0 to the ones having no sales. Read more aboutCOALESCE
hereWITH RECURSIVE dates(date) AS ( SELECT '2020-02-01' -- anchor UNION ALL SELECT date + INTERVAL 1 DAY -- recursive query FROM dates WHERE date < '2020-02-07' ) --termination condition SELECT dates.date, COALESCE(SUM(price), 0) sales FROM dates LEFT JOIN sales ON dates.date = sales.order_date GROUP BY dates.date;
Output: Representing the weekly sales of the ABC shop
Case 2 - Hierarchical Data Traversal
Recursive CTEs can be used to play around with hierarchal data like the understanding hierarchy of employees in a company or a family tree, etc. Such hierarchical structures can also inform strategies for employee rewards and recognition, as they help identify key contributors and streamline acknowledgment processes in complex organizations.
Consider Amazon LLC which has multiple products like Amazon Prime, Amazon Market Place, Amazon Alexa, Amazon Echo Dot, etc. These products have sub-products like Amazon Echo Dot comes under Alexa or Amazon Fresh and comes under the Amazon Marketplace. Below is the organizational structure of amazon Theamazon
table represents the same where level indicating under which category the category falls. We are supposed to show which category falls under which organization. As Amazon fresh comes under the marketplace, we are supposed to show ‘Amazon > Amazon Market place > Amazon Fresh to show the hierarchy. Let's create a recursive CTEhierarchy
with columnsid
,name
,path
, andlevel
WITH RECURSIVE hierarchy(id, name, path, level) AS ( SELECT id, company_name, CAST(company_name AS CHAR(100)), 1 FROM amazon WHERE level IS NULL UNION ALL SELECT am.id, am.company_name, CONCAT(h.path,' -> ',am.company_name), h.level+1 FROM hierarchy h JOIN amazon am ON h.id=am.level ) SELECT * FROM hierarchy ORDER BY level;
Output: Representing the hierarchy of organization in Amazon LLC.
Limitations of Recursive CTE
Aggregate methods like
SUM()
,GROUP BY
,ORDER BY
,DISTINCT
, and Window functions shouldn't be used by the recursive CTE. For CTE that is not recursive, this is not true. The CTE must only be mentioned once in the recursiveSELECT
part'sFROM
clause and never in a subquery.
Conclusion
- By utilizing temporary tables and the WITH clause, MySQL CTE can be used to prevent the need for complex queries.
- It can also be used to chain CTEs in order to make the query simpler.
- The CTE allows us to use it in place of the VIEW idea. It can also be used to quickly build recursive queries.
- Recursive CTEs can be used to create series and navigate through data that is organized in a tree or hierarchy.