New Query Optimizer features in MariaDB 10.3
- 2. 2
Plan
● MariaDB 10.2: Condition pushdown
● MariaDB 10.3: Condition pushdown through window
functions
● MariaDB 10.3: GROUP BY splitting
- 4. 4
Background – derived table merge
● “VIP customers and their big orders from October”
select *
from
vip_customer,
(select *
from orders
where order_date BETWEEN '2017-10-01' and '2017-10-31'
) as OCT_ORDERS
where
OCT_ORDERS.amount > 1M and
OCT_ORDERS.customer_id = customer.customer_id
- 5. 5
Naive execution
select *
from
vip_customer,
(select *
from orders
where
order_date BETWEEN '2017-10-01' and
'2017-10-31'
) as OCT_ORDERS
where
OCT_ORDERS.amount > 1M and
OCT_ORDERS.customer_id =
vip_customer.customer_id
orders
vip_customer
1 – compute
oct_orders
2- do join OCT_ORDERS
amount > 1M
- 6. 6
Derived table merge
select *
from
vip_customer,
(select *
from orders
where
order_date BETWEEN '2017-10-01' and
'2017-10-31'
) as OCT_ORDERS
where
OCT_ORDERS.amount > 1M and
OCT_ORDERS.customer_id =
vip_customer.customer_id
select *
from
vip_customer,
orders
where
order_date BETWEEN '2017-10-01' and
'2017-10-31'
and
orders.amount > 1M and
orders.customer_id =
vip_customer.customer_id
- 7. 7
Execution after merge
vip_customer
Join
orders
select *
from
vip_customer,
orders
where
order_date BETWEEN '2017-10-01' and
'2017-10-31'
and
orders.amount > 1M and
orders.customer_id =
vip_customer.customer_id
Made in October
amount > 1M
● Allows the optimizer to do customer->orders or orders→customer
● Good for optimization
- 8. 8
Another use case - grouping
● Can’t merge due to GROUP BY in the child.
create view OCT_TOTALS as
select
customer_id,
SUM(amount) as TOTAL_AMT
from orders
where
order_date BETWEEN '2017-10-01' and '2017-10-31'
group by
customer_id
select * from OCT_TOTALS where customer_id=1
- 9. 9
Execution is inefficient
create view OCT_TOTALS as
select
customer_id,
SUM(amount) as TOTAL_AMT
from orders
where
order_date BETWEEN '2017-10-01' and '2017-10-31'
group by
customer_id
select * from OCT_TOTALS where customer_id=1
orders
1 – compute all totals
2- get customer=1
OCT_TOTALS
customer_id=1
Sum
- 10. 10
Condition pushdown
select *
from OCT_TOTALS
where customer_id=1
create view OCT_TOTALS as
select
customer_id,
SUM(amount) as TOTAL_AMT
from orders
where
order_date BETWEEN '2017-10-01' and '2017-10-31'
group by
customer_id
● Can push down conditions on GROUP
BY columns
● … to filter out rows that go into groups
we dont care about
- 11. 11
Condition pushdown
select *
from OCT_TOTALS
where customer_id=1
orders
1 – find customer_id=1
OCT_TOTALS,
customer_id=1
customer_id=1
Sum
● Looking only at rows you’re interested in is much more efficient
create view OCT_TOTALS as
select
customer_id,
SUM(amount) as TOTAL_AMT
from orders
where
order_date BETWEEN '2017-10-01' and '2017-10-31'
group by
customer_id
orders
- 12. 12
MariaDB 10.3: Pushdown through Window Functions
● “Customer’s biggest orders”
create view top_three_orders as
select *
from
(
select
customer_id,
amount,
rank() over (partition by customer_id
order by amount desc
) as order_rank
from orders
) as ordered_orders
where order_rank<3
select * from top_three_orders where customer_id=1
+-------------+--------+------------+
| customer_id | amount | order_rank |
+-------------+--------+------------+
| 1 | 10000 | 1 |
| 1 | 9500 | 2 |
| 1 | 400 | 3 |
| 2 | 3200 | 1 |
| 2 | 1000 | 2 |
| 2 | 400 | 3 |
...
- 13. 13
MariaDB 10.3: Pushdown through Window Functions
MariaDB 10.2, MySQL 8.0
● Compute
top_three_orders for all
customers
● select rows with
customer_id=1
select * from top_three_orders where customer_id=1
MariaDB 10.3 (and e.g. PostgreSQL)
● Only compute top_three_orders
for customer_id=1
– This can be much faster!
– Can make use of
index(customer_id)
- 14. 14
“Split grouping for derived”
select *
from
customer, OCT_TOTALS
where
customer.customer_id=OCT_TOTALS.customer_id and
customer.customer_name IN ('Customer 1', 'Customer 2')
create view OCT_TOTALS as
select
customer_id,
SUM(amount) as TOTAL_AMT
from orders
where
order_date BETWEEN '2017-10-01' and '2017-10-31'
group by
customer_id
- 15. 15
Execution, the old way
Sum
orders
select *
from
customer, OCT_TOTALS
where
customer.customer_id=
OCT_TOTALS.customer_id and
customer.customer_name IN ('Customer 1',
'Customer 2')
create view OCT_TOTALS as
select
customer_id,
SUM(amount) as TOTAL_AMT
from orders
where
order_date BETWEEN '2017-10-01' and '2017-10-31'
group by
customer_id
Customer 1
Customer 2
Customer 3
Customer 100
Customer 1
Customer 2
Customer 3
Customer 100
customer
Customer 1
Customer 2
OCT_TOTALS
● Inefficient, OCT_TOTALS
is computed for *all*
customers.
- 16. 16
Split grouping execution
Sum
customer
Customer 2
Customer 2
Customer 1
Customer 100
orders
Customer 1
Customer 1
Customer 2
Sum
SumSum
● Can be used when doing join from
customer to orders
● Must have equalities for GROUP BY
columns:
OCT_TOTALS.customer_id=customer.customer_id
– This allows to select one group
● The underlying table (orders) must
have an index on the GROUP BY
column (customer_id)
– This allows to use ref access
- 17. 17
Split grouping execution
● EXPLAIN shows “LATERAL DERIVED”
● @@optimizer_switch flag: split_grouping_derived (ON by default)
● Not fully cost-based choice atm (check query plan, use if possible and certainly advantageous)
select *
from
customer, OCT_TOTALS
where
customer.customer_id=
OCT_TOTALS.customer_id and
customer.customer_name IN ('Customer 1',
'Customer 2')
create view OCT_TOTALS as
select
customer_id,
SUM(amount) as TOTAL_AMT
from orders
where
order_date BETWEEN '2017-10-01' and '2017-10-31'
group by
customer_id
+------+-----------------+------------+------+---------------+-------------+---------+----------------------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-----------------+------------+------+---------------+-------------+---------+----------------------+------+-------------+
| 1 | PRIMARY | customer | ALL | PRIMARY | NULL | NULL | NULL | 1000 | |
| 1 | PRIMARY | <derived2> | ref | key0 | key0 | 4 | customer.customer_id | 36 | |
| 2 | LATERAL DERIVED | orders | ref | customer_id | customer_id | 4 | customer.customer_id | 365 | Using where |
+------+-----------------+------------+------+---------------+-------------+---------+----------------------+------+-------------+
- 18. 18
Summary
● MariaDB 10.2: Condition pushdown for derived tables optimization
– Push a condition into derived table
– Used when derived table cannot be merged
– Biggest effect is for subqueries with GROUP BY
● MariaDB 10.3: Condition Pushdown through Window functions
● MariaDB 10.3: Lateral derived optimization
– When doing a join, can’t do condition pushdown
– So, lateral derived is used. It allows to only examine GROUP BY groups that
match other tables. It needs index on grouped columns
– Work in progress (optimization process is very basic ATM)