SlideShare a Scribd company logo
Sergei Petrunia <sergey@mariadb.com>
MariaDB Shenzhen Meetup
November 2017
[Some of]
New Query Optimizer features
in MariaDB 10.3
2
Plan
● MariaDB 10.2: Condition pushdown
● MariaDB 10.3: Condition pushdown through window
functions
● MariaDB 10.3: GROUP BY splitting
3
Condition pushdown
● Just condition pushdown in 10.2
● Pushdown through window functions in 10.3
4
Background – derived table merge
● “VIP customers and their big orders from October”
select *
from
vip_customer,
(select *
from orders
where order_date BETWEEN '2017-10-01' and '2017-10-31'
) as OCT_ORDERS
where
OCT_ORDERS.amount > 1M and
OCT_ORDERS.customer_id = customer.customer_id
5
Naive execution
select *
from
vip_customer,
(select *
from orders
where
order_date BETWEEN '2017-10-01' and
'2017-10-31'
) as OCT_ORDERS
where
OCT_ORDERS.amount > 1M and
OCT_ORDERS.customer_id =
vip_customer.customer_id
orders
vip_customer
1 – compute
oct_orders
2- do join OCT_ORDERS
amount > 1M
6
Derived table merge
select *
from
vip_customer,
(select *
from orders
where
order_date BETWEEN '2017-10-01' and
'2017-10-31'
) as OCT_ORDERS
where
OCT_ORDERS.amount > 1M and
OCT_ORDERS.customer_id =
vip_customer.customer_id
select *
from
vip_customer,
orders
where
order_date BETWEEN '2017-10-01' and
'2017-10-31'
and
orders.amount > 1M and
orders.customer_id =
vip_customer.customer_id
7
Execution after merge
vip_customer
Join
orders
select *
from
vip_customer,
orders
where
order_date BETWEEN '2017-10-01' and
'2017-10-31'
and
orders.amount > 1M and
orders.customer_id =
vip_customer.customer_id
Made in October
amount > 1M
● Allows the optimizer to do customer->orders or orders→customer
● Good for optimization
8
Another use case - grouping
● Can’t merge due to GROUP BY in the child.
create view OCT_TOTALS as
select
customer_id,
SUM(amount) as TOTAL_AMT
from orders
where
order_date BETWEEN '2017-10-01' and '2017-10-31'
group by
customer_id
select * from OCT_TOTALS where customer_id=1
9
Execution is inefficient
create view OCT_TOTALS as
select
customer_id,
SUM(amount) as TOTAL_AMT
from orders
where
order_date BETWEEN '2017-10-01' and '2017-10-31'
group by
customer_id
select * from OCT_TOTALS where customer_id=1
orders
1 – compute all totals
2- get customer=1
OCT_TOTALS
customer_id=1
Sum
10
Condition pushdown
select *
from OCT_TOTALS
where customer_id=1
create view OCT_TOTALS as
select
customer_id,
SUM(amount) as TOTAL_AMT
from orders
where
order_date BETWEEN '2017-10-01' and '2017-10-31'
group by
customer_id
● Can push down conditions on GROUP
BY columns
● … to filter out rows that go into groups
we dont care about
11
Condition pushdown
select *
from OCT_TOTALS
where customer_id=1
orders
1 – find customer_id=1
OCT_TOTALS,
customer_id=1
customer_id=1
Sum
● Looking only at rows you’re interested in is much more efficient
create view OCT_TOTALS as
select
customer_id,
SUM(amount) as TOTAL_AMT
from orders
where
order_date BETWEEN '2017-10-01' and '2017-10-31'
group by
customer_id
orders
12
MariaDB 10.3: Pushdown through Window Functions
● “Customer’s biggest orders”
create view top_three_orders as
select *
from
(
select
customer_id,
amount,
rank() over (partition by customer_id
order by amount desc
) as order_rank
from orders
) as ordered_orders
where order_rank<3
select * from top_three_orders where customer_id=1
+-------------+--------+------------+
| customer_id | amount | order_rank |
+-------------+--------+------------+
| 1 | 10000 | 1 |
| 1 | 9500 | 2 |
| 1 | 400 | 3 |
| 2 | 3200 | 1 |
| 2 | 1000 | 2 |
| 2 | 400 | 3 |
...
13
MariaDB 10.3: Pushdown through Window Functions
MariaDB 10.2, MySQL 8.0
● Compute
top_three_orders for all
customers
● select rows with
customer_id=1
select * from top_three_orders where customer_id=1
MariaDB 10.3 (and e.g. PostgreSQL)
● Only compute top_three_orders
for customer_id=1
– This can be much faster!
– Can make use of
index(customer_id)
14
“Split grouping for derived”
select *
from
customer, OCT_TOTALS
where
customer.customer_id=OCT_TOTALS.customer_id and
customer.customer_name IN ('Customer 1', 'Customer 2')
create view OCT_TOTALS as
select
customer_id,
SUM(amount) as TOTAL_AMT
from orders
where
order_date BETWEEN '2017-10-01' and '2017-10-31'
group by
customer_id
15
Execution, the old way
Sum
orders
select *
from
customer, OCT_TOTALS
where
customer.customer_id=
OCT_TOTALS.customer_id and
customer.customer_name IN ('Customer 1',
'Customer 2')
create view OCT_TOTALS as
select
customer_id,
SUM(amount) as TOTAL_AMT
from orders
where
order_date BETWEEN '2017-10-01' and '2017-10-31'
group by
customer_id
Customer 1
Customer 2
Customer 3
Customer 100
Customer 1
Customer 2
Customer 3
Customer 100
customer
Customer 1
Customer 2
OCT_TOTALS
● Inefficient, OCT_TOTALS
is computed for *all*
customers.
16
Split grouping execution
Sum
customer
Customer 2
Customer 2
Customer 1
Customer 100
orders
Customer 1
Customer 1
Customer 2
Sum
SumSum
● Can be used when doing join from
customer to orders
● Must have equalities for GROUP BY
columns:
OCT_TOTALS.customer_id=customer.customer_id
– This allows to select one group
● The underlying table (orders) must
have an index on the GROUP BY
column (customer_id)
– This allows to use ref access
17
Split grouping execution
● EXPLAIN shows “LATERAL DERIVED”
● @@optimizer_switch flag: split_grouping_derived (ON by default)
● Not fully cost-based choice atm (check query plan, use if possible and certainly advantageous)
select *
from
customer, OCT_TOTALS
where
customer.customer_id=
OCT_TOTALS.customer_id and
customer.customer_name IN ('Customer 1',
'Customer 2')
create view OCT_TOTALS as
select
customer_id,
SUM(amount) as TOTAL_AMT
from orders
where
order_date BETWEEN '2017-10-01' and '2017-10-31'
group by
customer_id
+------+-----------------+------------+------+---------------+-------------+---------+----------------------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-----------------+------------+------+---------------+-------------+---------+----------------------+------+-------------+
| 1 | PRIMARY | customer | ALL | PRIMARY | NULL | NULL | NULL | 1000 | |
| 1 | PRIMARY | <derived2> | ref | key0 | key0 | 4 | customer.customer_id | 36 | |
| 2 | LATERAL DERIVED | orders | ref | customer_id | customer_id | 4 | customer.customer_id | 365 | Using where |
+------+-----------------+------------+------+---------------+-------------+---------+----------------------+------+-------------+
18
Summary
● MariaDB 10.2: Condition pushdown for derived tables optimization
– Push a condition into derived table
– Used when derived table cannot be merged
– Biggest effect is for subqueries with GROUP BY
● MariaDB 10.3: Condition Pushdown through Window functions
● MariaDB 10.3: Lateral derived optimization
– When doing a join, can’t do condition pushdown
– So, lateral derived is used. It allows to only examine GROUP BY groups that
match other tables. It needs index on grouped columns
– Work in progress (optimization process is very basic ATM)
19
Thanks!
Discussion

More Related Content

New Query Optimizer features in MariaDB 10.3

  • 1. Sergei Petrunia <sergey@mariadb.com> MariaDB Shenzhen Meetup November 2017 [Some of] New Query Optimizer features in MariaDB 10.3
  • 2. 2 Plan ● MariaDB 10.2: Condition pushdown ● MariaDB 10.3: Condition pushdown through window functions ● MariaDB 10.3: GROUP BY splitting
  • 3. 3 Condition pushdown ● Just condition pushdown in 10.2 ● Pushdown through window functions in 10.3
  • 4. 4 Background – derived table merge ● “VIP customers and their big orders from October” select * from vip_customer, (select * from orders where order_date BETWEEN '2017-10-01' and '2017-10-31' ) as OCT_ORDERS where OCT_ORDERS.amount > 1M and OCT_ORDERS.customer_id = customer.customer_id
  • 5. 5 Naive execution select * from vip_customer, (select * from orders where order_date BETWEEN '2017-10-01' and '2017-10-31' ) as OCT_ORDERS where OCT_ORDERS.amount > 1M and OCT_ORDERS.customer_id = vip_customer.customer_id orders vip_customer 1 – compute oct_orders 2- do join OCT_ORDERS amount > 1M
  • 6. 6 Derived table merge select * from vip_customer, (select * from orders where order_date BETWEEN '2017-10-01' and '2017-10-31' ) as OCT_ORDERS where OCT_ORDERS.amount > 1M and OCT_ORDERS.customer_id = vip_customer.customer_id select * from vip_customer, orders where order_date BETWEEN '2017-10-01' and '2017-10-31' and orders.amount > 1M and orders.customer_id = vip_customer.customer_id
  • 7. 7 Execution after merge vip_customer Join orders select * from vip_customer, orders where order_date BETWEEN '2017-10-01' and '2017-10-31' and orders.amount > 1M and orders.customer_id = vip_customer.customer_id Made in October amount > 1M ● Allows the optimizer to do customer->orders or orders→customer ● Good for optimization
  • 8. 8 Another use case - grouping ● Can’t merge due to GROUP BY in the child. create view OCT_TOTALS as select customer_id, SUM(amount) as TOTAL_AMT from orders where order_date BETWEEN '2017-10-01' and '2017-10-31' group by customer_id select * from OCT_TOTALS where customer_id=1
  • 9. 9 Execution is inefficient create view OCT_TOTALS as select customer_id, SUM(amount) as TOTAL_AMT from orders where order_date BETWEEN '2017-10-01' and '2017-10-31' group by customer_id select * from OCT_TOTALS where customer_id=1 orders 1 – compute all totals 2- get customer=1 OCT_TOTALS customer_id=1 Sum
  • 10. 10 Condition pushdown select * from OCT_TOTALS where customer_id=1 create view OCT_TOTALS as select customer_id, SUM(amount) as TOTAL_AMT from orders where order_date BETWEEN '2017-10-01' and '2017-10-31' group by customer_id ● Can push down conditions on GROUP BY columns ● … to filter out rows that go into groups we dont care about
  • 11. 11 Condition pushdown select * from OCT_TOTALS where customer_id=1 orders 1 – find customer_id=1 OCT_TOTALS, customer_id=1 customer_id=1 Sum ● Looking only at rows you’re interested in is much more efficient create view OCT_TOTALS as select customer_id, SUM(amount) as TOTAL_AMT from orders where order_date BETWEEN '2017-10-01' and '2017-10-31' group by customer_id orders
  • 12. 12 MariaDB 10.3: Pushdown through Window Functions ● “Customer’s biggest orders” create view top_three_orders as select * from ( select customer_id, amount, rank() over (partition by customer_id order by amount desc ) as order_rank from orders ) as ordered_orders where order_rank<3 select * from top_three_orders where customer_id=1 +-------------+--------+------------+ | customer_id | amount | order_rank | +-------------+--------+------------+ | 1 | 10000 | 1 | | 1 | 9500 | 2 | | 1 | 400 | 3 | | 2 | 3200 | 1 | | 2 | 1000 | 2 | | 2 | 400 | 3 | ...
  • 13. 13 MariaDB 10.3: Pushdown through Window Functions MariaDB 10.2, MySQL 8.0 ● Compute top_three_orders for all customers ● select rows with customer_id=1 select * from top_three_orders where customer_id=1 MariaDB 10.3 (and e.g. PostgreSQL) ● Only compute top_three_orders for customer_id=1 – This can be much faster! – Can make use of index(customer_id)
  • 14. 14 “Split grouping for derived” select * from customer, OCT_TOTALS where customer.customer_id=OCT_TOTALS.customer_id and customer.customer_name IN ('Customer 1', 'Customer 2') create view OCT_TOTALS as select customer_id, SUM(amount) as TOTAL_AMT from orders where order_date BETWEEN '2017-10-01' and '2017-10-31' group by customer_id
  • 15. 15 Execution, the old way Sum orders select * from customer, OCT_TOTALS where customer.customer_id= OCT_TOTALS.customer_id and customer.customer_name IN ('Customer 1', 'Customer 2') create view OCT_TOTALS as select customer_id, SUM(amount) as TOTAL_AMT from orders where order_date BETWEEN '2017-10-01' and '2017-10-31' group by customer_id Customer 1 Customer 2 Customer 3 Customer 100 Customer 1 Customer 2 Customer 3 Customer 100 customer Customer 1 Customer 2 OCT_TOTALS ● Inefficient, OCT_TOTALS is computed for *all* customers.
  • 16. 16 Split grouping execution Sum customer Customer 2 Customer 2 Customer 1 Customer 100 orders Customer 1 Customer 1 Customer 2 Sum SumSum ● Can be used when doing join from customer to orders ● Must have equalities for GROUP BY columns: OCT_TOTALS.customer_id=customer.customer_id – This allows to select one group ● The underlying table (orders) must have an index on the GROUP BY column (customer_id) – This allows to use ref access
  • 17. 17 Split grouping execution ● EXPLAIN shows “LATERAL DERIVED” ● @@optimizer_switch flag: split_grouping_derived (ON by default) ● Not fully cost-based choice atm (check query plan, use if possible and certainly advantageous) select * from customer, OCT_TOTALS where customer.customer_id= OCT_TOTALS.customer_id and customer.customer_name IN ('Customer 1', 'Customer 2') create view OCT_TOTALS as select customer_id, SUM(amount) as TOTAL_AMT from orders where order_date BETWEEN '2017-10-01' and '2017-10-31' group by customer_id +------+-----------------+------------+------+---------------+-------------+---------+----------------------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-----------------+------------+------+---------------+-------------+---------+----------------------+------+-------------+ | 1 | PRIMARY | customer | ALL | PRIMARY | NULL | NULL | NULL | 1000 | | | 1 | PRIMARY | <derived2> | ref | key0 | key0 | 4 | customer.customer_id | 36 | | | 2 | LATERAL DERIVED | orders | ref | customer_id | customer_id | 4 | customer.customer_id | 365 | Using where | +------+-----------------+------------+------+---------------+-------------+---------+----------------------+------+-------------+
  • 18. 18 Summary ● MariaDB 10.2: Condition pushdown for derived tables optimization – Push a condition into derived table – Used when derived table cannot be merged – Biggest effect is for subqueries with GROUP BY ● MariaDB 10.3: Condition Pushdown through Window functions ● MariaDB 10.3: Lateral derived optimization – When doing a join, can’t do condition pushdown – So, lateral derived is used. It allows to only examine GROUP BY groups that match other tables. It needs index on grouped columns – Work in progress (optimization process is very basic ATM)