2

In the book "Inside Microsoft® SQL Server® 2008: T-SQL Programming" the behaviour of a sql query is explained. The following picture is taken from the book. I have some questions about the explanation.

  1. How is a sql query processed when there are indexes, that is, would this behavior change?

My doubt is because I am a "teacher" and I would like to teach more details to my students.

enter image description here

  1. This explains it in SQL Server, but I'm pretty sure it applies the same in MySQL, is that correct?

  2. Do we have 2 logical query processings, one with indexes and one without indexes?

I want to make sure my explanation to my students is still correct when there are indexes involved.

17
  • Indexes only affect the so-called physical processing - the algorithms that determine how data is retrieved from storage and combined. There can often be many different physical processes that produce the same logical result, but with different efficiency.
    – Steve
    Commented Nov 13, 2023 at 21:54
  • Hi @Steve,logical processing (the one in the image of my question) 1) is always applied and 2) then physical processing follows?
    – jwa
    Commented Nov 13, 2023 at 22:43
  • 3
    Please clarify your specific problem or provide additional details to highlight exactly what you need. As it's currently written, it's hard to tell exactly what you're asking.
    – Community Bot
    Commented Nov 13, 2023 at 22:45
  • Without context, its unclear what this is trying to described. It looks like this is describing an abstract logic model of a query. So rather than thinking of physical and logical as separate, consider them as different ways of thinking about the problem. The logical model is a simplified model the is good for when performance is unimportant and reasoning about the outcome of queries. The physical model around qury plans and indexes is important if you care about performance, and is much more complex, and involves a series of transforms and concurrent steps. Commented Nov 13, 2023 at 23:42
  • 2
    Wow, that diagram is actively bad for understanding the process.
    – pjc50
    Commented Nov 14, 2023 at 9:30

2 Answers 2

10

tl;dr: Query planners are a big deal.

In general, before an RDBMS backend executes a query, it first plans the query. There's more than one way to access the data, and the planner evaluates some of those ways to estimate which will be fastest.


  1. How is a sql query processed when there are indexes, that is, would this behavior change?

Given a "SELECT * FROM foo WHERE ..." query, by default the plan would be to simply tablescan foo. This is no different from $ grep ... foo.csv, requiring us to drag all the blocks off the disk and into RAM so we can examine them.

Adding an index may offer a second access path, if the WHERE conditional can be evaluated using just that index. Retrieving < 1% of rows via b-tree index is quicker than tablescan. (Some indexes cover more than one column, handy if the conditional mentions multiple columns.) Adding more indexes can lead to an exponential explosion of access paths that the planner might consider.

So, "would the behavior change?" Yes, it would. We might find good selectivity and choose to filter a conjunct by probing an index, rather than by resorting to tablescan.


  1. it also applies the same in MySQL, is that correct?

Yes, if you look under the hood of any mature RDBMS, you will find a query planner that examines several access paths, estimates their costs, and uses the winning plan to execute the query.


  1. Do we have two logical query processings, one with indexes and one without indexes?

Much more than that.

Each time we add an index, each time we add a JOIN, each time we add another condition to an ON or WHERE clause, we multiply the number of potential access paths that the planner might wish to consider. There is an old and deep literature on query planning.

To better understand how your database operates on your data, start habitually asking for EXPLAIN PLAN steps and statistics. Create a table and then explain a query. Add an index and explain again. Then another index. Then change the conjunct in your WHERE clause.

It's worth noting that plans are only interesting for tables that have significantly more than a hundred rows. And the distribution of values in your data will make a difference to the plan, since the backend will try to estimate the selectivity of each conjunct. For teaching purposes we often manipulate just half a dozen rows, but when examining plan choices we need far more data than that, and it should be somewhat realistic data.

1
  • I follow your answer no problem, but I would warn that this is very jargon heavy and I'd be surprised if someone who didn't already understand all the crucial things, could possibly follow it. For example, what is an "access path"? I wouldn't want to attempt the explanation myself, as there's icebergs in these waters.
    – Steve
    Commented Nov 14, 2023 at 9:27
2
  1. How is a sql query processed when there are indexes, that is, would this behavior change?

That diagram describes a process for determining what data a given query will return. It has nothing to do with how SQL server will evaluate that query, the presence or absence of indexes don't change what data you get.

  1. This explains it in SQL Server, but I'm pretty sure it applies the same in MySQL, is that correct?

It would be broadly similar. Afaict, MySQL doesn't have PIVOT, UNPIVOT or APPLY clauses, so those steps don't exist. It has other operators of it's own, that SQL Server doesn't have.

  1. Do we have 2 logical query processings, one with indexes and one without indexes?

No. Indexes impact the overall performance of a query, by giving the query engine additional ways of accessing the underlying data. The query engine will pick a specific access pattern that it thinks is fastest.

Not the answer you're looking for? Browse other questions tagged or ask your own question.