Skip to main content

A core SQL statement, SELECT retrieves data from one or more tables or other sources of row set data such as views or table-valued functions.

SELECT is a core SQL statement that retrieves data from various sources, typically tables, views or functions that return record set data.

Select statements can join data from multiple sources including the output of other embedded select statements (known as sub-queries). Joining is one of the key defining characteristics of a relational database management system.

a basic SELECT statement is of the form:

SELECT [column list]
  FROM [sources]
 WHERE [conditions]

A more complex query can be composed of a variety of primitive operations, including:

Joins take two data sets (tables, subquery results or other sources) and link them together based on some logical function such as equality of the columns. Some join predicates can use indexes and allow optimisations in the query that may not be possible with others. In some cases it may be necessary to re-structure the query so it can (for example) make use of an index, even though the original query was semantically equivalent to the modified version. A basic join query has the form:

SELECT [column list]
  FROM [table 1]
  JOIN [table 2]
    ON [some predicate such as equality of two columns]

A join may be optional on one or both sides, called an outer join. If data on the optional side of the join is not present, then NULLs will be returned in the place of columnar data from the data set on that side of the join.

Nested Subqueries are SQL statements embedded within a query that can be used by outer queries by selecting from that data set or joining against another. A query that joins something against the output of a nested subquery looks something like:

SELECT a.[columns]
      ,b.[columns]
  FROM [table1] a
  JOIN (SELECT [columns]
          FROM [ . . . ]) b
    ON [join condition]

Correlated Subqueries are sub-queries that use some data from the parent query. A correlated subquery might look something like:

SELECT a.[columns]
  FROM [table1] a
 WHERE EXISTS
       (SELECT 1
          FROM [table2] b
         WHERE [some condition involving data from both a and b])

In this case, note that the condition involves data from both a and b. The defining feature of a correlated subquery is that it needs data from the parent to resolve the predicate.

WHERE Clause This allows arbitrary logical conditions to be applied. In some cases these can be applied in joins as well, and it may be preferable to explicitly apply them in the join condition so the query optimiser can use them properly.

Common Table Expressions (CTEs) are an abstraction mechanism that can be used to re-use common logic in a query, or for certain facilities such as recursion. A CTE may also take parameters, and has a form similar to:

WITH [name of cte] (parameters) AS
     (SELECT [columns]
        FROM [sources]
       WHERE [predicates])  -- This could be arbitrarily complex
SELECT [. . .]

Aggreates and filters A query can aggregate data within groupings using the GROUP BY clause, and can filter after the aggregate using HAVING.

Various other features may be present in certain dialects, for example windowing functinons, Oracle's CONNECT BY, applying table valued functions (CROSS APPLY) or using table valued functions as data sources.