Trisha Shetty (Editor)

Select (SQL)

Updated on
Edit
Like
Comment
Share on FacebookTweet on TwitterShare on LinkedInShare on Reddit

The SQL SELECT statement returns a result set of records from one or more tables.

Contents

A SELECT statement retrieves zero or more rows from one or more database tables or database views. In most applications, SELECT is the most commonly used data query language (DQL) command. As SQL is a declarative programming language, SELECT queries specify a result set, but do not specify how to calculate it. The database translates the query into a "query plan" which may vary between executions, database versions and database software. This functionality is called the "query optimizer" as it is responsible for finding the best possible execution plan for the query, within applicable constraints.

The SELECT statement has many optional clauses:

  • WHERE specifies which rows to retrieve.
  • GROUP BY groups rows sharing a property so that an aggregate function can be applied to each group.
  • HAVING selects among the groups defined by the GROUP BY clause.
  • ORDER BY specifies an order in which to return the rows.
  • AS provides an alias which can be used to temporarily rename tables or columns.
  • Examples

    Given a table T, the query SELECT * FROM T will result in all the elements of all the rows of the table being shown.

    With the same table, the query SELECT C1 FROM T will result in the elements from the column C1 of all the rows of the table being shown. This is similar to a projection in Relational algebra, except that in the general case, the result may contain duplicate rows. This is also known as a Vertical Partition in some database terms, restricting query output to view only specified fields or columns.

    With the same table, the query SELECT * FROM T WHERE C1 = 1 will result in all the elements of all the rows where the value of column C1 is '1' being shown — in Relational algebra terms, a selection will be performed, because of the WHERE clause. This is also known as a Horizontal Partition, restricting rows output by a query according to specified conditions.

    With more than one table, the result set will be every combination of rows. So if two tables are T1 and T2, SELECT * FROM T1, T2 will result in every combination of T1 rows with every T2 rows. E.g., if T1 has 3 rows and T2 has 5 rows, then 15 rows will result.

    The SELECT clause specifies a list of properties (columns) by name, or the wildcard character (“*”) to mean “all properties”. Notice the special case of joinpropname, this provides for joins, but only on the jcr:path column, as described in 8.5.2 Database View. See also 6.6.3.1 Column Specifier.

    Limiting result rows

    Often it is convenient to indicate a maximum number of rows that are returned. This can be used for testing or to prevent consuming excessive resources if the query returns more information than expected. The approach to do this often varies per vendor.

    In ISO SQL:2003, result sets may be limited by using

  • cursors, or
  • By introducing SQL window function to the SELECT-statement
  • ISO SQL:2008 introduced the FETCH FIRST clause.

    According to PostgreSQL v.9 documentation, an SQL Window function performs a calculation across a set of table rows that are somehow related to the current row, in a way similar to aggregate functions. The name recalls signal processing window functions. A window function call always contains an OVER clause.

    ROW_NUMBER() window function

    ROW_NUMBER() OVER may be used for a simple table on the returned rows, e.g. to return no more than ten rows:

    ROW_NUMBER can be non-deterministic: if sort_key is not unique, each time you run the query it is possible to get different row numbers assigned to any rows where sort_key is the same. When sort_key is unique, each row will always get a unique row number.

    RANK() window function

    The RANK() OVER window function acts like ROW_NUMBER, but may return more or less than n rows in case of tie conditions, e.g. to return the top-10 youngest persons:

    The above code could return more than ten rows, e.g. if there are two people of the same age, it could return eleven rows.

    FETCH FIRST clause

    Since ISO SQL:2008 results limits can be specified as in the following example using the FETCH FIRST clause.

    SELECT * FROM T FETCH FIRST 10 ROWS ONLY

    This clause currently is supported by CA DATACOM/DB 11, IBM DB2, Sybase SQL Anywhere, PostgreSQL, EffiProz, H2, HSQLDB version 2.0, Microsoft SQL Server 2012, Oracle 12c and Mimer SQL.

    Microsoft SQL Server 2014 requires more:

    SELECT * FROM atable ORDER BY acolumn DESC OFFSET 0 ROWS FETCH FIRST 10 ROWS ONLY

    Result limits

    Not all DBMSs support the mentioned window functions, and non-standard syntax has to be used. Below, variants of the simple limit query for different DBMSes are listed:

    Hierarchical query

    Some databases provide specialised syntax for hierarchical data.

    A window function in SQL:2003 is an aggregate function applied to a partition of the result set.

    For example,

    sum(population) OVER( PARTITION BY city )

    calculates the sum of the populations of all rows having the same city value as the current row.

    Partitions are specified using the OVER clause which modifies the aggregate. Syntax:

    <OVER_CLAUSE> :: = OVER ( [ PARTITION BY <expr>, ... ] [ ORDER BY <expression> ] )

    The OVER clause can partition and order the result set. Ordering is used for order-relative functions such as row_number.

    Query evaluation ANSI

    The processing of a SELECT statement according to ANSI SQL would be the following:

    Generating Data in T-SQL

    Method to generate data based on the union all

    SQL Server 2008 supports the "row constructor" specified in the SQL3 ("SQL:1999") standard

    References

    Select (SQL) Wikipedia