Go to the first, previous, next, last section, table of contents.


13 SQL Statement Syntax

This chapter describes the syntax for the SQL statements supported in MySQL.

13.1 Data Manipulation Statements

13.1.1 DELETE Syntax

Single-table syntax:

DELETE [LOW_PRIORITY] [QUICK] [IGNORE] FROM tbl_name
       [WHERE where_definition]
       [ORDER BY ...]
       [LIMIT row_count]

Multiple-table syntax:

DELETE [LOW_PRIORITY] [QUICK] [IGNORE]
       tbl_name[.*] [, tbl_name[.*] ...]
       FROM table_references
       [WHERE where_definition]

Or:

DELETE [LOW_PRIORITY] [QUICK] [IGNORE]
       FROM tbl_name[.*] [, tbl_name[.*] ...]
       USING table_references
       [WHERE where_definition]

DELETE deletes rows from tbl_name that satisfy the condition given by where_definition, and returns the number of records deleted.

If you issue a DELETE statement with no WHERE clause, all rows are deleted. A faster way to do this, when you don't want to know the number of deleted rows, is to use TRUNCATE TABLE. See section 13.1.9 TRUNCATE Syntax.

In MySQL 3.23, DELETE without a WHERE clause returns zero as the number of affected records.

In MySQL 3.23, if you really want to know how many records are deleted when you are deleting all rows, and are willing to suffer a speed penalty, you can use a DELETE statement that includes a WHERE clause with an expression that is true for every row. For example:

mysql> DELETE FROM tbl_name WHERE 1>0;

This is much slower than TRUNCATE tbl_name, because it deletes rows one at a time.

If you delete the row containing the maximum value for an AUTO_INCREMENT column, the value will be reused for an ISAM or BDB table, but not for a MyISAM or InnoDB table. If you delete all rows in the table with DELETE FROM tbl_name (without a WHERE) in AUTOCOMMIT mode, the sequence starts over for all table types except for InnoDB and (as of MySQL 4.0) MyISAM. There are some exceptions to this behavior for InnoDB tables, discussed in section 15.7.3 How an AUTO_INCREMENT Column Works in InnoDB.

For MyISAM and BDB tables, you can specify an AUTO_INCREMENT secondary column in a multiple-column key. In this case, reuse of values deleted from the top of the sequence occurs even for MyISAM tables. See section 3.6.9 Using AUTO_INCREMENT.

The DELETE statement supports the following modifiers:

The speed of delete operations may also be affected by factors discussed in section 7.2.16 Speed of DELETE Statements.

In MyISAM tables, deleted records are maintained in a linked list and subsequent INSERT operations reuse old record positions. To reclaim unused space and reduce file sizes, use the OPTIMIZE TABLE statement or the myisamchk utility to reorganize tables. OPTIMIZE TABLE is easier, but myisamchk is faster. See section 13.5.2.5 OPTIMIZE TABLE Syntax and section 5.7.2.10 Table Optimization.

The QUICK modifier affects whether index leaves are merged for delete operations. DELETE QUICK is most useful for applications where index values for deleted rows will be replaced by similar index values from rows inserted later. In this case, the holes left by deleted values will be reused.

DELETE QUICK is not useful when deleted values lead to underfilled index blocks spanning a range of index values for which new inserts will occur again. In this case, use of QUICK can lead to wasted space in the index that remains unreclaimed. Here is an example of such a scenario:

  1. Create a table that contains an indexed AUTO_INCREMENT column.
  2. Insert many records into the table. Each insert results in an index values that is added to the high end of the index.
  3. Delete a block of records at the low end of the column range using DELETE QUICK.

In this scenario, the index blocks associated with the deleted index values become underfilled but are not merged with other index blocks due to the use of QUICK. They will remain underfilled when new inserts occur, because new records will not have index values in the deleted range. Furthermore, they will remain underfilled even if you later use DELETE without QUICK, unless some of the deleted index values happen to lie in index blocks within or adjacent to the underfilled blocks. To reclaim unused index space under these circumstances, you can use OPTIMIZE TABLE.

If you are going to delete many rows from a table, it might be faster to use DELETE QUICK followed by OPTIMIZE TABLE. This rebuilds the index rather than performing many index block merge operations.

The MySQL-specific LIMIT row_count option to DELETE tells the server the maximum number of rows to be deleted before control is returned to the client. This can be used to ensure that a specific DELETE statement doesn't take too much time. You can simply repeat the DELETE statement until the number of affected rows is less than the LIMIT value.

If the DELETE statement includes an ORDER BY clause, the rows are deleted in the order specified by the clause. This is really useful only in conjunction with LIMIT. For example, the following statement finds rows matching the WHERE clause, sorts them in timestamp order, and deletes the first (oldest) one:

DELETE FROM somelog
WHERE user = 'jcole'
ORDER BY timestamp
LIMIT 1

ORDER BY can be used with DELETE beginning with MySQL 4.0.0.

From MySQL 4.0, you can specify multiple tables in the DELETE statement to delete rows from one or more tables depending on a particular condition in multiple tables. However, you cannot use ORDER BY or LIMIT in a multiple-table DELETE.

The first multiple-table DELETE syntax is supported starting from MySQL 4.0.0. The second is supported starting from MySQL 4.0.2. The table_references part lists the tables involved in the join. Its syntax is described in section 13.1.7.1 JOIN Syntax.

For the first syntax, only matching rows from the tables listed before the FROM clause are deleted. For the second syntax, only matching rows from the tables listed in the FROM clause (before the USING clause) are deleted. The effect is that you can delete rows from many tables at the same time and also have additional tables that are used for searching:

DELETE t1, t2 FROM t1, t2, t3 WHERE t1.id=t2.id AND t2.id=t3.id;

Or:

DELETE FROM t1, t2 USING t1, t2, t3 WHERE t1.id=t2.id AND t2.id=t3.id;

These statements use all three files when searching for rows to delete, but delete matching rows only from tables t1 and t2.

The examples show inner joins using the comma operator, but multiple-table DELETE statements can use any type of join allowed in SELECT statements, such as LEFT JOIN.

The syntax allows .* after the table names for compatibility with Access.

If you use a multiple-table DELETE statement involving InnoDB tables for which there are foreign key constraints, the MySQL optimizer might process tables in an order that differs from that of their parent/child relationship. In this case, the statement fails and rolls back. Instead, delete from a single table and rely on the ON DELETE capabilities that InnoDB provides to cause the other tables to be modified accordingly.

Note: In MySQL 4.0, you should refer to the table names to be deleted with the true table name. In MySQL 4.1, you must use the alias (if one was given) when referring to a table name:

In MySQL 4.0:

DELETE test FROM test AS t1, test2 WHERE ...

In MySQL 4.1:

DELETE t1 FROM test AS t1, test2 WHERE ...

The reason we didn't make this change in 4.0 is that we didn't want to break any old 4.0 applications that were using the old syntax.

Currently, you cannot delete from a table and select from the same table in a subquery.

13.1.2 DO Syntax

DO expr [, expr] ...

DO executes the expressions but doesn't return any results. This is shorthand for SELECT expr, ..., but has the advantage that it's slightly faster when you don't care about the result.

DO is useful mainly with functions that have side effects, such as RELEASE_LOCK().

DO was added in MySQL 3.23.47.

13.1.3 HANDLER Syntax

HANDLER tbl_name OPEN [ AS alias ]
HANDLER tbl_name READ index_name { = | >= | <= | < } (value1,value2,...)
    [ WHERE where_condition ] [LIMIT ... ]
HANDLER tbl_name READ index_name { FIRST | NEXT | PREV | LAST }
    [ WHERE where_condition ] [LIMIT ... ]
HANDLER tbl_name READ { FIRST | NEXT }
    [ WHERE where_condition ] [LIMIT ... ]
HANDLER tbl_name CLOSE

The HANDLER statement provides direct access to table storage engine interfaces. It is available for MyISAM tables as MySQL 4.0.0 and InnoDB tables as of MySQL 4.0.3.

The HANDLER ... OPEN statement opens a table, making it accessible via subsequent HANDLER ... READ statements. This table object is not shared by other threads and is not closed until the thread calls HANDLER ... CLOSE or the thread terminates. If you open the table using an alias, further references to the table with other HANDLER statements must use the alias rather than the table name.

The first HANDLER ... READ syntax fetches a row where the index specified satisfies the given values and the WHERE condition is met. If you have a multiple-column index, specify the index column values as a comma-separated list. Either specify values for all the columns in the index, or specify values for a leftmost prefix of the index columns. Suppose that an index includes three columns named col_a, col_b, and col_c, in that order. The HANDLER statement can specify values for all three columns in the index, or for the columns in a leftmost prefix. For example:

HANDLER ... index_name = (col_a_val,col_b_val,col_c_val) ...
HANDLER ... index_name = (col_a_val,col_b_val) ...
HANDLER ... index_name = (col_a_val) ...

The second HANDLER ... READ syntax fetches a row from the table in index order that matches WHERE condition.

The third HANDLER ... READ syntax fetches a row from the table in natural row order that matches the WHERE condition. It is faster than HANDLER tbl_name READ index_name when a full table scan is desired. Natural row order is the order in which rows are stored in a MyISAM table data file. This statement works for InnoDB tables as well, but there is no such concept because there is no separate data file.

Without a LIMIT clause, all forms of HANDLER ... READ fetch a single row if one is available. To return a specific number of rows, include a LIMIT clause. It has the same syntax as for the SELECT statement. See section 13.1.7 SELECT Syntax.

HANDLER ... CLOSE closes a table that was opened with HANDLER ... OPEN.

Note: To use the HANDLER interface to refer to a table's PRIMARY KEY, use the quoted identifier `PRIMARY`:

HANDLER tbl_name READ `PRIMARY` > (...);

HANDLER is a somewhat low-level statement. For example, it does not provide consistency. That is, HANDLER ... OPEN does not take a snapshot of the table, and does not lock the table. This means that after a HANDLER ... OPEN statement is issued, table data can be modified (by this or any other thread) and these modifications might appear only partially in HANDLER ... NEXT or HANDLER ... PREV scans.

There are several reasons to use the HANDLER interface instead of normal SELECT statements:

13.1.4 INSERT Syntax

INSERT [LOW_PRIORITY | DELAYED | HIGH_PRIORITY] [IGNORE]
    [INTO] tbl_name [(col_name,...)]
    VALUES ({expr | DEFAULT},...),(...),...
    [ ON DUPLICATE KEY UPDATE col_name=expr, ... ]

Or:

INSERT [LOW_PRIORITY | DELAYED | HIGH_PRIORITY] [IGNORE]
    [INTO] tbl_name
    SET col_name={expr | DEFAULT}, ...
    [ ON DUPLICATE KEY UPDATE col_name=expr, ... ]

Or:

INSERT [LOW_PRIORITY | DELAYED | HIGH_PRIORITY] [IGNORE]
    [INTO] tbl_name [(col_name,...)]
    SELECT ...

INSERT inserts new rows into an existing table. The INSERT ... VALUES and INSERT ... SET forms of the statement insert rows based on explicitly specified values. The INSERT ... SELECT form inserts rows selected from another table or tables. The INSERT ... VALUES form with multiple value lists is supported in MySQL 3.22.5 or later. The INSERT ... SET syntax is supported in MySQL 3.22.10 or later. INSERT ... SELECT is discussed further in See section 13.1.4.1 INSERT ... SELECT Syntax.

tbl_name is the table into which rows should be inserted. The columns for which the statement provides values can be specified as follows:

Column values can be given in several ways:

The INSERT statement supports the following modifiers:

If you specify the ON DUPLICATE KEY UPDATE clause (new in MySQL 4.1.0), and a row is inserted that would cause a duplicate value in a UNIQUE index or PRIMARY KEY, an UPDATE of the old row is performed. For example, if column a is declared as UNIQUE and already contains the value 1, the following two statements have identical effect:

mysql> INSERT INTO table (a,b,c) VALUES (1,2,3)
    -> ON DUPLICATE KEY UPDATE c=c+1;

mysql> UPDATE table SET c=c+1 WHERE a=1;

The rows-affected value is 1 if the row is inserted as a new record and 2 if an existing record is updated.

Note: If column b is unique too, the INSERT would be equivalent to this UPDATE statement instead:

mysql> UPDATE table SET c=c+1 WHERE a=1 OR b=2 LIMIT 1;

If a=1 OR b=2 matches several rows, only one row is updated! In general, you should try to avoid using the ON DUPLICATE KEY clause on tables with multiple UNIQUE keys.

As of MySQL 4.1.1, you can use the VALUES(col_name) function in the UPDATE clause to refer to column values from the INSERT part of the INSERT ... UPDATE statement. In other words, VALUES(col_name) in the UPDATE clause refers to the value of col_name that would be inserted if no duplicate-key conflict occurred. This function is especially useful in multiple-row inserts. The VALUES() function is meaningful only in INSERT ... UPDATE statements and returns NULL otherwise.

Example:

mysql> INSERT INTO table (a,b,c) VALUES (1,2,3),(4,5,6)
    -> ON DUPLICATE KEY UPDATE c=VALUES(a)+VALUES(b);

That statement is identical to the following two statements:

mysql> INSERT INTO table (a,b,c) VALUES (1,2,3)
    -> ON DUPLICATE KEY UPDATE c=3;
mysql> INSERT INTO table (a,b,c) VALUES (4,5,6)
    -> ON DUPLICATE KEY UPDATE c=9;

When you use ON DUPLICATE KEY UPDATE, the DELAYED option is ignored.

You can find the value used for an AUTO_INCREMENT column by using the LAST_INSERT_ID() function. From within the C API, use the mysql_insert_id() function. However, note that the two functions do not behave quite identically under all circumstances. The behavior of INSERT statements with respect to AUTO_INCREMENT columns is discussed further in section 12.8.3 Information Functions and section 21.2.3.33 mysql_insert_id().

If you use an INSERT ... VALUES statement with multiple value lists or INSERT ... SELECT, the statement returns an information string in this format:

Records: 100 Duplicates: 0 Warnings: 0

Records indicates the number of rows processed by the statement. (This is not necessarily the number of rows actually inserted. Duplicates can be non-zero.) Duplicates indicates the number of rows that couldn't be inserted because they would duplicate some existing unique index value. Warnings indicates the number of attempts to insert column values that were problematic in some way. Warnings can occur under any of the following conditions:

If you are using the C API, the information string can be obtained by invoking the mysql_info() function. See section 21.2.3.31 mysql_info().

13.1.4.1 INSERT ... SELECT Syntax

INSERT [LOW_PRIORITY] [IGNORE] [INTO] tbl_name [(column_list)]
    SELECT ...

With INSERT ... SELECT, you can quickly insert many rows into a table from one or many tables.

For example:

INSERT INTO tbl_temp2 (fld_id)
    SELECT tbl_temp1.fld_order_id
    FROM tbl_temp1 WHERE tbl_temp1.fld_order_id > 100;

The following conditions hold for an INSERT ... SELECT statement:

You can use REPLACE instead of INSERT to overwrite old rows. REPLACE is the counterpart to INSERT IGNORE in the treatment of new rows that contain unique key values that duplicate old rows: The new rows are used to replace the old rows rather than being discarded.

13.1.4.2 INSERT DELAYED Syntax

INSERT DELAYED ...

The DELAYED option for the INSERT statement is a MySQL extension to standard SQL that is very useful if you have clients that can't wait for the INSERT to complete. This is a common problem when you use MySQL for logging and you also periodically run SELECT and UPDATE statements that take a long time to complete. DELAYED was introduced in MySQL 3.22.15.

When a client uses INSERT DELAYED, it gets an okay from the server at once, and the row is queued to be inserted when the table is not in use by any other thread.

Another major benefit of using INSERT DELAYED is that inserts from many clients are bundled together and written in one block. This is much faster than doing many separate inserts.

There are some constraints on the use of DELAYED:

Note that currently the queued rows are held only in memory until they are inserted into the table. This means that if you terminate mysqld forcibly (for example, with kill -9) or if mysqld dies unexpectedly, any queued rows that have not been written to disk are lost!

The following describes in detail what happens when you use the DELAYED option to INSERT or REPLACE. In this description, the ``thread'' is the thread that received an INSERT DELAYED statement and ``handler'' is the thread that handles all INSERT DELAYED statements for a particular table.

Note that INSERT DELAYED is slower than a normal INSERT if the table is not in use. There is also the additional overhead for the server to handle a separate thread for each table for which there are delayed rows. This means that you should use INSERT DELAYED only when you are really sure that you need it!

13.1.5 LOAD DATA INFILE Syntax

LOAD DATA [LOW_PRIORITY | CONCURRENT] [LOCAL] INFILE 'file_name.txt'
    [REPLACE | IGNORE]
    INTO TABLE tbl_name
    [FIELDS
        [TERMINATED BY '\t']
        [[OPTIONALLY] ENCLOSED BY '']
        [ESCAPED BY '\\' ]
    ]
    [LINES 
        [STARTING BY '']    
        [TERMINATED BY '\n']
    ]
    [IGNORE number LINES]
    [(col_name,...)]

The LOAD DATA INFILE statement reads rows from a text file into a table at a very high speed. For more information about the efficiency of INSERT versus LOAD DATA INFILE and speeding up LOAD DATA INFILE, section 7.2.14 Speed of INSERT Statements.

You can also load data files by using the mysqlimport utility; it operates by sending a LOAD DATA INFILE statement to the server. The --local option causes mysqlimport to read data files from the client host. You can specify the --compress option to get better performance over slow networks if the client and server support the compressed protocol. See section 8.10 The mysqlimport Data Import Program.

If you specify the LOW_PRIORITY keyword, execution of the LOAD DATA statement is delayed until no other clients are reading from the table.

If you specify the CONCURRENT keyword with a MyISAM table that satisfies the condition for concurrent inserts (that is, it contains no free blocks in the middle), then other threads can retrieve data from the table while LOAD DATA is executing. Using this option affects the performance of LOAD DATA a bit, even if no other thread is using the table at the same time.

If the LOCAL keyword is specified, it is interpreted with respect to the client end of the connection:

LOCAL is available in MySQL 3.22.6 or later.

For security reasons, when reading text files located on the server, the files must either reside in the database directory or be readable by all. Also, to use LOAD DATA INFILE on server files, you must have the FILE privilege. See section 5.5.3 Privileges Provided by MySQL.

Using LOCAL is a bit slower than letting the server access the files directly, because the contents of the file must be sent over the connection by the client to the server. On the other hand, you do not need the FILE privilege to load local files.

As of MySQL 3.23.49 and MySQL 4.0.2 (4.0.13 on Windows), LOCAL works only if your server and your client both have been enabled to allow it. For example, if mysqld was started with --local-infile=0, LOCAL will not work. See section 5.4.4 Security Issues with LOAD DATA LOCAL.

If you need LOAD DATA to read from a pipe, you can use the following technique (here we load the listing of the '/' directory into a table):

mkfifo /mysql/db/x/x
chmod 666 /mysql/db/x/x
find / -ls > /mysql/db/x/x
mysql -e "LOAD DATA INFILE 'x' INTO TABLE x" x

If you are using a version of MySQL older than 3.23.25, you can use this technique only with LOAD DATA LOCAL INFILE.

If you are using MySQL before Version 3.23.24, you can't read from a FIFO with LOAD DATA INFILE. If you need to read from a FIFO (for example, the output from gunzip), use LOAD DATA LOCAL INFILE instead.

When locating files on the server host, the server uses the following rules:

Note that these rules mean that a file named as `./myfile.txt' is read from the server's data directory, whereas the same file named as `myfile.txt' is read from the database directory of the default database. For example, the following LOAD DATA statement reads the file `data.txt' from the database directory for db1 because db1 is the current database, even though the statement explicitly loads the file into a table in the db2 database:

mysql> USE db1;
mysql> LOAD DATA INFILE 'data.txt' INTO TABLE db2.my_table;

The REPLACE and IGNORE keywords control handling of input records that duplicate existing records on unique key values.

If you specify REPLACE, input rows replace existing rows (in other words, rows that have the same value for a primary or unique index as an existing row). See section 13.1.6 REPLACE Syntax.

If you specify IGNORE, input rows that duplicate an existing row on a unique key value are skipped. If you don't specify either option, the behavior depends on whether or not the LOCAL keyword is specified. Without LOCAL, an error occurs when a duplicate key value is found, and the rest of the text file is ignored. With LOCAL, the default behavior is the same as if IGNORE is specified; this is because the server has no way to stop transmission of the file in the middle of the operation.

If you want to ignore foreign key constraints during the load operation, you can issue a SET FOREIGN_KEY_CHECKS=0 statement before executing LOAD DATA.

If you use LOAD DATA INFILE on an empty MyISAM table, all non-unique indexes are created in a separate batch (as for REPAIR TABLE). This normally makes LOAD DATA INFILE much faster when you have many indexes. Normally this is very fast, but in some extreme cases, you can create the indexes even faster by turning them off with ALTER TABLE .. DISABLE KEYS before loading the file into the table and using ALTER TABLE .. ENABLE KEYS to re-create the indexes after loading the file. See section 7.2.14 Speed of INSERT Statements.

LOAD DATA INFILE is the complement of SELECT ... INTO OUTFILE. See section 13.1.7 SELECT Syntax. To write data from a table to a file, use SELECT ... INTO OUTFILE. To read the file back into a table, use LOAD DATA INFILE. The syntax of the FIELDS and LINES clauses is the same for both statements. Both clauses are optional, but FIELDS must precede LINES if both are specified.

If you specify a FIELDS clause, each of its subclauses (TERMINATED BY, [OPTIONALLY] ENCLOSED BY, and ESCAPED BY) is also optional, except that you must specify at least one of them.

If you don't specify a FIELDS clause, the defaults are the same as if you had written this:

FIELDS TERMINATED BY '\t' ENCLOSED BY '' ESCAPED BY '\\'

If you don't specify a LINES clause, the default is the same as if you had written this:

LINES TERMINATED BY '\n' STARTING BY ''

In other words, the defaults cause LOAD DATA INFILE to act as follows when reading input:

Conversely, the defaults cause SELECT ... INTO OUTFILE to act as follows when writing output:

Note that to write FIELDS ESCAPED BY '\\', you must specify two backslashes for the value to be read as a single backslash.

Note: If you have generated the text file on a Windows system, you might have to use LINES TERMINATED BY '\r\n' to read the file properly, because Windows programs typically use two characters as a line terminator. Some programs, such as WordPad, might use \r as a line terminator when writing files. To read such files, use LINES TERMINATED BY '\r'.

If all the lines you want to read in have a common prefix that you want to ignore, you can use LINES STARTING BY 'prefix_string' to skip over the prefix (and anything before it). If a line doesn't include the prefix, the entire line is skipped. Note that prefix_string may be in the middle of the line!

Example:

mysql> LOAD DATA INFILE '/tmp/test.txt'
    -> INTO TABLE test LINES STARTING BY "xxx";

With this you can read in a file that contains something like:

xxx"Row",1
something xxx"Row",2

And just get the data ("row",1) and ("row",2).

The IGNORE number LINES option can be used to ignore lines at the start of the file. For example, you can use IGNORE 1 LINES to skip over an initial header line containing column names:

mysql> LOAD DATA INFILE '/tmp/test.txt'
    -> INTO TABLE test IGNORE 1 LINES;

When you use SELECT ... INTO OUTFILE in tandem with LOAD DATA INFILE to write data from a database into a file and then read the file back into the database later, the field- and line-handling options for both statements must match. Otherwise, LOAD DATA INFILE will not interpret the contents of the file properly. Suppose that you use SELECT ... INTO OUTFILE to write a file with fields delimited by commas:

mysql> SELECT * INTO OUTFILE 'data.txt'
    ->          FIELDS TERMINATED BY ','
    ->          FROM table2;

To read the comma-delimited file back in, the correct statement would be:

mysql> LOAD DATA INFILE 'data.txt' INTO TABLE table2
    ->           FIELDS TERMINATED BY ',';

If instead you tried to read in the file with the statement shown here, it wouldn't work because it instructs LOAD DATA INFILE to look for tabs between fields:

mysql> LOAD DATA INFILE 'data.txt' INTO TABLE table2
    ->           FIELDS TERMINATED BY '\t';

The likely result is that each input line would be interpreted as a single field.

LOAD DATA INFILE can be used to read files obtained from external sources, too. For example, a file in dBASE format will have fields separated by commas and enclosed within double quotes. If lines in the file are terminated by newlines, the statement shown here illustrates the field- and line-handling options you would use to load the file:

mysql> LOAD DATA INFILE 'data.txt' INTO TABLE tbl_name
    ->           FIELDS TERMINATED BY ',' ENCLOSED BY '"'
    ->           LINES TERMINATED BY '\n';

Any of the field- or line-handling options can specify an empty string (''). If not empty, the FIELDS [OPTIONALLY] ENCLOSED BY and FIELDS ESCAPED BY values must be a single character. The FIELDS TERMINATED BY, LINES STARTING BY, and LINES TERMINATED BY values can be more than one character. For example, to write lines that are terminated by carriage return/linefeed pairs, or to read a file containing such lines, specify a LINES TERMINATED BY '\r\n' clause.

To read a file containing jokes that are separated by lines consisting of %%, you can do this

mysql> CREATE TABLE jokes
    ->     (a INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
    ->     joke TEXT NOT NULL);
mysql> LOAD DATA INFILE '/tmp/jokes.txt' INTO TABLE jokes
    ->     FIELDS TERMINATED BY ''
    ->     LINES TERMINATED BY '\n%%\n' (joke);

FIELDS [OPTIONALLY] ENCLOSED BY controls quoting of fields. For output (SELECT ... INTO OUTFILE), if you omit the word OPTIONALLY, all fields are enclosed by the ENCLOSED BY character. An example of such output (using a comma as the field delimiter) is shown here:

"1","a string","100.20"
"2","a string containing a , comma","102.20"
"3","a string containing a \" quote","102.20"
"4","a string containing a \", quote and comma","102.20"

If you specify OPTIONALLY, the ENCLOSED BY character is used only to enclose CHAR and VARCHAR fields:

1,"a string",100.20
2,"a string containing a , comma",102.20
3,"a string containing a \" quote",102.20
4,"a string containing a \", quote and comma",102.20

Note that occurrences of the ENCLOSED BY character within a field value are escaped by prefixing them with the ESCAPED BY character. Also note that if you specify an empty ESCAPED BY value, it is possible to generate output that cannot be read properly by LOAD DATA INFILE. For example, the preceding output just shown would appear as follows if the escape character is empty. Observe that the second field in the fourth line contains a comma following the quote, which (erroneously) appears to terminate the field:

1,"a string",100.20
2,"a string containing a , comma",102.20
3,"a string containing a " quote",102.20
4,"a string containing a ", quote and comma",102.20

For input, the ENCLOSED BY character, if present, is stripped from the ends of field values. (This is true whether or not OPTIONALLY is specified; OPTIONALLY has no effect on input interpretation.) Occurrences of the ENCLOSED BY character preceded by the ESCAPED BY character are interpreted as part of the current field value.

If the field begins with the ENCLOSED BY character, instances of that character are recognized as terminating a field value only if followed by the field or line TERMINATED BY sequence. To avoid ambiguity, occurrences of the ENCLOSED BY character within a field value can be doubled and will be interpreted as a single instance of the character. For example, if ENCLOSED BY '"' is specified, quotes are handled as shown here:

"The ""BIG"" boss"  -> The "BIG" boss
The "BIG" boss      -> The "BIG" boss
The ""BIG"" boss    -> The ""BIG"" boss

FIELDS ESCAPED BY controls how to write or read special characters. If the FIELDS ESCAPED BY character is not empty, it is used to prefix the following characters on output:

If the FIELDS ESCAPED BY character is empty, no characters are escaped and NULL is output as NULL, not \N. It is probably not a good idea to specify an empty escape character, particularly if field values in your data contain any of the characters in the list just given.

For input, if the FIELDS ESCAPED BY character is not empty, occurrences of that character are stripped and the following character is taken literally as part of a field value. The exceptions are an escaped `0' or `N' (for example, \0 or \N if the escape character is `\'). These sequences are interpreted as ASCII NUL (a zero-valued byte) and NULL. The rules for NULL handling are described later in this section.

For more information about `\'-escape syntax, see section 9.1 Literal Values.

In certain cases, field- and line-handling options interact:

Handling of NULL values varies according to the FIELDS and LINES options in use:

Some cases are not supported by LOAD DATA INFILE:

The following example loads all columns of the persondata table:

mysql> LOAD DATA INFILE 'persondata.txt' INTO TABLE persondata;

By default, when no column list is provided at the end of the LOAD DATA INFILE statement, input lines are expected to contain a field for each table column. If you want to load only some of a table's columns, specify a column list:

mysql> LOAD DATA INFILE 'persondata.txt'
    ->           INTO TABLE persondata (col1,col2,...);

You must also specify a column list if the order of the fields in the input file differs from the order of the columns in the table. Otherwise, MySQL cannot tell how to match up input fields with table columns.

If an input line has too many fields, the extra fields are ignored and the number of warnings is incremented.

If an input line has too few fields, the table columns for which input fields are missing are set to their default values. Default value assignment is described in section 13.2.6 CREATE TABLE Syntax.

An empty field value is interpreted differently than if the field value is missing:

These are the same values that result if you assign an empty string explicitly to a string, numeric, or date or time type explicitly in an INSERT or UPDATE statement.

TIMESTAMP columns are set to the current date and time only if there is a NULL value for the column (that is, \N), or (for the first TIMESTAMP column only) if the TIMESTAMP column is omitted from the field list when a field list is specified.

LOAD DATA INFILE regards all input as strings, so you can't use numeric values for ENUM or SET columns the way you can with INSERT statements. All ENUM and SET values must be specified as strings!

When the LOAD DATA INFILE statement finishes, it returns an information string in the following format:

Records: 1  Deleted: 0  Skipped: 0  Warnings: 0

If you are using the C API, you can get information about the statement by calling the mysql_info() function. See section 21.2.3.31 mysql_info().

Warnings occur under the same circumstances as when values are inserted via the INSERT statement (see section 13.1.4 INSERT Syntax), except that LOAD DATA INFILE also generates warnings when there are too few or too many fields in the input row. The warnings are not stored anywhere; the number of warnings can be used only as an indication of whether everything went well.

From MySQL 4.1.1 on, you can use SHOW WARNINGS to get a list of the first max_error_count warnings as information about what went wrong. See section 13.5.4.20 SHOW WARNINGS Syntax.

Before MySQL 4.1.1, only a warning count is available to indicate that something went wrong. If you get warnings and want to know exactly why you got them, one way to do this is to dump the table into another file using SELECT ... INTO OUTFILE and compare the file to your original input file.

13.1.6 REPLACE Syntax

REPLACE [LOW_PRIORITY | DELAYED]
    [INTO] tbl_name [(col_name,...)]
    VALUES ({expr | DEFAULT},...),(...),...

Or:

REPLACE [LOW_PRIORITY | DELAYED]
    [INTO] tbl_name
    SET col_name={expr | DEFAULT}, ...

Or:

REPLACE [LOW_PRIORITY | DELAYED]
    [INTO] tbl_name [(col_name,...)]
    SELECT ...

REPLACE works exactly like INSERT, except that if an old record in the table has the same value as a new record for a PRIMARY KEY or a UNIQUE index, the old record is deleted before the new record is inserted. See section 13.1.4 INSERT Syntax.

Note that unless the table has a PRIMARY KEY or UNIQUE index, using a REPLACE statement makes no sense. It becomes equivalent to INSERT, because there is no index to be used to determine whether a new row duplicates another.

Values for all columns are taken from the values specified in the REPLACE statement. Any missing columns are set to their default values, just as happens for INSERT. You can't refer to values from the old row and use them in the new row. It appeared that you could do this in some old MySQL versions, but that was a bug that has been corrected.

To be able to use REPLACE, you must have INSERT and DELETE privileges for the table.

The REPLACE statement returns a count to indicate the number of rows affected. This is the sum of the rows deleted and inserted. If the count is 1 for a single-row REPLACE, a row was inserted and no rows were deleted. If the count is greater than 1, one or more old rows were deleted before the new row was inserted. It is possible for a single row to replace more than one old row if the table contains multiple unique indexes and the new row duplicates values for different old rows in different unique indexes.

The affected-rows count makes it easy to determine whether REPLACE only added a row or whether it also replaced any rows: Check whether the count is 1 (added) or greater (replaced).

If you are using the C API, the affected-rows count can be obtained using the mysql_affected_rows() function.

Currently, you cannot replace into a table and select from the same table in a subquery.

Here follows in more detail the algorithm that is used (it is also used with LOAD DATA ... REPLACE):

  1. Try to insert the new row into the table
  2. While the insertion fails because a duplicate-key error occurs for a primary or unique key:
    1. Delete from the table the conflicting row that has the duplicate key value
    2. Try again to insert the new row into the table

13.1.7 SELECT Syntax

SELECT
    [ALL | DISTINCT | DISTINCTROW ]
      [HIGH_PRIORITY]
      [STRAIGHT_JOIN]
      [SQL_SMALL_RESULT] [SQL_BIG_RESULT] [SQL_BUFFER_RESULT]
      [SQL_CACHE | SQL_NO_CACHE] [SQL_CALC_FOUND_ROWS]
    select_expr, ...
    [INTO OUTFILE 'file_name' export_options
      | INTO DUMPFILE 'file_name']
    [FROM table_references
      [WHERE where_definition]
      [GROUP BY {col_name | expr | position}
        [ASC | DESC], ... [WITH ROLLUP]]
      [HAVING where_definition]
      [ORDER BY {col_name | expr | position}
        [ASC | DESC] , ...]
      [LIMIT {[offset,] row_count | row_count OFFSET offset}]
      [PROCEDURE procedure_name(argument_list)]
      [FOR UPDATE | LOCK IN SHARE MODE]]

SELECT is used to retrieve rows selected from one or more tables. Support for UNION statements and subqueries is available as of MySQL 4.0 and 4.1, respectively. See section 13.1.7.2 UNION Syntax and section 13.1.8 Subquery Syntax.

SELECT can also be used to retrieve rows computed without reference to any table. For example:

mysql> SELECT 1 + 1;
        -> 2

All clauses used must be given in exactly the order shown in the syntax description. For example, a HAVING clause must come after any GROUP BY clause and before any ORDER BY clause.

Following the SELECT keyword, you can give a number of options that affect the operation of the statement.

The ALL, DISTINCT, and DISTINCTROW options specify whether duplicate rows should be returned. If none of these options are given, the default is ALL (all matching rows are returned). DISTINCT and DISTINCTROW are synonyms and specify that duplicate rows in the result set should be removed.

HIGH_PRIORITY, STRAIGHT_JOIN, and options beginning with SQL_ are MySQL extensions to standard SQL.

13.1.7.1 JOIN Syntax

MySQL supports the following JOIN syntaxes for the table_references part of SELECT statements and multiple-table DELETE and UPDATE statements:

table_reference, table_reference
table_reference [INNER | CROSS] JOIN table_reference [join_condition]
table_reference STRAIGHT_JOIN table_reference
table_reference LEFT [OUTER] JOIN table_reference [join_condition]
table_reference NATURAL [LEFT [OUTER]] JOIN table_reference
{ OJ table_reference LEFT OUTER JOIN table_reference
    ON conditional_expr }
table_reference RIGHT [OUTER] JOIN table_reference [join_condition]
table_reference NATURAL [RIGHT [OUTER]] JOIN table_reference

table_reference is defined as:

tbl_name [[AS] alias]
    [[USE INDEX (key_list)]
      | [IGNORE INDEX (key_list)]
      | [FORCE INDEX (key_list)]]

join_condition is defined as:

ON conditional_expr | USING (column_list)

You should generally not have any conditions in the ON part that are used to restrict which rows you want in the result set, but rather specify these conditions in the WHERE clause. There are exceptions to this rule.

Note that INNER JOIN syntax allows a join_condition only from MySQL 3.23.17 on. The same is true for JOIN and CROSS JOIN only as of MySQL 4.0.11.

The { OJ ... LEFT OUTER JOIN ...} syntax shown in the preceding list exists only for compatibility with ODBC.

As of MySQL 3.23.12, you can give hints about which index MySQL should use when retrieving information from a table. By specifying USE INDEX (key_list), you can tell MySQL to use only one of the possible indexes to find rows in the table. The alternative syntax IGNORE INDEX (key_list) can be used to tell MySQL to not use some particular index. These hints are useful if EXPLAIN shows that MySQL is using the wrong index from the list of possible indexes.

From MySQL 4.0.9 on, you can also use FORCE INDEX. This acts likes USE INDEX (key_list) but with the addition that a table scan is assumed to be very expensive. In other words, a table scan will only be used if there is no way to use one of the given indexes to find rows in the table.

USE KEY, IGNORE KEY, and FORCE KEY are synonyms for USE INDEX, IGNORE INDEX, and FORCE INDEX.

Note: USE INDEX, IGNORE INDEX, and FORCE INDEX only affect which indexes are used when MySQL decides how to find rows in the table and how to do the join. They do not affect whether an index will be used when resolving an ORDER BY or GROUP BY.

Some join examples:

mysql> SELECT * FROM table1,table2 WHERE table1.id=table2.id;
mysql> SELECT * FROM table1 LEFT JOIN table2 ON table1.id=table2.id;
mysql> SELECT * FROM table1 LEFT JOIN table2 USING (id);
mysql> SELECT * FROM table1 LEFT JOIN table2 ON table1.id=table2.id
    ->          LEFT JOIN table3 ON table2.id=table3.id;
mysql> SELECT * FROM table1 USE INDEX (key1,key2)
    ->          WHERE key1=1 AND key2=2 AND key3=3;
mysql> SELECT * FROM table1 IGNORE INDEX (key3)
    ->          WHERE key1=1 AND key2=2 AND key3=3;

See section 7.2.9 How MySQL Optimizes LEFT JOIN and RIGHT JOIN.

13.1.7.2 UNION Syntax

SELECT ...
UNION [ALL | DISTINCT]
SELECT ...
  [UNION [ALL | DISTINCT]
   SELECT ...]

UNION is used to combine the result from many SELECT statements into one result set. UNION is available from MySQL 4.0.0 on.

Selected columns listed in corresponding positions of each SELECT statement should have the same type. (For example, the first column selected by the first statement should have the same type as the first column selected by the other statements.) The column names used in the first SELECT statement are used as the column names for the results returned.

The SELECT statements are normal select statements, but with the following restrictions:

If you don't use the keyword ALL for the UNION, all returned rows will be unique, as if you had done a DISTINCT for the total result set. If you specify ALL, you will get all matching rows from all the used SELECT statements.

The DISTINCT keyword is an optional word (introduced in MySQL 4.0.17). It does nothing, but is allowed in the syntax as required by the SQL standard.

Before MySQL 4.1.2, you cannot mix UNION ALL and UNION DISTINCT in the same query. If you use ALL for one UNION, it is used for all of them. As of MySQL 4.1.2, mixed UNION types are treated such that a DISTINCT union overrides any ALL union to its left. A DISTINCT union can be produced explicitly by using UNION DISTINCT or implicitly by using UNION with no following DISTINCT or ALL keyword.

If you want to use an ORDER BY or LIMIT clause to sort or limit the entire UNION result, parenthesize the individual SELECT statements and place the ORDER BY or LIMIT after the last one. The following example uses both clauses:

(SELECT a FROM tbl_name WHERE a=10 AND B=1)
UNION
(SELECT a FROM tbl_name WHERE a=11 AND B=2)
ORDER BY a LIMIT 10;

This kind of ORDER BY cannot use column references that include a table name (that is, names in tbl_name.col_name format). Instead, provide a column alias in the first SELECT statement and refer to the alias in the ORDER BY, or else refer to the column in the ORDER BY using its column position. (An alias is preferable because use of column positions is deprecated.)

To apply ORDER BY or LIMIT to an individual SELECT, place the clause inside the parentheses that enclose the SELECT:

(SELECT a FROM tbl_name WHERE a=10 AND B=1 ORDER BY a LIMIT 10)
UNION
(SELECT a FROM tbl_name WHERE a=11 AND B=2 ORDER BY a LIMIT 10);

The types and lengths of the columns in the result set of a UNION take into account the values retrieved by all the SELECT statements. Before MySQL 4.1.1, a limitation of UNION is that only the values from the first SELECT are used to determine result column types and lengths. This could result in value truncation if, for example, the first SELECT retrieves shorter values than the second SELECT:

mysql> SELECT REPEAT('a',1) UNION SELECT REPEAT('b',10);
+---------------+
| REPEAT('a',1) |
+---------------+
| a             |
| b             |
+---------------+

That limitation has been removed as of MySQL 4.1.1:

mysql> SELECT REPEAT('a',1) UNION SELECT REPEAT('b',10);
+---------------+
| REPEAT('a',1) |
+---------------+
| a             |
| bbbbbbbbbb    |
+---------------+

13.1.8 Subquery Syntax

A subquery is a SELECT statement inside another statement.

Starting with MySQL 4.1, all subquery forms and operations that the SQL standard requires are supported, as well as a few features that are MySQL-specific.

With MySQL versions prior to 4.1, it was necessary to work around or avoid the use of subqueries. In many cases, subqueries can successfully be rewritten using joins and other methods. See section 13.1.8.11 Rewriting Subqueries as Joins for Earlier MySQL Versions.

Here is an example of a subquery:

SELECT * FROM t1 WHERE column1 = (SELECT column1 FROM t2);

In this example, SELECT * FROM t1 ... is the outer query (or outer statement), and (SELECT column1 FROM t2) is the subquery. We say that the subquery is nested in the outer query, and in fact it's possible to nest subqueries within other subqueries, to a great depth. A subquery must always appear within parentheses.

The main advantages of subqueries are:

Here is an example statement that shows the major points about subquery syntax as specified by the SQL standard and supported in MySQL:

DELETE FROM t1
WHERE s11 > ANY
 (SELECT COUNT(*) /* no hint */ FROM t2
 WHERE NOT EXISTS
  (SELECT * FROM t3
   WHERE ROW(5*t2.s1,77)=
    (SELECT 50,11*s1 FROM t4 UNION SELECT 50,77 FROM
     (SELECT * FROM t5) AS t5)));

A subquery can return a scalar (a single value), a single row, a single column, or a table (one or more rows of one or more columns). These are called scalar, column, row, and table subqueries. Subqueries that return a particular kind of result often can be used only in certain contexts, as described in the following sections.

There are few restrictions on the type of statements in which subqueries can be used:

One restriction is that currently you cannot modify a table and select from the same table in a subquery. This applies to statements such as DELETE, INSERT, REPLACE, and UPDATE.

13.1.8.1 The Subquery as Scalar Operand

In its simplest form, a subquery is a scalar subquery that returns a single value. A scalar subquery is a simple operand, and you can use it wherever a single column value or literal is legal, and you can expect it to have those characteristics that all operands have: a data type, a length, an indication whether it can be NULL, and so on. For example:

CREATE TABLE t1 (s1 INT, s2 CHAR(5) NOT NULL);
INSERT INTO t1 VALUES(100, 'abcde');
SELECT (SELECT s2 FROM t1);

The subquery in this SELECT returns a single value ('abcde') that has a data type of CHAR, a length of 5, a character set and collation equal to the defaults in effect at CREATE TABLE time, and an indication that the value in the column can be NULL. In fact, almost all subqueries can be NULL. If the table used in the example were empty, the value of the subquery would be NULL.

When you see examples in the following sections that contain the rather spartan construct (SELECT column1 FROM t1), imagine that your own code will contain much more diverse and complex constructions.

For example, suppose that we make two tables:

CREATE TABLE t1 (s1 INT);
INSERT INTO t1 VALUES (1);
CREATE TABLE t2 (s1 INT);
INSERT INTO t2 VALUES (2);

Then perform a SELECT:

SELECT (SELECT s1 FROM t2) FROM t1;

The result will be 2 because there is a row in t2 containing a column s1 that has a value of 2.

A scalar subquery can be part of an expression. Don't forget the parentheses, even if the subquery is an operand that provides an argument for a function. For example:

SELECT UPPER((SELECT s1 FROM t1)) FROM t2;

13.1.8.2 Comparisons Using Subqueries

The most common use of a subquery is in the form:

non_subquery_operand comparison_operator (subquery)

Where comparison_operator is one of these operators:

=  >  <  >=  <=  <>

For example:

... 'a' = (SELECT column1 FROM t1)

At one time the only legal place for a subquery was on the right side of a comparison, and you might still find some old DBMSs that insist on this.

Here is an example of a common-form subquery comparison that you cannot do with a join. It finds all the values in table t1 that are equal to a maximum value in table t2:

SELECT column1 FROM t1
       WHERE column1 = (SELECT MAX(column2) FROM t2);

Here is another example, which again is impossible with a join because it involves aggregating for one of the tables. It finds all rows in table t1 containing a value that occurs twice in a given column:

SELECT * FROM t1 AS t
    WHERE 2 = (SELECT COUNT(*) FROM t1 WHERE t1.id = t.id);

For a comparison performed with one of these comparison operators, the subquery must return a scalar, with the exception that = can be used with row subqueries. See section 13.1.8.5 Row Subqueries.

13.1.8.3 Subqueries with ANY, IN, and SOME

Syntax:

operand comparison_operator ANY (subquery)
operand IN (subquery)
operand comparison_operator SOME (subquery)

The ANY keyword, which must follow a comparison operator, means ``return TRUE if the comparison is TRUE for ANY of the values in the column that the subquery returns.'' For example:

SELECT s1 FROM t1 WHERE s1 > ANY (SELECT s1 FROM t2);

Suppose that there is a row in table t1 containing (10). The expression is TRUE if table t2 contains (21,14,7) because there is a value 7 in t2 that is less than 10. The expression is FALSE if table t2 contains (20,10), or if table t2 is empty. The expression is UNKNOWN if table t2 contains (NULL,NULL,NULL).

The word IN is an alias for = ANY. Thus these two statements are the same:

SELECT s1 FROM t1 WHERE s1 = ANY (SELECT s1 FROM t2);
SELECT s1 FROM t1 WHERE s1 IN    (SELECT s1 FROM t2);

However, NOT IN is not an alias for <> ANY, but for <> ALL. See section 13.1.8.4 Subqueries with ALL.

The word SOME is an alias for ANY. Thus these two statements are the same:

SELECT s1 FROM t1 WHERE s1 <> ANY  (SELECT s1 FROM t2);
SELECT s1 FROM t1 WHERE s1 <> SOME (SELECT s1 FROM t2);

Use of the word SOME is rare, but this example shows why it might be useful. To most people's ears, the English phrase ``a is not equal to any b'' means ``there is no b which is equal to a,'' but that isn't what is meant by the SQL syntax. Using <> SOME instead helps ensure that everyone understands the true meaning of the query.

13.1.8.4 Subqueries with ALL

Syntax:

operand comparison_operator ALL (subquery)

The word ALL, which must follow a comparison operator, means ``return TRUE if the comparison is TRUE for ALL of the values in the column that the subquery returns.'' For example:

SELECT s1 FROM t1 WHERE s1 > ALL (SELECT s1 FROM t2);

Suppose that there is a row in table t1 containing (10). The expression is TRUE if table t2 contains (-5,0,+5) because 10 is greater than all three values in t2. The expression is FALSE if table t2 contains (12,6,NULL,-100) because there is a single value 12 in table t2 that is greater than 10. The expression is UNKNOWN if table t2 contains (0,NULL,1).

Finally, if table t2 is empty, the result is TRUE. You might think the result should be UNKNOWN, but sorry, it's TRUE. So, rather oddly, the following statement is TRUE when table t2 is empty:

SELECT * FROM t1 WHERE 1 > ALL (SELECT s1 FROM t2);

But this statement is UNKNOWN when table t2 is empty:

SELECT * FROM t1 WHERE 1 > (SELECT s1 FROM t2);

In addition, the following statement is UNKNOWN when table t2 is empty:

SELECT * FROM t1 WHERE 1 > ALL (SELECT MAX(s1) FROM t2);

In general, tables with NULL values and empty tables are edge cases. When writing subquery code, always consider whether you have taken those two possibilities into account.

NOT IN is an alias for <> ALL. Thus these two statements are the same:

SELECT s1 FROM t1 WHERE s1 <> ALL (SELECT s1 FROM t2);
SELECT s1 FROM t1 WHERE s1 NOT IN (SELECT s1 FROM t2);

13.1.8.5 Row Subqueries

The discussion to this point has been of scalar or column subqueries, that is, subqueries that return a single value or a column of values. A row subquery is a subquery variant that returns a single row and can thus return more than one column value. Here are two examples:

SELECT * FROM t1 WHERE (1,2) = (SELECT column1, column2 FROM t2);
SELECT * FROM t1 WHERE ROW(1,2) = (SELECT column1, column2 FROM t2);

The queries here are both TRUE if table t2 has a row where column1 = 1 and column2 = 2.

The expressions (1,2) and ROW(1,2) are sometimes called row constructors. The two are equivalent. They are legal in other contexts, too. For example, the following two statements are semantically equivalent (although currently only the second one can be optimized):

SELECT * FROM t1 WHERE (column1,column2) = (1,1);
SELECT * FROM t1 WHERE column1 = 1 AND column2 = 1;

The normal use of row constructors, though, is for comparisons with subqueries that return two or more columns. For example, the following query answers the request, ``find all rows in table t1 that also exist in table t2'':

SELECT column1,column2,column3
       FROM t1
       WHERE (column1,column2,column3) IN
             (SELECT column1,column2,column3 FROM t2);

13.1.8.6 EXISTS and NOT EXISTS

If a subquery returns any rows at all, then EXISTS subquery is TRUE, and NOT EXISTS subquery is FALSE. For example:

SELECT column1 FROM t1 WHERE EXISTS (SELECT * FROM t2);

Traditionally, an EXISTS subquery starts with SELECT *, but it could begin with SELECT 5 or SELECT column1 or anything at all. MySQL ignores the SELECT list in such a subquery, so it doesn't matter.

For the preceding example, if t2 contains any rows, even rows with nothing but NULL values, then the EXISTS condition is TRUE. This is actually an unlikely example, since almost always a [NOT] EXISTS subquery will contain correlations. Here are some more realistic examples:

The last example is a double-nested NOT EXISTS query. That is, it has a NOT EXISTS clause within a NOT EXISTS clause. Formally, it answers the question ``does a city exist with a store that is not in Stores?'' But it's easier to say that a nested NOT EXISTS answers the question ``is x TRUE for all y?''

13.1.8.7 Correlated Subqueries

A correlated subquery is a subquery that contains a reference to a table that also appears in the outer query. For example:

SELECT * FROM t1 WHERE column1 = ANY
       (SELECT column1 FROM t2 WHERE t2.column2 = t1.column2);

Notice that the subquery contains a reference to a column of t1, even though the subquery's FROM clause doesn't mention a table t1. So, MySQL looks outside the subquery, and finds t1 in the outer query.

Suppose that table t1 contains a row where column1 = 5 and column2 = 6; meanwhile, table t2 contains a row where column1 = 5 and column2 = 7. The simple expression ... WHERE column1 = ANY (SELECT column1 FROM t2) would be TRUE, but in this example, the WHERE clause within the subquery is FALSE (because (5,6) is not equal to (5,7)), so the subquery as a whole is FALSE.

Scoping rule: MySQL evaluates from inside to outside. For example:

SELECT column1 FROM t1 AS x
  WHERE x.column1 = (SELECT column1 FROM t2 AS x
    WHERE x.column1 = (SELECT column1 FROM t3
      WHERE x.column2 = t3.column1));

In this statement, x.column2 must be a column in table t2 because SELECT column1 FROM t2 AS x ... renames t2. It is not a column in table t1 because SELECT column1 FROM t1 ... is an outer query that is farther out.

For subqueries in HAVING or ORDER BY clauses, MySQL also looks for column names in the outer select list.

For certain cases, a correlated subquery is optimized. For example:

val IN (SELECT key_val FROM tbl_name WHERE correlated_condition)

Otherwise, they are inefficient and likely to be slow. Rewriting the query as a join might improve performance.

13.1.8.8 Subqueries in the FROM clause

Subqueries are legal in a SELECT statement's FROM clause. The syntax that you'll actually see is:

SELECT ... FROM (subquery) AS name ...

The AS name clause is mandatory, because every table in a FROM clause must have a name. Any columns in the subquery select list must have unique names. You can find this syntax described elsewhere in this manual, where the term used is ``derived tables.''

For illustration, assume that you have this table:

CREATE TABLE t1 (s1 INT, s2 CHAR(5), s3 FLOAT);

Here's how to use a subquery in the FROM clause, using the example table:

INSERT INTO t1 VALUES (1,'1',1.0);
INSERT INTO t1 VALUES (2,'2',2.0);
SELECT sb1,sb2,sb3
       FROM (SELECT s1 AS sb1, s2 AS sb2, s3*2 AS sb3 FROM t1) AS sb
       WHERE sb1 > 1;

Result: 2, '2', 4.0.

Here's another example: Suppose that you want to know the average of a set of sums for a grouped table. This won't work:

SELECT AVG(SUM(column1)) FROM t1 GROUP BY column1;

But this query will provide the desired information:

SELECT AVG(sum_column1)
       FROM (SELECT SUM(column1) AS sum_column1
             FROM t1 GROUP BY column1) AS t1;

Notice that the column name used within the subquery (sum_column1) is recognized in the outer query.

Subqueries in the FROM clause can return a scalar, column, row, or table. At the moment, subqueries in the FROM clause cannot be correlated subqueries.

Subqueries in the FROM clause will be executed even for the EXPLAIN statement (that is, derived temporary tables will be built). This occurs because upper level queries need information about all tables during optimization phase.

13.1.8.9 Subquery Errors

There are some new error returns that apply only to subqueries. This section groups them together because reviewing them will help remind you of some points.

For transactional storage engines, a failure of a subquery causes the entire statement to fail. For non-transactional storage engines, data modifications made before the error was encountered are preserved.

13.1.8.10 Optimizing Subqueries

Development is ongoing, so no optimization tip is reliable for the long term. Some interesting tricks that you might want to play with are:

These tricks might cause programs to go faster or slower. Using MySQL facilities like the BENCHMARK() function, you can get an idea about what helps in your own situation. Don't worry too much about transforming to joins except for compatibility with older versions of MySQL before 4.1 that do not support subqueries.

Some optimizations that MySQL itself makes are:

There is a chapter titled ``How MySQL Transforms Subqueries'' in the MySQL Internals Manual. You can obtain this document by downloading the MySQL source package and looking for a file named `internals.texi' in the `Docs' directory.

13.1.8.11 Rewriting Subqueries as Joins for Earlier MySQL Versions

Before MySQL 4.1, only nested queries of the form INSERT ... SELECT ... and REPLACE ... SELECT ... are supported. The IN() construct can be used in other contexts to test membership in a set of values.

It is often possible to rewrite a query without a subquery:

SELECT * FROM t1 WHERE id IN (SELECT id FROM t2);

This can be rewritten as:

SELECT DISTINCT t1.* FROM t1, t2 WHERE t1.id=t2.id;

The queries:

SELECT * FROM t1 WHERE id NOT IN (SELECT id FROM t2);
SELECT * FROM t1 WHERE NOT EXISTS (SELECT id FROM t2 WHERE t1.id=t2.id);

Can be rewritten as:

SELECT table1.* FROM table1 LEFT JOIN table2 ON table1.id=table2.id
                                       WHERE table2.id IS NULL;

A LEFT [OUTER] JOIN can be faster than an equivalent subquery because the server might be able to optimize it better--a fact that is not specific to MySQL Server alone. Prior to SQL-92, outer joins did not exist, so subqueries were the only way to do certain things in those bygone days. Today, MySQL Server and many other modern database systems offer a whole range of outer join types.

For more complicated subqueries, you can often create temporary tables to hold the subquery. In some cases, however, this option will not work. The most frequently encountered of these cases arises with DELETE statements, for which standard SQL does not support joins (except in subqueries). For this situation, there are three options available:

MySQL Server 4.0 supports multiple-table DELETE statements that can be used to efficiently delete rows based on information from one table or even from many tables at the same time. Multiple-table UPDATE statements are also supported as of MySQL 4.0.

13.1.9 TRUNCATE Syntax

TRUNCATE TABLE tbl_name

TRUNCATE TABLE empties a table completely. Logically, this is equivalent to a DELETE statement that deletes all rows, but there are practical differences under some circumstances.

For InnoDB, TRUNCATE TABLE is mapped to DELETE, so there is no difference. For other storage engines, TRUNCATE TABLE differs from DELETE FROM ... in the following ways from MySQL 4.0 and up:

In MySQL 3.23, TRUNCATE TABLE is mapped to COMMIT; DELETE FROM tbl_name, so it behaves like DELETE. See section 13.1.1 DELETE Syntax.

TRUNCATE TABLE is an Oracle SQL extension. This statement was added in MySQL 3.23.28, although from 3.23.28 to 3.23.32, the keyword TABLE must be omitted.

13.1.10 UPDATE Syntax<