title | description | author | ms.author | ms.date | ms.service | ms.subservice | ms.topic | f1_keywords | helpviewer_keywords | dev_langs | monikerRange | |||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
INSERT (Transact-SQL) |
INSERT (Transact-SQL) |
WilliamDAssafMSFT |
wiassaf |
06/10/2021 |
sql |
t-sql |
reference |
|
|
TSQL |
>=aps-pdw-2016||=azuresqldb-current||=azure-sqldw-latest||>=sql-server-2016||>=sql-server-linux-2017||=azuresqldb-mi-current |
INSERT (Transact-SQL)
[!INCLUDE sql-asdb-asdbmi-asa-pdw]
Adds one or more rows to a table or a view in [!INCLUDEssNoVersion]. For examples, see Examples.
:::image type=»icon» source=»../../includes/media/topic-link-icon.svg» border=»false»::: Transact-SQL syntax conventions
Syntax
-- Syntax for SQL Server and Azure SQL Database
[ WITH <common_table_expression> [ ,...n ] ]
INSERT
{
[ TOP ( expression ) [ PERCENT ] ]
[ INTO ]
{ <object> | rowset_function_limited
[ WITH ( <Table_Hint_Limited> [ ...n ] ) ]
}
{
[ ( column_list ) ]
[ <OUTPUT Clause> ]
{ VALUES ( { DEFAULT | NULL | expression } [ ,...n ] ) [ ,...n ]
| derived_table
| execute_statement
| <dml_table_source>
| DEFAULT VALUES
}
}
}
[;]
<object> ::=
{
[ server_name . database_name . schema_name .
| database_name .[ schema_name ] .
| schema_name .
]
table_or_view_name
}
<dml_table_source> ::=
SELECT <select_list>
FROM ( <dml_statement_with_output_clause> )
[AS] table_alias [ ( column_alias [ ,...n ] ) ]
[ WHERE <search_condition> ]
[ OPTION ( <query_hint> [ ,...n ] ) ]
-- External tool only syntax
INSERT
{
[BULK]
{ database_name.schema_name.table_or_view_name | schema_name.table_or_view_name | table_or_view_name }
( <column_definition> )
[ WITH (
[ [ , ] CHECK_CONSTRAINTS ]
[ [ , ] FIRE_TRIGGERS ]
[ [ , ] KEEP_NULLS ]
[ [ , ] KILOBYTES_PER_BATCH = kilobytes_per_batch ]
[ [ , ] ROWS_PER_BATCH = rows_per_batch ]
[ [ , ] ORDER ( { column [ ASC | DESC ] } [ ,...n ] ) ]
[ [ , ] TABLOCK ]
) ]
}
[; ] <column_definition> ::=
column_name <data_type>
[ COLLATE collation_name ]
[ NULL | NOT NULL ]
<data type> ::=
[ type_schema_name . ] type_name
[ ( precision [ , scale ] | max ]
-- Syntax for Azure Synapse Analytics and Parallel Data Warehouse
INSERT [INTO] { database_name.schema_name.table_name | schema_name.table_name | table_name }
[ ( column_name [ ,...n ] ) ]
{
VALUES ( { NULL | expression } )
| SELECT <select_criteria>
}
[ OPTION ( <query_option> [ ,...n ] ) ]
[;]
[!INCLUDEsql-server-tsql-previous-offline-documentation]
Arguments
WITH <common_table_expression>
Specifies the temporary named result set, also known as common table expression, defined within the scope of the INSERT statement. The result set is derived from a SELECT statement. For more information, see WITH common_table_expression (Transact-SQL).
TOP (expression) [ PERCENT ]
Specifies the number or percent of random rows that will be inserted. expression can be either a number or a percent of the rows. For more information, see TOP (Transact-SQL).
INTO
Is an optional keyword that can be used between INSERT and the target table.
server_name
Applies to: [!INCLUDEsql2008-md] and later.
Is the name of the linked server on which the table or view is located. server_name can be specified as a linked server name, or by using the OPENDATASOURCE function.
When server_name is specified as a linked server, database_name and schema_name are required. When server_name is specified with OPENDATASOURCE, database_name and schema_name may not apply to all data sources and is subject to the capabilities of the OLE DB provider that accesses the remote object.
database_name
Applies to: [!INCLUDEsql2008-md] and later.
Is the name of the database.
schema_name
Is the name of the schema to which the table or view belongs.
table_or view_name
Is the name of the table or view that is to receive the data.
A table variable, within its scope, can be used as a table source in an INSERT statement.
The view referenced by table_or_view_name must be updatable and reference exactly one base table in the FROM clause of the view. For example, an INSERT into a multi-table view must use a column_list that references only columns from one base table. For more information about updatable views, see CREATE VIEW (Transact-SQL).
rowset_function_limited
Applies to: [!INCLUDEsql2008-md] and later.
Is either the OPENQUERY or OPENROWSET function. Use of these functions is subject to the capabilities of the OLE DB provider that accesses the remote object.
WITH ( <table_hint_limited> [… n ] )
Specifies one or more table hints that are allowed for a target table. The WITH keyword and the parentheses are required.
READPAST, NOLOCK, and READUNCOMMITTED are not allowed. For more information about table hints, see Table Hints (Transact-SQL).
[!IMPORTANT]
The ability to specify the HOLDLOCK, SERIALIZABLE, READCOMMITTED, REPEATABLEREAD, or UPDLOCK hints on tables that are targets of INSERT statements will be removed in a future version of [!INCLUDEssNoVersion]. These hints do not affect the performance of INSERT statements. Avoid using them in new development work, and plan to modify applications that currently use them.
Specifying the TABLOCK hint on a table that is the target of an INSERT statement has the same effect as specifying the TABLOCKX hint. An exclusive lock is taken on the table.
(column_list)
Is a list of one or more columns in which to insert data. column_list must be enclosed in parentheses and delimited by commas.
If a column is not in column_list, the [!INCLUDEssDE] must be able to provide a value based on the definition of the column; otherwise, the row cannot be loaded. The [!INCLUDEssDE] automatically provides a value for the column if the column:
-
Has an IDENTITY property. The next incremental identity value is used.
-
Has a default. The default value for the column is used.
-
Has a timestamp data type. The current timestamp value is used.
-
Is nullable. A null value is used.
-
Is a computed column. The calculated value is used.
column_list must be used when explicit values are inserted into an identity column, and the SET IDENTITY_INSERT option must be ON for the table.
OUTPUT Clause
Returns inserted rows as part of the insert operation. The results can be returned to the processing application or inserted into a table or table variable for further processing.
The OUTPUT clause is not supported in DML statements that reference local partitioned views, distributed partitioned views, or remote tables, or INSERT statements that contain an execute_statement. The OUTPUT INTO clause is not supported in INSERT statements that contain a <dml_table_source> clause. For more information about the arguments and behavior of this clause, see OUTPUT Clause (Transact-SQL).
VALUES
Introduces the list or lists of data values to be inserted. There must be one data value for each column in column_list, if specified, or in the table. The value list must be enclosed in parentheses.
If the values in the Value list are not in the same order as the columns in the table or do not have a value for each column in the table, column_list must be used to explicitly specify the column that stores each incoming value.
You can use the [!INCLUDEtsql] row constructor (also called a table value constructor) to specify multiple rows in a single INSERT statement. The row constructor consists of a single VALUES clause with multiple value lists enclosed in parentheses and separated by a comma. For more information, see Table Value Constructor (Transact-SQL).
[!NOTE]
Table value constructor is not supported in Azure Synapse Analytics. Instead, subsequentINSERT
statements can be executed to insert multiple rows. In Azure Synapse Analytics, insert values can only be constant literal values or variable references. To insert a non-literal, set a variable to non-constant value and insert the variable.
DEFAULT
Forces the [!INCLUDEssDE] to load the default value defined for a column. If a default does not exist for the column and the column allows null values, NULL is inserted. For a column defined with the timestamp data type, the next timestamp value is inserted. DEFAULT is not valid for an identity column.
expression
Is a constant, a variable, or an expression. The expression cannot contain an EXECUTE statement.
When referencing the Unicode character data types nchar, nvarchar, and ntext, ‘expression‘ should be prefixed with the capital letter ‘N’. If ‘N’ is not specified, [!INCLUDEssNoVersion] converts the string to the code page that corresponds to the default collation of the database or column. Any characters not found in this code page are lost.
derived_table
Is any valid SELECT statement that returns rows of data to be loaded into the table. The SELECT statement cannot contain a common table expression (CTE).
execute_statement
Is any valid EXECUTE statement that returns data with SELECT or READTEXT statements. For more information, see EXECUTE (Transact-SQL).
The RESULT SETS options of the EXECUTE statement cannot be specified in an INSERT…EXEC statement.
If execute_statement is used with INSERT, each result set must be compatible with the columns in the table or in column_list.
execute_statement can be used to execute stored procedures on the same server or a remote server. The procedure in the remote server is executed, and the result sets are returned to the local server and loaded into the table in the local server. In a distributed transaction, execute_statement cannot be issued against a loopback linked server when the connection has multiple active result sets (MARS) enabled.
If execute_statement returns data with the READTEXT statement, each READTEXT statement can return a maximum of 1 MB (1024 KB) of data. execute_statement can also be used with extended procedures. execute_statement inserts the data returned by the main thread of the extended procedure; however, output from threads other than the main thread are not inserted.
You cannot specify a table-valued parameter as the target of an INSERT EXEC statement; however, it can be specified as a source in the INSERT EXEC string or stored-procedure. For more information, see Use Table-Valued Parameters (Database Engine).
<dml_table_source>
Specifies that the rows inserted into the target table are those returned by the OUTPUT clause of an INSERT, UPDATE, DELETE, or MERGE statement, optionally filtered by a WHERE clause. If <dml_table_source> is specified, the target of the outer INSERT statement must meet the following restrictions:
-
It must be a base table, not a view.
-
It cannot be a remote table.
-
It cannot have any triggers defined on it.
-
It cannot participate in any primary key-foreign key relationships.
-
It cannot participate in merge replication or updatable subscriptions for transactional replication.
The compatibility level of the database must be set to 100 or higher. For more information, see OUTPUT Clause (Transact-SQL).
<select_list>
Is a comma-separated list specifying which columns returned by the OUTPUT clause to insert. The columns in <select_list> must be compatible with the columns into which values are being inserted. <select_list> cannot reference aggregate functions or TEXTPTR.
[!NOTE]
Any variables listed in the SELECT list refer to their original values, regardless of any changes made to them in <dml_statement_with_output_clause>.
<dml_statement_with_output_clause>
Is a valid INSERT, UPDATE, DELETE, or MERGE statement that returns affected rows in an OUTPUT clause. The statement cannot contain a WITH clause, and cannot target remote tables or partitioned views. If UPDATE or DELETE is specified, it cannot be a cursor-based UPDATE or DELETE. Source rows cannot be referenced as nested DML statements.
WHERE <search_condition>
Is any WHERE clause containing a valid <search_condition> that filters the rows returned by <dml_statement_with_output_clause>. For more information, see Search Condition (Transact-SQL). When used in this context, <search_condition> cannot contain subqueries, scalar user-defined functions that perform data access, aggregate functions, TEXTPTR, or full-text search predicates.
DEFAULT VALUES
Applies to: [!INCLUDEsql2008-md] and later.
Forces the new row to contain the default values defined for each column.
BULK
Applies to: [!INCLUDEsql2008-md] and later.
Used by external tools to upload a binary data stream. This option is not intended for use with tools such as [!INCLUDEssManStudioFull], SQLCMD, OSQL, or data access application programming interfaces such as [!INCLUDEssNoVersion] Native Client.
FIRE_TRIGGERS
Applies to: [!INCLUDEsql2008-md] and later.
Specifies that any insert triggers defined on the destination table execute during the binary data stream upload operation. For more information, see BULK INSERT (Transact-SQL).
CHECK_CONSTRAINTS
Applies to: [!INCLUDEsql2008-md] and later.
Specifies that all constraints on the target table or view must be checked during the binary data stream upload operation. For more information, see BULK INSERT (Transact-SQL).
KEEPNULLS
Applies to: [!INCLUDEsql2008-md] and later.
Specifies that empty columns should retain a null value during the binary data stream upload operation. For more information, see Keep Nulls or Use Default Values During Bulk Import (SQL Server).
KILOBYTES_PER_BATCH = kilobytes_per_batch
Specifies the approximate number of kilobytes (KB) of data per batch as kilobytes_per_batch. For more information, see BULK INSERT (Transact-SQL).
ROWS_PER_BATCH =rows_per_batch
Applies to: [!INCLUDEsql2008-md] and later.
Indicates the approximate number of rows of data in the binary data stream. For more information, see BULK INSERT (Transact-SQL).
[!NOTE]
A syntax error is raised if a column list is not provided.
Remarks
For information specific to inserting data into SQL graph tables, see INSERT (SQL Graph).
Best Practices
Use the @@ROWCOUNT function to return the number of inserted rows to the client application. For more information, see @@ROWCOUNT (Transact-SQL).
Best Practices for Bulk Importing Data
Using INSERT INTO…SELECT to Bulk Import data with minimal logging and parallelism
You can use INSERT INTO <target_table> SELECT <columns> FROM <source_table>
to efficiently transfer a large number of rows from one table, such as a staging table, to another table with minimal logging. Minimal logging can improve the performance of the statement and reduce the possibility of the operation filling the available transaction log space during the transaction.
Minimal logging for this statement has the following requirements:
- The recovery model of the database is set to simple or bulk-logged.
- The target table is an empty or non-empty heap.
- The target table is not used in replication.
- The
TABLOCK
hint is specified for the target table.
Rows that are inserted into a heap as the result of an insert action in a MERGE statement may also be minimally logged.
Unlike the BULK INSERT
statement, which holds a less restrictive Bulk Update (BU) lock, INSERT INTO … SELECT
with the TABLOCK
hint holds an exclusive (X) lock on the table. This means that you cannot insert rows using multiple insert operations executing simultaneously.
However, starting with [!INCLUDEsssql16-md] and database compatibility level 130, a single INSERT INTO … SELECT
statement can be executed in parallel when inserting into heaps or clustered columnstore indexes (CCI). Parallel inserts are possible when using the TABLOCK
hint.
Parallelism for the statement above has the following requirements, which are similar to the requirements for minimal logging:
- The target table is an empty or non-empty heap.
- The target table has a clustered columnstore index (CCI) but no non-clustered indexes.
- The target table does not have an identity column with IDENTITY_INSERT set to OFF.
- The
TABLOCK
hint is specified for the target table.
For scenarios where requirements for minimal logging and parallel insert are met, both improvements will work together to ensure maximum throughput of your data load operations.
[!NOTE]
Inserts into local temporary tables (identified by the # prefix) and global temporary tables (identified by ## prefixes) are also enabled for parallelism using the TABLOCK hint.
Using OPENROWSET and BULK to Bulk Import data
The OPENROWSET function can accept the following table hints, which provide bulk-load optimizations with the INSERT statement:
- The
TABLOCK
hint can minimize the number of log records for the insert operation. The recovery model of the database must be set to simple or bulk-logged and the target table cannot be used in replication. For more information, see Prerequisites for Minimal Logging in Bulk Import. - The
TABLOCK
hint can enable parallel insert operations. The target table is a heap or clustered columnstore index (CCI) with no non-clustered indexes, and the target table cannot have an identity column specified. - The
IGNORE_CONSTRAINTS
hint can temporarily disable FOREIGN KEY and CHECK constraint checking. - The
IGNORE_TRIGGERS
hint can temporarily disable trigger execution. - The
KEEPDEFAULTS
hint allows the insertion of a table column’s default value, if any, instead of NULL when the data record lacks a value for the column. - The
KEEPIDENTITY
hint allows the identity values in the imported data file to be used for the identity column in the target table.
These optimizations are similar to those available with the BULK INSERT
command. For more information, see Table Hints (Transact-SQL).
Data Types
When you insert rows, consider the following data type behavior:
-
If a value is being loaded into columns with a char, varchar, or varbinary data type, the padding or truncation of trailing blanks (spaces for char and varchar, zeros for varbinary) is determined by the SET ANSI_PADDING setting defined for the column when the table was created. For more information, see SET ANSI_PADDING (Transact-SQL).
The following table shows the default operation for SET ANSI_PADDING OFF.
Data type Default operation char Pad value with spaces to the defined width of column. varchar Remove trailing spaces to the last non-space character or to a single-space character for strings made up of only spaces. varbinary Remove trailing zeros. -
If an empty string (‘ ‘) is loaded into a column with a varchar or text data type, the default operation is to load a zero-length string.
-
Inserting a null value into a text or image column does not create a valid text pointer, nor does it preallocate an 8-KB text page.
-
Columns created with the uniqueidentifier data type store specially formatted 16-byte binary values. Unlike with identity columns, the [!INCLUDEssDE] does not automatically generate values for columns with the uniqueidentifier data type. During an insert operation, variables with a data type of uniqueidentifier and string constants in the form xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx (36 characters including hyphens, where x is a hexadecimal digit in the range 0-9 or a-f) can be used for uniqueidentifier columns. For example, 6F9619FF-8B86-D011-B42D-00C04FC964FF is a valid value for a uniqueidentifier variable or column. Use the NEWID() function to obtain a globally unique ID (GUID).
Inserting Values into User-Defined Type Columns
You can insert values in user-defined type columns by:
-
Supplying a value of the user-defined type.
-
Supplying a value in a [!INCLUDEssNoVersion] system data type, as long as the user-defined type supports implicit or explicit conversion from that type. The following example shows how to insert a value in a column of user-defined type
Point
, by explicitly converting from a string.INSERT INTO Cities (Location) VALUES ( CONVERT(Point, '12.3:46.2') );
A binary value can also be supplied without performing explicit conversion, because all user-defined types are implicitly convertible from binary.
-
Calling a user-defined function that returns a value of the user-defined type. The following example uses a user-defined function
CreateNewPoint()
to create a new value of user-defined typePoint
and insert the value into theCities
table.INSERT INTO Cities (Location) VALUES ( dbo.CreateNewPoint(x, y) );
Error Handling
You can implement error handling for the INSERT statement by specifying the statement in a TRY…CATCH construct.
If an INSERT statement violates a constraint or rule, or if it has a value incompatible with the data type of the column, the statement fails and an error message is returned.
If INSERT is loading multiple rows with SELECT or EXECUTE, any violation of a rule or constraint that occurs from the values being loaded causes the statement to be stopped, and no rows are loaded.
When an INSERT statement encounters an arithmetic error (overflow, divide by zero, or a domain error) occurring during expression evaluation, the [!INCLUDEssDE] handles these errors as if SET ARITHABORT is set to ON. The batch is stopped, and an error message is returned. During expression evaluation when SET ARITHABORT and SET ANSI_WARNINGS are OFF, if an INSERT, DELETE or UPDATE statement encounters an arithmetic error, overflow, divide-by-zero, or a domain error, [!INCLUDEssNoVersion] inserts or updates a NULL value. If the target column is not nullable, the insert or update action fails and the user receives an error.
Interoperability
When an INSTEAD OF
trigger is defined on INSERT actions against a table or view, the trigger executes instead of the INSERT statement. For more information about INSTEAD OF
triggers, see CREATE TRIGGER (Transact-SQL).
Limitations and Restrictions
When you insert values into remote tables and not all values for all columns are specified, you must identify the columns to which the specified values are to be inserted.
When TOP is used with INSERT the referenced rows are not arranged in any order and the ORDER BY clause can not be directly specified in this statements. If you need to use TOP to insert rows in a meaningful chronological order, you must use TOP together with an ORDER BY clause that is specified in a subselect statement. See the Examples section that follows in this topic.
INSERT queries that use SELECT with ORDER BY to populate rows guarantees how identity values are computed but not the order in which the rows are inserted.
In Parallel Data Warehouse, the ORDER BY clause is invalid in VIEWS, CREATE TABLE AS SELECT, INSERT SELECT, inline functions, derived tables, subqueries and common table expressions, unless TOP is also specified.
Logging Behavior
The INSERT statement is always fully logged except when using the OPENROWSET function with the BULK keyword or when using INSERT INTO <target_table> SELECT <columns> FROM <source_table>
. These operations can be minimally logged. For more information, see the section «Best Practices for Bulk Loading Data» earlier in this topic.
Security
During a linked server connection, the sending server provides a login name and password to connect to the receiving server on its behalf. For this connection to work, you must create a login mapping between the linked servers by using sp_addlinkedsrvlogin.
When you use OPENROWSET(BULK…), it is important to understand how [!INCLUDEssNoVersion] handles impersonation. For more information, see «Security Considerations» in Import Bulk Data by Using BULK INSERT or OPENROWSET(BULK…) (SQL Server).
Permissions
INSERT permission is required on the target table.
INSERT permissions default to members of the sysadmin
fixed server role, the db_owner
and db_datawriter
fixed database roles, and the table owner. Members of the sysadmin
, db_owner
, and the db_securityadmin
roles, and the table owner can transfer permissions to other users.
To execute INSERT with the OPENROWSET function BULK option, you must be a member of the sysadmin
fixed server role or of the bulkadmin
fixed server role.
Examples
Category | Featured syntax elements |
---|---|
Basic syntax | INSERT • table value constructor |
Handling column values | IDENTITY • NEWID • default values • user-defined types |
Inserting data from other tables | INSERT…SELECT • INSERT…EXECUTE • WITH common table expression • TOP • OFFSET FETCH |
Specifying target objects other than standard tables | Views • table variables |
Inserting rows into a remote table | Linked server • OPENQUERY rowset function • OPENDATASOURCE rowset function |
Bulk loading data from tables or data files | INSERT…SELECT • OPENROWSET function |
Overriding the default behavior of the query optimizer by using hints | Table hints |
Capturing the results of the INSERT statement | OUTPUT clause |
Basic Syntax
Examples in this section demonstrate the basic functionality of the INSERT statement using the minimum required syntax.
A. Inserting a single row of data
The following example inserts one row into the Production.UnitMeasure
table in the [!INCLUDEssSampleDBnormal] database. The columns in this table are UnitMeasureCode
, Name
, and ModifiedDate
. Because values for all columns are supplied and are listed in the same order as the columns in the table, the column names do not have to be specified in the column list*.*
INSERT INTO Production.UnitMeasure VALUES (N'FT', N'Feet', '20080414');
B. Inserting multiple rows of data
The following example uses the table value constructor to insert three rows into the Production.UnitMeasure
table in the [!INCLUDEssSampleDBnormal] database in a single INSERT statement. Because values for all columns are supplied and are listed in the same order as the columns in the table, the column names do not have to be specified in the column list.
[!NOTE]
The table vaule constructor is not supported in Azure Synapse Analytics.
INSERT INTO Production.UnitMeasure VALUES (N'FT2', N'Square Feet ', '20080923'), (N'Y', N'Yards', '20080923') , (N'Y3', N'Cubic Yards', '20080923');
C. Inserting data that is not in the same order as the table columns
The following example uses a column list to explicitly specify the values that are inserted into each column. The column order in the Production.UnitMeasure
table in the [!INCLUDEssSampleDBnormal] database is UnitMeasureCode
, Name
, ModifiedDate
; however, the columns are not listed in that order in column_list.
INSERT INTO Production.UnitMeasure (Name, UnitMeasureCode, ModifiedDate) VALUES (N'Square Yards', N'Y2', GETDATE());
Handling Column Values
Examples in this section demonstrate methods of inserting values into columns that are defined with an IDENTITY property, DEFAULT value, or are defined with data types such as uniqueidentifer or user-defined type columns.
D. Inserting data into a table with columns that have default values
The following example shows inserting rows into a table with columns that automatically generate a value or have a default value. Column_1
is a computed column that automatically generates a value by concatenating a string with the value inserted into column_2
. Column_2
is defined with a default constraint. If a value is not specified for this column, the default value is used. Column_3
is defined with the rowversion data type, which automatically generates a unique, incrementing binary number. Column_4
does not automatically generate a value. When a value for this column is not specified, NULL is inserted. The INSERT statements insert rows that contain values for some of the columns but not all. In the last INSERT statement, no columns are specified and only the default values are inserted by using the DEFAULT VALUES clause.
CREATE TABLE dbo.T1 ( column_1 AS 'Computed column ' + column_2, column_2 varchar(30) CONSTRAINT default_name DEFAULT ('my column default'), column_3 rowversion, column_4 varchar(40) NULL ); GO INSERT INTO dbo.T1 (column_4) VALUES ('Explicit value'); INSERT INTO dbo.T1 (column_2, column_4) VALUES ('Explicit value', 'Explicit value'); INSERT INTO dbo.T1 (column_2) VALUES ('Explicit value'); INSERT INTO T1 DEFAULT VALUES; GO SELECT column_1, column_2, column_3, column_4 FROM dbo.T1; GO
E. Inserting data into a table with an identity column
The following example shows different methods of inserting data into an identity column. The first two INSERT statements allow identity values to be generated for the new rows. The third INSERT statement overrides the IDENTITY property for the column with the SET IDENTITY_INSERT statement and inserts an explicit value into the identity column.
CREATE TABLE dbo.T1 ( column_1 int IDENTITY, column_2 VARCHAR(30)); GO INSERT T1 VALUES ('Row #1'); INSERT T1 (column_2) VALUES ('Row #2'); GO SET IDENTITY_INSERT T1 ON; GO INSERT INTO T1 (column_1,column_2) VALUES (-99, 'Explicit identity value'); GO SELECT column_1, column_2 FROM T1; GO
F. Inserting data into a uniqueidentifier column by using NEWID()
The following example uses the NEWID() function to obtain a GUID for column_2
. Unlike for identity columns, the [!INCLUDEssDE] does not automatically generate values for columns with the uniqueidentifier data type, as shown by the second INSERT
statement.
CREATE TABLE dbo.T1 ( column_1 int IDENTITY, column_2 uniqueidentifier, ); GO INSERT INTO dbo.T1 (column_2) VALUES (NEWID()); INSERT INTO T1 DEFAULT VALUES; GO SELECT column_1, column_2 FROM dbo.T1;
G. Inserting data into user-defined type columns
The following [!INCLUDEtsql] statements insert three rows into the PointValue
column of the Points
table. This column uses a CLR user-defined type (UDT). The Point
data type consists of X and Y integer values that are exposed as properties of the UDT. You must use either the CAST or CONVERT function to cast the comma-delimited X and Y values to the Point
type. The first two statements use the CONVERT function to convert a string value to the Point
type, and the third statement uses the CAST function. For more information, see Manipulating UDT Data.
INSERT INTO dbo.Points (PointValue) VALUES (CONVERT(Point, '3,4')); INSERT INTO dbo.Points (PointValue) VALUES (CONVERT(Point, '1,5')); INSERT INTO dbo.Points (PointValue) VALUES (CAST ('1,99' AS Point));
Inserting Data from Other Tables
Examples in this section demonstrate methods of inserting rows from one table into another table.
H. Using the SELECT and EXECUTE options to insert data from other tables
The following example shows how to insert data from one table into another table by using INSERT…SELECT or INSERT…EXECUTE. Each is based on a multi-table SELECT statement that includes an expression and a literal value in the column list.
The first INSERT statement uses a SELECT statement to derive the data from the source tables (Employee
, SalesPerson
, and Person
) in the [!INCLUDEssSampleDBnormal] database and store the result set in the EmployeeSales
table. The second INSERT statement uses the EXECUTE clause to call a stored procedure that contains the SELECT statement, and the third INSERT uses the EXECUTE clause to reference the SELECT statement as a literal string.
CREATE TABLE dbo.EmployeeSales ( DataSource varchar(20) NOT NULL, BusinessEntityID varchar(11) NOT NULL, LastName varchar(40) NOT NULL, SalesDollars money NOT NULL ); GO CREATE PROCEDURE dbo.uspGetEmployeeSales AS SET NOCOUNT ON; SELECT 'PROCEDURE', sp.BusinessEntityID, c.LastName, sp.SalesYTD FROM Sales.SalesPerson AS sp INNER JOIN Person.Person AS c ON sp.BusinessEntityID = c.BusinessEntityID WHERE sp.BusinessEntityID LIKE '2%' ORDER BY sp.BusinessEntityID, c.LastName; GO --INSERT...SELECT example INSERT INTO dbo.EmployeeSales SELECT 'SELECT', sp.BusinessEntityID, c.LastName, sp.SalesYTD FROM Sales.SalesPerson AS sp INNER JOIN Person.Person AS c ON sp.BusinessEntityID = c.BusinessEntityID WHERE sp.BusinessEntityID LIKE '2%' ORDER BY sp.BusinessEntityID, c.LastName; GO --INSERT...EXECUTE procedure example INSERT INTO dbo.EmployeeSales EXECUTE dbo.uspGetEmployeeSales; GO --INSERT...EXECUTE('string') example INSERT INTO dbo.EmployeeSales EXECUTE (' SELECT ''EXEC STRING'', sp.BusinessEntityID, c.LastName, sp.SalesYTD FROM Sales.SalesPerson AS sp INNER JOIN Person.Person AS c ON sp.BusinessEntityID = c.BusinessEntityID WHERE sp.BusinessEntityID LIKE ''2%'' ORDER BY sp.BusinessEntityID, c.LastName '); GO --Show results. SELECT DataSource,BusinessEntityID,LastName,SalesDollars FROM dbo.EmployeeSales;
I. Using WITH common table expression to define the data inserted
The following example creates the NewEmployee
table in the [!INCLUDEssSampleDBnormal] database. A common table expression (EmployeeTemp
) defines the rows from one or more tables to be inserted into the NewEmployee
table. The INSERT statement references the columns in the common table expression.
CREATE TABLE HumanResources.NewEmployee ( EmployeeID int NOT NULL, LastName nvarchar(50) NOT NULL, FirstName nvarchar(50) NOT NULL, PhoneNumber Phone NULL, AddressLine1 nvarchar(60) NOT NULL, City nvarchar(30) NOT NULL, State nchar(3) NOT NULL, PostalCode nvarchar(15) NOT NULL, CurrentFlag Flag ); GO WITH EmployeeTemp (EmpID, LastName, FirstName, Phone, Address, City, StateProvince, PostalCode, CurrentFlag) AS (SELECT e.BusinessEntityID, c.LastName, c.FirstName, pp.PhoneNumber, a.AddressLine1, a.City, sp.StateProvinceCode, a.PostalCode, e.CurrentFlag FROM HumanResources.Employee e INNER JOIN Person.BusinessEntityAddress AS bea ON e.BusinessEntityID = bea.BusinessEntityID INNER JOIN Person.Address AS a ON bea.AddressID = a.AddressID INNER JOIN Person.PersonPhone AS pp ON e.BusinessEntityID = pp.BusinessEntityID INNER JOIN Person.StateProvince AS sp ON a.StateProvinceID = sp.StateProvinceID INNER JOIN Person.Person as c ON e.BusinessEntityID = c.BusinessEntityID ) INSERT INTO HumanResources.NewEmployee SELECT EmpID, LastName, FirstName, Phone, Address, City, StateProvince, PostalCode, CurrentFlag FROM EmployeeTemp; GO
J. Using TOP to limit the data inserted from the source table
The following example creates the table EmployeeSales
and inserts the name and year-to-date sales data for the top 5 random employees from the table HumanResources.Employee
in the [!INCLUDEssSampleDBnormal] database. The INSERT statement chooses any 5 rows returned by the SELECT
statement. The OUTPUT clause displays the rows that are inserted into the EmployeeSales
table. Notice that the ORDER BY clause in the SELECT statement is not used to determine the top 5 employees.
CREATE TABLE dbo.EmployeeSales ( EmployeeID nvarchar(11) NOT NULL, LastName nvarchar(20) NOT NULL, FirstName nvarchar(20) NOT NULL, YearlySales money NOT NULL ); GO INSERT TOP(5)INTO dbo.EmployeeSales OUTPUT inserted.EmployeeID, inserted.FirstName, inserted.LastName, inserted.YearlySales SELECT sp.BusinessEntityID, c.LastName, c.FirstName, sp.SalesYTD FROM Sales.SalesPerson AS sp INNER JOIN Person.Person AS c ON sp.BusinessEntityID = c.BusinessEntityID WHERE sp.SalesYTD > 250000.00 ORDER BY sp.SalesYTD DESC;
If you have to use TOP to insert rows in a meaningful chronological order, you must use TOP together with ORDER BY in a subselect statement as shown in the following example. The OUTPUT clause displays the rows that are inserted into the EmployeeSales
table. Notice that the top 5 employees are now inserted based on the results of the ORDER BY clause instead of random rows.
INSERT INTO dbo.EmployeeSales OUTPUT inserted.EmployeeID, inserted.FirstName, inserted.LastName, inserted.YearlySales SELECT TOP (5) sp.BusinessEntityID, c.LastName, c.FirstName, sp.SalesYTD FROM Sales.SalesPerson AS sp INNER JOIN Person.Person AS c ON sp.BusinessEntityID = c.BusinessEntityID WHERE sp.SalesYTD > 250000.00 ORDER BY sp.SalesYTD DESC;
Specifying Target Objects Other Than Standard Tables
Examples in this section demonstrate how to insert rows by specifying a view or table variable.
K. Inserting data by specifying a view
The following example specifies a view name as the target object; however, the new row is inserted in the underlying base table. The order of the values in the INSERT
statement must match the column order of the view. For more information, see Modify Data Through a View.
CREATE TABLE T1 ( column_1 int, column_2 varchar(30)); GO CREATE VIEW V1 AS SELECT column_2, column_1 FROM T1; GO INSERT INTO V1 VALUES ('Row 1',1); GO SELECT column_1, column_2 FROM T1; GO SELECT column_1, column_2 FROM V1; GO
L. Inserting data into a table variable
The following example specifies a table variable as the target object in the [!INCLUDEssSampleDBnormal] database.
-- Create the table variable. DECLARE @MyTableVar table( LocationID int NOT NULL, CostRate smallmoney NOT NULL, NewCostRate AS CostRate * 1.5, ModifiedDate datetime); -- Insert values into the table variable. INSERT INTO @MyTableVar (LocationID, CostRate, ModifiedDate) SELECT LocationID, CostRate, GETDATE() FROM Production.Location WHERE CostRate > 0; -- View the table variable result set. SELECT * FROM @MyTableVar; GO
Inserting Rows into a Remote Table
Examples in this section demonstrate how to insert rows into a remote target table by using a linked server or a rowset function to reference the remote table.
M. Inserting data into a remote table by using a linked server
The following example inserts rows into a remote table. The example begins by creating a link to the remote data source by using sp_addlinkedserver. The linked server name, MyLinkServer
, is then specified as part of the four-part object name in the form server.catalog.schema.object.
Applies to: [!INCLUDEsql2008-md] and later.
USE master; GO -- Create a link to the remote data source. -- Specify a valid server name for @datasrc as 'server_name' -- or 'server_nameinstance_name'. EXEC sp_addlinkedserver @server = N'MyLinkServer', @srvproduct = N' ', @provider = N'SQLNCLI', @datasrc = N'server_name', @catalog = N'AdventureWorks2012'; GO
-- Specify the remote data source in the FROM clause using a four-part name -- in the form linked_server.catalog.schema.object. INSERT INTO MyLinkServer.AdventureWorks2012.HumanResources.Department (Name, GroupName) VALUES (N'Public Relations', N'Executive General and Administration'); GO
N. Inserting data into a remote table by using the OPENQUERY function
The following example inserts a row into a remote table by specifying the OPENQUERY rowset function. The linked server name created in the previous example is used in this example.
Applies to: [!INCLUDEsql2008-md] and later.
INSERT OPENQUERY (MyLinkServer, 'SELECT Name, GroupName FROM AdventureWorks2012.HumanResources.Department') VALUES ('Environmental Impact', 'Engineering'); GO
O. Inserting data into a remote table by using the OPENDATASOURCE function
The following example inserts a row into a remote table by specifying the OPENDATASOURCE rowset function. Specify a valid server name for the data source by using the format server_name or server_nameinstance_name.
Applies to: [!INCLUDEsql2008-md] and later.
-- Use the OPENDATASOURCE function to specify the remote data source. -- Specify a valid server name for Data Source using the format -- server_name or server_nameinstance_name. INSERT INTO OPENDATASOURCE('SQLNCLI', 'Data Source= <server_name>; Integrated Security=SSPI') .AdventureWorks2012.HumanResources.Department (Name, GroupName) VALUES (N'Standards and Methods', 'Quality Assurance'); GO
P. Inserting into an external table created using PolyBase
Export data from SQL Server to Hadoop or Azure Storage. First, create an external table that points to the destination file or directory. Then, use INSERT INTO to export data from a local SQL Server table to an external data source. The INSERT INTO statement creates the destination file or directory if it does not exist and the results of the SELECT statement are exported to the specified location in the specified file format. For more information, see Get started with PolyBase.
Applies to: [!INCLUDEssnoversion].
-- Create an external table. CREATE EXTERNAL TABLE [dbo].[FastCustomers2009] ( [FirstName] char(25) NOT NULL, [LastName] char(25) NOT NULL, [YearlyIncome] float NULL, [MaritalStatus] char(1) NOT NULL ) WITH ( LOCATION='/old_data/2009/customerdata.tbl', DATA_SOURCE = HadoopHDP2, FILE_FORMAT = TextFileFormat, REJECT_TYPE = VALUE, REJECT_VALUE = 0 ); -- Export data: Move old data to Hadoop while keeping -- it query-able via external table. INSERT INTO dbo.FastCustomer2009 SELECT T.* FROM Insured_Customers T1 JOIN CarSensor_Data T2 ON (T1.CustomerKey = T2.CustomerKey) WHERE T2.YearMeasured = 2009 and T2.Speed > 40;
Bulk Loading Data from Tables or Data Files
Examples in this section demonstrate two methods to bulk load data into a table by using the INSERT statement.
Q. Inserting data into a heap with minimal logging
The following example creates a new table (a heap) and inserts data from another table into it using minimal logging. The example assumes that the recovery model of the AdventureWorks2012
database is set to FULL. To ensure minimal logging is used, the recovery model of the AdventureWorks2012
database is set to BULK_LOGGED before rows are inserted and reset to FULL after the INSERT INTO…SELECT statement. In addition, the TABLOCK hint is specified for the target table Sales.SalesHistory
. This ensures that the statement uses minimal space in the transaction log and performs efficiently.
-- Create the target heap. CREATE TABLE Sales.SalesHistory( SalesOrderID int NOT NULL, SalesOrderDetailID int NOT NULL, CarrierTrackingNumber nvarchar(25) NULL, OrderQty smallint NOT NULL, ProductID int NOT NULL, SpecialOfferID int NOT NULL, UnitPrice money NOT NULL, UnitPriceDiscount money NOT NULL, LineTotal money NOT NULL, rowguid uniqueidentifier ROWGUIDCOL NOT NULL, ModifiedDate datetime NOT NULL ); GO -- Temporarily set the recovery model to BULK_LOGGED. ALTER DATABASE AdventureWorks2012 SET RECOVERY BULK_LOGGED; GO -- Transfer data from Sales.SalesOrderDetail to Sales.SalesHistory INSERT INTO Sales.SalesHistory WITH (TABLOCK) (SalesOrderID, SalesOrderDetailID, CarrierTrackingNumber, OrderQty, ProductID, SpecialOfferID, UnitPrice, UnitPriceDiscount, LineTotal, rowguid, ModifiedDate) SELECT * FROM Sales.SalesOrderDetail; GO -- Reset the recovery model. ALTER DATABASE AdventureWorks2012 SET RECOVERY FULL; GO
R. Using the OPENROWSET function with BULK to bulk load data into a table
The following example inserts rows from a data file into a table by specifying the OPENROWSET function. The IGNORE_TRIGGERS table hint is specified for performance optimization. For more examples, see Import Bulk Data by Using BULK INSERT or OPENROWSET(BULK…) (SQL Server).
Applies to: [!INCLUDEsql2008-md] and later.
INSERT INTO HumanResources.Department WITH (IGNORE_TRIGGERS) (Name, GroupName) SELECT b.Name, b.GroupName FROM OPENROWSET ( BULK 'C:SQLFilesDepartmentData.txt', FORMATFILE = 'C:SQLFilesBulkloadFormatFile.xml', ROWS_PER_BATCH = 15000)AS b ;
Overriding the Default Behavior of the Query Optimizer by Using Hints
Examples in this section demonstrate how to use table hints to temporarily override the default behavior of the query optimizer when processing the INSERT statement.
[!CAUTION]
Because the [!INCLUDEssNoVersion] query optimizer typically selects the best execution plan for a query, we recommend that hints be used only as a last resort by experienced developers and database administrators.
S. Using the TABLOCK hint to specify a locking method
The following example specifies that an exclusive (X) lock is taken on the Production.Location table and is held until the end of the INSERT statement.
Applies to: [!INCLUDEssNoVersion_md], [!INCLUDEssSDS_md].
INSERT INTO Production.Location WITH (XLOCK) (Name, CostRate, Availability) VALUES ( N'Final Inventory', 15.00, 80.00);
Capturing the Results of the INSERT Statement
Examples in this section demonstrate how to use the OUTPUT Clause to return information from, or expressions based on, each row affected by an INSERT statement. These results can be returned to the processing application for use in such things as confirmation messages, archiving, and other such application requirements.
T. Using OUTPUT with an INSERT statement
The following example inserts a row into the ScrapReason
table and uses the OUTPUT
clause to return the results of the statement to the @MyTableVar
table variable. Because the ScrapReasonID
column is defined with an IDENTITY
property, a value is not specified in the INSERT
statement for that column. However, note that the value generated by the [!INCLUDEssDE] for that column is returned in the OUTPUT
clause in the INSERTED.ScrapReasonID
column.
DECLARE @MyTableVar table( NewScrapReasonID smallint, Name varchar(50), ModifiedDate datetime); INSERT Production.ScrapReason OUTPUT INSERTED.ScrapReasonID, INSERTED.Name, INSERTED.ModifiedDate INTO @MyTableVar VALUES (N'Operator error', GETDATE()); --Display the result set of the table variable. SELECT NewScrapReasonID, Name, ModifiedDate FROM @MyTableVar; --Display the result set of the table. SELECT ScrapReasonID, Name, ModifiedDate FROM Production.ScrapReason;
U. Using OUTPUT with identity and computed columns
The following example creates the EmployeeSales
table and then inserts several rows into it using an INSERT statement with a SELECT statement to retrieve data from source tables. The EmployeeSales
table contains an identity column (EmployeeID
) and a computed column (ProjectedSales
). Because these values are generated by the [!INCLUDEssDE] during the insert operation, neither of these columns can be defined in @MyTableVar
.
CREATE TABLE dbo.EmployeeSales ( EmployeeID int IDENTITY (1,5)NOT NULL, LastName nvarchar(20) NOT NULL, FirstName nvarchar(20) NOT NULL, CurrentSales money NOT NULL, ProjectedSales AS CurrentSales * 1.10 ); GO DECLARE @MyTableVar table( LastName nvarchar(20) NOT NULL, FirstName nvarchar(20) NOT NULL, CurrentSales money NOT NULL ); INSERT INTO dbo.EmployeeSales (LastName, FirstName, CurrentSales) OUTPUT INSERTED.LastName, INSERTED.FirstName, INSERTED.CurrentSales INTO @MyTableVar SELECT c.LastName, c.FirstName, sp.SalesYTD FROM Sales.SalesPerson AS sp INNER JOIN Person.Person AS c ON sp.BusinessEntityID = c.BusinessEntityID WHERE sp.BusinessEntityID LIKE '2%' ORDER BY c.LastName, c.FirstName; SELECT LastName, FirstName, CurrentSales FROM @MyTableVar; GO SELECT EmployeeID, LastName, FirstName, CurrentSales, ProjectedSales FROM dbo.EmployeeSales;
V. Inserting data returned from an OUTPUT clause
The following example captures data returned from the OUTPUT clause of a MERGE statement, and inserts that data into another table. The MERGE statement updates the Quantity
column of the ProductInventory
table daily, based on orders that are processed in the SalesOrderDetail
table in the [!INCLUDEssSampleDBnormal] database. It also deletes rows for products whose inventories drop to 0. The example captures the rows that are deleted and inserts them into another table, ZeroInventory
, which tracks products with no inventory.
--Create ZeroInventory table. CREATE TABLE Production.ZeroInventory (DeletedProductID int, RemovedOnDate DateTime); GO INSERT INTO Production.ZeroInventory (DeletedProductID, RemovedOnDate) SELECT ProductID, GETDATE() FROM ( MERGE Production.ProductInventory AS pi USING (SELECT ProductID, SUM(OrderQty) FROM Sales.SalesOrderDetail AS sod JOIN Sales.SalesOrderHeader AS soh ON sod.SalesOrderID = soh.SalesOrderID AND soh.OrderDate = '20070401' GROUP BY ProductID) AS src (ProductID, OrderQty) ON (pi.ProductID = src.ProductID) WHEN MATCHED AND pi.Quantity - src.OrderQty <= 0 THEN DELETE WHEN MATCHED THEN UPDATE SET pi.Quantity = pi.Quantity - src.OrderQty OUTPUT $action, deleted.ProductID) AS Changes (Action, ProductID) WHERE Action = 'DELETE'; IF @@ROWCOUNT = 0 PRINT 'Warning: No rows were inserted'; GO SELECT DeletedProductID, RemovedOnDate FROM Production.ZeroInventory;
W. Inserting data using the SELECT option
The following example shows how to insert multiple rows of data using an INSERT statement with a SELECT option. The first INSERT
statement uses a SELECT
statement directly to retrieve data from the source table, and then to store the result set in the EmployeeTitles
table.
CREATE TABLE EmployeeTitles ( EmployeeKey INT NOT NULL, LastName varchar(40) NOT NULL, Title varchar(50) NOT NULL ); INSERT INTO EmployeeTitles SELECT EmployeeKey, LastName, Title FROM ssawPDW.dbo.DimEmployee WHERE EndDate IS NULL;
X. Specifying a label with the INSERT statement
The following example shows the use of a label with an INSERT statement.
-- Uses AdventureWorks INSERT INTO DimCurrency VALUES (500, N'C1', N'Currency1') OPTION ( LABEL = N'label1' );
Y. Using a label and a query hint with the INSERT statement
This query shows the basic syntax for using a label and a query join hint with the INSERT statement. After the query is submitted to the Control node, [!INCLUDEssNoVersion], running on the Compute nodes, will apply the hash join strategy when it generates the [!INCLUDEssNoVersion] query plan. For more information on join hints and how to use the OPTION clause, see OPTION (SQL Server PDW).
-- Uses AdventureWorks INSERT INTO DimCustomer (CustomerKey, CustomerAlternateKey, FirstName, MiddleName, LastName ) SELECT ProspectiveBuyerKey, ProspectAlternateKey, FirstName, MiddleName, LastName FROM ProspectiveBuyer p JOIN DimGeography g ON p.PostalCode = g.PostalCode WHERE g.CountryRegionCode = 'FR' OPTION ( LABEL = 'Add French Prospects', HASH JOIN);
See Also
BULK INSERT (Transact-SQL)
DELETE (Transact-SQL)
EXECUTE (Transact-SQL)
FROM (Transact-SQL)
IDENTITY (Property) (Transact-SQL)
NEWID (Transact-SQL)
SELECT (Transact-SQL)
UPDATE (Transact-SQL)
MERGE (Transact-SQL)
OUTPUT Clause (Transact-SQL)
Use the inserted and deleted Tables
The database developer can, of course, throw all errors back to the application developer to deal with, but this is neither kind nor necessary. How errors are dealt with is very dependent on the application, but the process itself isn’t entirely obvious. Phil became gripped with a mission to explain…
In this article, we’re going to take a problem and use it to explore transactions, and constraint violations, before suggesting a solution to the problem.
The problem is this: we have a database which uses constraints; lots of them. It does a very solid job of checking the complex rules and relationships governing the data. We wish to import a batch of potentially incorrect data into the database, checking for constraint violations without throwing errors back at any client application, reporting what data caused the errors, and either rolling back the import or just the offending rows. This would then allow the administrator to manually correct the records and re-apply them.
Just to illustrate various points, we’ll take the smallest possible unit of this problem, and provide simple code that you can use to experiment with. We’ll be exploring transactions and constraint violations
Transactions
Transactions enable you to keep a database consistent, even after an error. They underlie every SQL data manipulation in order to enforce atomicity and consistency. They also enforce isolation, in that they also provide the way of temporarily isolating a connection from others that are accessing the database at the same time whilst a single unit of work is done as one or more SQL Statements. Any temporary inconsistency of the data is visible only to the connection. A transaction is both a unit of work and a unit of recovery. Together with constraints, transactions are the best way of ensuring that the data stored within the database is consistent and error-free.
Each insert, update, and delete statement is considered a single transaction (Autocommit, in SQL Server jargon). However, only you can define what you consider a ‘unit of work’ which is why we have explicit transactions. Using explicit transactions in SQL Server isn’t like sprinkling magic dust, because of the way that error-handling and constraint-checking is done. You need to be aware how this rather complex system works in order to avoid some of the pitfalls when you are planning on how to recover from errors.
Any good SQL Server database will use constraints and other DRI in order to maintain integrity and increase performance. The violation of any constraints leads to an error, and it is rare to see this handled well.
Autocommit transaction mode
Let’s create a table that allows us to be able to make a couple of different constraint violations. You’ll have to imagine that this is a part of a contact database that is full of constraints and triggers that will defend against bad data ever reaching the database. Naturally, there will be more in this table. It might contain the actual address that relates to the PostCode(in reality, it isn’t a one-to-one correspondence).
CREATE TABLE PostCode ( Code VARCHAR(10) PRIMARY KEY CHECK ( Code LIKE ‘[A-Z][A-Z0-9] [0-9][ABD-HJLNP-UW-Z][ABD-HJLNP-UW-Z]’ OR Code LIKE ‘[A-Z][A-Z0-9]_ [0-9][ABD-HJLNP-UW-Z][ABD-HJLNP-UW-Z]’ OR Code LIKE ‘[A-Z][A-Z0-9]__ [0-9][ABD-HJLNP-UW-Z][ABD-HJLNP-UW-Z]’ ) ); |
Listing 1: Creating the PostCodetable
This means that PostCodes in this table must be unique and they must conform to a specific pattern. Since SQL Databases are intrinsically transactional, those DML (Data Manipulation Language) statements that trigger an error will be rolled back. Assuming our table is empty, try this…
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
Delete from PostCode INSERT INTO PostCode (code) SELECT ‘W6 8JB’ AS PostCode UNION ALL SELECT ‘CM8 3BY’ UNION ALL SELECT ‘CR AZY’ —this is an invalid PostCode UNION ALL SELECT ‘G2 9AG’ UNION ALL SELECT ‘G2 9AG’; —a duplicate SELECT * FROM PostCode—none there Msg 547, Level 16, State 0, Line 3 The INSERT statement conflicted with the CHECK constraint «CK__PostCode__Code__4AB81AF0«. The conflict occurred in database «contacts«, table «dbo.PostCode«, column ‘Code’. The statement has been terminated. Code ———- (0 row(s) affected) |
Listing 2: Inserting rows in a single statement (XACT_ABORT OFF)
Nothing there, is there? It found the bad PostCodebut never got to find the duplicate, did it? So, this single statement was rolled back, because the CHECK
constraint found the invalid PostCode. Would this rollback the entire batch? Let’s try doing some insertions as separate statements to check this.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
SET XACT_ABORT OFF — confirm that XACT_ABORT is OFF (the default) DELETE FROM PostCode INSERT INTO PostCode (code) SELECT ‘W6 8JB’ AS PostCode INSERT INTO PostCode (code) SELECT ‘CM8 3BY’ INSERT INTO PostCode (code) SELECT ‘CR AZY’ —this is an invalid PostCode INSERT INTO PostCode (code) SELECT ‘G2 9AG’; INSERT INTO PostCode (code) SELECT ‘G2 9AG’; —a duplicate. Not allowed SELECT * FROM PostCode Msg 547, Level 16, State 0, Line 5 The INSERT statement conflicted with the CHECK constraint «CK__PostCode__Code__4AB81AF0«. The conflict occurred in database «contacts«, table «dbo.PostCode«, column ‘Code’. The statement has been terminated. Msg 2627, Level 14, State 1, Line 7 Violation of PRIMARY KEY constraint ‘PK__PostCode__A25C5AA648CFD27E’. Cannot insert duplicate key in object ‘dbo.PostCode’. The statement has been terminated. Code ———- CM8 3BY G2 9AG W6 8JB |
Listing 3: Single batch using separate INSERT statements (XACT_ABORT OFF)
Not only doesn’t it roll back the batch when it hits a constraint violation, but just the statement. It then powers on and finds the UNIQUE
constraint violation. As it wasn’t judged as a severe ‘batch-aborting’ error, SQL Server only rolled back the two offending inserts. If, however, we substitute SET XACT_ABORT ON
then the entire batch is aborted at the first error, leaving the two first insertions in place. The rest of the batch isn’t even executed. Try it.
By setting XACT_ABORT ON
, we are telling SQL Server to react to any error by rolling back the entire transaction and aborting the batch. By default, the session setting is OFF. In this case, SQL Server merely rolls back the Transact-SQL statement that raised the error and the batch continues. Even with SET XACT_ABORT
set to OFF
, SQL Server will choose to roll back a whole batch if it hits more severe errors.
If we want to clean up specific things after an error, or if we want processing to continue in the face of moderate errors, then we need to use SET XACT_ABORT OFF
, but there is a down-side: It is our responsibility now to make sure we can return the database to a consistent state on error…and use appropriate error handling to deal with even the trickier errors such as those caused by a cancel/timeout of the session in the middle of a transaction.
Just by changing the setting of XACT_ABORT
, we can rerun the example and end up with different data in the database. This is because, with XACT_ABORT ON
, the behavior is consistent regardless of the type of error. It simply assumes the transaction just can’t be committed, stops processing, and aborts the batch.
With XACT_ABORT OFF
, the behavior depends on the type of error. If it’s a constraint violation, permission-denial, or a divide-by-zero, it will plough on. If the error dooms the transaction, such as when there is a conversion error or deadlock, it won’t. Let’s illustrate this draconian batch-abortion.
DELETE FROM PostCode GO SET XACT_ABORT ON—or off. Try it both ways INSERT INTO PostCode (code) SELECT ‘CM8 3BY’ INSERT INTO PostCode (code) SELECT ‘W6 8JB’ AS PostCode UNION ALL SELECT ‘CM8 3BY’ UNION ALL SELECT ‘CR AZY’ —this is an invalid PostCode UNION ALL SELECT ‘G2 9AG’ UNION ALL SELECT ‘G2 9AG’; —a duplicate INSERT INTO PostCode (code) SELECT ‘CM8 3BY’ GO |
Listing 4: Inserting rows in a batch using separate INSERT statements (XACT_ABORT ON)
If you’ve got the XACT_ABORT ON
then you’ll get…
Msg 2627, Level 14, State 1, Line 4 Violation of PRIMARY KEY constraint ‘PK__PostCode__A25C5AA648CFD27E’. Cannot insert duplicate key in object ‘dbo.PostCode’. Code ———- CM8 3BY |
You’ll see that, in the second batch, the PostCode ‘G2 9AG’ never gets inserted because the batch is aborted after the first constraint violation.
If you set XACT_ABORT OFF
, then you’ll get …
Msg 2627, Level 14, State 1, Line 4 Violation of PRIMARY KEY constraint ‘PK__PostCode__A25C5AA648CFD27E’. Cannot insert duplicate key in object ‘dbo.PostCode’. The statement has been terminated. (1 row(s) affected) Code ———- CM8 3BY G2 9AG |
And to our surprise, we can see that we get a different result depending on the setting of XACT_ABORT.
(Remember that GO
is a client-side batch separator!) You’ll see that, if we insert a GO
after the multi-row insert, we get the same two PostCodes in . Yes, With XACT_ABORT ON
the behavior is consistent regardless of the type of error. With XACT_ABORT OFF
, behavior depends on the type of error
There is a great difference in the ‘abortion’ of a batch, and a ‘rollback’. With an ‘abortion’, any further execution of the batch is always abandoned. This will happen whatever you specified for XACT_ABORT
. If a type of error occurs that SQL Server considers too severe to allow you to ever commit the transaction, it is ‘doomed’. This happens whether you like it or not. The offending statement is rolled back and the batch is aborted.
Let’s ‘doom’ the batch by putting in a conversion error.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
SET XACT_ABORT OFF — confirm that XACT_ABORT is OFF (the default) DELETE FROM PostCode INSERT INTO PostCode (code) SELECT ‘W6 8JB’ AS PostCode; INSERT INTO PostCode (code) SELECT ‘CM8 3BY’; INSERT INTO PostCode (code) SELECT ‘G2 9AG’; INSERT INTO PostCode (code) SELECT ‘CR AZY’+1; —this is an invalid PostCode INSERT INTO PostCode (code) SELECT ‘G2 9AG’; —a duplicate. Not allowed PRINT ‘that went well!’ GO SELECT * FROM PostCode Msg 245, Level 16, State 1, Line 7 Conversion failed when converting the varchar value ‘CR AZY’ to data type int. Code ———- CM8 3BY G2 9AG W6 8JB |
Listing 5: Single batch using separate INSERT statements with a type conversion error (XACT_ABORT OFF)
You’ll probably notice that execution of the first batch stopped when the conversion error was detected, and just that statement was rolled back. It never found the Unique Constraint error. Then the following batch…select * from PostCode
…was executed.
You can combine several statements into a unit of work using wither explicit transactions or by setting implicit transactions on. The latter requires fewer statements but is less versatile and doesn’t provide anything new, so we’ll just stick to explicit transactions
So let’s introduce an explicit transaction that encompasses several statements. We can then see what difference this makes to the behavior we’ve seen with autoCommit.
Explicit Transactions
When we explicitly declare the start of a transaction in SQL by using the BEGIN TRANSACTION
statement, we are defining a point at which the data referenced by a particular connection is logically and physically consistent. If errors are encountered, all data modifications made after the BEGIN TRANSACTION
can be rolled back to return the data to this known state of consistency. While it’s possible to get SQL Server to roll back in this fashion, it doesn’t do it without additional logic. We either have to specify this behavior by setting XACT_ABORT
to ON
, so that the explicit transaction is rolled back automatically, or by using a ROLLBACK
.
Many developers believe that the mere fact of having declared the start of a transaction is enough to trigger an automatic rollback of the entire transaction if we hit an error during that transaction. Let’s try it.
SET XACT_ABORT OFF DELETE FROM PostCode BEGIN TRANSACTION INSERT INTO PostCode (code) SELECT ‘W6 8JB’; INSERT INTO PostCode (code) SELECT ‘CM8 3BY’; INSERT INTO PostCode (code) SELECT ‘CR AZY’; —invalid PostCode INSERT INTO PostCode (code) SELECT ‘G2 9AG’; INSERT INTO PostCode (code) SELECT ‘G2 9AG’; —a duplicate. Not allowed COMMIT TRANSACTION go SELECT * FROM PostCode; |
Listing 6: Multi-statement INSERT (single batch) using an explicit transaction
No dice. The result is exactly the same as when we tried it without the explicit transaction (see Listing 3). If we again use SET XACT_ABORT ON
then the batch is again aborted at the first error, but this time, the whole unit of work is rolled back.
By using SET XACT_ABORT ON
, you make SQL Server do what most programmers think happens anyway. Since it is unusual not to want to rollback a transaction following an error, it is normally safer to explicitly set it
ON
. However, there are times when you’d want it OFF
. You might, for example, wish to know about every constraint violation in the rows being imported into a table, and then do a complete rollback if any errors happened.
Most SQL Server clients set it to OFF
by default, though OLEDB
sets it to ON.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 |
SET XACT_ABORT OFF DELETE FROM PostCode DECLARE @Error INT SELECT @Error = 0 BEGIN TRANSACTION INSERT INTO PostCode (code) SELECT ‘W6 8JB’; SELECT @Error = @error + @@error; INSERT INTO PostCode (code) SELECT ‘CM8 3BY’; SELECT @Error = @error + @@error; INSERT INTO PostCode (code) SELECT ‘CR AZY’; —invalid PostCode SELECT @Error = @error + @@error; INSERT INTO PostCode (code) SELECT ‘G2 9AG’; SELECT @Error = @error + @@error; INSERT INTO PostCode (code) SELECT ‘G2 9AG’; —a duplicate. Not allowed SELECT @Error = @error + @@error; IF @error > 0 ROLLBACK TRANSACTION else COMMIT TRANSACTION go SELECT * FROM PostCode; SELECT @@Trancount —to check that the transaction is done Msg 547, Level 16, State 0, Line 11 The INSERT statement conflicted with the CHECK constraint «CK__PostCode__Code__4AB81AF0«. The conflict occurred in database «contacts«, table «dbo.PostCode«, column ‘Code’. The statement has been terminated. (1 row(s) affected) Msg 2627, Level 14, State 1, Line 15 Violation of PRIMARY KEY constraint ‘PK__PostCode__A25C5AA648CFD27E’. Cannot insert duplicate key in object ‘dbo.PostCode’. The statement has been terminated. Code ———- |
Listing 7: Multi-statement INSERT (single batch) using an explicit transaction
In this batch, we execute all the insertions in separate statements, checking the volatile @@Error
value. Then, we check to see whether the batch hit errors or it was successful. If it completes without any errors, we issue a COMMIT TRANSACTION
to make the modification a permanent part of the database. If one or more errors are encountered, then all modifications are undone with a ROLLBACK TRANSACTION
statement that rolls back to the start of the transaction.
The use of @@Error
isn’t entirely pain-free, since it only records the last error, and so, if a trigger has fired after the statement you’re checking, then the @@Error
value will be that corresponding to the last statement executed in the trigger, rather than your statement.
If the transaction becomes doomed, all that happens is that the transaction is rolled back without the rest of the transaction being executed, just as would happen anyway if XACT_ABORT i
s set to ON
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
SET XACT_ABORT OFF DELETE FROM PostCode DECLARE @Error INT SELECT @Error = 0 BEGIN TRANSACTION INSERT INTO PostCode (code) SELECT ‘W6 8JB’; SELECT @Error = @error + @@error; INSERT INTO PostCode (code) SELECT ‘CM8 3BY’; SELECT @Error = @error + @@error; INSERT INTO PostCode (code) SELECT ‘CR AZY’; —invalid PostCode SELECT @Error = @error + @@error; INSERT INTO PostCode (code) SELECT ‘G2 9AG’; SELECT @Error = @error + @@error; INSERT INTO PostCode (code) SELECT ‘G2 9AG’+1; —a duplicate. Not allowed SELECT @Error = @error + @@error; IF @error > 0 ROLLBACK TRANSACTION else COMMIT TRANSACTION go SELECT * FROM PostCode; SELECT @@Trancount; —to check that the transaction is complete Msg 245, Level 16, State 1, Line 6 Conversion failed when converting the varchar value ‘W6 8JB’ to data type int. Code ———- |
Listing 8: Multi-statement INSERT (single batch) with a doomed explicit transaction
There is a problem with this code, because I’ve issued the rollback without any qualification. If this code is called from within another transaction is will roll back to the start of the outer transaction. Often this is not what you want. I should really have declared a SavePoint to specify where to rollback to. I must explain.
Nested transactions and Savepoints
Transactions can be misleading because programmers equate them to program blocks, and assume that they can somehow be ‘nested’. All manner of routines can be called during a transaction, and some of them could, in turn, specify a transaction, but a rollback will always go to the base transaction.
Support for nested transactions in SQL Server (or other RDBMSs) simply means that it will tolerate us embedding a transaction within one or more other transactions. Most developers will assume that such ‘nesting’ will ensure that SQL Server handles each sub-transaction in an atomic way, as a logical unit of work that can commit independently of other child transactions. However, such behavior is not possible with nested transactions in SQL Server, or other RDMBSs; if the outer transaction was to allow such a thing it would be subverting the all-or-nothing rule of atomicity. SQL Server allows transactions within transactions purely so that a process can call transactions within a routine, such as a stored procedure, regardless of whether that process is within a transaction.
The use of a SavePoint can, however, allow you to rollback a series of statements within a transaction.
Without a Savepoint, a ROLLBACK
of a nested transaction can affect more than just the unit of work we’ve defined . If we rollback a transaction and it is ‘nested’ within one or more other transactions, it doesn’t just roll back to the last, or innermost BEGIN TRANSACTION
, but rolls all the way back in time to the start of the base transaction. This may not be what we want or expect, and could turn a minor inconvenience into a major muddle.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
SET XACT_ABORT OFF DELETE FROM PostCode DECLARE @Error INT SELECT @Error = 0 BEGIN TRANSACTION INSERT INTO PostCode (code) SELECT ‘W6 8JB’; INSERT INTO PostCode (code) SELECT ‘CM8 3BY’; BEGIN TRANSACTION —‘nested’ transaction INSERT INTO PostCode (code) SELECT ‘BY 5JR’; INSERT INTO PostCode (code) SELECT ‘PH2 0QA’; ROLLBACK—end of ‘nesting’ INSERT INTO PostCode (code) SELECT ‘CR 4ZY’; INSERT INTO PostCode (code) SELECT ‘G2 9AG’; COMMIT TRANSACTION go SELECT * FROM PostCode; SELECT @@Trancount; —to check that the transaction is complete Msg 3902, Level 16, State 1, Line 15 The COMMIT TRANSACTION request has no corresponding BEGIN TRANSACTION. Code ———- CR 4ZY G2 9AG |
Listing 9: Rolling back a nested transaction without a Savepoint
As you can see, SQL Server hasn’t just rolled back the inner transaction but all the work done since the outer BEGIN TRANSACTION
. You have a warning as well, 'The COMMIT TRANSACTION request has no corresponding BEGIN TRANSACTION
‘ because the transaction count became zero after the rollback, it successfully inserted two rows and came to the COMMIT TRANSACTION
statement.
Similarly, SQL Server simply ignores all commands to COMMIT
the transaction within ‘nested’ transactions until the batch issues the COMMIT
that matches the outermost BEGIN TRANSCATION
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
SET XACT_ABORT OFF DELETE FROM PostCode DECLARE @Error INT SELECT @Error = 0 BEGIN TRANSACTION INSERT INTO PostCode (code) SELECT ‘W6 8JB’; INSERT INTO PostCode (code) SELECT ‘CM8 3BY’; BEGIN TRANSACTION —‘nested’ transaction INSERT INTO PostCode (code) SELECT ‘BY 5JR’; INSERT INTO PostCode (code) SELECT ‘PH2 0QA’; COMMIT TRANSACTION—end of ‘nesting’ INSERT INTO PostCode (code) SELECT ‘CR 4ZY’; INSERT INTO PostCode (code) SELECT ‘G2 9AG’; Rollback go SELECT * FROM PostCode; SELECT @@Trancount; —to check that the transaction is complete Code ———- |
Listing 10: Attempting to COMMIT a nested transaction without a Savepoint
The evident desire was to commit the nested transaction, because we explicitly requested that the changes in the transaction be made permanent, and so we might expect at least something to happen, but what does? Nothing; if we have executed a COMMIT TRANSACTION
in a nested transaction that is contained a parent transaction that is then rolled back, the nested transaction will also be rolled back. SQL Server ignores the nested COMMIT
command and, whatever we do, nothing is committed until the base transaction is committed. In other words, the COMMIT
of the nested transaction is actually conditional on the COMMIT
of the parent.
One might think that it is possible to use the NAME
parameter of the ROLLBACK
TRANSACTION
statement to refer to the inner transactions of a set of named ‘nested’ transactions. Nice try, but the only name allowed, other than a Savepoint, is the transaction name of the outermost transaction. By adding the name, we can specify that all of the nested transactions are rolled back leaving the outermost, or ‘base’, one, whereas if we leave it out then the rollback includes the outermost transaction. The NAME
parameter is only useful in that we’ll get an error if someone inadvertently wraps what was the base transaction in a new base transaction, By giving the base transaction a name, it makes it easier to identify when we want to monitor the progress of long-running queries.
We can sort this problem out by using a SavePoint. This will allow us to do quite a bit of what we might have thought was happening anyway by nesting transactions! Savepoints are handy for marking a point in your transaction. We then have the option, later, of rolling back work performed before the current point in the transaction but after a declared savepoint within the same transaction.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
SET XACT_ABORT OFF DELETE FROM PostCode DECLARE @Error INT SELECT @Error = 0 BEGIN TRANSACTION INSERT INTO PostCode (code) SELECT ‘W6 8JB’; INSERT INTO PostCode (code) SELECT ‘CM8 3BY’; SAVE TRANSACTION here —create a savepoint called ‘here’ INSERT INTO PostCode (code) SELECT ‘BY 5JR’; INSERT INTO PostCode (code) SELECT ‘PH2 0QA’; ROLLBACK TRANSACTION here —rollback to the savepoint INSERT INTO PostCode (code) SELECT ‘CR 4ZY’; INSERT INTO PostCode (code) SELECT ‘G2 9AG’; COMMIT TRANSACTION go SELECT * FROM PostCode; SELECT @@Trancount; —to check that the transaction is complete Code ———- CM8 3BY CR 4ZY G2 9AG W6 8JB |
Listing 11: Using Savepoints to roll back to a ‘known’ point
When we roll backto a save point, only those statements that ran after the savepoint are rolled back. All savepoints that were established later are, of course, lost.
So, if we actually want rollback within a nested transaction , then we can create a savepoint at the start. Then, if a statement within the transaction fails, it is easy to return the data to its state before the transaction began and re-run it. Even better, we can create a transaction and call a series of stored procedures which do DML stuff. Before each stored procedure, we can create a savepoint. Then, if the procedure fails, it is easy to return the data to its state before it began and re-run the function with revised parameters or set to perform a recovery action. The downside would be holding a transaction open for too long.
The Consequences of Errors.
In our example, we’re dealing mainly with constraint violations which lead to statement termination, and we’ve contrasted them to errors that lead to batch abortion, and demonstrated that by setting XACT_ABORT ON
, statement termination starts to behave more like batch-abortion errors. (‘scope-abortion’ happens when there is a compile error, ‘connection-termination’ only happens when something horrible happens, and ‘batch-cancellation’ only when the client of a session cancels it, or there is a time-out ) All this can be determined from the @@Error
variable but there is nothing one can do to prevent errors from being passed back to the application. Nothing, that is, unless you use TRY...CATCH
TRY CATCH Behavior
It is easy to think that all one’s troubles are over with TRY..CATCH
, but in fact one still needs to be aware of other errors
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
set XACT_ABORT on DELETE FROM PostCode BEGIN TRY INSERT INTO PostCode (code) SELECT ‘W6 8JB’ AS PostCode INSERT INTO PostCode(code) SELECT ‘CM8 3BY’ INSERT INTO PostCode (code) SELECT ‘CR AZY’ —‘CR1 4ZY’ for a valid one INSERT INTO PostCode(code) SELECT ‘G2 9AG’ INSERT INTO PostCode(code) SELECT ‘G2 9AG’; END TRY BEGIN CATCH PRINT ‘ERROR ‘ + CONVERT(VARCHAR(8), @@error) + ‘, ‘ + ERROR_MESSAGE() END CATCH; SELECT * FROM PostCode ERROR 547 The INSERT statement conflicted with the CHECK constraint «CK__PostCode__Code__44FF419A«. The conflict occurred in database «contacts«, table «dbo.PostCode«, column ‘Code’. (1 row(s) affected) ERROR Code ———- CM8 3BY W6 8JB (2 row(s) affected) |
Listing 12: TRY…CATCH without a transaction
This behaves the same way whether XACT_ABORT
is on or off. This catches the first execution error that has a severity higher than 10 that does not close the database connection. This means that execution ends after the first error, but there is no automatic rollback of the unit of work defined by the TRY
block: No, we must still define a transaction. This works fine for most purposes though one must beware of the fact that certain errors such as killed connections or timeouts don’t get caught.
Try-Catch behavior deals with statement termination but needs extra logic to deal well with batch-abortion. In other words, we need to deal with un-committable and doomed transactions. Here is what happens if we don’t do it properly.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
set XACT_ABORT on DELETE FROM PostCode BEGIN TRANSACTION SAVE TRANSACTION here —only if SET XACT_ABORT OFF BEGIN TRY INSERT INTO PostCode (code) SELECT ‘W6 8JB’ AS PostCode INSERT INTO PostCode(code) SELECT ‘CM8 3BY’ INSERT INTO PostCode (code) SELECT ‘CR AZY’ —‘CR1 4ZY’ for a valid one INSERT INTO PostCode(code) SELECT ‘G2 9AG’ INSERT INTO PostCode(code) SELECT ‘G2 9AG’; END TRY BEGIN CATCH ROLLBACK TRANSACTION here PRINT ‘ERROR ‘ + CONVERT(VARCHAR(8), @@error) + ‘, ‘ + ERROR_MESSAGE()END CATCH; SELECT * FROM PostCode) Msg 3931, Level 16, State 1, Line 16 The current transaction cannot be committed and cannot be rolled back to a savepoint. Roll back the entire transaction. |
Listing 13: Mishandled Batch-abort
This error will immediately abort and roll back the batch whatever you do, but the TRY-CATCH
seems to handle the problem awkwardly if you set XACT_ABORT ON,
and it passes back a warning instead of reporting the error. Any error causes the transaction to be classified as an un-committable or ‘doomed’ transaction. The request cannot be committed, or rolled back to a savepoint. Only a full rollback to the start of the base transaction will do. No write operations can happen until it rolls back the transaction, only reads.
If you set XACT_ABORT
off, then it behaves gracefully, but terminates after the first error it comes across, executing the code in the CATCH
block.
To get around this, we can use the XACT_STATE()
function. This will tell you whether SQL Server has determined that the transaction is doomed. Whilst we can use the @@TRANCOUNT
variable to detect whether the current request has an active user transaction, we cannot use it to determine whether that transaction has been classified as an uncommitable transaction. Only XACT_STATE()
will tell us if the transaction is doomed, and only only @@TRANCOUNT
can be used to determine whether there are nested transactions.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
set XACT_ABORT off Declare @xact_state int DELETE FROM PostCode BEGIN TRANSACTION SAVE TRANSACTION here —only if SET XACT_ABORT OFF BEGIN TRY INSERT INTO PostCode (code) SELECT ‘W6 8JB’ AS PostCode INSERT INTO PostCode(code) SELECT ‘CM8 3BY’ INSERT INTO PostCode (code) SELECT ‘CR 4ZY’ —‘CR1 4ZY’ for a valid one INSERT INTO PostCode(code) SELECT ‘G2 9AG’ INSERT INTO PostCode(code) SELECT ‘G2 9AG’; END TRY BEGIN CATCH select @xact_state=XACT_STATE() IF ( @xact_state ) = 1 —the transaction is commitable ROLLBACK TRANSACTION here —just rollback to the savepoint ELSE ROLLBACK TRANSACTION —back to base, because it’s probably doomed PRINT case when @xact_state= —1 then ‘Doomed ‘ else » end +‘Error ‘ + CONVERT(VARCHAR(8), ERROR_NUMBER()) + ‘ on line ‘ + CONVERT(VARCHAR(8), ERROR_LINE()) + ‘, ‘ + ERROR_MESSAGE() END CATCH; IF XACT_STATE() = 1 COMMIT TRANSACTION —only if this is the base transaction —only if it hasn’t been rolled back SELECT * FROM PostCode |
Listing 14: Both Statement-termination and Batch abort handled
Reaching the Goal
So now, we can have reasonable confidence that we have a mechanism that will allow us to import a large number of records and tell us, without triggering errors, which records contain bad data, as defined by our constraints.
Sadly, we are going to do this insertion row-by-row, but you’ll see that 10,000 rows only takes arount three seconds, so it is worth the wait. We have a temporary table full of 10,000 valid PostCodes, and we’ll add in a couple of rogues just to test out what happens.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 |
SET XACT_ABORT OFF DELETE FROM PostCode SET NOCOUNT ON DECLARE @II INT, @iiMax INT, @Code VARCHAR(10) DECLARE @TemporaryStagingTable TABLE (Code_ID INT IDENTITY(1,1) PRIMARY KEY, Code CHAR(10) ) DECLARE @Error TABLE (Error_ID INT IDENTITY(1,1) PRIMARY KEY, ErrorCode INT, PostCodeVARCHAR(10), TransactionState INT, ErrorMessage VARCHAR(255) ) INSERT INTO @TemporaryStagingTable (code) SELECT code FROM PostCodeData UNION ALL SELECT ‘W6 8JB’ UNION ALL SELECT ‘CM8 3BY’ UNION ALL SELECT ‘CR AZY’ UNION ALL SELECT ‘G2 9AG’ UNION ALL SELECT ‘G2 9AG’ SELECT @ii=MIN(Code_ID),@iiMax=MAX(Code_ID) FROM @TemporaryStagingTable WHILE @ii<=@iiMax BEGIN BEGIN try SELECT @Code=code FROM @TemporaryStagingTable WHERE Code_ID=@ii INSERT INTO PostCode(code) SELECT @Code END try BEGIN CATCH INSERT INTO @error(ErrorCode, PostCode,TransactionState,ErrorMessage) SELECT ERROR_NUMBER(), @Code, XACT_STATE(), ERROR_MESSAGE() END CATCH; SELECT @ii=@ii+1 END SELECT * FROM @error |
Listing 15: insert from staging table with error reporting but without rollback on error
..and if you wanted to rollback the whole import process if you hit an error, then you could try this.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 |
SET XACT_ABORT OFF —to get statement-level rollbacks DELETE FROM PostCode—teardown last test SET NOCOUNT ON DECLARE @II INT, @iiMax INT, @Code VARCHAR(10) DECLARE @TemporaryStagingTable TABLE —to help us iterate through (Code_ID INT IDENTITY(1,1) PRIMARY KEY, Code CHAR(10) ) DECLARE @Error TABLE —to collect up all the errors (Error_ID INT IDENTITY(1,1) PRIMARY KEY, ErrorCode INT, PostCodeVARCHAR(10), TransactionState INT, ErrorMessage VARCHAR(255) ) INSERT INTO @TemporaryStagingTable (code) SELECT code FROM PostCodeData —the good stuff UNION ALL SELECT ‘W6 8JB’ UNION ALL SELECT ‘CM8 3BY’ UNION ALL SELECT ‘CR AZY’ UNION ALL SELECT ‘G2 9AG’ —bad stuff UNION ALL SELECT ‘G2 9AG’ —bad stuff —get the size of the table SELECT @ii=MIN(Code_ID),@iiMax=MAX(Code_ID) FROM @TemporaryStagingTable BEGIN TRANSACTION —start a transaction SAVE TRANSACTION here —pop in a savepoint since we may already be in a transaction —and we don’t want to mess it up WHILE @ii <= @iiMax AND XACT_STATE() <> —1 —if the whole transaction is doomed —then you’ve no option BEGIN BEGIN try —get the code first for our error record SELECT @Code=code FROM @TemporaryStagingTable WHERE Code_ID=@ii INSERT INTO PostCode(code) SELECT @Code —pop it in END try BEGIN CATCH —record the error INSERT INTO @error(ErrorCode, PostCode,TransactionState,ErrorMessage) SELECT ERROR_NUMBER(), @Code, XACT_STATE(), ERROR_MESSAGE() END CATCH; SELECT @ii=@ii+1 END IF EXISTS (SELECT * FROM @error) BEGIN IF ( XACT_STATE() ) = 1 —the transaction is commitable ROLLBACK TRANSACTION here —just rollback to the savepoint ELSE ROLLBACK TRANSACTION —we’re doomed! Doomed! SELECT * FROM @error; END ELSE COMMIT |
Listing 16: insert from staging table with error-reporting and rollback on error
You can comment out the rogue PostCodes or change the XACT_ABORT
settings just to check if it handles batch aborts properly.
Conclusion
To manage transactions properly, and react appropriately to errors fired by constraints, you need to plan carefully. You need to distinguish the various types of errors, and make sure that you react to all of these types appropriately in your code, where it is possible to do so. You need to specify the transaction abort mode you want, and the transaction mode, and you should monitor the transaction level and transaction state.
You should be clear that transactions are never nested, in the meaning that the term usually conveys.
Transactions must be short, and only used when necessary. A session must always be cleaned up, even when it times-out or is aborted, and one must do as much error reporting as possible when transactions have to be rolled back. DDL changes should be avoided within transactions, so as to avoid locks being placed on system tables.
The application developer should be not be forced to become too familiar with SQL Server errors, though some will inevitably require handling within application code. As much as possible, especially in the case of moderate errors such as constraint violations or deadlocks should be handled within the application/database interface.
Once the handling of constraint errors within transactions has been tamed and understood, constraints will prove to be one of the best ways of guaranteeing the integrity of the data within a database.
When you execute a MySQL statement, you may sometimes encounter ERROR 1054 as shown below:
mysql> SELECT user_name FROM users;
ERROR 1054 (42S22): Unknown column 'user_name' in 'field list'
The ERROR 1054 in MySQL occurs because MySQL can’t find the column or field you specified in your statement.
This error can happen when you execute any valid MySQL statements like a SELECT
, INSERT
, UPDATE
, or ALTER TABLE
statement.
This tutorial will help you fix the error by adjusting your SQL statements.
Let’s start with the SELECT
statement.
Fix ERROR 1054 on a SELECT statement
To fix the error in your SELECT
statement, you need to make sure that the column(s) you specified in your SQL statement actually exists in your database table.
Because the error above says that user_name
column is unknown, let’s check the users
table and see if the column exists or not.
To help you check the table in question, you can use the DESCRIBE
or EXPLAIN
statement to show your table information.
The example below shows the output of EXPLAIN
statement for the users
table:
mysql> EXPLAIN users;
+--------------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+--------------+-------------+------+-----+---------+-------+
| username | varchar(25) | NO | | | |
| display_name | varchar(50) | NO | | | |
| age | int | YES | | NULL | |
| comments | text | YES | | NULL | |
+--------------+-------------+------+-----+---------+-------+
From the result above, you can see that the users
table has no user_name
field (column)
Instead, it has the username
column without the underscore.
Knowing this, I can adjust my previous SQL query to fix the error:
SELECT username FROM users;
That should fix the error and your SQL query should show the result set.
Fix ERROR 1054 on an INSERT statement
When you specify column names in an INSERT
statement, then the error can be triggered on an INSERT
statement because of a wrong column name, just like in the SELECT
statement.
First, you need to check that you have the right column names in your statement.
Once you are sure, the next step is to look at the VALUES()
you specified in the statement.
For example, when I ran the following statement, I triggered the 1054 error:
mysql> INSERT INTO users(username, display_name)
-> VALUES ("jackolantern", Jack);
ERROR 1054 (42S22): Unknown column 'Jack' in 'field list'
The column names above are correct, and the error itself comes from the last entry in the VALUES()
function.
The display_name
column is of VARCHAR
type, so MySQL expects you to insert a VARCHAR
value into the column.
But Jack
is not a VARCHAR
value because it’s not enclosed in a quotation mark. MySQL considers the value to be a column name.
To fix the error above, simply add a quotation mark around the value. You can use both single quotes or double quotes as shown below:
INSERT INTO users(username, display_name)
VALUES ("jackolantern", 'Jack');
Now the INSERT
statement should run without any error.
Fix ERROR 1054 on an UPDATE statement
To fix the 1054 error caused by an UPDATE
statement, you need to look into the SET
and WHERE
clauses of your statement and make sure that the column names are all correct.
You can look at the error message that MySQL gave you to identify where the error is happening.
For example, the following SQL statement:
UPDATE users
SET username = "jackfrost", display_name = "Jack Frost"
WHERE user_name = "jackolantern";
Produces the following error:
ERROR 1054 (42S22): Unknown column 'user_name' in 'where clause'
The error clearly points toward the user_name
column in the WHERE
clause, so you only need to change that.
If the error points toward the field_list
as shown below:
ERROR 1054 (42S22): Unknown column 'displayname' in 'field list'
Then you need to check on the SET
statement and make sure that:
- You have the right column names
- Any
string
type values are enclosed in a quotation mark
You can also check on the table name that you specified in the UPDATE
statement and make sure that you’re operating on the right table.
Next, let’s look at how to fix the error on an ALTER TABLE
statement
Fix ERROR 1054 on an ALTER TABLE statement
The error 1054 can also happen on an ALTER TABLE
statement.
For example, the following statement tries to rename the displayname
column to realname
:
ALTER TABLE users
RENAME COLUMN displayname TO realname;
Because there’s no displayname
column name in the table, MySQL will respond with the ERROR 1054 message.
Conclusion
In short, ERROR 1054 means that MySQL can’t find the column name that you specified in your SQL statements.
It doesn’t matter if you’re writing an INSERT
, SELECT
, or UPDATE
statement.
There are only two things you need to check to fix the error:
- Make sure you’ve specified the right column name in your statement
- Make sure that any value of
string
type in your statement is surrounded by a quotation mark
You can check on your table structure using the DESCRIBE
or EXPLAIN
statement to help you match the column name and type with your statement.
And that’s how you fix the MySQL ERROR 1054 caused by your SQL statements.
I hope this tutorial has been useful for you 🙏