Содержание
- «Statement Suspended, Wait Error To Be Cleared» Wait Event (Doc ID 761848.1)
- Applies to:
- Symptoms
- Changes
- Cause
- To view full details, sign in with your My Oracle Support account.
- Don’t have a My Oracle Support account? Click to get started!
- Managing Resumable Space Allocation
- Resumable Space Allocation Overview
- How Resumable Space Allocation Works
- What Operations are Resumable?
- What Errors are Correctable?
- Resumable Space Allocation and Distributed Operations
- Parallel Execution and Resumable Space Allocation
- Enabling and Disabling Resumable Space Allocation
- Setting the RESUMABLE_TIMEOUT Initialization Parameter
- Using ALTER SESSION to Enable and Disable Resumable Space Allocation
- Specifying a Timeout Interval
- Naming Resumable Statements
- Using a LOGON Trigger to Set Default Resumable Mode
- Detecting Suspended Statements
- Notifying Users: The AFTER SUSPEND System Event and Trigger
- Using Views to Obtain Information About Suspended Statements
- Using the DBMS_RESUMABLE Package
- Operation-Suspended Alert
- Resumable Space Allocation Example: Registering an AFTER SUSPEND Trigger
«Statement Suspended, Wait Error To Be Cleared» Wait Event (Doc ID 761848.1)
Last updated on FEBRUARY 02, 2022
Applies to:
Oracle Database Exadata Cloud Machine — Version N/A and later
Oracle Cloud Infrastructure — Database Service — Version N/A and later
Oracle Database Cloud Exadata Service — Version N/A and later
Oracle Database Exadata Express Cloud Service — Version N/A and later
Oracle Database Cloud Schema Service — Version N/A and later
Information in this document applies to any platform.
This problem can occur on any platform.
Oracle Server — Enterprise Edition — Version: 10.1.0.2 to 11.1.0.7
Symptoms
The following wait event is found in AWR reports:
statement suspended, wait error to be cleared.
Messages like the following will be found in alert log:
Example:
statement in resumable session ‘. ‘ was suspended due to
ORA-30036: unable to extend segment by 2 in undo tablespace ‘UNDOTBS’
Changes
Resumable Space Allocation has been enabled.
Cause
To view full details, sign in with your My Oracle Support account.
Don’t have a My Oracle Support account? Click to get started!
In this Document
My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.
Oracle offers a comprehensive and fully integrated stack of cloud applications and platform services. For more information about Oracle (NYSE:ORCL), visit oracle.com. пїЅ Oracle | Contact and Chat | Support | Communities | Connect with us | |
|
| Legal Notices | Terms of Use
Источник
Managing Resumable Space Allocation
Oracle Database provides a means for suspending, and later resuming, the execution of large database operations in the event of space allocation failures. This enables you to take corrective action instead of the Oracle Database server returning an error to the user. After the error condition is corrected, the suspended operation automatically resumes. This feature is called resumable space allocation . The statements that are affected are called resumable statements.
This section contains the following topics:
Resumable Space Allocation Overview
This section provides an overview of resumable space allocation. It describes how resumable space allocation works, and specifically defines qualifying statements and error conditions.
How Resumable Space Allocation Works
The following is an overview of how resumable space allocation works. Details are contained in later sections.
A statement executes in a resumable mode only if its session has been enabled for resumable space allocation by one of the following actions:
The RESUMABLE_TIMEOUT initialization parameter is set to a nonzero value.
The ALTER SESSION ENABLE RESUMABLE statement is issued.
A resumable statement is suspended when one of the following conditions occur (these conditions result in corresponding errors being signalled for non-resumable statements):
Out of space condition
Maximum extents reached condition
Space quota exceeded condition.
When the execution of a resumable statement is suspended, there are mechanisms to perform user supplied operations, log errors, and to query the status of the statement execution. When a resumable statement is suspended the following actions are taken:
The error is reported in the alert log.
The system issues the Resumable Session Suspended alert.
If the user registered a trigger on the AFTER SUSPEND system event, the user trigger is executed. A user supplied PL/SQL procedure can access the error message data using the DBMS_RESUMABLE package and the DBA_ or USER_RESUMABLE view.
Suspending a statement automatically results in suspending the transaction. Thus all transactional resources are held through a statement suspend and resume.
When the error condition is resolved (for example, as a result of user intervention or perhaps sort space released by other queries), the suspended statement automatically resumes execution and the Resumable Session Suspended alert is cleared.
A suspended statement can be forced to throw the exception using the DBMS_RESUMABLE.ABORT() procedure. This procedure can be called by a DBA, or by the user who issued the statement.
A suspension time out interval is associated with resumable statements. A resumable statement that is suspended for the timeout interval (the default is two hours) wakes up and returns the exception to the user.
A resumable statement can be suspended and resumed multiple times during execution.
What Operations are Resumable?
The following operations are resumable:
SELECT statements that run out of temporary space (for sort areas) are candidates for resumable execution. When using OCI, the calls OCIStmtExecute() and OCIStmtFetch() are candidates.
INSERT , UPDATE , and DELETE statements are candidates. The interface used to execute them does not matter; it can be OCI, SQLJ, PL/SQL, or another interface. Also, INSERT INTO. SELECT from external tables can be resumable.
As for SQL*Loader, a command line parameter controls whether statements are resumable after recoverable errors.
The following statements are candidates for resumable execution:
CREATE TABLE . AS SELECT
ALTER INDEX . REBUILD
ALTER TABLE . MOVE PARTITION
ALTER TABLE . SPLIT PARTITION
ALTER INDEX . REBUILD PARTITION
ALTER INDEX . SPLIT PARTITION
CREATE MATERIALIZED VIEW
CREATE MATERIALIZED VIEW LOG
What Errors are Correctable?
There are three classes of correctable errors:
Out of space condition
The operation cannot acquire any more extents for a table/index/temporary segment/undo segment/cluster/LOB/table partition/index partition in a tablespace. For example, the following errors fall in this category:
Maximum extents reached condition
The number of extents in a table/index/temporary segment/undo segment/cluster/LOB/table partition/index partition equals the maximum extents defined on the object. For example, the following errors fall in this category:
Space quota exceeded condition
The user has exceeded his assigned space quota in the tablespace. Specifically, this is noted by the following error:
Resumable Space Allocation and Distributed Operations
In a distributed environment, if a user enables or disables resumable space allocation, or if you, as a DBA, alter the RESUMABLE_TIMEOUT initialization parameter, only the local instance is affected. In a distributed transaction, sessions or remote instances are suspended only if RESUMABLE has been enabled in the remote instance.
Parallel Execution and Resumable Space Allocation
In parallel execution, if one of the parallel execution server processes encounters a correctable error, that server process suspends its execution. Other parallel execution server processes will continue executing their respective tasks, until either they encounter an error or are blocked (directly or indirectly) by the suspended server process. When the correctable error is resolved, the suspended process resumes execution and the parallel operation continues execution. If the suspended operation is terminated, the parallel operation aborts, throwing the error to the user.
Different parallel execution server processes may encounter one or more correctable errors. This may result in firing an AFTER SUSPEND trigger multiple times, in parallel. Also, if a parallel execution server process encounters a non-correctable error while another parallel execution server process is suspended, the suspended statement is immediately aborted.
For parallel execution, every parallel execution coordinator and server process has its own entry in the DBA _ or USER_RESUMABLE view.
Enabling and Disabling Resumable Space Allocation
Resumable space allocation is only possible when statements are executed within a session that has resumable mode enabled. There are two means of enabling and disabling resumable space allocation. You can control it at the system level with the RESUMABLE_TIMEOUT initialization parameter, or users can enable it at the session level using clauses of the ALTER SESSION statement.
Because suspended statements can hold up some system resources, users must be granted the RESUMABLE system privilege before they are allowed to enable resumable space allocation and execute resumable statements.
Setting the RESUMABLE_TIMEOUT Initialization Parameter
You can enable resumable space allocation system wide and specify a timeout interval by setting the RESUMABLE_TIMEOUT initialization parameter. For example, the following setting of the RESUMABLE_TIMEOUT parameter in the initialization parameter file causes all sessions to initially be enabled for resumable space allocation and sets the timeout period to 1 hour:
If this parameter is set to 0, then resumable space allocation is disabled initially for all sessions. This is the default.
You can use the ALTER SYSTEM SET statement to change the value of this parameter at the system level. For example, the following statement will disable resumable space allocation for all sessions:
Within a session, a user can issue the ALTER SESSION SET statement to set the RESUMABLE_TIMEOUT initialization parameter and enable resumable space allocation, change a timeout value, or to disable resumable mode.
Using ALTER SESSION to Enable and Disable Resumable Space Allocation
A user can enable resumable mode for a session, using the following SQL statement:
To disable resumable mode, a user issues the following statement:
The default for a new session is resumable mode disabled, unless the RESUMABLE_TIMEOUT initialization parameter is set to a nonzero value.
The user can also specify a timeout interval, and can provide a name used to identify a resumable statement. These are discussed separately in following sections.
Specifying a Timeout Interval
A timeout period, after which a suspended statement will error if no intervention has taken place, can be specified when resumable mode is enabled. The following statement specifies that resumable transactions will time out and error after 3600 seconds:
The value of TIMEOUT remains in effect until it is changed by another ALTER SESSION ENABLE RESUMABLE statement, it is changed by another means, or the session ends. The default timeout interval when using the ENABLE RESUMABLE TIMEOUT clause to enable resumable mode is 7200 seconds.
«Setting the RESUMABLE_TIMEOUT Initialization Parameter» for other methods of changing the timeout interval for resumable space allocation
Naming Resumable Statements
Resumable statements can be identified by name. The following statement assigns a name to resumable statements:
The NAME value remains in effect until it is changed by another ALTER SESSION ENABLE RESUMABLE statement, or the session ends. The default value for NAME is ‘ User username ( userid ), Session sessionid , Instance instanceid ‘.
The name of the statement is used to identify the resumable statement in the DBA_RESUMABLE and USER_RESUMABLE views.
Using a LOGON Trigger to Set Default Resumable Mode
Another method of setting default resumable mode, other than setting the RESUMABLE_TIMEOUT initialization parameter, is that you can register a database level LOGON trigger to alter a user’s session to enable resumable and set a timeout interval.
If there are multiple triggers registered that change default mode and timeout for resumable statements, the result will be unspecified because Oracle Database does not guarantee the order of trigger invocation.
Detecting Suspended Statements
When a resumable statement is suspended, the error is not raised to the client. In order for corrective action to be taken, Oracle Database provides alternative methods for notifying users of the error and for providing information about the circumstances.
Notifying Users: The AFTER SUSPEND System Event and Trigger
When a resumable statement encounter a correctable error, the system internally generates the AFTER SUSPEND system event. Users can register triggers for this event at both the database and schema level. If a user registers a trigger to handle this system event, the trigger is executed after a SQL statement has been suspended.
SQL statements executed within a AFTER SUSPEND trigger are always non-resumable and are always autonomous. Transactions started within the trigger use the SYSTEM rollback segment. These conditions are imposed to overcome deadlocks and reduce the chance of the trigger experiencing the same error condition as the statement.
Users can use the USER_RESUMABLE or DBA_RESUMABLE views, or the DBMS_RESUMABLE. SPACE_ERROR_INFO function, within triggers to get information about the resumable statements.
Triggers can also call the DBMS_RESUMABLE package to terminate suspended statements and modify resumable timeout values. In the following example, the default system timeout is changed by creating a system wide AFTER SUSPEND trigger that calls DBMS_RESUMABLE to set the timeout to 3 hours:
Oracle Database PL/SQL Language Reference for information about triggers and system events
Using Views to Obtain Information About Suspended Statements
The following views can be queried to obtain information about the status of resumable statements:
View | Description |
---|---|
DBA_RESUMABLE |
These views contain rows for all currently executing or suspended resumable statements. They can be used by a DBA, AFTER SUSPEND trigger, or another session to monitor the progress of, or obtain specific information about, resumable statements. V$SESSION_WAIT When a statement is suspended the session invoking the statement is put into a wait state. A row is inserted into this view for the session with the EVENT column containing «statement suspended, wait error to be cleared».
Oracle Database Reference for specific information about the columns contained in these views
Using the DBMS_RESUMABLE Package
The DBMS_RESUMABLE package helps control resumable space allocation. The following procedures can be invoked:
Procedure | Description |
---|---|
ABORT(sessionID) | This procedure aborts a suspended resumable statement. The parameter sessionID is the session ID in which the statement is executing. For parallel DML/DDL, sessionID is any session ID which participates in the parallel DML/DDL. |
Oracle Database guarantees that the ABORT operation always succeeds. It may be called either inside or outside of the AFTER SUSPEND trigger.
The caller of ABORT must be the owner of the session with sessionID , have ALTER SYSTEM privilege, or have DBA privileges.
GET_SESSION_TIMEOUT(sessionID) This function returns the current timeout value of resumable space allocation for the session with sessionID . This returned timeout is in seconds. If the session does not exist, this function returns -1. SET_SESSION_TIMEOUT(sessionID, timeout) This procedure sets the timeout interval of resumable space allocation for the session with sessionID . The parameter timeout is in seconds. The new timeout setting will applies to the session immediately. If the session does not exist, no action is taken. GET_TIMEOUT() This function returns the current timeout value of resumable space allocation for the current session. The returned value is in seconds. SET_TIMEOUT(timeout) This procedure sets a timeout value for resumable space allocation for the current session. The parameter timeout is in seconds. The new timeout setting applies to the session immediately.
Operation-Suspended Alert
When a resumable session is suspended, an operation-suspended alert is issued on the object that needs allocation of resource for the operation to complete. Once the resource is allocated and the operation completes, the operation-suspended alert is cleared. Please refer to «Managing Tablespace Alerts» for more information on system-generated alerts.
Resumable Space Allocation Example: Registering an AFTER SUSPEND Trigger
In the following example, a system wide AFTER SUSPEND trigger is created and registered as user SYS at the database level. Whenever a resumable statement is suspended in any session, this trigger can have either of two effects:
If an undo segment has reached its space limit, then a message is sent to the DBA and the statement is aborted.
If any other recoverable error has occurred, the timeout interval is reset to 8 hours.
Источник
I was trying to create a very big table through a create table as select from a join between two huge tables forcing both parallel DDL and parallel QUERY and was mainly concerned about the amount of temp space the hash join between those two big tables will require to get my table smoothly created. I said hash join because my first attempt to create this table used a nested loop join which was driving a dramatic 100 million starts of table access by index rowid since more than 10 hours when I decided to kill the underlying session. The create table being a one-shot process I decided to hint the optimizer so that it will opt for a hash join operation instead of the initial nested loop. But my concern has been transferred from a long running non finishing SQL statement to how much my create table will need of TEMP space in order to complete successfully.
I launched the create table and started monitoring it with Tanel Poder snapper script, v$active_session_history and the Real Time SQL Monitoring feature (RTSM). This is what the monitoring was showing
SQL> select event ,count(1) from v$active_session_history where sql_id = '0xhm7700rq6tt' group by event order by 2 desc; EVENT COUNT(1) ---------------------------------------------------------------- ---------- direct path read 3202 direct path write temp 1629 979 db file scattered read 8 CSS initialization 7 CSS operation: query 1
However few minutes later Tanel Poder snapper started showing the following dominant wait event
SQL> @snapper ash 5 1 all ---------------------------------------------------------------------------------------------------- Active% | INST | SQL_ID | SQL_CHILD | EVENT | WAIT_CLASS ---------------------------------------------------------------------------------------------------- 800% | 1 | 0xhm7700rq6tt | 0 | statement suspended, wait error to | Configuration -- End of ASH snap 1, end=2015-02-12 09:12:57, seconds=5, samples_taken=45
Event which keeps incrementing in v$active_session_history as shown below:
EVENT COUNT(1) ---------------------------------------------------------------- ---------- statement suspended, wait error to be cleared 7400 direct path read 3202 direct path write temp 1629 979 db file scattered read 8 CSS initialization 7 CSS operation: query 1 7 rows selected. SQL> / EVENT COUNT(1) ---------------------------------------------------------------- ---------- statement suspended, wait error to be cleared 7440 direct path read 3202 direct path write temp 1629 979 db file scattered read 8 CSS initialization 7 CSS operation: query 1 7 rows selected. SQL> / EVENT COUNT(1) ---------------------------------------------------------------- ---------- statement suspended, wait error to be cleared 7480 direct path read 3202 direct path write temp 1629 979 db file scattered read 8 CSS initialization 7 CSS operation: query 1 EVENT COUNT(1) ---------------------------------------------------------------- ---------- statement suspended, wait error to be cleared 11943 direct path read 3202 direct path write temp 1629 980 db file scattered read 8 CSS initialization 7 CSS operation: query 1
The unique wait event that was incrementing is that bizarre statement suspended, wait error to be cleared. The session from which I launched the create table statment was hanging without reporting any error.
As far as I was initially concerned by the TEMP space I checked the available space there
SQL> select tablespace_name, total_blocks, used_blocks, free_blocks from v$sort_segment; TABLESPACE_NAME TOTAL_BLOCKS USED_BLOCKS FREE_BLOCKS ------------------------------- ------------ ----------- ----------- TEMP 8191744 8191744 0
No available free space left in TEMP tablespace
The RTSM of the corresponding sql_id was showing the following
===================================================================================== | Id | Operation | Name | Rows | Temp | | | | |(Actual) | | ===================================================================================== | 0 | CREATE TABLE STATEMENT | | 0 | | | 1 | PX COORDINATOR | | | | | 2 | PX SEND QC (RANDOM) | :TQ10003 | | | | 3 | LOAD AS SELECT | | | | | 4 | HASH GROUP BY | | | | | 5 | PX RECEIVE | | | | | 6 | PX SEND HASH | :TQ10002 | | | | -> 7 | HASH JOIN | | 0 | 66G | --> spot this | | | | | | | | | | | | | | | | | | | 8 | PX RECEIVE | | 626M | | | 9 | PX SEND BROADCAST | :TQ10001 | 626M | | | 10 | HASH JOIN | | 78M | | | 11 | PX RECEIVE | | 372K | | | 12 | PX SEND BROADCAST | :TQ10000 | 372K | | | 13 | PX BLOCK ITERATOR | | 46481 | | | 14 | INDEX FAST FULL SCAN | INDX_12453656_TABL | 46481 | | | 15 | PX BLOCK ITERATOR | | 88M | | | 16 | TABLE ACCESS FULL | TABL1 | 88M | | | | | | | | | -> 17 | PX BLOCK ITERATOR | | 682M | | | -> 18 | TABLE ACCESS FULL | TABLE123456 | 682M | | | | | | | | =====================================================================================
The HASH JOIN operation at line 7 has already gone above the available TEMP tablespace size which is 64G. I was wondering then, why my session has not been stopped with an ORA-01652 error instead of hanging there and letting the wait event statement suspended, wait error to be cleared popping up in my different monitoring tools.
After few minutes of googling tuning steps I ended up by executing the following query
SQL>select coord_session_id ,substr(sql_text, 1,10) ,error_msg from dba_resumable; SESSION_ID SQL ERROR_MSG ----------- --------- ------------------------------------------------------------------- 146 create /*+ ORA-01652: unable to extend temp segment by 128 in tablespace TEMP
This 146 SID is my create table session! It is in error but hanging and waiting on that statement suspend wait event.
After few minutes of googling tuning again I ended up by checking the following parameter:
SQL> sho parameter resumable NAME TYPE VALUE ------------------------------ -------------------------------- ------ resumable_timeout integer 9000
Notice now how I finally have made a link between the title and the purpose of this article: resumable_timeout. The Oracle documentation states that when this parameter is set to a nonzero value the resumable space allocation might kicks in.
Put is simply what happens in my case is that since this parameter has been set to 9000 minutes and since my create table as select has encountered an ORA-01652 temp tablespace error, Oracle decided to put my session in a resumable state until someone will point out that space shortage,correct the situation and allow the session to continue its normal processing
The next step I did is to abort my session from its resumable state
SQL> exec dbms_resumable.abort(146); PL/SQL procedure successfully completed.
Unfortunately I waited a couple of minutes in front of my sqlplus session waiting for it to resume without success so that I decided to use the brute force
ERROR: ORA-03113: end-of-file on communication channel Process ID: 690975 Session ID: 146 Serial number: 3837
Summary
I would suggest to do not set the resumable_timeout parameter to a nonzero value. Otherwise you will be hanging on a statement suspended, wait error to be cleared event during an amount of time indicated by this resumable_timeout parameter looking around for what is making your SQL statement hanging.
Footnote
Here below few extra words from Oracle documentation about Resumable space allocation
“Resumable space allocation avoids headaches and saves time by suspending, instead
of terminating, a large database operation requiring more disk space than is currently
available. While the operation is suspended, you can allocate more disk space on
the destination tablespace or increase the quota for the user. Once the low space
condition is addressed, the large database operation automatically picks up where it
left off.
As you might expect, the statements that resume are known as resumable statements.
The suspended statement, if it is part of a transaction, also suspends the transaction.
When disk space becomes available and the suspended statement resumes, the
transaction can be committed or rolled back whether or not any statements in
the transactions were suspended. The following conditions can trigger resumable
space allocation:
■ Out of disk space in a permanent or temporary tablespace
■ Maximum extents reached on a tablespace
■ User space quota exceeded
You can also control how long a statement can be suspended. The default time
interval is two hours, at which point the statement fails and returns an error message
to the user or application as if the statement was not suspended at all.
There are four general categories of commands that can be resumable:
(1) SELECT statements, (2) DML commands, (3) SQL*Loader operations, and
(4) DDL statements that allocate disk space.”
Managing Resumable Space Allocation — Oracle Administration Workshop 2
A resumable statement:
- Enables you to suspend large operations instead of receiving an error
- Gives you a chance to fix the problem while the operation is suspended, rather than starting over
- Is suspended for the following conditions:
- Out of space
- Maximum extents reached
- Space quota exceeded
The Oracle database provides a means for suspending, and later resuming, the execution of large
database operations in the event of space allocation failures. This enables you to take corrective
action instead of the Oracle database server returning an error to the user. After the error
condition is corrected, the suspended operation automatically resumes. This feature is called
“resumable space allocation.” The statements that are affected are called “resumable
statements.”
A statement executes in resumable mode only when the resumable statement feature has been
enabled for the system or session.
Suspending a statement automatically results in suspending the transaction. Thus all
transactional resources are held through the suspension and resuming of a SQL statement. When
the error condition disappears (for example, as a result of user intervention or perhaps sort space
released by other queries), the suspended statement automatically resumes execution.
A suspension time-out interval is associated with resumable statements. A resumable statement
that is suspended for the time-out interval (the default is 7,200 seconds (2 hours)) reactivates
itself and returns the exception to the user. A resumable statement can be suspended and
resumed multiple times during execution.
Using Resumable Space Allocation
- Queries, DML operations, and certain DDL operations can be resumed if they encounter an
out-of-space error. - A resumable statement can be issued through SQL, PL/SQL, SQL*Loader, or the Oracle Call
Interface (OCI). - To run a resumable statement, you must first enable resumable statements for your session.
Using After Suspend Trigger To Automatically Resolve User Error’s
- First Create A Procedure as shown Below.
- Then create the after suspend trigger as shown in the example below.
Example:
- An INSERT statement encounters an error saying the table is full.
- The INSERT statement is suspended, and no error is passed to client.
- Optionally, an AFTER SUSPEND trigger is executed.
- Optionally, the SQLERRROR exception is activated to abort the statement.
- If the statement is not aborted and free space is successfully added to the table, the INSERT statement resumes execution.
Detecting a Suspended Statement
When a resumable statement is suspended, the error is not raised to the client. In order for
corrective action to be taken, the Oracle database provides alternative methods for notifying
users of the error and for providing information about the circumstances.
Possible Actions During Suspension
When a resumable statement encounters a correctable error, the system internally generates the
AFTER SUSPEND system event. Users can register triggers for this event at both the database
and schema level. If a user registers a trigger to handle this system event, the trigger is executed
after a SQL statement has been suspended. SQL statements executed within an AFTER
SUSPEND trigger are always nonresumable and are always autonomous. Transactions started
within the trigger use the SYSTEM rollback segment. These conditions are imposed to overcome
deadlocks and reduce the chance of the trigger experiencing the same error condition as the
statement.
Within the trigger code, you can use the USER_RESUMABLE or DBA_RESUMABLE views, or
the DBMS_RESUMABLE.SPACE_ERROR_INFO function to get information about the
resumable statements.
When a resumable statement is suspended:
- The session invoking the statement is put into a wait state. A row is inserted into
V$SESSION_WAIT for the session with the EVENT column containing “statement
suspended, wait error to be cleared”. - An operation-suspended alert is issued on the object that needs addition resources for the
suspended statement to complete.
Ending a Suspended Statement
When the error condition is resolved (for example, as a result of DBA intervention or perhaps
sort space released by other queries), the suspended statement automatically resumes execution
and the “resumable session suspended” alert is cleared.
A suspended statement can be forced to activate the SERVERERROR exception by using the
DBMS_RESUMABLE.ABORT() procedure. This procedure can be called by a DBA, or by the
user who issued the statement. If the suspension time-out interval associated with the resumable
statement is reached, the statement aborts automatically and an error is returned to the user.
Popular posts from this blog
Oracle Architecture — Explained In Detailed — Administration I
Figures shown in these notes are from Oracle® Database Concepts 11 g Release 2 Objectives These notes introduce the Oracle server architecture. The architecture includes physical components, memory components, processes, and logical structures. Primary Architecture Components The figure shown above details the Oracle architecture. Oracle server : An Oracle server includes an Oracle Instance and an Oracle database . · An Oracle database includes several different types of files: datafiles, control files, redo log files and archive redo log files. The Oracle server also accesses parameter files and password files. · This set of files has several purposes. o One is to enable system users to process SQL statements. o Another is to improve system performance. o Still another is to ensure the database can be recovered if there is a software/hardware failure. · The database server must manage large amo
Oracle Database SQL — Practise — Question — Scott Schema Examples
Oracle SQL Practice Questions for Students All Question are Based on Scott Schema Exhibits EMP TABLE Exhibits DEPT TABLE Exhibits SALGRADE TABLE Question 1) Display all the records in emp table? 2) Display all the records in emp table where employee belongs to deptno 10? 3) Display all the records in emp table where employee does not belong to deptno 30? 4) Display total number of records in Emp table? 5) Display emp table with salary descending order? 6) Display first five records in employee table? 7) Display all the records in emp table order by ascending deptno, descending salary? Display all employees those who were joined in year 1981? 9) Display COMM in emp table. Display zero in place of null. 10) Display the records in emp table where MGR in 7698,7566 and sal should be greater then 1500 11) Display all employees where employees hired before 01-JAN-1981 12) Display all employees with
Step by Step — How to resolve redo log file corruption using ALTER DATABASE CLEAR UNARCHIVED LOGFILE command
Connect to sys using sqlplus in admin command prompt Check whether your database is in archive log mode or not. Run command ARCHIVE LOG LIST. It will show you the following details: It is always preferable to have your database in archive log mode. The reason being if your database is in archive log mode you can recover from all committed changes in the event of an OS or disk failure. This query will display the path of redo log files, their group and their status. It is always recommended to have a minimum of two members in one group. As you can see I have only 1 member in group 6 whose current status is INACTIVE. I intentionally have 1 member to generate a scenario for the sake of this practical. Now go to the specified path where the redo log member of group 6 is located and open it in a notepad. It will show some symbolic data. Delete couple of lines and save it the original location which is the one we got as a result of query. Why did we do the above step i.
There are two questions about temporary space that appear fairly regularly on the various Oracle forums. One is of the form:
From time to time my temporary tablespace grows enormously (and has to be shrunk), how do I find what’s making this happen?
The other follows the more basic pattern:
My process sometimes crashes with Oracle error: “ORA-01652: unable to extend temp segment by %n in tablespace %s” how do I stop this happening?
Before moving on to the topic of the blog, it’s worth pointing out two things about the second question:
- First, it’s too easy to get stuck at the word temp and leap to the conclusion that the problem is about the temporary tablespace without noticing that the error message includes the specific tablespace that’s raised the problem. If, for example, you rebuild an index in a nominated tablespace Oracle first creates the index as a temporary segment (with a name like {starting_file_number}.{starting_block_number}) in that tablespace then renames it to match the original index name once the rebuild is complete and drops the old index.
- Secondly a process that raises ORA-01652 isn’t necessarily the guilty party – it may be the victim of some other process hogging all the available space when it shouldn’t. Moreover that other process may have completed and released its space by the time you start looking for the problem – causing extra confusion because your process seems to have crashed without a cause. Taking my example of an index rebuild – your index rebuild may fail because someone else was rebuilding a different index at the same time in the same tablespace; but when you check the tablespace all the space from their original index is now free as their rebuild completed in the interim.
So, before you start chasing something that you think is a problem with your code, pause a moment to double-check the error message and think about whether you could have been the victim of some concurrent, but now complete, activity.
I’ve listed the two questions as variants on the same theme because the workaround to one of them introduces the risk of the other – if you want to avoid ORA-01652 you could make all your data files and temp files “autoextensible”, but then there may be occasions when they extend far too much and you need to shrink them down again (and that’s not necessarily easy if it’s not the temporary tablespace). Conversely, if you think your data or temp files randomly explode to ludicrous sizes you could decide on a maximum size for your files and disable autoextension – then handle the complaints when a user reports an ORA-01652.
There are various ways you could monitor your system in near real time to spot the threat as it builds, of course; and there are various ways to identify potentially guilty SQL after the event. You could keep an eye on various v$ dynamic performance views or dba_ administrative views to try and intercept a problem; you could set event 1652 to dump an errorstack (or even systemstate) for post-crash analysis to see what that reported. Neither is an ideal solution – one requires you to pay excessive attention to the system, the other is designed to let the problem happen then leave you to clean up afterwards. There is, however, a strategy that may stop the problem from appearing without requiring constant monitoring. The strategy is to enable (selectively) resumable operations.
If a resumable operation needs to allocate space but is unable to do so – i.e. it would normally be about to raise ORA-01652 – it will suspend itself for a while going into the wait state “statement suspended, wait error to be cleared” which will show up as the event in v$session_wait, timing out every 2 seconds The session will also be reporting its current action in the view v$resumable or, for slightly more information, dba_resumable. As it suspends the session will also write a message to the alert log but you can also create an “after suspend” database trigger to alert you that a problem has occurred.
If you set the resumable timeout to a suitable value then you may find:
- the problem goes away of its own accord and the session resumes before the timeout is reached
or
- you receive a warning and have some time to identify the source of the problem and take the minimum action needed to allow the session to resume
Implementation
The parameter resumable_timeout is a general control for resumable sessions if you don’t handle the feature at a more granular level than the system.
By default this parameter is set to zero which translates into a default value of 7,200 seconds but that default doesn’t come into effect unless a session declares itself resumable. If you set the parameter to a non-zero value all session will automatically be operating as resumable sessions – and you’ll soon hear why you don’t want to do that.
The second enabling feature for resumable sessions is the resumable privilege – a session can’t control it’s own resumability unless the schema has been granted the resumable privilege – which may be granted through a role. If a session has the privilege it may set its own resumable_timeout, even if the system value is zero.
Assume we have set resumable_timeout to 10 (seconds) through the instance parameter file and restarted the instance. If we now issue (for example) the following ‘create table’ statement:
create table t1 (n1, v1 ) pctfree 90 pctused 10 tablespace tiny as select rownum, cast(lpad('x',800) as varchar2(1000)) from all_objects where rownum <= 20000 /
This will attempt to allocate 1 row per block for 20,000 blocks (plus about 1.5% for bitmap space management blocks) – and tablespace tiny lives up (or down) to its name, consisting of a single file of only 10,000 Oracle blocks. Shortly after starting, the session will hit Oracle error “ORA-01652: unable to extend temp segment by 128 in tablespace TINY”, but it won’t report it; instead it will suspend itself for 10 seconds before failing and reporting the error. This will happen whether or not the session has the resumable privilege – in this case the behaviour is dictated by our setting the system parameter. If you look in the alert log after the session finally errors out you will find text like the following:
2019-10-04T14:01:11.847943+01:00 ORCL(3):ORA-1652: unable to extend temp segment by 128 in tablespace TINY [ORCL] ORCL(3):statement in resumable session 'User TEST_USER(138), Session 373, Instance 1' was suspended due to ORCL(3): ORA-01652: unable to extend temp segment by 128 in tablespace TINY 2019-10-04T14:01:23.957586+01:00 ORCL(3):statement in resumable session 'User TEST_USER(138), Session 373, Instance 1' was timed out
Note that there’s a 10 (plus a couple) second gap between the point where the session reports that it is suspending itself and the point where it fails with a timeout. The two-extra seconds appear because the session polls every 2 seconds to see whether the problem is still present or whether it has spontaneously disappeared so allowing the session to resume.
Let’s change the game slightly; let’s try to create the table again, but this time execute the following statement first:
alter session enable resumable timeout 60 name 'Help I''m stuck';
The initial response to this will be Oracle error “ORA-01031: insufficient privileges” because the session doesn’t have the resumable privilege, but after granting resumable to the user (or a relevant role) we try again and find we will be allowed a little extra time before the CTAS times out. Our session now overrides the system timeout and will wait 60 seconds (plus a bit) before failing.The “timeout” clause is optional and if we omit it the session will use the system value, similarly the “name” clause is optional though there’s no default for it, it’s just a message that will get into various views and reports.
There are several things you might check in this 60 second grace period. The session wait history will confirm that your session has been timing out every two seconds (as will the active session history if you’re licensed to use it):
select seq#, event, wait_time from v$session_wait_history where sid = 373 SEQ# EVENT WAIT_TIME ---------- ---------------------------------------------------------------- ---------- 1 statement suspended, wait error to be cleared 204 2 statement suspended, wait error to be cleared 201 3 statement suspended, wait error to be cleared 201 4 statement suspended, wait error to be cleared 201 5 statement suspended, wait error to be cleared 200 6 statement suspended, wait error to be cleared 200 7 statement suspended, wait error to be cleared 202 8 statement suspended, wait error to be cleared 200 9 statement suspended, wait error to be cleared 200 10 statement suspended, wait error to be cleared 200
Then there’s a special dynamic performance view, v$resumable which I’ve reported below using a print_table() procedure that Tom Kyte wrote many, many years ago to report rows in a column format:
SQL> set serveroutput on SQL> execute print_table('select * from v$resumable where sid = 373') ADDR : 0000000074515B10 SID : 373 ENABLED : YES STATUS : SUSPENDED TIMEOUT : 60 SUSPEND_TIME : 10/04/19 14:26:20 RESUME_TIME : NAME : Help I'm stuck ERROR_NUMBER : 1652 ERROR_PARAMETER1 : 128 ERROR_PARAMETER2 : TINY ERROR_PARAMETER3 : ERROR_PARAMETER4 : ERROR_PARAMETER5 : ERROR_MSG : ORA-01652: unable to extend temp segment by 128 in tablespace TINY CON_ID : 0 ----------------- 1 rows selected
Notice how the name column reports the name I supplied when I enabled the resumable session. The view also tells us when the critical statement was suspended and how long it is prepared to wait (in total) – leaving us to work out from the current time how much time we have left to work around the problem.
There’s also a dba_resumable variant of the view which is slightly more informative (though the sample below is not consistent with the one above because I ran the CTAS several times, editing the blog as I did so):
SQL> execute print_table('select * from dba_resumable where session_id = 373') USER_ID : 138 SESSION_ID : 373 INSTANCE_ID : 1 COORD_INSTANCE_ID : COORD_SESSION_ID : STATUS : SUSPENDED TIMEOUT : 60 START_TIME : 10/04/19 14:21:14 SUSPEND_TIME : 10/04/19 14:21:16 RESUME_TIME : NAME : Help I'm stuck SQL_TEXT : create table t1 (n1, v1 ) pctfree 90 pctused 10 tablespace tiny as select rownum, cast(lpad('x',800) as varchar2(1000)) from all_objects where rownum <= 20000 ERROR_NUMBER : 1652 ERROR_PARAMETER1 : 128 ERROR_PARAMETER2 : TINY ERROR_PARAMETER3 : ERROR_PARAMETER4 : ERROR_PARAMETER5 : ERROR_MSG : ORA-01652: unable to extend temp segment by 128 in tablespace TINY ----------------- 1 rows selected
This view includes the text of the statement that has been suspended and shows us when it started running (so that we can decide whether we really want to rescue it, or might be happy to kill it to allow some other suspended session to resume).
If you look at the alert log in this case you’ll see that the name has been reported there instead of the user, session and instance – which means you might want to think carefully about how you use the name option:
2019-10-04T14:21:16.151839+01:00 ORCL(3):statement in resumable session 'Help I'm stuck' was suspended due to ORCL(3): ORA-01652: unable to extend temp segment by 128 in tablespace TINY 2019-10-04T14:22:18.655808+01:00 ORCL(3):statement in resumable session 'Help I'm stuck' was timed out
Once your resumable task has completed (or timed out and failed) you can stop the session from being resumable with the command:
alter session disable resumable;
And it’s important that every time you enable resumability you should disable it as soon as the capability is no longer needed. Also, be careful about when you enable it, don’t be tempted to make every session resumable. Use it only for really important cases. Once a session is resumable virtually everything that goes on in that session is deemed to be resumable, and this has side effects.
The first side effect that may spring to mind is the impact of the view v$resumable – it’s a memory structure in the SGA so that everyone can see it and all the resumable sessions can populate and update it. That means there’s got to be some latch (or mutex) protection going on – and if you look at v$latch you’ll discover that there;s just a single (child) latch doing the job, so resumability can introduce a point of contention. Here’s a simple script (using my “start_XXX” strategy to “select 1 from dual;” one thousand times, with calls to check the latch activity:
set termout off set serveroutput off execute snap_latch.start_snap @start_1000 set termout on set serveroutput on execute snap_latch.end_snap(750)
And here are the results of running the script – reporting only the latches with more than 750 gets in the interval – first without and then with a resumable session:
--------------------------------- Latch waits:- 04-Oct 15:04:31 Lower limit:- 750 --------------------------------- Latch Gets Misses Sp_Get Sleeps Im_Gets Im_Miss Holding Woken Time ms ----- ---- ------ ------ ------ ------- ------- ------- ----- ------- session idle bit 6,011 0 0 0 0 0 0 0 .0 enqueue hash chains 2,453 0 0 0 0 0 0 0 .0 enqueue freelist latch 1 0 0 0 2,420 0 0 0 .0 JS queue state obj latch 1,176 0 0 0 0 0 0 0 .0 SQL> alter session enable resumable; SQL> @test --------------------------------- Latch waits:- 04-Oct 15:04:46 Lower limit:- 750 --------------------------------- Latch Gets Misses Sp_Get Sleeps Im_Gets Im_Miss Holding Woken Time ms ----- ---- ------ ------ ------ ------- ------- ------- ----- ------- session idle bit 6,011 0 0 0 0 0 0 0 .0 enqueue hash chains 2,623 0 0 0 0 0 0 0 .0 enqueue freelist latch 1 0 0 0 2,588 0 0 0 .0 resumable state object 3,005 0 0 0 0 0 0 0 .0 JS queue state obj latch 1,260 0 0 0 0 0 0 0 .0 PL/SQL procedure successfully completed. SQL> alter session disable resumable;
That’s 1,000 selects from dual – 3,000 latch gets on a single child latch. It looks like every call to the database results in a latch get and an update to the memory structure. (Note: You wouldn’t see the same effect if you ran a loop inside an anonymous PL/SQL block since the block would be the single database call).
For other side effects with resumability think about what else is going on around your session. If you allow a session to suspend for (say) 3600 seconds and it manages to resume just in time to avoid a timeout it now has 3,600 seconds of database changes to unwind if it’s trying to produce a read-consistent result; so not only do you have to allow for increasing the size of the undo tablespace and increasing the undo retention time, you have to allow for the fact that when the process resumes it may run much more slowly than usual because it spends more of its time trying to see the data as it was before it suspended, which may require far more single block reads of the undo tablespace – and the session may then crash anyway with an Oracle error ORA-01555 (which is so well-known that I won’t quote the text).
In the same vein – if a process acquires a huge amount of space in the temporary tablespace (in particular) and fails instantly because it can’t get any more space it normally crashes and releases the space. If you allow that process to suspend for an hour it’s going to hold onto that space – which means other processes that used to run safely may now crash because they find there’s no free space left for them in the temporary tablespace.
Be very cautious when you introduce resumable sessions – you need to understand the global impact, not just the potential benefit to your session.
Getting Alerts
Apart from the (passive) views telling you that a session has suspended it’s also possible to get some form of (active) alert when the event happens. There’s an “after suspend” event that you can use to create a database trigger to take some defensive action, e.g.:
create or replace trigger call_for_help after suspend on test_user.schema begin if sysdate between trunc(sysdate) and trunc(sysdate) + 3/24 then null; -- use utl_mail, utl_smtp et. al. to page the DBA end if; end; /
This trigger is restricted to the test_user schema, and (code not included) sends a message to the DBA’s pager only between the hours of midnight and 3:00 a.m. Apart from the usual functions in dbms_standard that returnn error codes, names of objects and so on you might want to take a look at the dbms_resumable package for the “helper” functions and procedures it supplies.
For further information on resumable sessions here’s a link to the 12.2 manual to get you started.
Oracle Database provides a means for suspending, and later resuming, the execution of large database operations in the event of space allocation failures. Therefore, you can take corrective action instead of the Oracle Database server returning an error to the user. After the error condition is corrected, the suspended operation automatically resumes. This feature is called resumable space allocation. The statements that are affected are called resumable statements.
This section contains the following topics:
-
Resumable Space Allocation Overview
-
Enabling and Disabling Resumable Space Allocation
-
Detecting Suspended Statements
-
Operation-Suspended Alert
-
Resumable Space Allocation Example: Registering an AFTER SUSPEND Trigger
Resumable Space Allocation Overview
This section provides an overview of resumable space allocation. It describes how resumable space allocation works, and specifically defines qualifying statements and error conditions.
How Resumable Space Allocation Works
The following is an overview of how resumable space allocation works. Details are contained in later sections.
-
A statement executes in a resumable mode only if its session has been enabled for resumable space allocation by one of the following actions:
-
The
RESUMABLE_TIMEOUT
initialization parameter is set to a nonzero value. -
The
ALTER SESSION ENABLE RESUMABLE
statement is issued.
-
-
A resumable statement is suspended when one of the following conditions occur (these conditions result in corresponding errors being signalled for non-resumable statements):
-
Out of space condition
-
Maximum extents reached condition
-
Space quota exceeded condition.
-
-
When the execution of a resumable statement is suspended, there are mechanisms to perform user supplied operations, log errors, and to query the status of the statement execution. When a resumable statement is suspended the following actions are taken:
-
The error is reported in the alert log.
-
The system issues the Resumable Session Suspended alert.
-
If the user registered a trigger on the
AFTER SUSPEND
system event, the user trigger is executed. A user supplied PL/SQL procedure can access the error message data using theDBMS_RESUMABLE
package and theDBA_
orUSER_RESUMABLE
view.
-
-
Suspending a statement automatically results in suspending the transaction. Thus all transactional resources are held through a statement suspend and resume.
-
When the error condition is resolved (for example, as a result of user intervention or perhaps sort space released by other queries), the suspended statement automatically resumes execution and the Resumable Session Suspended alert is cleared.
-
A suspended statement can be forced to throw the exception using the
DBMS_RESUMABLE.ABORT()
procedure. This procedure can be called by a DBA, or by the user who issued the statement. -
A suspension time out interval is associated with resumable statements. A resumable statement that is suspended for the timeout interval (the default is two hours) wakes up and returns the exception to the user.
-
A resumable statement can be suspended and resumed multiple times during execution.
What Operations are Resumable?
The following operations are resumable:
-
Queries
SELECT
statements that run out of temporary space (for sort areas) are candidates for resumable execution. When using OCI, the callsOCIStmtExecute()
andOCIStmtFetch()
are candidates. -
DML
INSERT
,UPDATE
, andDELETE
statements are candidates. The interface used to execute them does not matter; it can be OCI, SQLJ, PL/SQL, or another interface. Also,INSERT INTO...SELECT
from external tables can be resumable. -
Import/Export
As for SQL*Loader, a command line parameter controls whether statements are resumable after recoverable errors.
-
DDL
The following statements are candidates for resumable execution:
-
CREATE
TABLE
…AS
SELECT
-
CREATE
INDEX
-
ALTER
INDEX
…REBUILD
-
ALTER
TABLE
…MOVE
PARTITION
-
ALTER
TABLE
…SPLIT
PARTITION
-
ALTER
INDEX
…REBUILD
PARTITION
-
ALTER
INDEX
…SPLIT
PARTITION
-
CREATE
MATERIALIZED
VIEW
-
CREATE
MATERIALIZED
VIEW
LOG
-
What Errors are Correctable?
There are three classes of correctable errors:
-
Out of space condition
The operation cannot acquire any more extents for a table/index/temporary segment/undo segment/cluster/LOB/table partition/index partition in a tablespace. For example, the following errors fall in this category:
ORA-01653 unable to extend table ... in tablespace ... ORA-01654 unable to extend index ... in tablespace ...
-
Maximum extents reached condition
The number of extents in a table/index/temporary segment/undo segment/cluster/LOB/table partition/index partition equals the maximum extents defined on the object. For example, the following errors fall in this category:
ORA-01631 max # extents ... reached in table ... ORA-01632 max # extents ... reached in index ...
-
Space quota exceeded condition
The user has exceeded his assigned space quota in the tablespace. Specifically, this is noted by the following error:
ORA-01536 space quote exceeded for tablespace string
Resumable Space Allocation and Distributed Operations
In a distributed environment, if a user enables or disables resumable space allocation, or if you, as a DBA, alter the RESUMABLE_TIMEOUT
initialization parameter, only the local instance is affected. In a distributed transaction, sessions or remote instances are suspended only if RESUMABLE
has been enabled in the remote instance.
Parallel Execution and Resumable Space Allocation
In parallel execution, if one of the parallel execution server processes encounters a correctable error, that server process suspends its execution. Other parallel execution server processes will continue executing their respective tasks, until either they encounter an error or are blocked (directly or indirectly) by the suspended server process. When the correctable error is resolved, the suspended process resumes execution and the parallel operation continues execution. If the suspended operation is terminated, the parallel operation aborts, throwing the error to the user.
Different parallel execution server processes may encounter one or more correctable errors. This may result in firing an AFTER SUSPEND
trigger multiple times, in parallel. Also, if a parallel execution server process encounters a non-correctable error while another parallel execution server process is suspended, the suspended statement is immediately aborted.
For parallel execution, every parallel execution coordinator and server process has its own entry in the DBA
_ or USER_RESUMABLE
view.
Enabling and Disabling Resumable Space Allocation
Resumable space allocation is only possible when statements are executed within a session that has resumable mode enabled. There are two means of enabling and disabling resumable space allocation. You can control it at the system level with the RESUMABLE_TIMEOUT
initialization parameter, or users can enable it at the session level using clauses of the ALTER SESSION
statement.
Note:
Because suspended statements can hold up some system resources, users must be granted the RESUMABLE
system privilege before they are allowed to enable resumable space allocation and execute resumable statements.
Setting the RESUMABLE_TIMEOUT Initialization Parameter
You can enable resumable space allocation system wide and specify a timeout interval by setting the RESUMABLE_TIMEOUT
initialization parameter. For example, the following setting of the RESUMABLE_TIMEOUT
parameter in the initialization parameter file causes all sessions to initially be enabled for resumable space allocation and sets the timeout period to 1 hour:
RESUMABLE_TIMEOUT = 3600
If this parameter is set to 0, then resumable space allocation is disabled initially for all sessions. This is the default.
You can use the ALTER SYSTEM SET
statement to change the value of this parameter at the system level. For example, the following statement will disable resumable space allocation for all sessions:
ALTER SYSTEM SET RESUMABLE_TIMEOUT=0;
Within a session, a user can issue the ALTER SESSION SET
statement to set the RESUMABLE_TIMEOUT
initialization parameter and enable resumable space allocation, change a timeout value, or to disable resumable mode.
Using ALTER SESSION to Enable and Disable Resumable Space Allocation
A user can enable resumable mode for a session, using the following SQL statement:
ALTER SESSION ENABLE RESUMABLE;
To disable resumable mode, a user issues the following statement:
ALTER SESSION DISABLE RESUMABLE;
The default for a new session is resumable mode disabled, unless the RESUMABLE_TIMEOUT
initialization parameter is set to a nonzero value.
The user can also specify a timeout interval, and can provide a name used to identify a resumable statement. These are discussed separately in following sections.
Specifying a Timeout Interval
A timeout period, after which a suspended statement will error if no intervention has taken place, can be specified when resumable mode is enabled. The following statement specifies that resumable transactions will time out and error after 3600 seconds:
ALTER SESSION ENABLE RESUMABLE TIMEOUT 3600;
The value of TIMEOUT
remains in effect until it is changed by another ALTER SESSION ENABLE RESUMABLE
statement, it is changed by another means, or the session ends. The default timeout interval when using the ENABLE RESUMABLE TIMEOUT
clause to enable resumable mode is 7200 seconds.
See Also:
«Setting the RESUMABLE_TIMEOUT Initialization Parameter» for other methods of changing the timeout interval for resumable space allocation
Naming Resumable Statements
Resumable statements can be identified by name. The following statement assigns a name to resumable statements:
ALTER SESSION ENABLE RESUMABLE TIMEOUT 3600 NAME 'insert into table';
The NAME
value remains in effect until it is changed by another ALTER SESSION ENABLE RESUMABLE
statement, or the session ends. The default value for NAME
is ‘User
username
(userid
), Session
sessionid
, Instance
instanceid
‘.
The name of the statement is used to identify the resumable statement in the DBA_RESUMABLE
and USER_RESUMABLE
views.
Using a LOGON Trigger to Set Default Resumable Mode
Another method of setting default resumable mode, other than setting the RESUMABLE_TIMEOUT
initialization parameter, is that you can register a database level LOGON
trigger to alter a user’s session to enable resumable and set a timeout interval.
Note:
If there are multiple triggers registered that change default mode and timeout for resumable statements, the result will be unspecified because Oracle Database does not guarantee the order of trigger invocation.
Detecting Suspended Statements
When a resumable statement is suspended, the error is not raised to the client. In order for corrective action to be taken, Oracle Database provides alternative methods for notifying users of the error and for providing information about the circumstances.
Notifying Users: The AFTER SUSPEND System Event and Trigger
When a resumable statement encounter a correctable error, the system internally generates the AFTER SUSPEND
system event. Users can register triggers for this event at both the database and schema level. If a user registers a trigger to handle this system event, the trigger is executed after a SQL statement has been suspended.
SQL statements executed within a AFTER SUSPEND
trigger are always non-resumable and are always autonomous. Transactions started within the trigger use the SYSTEM
rollback segment. These conditions are imposed to overcome deadlocks and reduce the chance of the trigger experiencing the same error condition as the statement.
Users can use the USER_RESUMABLE
or DBA_RESUMABLE
views, or the DBMS_RESUMABLE.SPACE_ERROR_INFO
function, within triggers to get information about the resumable statements.
Triggers can also call the DBMS_RESUMABLE
package to terminate suspended statements and modify resumable timeout values. In the following example, the default system timeout is changed by creating a system wide AFTER SUSPEND
trigger that calls DBMS_RESUMABLE
to set the timeout to 3 hours:
CREATE OR REPLACE TRIGGER resumable_default_timeout AFTER SUSPEND ON DATABASE BEGIN DBMS_RESUMABLE.SET_TIMEOUT(10800); END; /
Using Views to Obtain Information About Suspended Statements
The following views can be queried to obtain information about the status of resumable statements:
View | Description |
---|---|
DBA_RESUMABLE
|
These views contain rows for all currently executing or suspended resumable statements. They can be used by a DBA, AFTER SUSPEND trigger, or another session to monitor the progress of, or obtain specific information about, resumable statements. |
V$SESSION_WAIT |
When a statement is suspended the session invoking the statement is put into a wait state. A row is inserted into this view for the session with the EVENT column containing «statement suspended, wait error to be cleared». |
Using the DBMS_RESUMABLE Package
The DBMS_RESUMABLE
package helps control resumable space allocation. The following procedures can be invoked:
Procedure | Description |
---|---|
ABORT(sessionID) |
This procedure aborts a suspended resumable statement. The parameter sessionID is the session ID in which the statement is executing. For parallel DML/DDL, sessionID is any session ID which participates in the parallel DML/DDL.
Oracle Database guarantees that the The caller of |
GET_SESSION_TIMEOUT(sessionID) |
This function returns the current timeout value of resumable space allocation for the session with sessionID . This returned timeout is in seconds. If the session does not exist, this function returns -1. |
SET_SESSION_TIMEOUT(sessionID, timeout) |
This procedure sets the timeout interval of resumable space allocation for the session with sessionID . The parameter timeout is in seconds. The new timeout setting will applies to the session immediately. If the session does not exist, no action is taken. |
GET_TIMEOUT() |
This function returns the current timeout value of resumable space allocation for the current session. The returned value is in seconds. |
SET_TIMEOUT(timeout) |
This procedure sets a timeout value for resumable space allocation for the current session. The parameter timeout is in seconds. The new timeout setting applies to the session immediately. |
Operation-Suspended Alert
When a resumable session is suspended, an operation-suspended alert is issued on the object that needs allocation of resource for the operation to complete. Once the resource is allocated and the operation completes, the operation-suspended alert is cleared. See «Managing Tablespace Alerts» for more information on system-generated alerts.
Resumable Space Allocation Example: Registering an AFTER SUSPEND Trigger
In the following example, a system wide AFTER SUSPEND
trigger is created and registered as user SYS
at the database level. Whenever a resumable statement is suspended in any session, this trigger can have either of two effects:
-
If an undo segment has reached its space limit, then a message is sent to the DBA and the statement is aborted.
-
If any other recoverable error has occurred, the timeout interval is reset to 8 hours.
Here are the statements for this example:
CREATE OR REPLACE TRIGGER resumable_default AFTER SUSPEND ON DATABASE DECLARE /* declare transaction in this trigger is autonomous */ /* this is not required because transactions within a trigger are always autonomous */ PRAGMA AUTONOMOUS_TRANSACTION; cur_sid NUMBER; cur_inst NUMBER; errno NUMBER; err_type VARCHAR2; object_owner VARCHAR2; object_type VARCHAR2; table_space_name VARCHAR2; object_name VARCHAR2; sub_object_name VARCHAR2; error_txt VARCHAR2; msg_body VARCHAR2; ret_value BOOLEAN; mail_conn UTL_SMTP.CONNECTION; BEGIN -- Get session ID SELECT DISTINCT(SID) INTO cur_SID FROM V$MYSTAT; -- Get instance number cur_inst := userenv('instance'); -- Get space error information ret_value := DBMS_RESUMABLE.SPACE_ERROR_INFO(err_type,object_type,object_owner, table_space_name,object_name, sub_object_name); /* -- If the error is related to undo segments, log error, send email -- to DBA, and abort the statement. Otherwise, set timeout to 8 hours. -- -- sys.rbs_error is a table which is to be -- created by a DBA manually and defined as -- (sql_text VARCHAR2(1000), error_msg VARCHAR2(4000), -- suspend_time DATE) */ IF OBJECT_TYPE = 'UNDO SEGMENT' THEN /* LOG ERROR */ INSERT INTO sys.rbs_error ( SELECT SQL_TEXT, ERROR_MSG, SUSPEND_TIME FROM DBMS_RESUMABLE WHERE SESSION_ID = cur_sid AND INSTANCE_ID = cur_inst ); SELECT ERROR_MSG INTO error_txt FROM DBMS_RESUMABLE WHERE SESSION_ID = cur_sid and INSTANCE_ID = cur_inst; -- Send email to receipient through UTL_SMTP package msg_body:='Subject: Space Error Occurred Space limit reached for undo segment ' || object_name || on ' || TO_CHAR(SYSDATE, 'Month dd, YYYY, HH:MIam') || '. Error message was ' || error_txt; mail_conn := UTL_SMTP.OPEN_CONNECTION('localhost', 25); UTL_SMTP.HELO(mail_conn, 'localhost'); UTL_SMTP.MAIL(mail_conn, 'sender@localhost'); UTL_SMTP.RCPT(mail_conn, 'recipient@localhost'); UTL_SMTP.DATA(mail_conn, msg_body); UTL_SMTP.QUIT(mail_conn); -- Abort the statement DBMS_RESUMABLE.ABORT(cur_sid); ELSE -- Set timeout to 8 hours DBMS_RESUMABLE.SET_TIMEOUT(28800); END IF; /* commit autonomous transaction */ COMMIT; END; /
We use expdp/impdp often as part of backup strategy, data movement, refresh requests, etc. Here is one situation I encountered when trying to import data into a database using impdp.
Environment: Oracle 11.2.0.2 on Suse Linux 10 SP2
The import was hung and it was WAITING !!
Identifiy the impdp process at the session level and see what it is waiting on..
SQL> col username for a15
SQL> col program for a30
SQL> set lines 200
SQL> col machine for a25
SQL> select inst_id,username,machine,program,sid,serial#,status,sql_id,SQL_HASH_VALUE,last_call_et from gv$session where username is not null and username <>’SYS’ order by 10;
INST_ID USERNAME MACHINE PROGRAM SID SERIAL# STATUS SQL_ID SQL_HASH_VALUE LAST_CALL_ET
———- ————— ————————- —————————— ———- ———- ——— ————- ————— ————
1 DBSNMP nelsdb20 emagent@nelsdb20 (TNS V1-V3) 99 17 INACTIVE 0 56
2 DBSNMP nelsdb21 emagent@nelsdb21 (TNS V1-V3) 226 10053 INACTIVE 0 75
1 SSCADM nelsdb20 oracle@nelsdb20 (DW00) 40 841 ACTIVE 9035v672ffr0x 3303496733 1085
2 DBSNMP nelsdb21 emagent@nelsdb21 (TNS V1-V3) 163 13 INACTIVE 0 322465
SQL> select event from v$session_wait where sid=40;
EVENT
—————————————————————-
statement suspended, wait error to be cleared
SQL> SELECT NAME,STATUS, TIMEOUT, ERROR_NUMBER, ERROR_MSG FROM DBA_RESUMABLE;
NAME STATUS TIMEOUT ERROR_NUMBER ERROR_MSG
————————— ——— ——— ———— ——————————————————————
SYS.SYS_IMPORT_SCHEMA_01.1 SUSPENDED 7200 1652 ORA-01652: unable to extend temp segment by 128 in tablespace SSC
SYS.SYS_IMPORT_SCHEMA_01 NORMAL 7200
The error points to unavailability of free space in the tablespace. The impdp goes on a resumable state in this case and simply waits.
Once space is added to the tablespace, the process automatically picks up and the impdp completes.
In the introduction to this series of posts, I spoke briefly about data warehouse fault tolerance and the unique challenges resulting from high data volumes combined the batch load window required to create them. I then defined the goal: a layered approach allowing simple errors to be caught early before they turn in to serious conditions.
Resuming is the ability to continue effortlessly after an error. The important thing is that there should be no aftermath from the error: our process should pause gracefully until the error is corrected. The Oracle Database has offered out of the box functionality for resuming since version 9i in the form of Resumable Space Allocation. Resumable operations are supported for SELECT queries, DML and DDL, and can be enabled at either the system or the session level. To enable at the system level, the RESUMABLE_TIMEOUT database parameter should have a non-zero value.
SQL> alter system set resumable_timeout=3600; System altered. SQL>
To enable resumable operations at the session level, the statement follows this basic syntax, with the TIMEOUT and NAME clauses being optional:
ALTER SESSION ENABLE RESUMABLE <TIMEOUT n> <NAME string>;
The TIMEOUT value is specified in seconds, and if omitted, the default value of 7200 is used, or 2 hours. The NAME clause gives the resumable session a user-friendly name for when we are monitoring for resumable sessions (as we will see later) to see which of our processes is suspended. Enabling resumable operations for the session level requires that the RESUMABLE permission has been granted:
SQL> grant resumable to stewart; Grant succeeded. SQL>
Resumable operations can also be enabled with the Oracle utilities… such as SQL-Loader, Export/Import and Datapump. The command-line parameters RESUMABLE, RESUMABLE_NAME and RESUMABLE_TIMEOUT exist to mimic the functionality mentioned above.
Now for a demonstration. I’ll create a situation that is ripe for a space allocation error: I’ll put an empty copy of the SALES fact table from the SH schema in a tablespace with only 250K of space:
SQL> create tablespace target datafile '/oracle/oradata/bidw1/target01.dbf' size 250K; Tablespace created. SQL> create table target.sales tablespace target as select * from sh.sales where 1=0; Table created. SQL>
Now I’ll load some records into the table, which should cause it to suspend. To prepare my session, I need to enable resumable operations. Since I always instrument my code, I’ll register my process with the database. After that, I have an easy way to guarantee consistency when referring to processes. Now, I can use the registered name for my resumable session as well:
SQL> exec dbms_application_info.set_module('SALES fact load','insert some rows'); PL/SQL procedure successfully completed. SQL> SQL> DECLARE 2 l_module VARCHAR2(48) := sys_context('USERENV','MODULE'); 3 BEGIN 4 EXECUTE IMMEDIATE 5 'alter session enable resumable timeout 18000 name '''||l_module||''''; 6 END; 7 / PL/SQL procedure successfully completed. SQL>
I start loading the records in hopes of a suspended session:
SQL> insert into target.sales select * from sh.sales;
So now, I open up another session, and I start another transaction against the TARGET.SALES table, just to pile on the TARGET tablespace:
SQL> exec dbms_application_info.set_module('SALES fact load2','insert more rows'); PL/SQL procedure successfully completed. SQL> SQL> DECLARE 2 l_module VARCHAR2(48) := sys_context('USERENV','MODULE'); 3 BEGIN 4 EXECUTE IMMEDIATE 5 'alter session enable resumable timeout 18000 name '''||l_module||''''; 6 END; 7 / PL/SQL procedure successfully completed. SQL> insert into target.sales select * from sh.sales;
I’ll have a look in the DBA_RESUMABLE view (there is also a USER_RESUMABLE version) for my suspended sessions. Even though I could get all the following information with a single SQL statement, I broke it up for better visibility on the blog:
SQL> select name, start_time, suspend_time, status from dba_resumable; NAME | START_TIME | SUSPEND_TIME | STATUS ----------------- | -------------------- | -------------------- | ------------ SALES fact load2 | 02/06/10 10:33:33 | 02/06/10 10:33:33 | SUSPENDED SALES fact load | 02/06/10 10:29:03 | 02/06/10 10:29:03 | SUSPENDED 2 rows selected. Elapsed: 00:00:00.07 SQL> select name, sql_text from dba_resumable; NAME | SQL_TEXT ----------------- | ----------------------------------------------- SALES fact load2 | insert into target.sales select * from sh.sales SALES fact load | insert into target.sales select * from sh.sales 2 rows selected. SQL> select name, error_msg from dba_resumable; NAME | ERROR_MSG ----------------- | ------------------------------------------------------------------------ SALES fact load2 | ORA-01653: unable to extend table TARGET.SALES by 8 in tablespace TARGET SALES fact load | ORA-01653: unable to extend table TARGET.SALES by 8 in tablespace TARGET 2 rows selected. SQL>
The Oracle Database also publishes server alerts concerning suspended transactions using the Server-Generated Alerts infrastructure. This infrastructure uses the AWR toolset, the server package DBMS_SERVER_ALERT for getting and setting metric threshholds, and the queue table ALERT_QUE to hold alerts that have been published from AWR. Custom processes could be written to mine ALERT_QUE for these alerts, but the easiest way to configure and view server alerts is using Oracle Enterprise Manager (OEM). On the Alerts section of the main OEM page, we can see three different alerts generated by the Oracle Database:
If we click on the «Session Suspended» link, we can see the multiple alerts generated in this category:
Another alert generated indirectly by the suspended transaction is the «Configuration» class event caused by our session «waiting» to proceed. The Oracle wait event interface can show us information about the suspend waits on the system:
SQL> SELECT event, 2 SUM(time_waited) time_waited, 3 SUM(total_waits) total_waits, 4 AVG(average_wait) average_wait 5 FROM gv$session_event 6 WHERE lower(event) LIKE '%suspend%' 7 GROUP BY event 8 ORDER BY time_waited ASC 9 /
EVENT | TIME_WAITED | TOTAL_WAITS | AVERAGE_WAIT |
---|---|---|---|
statement suspended, wait error to be cleared | 305373 | 1377 | 221.78 |
1 row selected.
SQL>
To free up the space issue, I’ll enable autoextend on the TARGET tablespace. Then, I’ll take a look and see if anything has changed:
SQL> alter database datafile '/oracle/oradata/bidw1/target01.dbf' 2 autoextend on next 10M maxsize 1000M; Database altered. SQL> select status, resume_time, name from dba_resumable; STATUS | RESUME_TIME | NAME ------------ | -------------------- | ----------------- NORMAL | 02/06/10 10:56:49 | SALES fact load2 NORMAL | 02/06/10 10:56:49 | SALES fact load 2 rows selected. SQL>
The Resumable Space Allocation features includes the AFTER SUSPEND trigger, which allows the specification of a system-wide trigger that will fire whenever a transaction is suspended. The typical use for this functionality is alerting as suspended operations don’t write anything to the alert log.
UPDATE: I made a mistake here… suspended transactions do in fact cause entries in the alert log, and so does the RESUME process detailed below.
There are some features in the DBMS_RESUMABLE package that may make sense when writing an AFTER SUSPEND trigger:
SQL> desc dbms_resumable PROCEDURE ABORT Argument Name Type In/Out Default? ------------------------------ ----------------------- ------ -------- SESSIONID NUMBER IN FUNCTION GET_SESSION_TIMEOUT RETURNS NUMBER Argument Name Type In/Out Default? ------------------------------ ----------------------- ------ -------- SESSIONID NUMBER IN FUNCTION GET_TIMEOUT RETURNS NUMBER PROCEDURE SET_SESSION_TIMEOUT Argument Name Type In/Out Default? ------------------------------ ----------------------- ------ -------- SESSIONID NUMBER IN TIMEOUT NUMBER IN PROCEDURE SET_TIMEOUT Argument Name Type In/Out Default? ------------------------------ ----------------------- ------ -------- TIMEOUT NUMBER IN FUNCTION SPACE_ERROR_INFO RETURNS BOOLEAN Argument Name Type In/Out Default? ------------------------------ ----------------------- ------ -------- ERROR_TYPE VARCHAR2 OUT OBJECT_TYPE VARCHAR2 OUT OBJECT_OWNER VARCHAR2 OUT TABLE_SPACE_NAME VARCHAR2 OUT OBJECT_NAME VARCHAR2 OUT SUB_OBJECT_NAME VARCHAR2 OUT SQL>
This package adds functionality for writing custom processes in the AFTER SUSPEND trigger. The SPACE_ERROR_INFO function returns specifics about the table and tablespace affected by the space error. A series of checks could be coded enabling specific actions depending on which objects were affected. A suspended process can be ended prematurely with the ABORT procedure, or more time can be added using the SET_TIMEOUT procedure. I actually had one client explain how she had written an AFTER SUSPEND trigger that compiled information about the tablespace affected so that an «ALTER DATABASE… RESIZE…» command could be issued to add more space to the affected datafile. I didn’t have the heart to tell her that she had basically written a feature that already existed in the database: AUTOEXTEND.
So what are the best practices to take away from this? Quite simply… all ETL mappings and flows, as well as database maintenance processes, should use Resumable Space Allocation, preferably using the NAME clause in conjunction with DBMS_APPLICATION_INFO. Setting a RESUMABLE_TIMEOUT value at the system level can be scary, because a single suspended transaction could cause locks that reverberate all the way through the system. But is this really a concern in a BI/DW environment? Are there any processes in our batch load window or with any of our operational maintenance processes that we wouldn’t want to enable for resumable operations, no matter how many processes back up waiting for them to complete? It could spell bad news if we used any kind of synchronous replication technology to move data to the DW instance, but short of that, I can’t think of any. Please let me know if you have alternative viewpoints.
I’ve never found much reason to use the AFTER SUSPEND trigger though. Data warehouses should have production-type monitoring running already, just like other production systems. OEM is more than satisfactory for basic monitoring and alerting, and with the Server-Generated Alerts introduced in 10g, forms a complete product for Oracle environments. But regardless of which monitoring solution is used, it should be able to issue simple queries against the database and alert based on the results of those queries. A select against the DBA_RESUMABLE table provides all the information required to send out an alert, and with features such as AUTOEXTEND, I just can’t see a requirement for the ability to issue procedural code because a transaction is suspended.
UPDATE: as pointed out above, since suspended transactions do in fact show up in the alert log, this is good news for integrating Resumable Space Allocation into an existing environment. Assuming that there’s proper alert log monitoring with paging functionality already in place, implementing resumable operations can simply use that infrastructure already in place.
Keep your eyes open for the next of the «Three R’s» in BI/DW fault tolerance: Restarting.