Mrp0 background media recovery terminated with error 448

Gouranga’s Tech Blog Pages Apr 1, 2016 Fix : ORA-10562: Error occurred while applying redo to data block During recovering in standby using manual process ( no DG configured), I found below error in laert log. Error in Alert Log: Thu Mar 31 10:42:10 2016 Dumping diagnostic data in directory=[cdmp_20160331104210], requested by (instance=1, osid=23658610 […]

Содержание

  1. Gouranga’s Tech Blog
  2. Pages
  3. Apr 1, 2016
  4. Fix : ORA-10562: Error occurred while applying redo to data block
  5. Русские Блоги
  6. Oracle 11.2.0.1 ADG Environment MRP -процесс встречился с ORA
  7. Marco’s DBA Blog
  8. Preface
  9. The test environment
  10. Scenario #1: No Lost Write Detection
  11. Scenario #2: Lost Write Detection enabled
  12. Recovering from a lost write
  13. Conclusion
  14. Database shutdown fails: ORA-00449: background process ‘MMON’ unexpectedly terminated with error 448 (Doc ID 2111305.1)
  15. Applies to:
  16. Symptoms
  17. Changes
  18. Cause
  19. To view full details, sign in with your My Oracle Support account.
  20. Don’t have a My Oracle Support account? Click to get started!
  21. data guard forum
  22. Answers

Gouranga’s Tech Blog

Pages

Apr 1, 2016

Fix : ORA-10562: Error occurred while applying redo to data block

During recovering in standby using manual process ( no DG configured), I found below error in laert log.

Error in Alert Log:

Thu Mar 31 10:42:10 2016
Dumping diagnostic data in directory=[cdmp_20160331104210], requested by (instance=1, osid=23658610 (PR01)), summary=[incident=4
226].
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Slave exiting with ORA-10562 exception
Errors in file /u01/app/oracle/diag/rdbms/prod/PROD/trace/PROD_pr01_23658610.trc:
ORA-10562: Error occurred while applying redo to data block (file# 80, block# 1288553)
ORA-10564: tablespace sample
ORA-01110: data file 80: ‘/u02/flash_recovery_area/PROD/ORADATA/sample01.dbf’
ORA-10561: block type ‘TRANSACTION MANAGED DATA BLOCK’, data object# 89532
ORA-00600: internal error code, arguments: [kdBlkCheckError], [80], [1288553], [6264], [], [], [], [], [], [], [], []
Recovery Slave PR01 previously exited with exception 10562
Thu Mar 31 10:42:10 2016
Errors with log /u03/restore_archive/2016_03_30/thread_2_seq_625.1575.907832317
MRP0: Background Media Recovery terminated with error 448
Errors in file /u01/app/oracle/diag/rdbms/prod/PROD/trace/PROD_pr00_9109508.trc:
ORA-00448: normal completion of background process
Recovery interrupted!

(1) Take backup of related datafiles.(in primary)

e.g.,
RMAN> backup datafile 80 format ‘/u02/df_80_pr.bk’ ;

Now transfer this backup piece to standby server.

(2) In standby : catalog the backup piece location

RMAN > catalog backuppiece ‘/u03/backup_files/df_80_pr.bk’;

Then list it for confirmation.

RMAN> list backuppiece’/u03/backup_files/df_80_pr.bk’;
RMAN> list backup of datafile 80; # Check the backup piece

(3) Cancel MRP if started.

ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;

(4) Restore from backup piece:

e.g., restore datafile 80; # restore datafile 80 ;

RMAN> restore datafile 80;

Starting restore at 31-MAR-16
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=199 device type=DISK

channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00080 to /u02/flash_recovery_area/PROD/ORADATA/sample01.dbf
channel ORA_DISK_1: reading from backup piece /u03/backup_files/df_80_pr.bk
channel ORA_DISK_1: piece handle=/u03/backup_files/df_80_pr.bk tag=TAG20160331T121840
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:45
Finished restore at 31-MAR-16

(5) Start the MRP in standby:

SQL> alter database recover managed standby database disconnect ;

Источник

Русские Блоги

Oracle 11.2.0.1 ADG Environment MRP -процесс встречился с ORA

Среда: Linux + Oracle 11.2.0.1 ADG
Феномен: нет журнала приложений в основе библиотеки

1. Текущее состояние подготовки библиотеки запросов базы данных
Установлено, что нет журнала приложений в подготовке библиотеки. Применение LAG показало, что не существует журнала приложений в течение более 21 часов в течение 3 дней и 21 часа.

2. Запрос журнала тревоги оповещения
С момента, когда журнал тревоги в оповещениях расположен до времени ADG, есть 600 информации об ошибках, что приводит к прекращению процесса MRP. Подробные журналы следующие:

3. Попробуйте начать процесс восстановления MRP вручную библиотеку
Было обнаружено, что процесс восстановления MRP изготовленных вручную библиотек был найден, и журнал тревоги по-прежнему будет сообщать о такой же ошибке ORA-600 [KCBR_APPLY_CHANGE_11].

4. Попробуйте статус Mount, чтобы начать процесс восстановления MRP
Установлено, что в состоянии крепления вы можете запустить процесс восстановления MRP обычно. После завершения восстановления применение ADG Real -времени вновь открывается, все нормально.

Запрос статус библиотеки, чтобы подтвердить, что все нормально:

5. Запрос MOS, основная причина позиционирования
Запрос MOS обнаружил, что это явление соответствует ошибке 10419984
Bug 10419984 : ACTIVE DATA GUARD STANDBY GIVES ORA-600 [KCBR_APPLY_CHANGE_11]
Рекомендуется применить этот патч, чтобы предотвратить эту проблему снова.

Источник

Marco’s DBA Blog

Preface

Today I will blog again about a Data Guard topic. There are a couple of best practices out there which one should follow. One of these best practises is enabeling block checking and lost write protection. About the latter there are not many information out there. So that’s why I want to outline the concept and importance of this feature. Actually this post is inspired by a talk that I had during DOAG Conference 2016. I had a presentation about best practices in Data Guard and someone from the audience was asking how that lost write protection actually works.
Basically it is there to detect lost writes, as the parameter clearly states. That means, a write request to the disk was commited an the database is happy with that. But the write did not actually happen for whatever reason. So when the block will be read the next time, it is still in old state, any changed, deleted or added values are not included. The block itself is consistent, it is not corrupted. The DBA will not notice it since there is no error. An error will occur only when you restore the tablespace containing the block and then try to apply the redo stream. The recovery will detect a newer SCN in the redo stream which does not match the blocks SCN. That is the point where it gets tricky.

The test environment

My simple test cases run on a VirtualBox VM with OEL 6.7, Oracle Restart 12.1.0.2 and Oracle Database 12.1.0.2. Primary and Standby run on the same host.
DB_NAME: db12c
DB_UNIQUE_NAME: db12ca
DB_UNIQUE_NAME: db12cb
You will see the names in my SQL prompt to make things clear.

This is the current state of the system:

So “db12cb” is my primary and “db12ca” my standby instance. by the way, that’s why I gave them the suffix “a” and “b” because they may change roles over and over again.

For testing I create a separate tablespace with manual space management. This allows me to specify FREELISTS=1. Otherwise the changes to my data may end up in different blocks which is not what I want for my testing. Beside that, I create an user which I will use for testing and which gets the necessary grants.

Scenario #1: No Lost Write Detection

The new user can now create a table and insert some data, so let’s do that.

Now we can identify the block and check if the data is really in there.

Ok, the data is in that block. In the same way I can now check if the DML was successfully applied on the standby.

So everything is fine until now as it should be.
I will now insert another row into the test table, force that change to be written to disk and then clear the buffer cache.

Again, check if it was written to disk.

Both values that I inserted are on disk now. Just to make sure everything is ok, I check the block on the standby.

So far, so good. Now comes the funny part. I will simulate a lost write by just putting my first extracted block back in the datafile.

Now let us query the test table and see what’s happening.

No error, no waring, just the result. But the result set obviously lacks the row from the second insert. And as the block is completely intact and not corrupted, there is no need to raise any error.
So now it is time to do another INSERT.

That is the point where it comes to light. The redo apply of the standby database detects a redo record which does not match the data block that itself has. It has no other chance as to stop recovery and raise an error in the alert.log.

Beside that, the primary is still running fine, accepts changes, commits and is just doing what a database is supposed to do. This is very unkind since the only way to recover from such a situation is doing a failover to the standby and lose all changes that happened after the change to damaged block. And this can be a lot.

Scenario #2: Lost Write Detection enabled

I enable it by simply setting the parameter to typical on both instances.

This parameter forces the database to record the SCN of all blocks that it reads from disk to the redo stream. The standby database can use this information to compare the recorded SCN from the redo stream to the actual SCN of the block at the standby site. If there is a difference, it can report a lost write.

Now I walk through the same steps as above. But this time, after simulating the lost write, I simply query the table.

The SELECT succeeds, but the alert.log of the primary reports the following error.

The standby’s alert.log now reports an ORA-752 instead of an ORA-600.

Recovering from a lost write

As in scenario #1, the only way to work around this error is to failover to the standby database.

Now I can query my test table at the new primary.

I now need to re-create the old primary. Reinstate using Flashback Database will not work. The steps will be basically these:

  • remove database from configuration
  • recreate the database using duplicate
  • add database back to the configuration

A lot of effort for such a “small” failure….

Conclusion

Enabling lost write detection is crucial in a Data Guard setup. Lost writes are detected at read time which allows to perform recovery steps much earlier than without it. Nevertheless, lost writes should not occur. If it does occur, something really bad is going on in your environment and you need to investigate the root cause of the lost write.
That’s it, basically. I hope it makes things a little more clear.

Источник

Database shutdown fails: ORA-00449: background process ‘MMON’ unexpectedly terminated with error 448 (Doc ID 2111305.1)

Last updated on APRIL 19, 2022

Applies to:

Symptoms

12.1.0.2 on AIX 7.1 TL4 SP1 (or AIX 7.2 TL0), database shutdown fails with ORA-00449:

The issue can also cause DBCA failure while creating a database.

In the case of installing/upgrading to 12.1.0.2 Grid Infrastructure, as installer creates GIMR database, it could fail during step of «Creating Container Database for Oracle Grid Infrastructure Management Repository» with error INS-20802:

Registering database with Oracle Grid Infrastructure
DBCA_PROGRESS : 5%
Copying database files
DBCA_PROGRESS : 7%
DBCA_PROGRESS : 9%
DBCA_PROGRESS : 16%
DBCA_PROGRESS : 23%
DBCA_PROGRESS : 30%
DBCA_PROGRESS : 41%
Creating and starting Oracle instance
DBCA_PROGRESS : 43%
DBCA_PROGRESS : 48%
DBCA_PROGRESS : 49%
DBCA_PROGRESS : 50%
DBCA_PROGRESS : 55%
DBCA_PROGRESS : 60%
DBCA_PROGRESS : 61%
DBCA_PROGRESS : 64%
Completing Database Creation
DBCA_PROGRESS : 68%
DBCA_PROGRESS : 79%
ORA-00449: background process ‘MMON’ unexpectedly terminated with error 448

ORA-01034: ORACLE not available

DBCA_PROGRESS : 80%
ORA-01034: ORACLE not available

ORA-01034: ORACLE not available

DBCA_PROGRESS : DBCA Operation failed.

ORA-01012: not logged on

Error while executing » /rdbms/admin/dbmssml.sql». Refer to » /cfgtoollogs/dbca/_mgmtdb/dbmssml0.log» for more details. Error in Process: /perl/bin/perl
DBCA_PROGRESS : DBCA Operation failed.

Changes

The issue is happening on AIX 7.1 TL4 SP1 (7100-04-01-1543) onward, there’s no confirmed case for earlier TL.

Cause

To view full details, sign in with your My Oracle Support account.

Don’t have a My Oracle Support account? Click to get started!

In this Document

My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.

Oracle offers a comprehensive and fully integrated stack of cloud applications and platform services. For more information about Oracle (NYSE:ORCL), visit oracle.com. пїЅ Oracle | Contact and Chat | Support | Communities | Connect with us | | | | Legal Notices | Terms of Use

Источник

data guard forum

Yesterday I noticed that my standby server was behind the prduction one.
In the error log I noticed:
ORA-00600: internal error code, arguments: [kcbr_apply_change_11], [], [], [], [], [], [], [], [], [], [], []

then
MRP0: Background Media Recovery terminated with error 448
Recovered data files to a consistent state at change 286116915

This error happens in sequence: Recovery of Online Redo Log: Thread 1 Group 4 Seq 8830 Reading mem 0

I found this archive log in standby and it is same size as in the production one.

Anybody can help me please?

Answers

930648 wrote:
Hello everybody !

Yesterday I noticed that my standby server was behind the prduction one.
In the error log I noticed:
ORA-00600: internal error code, arguments: [kcbr_apply_change_11], [], [], [], [], [], [], [], [], [], [], []

then
MRP0: Background Media Recovery terminated with error 448
Recovered data files to a consistent state at change 286116915

This error happens in sequence: Recovery of Online Redo Log: Thread 1 Group 4 Seq 8830 Reading mem 0

I found this archive log in standby and it is same size as in the production one.

Anybody can help me please?

What is the version?
You have to use Error lookup tool to investigate Internal errors. *ORA-600/ORA-7445/ORA-700 Error Look-up Tool [ID 153788.1]*
Version of database is very important, you can take a look this note +Managed Recovery fails with ORA-00600:[kcbr_apply_change_11] [ID 1318733.1]+
Can you perform clean shutdown and start with recovery your standby database Again, If any errors from alert log file during startup, Please post here.

Handle: 930648
Status Level: Newbie
Registered: Apr 27, 2012
Total Posts: 39
Total Questions: 14 (11 unresolved)
>

Close your old threads and keep the forum clean.

Источник

Hi,all

     I am trying to do the oracle backup&restore using dataguard.

     I installed 4 instances in a same server, and did the configuration correctly in every instance.

     but after i did the role switch-over for one instance, the MRP terminated by system, and after awhile,  the system attempt to start background MRP again, and it will be terminated again….

     the log information is showed as follows, and attachment is the configuration and inoc_pr00_43079.trc.

     hope you guys can help me, thanks very much….

        Fri Mar 04 01:15:09 2016
            ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT FROM SESSION

ORA-1153 signalled during:             ALTER DATABASE RECOVER MANAGED
STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT FROM SESSION …
Fri Mar 04
01:15:20 2016
            ALTER DATABASE RECOVER MANAGED STANDBY DATABASE
USING CURRENT LOGFILE DISCONNECT FROM SESSION
Fri Mar 04 01:15:21
2016
Recovery coordinator died, shutting down parallel recovery
Fri Mar 04
01:15:21 2016
MRP0: Background Media Recovery terminated with error
448
Errors in file
/opt/oracle/app/oracle/diag/rdbms/inocstandby/inoc/trace/inoc_pr00_43079.trc:
ORA-00448:
normal completion of background process
Managed Standby Recovery not using
Real Time Apply
Recovery interrupted!
Recovered data files to a consistent
state at change 2687663
Attempt to start background Managed Standby Recovery
process (inoc)
Fri Mar 04 01:15:22 2016
MRP0 started with pid=28, OS
id=49748
MRP0: Background Managed Standby Recovery process started
(inoc)
started logmerger process
Fri Mar 04 01:15:27 2016
Managed
Standby Recovery starting Real Time Apply
Parallel Media Recovery started
with 5 slaves
Waiting for all non-current ORLs to be archived…
All
non-current ORLs have been archived.
Media Recovery Waiting for thread 1
sequence 55 (in transit)
Recovery of Online Redo Log: Thread 1 Group 17 Seq
55 Reading mem 0
  Mem# 0: /oradata/inoc/standby_redo01.log
Completed:
            ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT
LOGFILE DISCONNECT FROM SESSION
Fri Mar 04 01:15:36
2016
            ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING
CURRENT LOGFILE DISCONNECT FROM SESSION
ORA-1153 signalled during: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT FROM SESSION …
Fri Mar 04 01:15:48
2016
            ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING
CURRENT LOGFILE DISCONNECT FROM SESSION
ORA-1153 signalled during:
            ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT
LOGFILE DISCONNECT FROM SESSION …
Fri Mar 04 01:15:58
2016
            ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING
CURRENT LOGFILE DISCONNECT FROM SESSION
ORA-1153 signalled during:
            ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT
LOGFILE DISCONNECT FROM SESSION …
Fri Mar 04 01:16:25 2016
Fri Mar 04
01:16:25 2016
Recovery coordinator died, shutting down parallel recoveryMRP0:
Background Media Recovery terminated with error 448

Errors in file
/opt/oracle/app/oracle/diag/rdbms/inocstandby/inoc/trace/inoc_pr00_49767.trc:
ORA-00448:
normal completion of background process
Managed Standby Recovery not using
Real Time Apply
Recovery interrupted!
Recovered data files to a consistent
state at change 2687825

Среда: Linux + Oracle 11.2.0.1 ADG
Феномен: нет журнала приложений в основе библиотеки

1. Текущее состояние подготовки библиотеки запросов базы данных
Установлено, что нет журнала приложений в подготовке библиотеки. Применение LAG показало, что не существует журнала приложений в течение более 21 часов в течение 3 дней и 21 часа.

SQL> set linesize 1200
SQL> SELECT OPEN_MODE, DATABASE_ROLE, SWITCHOVER_STATUS, FORCE_LOGGING, DATAGUARD_BROKER, GUARD_STATUS FROM V$DATABASE; 

OPEN_MODE            DATABASE_ROLE    SWITCHOVER_STATUS    FOR DATAGUAR GUARD_S
-------------------- ---------------- -------------------- --- -------- -------
READ ONLY            PHYSICAL STANDBY NOT ALLOWED          YES DISABLED NONE

SQL> select * from v$dataguard_stats;

NAME                             VALUE                                                            UNIT                           TIME_COMPUTED                  DATUM_TIME
-------------------------------- ---------------------------------------------------------------- ------------------------------ ------------------------------ ------------------------------
transport lag                    +00 00:00:00                                                     day(2) to second(0) interval   01/17/2017 16:07:12            01/17/2017 16:07:12
apply lag                        +03 21:34:49                                                     day(2) to second(0) interval   01/17/2017 16:07:12            01/17/2017 16:07:12
apply finish time                +00 03:10:34.000                                                 day(2) to second(3) interval   01/17/2017 16:07:12
estimated startup time           15                                                               second                         01/17/2017 16:07:12

2. Запрос журнала тревоги оповещения
С момента, когда журнал тревоги в оповещениях расположен до времени ADG, есть 600 информации об ошибках, что приводит к прекращению процесса MRP. Подробные журналы следующие:

Fri Jan 13 18:32:25 2017
Errors in file /home/oracle/opt/oracle/diag/rdbms/orcl/orcl/trace/orcl_pr03_22555.trc  (incident=67480):
ORA-00600: internal error code, arguments: [kcbr_apply_change_11], [], [], [], [], [], [], [], [], [], [], []
Incident details in: /home/oracle/opt/oracle/diag/rdbms/orcl/orcl/incident/incdir_67480/orcl_pr03_22555_i67480.trc
Slave exiting with ORA-600 exception
Errors in file /home/oracle/opt/oracle/diag/rdbms/orcl/orcl/trace/orcl_pr03_22555.trc:
ORA-00600: internal error code, arguments: [kcbr_apply_change_11], [], [], [], [], [], [], [], [], [], [], []
Fri Jan 13 18:32:26 2017
Errors in file /home/oracle/opt/oracle/diag/rdbms/orcl/orcl/trace/orcl_mrp0_22547.trc  (incident=67448):
ORA-00600: internal error code, arguments: [kcbr_apply_change_11], [], [], [], [], [], [], [], [], [], [], []
Incident details in: /home/oracle/opt/oracle/diag/rdbms/orcl/orcl/incident/incdir_67448/orcl_mrp0_22547_i67448.trc
Fri Jan 13 18:32:26 2017
Trace dumping is performing id=[cdmp_20170113183226]
Recovery Slave PR03 previously exited with exception 600
Fri Jan 13 18:32:27 2017
MRP0: Background Media Recovery terminated with error 448
Errors in file /home/oracle/opt/oracle/diag/rdbms/orcl/orcl/trace/orcl_pr00_22549.trc:
ORA-00448: normal completion of background process
Managed Standby Recovery not using Real Time Apply
Recovery interrupted!
Fri Jan 13 18:32:27 2017
Sweep [inc][67480]: completed
Sweep [inc][67480]: completed
Recovered data files to a consistent state at change 2010287982
Errors in file /home/oracle/opt/oracle/diag/rdbms/orcl/orcl/trace/orcl_pr00_22549.trc:
ORA-00448: normal completion of background process
Errors in file /home/oracle/opt/oracle/diag/rdbms/orcl/orcl/trace/orcl_mrp0_22547.trc:
ORA-00600: internal error code, arguments: [kcbr_apply_change_11], [], [], [], [], [], [], [], [], [], [], []
MRP0: Background Media Recovery process shutdown (orcl)
Sweep [inc][67448]: completed
Sweep [inc2][67480]: completed
Sweep [inc2][67448]: completed
Trace dumping is performing id=[cdmp_20170113183227]
Fri Jan 13 18:33:04 2017
Using STANDBY_ARCHIVE_DEST parameter default value as USE_DB_RECOVERY_FILE_DEST

3. Попробуйте начать процесс восстановления MRP вручную библиотеку
Было обнаружено, что процесс восстановления MRP изготовленных вручную библиотек был найден, и журнал тревоги по-прежнему будет сообщать о такой же ошибке ORA-600 [KCBR_APPLY_CHANGE_11].

4. Попробуйте статус Mount, чтобы начать процесс восстановления MRP
Установлено, что в состоянии крепления вы можете запустить процесс восстановления MRP обычно. После завершения восстановления применение ADG Real -времени вновь открывается, все нормально.

shutdown immediate
startup mount
alter database recover managed standby database disconnect from session;
В ожидании восстановления в это время ...
alter database recover managed standby database cancel;
alter database open;
alter database recover managed standby database using current logfile disconnect from session;

Запрос статус библиотеки, чтобы подтвердить, что все нормально:

SQL> SELECT OPEN_MODE, DATABASE_ROLE, SWITCHOVER_STATUS, FORCE_LOGGING, DATAGUARD_BROKER, GUARD_STATUS FROM V$DATABASE; 

OPEN_MODE            DATABASE_ROLE    SWITCHOVER_STATUS    FOR DATAGUAR GUARD_S
-------------------- ---------------- -------------------- --- -------- -------
READ ONLY WITH APPLY PHYSICAL STANDBY NOT ALLOWED          YES DISABLED NONE

SQL> select * from v$dataguard_stats;

NAME                             VALUE                                                            UNIT                           TIME_COMPUTED                  DATUM_TIME
-------------------------------- ---------------------------------------------------------------- ------------------------------ ------------------------------ ------------------------------
transport lag                    +00 00:00:00                                                     day(2) to second(0) interval   01/17/2017 17:42:26            01/17/2017 17:42:26
apply lag                        +00 00:00:00                                                     day(2) to second(0) interval   01/17/2017 17:42:26            01/17/2017 17:42:26
apply finish time                +00 00:00:00.000                                                 day(2) to second(3) interval   01/17/2017 17:42:26
estimated startup time           18                                                               second                         01/17/2017 17:42:26

5. Запрос MOS, основная причина позиционирования
Запрос MOS обнаружил, что это явление соответствует ошибке 10419984
Bug 10419984 : ACTIVE DATA GUARD STANDBY GIVES ORA-600 [KCBR_APPLY_CHANGE_11]
Рекомендуется применить этот патч, чтобы предотвратить эту проблему снова.

     We were receiving error like MRPO terminated with error 1274,unnamed file added in standby database, when a new data file was added in the primary.

Below is the actual error.

Media Recovery Log /uv1019/u341/app/prod/arch/1_68815_799031040.arch
File #382 added to control file as 'UNNAMED00382' because
the parameter STANDBY_FILE_MANAGEMENT is set to MANUAL
The file should be manually created to continue.
Errors with log /uv1019/u341/app/prod/arch/1_68815_799031040.arch
MRP0: Background Media Recovery terminated with error 1274
Errors in file /apps/oracle/admin/prod/diag/rdbms/prodstby/prod/trace/prod_pr00_19983.trc:
 

The issue was standby_file_management parameter is set to MANUAL in standby database. It means what ever datafiles we add in primary, those files won’t be replicated in standby. We need to add them manually. So when we added the file in primary, that file was not getting recognized by standby and created as unnamed file.

SQL> show parameter standby

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
standby_archive_dest                 string      ?/dbs/arch
standby_file_management              string      MANUAL

Follow Below steps:

Cancel the recovery:

SQL>ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;

Database altered.

[STANDBY]Find the unnamed file in standby.

As per the alert log, the problem is with FILE#382

SQL> select file#,name from v$datafile where name like '%UNNAMED%';

     FILE#
----------
NAME
--------------------------------------------------------------------------------
       382
/apps/oracle/product/11.2.0.2.2013Q4/dbs/UNNAMED00382

[PRIMARY]Check the actual file_name for FILE#382 in PRIMARY.

SQL> select name from v$datafile where file#382;
NAME
--------------------------------------------------------------------------------
/uv1019/u340/app/prod/oradata/prodmsc20.tbl

[STANDBY]Recreate the unnamed datafile in standby to give it same name as that of primary.

SQL> alter database create datafile '/apps/oracle/product/11.2.0.2.2013Q4/dbs/UNNAMED00382'
 as '/uv1019/u340/app/prod/oradata/prodmsc20.tbl';

Database altered.

[STANDBY]Now change the file_management to AUTO in standby:

SQL>alter database set standby_file_management = 'auto'; scope=spfile;

SQL> shutdown immediate;

SQL> startup mount

[STANDBY]Start the recovery

SQL>  alter database recover managed standby database disconnect from session;

Database altered.

SQL>  select PROCESS,CLIENT_PROCESS,THREAD#,SEQUENCE#,BLOCK# from v$managed_standby where process = 'MRP0' or client_process='LGWR';

PROCESS   CLIENT_P    THREAD#  SEQUENCE#     BLOCK#
--------- -------- ---------- ---------- ----------
RFS       LGWR              1      69136     610754
MRP0      N/A               1      68815     693166

During recovering in standby using manual process ( no DG configured), I found below error in laert log.

Error in Alert Log:

Thu Mar 31 10:42:10 2016
Dumping diagnostic data in directory=[cdmp_20160331104210], requested by (instance=1, osid=23658610 (PR01)), summary=[incident=4
226].
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Slave exiting with ORA-10562 exception
Errors in file /u01/app/oracle/diag/rdbms/prod/PROD/trace/PROD_pr01_23658610.trc:
ORA-10562: Error occurred while applying redo to data block (file# 80, block# 1288553)
ORA-10564: tablespace sample
ORA-01110: data file 80: ‘/u02/flash_recovery_area/PROD/ORADATA/sample01.dbf’
ORA-10561: block type ‘TRANSACTION MANAGED DATA BLOCK’, data object# 89532
ORA-00600: internal error code, arguments: [kdBlkCheckError], [80], [1288553], [6264], [], [], [], [], [], [], [], []
Recovery Slave PR01 previously exited with exception 10562
Thu Mar 31 10:42:10 2016
Errors with log /u03/restore_archive/2016_03_30/thread_2_seq_625.1575.907832317
MRP0: Background Media Recovery terminated with error 448
Errors in file /u01/app/oracle/diag/rdbms/prod/PROD/trace/PROD_pr00_9109508.trc:
ORA-00448: normal completion of background process
Recovery interrupted!

Solution:

(1) Take backup of related datafiles.(in primary)

e.g.,
RMAN> backup datafile 80 format ‘/u02/df_80_pr.bk’ ;

Now transfer this backup piece to standby server.

(2) In standby : catalog the backup piece location 

RMAN > catalog backuppiece ‘/u03/backup_files/df_80_pr.bk’;

Then list it for confirmation.

RMAN> list backuppiece’/u03/backup_files/df_80_pr.bk’;
RMAN> list backup of datafile 80; # Check the backup piece

(3) Cancel MRP if started.

ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;

(4) Restore from backup piece:

e.g., restore datafile 80; # restore datafile 80 ;

RMAN> restore datafile 80;

Starting restore at 31-MAR-16
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=199 device type=DISK

channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00080 to /u02/flash_recovery_area/PROD/ORADATA/sample01.dbf
channel ORA_DISK_1: reading from backup piece /u03/backup_files/df_80_pr.bk
channel ORA_DISK_1: piece handle=/u03/backup_files/df_80_pr.bk tag=TAG20160331T121840
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:45
Finished restore at 31-MAR-16

RMAN>

(5) Start the MRP in standby:

SQL> alter database recover managed standby database disconnect ;

Sure, It will apply archivelogs again.

Thanks .

I have dataguard configuration such as 2 node rac(11.2.0.2) on aix 6.1 as primary, standalone standby database.

I noticed that Primary can transport archive log to standby but standby can not apply archive log and giving errors at standby alert.log such as:
MRP0: Background Media Recovery terminated with error 1111 and MRP0: Background Media Recovery process shutdown (PROD00DG)

On primary alert.log
————————
RC8: Archive log rejected (thread 1 sequence 75698) at host ‘PROD00DG_private_odm_izm’
FAL[server, ARC8]: FAL archive failed, see trace file.
ARCH: FAL archive failed. Archiver continuing
ORACLE Instance PROD001 – Archival Error. Archiver continuing.

I noticed that MPR has been stopped and standby.

On standby:
—————
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT FROM SESSION;

at alert:
————–
Slave exiting with ORA-1111 exception
Errors in file /oracle11g/app/oracle/diag/rdbms/PROD00dg/PROD00DG/trace/PROD00DG_pr00_17891406.trc:
ORA-01111: name for data file 1285 is unknown – rename to correct file
ORA-01110: data file 1285: ‘/oracle11g/app/oracle/11.2.0/dbs/UNNAMED01285’
ORA-01157: cannot identify/lock data file 1285 – see DBWR trace file
ORA-01111: name for data file 1285 is unknown – rename to correct file
ORA-01110: data file 1285: ‘/oracle11g/app/oracle/11.2.0/dbs/UNNAMED01285’
Recovery Slave PR00 previously exited with exception 1111
MRP0: Background Media Recovery process shutdown (PROD00DG)

at standby trace:
———————-
MRP0: Background Media Recovery terminated with error 1111
ORA-01111: name for data file 1285 is unknown – rename to correct file
ORA-01110: data file 1285: ‘/oracle11g/app/oracle/11.2.0/dbs/UNNAMED01285’
ORA-01157: cannot identify/lock data file 1285 – see DBWR trace file
ORA-01111: name for data file 1285 is unknown – rename to correct file
ORA-01110: data file 1285: ‘/oracle11g/app/oracle/11.2.0/dbs/UNNAMED01285’

*** 2016-07-21 19:41:03.428
Completed Media Recovery
Managed Recovery: Not Active posted.

*** 2016-07-21 19:41:04.133
Slave exiting with ORA-1111 exception
ORA-01111: name for data file 1285 is unknown – rename to correct file
ORA-01110: data file 1285: ‘/oracle11g/app/oracle/11.2.0/dbs/UNNAMED01285’
ORA-01157: cannot identify/lock data file 1285 – see DBWR trace file
ORA-01111: name for data file 1285 is unknown – rename to correct file
ORA-01110: data file 1285: ‘/oracle11g/app/oracle/11.2.0/dbs/UNNAMED01285’

I have checked below query ouputs:

a.) select file#, error from v$recover_file;
b.) select file#, name, status from v$datafile;
Outputs are:
——————

SQL> select file#, error from v$recover_file;

FILE# ERROR
———- —————————–
1268
1269
1270
1275
1276
1277
1278
1279
1280
1281
1282

FILE# ERROR
———- ——————————
1283
1284
1285 FILE MISSING

SQL> select file#, name, status from v$datafile;

file# name status
—— ———– ———
1285 /oracle11g/app/oracle/11.2.0/dbs/UNNAMED01285 RECOVER

After make some search, I have found MOS Recovering the primary database’s datafile using the physical standby, and vice versa [ID 453153.1]

A backup of the one datafile from the primary can be made and then used to restore on the standby database, as indicated in the following note:

The document walks you through the process starting about one-half way down, in the section titled:
“Recovering the Standby’s Datafile”

I followed below steps:

1. Backup related file at primary

On primary:
——————–
$ rman target /

RMAN> backup datafile 1285 format ‘/tmp/1285_pr.bk’ tag ‘PRIMARY_1285′;

2. Transfer the file to the standby site using an operating system utility such as scp, NFS, ftp etc

3. At the standby site, catalog the backuppiece and confirm it’s available for use:

On standby:
——————–
$ rman target /

RMAN> catalog backuppiece’/tmp/1285_pr.bk’;
RMAN> list backuppiece’/tmp/1285_pr.bk’;
RMAN> list backup of datafile 1285;

4. Stop redo apply on the physical standby database

On standby:
——————–
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;

For my issue my redo apply has been already stopped.

5. On the standby site restore the datafile:

On standby:
——————–
$ rman target /
RMAN> restore datafile 1285;

At steps 5 I got error:

RMAN> restore datafile 1285;

Starting restore at 01-DEC-12
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=438 device type=DISK

RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of restore command at 2016-07-21 21:46:50
RMAN-06085: must use SET NEWNAME command to restore datafile /oracle11g/app/oracle/11.2.0/dbs/UNNAMED01285

So I need to run below command for can restore datafile 1285:

RUN {
SET NEWNAME FOR DATAFILE 1285 to ‘+ORADATA’;
RESTORE DATAFILE 1285;
SWITCH DATAFILE 1285;
}

6. Check status of files:Restart redo apply on the physical standby database

On standby:
——————–
a.) select file#, error from v$recover_file;
b.) select file#, name, status from v$datafile;

On standby:
——————–
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT;
from log:

Successfully added datafile 1285 to media recovery
Datafile #1285: ‘+ORADATA/PROD00dg/datafile/tPRODspace_2016_ernst.1574.800921533’

Successfully added datafile 1285 to media recovery
Datafile #1286: ‘+ORADATA/PROD00dg/datafile/tPRODspace_2016_ernst.1574.800921533′
SQL> select thread#, max (sequence#) from v$archived_log where APPLIED=’YES’ group by thread#;

THREAD#    MAX(SEQUENCE#)
———-         ————–
1                        75677
2                         72871

Reference:
—————
Recovering the primary database’s datafile using the physical standby, and vice versa [ID 453153.1]
How to Recover from a Lost or Deleted Datafile with Different Scenarios [ID 198640.1]
MRP0: Background Media Recovery terminated with error 1274 [ID 739618.1]

oracle12c data guard, the log cannot be applied from the library, check the alert date to find

2019-10-21T14:55:40.087819+08:00
MRP0: Background Media Recovery process shutdown (DATA)

Check the trace of mrp to find:

Trace file /oracle/diag/rdbms/pdDATA/DATA/trace/DATA_mrp0_37206.trc
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production
Build label: RDBMS_12.2.0.1.0_LINUX.X64_170125
ORACLE_HOME: /oracle/product/12c/db
System name: Linux
Node name: pdfmdm002
Release: 2.6.32-431.el6.x86_64
Version: #1 SMP Sun Nov 10 22:19:54 EST 2013
Machine: x86_64
Instance name: DATA
Redo thread mounted by this instance: 1
Oracle process number: 47
Unix process pid: 37206, image: [email protected] (MRP0)


*** 2019-10-21T14:39:21.697436+08:00
*** SESSION ID:(1343.12850) 2019-10-21T14:39:21.697470+08:00
*** CLIENT ID:() 2019-10-21T14:39:21.697477+08:00
*** SERVICE NAME:() 2019-10-21T14:39:21.697482+08:00
*** MODULE NAME:() 2019-10-21T14:39:21.697488+08:00
*** ACTION NAME:() 2019-10-21T14:39:21.697493+08:00
*** CLIENT DRIVER:() 2019-10-21T14:39:21.697498+08:00

*** 2019-10-21 14:39:21.696084 5634 krsh.c
MRP0: Background Managed Standby Recovery process started

*** 2019-10-21T14:39:26.702935+08:00
Managed Recovery: Initialization posted.

*** 2019-10-21T14:39:27.596076+08:00
Successfully allocated 8 recovery slaves
Parallel Media Recovery started with 8 slaves
Managed Recovery: Active posted.
LogMerger process exited abnormally.

*** 2019-10-21T14:55:18.881022+08:00
Slave# 8: PR02 exited 
Slave# 7: PR04 exited 
Slave# 6: PR01 exited 
Slave# 5: PR06 exited 
Slave# 4: PR07 exited 
Slave# 3: PR05 exited 
Slave# 2: PR03 exited 
Slave# 1: PR00 exited 
ksvp2penabled: ep->flg = 0, rpr->slv_flg = 0
ksvp2penabled: ep = 0x7f178ffce408, rpr = 0x463f7f8b8
Managed Recovery: Initialization posted.

*** 2019-10-21T14:55:20.082337+08:00
ksvp2penabled: ep->flg = 0, rpr->slv_flg = 0
ksvp2penabled: ep = 0x7f178ffce408, rpr = 0x463f7f8b8
ORA-19909: datafile 1 belongs to an orphan incarnation
ORA-01110: data file 1: '/oradata/DATA/datafile/system.257.987760395'

*** 2019-10-21T14:55:40.087576+08:00
Managed standby recovery cannot handle orphaned datafiles
*** 2019-10-21 14:55:40.087837 5634 krsh.c
MRP0: Background Media Recovery process shutdown
Managed Recovery: Not Active posted.

Check the incarnation of the slave library

$ rman target /

Recovery Manager: Release 12.2.0.1.0 - Production on Mon Oct 21 15:22:46 2019

Copyright (c) 1982, 2017, Oracle and/or its affiliates. All rights reserved.

connected to target database: DATA (DBID=1936743762, not open)

RMAN> list incarnation of database;

using target database control file instead of recovery catalog

List of Database Incarnations
DB Key Inc Key DB Name DB ID STATUS Reset SCN Reset Time
------- ------- -------- ---------------- --- ---------- ----------
1 1 DATA 1936743762 PARENT 1 26-JAN-17
2 2 DATA 1936743762 PARENT 1408558 25-SEP-18
3 3 DATA 1936743762 ORPHAN 4346574446 23-JUL-19
4 4 DATA 1936743762 CURRENT 5041364391 10-SEP-19

Compare the main library:

RMAN> list incarnation of database;


List of Database Incarnations
DB Key Inc Key DB Name DB ID STATUS Reset SCN Reset Time
------- ------- -------- ---------------- --- ---------- ----------
1 1 MUDATA 1936743762 PARENT 1 2017-01-26 13:52:29
2 2 MUDATA 1936743762 CURRENT 1408558 2018-09-25 09:55:33

The main library uses 1408558, reset the library to 1408558

Backup database operation:

RMAN> reset database to incarnation 2;

database reset to incarnation 2

RMAN> list incarnation of database;


List of Database Incarnations
DB Key Inc Key DB Name DB ID STATUS Reset SCN Reset Time
------- ------- -------- ---------------- --- ---------- ----------
1 1 DATA 1936743762 PARENT 1 26-JAN-17
2 2 DATA 1936743762 CURRENT 1408558 25-SEP-18
3 3 DATA 1936743762 ORPHAN 4346574446 23-JUL-19
4 4 DATA 1936743762 ORPHAN 5041364391 10-SEP-19

RMAN>

Preface

Today I will blog again about a Data Guard topic. There are a couple of best practices out there which one should follow. One of these best practises is enabeling block checking and lost write protection. About the latter there are not many information out there. So that’s why I want to outline the concept and importance of this feature. Actually this post is inspired by a talk that I had during DOAG Conference 2016. I had a presentation about best practices in Data Guard and someone from the audience was asking how that lost write protection actually works.
Basically it is there to detect lost writes, as the parameter clearly states. That means, a write request to the disk was commited an the database is happy with that. But the write did not actually happen for whatever reason. So when the block will be read the next time, it is still in old state, any changed, deleted or added values are not included. The block itself is consistent, it is not corrupted. The DBA will not notice it since there is no error. An error will occur only when you restore the tablespace containing the block and then try to apply the redo stream. The recovery will detect a newer SCN in the redo stream which does not match the blocks SCN. That is the point where it gets tricky.

The test environment

My simple test cases run on a VirtualBox VM with OEL 6.7, Oracle Restart 12.1.0.2 and Oracle Database 12.1.0.2. Primary and Standby run on the same host.
DB_NAME: db12c
DB_UNIQUE_NAME: db12ca
DB_UNIQUE_NAME: db12cb
You will see the names in my SQL prompt to make things clear.

This is the current state of the system:

SYS@db12ca> show parameter lost

NAME                                 TYPE                              VALUE
------------------------------------ --------------------------------- ------------------------------
db_lost_write_protect                string                            NONE

SYS@db12ca> select database_role from v$database;

DATABASE_ROLE
------------------------------------------------
PHYSICAL STANDBY
SYS@db12cb> show parameter lost


NAME                                 TYPE                              VALUE
------------------------------------ --------------------------------- ------------------------------
db_lost_write_protect                string                            NONE


SYS@db12cb> select database_role from v$database;

DATABASE_ROLE
------------------------------------------------
PRIMARY

So “db12cb” is my primary and “db12ca” my standby instance. by the way, that’s why I gave them the suffix “a” and “b” because they may change roles over and over again.

For testing I create a separate tablespace with manual space management. This allows me to specify FREELISTS=1. Otherwise the changes to my data may end up in different blocks which is not what I want for my testing. Beside that, I create an user which I will use for testing and which gets the necessary grants.

SYS@db12cb> create tablespace marco datafile size 100m segment space management manual;

Tablespace created.

SYS@db12cb> create user marco identified by marco default tablespace marco quota unlimited on marco;

User created.

SYS@db12cb> grant create session to marco;

Grant succeeded.

SYS@db12cb> grant create table to marco;

Grant succeeded.

Scenario #1: No Lost Write Detection

The new user can now create a table and insert some data, so let’s do that.

SYS@db12cb> conn marco/marco
Connected.
MARCO@db12cb> create table testtable (id number, txt varchar2(100)) storage (freelists 1);

Table created.

MARCO@db12cb> insert into testtable values (1, 'Test Lost Write Detection - 1');

1 row created.

MARCO@db12cb> commit;

Commit complete.

Now we can identify the block and check if the data is really in there.

SYS@db12cb> select file_name from dba_data_files where tablespace_name='MARCO';

FILE_NAME
------------------------------------------------------------------------------------------------------------------------
/u01/app/oracle/oradata/DB12CB/datafile/o1_mf_marco_d3llm6dd_.dbf

SYS@db12cb> select block_id, blocks from dba_extents where segment_name='TESTTABLE' and owner='MARCO';

  BLOCK_ID     BLOCKS
---------- ----------
       128          8

SYS@db12cb> alter system checkpoint;

System altered.
[oracle@oel6u4 ~]$ dd if=/u01/app/oracle/oradata/DB12CB/datafile/o1_mf_marco_d3llm6dd_.dbf of=myblock.v1 skip=129 count=1 bs=8192
1+0 records in
1+0 records out
8192 bytes (8.2 kB) copied, 0.000162476 s, 50.4 MB/s
[oracle@oel6u4 ~]$ grep Detection myblock.v1
Binary file myblock.v1 matches

Ok, the data is in that block. In the same way I can now check if the DML was successfully applied on the standby.

SYS@db12ca> alter system flush buffer_cache;

System altered.

SYS@db12ca> select name from v$datafile where name like '%marco%';

NAME
--------------------------------------------------------------------------------
/u01/app/oracle/oradata/DB12CA/datafile/o1_mf_marco_d3llm8nt_.dbf
[oracle@oel6u4 ~]$ dd if=/u01/app/oracle/oradata/DB12CA/datafile/o1_mf_marco_d3llm8nt_.dbf of=sbblock.v1 skip=129 count=1 bs=8192
1+0 records in
1+0 records out
8192 bytes (8.2 kB) copied, 0.000662024 s, 12.4 MB/s
[oracle@oel6u4 ~]$ grep Detection sbblock.v1
Binary file sbblock.v1 matches

So everything is fine until now as it should be.
I will now insert another row into the test table, force that change to be written to disk and then clear the buffer cache.

MARCO@db12cb> insert into testtable values (2, 'Oh my god!');

1 row created.

MARCO@db12cb> commit;

Commit complete.

MARCO@db12cb>

MARCO@db12cb> conn / as sysdba
Connected.
SYS@db12cb> alter system checkpoint;

System altered.

SYS@db12cb> alter system flush buffer_cache;

System altered.

Again, check if it was written to disk.

[oracle@oel6u4 ~]$ dd if=/u01/app/oracle/oradata/DB12CB/datafile/o1_mf_marco_d3llm6dd_.dbf of=myblock.v2 skip=129 count=1 bs=8192
1+0 records in
1+0 records out
8192 bytes (8.2 kB) copied, 0.000318304 s, 25.7 MB/s
[oracle@oel6u4 ~]$ grep Detection myblock.v2
Binary file myblock.v2 matches
[oracle@oel6u4 ~]$ grep god myblock.v2
Binary file myblock.v2 matches

Both values that I inserted are on disk now. Just to make sure everything is ok, I check the block on the standby.

[oracle@oel6u4 ~]$ dd if=/u01/app/oracle/oradata/DB12CA/datafile/o1_mf_marco_d3llm8nt_.dbf of=sbblock.v2 skip=129 count=1 bs=8192
1+0 records in
1+0 records out
8192 bytes (8.2 kB) copied, 0.000162124 s, 50.5 MB/s
[oracle@oel6u4 ~]$ grep Detection sbblock.v2
Binary file sbblock.v2 matches
[oracle@oel6u4 ~]$ grep god sbblock.v2
Binary file sbblock.v2 matches

So far, so good. Now comes the funny part. I will simulate a lost write by just putting my first extracted block back in the datafile.

[oracle@oel6u4 ~]$ dd if=myblock.v1 of=/u01/app/oracle/oradata/DB12CB/datafile/o1_mf_marco_d3llm6dd_.dbf seek=129 count=1 bs=8192 conv=notrunc
1+0 records in
1+0 records out
8192 bytes (8.2 kB) copied, 0.000154517 s, 53.0 MB/s

Now let us query the test table and see what’s happening.

MARCO@db12cb> select * from testtable;

        ID
----------
TXT
--------------------------------------------------------------------------------
         1
Test Lost Write Detection - 1

No error, no waring, just the result. But the result set obviously lacks the row from the second insert. And as the block is completely intact and not corrupted, there is no need to raise any error.
So now it is time to do another INSERT.

MARCO@db12cb> insert into testtable values (3, 'Inconsistency!');

1 row created.

That is the point where it comes to light. The redo apply of the standby database detects a redo record which does not match the data block that itself has. It has no other chance as to stop recovery and raise an error in the alert.log.

2016-11-26 09:52:02.752000 +01:00
ERROR: ORA-00600: internal error code, arguments: [3020] recovery detected a data block with invalid SCN raised at location:kcbr_media_ap_1
Checker run found 1 new persistent data failures
Errors in file /u01/app/oracle/diag/rdbms/db12ca/db12ca/trace/db12ca_pr02_2466.trc  (incident=2705):
ORA-00600: internal error code, arguments: [3020], [2], [129], [8388737], [], [], [], [], [], [], [], []
ORA-10567: Redo is inconsistent with data block (file# 2, block# 129, file offset is 1056768 bytes)
ORA-10564: tablespace MARCO
ORA-01110: data file 2: '/u01/app/oracle/oradata/DB12CA/datafile/o1_mf_marco_d3llm8nt_.dbf'
ORA-10561: block type 'TRANSACTION MANAGED DATA BLOCK', data object# 93368
2016-11-26 09:52:03.882000 +01:00
Incident details in: /u01/app/oracle/diag/rdbms/db12ca/db12ca/incident/incdir_2705/db12ca_pr02_2466_i2705.trc

Beside that, the primary is still running fine, accepts changes, commits and is just doing what a database is supposed to do. This is very unkind since the only way to recover from such a situation is doing a failover to the standby and lose all changes that happened after the change to damaged block. And this can be a lot.

Scenario #2: Lost Write Detection enabled

I enable it by simply setting the parameter to typical on both instances.

SYS@db12ca> alter system set db_lost_write_protect=typical;

System altered.
SYS@db12cb> alter system set db_lost_write_protect=typical;

System altered.

This parameter forces the database to record the SCN of all blocks that it reads from disk to the redo stream. The standby database can use this information to compare the recorded SCN from the redo stream to the actual SCN of the block at the standby site. If there is a difference, it can report a lost write.

Now I walk through the same steps as above. But this time, after simulating the lost write, I simply query the table.

MARCO@db12cb> select * from testtable;

        ID
----------
TXT
--------------------------------------------------------------------------------
         1
Test Lost Write Detection - 1

The SELECT succeeds, but the alert.log of the primary reports the following error.

2016-11-26 10:40:47.143000 +01:00
DMON: A primary database lost write was reported by standby database db12ca. Please look at the alert and DRC logs of the standby database db12ca to see more information about the lost write.

The standby’s alert.log now reports an ORA-752 instead of an ORA-600.

No redo at or after SCN 3448159 can be used for recovery.
PR02: Primary database lost write detected by standby database db12ca
BLOCK THAT LOST WRITE 129, FILE 2, TABLESPACE# 7
The block read during the normal successful database operation had SCN 3346737 (0x0000.00331131) seq 1 (0x01)
ERROR: ORA-00752 detected lost write on primary
Slave exiting with ORA-752 exception
Errors in file /u01/app/oracle/diag/rdbms/db12ca/db12ca/trace/db12ca_pr02_2924.trc:
ORA-00752: recovery detected a lost write of a data block
ORA-10567: Redo is inconsistent with data block (file# 2, block# 129, file offset is 1056768 bytes)
ORA-10564: tablespace MARCO
ORA-01110: data file 2: '/u01/app/oracle/oradata/DB12CA/datafile/o1_mf_marco_d3lnpn8n_.dbf'
ORA-10561: block type 'TRANSACTION MANAGED DATA BLOCK', data object# 93369
Recovery Slave PR02 previously exited with exception 752
MRP0: Background Media Recovery terminated with error 448
Errors in file /u01/app/oracle/diag/rdbms/db12ca/db12ca/trace/db12ca_pr00_2919.trc:
ORA-00448: normal completion of background process

Recovering from a lost write

As in scenario #1, the only way to work around this error is to failover to the standby database.

[oracle@oel6u4 ~]$ dgmgrl
DGMGRL for Linux: Version 12.1.0.2.0 - 64bit Production

Copyright (c) 2000, 2013, Oracle. All rights reserved.

Welcome to DGMGRL, type "help" for information.
DGMGRL> connect sys@db12ca
Password:
Connected as SYSDBA.
DGMGRL> failover to db12ca immediate
Performing failover NOW, please wait...
Failover succeeded, new primary is "db12ca"

Now I can query my test table at the new primary.

SYS@db12ca> select * from marco.testtable;

        ID TXT
---------- ------------------------------
         1 Test Lost Write Detection - 1
         2 Oh my god!

I now need to re-create the old primary. Reinstate using Flashback Database will not work. The steps will be basically these:

  • remove database from configuration
  • recreate the database using duplicate
  • add database back to the configuration

A lot of effort for such a “small” failure….

Conclusion

Enabling lost write detection is crucial in a Data Guard setup. Lost writes are detected at read time which allows to perform recovery steps much earlier than without it. Nevertheless, lost writes should not occur. If it does occur, something really bad is going on in your environment and you need to investigate the root cause of the lost write.
That’s it, basically. I hope it makes things a little more clear.

Further reading

Resolving ORA-752 or ORA-600 [3020] During Standby Recovery (Doc ID 1265884.1)
Oracle Docs – Database Reference: DB_LOST_WRITE_PROTECT

Понравилась статья? Поделить с друзьями:
  • Mrb error 1 press any key to boot from floppy
  • Mr error fell ink
  • Mr doob google error
  • Mr 4039 ошибка мерседес атего
  • Mr 4024 ошибка мерседес актрос