3.13.2013

Possible data corruption,ORA-600 on RAC instance shutdown

I am afraid, it is not a so good news for those who manage 11.2.0.1|2 RAC databases. On a very unfortunate day, an RAC database db | instance shutdown normal|transactional|immediate mode could result in an ORA-600 errors or may cause logical corruption to the redo stream .

According to the MOS ID [1318986.1] , when a RAC instance(s) (11.2.0.1|2) shutdown either in normal|transactional|immediate mode, there could be possibilities, under certain circumstances, of hitting the bug.1020523 which could produce either pool of ORA-600 errors or might also corrupt the data.  The following workaround is suggested in the note:

For complete RAC database shutdown:

    SQL> alter system checkpoint;
    srvctl stop database -d <db_uniqueue_name> -o abort -f

For one or more RAC instances, run the below for each instance:

    SQL> alter system checkpoint local;
    SQL> shutdown abort; or
    
    # srvctl stop instance -d <db_unique_name> -i <instance_name> -o abort -f

-- as the shutdown abort -f option complete by-pass the vulnerable code path of the bug:

The known downstream effect includes the following:

* Data corruption occurs around shutdown one or more of the RAC instances

* One of the following ORA-600 asserts:
- ORA-600 [kclchkblk_3]
- ORA-600 [kclwcrs_6]
- ORA-600 [ktubko_1]
- ORA-600 [kcratr_scan_lostwrt]
- ORA-600[3020] on the standby database

You may also refer to my earlier post discussion about choosing a quick database shutdown mode.


Reference: