ASM instance in INTERMEDIATE state + CHECK TIMED OUT state_details

Yet another clash with ASM 11gR2 BUG during a new database creation under a 6 node development RAC environment. Got a requirement to create a new database, thereby, created a new ASM diskgroup manually and began creating the database with DBCA tool. However,  ran over the following error when the newly created ASM diskgroup was selected to host the data files of the database:

After getting the error, I thought I would better ensure once again that the diskgroup is properly mounted on the local ASM instance, despite the fact that the diskgroup in very well mount state, the issue persists. I therefore check over the resource (asm and dg) status in the cluster issuing the 'crsctl stat res -t' command. The output displays that the diskgroup in the question has OFFLINE STATE and scrolling further down it was a bit strange and scary moment. The ora.asm resource was in INTERMEDIATE state with CHECK TIMED OUT state_details.

My curiosity let me do some research over the MOS and quickly come across the following document:

oraagent.bin Exits After Check Timed Out [ID 1323679.1]

The issue was nearly identical to what has been explained in the document and the behavior is due to bug 11807012 that apparently fixed in, 12.1 and above. However, the document also did mention that the Interim patch 11807012 exists for certain platform/version and advised to engage Oracle Support to request if it does not exist for your platform/version. Unfortunately, we couldn't find the interim patch for our OS (HPUX). We then opened a SR and the engineer indeed confirmed it is due to the bug in the context. As a workaround, recommend the following action plan followed ASM instance restart:

crsctl modify resource "ora.asm" -attr "CHECK_TIMEOUT=132"

The above change was applied and bounced the ASM instance to reflect the change.(We seek the downtime to deploy the change, since it was a dev. environment, we easily get it). The ora.asm resource then cam back to normal condition and the database was successfully created subsequently..

Out of six ASM instances, we found three of them were in the same state. However, on a positive side, it will not cause any troubles with the ongoing operations unless you wanna do something what we needed. The following image outlines the status of the ASM resources:

Happy reading,


1 comment:

Wissem said...

thank you Jaffar for sharing it!