Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-71-1616910.1
Update Date:2017-08-24
Keywords:

Solution Type  Technical Instruction Sure

Solution  1616910.1 :   ODA Nodes Lacking Space Due to Large Cluster Health Monitor File Crfclust.Bdb  


Related Items
  • Oracle Database - Enterprise Edition
  •  
  • Oracle Database Appliance
  •  
Related Categories
  • PLA-Support>Eng Systems>Exadata/ODA/SSC>Oracle Database Appliance>DB: ODA_EST
  •  




Created from <SR 3-8358198991>

Applies to:

Oracle Database Appliance - Version All Versions and later
Oracle Database - Enterprise Edition - Version 11.2.0.4 to 11.2.0.4 [Release 11.2]
Information in this document applies to any platform.

Goal

Checking disk space on the ODA nodes, you see the /u01 partition is 66-67% full:

[root@oda1 oda1]# df -h /u01
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/VolGroupSys-LogVolU01
  97G 61G 32G 67% /u01

[root@oda1 oda1]# df -h /u01
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/VolGroupSys-LogVolU01
  97G 61G 32G 66% /u01


Checking the size of the GRID_HOME, you see the file Crfclust.Bdb occupies 31G:

[root@oda1 oda1]# ls -lrth
total 33G
-rw-r----- 1 root root 8.0K Oct 18 23:01 repdhosts.bdb
-rw-r----- 1 root root 24K Dec 10 18:48 __db.001
-rw-r--r-- 1 root root 115M Dec 10 18:48 oda1.ldb
-rw-r----- 1 root root 8.0K Dec 10 18:49 crfconn.bdb
-rw-r----- 1 root root 16M Jan 13 08:21 log.0000019932
-rw-r----- 1 root root 306M Jan 13 08:28 crfts.bdb
-rw-r----- 1 root root 472M Jan 13 08:28 crfloclts.bdb
-rw-r----- 1 root root 375M Jan 13 08:28 crfcpu.bdb
-rw-r----- 1 root root 31G Jan 13 08:28 crfclust.bdb   <<<<<<<<<<<<<<<<<<<<<<<<<<<
-rw-r----- 1 root root 16M Jan 13 08:29 log.0000019933
-rw-r----- 1 root root 56K Jan 13 08:29 __db.006
-rw-r----- 1 root root 386M Jan 13 08:29 crfhosts.bdb
-rw-r----- 1 root root 375M Jan 13 08:29 crfalert.bdb
-rw-r----- 1 root root 1.2M Jan 13 08:29 __db.005
-rw-r----- 1 root root 392K Jan 13 08:29 __db.002
-rw-r----- 1 root root 2.1M Jan 13 08:29 __db.004
-rw-r----- 1 root root 2.6M Jan 13 08:29 __db.003


This is true for both ODA nodes.


 

Solution

This is due to a known issue with Cluster Health Monitor (CHM) database taking up too much space. 

This is outlined in the following MOS Notes:

     Oracle Cluster Health Monitor (CHM) using large amount of space (more than default) (Doc 1343105.1)

     db_delete: BDB grown beyond user desired limits disabling loggerd (Doc ID 1574492.1)

To resize this database:

1.  As user grid in oda1 execute the following command:

[grid@oda1 ~]$ oclumon manage -repos resize 259200
oda1 --> retention check successful
oda2 --> retention check successful
New retention is 259200 and will use 4516300800 bytes of disk space

CRS-9115-Cluster Health Monitor repository size change completed on all nodes.

 


2.   After that resize, checking repository size will fail with the following error:

[grid@oda1 ~]$ oclumon manage -get repsize
CRS-9011-Error manage: Failed to initialize connection to the Cluster Logger Service

 This is due to the CHM repository BDB database being bigger than the retention period.

 

3.  Restart crf on both nodes to resolve the issue:

[grid@oda1 ~]$ crsctl stop res ora.crf -init
CRS-2673: Attempting to stop 'ora.crf' on 'oda1'
CRS-2677: Stop of 'ora.crf' on 'oda1' succeeded
[grid@oda1 ~]$ crsctl start res ora.crf -init

[grid@oda2 oda2]$ crsctl stop res ora.crf -init
CRS-2673: Attempting to stop 'ora.crf' on 'oda2'
CRS-2677: Stop of 'ora.crf' on 'oda2' succeeded
[grid@oda2 ~]$ crsctl start res ora.crf -init

 

4.  Check that the file is now smaller:

[grid@oda1 ~]$ cd /u01/app/11.2.0.3/grid/crf/db/oda1
[grid@oda1 oda1]$ ls -lrth
total 572K
-rw-r----- 1 root root 56K Jan 17 08:04 __db.006
-rw-r----- 1 root root 1.2M Jan 17 08:04 __db.005
-rw-r----- 1 root root 2.1M Jan 17 08:04 __db.004
-rw-r----- 1 root root 392K Jan 17 08:04 __db.002
-rw-r----- 1 root root 24K Jan 17 08:04 __db.001
-rw-r----- 1 root root 8.0K Jan 17 08:04 crfhosts.bdb
-rw-r----- 1 root root 8.0K Jan 17 08:04 crfconn.bdb
-rw-r----- 1 root root 128K Jan 17 08:04 crfclust.bdb   <<<<<<<<<<<
-rw-r----- 1 root root 16M Jan 17 08:04 log.0000000001
-rw-r----- 1 root root 2.6M Jan 17 08:04 __db.003
-rw-r----- 1 root root 8.0K Jan 17 08:04 crfts.bdb
-rw-r----- 1 root root 8.0K Jan 17 08:04 crfloclts.bdb
-rw-r----- 1 root root 8.0K Jan 17 08:04 crfcpu.bdb
-rw-r----- 1 root root 8.0K Jan 17 08:04 crfalert.bdb
-rw-r--r-- 1 root root 115M Jan 17 08:05 oda1.ldb

 

 And the disk space is reclaimed:

[grid@oda1 ~]$ df -h /u01
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/VolGroupSys-LogVolU01
97G 29G 64G 32% /u01

 
 

References

<NOTE:1343105.1> - Oracle Cluster Health Monitor (CHM) using large amount of space (more than default)
<NOTE:1574492.1> - db_delete: BDB grown beyond user desired limits disabling loggerd

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback