![]() | Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||||||||
Solution Type Problem Resolution Sure Solution 2135119.1 : SAS HBA does not maintain logs over a reboot on Exadata X5-2L High Capacity Storage Servers on SW Image versions below 12.1.2.3.0
SAS HBA does not maintain logs over a reboot on Exadata X5-2L High Capacity Storage Servers on SW Image versions below 12.1.2.3.0 In this Document
Applies to:Oracle SuperCluster M6-32 Hardware - Version All Versions and laterExadata X5-2 Hardware - Version All Versions and later Exadata X4-8 Hardware - Version All Versions and later Exadata X5-8 Hardware - Version All Versions and later Oracle SuperCluster T5-8 Hardware - Version All Versions and later Information in this document applies to any platform. SymptomsSAS HBA does not maintain logs over a reboot on Exadata X5-2L High Capacity Storage Servers on SW Image versions below 12.1.2.3.0.
Note: Exadata X5-2L Extreme Flash Storage Servers do not have a SAS HBA and are not affected. CauseFirmware bug and default log setting changes. Solution1. Check the image version of the system and firmware version of the SAS HBA: # imageinfo -ver
12.1.2.2.0.150917 # MegaCli64 -adpallinfo -a0 | grep -i package FW Package Build: 24.3.0-0081 If the system is running SW Image 12.1.2.3.0 or later, then the problem does not apply. These images have both the firmware fix and the persistent log setting is already enabled by default, so no further action is required. If the system is running SW Image 12.1.2.2.x, then these images have the firmware fix 24.3.0-0081 per the example above, but the HBA configuration settings need to be updated. Proceed to step 2. # imageinfo -ver
12.1.2.1.1.150316.2 # MegaCli64 -adpallinfo -a0 | grep -i package FW Package Build: 24.3.0-0073 If the system is running SW Image release earlier than 12.1.2.2.0 with firmware 24.3.0-0073 per the example above, then it needs to be updated to SW Image 12.1.2.2.0 or later, containing SAS HBA firmware package 24.3.0-0081 due to a firmware bug with terminal logging. For how to update image, refer to MOS Note 888828.1. The first preference for resolving this issue is to update image. If the server is not able to be updated with a later image at this time, then the SAS HBA firmware only may be temporarily updated to address this issue on systems that have had a failure as follows, using the firmware package "MR_6.3.8_24.3.0-0081.rom" attached to this Note. NOTE: If updating firmware on multiple storage cells in a rolling manner, do not reboot and apply the firmware update to multiple storage cells at the same time - only do them one at a time and ensure all disks are re-synchronized with ASM before proceeding to the next storage cell.
i. ASM drops a disk shortly after it/they are taken offline. The default DISK_REPAIR_TIME SQL> select dg.name,a.value from v$asm_attribute a, v$asm_diskgroup dg
where a.name = 'disk_repair_time' and a.group_number = dg.group_number; As long as the value is large enough to comfortably replace the hardware in a ii. Check if ASM will be OK if the grid disks go OFFLINE. # cellcli -e list griddisk attributes name,asmmodestatus,asmdeactivationoutcome
...snipit ... DATA_CD_09_cel01 ONLINE Yes DATA_CD_10_cel01 ONLINE Yes DATA_CD_11_cel01 ONLINE Yes RECO_CD_00_cel01 ONLINE Yes etc.... If one or more disks return asmdeactivationoutcome='No', you should wait for some time NOTE: Taking the storage server offline while one or more disks return a status of asmdeactivationoutcome='No' will cause Oracle ASM to dismount the affected disk group, causing the databases to shut down abruptly.
iii. Run cellcli command to Inactivate all grid disks on the cell you wish to power down/reboot. # cellcli -e alter griddisk all inactive
GridDisk DATA_CD_00_dmorlx8cel01 successfully altered GridDisk DATA_CD_01_dmorlx8cel01 successfully altered GridDisk DATA_CD_02_dmorlx8cel01 successfully altered GridDisk RECO_CD_00_dmorlx8cel01 successfully altered ...etc... iv. Execute the command below and the output should show asmmodestatus='UNUSED' or # cellcli -e list griddisk attributes name,status,asmmodestatus,asmdeactivationoutcome
DATA_CD_00_dmorlx8cel01 inactive OFFLINE Yes DATA_CD_01_dmorlx8cel01 inactive OFFLINE Yes DATA_CD_02_dmorlx8cel01 inactive OFFLINE Yes RECO_CD_00_dmorlx8cel01 inactive OFFLINE Yes ...etc...
c) Disable Exadata Storage Server services with the following command as 'root' user: # cellcli -e alter cell shutdown services all
d) Upgrade the HBA firmware with the following command as 'root' user: # /opt/oracle.cellos/CheckHWnFWProfile -action updatefw -mode diagnostic -component DiskController -attribute DiskControllerFirmwareRevision -diagnostic_version 24.3.0-0081 -fwpath /tmp/MR_6.3.8_24.3.0-0081.rom
Upon completion of the firmware upgrade, the system will automatically reboot. This takes ~10 minutes to complete the entire process after rebooting the cell, excluding disk re-synchronization time. e) Verify the SAS HBA firmware is updated: # MegaCli64 -adpallinfo -a0 | grep -i package
FW Package Build: 24.3.0-0081 The firmware package with the logging bug fix is 24.3.0-0081. f) Verify the disks and bring them online as follows: i. Verify the 12 disks are visible. The following command should show 12 disks: # lsscsi | grep -i LSI
[0:2:0:0] disk LSI MR9361-8i 4.23 /dev/sda [0:2:1:0] disk LSI MR9361-8i 4.23 /dev/sdb [0:2:2:0] disk LSI MR9361-8i 4.23 /dev/sdc [0:2:3:0] disk LSI MR9361-8i 4.23 /dev/sdd [0:2:4:0] disk LSI MR9361-8i 4.23 /dev/sde [0:2:5:0] disk LSI MR9361-8i 4.23 /dev/sdf [0:2:6:0] disk LSI MR9361-8i 4.23 /dev/sdg [0:2:7:0] disk LSI MR9361-8i 4.23 /dev/sdh [0:2:8:0] disk LSI MR9361-8i 4.23 /dev/sdi [0:2:9:0] disk LSI MR9361-8i 4.23 /dev/sdj [0:2:10:0] disk LSI MR9361-8i 4.23 /dev/sdk [0:2:11:0] disk LSI MR9361-8i 4.23 /dev/sdl ii. Activate the grid disks. # cellcli
… CellCLI> alter griddisk all active GridDisk DATA_CD_00_dmorlx8cel01 successfully altered GridDisk DATA_CD_01_dmorlx8cel01 successfully altered GridDisk RECO_CD_00_dmorlx8cel01 successfully altered GridDisk RECO_CD_01_dmorlx8cel01 successfully altered ...etc... iii. Verify all grid disks show 'active': CellCLI> list griddisk
DATA_CD_00_dmorlx8cel01 active DATA_CD_01_dmorlx8cel01 active RECO_CD_00_dmorlx8cel01 active RECO_CD_01_dmorlx8cel01 active ...etc... iv. Verify all grid disks have been successfully put online using the following command. Wait until asmmodestatus is ONLINE for all grid disks. The following is an example of the output early in the activation process. CellCLI> list griddisk attributes name,status,asmmodestatus,asmdeactivationoutcome
DATA_CD_00_dmorlx8cel01 active ONLINE Yes DATA_CD_01_dmorlx8cel01 active ONLINE Yes DATA_CD_02_dmorlx8cel01 active ONLINE Yes RECO_CD_00_dmorlx8cel01 active SYNCING Yes ...etc... Notice in the above example that RECO_CD_00_dmorlx8cel01 is still in the 'SYNCING' process. Oracle ASM synchronization is only complete when ALL grid disks show ‘asmmodestatus=ONLINE’. This process can take some time depending on how busy the machine is, and has been while this individual server was down for repair. g) Repeat the above steps to update the firmware on each storage cell, as needed. NOTE: If updating firmware on multiple storage cells in a rolling manner, do not reboot and apply the firmware update to multiple storage cells at the same time - only do them one at a time and ensure all disks are re-synchronized with ASM completely before proceeding to the next storage cell.
# MegaCli64 -fwtermlog -bbuget -a0
Battery is OFF for TTY history on Adapter 0 Exit Code: 0x00 We should see that the battery mode is off for the fwtermlog. # MegaCli64 -fwtermlog -bbuon -a0
Battery is set to ON for TTY history on Adapter 0 Running the above command on the cells will not have any impact. This change is persistent across cell reboots or power cycles and is only unset by command.
References<NOTE:888828.1> - Exadata Database Machine and Exadata Storage Server Supported Versions<BUG:22023718> - SET FWTERMLOG BBUON FOR STORAGE CELLS <BUG:21534072> - ASPEN TTY LOG DUMP IS NOT PERSISTENT ACROSS MULTIPLE BOOTS Attachments This solution has no attachment |
||||||||||||||||||
|