![]() | Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||||||
Solution Type Technical Instruction Sure Solution 2230270.1 : How to Replace a Big Data Appliance (Original V1) Faulty RAID HBA BBU
In this Document
Oracle Confidential PARTNER - Available to partners (SUN). Applies to:Big Data Appliance Hardware - Version All Versions and laterInformation in this document applies to any platform. GoalHow to Replace a Big Data Appliance Faulty RAID HBA BBU SolutionDISPATCH INSTRUCTIONS WHAT SKILLS DOES THE FIELD ENGINEER/ADMINISTRATOR NEED?: BDA trained The instructions below assume the Customer system administrator is available and working with the field engineer onsite to manage the host OS and BDA services. They are provided here to allow the FE to have all the available steps needed when onsite, and can be done by the FE if the customer system administrator wants or allows or needs help with these steps. # /opt/MegaRAID/megacli/MegaCli64 -ldsetprop wt -lall -a0
Verify the current cache policy for all logical volumes is now WriteThrough : # /opt/MegaRAID/megacli/MegaCli64 -ldpdinfo -a0 | grep BBU
2. The Customer’s system administrator should shutdown the server node and BDA services following the shutdown instructions for Big Data Appliance detailed in MOS Note 2099858.1 WHAT ACTION DOES THE FIELD ENGINEER/ADMINISTRATOR NEED TO TAKE?: Physical RAID HBA BBU replacement: NOTE:
Do not remove any cables prior to sliding the server forward, or the loose cable ends will jam in the cable management arms. Take care to ensure the cables and Cable Management Arm is moving properly. Refer to Note 1444683.1 for CMA handling training. 2. Disconnect the AC power cords. NOTE: Do NOT attempt to remove any screws from the top side of the HBA and battery pack – those screws hold the standoffs that provide the bottom screw holes and should remain with the battery pack.
b) Detach the battery pack including circuit board from the HBA by gently lifting it from its circuit board connector on the top side of the HBA.
1. Once the ILOM has booted you will see a slow blink on the green LED for the server. Press the power button on the front of the server to power on the unit. → start /SP/console
c. Use the local KVM and Keyboard/Monitor tray, open the tray and select the appropriate BDA Server Node hostname from the “Target Devices” list, and then select the “Console” button. Watch in particular, the LSI controller BIOS while it is loading. If it gives a warning message regarding drives with preserved cache, then choose “D” to discard the cache and continue. This is not an issue as the disk will get re-synced after boot by HDFS. If it gives a warning message regarding drives are in write-through mode due to a low battery, then choose to continue. The boot should continue normally after up to the login prompt. Note:
If using the ILOM serial console to monitor the boot, there may be a long pause during subsequent boot steps before the login prompt displays, as the default console is the graphics, and portions of the boot messages will only go to the graphics screen and not display on the serial console. 3. Once full boot is completed you should be able to login as ‘root’ user and verify the new battery is seen and is charging. # /opt/MegaRAID/MegaCli/MegaCli64 -adpbbucmd -a0
4. Set all logical drives cache policy to WriteBack cache mode: # /opt/MegaRAID/MegaCli/MegaCli64 -ldsetprop wb -lall -a0
5. Verify the current cache policy for all logical drives is now using WriteBack cache mode: # /opt/MegaRAID/MegaCli/MegaCli64 -ldpdinfo -a0 | grep BBU
6. Verify the InfiniBand links are up at 40Gbps as the cables were disconnected: # /usr/sbin/ibstatus
Infiniband device 'mlx4_0' port 1 status: default gid: fe80:0000:0000:0000:0021:2800:013e:70bb base lid: 0x50 sm lid: 0x1 state: 4: ACTIVE phys state: 5: LinkUp rate: 40 Gb/sec (4X QDR) Infiniband device 'mlx4_0' port 2 status: default gid: fe80:0000:0000:0000:0021:2800:013e:70bc base lid: 0x51 sm lid: 0x1 state: 4: ACTIVE phys state: 5: LinkUp rate: 40 Gb/sec (4X QDR) 7. Once the hardware is verified as up and running, the Customer's system administrator will need to verify the BDA services are up following the startup procedures for Big Data Appliance detailed in MOS Note 2099858.1 PARTS NOTE: ReferencesOracle ILOM 3.0 documentation library - http://docs.oracle.com/cd/E19860-01/index.html<NOTE:2099858.1> - Steps to Gracefully Shutdown and Power on a Single Node on Oracle Big Data Appliance Prior to Maintenance https://www.broadcom.com/support/oem/oracle/6gb/sg_x_sas6-r-int-z Attachments This solution has no attachment |
||||||||||||||||
|