Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-79-1542070.1
Update Date:2017-10-16
Keywords:

Solution Type  Predictive Self-Healing Sure

Solution  1542070.1 :   Reference information on LSI HBA battery backup unit (BBU) used on SAS2 RAID HBA  


Related Items
  • Netra Blade X3-2B
  •  
  • Netra Server X3-2
  •  
  • Sun Netra X4270 Server
  •  
  • Sun Server X3-2
  •  
  • Sun Netra X6270 M2 Server Module
  •  
  • Sun Fire X4470 Server
  •  
  • Sun Server X2-8
  •  
  • Sun Server X3-2L
  •  
  • Sun Fire X4270 M2 Server
  •  
  • Sun Blade X3-2B
  •  
  • Sun Blade X6270 M2 Server Module
  •  
  • Sun Server X2-4
  •  
  • Sun Fire X4800 Server
  •  
  • Sun Fire X4170 M2 Server
  •  
Related Categories
  • PLA-Support>Sun Systems>x86>Server>SN-x86: Sun Server X3
  •  


General reference information on LSI HBA battery backup unit (BBU) used on SAS2 RAID HBA's - SGX-SAS6-R-INT-Z and SGX-SAS6-R-REM-Z

In this Document
Purpose
Scope
Details
 About LSI SAS2 HBA Batteries
 Write-Back vs. Write-Through mode
 Battery Monitoring via Learn Cycles
 Battery Operating Temperature Guidelines
 Battery Swelling Guidelines
 Battery Related MegaCLI Commands


Applies to:

Sun Fire X4800 Server - Version Not Applicable and later
Sun Fire X4170 M2 Server - Version Not Applicable and later
Sun Server X3-2 - Version Not Applicable and later
Sun Server X3-2L - Version Not Applicable and later
Sun Netra X6270 M2 Server Module - Version Not Applicable and later
Information in this document applies to any platform.

Purpose

 Provide general reference information on LSI HBA battery backup unit (BBU) used on SAS2 RAID HBA's (SGX-SAS6-R-INT-Z and SGX-SAS6-R-REM-Z) provided with Oracle systems.

Scope

 This document covers LSI HBA battery backup units.

Note LSI is the original manufacturer, and was later taken over by AvagoTech and now Broadcom. All 3 company names may appear in different places in documentation and firmware output.

Details

 

About LSI SAS2 HBA Batteries


An LSI SAS2 6Gbps RAID Host Bus Adapter (HBA codename Niwot) is used in many Sun-Oracle x86, Blade and SPARC based systems to control and interface the disk drives. This HBA contains 512MB of Low Voltage DDR2 memory it uses to cache data writes in order to improve performance of disk write operations. The HBA also contains a Battery Backup Unit (BBU), which is designed to supply regulated battery power to the cache memory long enough for the main system power to be brought back up on line, when there is a main system power outage. For Sun-Oracle systems, the specified hold-up time is 48 hours which means the battery will maintain the memory cache for 48 hours after power is lost. If the power remains lost for >48 hours, the data in the cache may be lost, requiring recovery from backup.

The BBU is a single cell Li-ion battery pack and like all Li-ion rechargeable batteries, charge is supplied via a chemical reaction and the battery packs ability to hold charge will degrade over time. The BBU (also referred to as iBBU or Intelligent BBU) contains a small integrated circuit board with a "smart" gas gauge, accessible through an I2C bus, which permits the RAID on Chip (RoC) controller to monitor the actual battery capacity to ensure that caching is not permitted if the capacity falls below the minimum necessary threshold.


The BBU board also contains the charge circuitry. It is designed to be removable and replaceable as a Customer Replaceable Unit (CRU) with a single mating connector that interfaces the BBU board to the HBA, and 3 screws mounted under the HBA that physically retain it to the HBA.

 

Write-Back vs. Write-Through mode

When the BBU is present and operating normally, the virtual disks are placed in Write-Back (WB) mode which uses the cache memory to store writes that get written back to the disk physically in a block to optimize the data placement and speed on disk. Once the write is in the cache memory it is acknowledged back to the O/S as I/O completed which improves the O/S level performance. If the power is lost, the BBU ensures the cache is maintained and written back to disk once power is restored.

When the BBU is not present, or not able to be used currently, the HBA reverts the virtual disks into Write-Through (WT) mode. This mode bypasses the cache memory and writes all I/O's through to the disk physically, and waits on the disk to acknowledge the write is completed physically. This results in slower performance but guarantees all writes are done to non-volatile disk storage so there is no risk of data loss if power was lost.

 

Battery Monitoring via Learn Cycles

Learn cycles are a battery calibration method done periodically which fully discharges the battery and re-charges it. When complete, the BBU “learns” the new capacity of charge the battery can hold. Failure to run learn cycles at their recommended intervals may reduce the usable life of the battery by reducing the full charge capacity more rapidly leading to premature end of service life. This is reported by the "Full Charge Capacity" field in MegaCLI BBU output and will be updated after a learn cycle.

When a learn cycle is initiated, the charging circuit automatically places any virtual drives that are in WB mode into WT mode for the duration of the cycle which will temporarily reduce write performance. Once the learn cycle completes, the virtual drives are automatically transitioned back to WB mode if the battery is still capable of holding the required charge amount.

The battery model used on Sun-Oracle systems is iBBU08, for which the complete learn cycle process and the cache in WT mode is expected to be 2 to 3 hours, but may take up to 24 hours.

An initial learn cycle is done every time a system is powered on, to learn the current state of the BBU. The learn cycle is by default configured to occur automatically, with timing every 30 days from the time the system was powered on. It is recommended to leave this default setting.

If the time of day the automatic learn cycle is occurring is inconvenient to have lower disk performance, then it can be manipulated by either:

  • Powering on the system at a time that is going to have less impact on the system disk performance.

  • Disabling auto-learn cycles and then re-enabling auto-learn cycles at a time that is going to have less impact on the system disk performance.

  • Disabling auto-learn cycles and manually scheduling them using cron, for a time that is going to have less impact on the system disk performance.


It is not recommended to disable auto-learn cycles completely, and not run them manually, or the useable life of the battery will be reduced.

Additional learn cycles may start occurring more frequently than 30 days if the full charge capacity gets close to the replacement thresholds and the remaining capacity goes low which will initiate a new learn cycle to relearn the full charge capacity. This has been seen to occur as frequently as daily on a failing BBU.

When a new BBU is installed into a system, it will have a depleted charge state. Any virtual drives attached will be forced into WT cache mode while a full learn cycle is performed. Usually a sufficient charge to maintain the cache is reached after this cycle is complete. This may take 24 hours or longer.

 

Battery Operating Temperature Guidelines

Batteries are chemical in nature and as such have temperature conditions to be aware of. The system fans are used to cool the BBU so it stays in this operating range, which should be possible if the system is maintained within its ambient temperature range specified of 5C - 37C. Operating the system outside of these ranges will lead to reduced life for the battery.

Batteries under charge will heat up and possibly swell which are both normal and will return to normal after charging. Learn cycle charging will ignore temporary over-temperature conditions due to charge heat, and resume normal monitoring after the learn cycle when it has cooled back down.


BBU08 has a specified operating range of 10C to 55C, as reported by the BBU. Between 55C and 60C a High Temperature warning will be set. If the temperature has ever gone over 60C then Over Temperature flag will be set. There is no warning if the temperature has gone under 10C.

Operating for an extended period in the High warning range will result in a reduced life expectancy.


A BBU that has reported Over-Temperature (>60C) will stop charging the battery until such time as the temperature condition is resolved and the BBU is cooled down to within the specified operating temperature (<55 oC). In this condition the BBU cannot be trusted so the virtual disks are forced into WT mode until such time as the BBU starts charging again. A reduced battery life expectancy will result from operating in this range.

The system specification range of 5-37C requires an altitude de-rating of 1C per 300m above 900m for optimal operation. The battery Li-ion cell also needs de-rating for altitude, so the working temperature range specified above should be de-rated by 1C per 330m, starting at 500m altitude. Due to the lower rating on the battery, the overall system rating should be reduced to 500m. For example systems at 2000m should consider the specified temperature range as 2000m-500m = 1500m / 330m = ~4.5C so the de-rated temperature range for this altitude will be 10-50.5C for BBU08, where the system specification range is reduced by 5C to 5-32C. Remaining within the de-rated system specification range should be sufficient for meeting the de-rated battery temperature range. A reduced life expectancy will result from operating at altitude outside this de-rated range.

 

Battery Swelling Guidelines

Battery swelling has been known to occur in Lithium Polymer cells when batteries reach a state of discharge where the observed battery terminal voltage is less than 1 volt. Under these conditions, copper dissolution can occur and produce internal shorts in the battery which can lead to swelling.

High temperature, when Lithium Polymer cells are actively being charged or are held at a near full state charge, can also lead to swelling if the batteries are kept at temperatures above 55oC for extended periods of time. This combination will also accelerate battery capacity loss and shorten service life.


iBBU08 Pack Thickness Specification Limits:

Thicknessmminches

Minimum

9.6

0.378

Nominal

9.9

0.390

Maximum

10.2

0.402

 

Physical Battery Inspection is neither recommended nor required as a stand-alone maintenance action. Batteries should be replaced when the maximum thickness is greater than 12mm.

 

Battery Related MegaCLI Commands

The battery status can be monitored using LSI’s MegaCLI command line interface, available on Oracle systems hardware assistant downloads, or by downloading from:

https://www.broadcom.com/support/oem/oracle/6gb/sg_x_sas6-r-int-z  (Internal HBA)


https://www.broadcom.com/support/oem/oracle/6gb/sg_x_sas6-r-rem-z (Raid Expansion Module REM)

Refer also to the MegaRAID User’s Guide at the same location.

The following are some of the MegaCLI commands that provide useful user information about the battery. For complete commands and seeing the same information in MegaRAID Storage Manager, refer to the MegaRAID User’s Guide available at the above web page.

  • For complete information on the BBU, use the following:

# ./MegaCli64 -AdpBbuCmd -a0
BBU status for Adapter: 0
BatteryType: iBBU08
Voltage: 4027 mV
Current: 0 mA
Temperature: 37 C

Battery State : Operational
BBU Firmware Status:
Charging Status : None
Voltage : OK
Temperature : OK
Learn Cycle Requested : No
Learn Cycle Active : No
Learn Cycle Status : OK
Learn Cycle Timeout : No
I2c Errors Detected : No
Battery Pack Missing : No
Battery Replacement required : No
Remaining Capacity Low : No
Periodic Learn Required : No
Transparent Learn : No
No space to cache offload : No
Pack is about to fail & should be replaced : No
Cache Offload premium feature required : No
Module microcode update required : No

GasGuageStatus:

Fully Discharged : No
Fully Charged : No
Discharging : No
Initialized : Yes
Remaining Time Alarm : No
Discharge Terminated : No
Over Temperature : No
Charging Terminated : No
Over Charged : No

Relative State of Charge: 99 %
Charger System State: 1
Charger System Ctrl: 0
Charging current: 0 mA
Absolute state of charge: 85 %
Max Error: 0 %

Battery backup charge time : 48 hours +

BBU Capacity Info for Adapter: 0

Relative State of Charge: 99 %
Absolute State of charge: 85 %
Remaining Capacity: 1302 mAh
Full Charge Capacity: 1328 mAh
Run time to empty: Battery is not being discharged
Average time to empty: 156 min
Average Time to full: Battery is not being charged
Cycle Count: 12

BBU Design Info for Adapter: 0

Date of Manufacture: 06/02, 2011
Design Capacity: 1530 mAh
Design Voltage: 4100 mV
Specification Info: 0
Serial Number: 2613
Pack Stat Configuration: 0x0000
Manufacture Name: LS36681
Device Name: bq27541
Device Chemistry: LPMR
Battery FRU: N/A
Transparent Learn = 0
App Data = 0

BBU Properties for Adapter: 0

Auto Learn Period: 2592000 Sec
Next Learn time: 419541893 Sec
Learn Delay Interval:0 Hours
Auto-Learn Mode: Enabled
BBU Mode = 5

Exit Code: 0x00

 

  • To check if the BBU learn-cycle details, do the following:
# ./MegaCli64 -AdpBbuCmd -a0 | grep -i learn
 Learn Cycle Requested    : No
 Learn Cycle Active       : No
 Learn Cycle Status       : OK
 Learn Cycle Timeout      : No
 Periodic Learn Required  : No
 Transparent Learn        : No
Transparent Learn = 0
Auto Learn Period: 2592000 Sec
Next Learn time: 419541893 Sec
Learn Delay Interval:0 Hours
Auto-Learn Mode: Enabled

The Learn Cycle values will report Yes or No according to current status. In this example the Auto-Learn mode is Enabled, as per the default setting. Note you can only change 2 of these parameters – Learn Delay Interval, and Auto-Learn mode. The period (2592000s = 30 days) and next learn time cannot be changed. Next Learn time is the time in seconds from system power on time. This may not be easy to calculate, however it will be reported as a specific date and time if you review the firmware terminal log, as described below, in the summary of the most recent learn cycle results.

 

  • To change the Learn Cycle values, create a file e.g. battery.ini with the lines you wish to change.

    To disable auto-learn mode, place the following line in the file:
         autoLearnMode=1


    To enable auto-learn mode (if previously disabled), place the following line in the file:
         autoLearnMode=0


    To delay the automatic learn cycle the next time it runs, place the following line in the file:
         learnDelayInterval=36


    Use a value in hours to delay the next learn cycles for up to 168 hours (7 days).
    Note this does not permanently delay the learn cycle to a convenient time, only the next automatic learn cycle scheduled. This has no effect if the auto-earn mode is set to disabled.


    After creating the file, run the following command:

    # ./MegaCli64 -AdpBbuCmd -SetBbuProperties -f battery.ini -a0


    Adapter 0: Set BBU Properties Succeeded.


    Exit Code: 0x00

    where battery.ini is the name of the file created above with the parameter changes. While the file may contain multiple changes, the command may only change 1 parameter at a time. If you are changing multiple parameters, e.g. disable to enable and learn delay, then run the previous command to check the learn cycle details, and if not all have changed, then run the set command again, until each change in the file has been enacted. Note changing the enable/disable auto learn mode will initiate a learn cycle immediately.
     
  • To run a manual learn cycle, use the following command:

    # ./MegaCli64 -AdpBbuCmd -BbuLearn -a0

    Adapter 0: BBU Learn Succeeded.

    Exit Code: 0x00

 

  • To review the logs of BBU learn cycles occurring, status, and date/time of the next one, use the following, and then review the output file:

    # ./MegaCli64 -FwTermLog -dsply -a0 > FwTermLog.out
    # less FwTermLog.out

 

  • To determine the current temperature and whether it is over specification or not, run the following:

    The below is normal:

    # ./MegaCli64 -AdpBbuCmd -a0 | grep "Temperature"
    Temperature: 34 C
    Temperature : OK
    Over Temperature : No

    The virtual drives on the following systems will currently be in write-through mode, and will remain so until the battery temperature drops below 55C and the BBU resumes charging.

    The below is a BBU that is high, but has not yet been over 60C:

    # ./MegaCli64 -AdpBbuCmd -a0 | grep "Temperature"
    Temperature: 57 C
    Temperature : High
    Over Temperature : No
     

    The below is High over 60C and will reduce the lifetime of the battery. It has previously been over 60C also marked by the "Over Temperature" flag:

    # ./MegaCli64 -AdpBbuCmd -a0 | grep "Temperature"
    Temperature: 61 C
    Temperature : High
    Over Temperature : Yes
     

 

 

PARTS REFERENCE:
SGX-SAS6-R-INT-Z Oracle StorageTek 6Gb/s SAS PCIe RAID HBA, Internal
   375-3701 / 7047503 8-Port 6Gbps SAS-2 RAID PCI Express HBA, B4 ASIC [CRU]

SGX-SAS6-R-REM-Z Oracle StorageTek 6Gb/s SAS REM RAID HBA
   375-3647 / 7047851 6Gbps SAS-2 RAID Expansion Module (REM), B4 ASIC [CRU]

371-4982 / 7050794 6Gbps SAS-2 RAID PCI Battery Module, BBU08 [CRU]

 

REFERENCE:

LSI Oracle Support Site
  - Internal HBA
  - Raid Expansion Module (REM)


Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback