Recovering From a Failed SAF-TE Firmware Update on Sun Storage 33x0 RAID Arrays

Asset ID:	1-72-1011232.1
Update Date:	2013-08-12
Keywords:

Solution Type Problem Resolution Sure

Solution 1011232.1 : Recovering From a Failed SAF-TE Firmware Update on Sun Storage 33x0 RAID Arrays

Applies to:

Sun Storage 3310 Array - Version Not Applicable to Not Applicable [Release N/A]
Sun Storage 3320 SCSI Array - Version Not Applicable to Not Applicable [Release N/A]
All Platforms

Symptoms

If a firmware update process is interrupted, or has some other error, the process can fail on the RAID array, leaving the EMU modules in a down-rev condition, or in a failed state.

The update process may return messages such as:

SAF-TE Firmware download: one or more modules failed (CH 0 ID 14) - - - - - - - - - - - - - - - - - - - - -- - sccli: download enclosure firmware: error: firmware download failure on some targets

Repeated attempts to update the firmware again, or cross-load from a good EMU will not work, and may cause the good EMU to go into a failed state.

Cause

The exact cause for this issue is not known, but the steps explained in the SOLUTION section can be attempted prior to a hardware replacement.

Solution

In this circumstance, a possible workaround that can be done on-site before replacing hardware is to power off the array, and bring it up connected as a JBOD.

All that is necessary is to remove both RAID controllers and run cable directly tthe IO module on the tray.

If this is an expansion unit, disconnect the SCSI connection to the RAID head and connect a host cable to one of the expansionportstemporarily.

Once the array is powered on, it will present all drives to the host as regular SCSI disks, NOT LDs. At this point, you may run devfsadm (or wait for devfsadmd to pick up the new devices), and verify the drives appear in format.

The previous entries for the logical drives will not be valid, but this is a temporary condition, as a result of the RAID controller being bypassed.

Once the devices are available in format, sccli in-band can be used to run the SAF-TE update again.

In case you have multiple arrays attached to the host, an explicit device path can be used (from any of the drives) to direct sccli to the enclosure in question.

The SAF-TE update can now be run against the problem component, and in many instances, this will work, where the update in a RAID configuration fails.

After the procedure has been completed, and EMUs are online, the array can be powered-down and returned to the original configuration.

As long as the drives were not directly manipulated, all existing data should be unaffected. After the array is powered back on, the administrator can run 'devfsadm -C' to clean up the device paths, and remove the JBOD entries.

This can also be done manually if desired.

Caution must be used in this temporary JBOD configuration. Outside of the firmware update executed in-band with sccli, you MUST NOT manipulate the drives in the array in any way (format, newfs, mount, etc), or data loss may occur.Attempts to insert one good EMU with one bad EMU to try a crossload have not been successful, and result in an additional unusable EMU, as the bad EMU seems to bring down the good one in some cases.

Attachments

This solution has no attachment