Pillar Axiom: Multiple attempts to Replace a Slammer component fails using Guided Maintenance in R5

Asset ID:	1-72-1529154.1
Update Date:	2018-01-08
Keywords:

Solution Type Problem Resolution Sure

Solution 1529154.1 : Pillar Axiom: Multiple attempts to Replace a Slammer component fails using Guided Maintenance in R5

Applies to:

Pillar Axiom 600 Storage System - Version All Versions to All Versions [Release All Releases]
Pillar Axiom 500 Storage System - Version All Versions to All Versions [Release All Releases]
Information in this document applies to any platform.

Symptoms

LUNs change to Conservative Status after attempt to Slammer component replacement while all components are healthy state.

Multiple attempts to replace a Slammer component using Guided Maintenance fails with an error message FRU_GM_ALREADY_IN_PROGRESS

Cause

If any Slammer component is prepared for replacement, the LUNs on both CUs on that Slammer will change to conservative. This makes sense, as the BBM for CU1 is on CU0 and vice versa.

You can get multiple Admin Alerts indicating a FRU should be replaced using axiomcli:

e.g.

axiomcli system -replace -unit /SLAMMER-01/CU0/PS0

axiomcli system -replace -unit /SLAMMER-01/CU1/PS1

Listing these with

axiomcli system_alert -list -details

or viewing in the GUI confirms that there are two separate alerts, HOWEVER, as noted, the Axiom actually has only one FRU currently prepared for replacement.

Solution

A-Admin Alerts:

There may be existing Admin Alerts instructing you to remove a Slammer component (Power Supply, Battery, PIM, etc.)
If there are, just use the Alerts to cancel the replacement.

If you get another alert of task failure in aborting the FRU replacement, just delete that new alert and continue to process any alert telling you to replace a component.

The Axiom will only allow one outstanding FRU replacement prepare at any given time, so if there are multiple replacement alerts, cancelling them will give the abort task failure until you find the alert for the one open GM FRU replacement.

This may be enough to recover this.

B- CLI:

If the Admin Alerts have been deleted, then you will need to use the pcli to find the outstanding task and replace it.

The pcli must be used as there is no obvious axiomcli request to show outstanding GM requests.

1- List all active alerts with

axiomcli login -u administrator -p <Password> <axiom_IP>

axiomcli system_alert -list -details

2- Use pcli check to see which FRU is the one that is currently prepared for replacement.

pcli sub -u administrator -H <axiom_IP> GetFruPreparedForReplacement

3- Use pcli to cancel that replacement

pcli sub -u administrator -H <axiom_IP> AbortSlammerFruReplacement SlammerNodeIdentity.Fqn=<Related FQN> FruType=<related_FRU> FruNumber=<related_FRU_Number>

4- Use pcli to make sure there are no other GM requests

pcli sub -u administrator -H <axiom_IP> GetFruPreparedForReplacement

5- Check the Axiom GUI -> Configure -> LUN screen to see if the Conservative LUNs have returned to Online.

Example:

Listing active GM FRU Replacement task:

pcli sub -u administrator -H 10.217.5.42 GetFruPreparedForReplacement

You will get an output like below, with different values:

 Message
  Response
    CorrelationID: 1360834303
    BeginStreamResponse
      TaskGuid: 4130303237353742A13F25F70390F99E
      TaskFqn: /GetFruPreparedForReplacement/1196636/pillar
Message
  Response
    CorrelationID: 1360834303
    GetFruPreparedForReplacementResponse
      PreparedFruInformation
        Identity
          Id: 2009000B08049A5A
          Fqn: /Slammer2/1                                                   <-- Use this value as the 'SlammerNodeIdentity.Fqn'
        BrickFruType: NONE
        SlammerFruType: POWERSUPPLY <-- Use this value as the "FruType"  
        FruNumber: 0                                                         <-- Use this value as the "FruNumber"
Message
  Response
    CorrelationID: 1360834303
    EndStreamResponse
      TaskGuid: 4130303237353742A13F25F70390F99E
      TaskFqn: /GetFruPreparedForReplacement/1196636/pillar

 

Then, use pcli to cancel that replacement, substituting the values from your GetFruPreparedForReplacement  request.

pcli sub -u administrator -H 10.217.5.42 AbortSlammerFruReplacement SlammerNodeIdentity.Fqn=/Slammer2/1 FruType=POWERSUPPLY FruNumber=0

NOTE: If this issue is seen while replacing a Brick Component using Guided Maintenance, please see KM Doc ID <Document 1606390.1>

References

<NOTE:1606390.1> - Pillar Axiom: Multiple attempts to replace a Brick component fails using Guided Maintenance in R5
<BUG:16322846> - CONMAN ALLOWS TWO FRU PREPARE IF ON SEPARATE SLAMMER CU

Attachments

This solution has no attachment