Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-1562146.1
Update Date:2018-05-09
Keywords:

Solution Type  Problem Resolution Sure

Solution  1562146.1 :   'replacefru' for XSCFU_B ends with [Warning:311] due to wrong procedure  


Related Items
  • Sun SPARC Enterprise M9000-64 Server
  •  
Related Categories
  • PLA-Support>Sun Systems>SPARC>Enterprise>SN-SPARC: Mx000
  •  




In this Document
Symptoms
Changes
Cause
Solution
References


Oracle Confidential PARTNER - Available to partners (SUN).
Reason: additional info for maintaining FRU XSCFU_B

Applies to:

Sun SPARC Enterprise M9000-64 Server - Version All Versions to All Versions [Release All Releases]
Information in this document applies to any platform.

Symptoms

Purpose
This document outlines an example where a Hot Replacement procedure of an XSCFU
in the Base Cabinet on a SPARC Enterprise M9000-64 Server ends with [Warning:311]
due to an inaccurate procedure.

Hot Replacement of an XSCFU means that the 'replacefru' command is used.

Description
For replacing an XSCFU in the Base Cabinet, i.e. XSCFU_B, the 'replacefru' procedure
will ask to remove two components which are the XSCFU_B in the Base Cabinet and its
counterpart XSCFU_C in the Expansion Cabinet. For instance XSCFU_B#0/XSCFU_C#0.

The 'replacefru' procedure will also ask to select the f-key after the "removal has
been completed". The f-key is meant to finish that removal. Do not misunderstand the
procedure here. It is not asking for a replacement. It is not meant to insert the new
XSCFU and its counterpart before initiating the finish. If you do so the previous
removal would not be finished and it would be an inaccurate procedure. The f-key after
the insertion would end with [Warning:311].

The example in this document does not apply to SPARC Enterprise M9000-32 and M8000
Servers where the XSCFUs do not have a counterpart. These Servers have no Expansion
Cabinet. On these Servers there is no separate finishing of the removal and the
insertion. It is a just single finish (f-key) which takes place after having replaced
the XSCFU, i.e. the new XSCFU inserted.

On a M9000-64 System the 'replacefru' routine for XSCFU_B#0 ends with
[Warning:311] as outlined here

   XSCF> replacefru
   ----------------------------------------------------------------------
   Maintenance/Replacement Menu
   Please select a type of FRU to be replaced.
   
   1. CMU/IOU    (CPU Memory Board Unit/IO Unit)
   2. FAN        (Fan Unit)
   3. PSU        (Power Supply Unit)
   4. XSCFU      (Extended System Control Facility Unit)
   ----------------------------------------------------------------------
   Select [1-4|c:cancel] :4
   
   ----------------------------------------------------------------------
   Maintenance/Replacement Menu
   Please select an XSCF to be replaced.
   
   No. FRU             Status
   --- --------------- ------------------
    1. XSCFU_B#0       Faulted
       XSCFU_C#0       Normal
    2. XSCFU_B#1       Active
       XSCFU_C#1       Normal
   ----------------------------------------------------------------------
   Select [1,2|b:back] :1
   
   You are about to replace XSCFU_B#0/XSCFU_C#0.
   Do you want to continue?[r:replace|c:cancel] :r
   
   Please confirm the Ready LED is not lit and that the Check LED is
   blinking.
   If this is the case, please remove XSCFU_B#0/XSCFU_C#0.
   Note) Please remove both XSCFU_B#0 and XSCFU_C#0.
   After removal has been completed, please select[f:finish] :f
   
   [Warning:311]
   Failed to remove XSCFU_B#0/XSCFU_C#0.
   Do you want to try to remove XSCFU_B#0/XSCFU_C#0 again?
   [r:remove|c:cancel] :

 
Please note that it did not come to the point where the routine asks to install
any XSCFU. But the logs show that the underlying action was indeed more than just
a removal. There was an installation (add) as well...

   XSCF> sholwogs monitor
   [...]
   May 24 13:43:07 <hostname> monitor_msg: SCF:maintenance event (FRU is chosen to be replaced: /XSCFU_B#0)
   May 24 13:43:07 <hostname> monitor_msg: SCF:maintenance event (FRU is chosen to be replaced: /XSCFU_C#0)
   May 24 13:46:35 <hostname> monitor_msg: SCF:Unit configuration change (remove) /XSCFU_C#0
   May 24 13:48:11 <hostname> monitor_msg: SCF:Unit configuration change (remove) /XSCFU_B#0
   May 24 13:52:21 <hostname> monitor_msg: SCF:Unit configuration change (add) /XSCFU_C#0
   May 24 13:52:51 <hostname> monitor_msg: SCF:Unit configuration change (add) /XSCFU_B#0
   [...]

The Components XSCFU_B#0/XSCFU_C#0 on the M9000-64 Server have been installed
though the 'replacefru' routine did not ask for it. It did ask for a removal only.
Be aware that the 'replacefru' routine for an XSCFU_B in this case will explicitly
ask for the installation...

   [...]
   After removal has been completed, please select[f:finish] :f
   
   Please install XSCFU_C#0.
   After installation has been completed, please select[f:finish] :f
   
   Please install XSCFU_B#0.
   After installation has been completed, please select[f:finish] :f

   Waiting for XSCFU_B#0/XSCFU_C#0 to enter ready state.
   [This operation may take up to 35 minute(s)]
   (progress scale reported in seconds)
      0.....  30.....  60.....  90..... 120..... 150..... 180..... 210.....
   [...]

 

Changes

 

Cause

The 'replacefru' routine has not been followed exactly. The removal is a separate
required step that needs to be finished (f-key) before any XSCFU will be installed
again.

Solution

Perform the correct 'replacefru' procedure. See also...

   SPARC Enterprise M8000/M9000 Servers Service Manual
   Chapter 11 - XSCF Unit Replacement

 

Other evidence about the wrong procedure is given by the appropriate tracefile.
The appropriate tracefile is the one that covers region 0014 (PROC_CLI) with
the matching period of time where 'replacefru' has been performed. With the
snapshot data...

   $ show_scf_trace -l -s 14
   Analyzing snapshot at SNAPSHOT_HOME_DIR=./.
   
   Possible SCF Trace data files are:
        scf/cli/dbg/bin/scf_trace_dump_-dump_all_-.out
           0014 PROC_CLI       2013/05/12 18:33:08 904383  2013/05/24 14:37:14 770032
        [...]

Check this tracefile for another evidence about the wrong procedure. The first
f-key (ret=307) which is meant to finish the removal of XSCFU_B#0/XSCFU_C#0 is
at 13:54:07

   $ show_scf_trace scf/cli/dbg/bin/scf_trace_dump_-dump_all_-.out 14

   [0014 0x000f 0x040300fe | 2013/05/24 13:42:31 623636] [0000542b]:Mainte,replacefru,main():230:replacefru started.
   [...]
   [0014 0x000f 0x040386a2 | 2013/05/24 13:54:07 493620] [0000542b]:Mainte,replacefru,replace_xscfu():conf_delete():791:lmainte_ask_select was called(ret=307)

The "Unit configuration change (add)" as seen in showlogs above happened at
13:52:21 / 13:52:51 which is clearly before the f-key that is meant to finish
the removal. The removal is a separate required step before any XSCFU will be
installed again.

References

<NOTE:1303329.1> - M8000 M9000 How to Replace a XSCF_B Active Replacement:ATR:1877:2

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback