Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-71-1964065.1
Update Date:2018-03-12
Keywords:

Solution Type  Technical Instruction Sure

Solution  1964065.1 :   FS System: How to Remove and Replace a Motherboard in an FS1-2 Pilot  


Related Items
  • Oracle FS1-2 Flash Storage System
  •  
  • Oracle FS1-2 Cloud System
  •  
Related Categories
  • PLA-Support>Sun Systems>Sun_Other>Sun Collections>SN-OTH: DISK-CAP VCAP
  •  


Instructions on how to replace motherboard in an FS1-2 Pilot.

In this Document
Goal
Solution
References


Oracle Confidential PARTNER - Available to partners (SUN).
Reason: FRU

Applies to:

Oracle FS1-2 Flash Storage System - Version All Versions to All Versions [Release All Releases]
Oracle FS1-2 Cloud System - Version All Versions to All Versions [Release All Releases]
Information in this document applies to any platform.

Goal

Outline the steps required to replace an FS1-2 Pilot motherboard using Guided Maintenance.

 

Solution

DISPATCH INSTRUCTIONS

- WHAT SKILLS DOES THE FIELD ENGINEER/ADMINISTRATOR NEED:

Product knowledge, FS1-2 Flash Storage System

TIME ESTIMATE: 120 minutes

TASK COMPLEXITY: 2

FIELD ENGINEER/ADMINISTRATOR INSTRUCTIONS:

NOTE: This Action Plan requires the following 3 items.  They come as part of the motherboard FRU or can be ordered separately by the FE:

  • CPU Installation/Removal Tool - part # 7026168
  • Thermal Grease Kit - part # 350-1271 and contains:
    • Thermal Grease Syringe - 310-0065
    • Alcohol wipes - 250-1802

If you are not very familiar with servicing the Sun Server X4-2 upon which the Pilot is based, it is highly recommended that you look at the animation videos that detail the replace procedures covered in this CAP.  They are available at the Oracle Server Animations.


PROBLEM OVERVIEW: 

FS1-2 Pilot motherboard.

What: A Pilot motherboard in an FS1-2 has failed and needs to be replaced. 

Where: A failed motherboard will have a System Alert for the affected Pilot. 

WHAT STATE SHOULD THE SYSTEM BE IN TO BE READY TO PERFORM THE RESOLUTION ACTIVITY?

The Pilot with the motherboard failure will likely have a warning status but depending on how severe the damage is, the entire Pilot itself may be in a missing state.  The other Pilot must have a normal status as this procedure may require a Pilot failover so that the problem Pilot can be powered off in order to replace the failed motherboard.

 

NOTE: for software version R6.1.11 or below, please follow KM Document 2039278.1 FS System: Additional Steps Required to Replace a Pilot Motherboard in a System Running R6.1.11 or Below.

  

NOTE: Please review the KM Document 1942676.1 FS System: How to Disable Call Home to Prevent Automatic Service Request ASR Generation before proceeding with the procedure below. The steps contained therein are provided to allow an administrator to de-activate a particular ASR enabled array while performing maintenance or troubleshooting. This will prevent any additional Service Requests from being created unnecessarily.

 

NOTE: The FS1-2 Pilot uses a quorum mechanism for Key Identity Properties (KIP).  The quorum is comprised of the motherboard, disk backplane and power supply 0 which are all encoded with the Product Serial Number (PSN) of the Pilot (not the FS1-2).  At least two of these must agree on the correct PSN or the Pilot will NOT boot.  So as to avoid this problem, this process has the user confirm the PSNs are in sync before attempting the replacement.  NEVER replace one of these quorum devices if the PSNs are not in sync and NEVER replace two of these items at the same time.

 

QRC for this procedrue:

  Pilot Motherboard QRC

WHAT ACTION DOES THE FIELD ENGINEER/ADMINISTRATOR NEED TO TAKE:

  1. Confirm Product Serial Number Containers (PSNCs) are currently synchronized.
    1. Use ssh to access the good Pilot, not the one having the motherboard replaced (root/a1s2d3f$ login/password).
      1. Software versions prior to R6.1.12 had ssh enabled from the factory. For versions R6.1.12 and newer, it can be enabled using fscli (30 minutes in this example):

        # fscli system -modify -enableSsh 30
          

    2. Use ssh to access the bad Pilot's ILOM.

      [root@pilot2 ~]# ssh 169.254.2.9
      Password:

      Oracle(R) Integrated Lights Out Manager

      Version 3.1.2.10.b r77700

      Copyright (c) 2012, Oracle and/or its affiliates. All rights reserved.

      Warning: password is set to factory default.

      ->
       

      NOTE: using the IP address of 169.254.2.9 will ALWAYS connect you to the other Pilot's ILOM.  In the example above, starting from Pilot 2, the connection is being made to Pilot 1's ILOM.


    3. Enter restricted session mode and run the showpsnc command.

      -> set SESSION mode=restricted

      WARNING: The "Restricted Shell" account is provided solely
      to allow Services to perform diagnostic tasks.

      [(restricted_shell) ORACLESP-1307FML0VY:~]# showpsnc
      Primary: fruid:///SYS/DBP0
      Backup 1: fruid:///SYS/MB
      Backup 2: fruid:///SYS/PS0

      Element           | Primary           | Backup1           | Backup2
      ------------------+-------------------+-------------------+-------------------
      PPN                 7056044             7056044             7056044
      PSN                 1307FML0VY          1307FML0VY          1307FML0VY <==== Product Serial Numbers must match.
      Product Name        SUN FIRE X4170 M3   SUN FIRE X4170 M3   SUN FIRE X4170 M3
      [(restricted_shell) ORACLESP-1307FML0VY:~]#
       


    4. If all 3 PSNs match, exit all the way out of the FS1-2 and proceed to step 2.
    5. If the Disk BackPlane 0 (DBP0) and Power Supply 0 (PS0) are the same but MotherBoard (MB) is different, it is safe to proceed to step 2 since the motherboard will be replaced.
    6. If any other condition exists, STOP!! and re-engage the TSC for steps to correct before proceeding to replace the failed motherboard.

  2. Prepare FS1-2 for service procedure.
    1. Disable Call-Home to prevent spurious alerts (see KM Document 1942676.1 FS System: How to disable Call Home to prevent Automatic Service Request ASR Generation).
    2. Use ESD precautions.
    3. Log into Oracle FS System Manager to access Guided Maintenance:
      1. Select System tab
      2. In the navigation tree, expand Hardware and select Pilots
      3. In the main window, right click on the Pilot with the failed motherboard and select Repair Pilot.
      4. In the pop-up Repair window, select Motherboard assembly followed by the Next button.
      5. Follow the steps in Guided Maintenance to identify and place the Pilot offline.

  3. Access the Pilot motherboard.
    1. Deploy the anti-tip legs in the front of the rack.
    2. Slide the Pilot into the service position.
    3. Unplug both power cords.
    4. Open fan door and remove all 4 fan modules.
    5. Push down on the plastic green button on fron left side of the top cover and slide it back and remove.
    6. Disengage both power supplies from motherboard.

      Note: Do NOT completely remove the power supplies from the Pilot chassis as they are part of a quorum config.  If their locations are swapped (PS1 for PS2 and PS2 for PS1) in conjunction with a motherboard replacement, the Pilot will NOT boot.
       
    7. Remove riser 3 (leftmost).
      1. Slide PCIe retainer towards front to unlock.
      2. Lift the green tabbed latch at the rear to the up position.
      3. Lift the riser release lever to the open position
      4. Lift the riser from motherboard and place on drive compartment.

        Note: it is not necessary to remove the SAS cables since the HBAs will not be removed from their risers.
         
    8. Remove risers 1 & 2 (center and rightmost).
      1. Lift the green tabbed latch that secures the PCIe filler plate at the rear to the up position
      2. Lift the riser release lever to the open position
      3. Lift the riser from the motherboard and place on drive compartment.

    9. Loosen 4 captive screws and remove the server mid-wall.
    10. Remove remaining cables connected to motherboard.
      1. Front indicator module (far left side)
      2. Disk backplane LED cable (left side, near fan connector)
      3. Disk backplane power cable (far right side)

  4. Remove failed motherboard.
    1. Grasp the middle of the air duct in the front and back and slide the motherboard toward the front of the server.
    2. Lift it slightly to disengage the six mushroom standoffs that are located on the server's chassis under the motherboard.
    3. Lift the motherboard out of the server's chassis and place it on an anti-static mat.

      Note: When removing the motherboard from the chassis, make certain that the plastic lightpipe that is connected to the rear locate led/button stays connected to the motherboard. The replacement motherboard will come with it's own.
        

  5. Install replacement motherboard.
    1. Grasp the middle of the air duct in the front and back and tilt the front of the motherboard up slightly.
    2. Guide it into the opening in the rear of the server's chassis.
    3. Lower the motherboard into the server's chassis and slide it to the rear until it engages the six mushroom standoffs located on the server's chassis under the motherboard.
    4. Make certain that the indicators, controls and connectors on the rear of the motherboard fit correctly into the rear of the server's chassis.

      NOTE: If the Pilot is an X5-2 version, remove the Oracle System Assistant (OSA) USB drive from the replacement motherboard if one exists.   While having the OSA USB drive installed does not affect day to day operations of the FS1-2, it will create problems if the Pilot needs to be re-imaged in the future.


  6. Transfer DIMMs and CPUs with their heatsinks from old motherboard to new.
    1. DIMMs.
    2. CPUs and heatsinks.
      1. In order to access 2 of the 4 screws for CPU 1, you will need to remove a piece of the plastic air duct.
      2. While gently pushing down on the Heatsink, loosen 4 screws 1.5 turns each using a crossing pattern until all the screws are free.
      3. Gently rotate the heatsink back and forth slightly while pulling up to free it from the CPU.
      4. Use a supplied alcohol pad, clean the heatsink bottom and CPU top.  Be VERY careful not to damage the CPU pins or socket by applying too much pressure.  Avoid spreading the thermal grease to other surfaces .
      5. You must use a CPU Installation/Removal tool 7026168 to extract the CPU
      6. Viewing from the front, disengage the right CPU Cover retaining levers by pushing down on them then pushing them away from the CPU.  Repeat for left lever.  Be sure to move the levers all the way back to allow better access to the CPU cover and CPU itself.
      7. Open the CPU cover towards the right side to expose the CPU.
      8. Press the round button at the top center of the tool to unlock it.  Then place it over the CPU using the green arrow to align it properly.
      9. Lock the tool to the CPU by pushing the tab next to the center button on the top of the tool.  Once CPU is locked to the tool lift the tool straight up.
      10. Position the CPU in the replacement motherboard in the same location it was removed from and carefully align it with the CPU socket on the motherboard.
      11. Press the center round button on top of the tool to unlock the CPU.  Do NOT press on the CPU itself.
      12. Lift the CPU tool free and clear of the Pilot.
      13. Lower the CPU Cover back into place.
      14. Re-engage the left lever back into place and then the right lever.
      15. Using the syringe it comes in, apply ~0.1 ml of thermal grease in the center of the top of the CPU.  Do NOT spread it around.
      16. Verify the underside of the heatsink is clean and if not clean it with an alcohol pad.
      17. Carefully position the heatsink over the CPU by aligning the captive mounting screws to their holes in the motherboard.
      18. Once the heatsink has made contact with the thermal grease, keep any sideways movement to a minimum.
      19. Using a crossing pattern, tighten each screw 0.5 turns until all four are securely fastened.
      20. Repeat for second CPU.
      21. Reinstall the plastic duct piece for CPU 1 that was removed in step i.
    3. DIMMs
      1. One at a time, remove a DIMM from the failed motherboard and install it in the same slot as the replacement motherboard.
      2. To ensure proper cooling, also transfer the plastic fillers in the unused DIMM slots from the failed motherboard to it's replacement.

  7. Reassemble Pilot components.
    1. Reinstall the server mid-wall using the 4 captive screws to secure it to the chassis.
    2. Reconnect the 3 cables that were removed in step 2-j.

      NOTE: Be particularly careful with the flex cable on the left as it can be easily damaged.  Bad connections may cause the boot drive not to be seen and thus the Pilot won't boot.
       
    3. Reinstall the 3 risers removed in steps 3-g & 3-h.  Be sure to route the SAS cable through it's trough in the plastic air duct.
    4. Re-engage both power supplies back into the motherboard.
    5. Reinstall the 4 fan modules and close the fan door.
    6. Close the top cover.

  8. Return Pilot to FS1-2 System.

    NOTE: If the FS1-2 is running R6.1.11 or below, please follow the additional steps covered in KM Document 2039278.1 FS System: Additional Steps Required to Replace a Pilot Motherboard in a System Running R6.1.11 or Below
     
    1. Plug in both power cords.
    2. Return the Pilot to the rack position.
    3. Return the anti-tip legs to their normal position.
    4. Once the Pilot has completed booting, confirm the motherboard BIOS:
      1. ssh to the active Pilot and run the ver command with the -v option:
        [root@pilot2 ~]# ver -v

        pilot1(Active):
        Pilot Apps Build:        060215-052400
        Pilot OS version:        060215-052200
        OracleFS mfg version:    060215-052200
        ebod-7043630 version:    060100-007000
        ebod-7044319 version:    060100-007000

        Kernel Version: 4.1.12-61.1.22.el6uek.x86_64
        BIOS Version: American Megatrends Inc. 30060100 09/10/2015
        SP firmware 3.2.4.60
        SP firmware build number: 104651
        SP firmware date: Thu Nov  5 15:51:02 CST 2015
        ...
        pilot1:
        Pilot Apps Build:        060215-052400
        Pilot OS version:        060215-052200
        OracleFS mfg version:    060215-052200
        ebod-7044319 version:    060100-007000
        ebod-7043630 version:    060100-007000

        Kernel Version: 4.1.12-61.1.22.el6uek.x86_64
        BIOS Version: American Megatrends Inc. 30060100 09/10/2015
        SP firmware 3.2.4.60
        SP firmware build number: 104651
        SP firmware date: Thu Nov  5 15:51:02 CST 2015
         
      2. If they are different, please refer to KM Document 1939732.1 FS System: How to Access the Internal Service Guide, to upgrade the BIOS. 
        Note: the ILOM BIOS cannot synchronize if the files are missing from the surviving Pilot.  See KM Document 2316638.1 FS System: Recovering Pilot ILOM files.
         
    5. Once the Pilot has completed it's reboot, repeat step 1 to verify that the PSN of the replacement motherboard is synchronized to the other two quorum devices.
    6. When finished, re-enable Call-Home.


OBTAIN CUSTOMER ACCEPTANCE


WHAT ACTION DOES THE FIELD ENGINEER/ADMINISTRATOR NEED TO TAKE TO RETURN THE SYSTEM TO AN OPERATIONAL STATE:

 Confirm the System Alert associated previously is gone and the FS1-2 status is normal/green.

Note: Because the Pilot must cold start, it may take as long as 15 minutes for the boot process to complete and the Pilot to return to a normal status.


REFERENCE INFORMATION:

 From the Oracle Help Center: http://docs.oracle.com/en/storage/#fla select the Oracle Flash System Documentation Library for more information.

References

<NOTE:1939732.1> - FS System: How to access Internal Field Service Guides

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback