Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-71-2032465.1
Update Date:2017-10-04
Keywords:

Solution Type  Technical Instruction Sure

Solution  2032465.1 :   M8-8 / M7-8 / M7-16 - How to Replace a Faulty PCIe Card  


Related Items
  • Oracle SuperCluster M7 Hardware
  •  
  • SPARC M7-16
  •  
  • SPARC M8-8
  •  
  • SPARC M7-8
  •  
Related Categories
  • PLA-Support>Sun Systems>Sun_Other>Sun Collections>SN-OTH: SPARC-CAP VCAP
  •  


ATR CAP: M7-8 / M7-16 / M8-8 - How to Replace a Faulty PCIe Card

In this Document
Goal
Solution
References


Applies to:

SPARC M7-16 - Version All Versions and later
Oracle SuperCluster M7 Hardware - Version All Versions and later
SPARC M7-8 - Version All Versions and later
SPARC M8-8 - Version All Versions and later
Information in this document applies to any platform.

Goal

CAP PROBLEM OVERVIEW: M8-8 / M7-8 / M7-16  - How to Replace a Faulty PCIe card

To report errors or request improvements on this procedure, please go to My Oracle Support, and put a comment on Doc ID: 2032465.1

 

ESD Caution:
  • Circuit boards and drives contain electronic components that are  extremely sensitive to static electricity. Ordinary amounts of static electricity from clothing or the work environment can destroy the components located on these boards. Do not touch the components along their connector edges.
  • Use a Antistatic Wrist strap. Attach one end of the strap to your wrist and the other end to the chassis, depending on what type of strap you use, with the adhesive end or the metal plug.
  • Use an Antistatic Mat. Place ESD-sensitive components such as motherboards, memory, and other PCBs on an antistatic mat.

 

Contamination Caution:
  • Dust particles of packaging material are number one cause of datacenter contamination. Make sure to remove all packaging material, up to the ESD safe packaging material, while still being outside the datacenter.

 

HOT Replacement Caution:
  • Oracle SuperCluster M7 Hardware does not support PCIe card hot replacement (aka hotplug).  The Physical Domain (PDom) which owns the PCIe slot must be shutdown to replace the PCIe card.

 

Oracle SuperCluster M7 Infiniband HCA Responsibility:
  • When replacing Infiniband HCA in Oracle SuperCluster M7, additional Field Engineering responsibilities must be completed. Perform the steps in the following document: How to Replace Infiniband Card in Oracle SuperCluster Compute / DB nodes (Doc ID 2044499.1)

Solution

 

DISPATCH INSTRUCTIONS

WHAT SKILLS DOES THE ENGINEER NEED: M8-8 / M7-8 / M7-16 product training recommended but not required as this is a CRU procedure

TASK COMPLEXITY: 0

TIME ESTIMATE: 30 minutes

HOT replacement

Prepare labeling materials sufficient for all IO cables attached to the IOU.

WHAT STATE SHOULD THE SYSTEM BE IN TO BE READY TO PERFORM THE RESOLUTION ACTIVITY? : n.a.

WHAT ACTION DOES THE ENGINEER NEED TO TAKE:

Determine which PCIe card requires service.

Caution - To remove a PCIe card that is assigned to an I/O domain, first remove the device from the I/O domain. For more information about making hardware changes to an I/O domain, refer to the Oracle VM for SPARC documentation at http://www.oracle.com/goto/VM-SPARC/docs.
The Oracle® VM Server for SPARC 3.2 Administration Guide may be helpful with details how to successfully Minimize Guest Domain Outages When Removing a PCIe Card.
See https://docs.oracle.com/cd/E48724_01/html/E48732/minimizedomainoutageswhenremovecard.html

Remove a PCIe Carrier and Card From the Server
1. Press the ATTN button on the carrier that contains the PCIe card that you wish to remove.
    The LEDs on the carrier flash for approximately 10 seconds as the PDomain disables the I/O card. When the LEDs on both the carrier and the card turn off, the carrier and card are ready to remove.
2. Label and remove any I/O cables from the PCIe card.
3. Remove the carrier from the slot:
   a. Pull the carrier’s extraction lever.  ( The lever is held in place by friction. )
   b. Swing the extraction lever out 90 degrees until the far end of the lever begins to push the carrier out of the slot.
   c. Remove the carrier from the slot.
   d. Place the carrier on a static-safe workspace.

Remove the PCIe card from the carrier.
   a. Press the green tab to unlock the carrier latch and open the top of the PCIe carrier
   b. Slide the card from the slot. Place the PCIe card on an antistatic mat or into its antistatic packaging

Avoid twisting, tilting, or pulling unevenly on the PCIe card, which could damage the carrier slot or components on the PCIe carrier circuit board.

Install the new PCIe card in the carrier.

1. Unlatch and swing open the arm of the PCIe card carrier, and insert the new PCIe card until the bottom connector is firmly seated in the carrier's connector.

Do not twist or turn the PCIe card as you insert it into the carrier. The card's connector must be fully seated in the carrier's slot before you attempt to close the top cover.

If the PCIe card includes a mounting screw, do not use the mounting screw. The carrier does not accept mounting screws.

2. Close the top of the carrier.
   The green latch should click into place.  If the top is difficult to close, verify that the notch of the card bracket fits around the guide post.

Install the carrier into the server.

Ensure that the primary domain is at the Oracle Solaris prompt. Installing a PCIe card carrier while the primary domain is at the Open Boot prompt is not supported.


1. Insert the PCIe hot-plug carrier with the I/O card in the CMIOU slot.

  1. Push evenly on both sides of the carrier so that the carrier slides straight into the slot
    If the carrier slides correctly into the slot, you should feel a slight resistance as the carrier starts to seat in the connector
    Do not push the extraction lever while you insert the carrier into the slot.  The carrier can enter at an angle and damage the connectors
  2. Lock the carrier's extraction lever
    The LEDs on the carrier and the card should remain off at this point

2. Attach I/O cables to the card.

3. Press the ATTN button on the carrier to reconfigure the I/O card into the PDomain.
The carrier’s LEDs should flash for a few seconds until PDomain enables the I/O card. The card’s LEDs will show activity when the card is enabled

4. Verify the PCIe Card
Verify that the Fault LED is not lit and that the green Power LED is lit on the card that you installed.

OBTAIN CUSTOMER ACCEPTANCE

WHAT ACTION DOES THE CUSTOMER NEED TO TAKE TO RETURN THE SYSTEM TO AN OPERATIONAL STATE:

Verify that the fault has been cleared and the replaced component is operational

  1. Verify that the Fault LED and front and rear Service Required LEDs are not lit.
  2. Verify that there is no faulty components
    1. -> show faulty
    2.  -> show /System/Open_Problems
    3.  faultmgmtsp> fmadm faulty
  3. Perform one of the following tasks based on your verification results
    1. If the previous steps did not clear the fault, refer to doc 1309092.1 for information about the tools and methods you can use to diagnose and clear component faults.
    2. If the previous steps indicate that no faults have been detected, the component has been replaced successfully. No further action is required
  4. The Firmware on the cards may require to be upgraded. Check SPARC M8 and SPARC M7 Series Servers : Firmware for the Various Hardware Components used on SPARC M8 and M7 Series Servers (Doc ID 2076387.1)

 

Return the faulted component to Oracle.

  1. If the replacement included safety covers on the connectors, install the covers on the component that you are
  2. In the shipping container that contained the replacement component:
    1. Using the same material used to pack the replacement component, position the component so that it is not free to move.
    2. Add any required paperwork or other documentation in the container.
    3. Except when packing CMIOUs, include any tools that were loaned to you by Oracle. Do not place tools inside a container that is being used to return a CMIOU.
  3. Close the shipping container and seal it with the packaging tape supplied by Oracle.
  4. Apply the shipping label to the shipping container.
  5. Notify Oracle or an authorized shipper that the carton container is ready for pickup.

 

======================== Other info =====================

REFERENCE INFORMATION:  Service Manual: http://docs.oracle.com/cd/E55211_01/html/E55215/index.html

The Oracle® VM Server for SPARC 3.2 Administration Guide may be helpful with details how to successfully Minimize Guest Domain Outages When Removing a PCIe Card.
See https://docs.oracle.com/cd/E48724_01/html/E48732/minimizedomainoutageswhenremovecard.html

NOTE:1985159.1 - Updating IB partitions after replacing an Infiniband HCA in any nodes within IB network - steps to do after replacing HCA

References

<NOTE:2076387.1> - SPARC M8 and SPARC M7 Series Servers: Firmware for the Various Hardware Components used on SPARC M8 and M7 Series Servers

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback