Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-71-1356473.1
Update Date:2017-10-27
Keywords:

Solution Type  Technical Instruction Sure

Solution  1356473.1 :   How to Perform Exadata or SuperCluster Preventive Maintenance Service  


Related Items
  • Exadata X3-2 Hardware
  •  
  • SPARC SuperCluster T4-4 Full Rack
  •  
  • Oracle SuperCluster T5-8 Full Rack
  •  
  • Exadata X4-2 Hardware
  •  
  • Exadata Database Machine X2-2 Qtr Rack
  •  
  • Exadata X3-2 Half Rack
  •  
  • Oracle SuperCluster T5-8 Half Rack
  •  
  • Exadata X4-2 Quarter Rack
  •  
  • Exadata Database Machine X2-8
  •  
  • Exadata X4-8 Hardware
  •  
  • Exadata Database Machine X2-2 Full Rack
  •  
  • Exadata X3-2 Full Rack
  •  
  • Exadata X4-2 Half Rack
  •  
  • Zero Data Loss Recovery Appliance X4 Hardware
  •  
  • Exadata Database Machine X2-2 Half Rack
  •  
  • Exadata X3-8 Hardware
  •  
  • SPARC SuperCluster T4-4 Half Rack
  •  
  • Exadata X4-2 Full Rack
  •  
  • Exadata X3-2 Eighth Rack
  •  
  • Exadata Database Machine X2-2 Hardware
  •  
  • Exadata X3-2 Quarter Rack
  •  
  • SPARC SuperCluster T4-4
  •  
  • Exadata Database Machine V2
  •  
  • Exadata X3-8b Hardware
  •  
  • Oracle SuperCluster M6-32 Hardware
  •  
  • Oracle SuperCluster T5-8 Hardware
  •  
  • Exadata X4-2 Eighth Rack
  •  
Related Categories
  • PLA-Support>Sun Systems>Sun_Other>Sun Collections>SN-OTH: x64-CAP VCAP
  •  
  • _Old GCS Categories>Sun Microsystems>Specialized Systems>Database Systems
  •  
  • _Old GCS Categories>ST>Server>Engineered Systems>Exadata>Hardware
  •  
  • _Old GCS Categories>Sun Microsystems>Servers>x64 Servers
  •  


Canned Action Plan to Perform Exadata or SuperCluster Preventive Maintenance Service

In this Document
Goal
Solution
 PM Tasks for FSE:
 Prior to going on-site
 Pre-PM Activity [Action Plan for Initial PM Onsite Visit - for Visual Inspection and Determining Parts Required]
 PM Activity [Action Plan for PM Parts replacements]
 REFERENCE INFORMATION:
References


Oracle Confidential PARTNER - Available to partners (SUN).
Reason: Exadata FRU-only policy, to be performed by Oracle engineers only.

Applies to:

Exadata X3-2 Half Rack - Version All Versions and later
Exadata Database Machine X2-8 - Version All Versions and later
Exadata X3-2 Quarter Rack - Version All Versions and later
Exadata Database Machine X2-2 Qtr Rack - Version All Versions and later
Exadata X3-2 Hardware - Version All Versions and later
Information in this document applies to any platform.

Goal

Canned Action Plan to perform Exadata or SuperCluster Preventive Maintenance Service

Solution

DISPATCH INSTRUCTIONS:

Follow normal Dispatch process.  However, since this is relating to a Preventive Maintenance activity, do not close this task until you get confirmation from the SR owner
or from a Field Engineer that the activity is completed (successfully). (i.e., under no circumstances should Dispatch close or cancel this task due to unresponsiveness from the customer. Thank you).

WHAT SKILLS DOES THE ENGINEER NEED:

- Exadata or SuperCluster trained (whichever is applicable)

- TIME ESTIMATE: 720 Minutes

- TASK COMPLEXITY: 3

Time Estimate above is total including both initial on-site visit and any subsequent visits for parts replacements.


FIELD ENGINEER INSTRUCTIONS:

- PROBLEM OVERVIEW:  Preventative Maintenance

The Engineered Systems Preventative Maintenance (PM) process consists of a mandatory initial Pre-PM activity (for visual inspection and for determining parts needed, etc) as well as at least one onsite task for the actual PM activity (parts replacements at a later date). This CAP details the PM task created by an Oracle Support engineer for standard PM tasks/activities.  This CAP is used for both the initial Pre-PM activity, as well as the actual PM activity.

Please note that the PM Process was recently updated (July 2016). If you need clarification of the overall PM process especially as it relates to responsibilities of Field Engineers, refer to the PM Desk Manual - Doc ID 1803892.1 ES-PM Handling Engineered Systems Preventative Maintenance Service Requests : GCSEXA, GCSGCH, GCSSRM

If you are assigned this task, you may assume Oracle has already communicated the need for Preventative Maintenance to take place on this Engineered System, to the primary customer contact specified on the Service Request. You may also assume that the customer has given their agreement for us to carry out the Preventative Maintenance service.

Depending on the task, the FSE needs to refer to either the first Action Plan section titled "Pre-PM Activity" (visual inspection, etc) or to the second Action Plan section titled "PM Activity" (for PM parts replacement visits).

Also for clarification (from PM Desk Manual), the Field Engineer is responsible for:

  • Undertaking a Pre-PM visit (first mandatory task)
  • Performing and clearly documenting the outcome of the Pre-PM activity (via the Pre-PM template)
  • Establishing all part numbers and all part quantities, required to fully implement the PM activity on the designated system.
  • Where possible, obtaining the customer’s initial preferred date/time for the PM activity.
  • Providing Dispatch with details of all required parts, and (where known) customer preferred date/time for PM activity.
  • If Domain Engineer analysis of a healthcheck bundle found additional faults that must be rectified prior to completion of PM activity (if required, an additional, third task would be created and assigned)
  • Performing the actual PM activity (second mandatory task)
  • Providing confirmation that the PM activity has been completed successfully, to the Customer and Domain Engineer.

- WHAT STATE SHOULD THE SYSTEM BE IN TO BE READY TO PERFORM THE RESOLUTION ACTIVITY?

If the rolling method is used, then each node will need to be shut down one at a time. 

If the complete shutdown method is used, then the entire rack is shutdown – typically during a single maintenance window arranged by the customer. 

 

PM Tasks for FSE:

Prior to going on-site

1. If unfamiliar with the PM process, refer to all related documentation and review the PM Desk Manual.

2. Review the overview presentations and technical instructions attached to Oracle knowledge Doc ID 1356432.1 (Exadata Preventive Maintenance Service - Reference Documentation). Ensure you are following the most recently updated version of procedures.

3. Contact the customer to confirm the preferred date/time for the on-site visit.  Follow Field "Customer Call Process".
Refer to the Field Service Engineer Desk Manual: https://stbeehive.oracle.com/content/dav/st/GDMI/Public/FIELD_SERVICE_ENGINEER.htm

Pre-PM Activity [Action Plan for Initial PM Onsite Visit - for Visual Inspection and Determining Parts Required]

1. Visually inspect all components in the Engineered System rack that is scheduled for PM.

2. Check specifically for any broken Cable Management Arms (CMAs) or cables. Make a note of quantity and type of CMA if any need to be replaced.

3. If required, explain the need for minimum Exadata image version of 11.2.2.1.1 required to work with the replacement batteries, such that the customer must update image prior to scheduling the PM activity.

Refer the customer to Oracle knowledge Doc ID 888828.1 (Exadata Database Machine and Exadata Storage Server Supported Versions) for image patch information.

4. Establish customer’s preferred maintenance method (rolling or with down time). Note that with a rolling PM Service review Oracle knowledge Doc Id 1356432.1 to understand the implications.

Note, the next steps are taken directly from the PM Desk Manual:

5. Obtain and upload healthcheck bundle for entire ES rack, if possible.

6. Establish all part number(s) and all part quantity/quantities, required to complete PM on the specified ES (physical rack).

7. Discuss, and if possible obtain, customer’s initial preferred date/time for the PM activity.

8. Do not yet commit to the customer’s preferred date/time – explain parts availability and ETA must be confirmed by Dispatch.

9. Be aware that in some (exceptional) situations, delivery of standalone Lithium Ion batteries may take longer than expected.
    In particular, if (despite material steps by Logistics Planning to pre-position stock) insufficient batteries are available
    in-country to support PM activity for this customer, parts may need to be moved from the regional warehouse, via surface
    transportation.

10. Maximum battery lead-times, by location, are shown in supporting document:
    Lithium Ion Battery Surface Transportation Lookup Table
    This information may help the customer decide on a realistic preferred date/time for the PM activity (maintenance window
    should not be sooner than parts can be delivered). Do not share the document with the Customer.

11. Complete all sections of the Pre-PM visit Template contained within this action plan.

Here is the Template:


----- TEMPLATE BEGIN -----
Account Name:
Rack SN:
Product (e.g., Exadata, Exalogic):
SR #:
FE that performed the Pre-PM activity:
FE email address:
Should this same FE be assigned to the actual PM Task? [Yes, or TBD]
Date when customer has agreed to begin the PM Service:

If HBA Batteries are due to be replaced during this scheduled PM:
• Quantity of Batteries needed:
• Part number of Batteries:
If ESMs are due to be replaced during this scheduled PM:
• Quantity of ESMs needed:
• Part number of ESMs:
Other HW issues observed? [Yes, No]
If yes, list part descriptions, quantities, and part numbers.
----- TEMPLATE END -----

12. Add completed template to the task as an internal “Action Required” note.

Note, performing “Action Required” will close out your Task, so make sure you update it as necessary before using Action Required.

Also note that this will change the sub-status of the SR to “Review Task”. This sub-status change will alert the SR owner of your update, which is important.

13A.  For reference, Dispatch is now ordering parts; FSEs are no longer doing this.

13B.  Contact Dispatch and

  • State that you are calling to provide important details from an onsite Pre-PM inspection visit, to allow planning of Engineered Systems Preventative Maintenance (ES-PM) activity.
  • State all part number(s) and all part quantity/quantities required for the PM activity.
  • State clearly if standalone Lithium Ion batteries are required.
  • State customer’s preferred date/time for PM activity (explain that this date/time has not been committed to the customer).
  • State expected duration of the onsite PM activity (will be used by Dispatch to set duration of PM task).

PM Activity [Action Plan for PM Parts replacements]

Note: the specific procedures and commands from the PM Process document that require the customer to complete should be shared with the customer for the sole purpose of completing the PM process. The PM Process document as a whole however, should not be left with the customer.

1. If your customer’s ES is ASR-enabled, ensure that the customer’s system is properly configured for ASR.  While on-site, check for any ASR misconfiguration issues and work with the customer and attempt to correct them. Refer to Oracle knowledge Doc ID 2103715.1 for details. Carefully document any aspects of ASR misconfiguration that you find in your debrief notes, including whether you were able to correct these or not.

2. If systems have ASR enabled, make sure ASR is disabled during PM activities, so that ASR SRs are not generated unintentionally.

3. Initial Exadata systems were not configured to support ASR on the InfiniBand Switches.  In order to support ASR on the IB switches with proper entitlement, the external serial number on the physical label needs to be programmed into the switch firmware.  Regardless of whether the customer is using ASR now or not, this should be checked to see if this is configured properly and if not, configure them to enable alerts to be possible from the IB switch.  Refer to and follow Doc ID 1902710.1 for how to configure it; the procedure is also available in section 4.8 of the PM Process document.

4. With the customer’s assistance, prepare the system(s) for the PM activity.

5. Replace all faulty components.

6. Ensure that all nodes and components come back on-line and are healthy.

7. If this PM was a Rolling Method, please ensure that this Task remains open or that a Copy Task is created for additional work.

8. When all PM activities are complete, alert the SR owner that the PM activity was successful and that it is 100% completed - this is important for tracking purposes.

9. When using Debrief notes, please clearly state that the PM was completed successfully (or not).

10. Create a task note, and select “Action Required”.  Insert a note like this:

“PM Task/Visit was completed successfully and SR can be closed”.

Note, performing “Action Required” will close out your Task, so make sure you update it as necessary before using Action Required. 

Also note that this will change the sub-status of the SR to “Review Task”.  This sub-status change will alert the SR owner of your update, which is important.

REFERENCE INFORMATION:

Oracle knowledge Doc ID 1947296.1 (Gathering HW and ASR data on nodes within an ES Rack during Preventive Maintenance activity).
Oracle knowledge Doc ID 1356432.1 (Exadata Preventive Maintenance Service - Reference Documentation).
Oracle knowledge Doc ID 1803892.1 (ES-PM Handling Engineered Systems Preventative Maintenance Service Requests : GCSEXA, GCSGCH, GCSSRM)




Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback