![]() | Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||||||||||||||||
Solution Type Technical Instruction Sure Solution 1356473.1 : How to Perform Exadata or SuperCluster Preventive Maintenance Service
Canned Action Plan to Perform Exadata or SuperCluster Preventive Maintenance Service In this Document
Oracle Confidential PARTNER - Available to partners (SUN). Applies to:Exadata X3-2 Half Rack - Version All Versions and laterExadata Database Machine X2-8 - Version All Versions and later Exadata X3-2 Quarter Rack - Version All Versions and later Exadata Database Machine X2-2 Qtr Rack - Version All Versions and later Exadata X3-2 Hardware - Version All Versions and later Information in this document applies to any platform. GoalCanned Action Plan to perform Exadata or SuperCluster Preventive Maintenance Service SolutionDISPATCH INSTRUCTIONS: Follow normal Dispatch process. However, since this is relating to a Preventive Maintenance activity, do not close this task until you get confirmation from the SR owner - TASK COMPLEXITY: 3 Time Estimate above is total including both initial on-site visit and any subsequent visits for parts replacements.
- PROBLEM OVERVIEW: Preventative Maintenance The Engineered Systems Preventative Maintenance (PM) process consists of a mandatory initial Pre-PM activity (for visual inspection and for determining parts needed, etc) as well as at least one onsite task for the actual PM activity (parts replacements at a later date). This CAP details the PM task created by an Oracle Support engineer for standard PM tasks/activities. This CAP is used for both the initial Pre-PM activity, as well as the actual PM activity. Please note that the PM Process was recently updated (July 2016). If you need clarification of the overall PM process especially as it relates to responsibilities of Field Engineers, refer to the PM Desk Manual - Doc ID 1803892.1 ES-PM Handling Engineered Systems Preventative Maintenance Service Requests : GCSEXA, GCSGCH, GCSSRM If you are assigned this task, you may assume Oracle has already communicated the need for Preventative Maintenance to take place on this Engineered System, to the primary customer contact specified on the Service Request. You may also assume that the customer has given their agreement for us to carry out the Preventative Maintenance service. Depending on the task, the FSE needs to refer to either the first Action Plan section titled "Pre-PM Activity" (visual inspection, etc) or to the second Action Plan section titled "PM Activity" (for PM parts replacement visits). Also for clarification (from PM Desk Manual), the Field Engineer is responsible for:
- WHAT STATE SHOULD THE SYSTEM BE IN TO BE READY TO PERFORM THE RESOLUTION ACTIVITY? If the rolling method is used, then each node will need to be shut down one at a time. If the complete shutdown method is used, then the entire rack is shutdown – typically during a single maintenance window arranged by the customer.
PM Tasks for FSE:Prior to going on-site1. If unfamiliar with the PM process, refer to all related documentation and review the PM Desk Manual. 2. Review the overview presentations and technical instructions attached to Oracle knowledge Doc ID 1356432.1 (Exadata Preventive Maintenance Service - Reference Documentation). Ensure you are following the most recently updated version of procedures. 3. Contact the customer to confirm the preferred date/time for the on-site visit. Follow Field "Customer Call Process". Pre-PM Activity [Action Plan for Initial PM Onsite Visit - for Visual Inspection and Determining Parts Required]1. Visually inspect all components in the Engineered System rack that is scheduled for PM. 2. Check specifically for any broken Cable Management Arms (CMAs) or cables. Make a note of quantity and type of CMA if any need to be replaced. 3. If required, explain the need for minimum Exadata image version of 11.2.2.1.1 required to work with the replacement batteries, such that the customer must update image prior to scheduling the PM activity. Refer the customer to Oracle knowledge Doc ID 888828.1 (Exadata Database Machine and Exadata Storage Server Supported Versions) for image patch information. 4. Establish customer’s preferred maintenance method (rolling or with down time). Note that with a rolling PM Service review Oracle knowledge Doc Id 1356432.1 to understand the implications. Note, the next steps are taken directly from the PM Desk Manual: 5. Obtain and upload healthcheck bundle for entire ES rack, if possible. 6. Establish all part number(s) and all part quantity/quantities, required to complete PM on the specified ES (physical rack). 7. Discuss, and if possible obtain, customer’s initial preferred date/time for the PM activity. 8. Do not yet commit to the customer’s preferred date/time – explain parts availability and ETA must be confirmed by Dispatch. 9. Be aware that in some (exceptional) situations, delivery of standalone Lithium Ion batteries may take longer than expected. 10. Maximum battery lead-times, by location, are shown in supporting document: 11. Complete all sections of the Pre-PM visit Template contained within this action plan. Here is the Template:
If HBA Batteries are due to be replaced during this scheduled PM: 12. Add completed template to the task as an internal “Action Required” note. Note, performing “Action Required” will close out your Task, so make sure you update it as necessary before using Action Required. Also note that this will change the sub-status of the SR to “Review Task”. This sub-status change will alert the SR owner of your update, which is important. 13A. For reference, Dispatch is now ordering parts; FSEs are no longer doing this. 13B. Contact Dispatch and
PM Activity [Action Plan for PM Parts replacements]
Note: the specific procedures and commands from the PM Process document that require the customer to complete should be shared with the customer for the sole purpose of completing the PM process. The PM Process document as a whole however, should not be left with the customer. 1. If your customer’s ES is ASR-enabled, ensure that the customer’s system is properly configured for ASR. While on-site, check for any ASR misconfiguration issues and work with the customer and attempt to correct them. Refer to Oracle knowledge Doc ID 2103715.1 for details. Carefully document any aspects of ASR misconfiguration that you find in your debrief notes, including whether you were able to correct these or not. 2. If systems have ASR enabled, make sure ASR is disabled during PM activities, so that ASR SRs are not generated unintentionally. 3. Initial Exadata systems were not configured to support ASR on the InfiniBand Switches. In order to support ASR on the IB switches with proper entitlement, the external serial number on the physical label needs to be programmed into the switch firmware. Regardless of whether the customer is using ASR now or not, this should be checked to see if this is configured properly and if not, configure them to enable alerts to be possible from the IB switch. Refer to and follow Doc ID 1902710.1 for how to configure it; the procedure is also available in section 4.8 of the PM Process document. 4. With the customer’s assistance, prepare the system(s) for the PM activity. 5. Replace all faulty components. 6. Ensure that all nodes and components come back on-line and are healthy. 7. If this PM was a Rolling Method, please ensure that this Task remains open or that a Copy Task is created for additional work. 8. When all PM activities are complete, alert the SR owner that the PM activity was successful and that it is 100% completed - this is important for tracking purposes. 9. When using Debrief notes, please clearly state that the PM was completed successfully (or not). 10. Create a task note, and select “Action Required”. Insert a note like this: “PM Task/Visit was completed successfully and SR can be closed”. Note, performing “Action Required” will close out your Task, so make sure you update it as necessary before using Action Required. Also note that this will change the sub-status of the SR to “Review Task”. This sub-status change will alert the SR owner of your update, which is important. REFERENCE INFORMATION:Oracle knowledge Doc ID 1947296.1 (Gathering HW and ASR data on nodes within an ES Rack during Preventive Maintenance activity).
Attachments This solution has no attachment |
||||||||||||||||||||||||||
|