Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-71-1634541.1
Update Date:2017-06-22
Keywords:

Solution Type  Technical Instruction Sure

Solution  1634541.1 :   How to Perform Big Data Appliance Preventive Maintenance Service  


Related Items
  • Big Data Appliance X3-2 Hardware
  •  
  • Big Data Appliance X3-2 Full Rack
  •  
  • Big Data Appliance X3-2 In-Rack Expansion
  •  
  • Big Data Appliance X4-2 Hardware
  •  
  • Big Data Appliance X4-2 Full Rack
  •  
  • Big Data Appliance X4-2 Starter Rack
  •  
  • Big Data Appliance Hardware
  •  
  • Big Data Appliance X4-2 In-Rack Expansion
  •  
  • Big Data Appliance X3-2 Starter Rack
  •  
Related Categories
  • PLA-Support>Sun Systems>Sun_Other>Sun Collections>SN-OTH: x64-CAP VCAP
  •  


This note provides the reference documentation required by the Big Data Appliance Preventive Maintenance Service.

In this Document
Goal
Solution
 
 PM Tasks for FSE:
 Prior to going on-site
 Pre-PM Activity [Action Plan for Initial PM Onsite Visit - for Visual Inspection and Determining Parts Required]
 PM Activity [Action Plan for PM Parts replacements]
 
 REFERENCE INFORMATION:
References


Oracle Confidential PARTNER - Available to partners (SUN).
Reason: PM Service CAP

Applies to:

Big Data Appliance X4-2 In-Rack Expansion - Version All Versions and later
Big Data Appliance X4-2 Full Rack - Version All Versions and later
Big Data Appliance X4-2 Starter Rack - Version All Versions and later
Big Data Appliance X3-2 Hardware - Version All Versions and later
Big Data Appliance Hardware - Version All Versions and later
Information in this document applies to any platform.

Goal

 Canned Action Plan to perform Big Data Appliance (BDA) Preventive Maintenance Service

Solution

DISPATCH INSTRUCTIONS:

Follow normal Dispatch process.  However, since this is relating to a Preventive Maintenance activity, do not close this task until you get confirmation from the SR owner
or from a Field Engineer that the activity is completed (successfully). (i.e., under no circumstances should Dispatch close or cancel this task due to unresponsiveness from the customer. Thank you).

WHAT SKILLS DOES THE ENGINEER NEED:

- BDA (Big Data Appliance) trained

- TIME ESTIMATE: 720 Minutes

- TASK COMPLEXITY: 3

Time Estimate above is total including both initial on-site visit and any subsequent visits for parts replacements.


FIELD ENGINEER INSTRUCTIONS:

- PROBLEM OVERVIEW:  Preventative Maintenance

The Engineered Systems Preventative Maintenance (PM) process consists of a mandatory initial Pre-PM activity (for visual inspection and for determining parts needed, etc) as well as at least one onsite task for the actual PM activity (parts replacements at a later date). This CAP details the PM task created by an Oracle Support engineer for standard PM tasks/activities.  This CAP is used for both the initial Pre-PM activity, as well as the actual PM activity.

Please note that the PM Process was recently updated (July 2016). If you need clarification of the overall PM process especially as it relates to responsibilities of Field Engineers, refer to the PM Desk Manual http://deskmanual.oraclecorp.com/html/PRSRM050.htm?version=1#tasks

If you are assigned this task, you may assume Oracle has already communicated the need for Preventative Maintenance to take place on this Engineered System, to the primary customer contact specified on the Service Request. You may also assume that the customer has given their agreement for us to carry out the Preventative Maintenance service.

Depending on the task, the FSE needs to refer to either the first Action Plan section titled "Pre-PM Activity" (visual inspection, etc) or to the second Action Plan section titled "PM Activity" (for PM parts replacement visits).

Also for clarification (from PM Desk Manual), the Field Engineer is responsible for:

  • Undertaking a Pre-PM visit (first mandatory task)
  • Performing and clearly documenting the outcome of the Pre-PM activity (via the Pre-PM template)
  • Establishing all part numbers and all part quantities, required to fully implement the PM activity on the designated system.
  • Where possible, obtaining the customer’s initial preferred date/time for the PM activity.
  • Providing Dispatch with details of all required parts, and (where known) customer preferred date/time for PM activity.
  • If Domain Engineer analysis of a healthcheck bundle found additional faults that must be rectified prior to completion of PM activity (if required, an additional, third task would be created and assigned)
  • Performing the actual PM activity (second mandatory task)
  • Providing confirmation that the PM activity has been completed successfully, to the Customer and Domain Engineer.

The following attached articles should be reviewed and followed in order to complete the Big Data Appliance Preventive Maintenance service:

- BDA PM Service Overview (Customer-Ready) - Slide Deck [V2 Last Updated: 23-Apr-2014]
- BDA PM Service Overview (Internal Only) - Slide Deck [V2 Last Updated: 23-Apr-2014]
- BDA PM Field Service Procedures - Document [V6 Last Updated: 27-Mar-2017]

 

- WHAT STATE SHOULD THE SYSTEM BE IN TO BE READY TO PERFORM THE RESOLUTION ACTIVITY?

If the rolling method is used, then each node will need to be shut down one at a time. 

If the complete shutdown method is used, then the entire rack is shutdown – typically during a single maintenance window arranged by the customer. 

PM Tasks for FSE:

Prior to going on-site

1. If unfamiliar with the PM process, refer to all related documentation and review the PM Desk Manual.

2. Review the overview presentations and technical instructions attached to this document, as listed above and below. Ensure you are following the most recently updated version of procedures.

NOTE: BDA does not support remote battery replacement while the node is online.
The nodes should be shut down on a rolling basis and replaced (detailed in MOS Note 2099858.1), or full shutdown (detailed in MOS Note 1607802.1).

 3. Contact the customer to confirm the preferred date/time for the on-site visit. Follow Field "Customer Call Process".
Refer to the Field Service Engineer Desk Manual: https://stbeehive.oracle.com/content/dav/st/GDMI/Public/FIELD_SERVICE_ENGINEER.htm

 

Pre-PM Activity [Action Plan for Initial PM Onsite Visit - for Visual Inspection and Determining Parts Required]

1. Visually inspect all components in the Engineered System rack that is scheduled for PM.

2. Check specifically for any broken Cable Management Arms (CMAs) or cables. Make a note of quantity and type of CMA if any need to be replaced.

3. Establish customer’s preferred maintenance method (rolling or with down time). 

Note, the next steps are taken directly from the PM Desk Manual:

4. Obtain and upload healthcheck bundle for entire ES rack, if possible.

5. Establish all part number(s) and all part quantity/quantities, required to complete PM on the specified ES (physical rack).

6. Discuss, and if possible obtain, customer’s initial preferred date/time for the PM activity.

7. Do not yet commit to the customer’s preferred date/time – explain parts availability and ETA must be confirmed by Dispatch.

8. Be aware that in some (exceptional) situations, delivery of standalone Lithium Ion batteries may take longer than expected.
    In particular, if (despite material steps by Logistics Planning to pre-position stock) insufficient batteries are available
    in-country to support PM activity for this customer, parts may need to be moved from the regional warehouse, via surface
    transportation.

9. Maximum battery lead-times, by location, are shown in supporting document:
    Lithium Ion Battery Surface Transportation Lookup Table
    This information may help the customer decide on a realistic preferred date/time for the PM activity (maintenance window
    should not be sooner than parts can be delivered). Do not share the document with the Customer.

10. Complete all sections of the Pre-PM visit Template contained within this action plan.

Here is the Template:


----- TEMPLATE BEGIN -----
Account Name:
Rack SN:
Product (e.g., BDA):
SR #:
FE that performed the Pre-PM activity:
FE email address:
Should this same FE be assigned to the actual PM Task? [Yes, or TBD]
Date when customer has agreed to begin the PM Service:

If HBA Batteries are due to be replaced during this scheduled PM:
• Quantity of Batteries needed:
• Part number of Batteries:


Other HW issues observed? [Yes, No]
If yes, list part descriptions, quantities, and part numbers.
----- TEMPLATE END -----

12. Add completed template to the task as an internal “Action Required” note.  Note, performing “Action Required” will close out your Task, so make sure you update it as necessary before using Action Required.  Also note that this will change the sub-status of the SR to “Review Task”. This sub-status change will alert the SR owner of your update, which is important.

13A.  For reference, Dispatch is now ordering parts; FSEs are no longer doing this.

13B.  Contact Dispatch and

  • State that you are calling to provide important details from an onsite Pre-PM inspection visit, to allow planning of Engineered Systems Preventative Maintenance (ES-PM) activity.
  • State all part number(s) and all part quantity/quantities required for the PM activity.
  • State clearly if standalone Lithium Ion batteries are required.
  • State customer's preferred date/time for PM activity (explain that this date/time has not been committed to the customer).
  • State expected duration of the onsite PM activity (will be used by Dispatch to set duration of PM task).

PM Activity [Action Plan for PM Parts replacements]

Note: the specific procedures and commands from the PM Process document that require the customer to complete should be shared with the customer for the sole purpose of completing the PM process. The PM Process document as a whole however, should not be left with the customer.

1. If your customer’s ES is ASR-enabled, ensure that the customer’s system is properly configured for ASR.  While on-site, check for any ASR misconfiguration issues and work with the customer and attempt to correct them. Refer to Oracle knowledge Doc ID 2103715.1 for details. Carefully document any aspects of ASR misconfiguration that you find in your debrief notes, including whether you were able to correct these or not.

2. If systems have ASR enabled, make sure ASR is disabled during PM activities, so that ASR SRs are not generated unintentionally.

NOTE: BDA does not support remote battery replacement while the node is online.

The nodes should be shut down on a rolling basis and replaced (detailed in MOS Note 2099858.1), or full shutdown (detailed in MOS Note 1607802.1).

3. Initial BDA systems were not configured to support ASR on the InfiniBand Switches. In order to support ASR on the IB switches with proper entitlement, the external serial number on the physical label needs to be programmed into the switch firmware. Regardless of whether the customer is using ASR now or not, this should be checked to see if this is configured properly and if not, configure them to enable alerts to be possible from the IB switch. Refer to and follow Doc ID 1902710.1 for how to configure it; the procedure is also available in section 4.4 of the PM Process document attached.

4. With the customer’s assistance, prepare the system(s) for the PM activity.

5. Replace all faulty components.

6. Ensure that all nodes and components come back on-line and are healthy.

7. If this PM was a Rolling Method, please ensure that this Task remains open or that a Copy Task is created for additional work.

8. When all PM activities are complete, alert the SR owner that the PM activity was successful and that it is 100% completed - this is important for tracking purposes.

9. When using Debrief notes, please clearly state that the PM was completed successfully (or not).

10. Create a task note, and select “Action Required”.  Insert a note like this:

“PM Task/Visit was completed successfully and SR can be closed”.

Note, performing “Action Required” will close out your Task, so make sure you update it as necessary before using Action Required. 

Also note that this will change the sub-status of the SR to “Review Task”.  This sub-status change will alert the SR owner of your update, which is important.

REFERENCE INFORMATION:

Note 1947296.1 - Gathering HW and ASR data on nodes within an ES Rack during Preventive Maintenance activity.
Note 1643715.1 - Oracle Big Data Appliance Exachk Health-Check Tool.
Note 1643290.1  - Big Data Appliance Battery Check and Replacement Guideline.
Note 1902710.1 - How to configure Datacenter InfiniBand Switch 36 & QDR InfiniBand Gateway Switches for ASR
Note 2103715.1 - Engineered Systems ASR Configuration Check tool (asrexacheck version 4.x)



 

References

<NOTE:1947296.1> - Gathering HW and ASR data on nodes within an ES Rack during Preventive Maintenance activity
<NOTE:1643715.1> - Oracle Big Data Appliance Exachk Health-Check Tool
<NOTE:1643290.1> - Big Data Appliance (BDA) Battery Check and Replacement Guidelines
<NOTE:1902710.1> - How to configure Datacenter InfiniBand Switch 36 & QDR InfiniBand Gateway Switches for ASR
<NOTE:2103715.1> - Engineered Systems ASR Configuration Check tool (asrexacheck version 4.x)

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback