Exalogic Patch Set Update (PSU) Release 2.0.6.2.4 (Linux

Asset ID:	1-79-2093674.1
Update Date:	2016-12-16
Keywords:

Solution Type Predictive Self-Healing Sure

Solution 2093674.1 : Exalogic Patch Set Update (PSU) Release 2.0.6.2.4 (Linux - Virtual) for January 2016

Applies to:

Oracle Exalogic Elastic Cloud X2-2 Hardware
Exalogic Elastic Cloud X4-2 Hardware
Exalogic Elastic Cloud X5-2 Hardware
Oracle Exalogic Elastic Cloud Software - Version 2.0.6.0.0 to 2.0.6.2.4
Exalogic Elastic Cloud X3-2 Hardware
Linux x86-64
Oracle Virtual Server(x86-64)

Purpose

Oracle Exalogic is an integrated hardware and software system designed to provide a complete platform for a wide range of application types and widely varied workloads. It combines optimized Oracle Fusion Middleware software like WebLogic server, JRockit, Coherence with industry-standard Sun Server and Storage hardware and InfiniBand networking. The purpose of this document is to provide specific information around January 2016 Patch Set Update (PSU) for that system.

Scope

The target audience of this document are engineers and system administrators who plan to apply the Exalogic PSU.

This document provides the following:

README files that are included in the January 2016 PSU for convenience. These can be used for reference without having to download the PSU first
Known issues with the PSU - these provide a list of known issues along with workarounds/solutions, if available. Engineers and administrators planning to apply the PSU should ensure to review these before applying the PSU

This document will be kept up-to-date with updates to errata and known issues.

Details

Note:

This PSU it includes updates to ELLC 14.2.5
This PSU includes updates for vservers based on OL5 and OL6 templates

Patch Download

Released: January 2016

Product Version: 2.0.6.2.4 (on X2-2/X3-2/X4-2/X5-2) for Oracle Exalogic Elastic Cloud infrastructure

PATCH 22136619 - EXALOGIC VIRTUAL 2.0.6.2.4 (on X2-2/X3-2/X4-2/X5-2) PATCH SET UPDATE (PSU) FOR JANUARY 2016

Patch Readme Documentation

Refer to the readme documentation attached 22136619-Virtual.zip to this Document on how to upgrade the Exalogic infrastructure:

Exalogic X2-2, X3-2,X4-2 and X5-2 Systems: Upgrading to 2.0.6.2.4

The readme content structure layout of the 22136619-Virtual.zip:

22136619-Virtual
|
| - README.html
| - exalogic-lctools-14.2.5-release-notes.txt
| - docs/README.html
| - docs/components/Enable_http_zfs.html
| - docs/components/ECServices_Upgrade.html
| - docs/components/ILOM_ComputeNode_Upgrade.html
| - docs/components/Index.html
| - docs/components/Menu.html
| - docs/components/NM2_Upgrade.html
| - docs/components/VServers_Upgrade.html
| - docs/components/ZFS_Upgrade.html

Prerequisite Important:

Before patching, verify that the cluster heartbeat timeout is set correctly, as documented in the Document ID : 1995593.1 - Increase O2CB Cluster Heartbeat Timeout on Exalogic Virtual

INTERNAL

Pre-requisites

PCIe Renumbering check on ZS3-ES storage heads in Exalogic x4-2 and x5-2 racks

Refer to following Note which has information on this important pre-requisite check.

<Note 2087741.1>: Patching ZS3-ES Renumbering Check On Exalogic x4-2 And x5-2 Racks

Appendices

Appendix A: Fixed Bugs List

Please review Doc ID 2093677.1: Exalogic Infrastructure January 2016 PSU – Fixed Bugs List

Appendix B: Patching Known Issues

Compute node patching fails with error "unable to determine total RAM failed"

Symptoms

Compute node patching fails with error "unable to determine total RAM failed"

ERROR: Wed Jan 6 04:02:08 HST 2016: unable to determine total RAM failed

2016-01-06T04:02:17.366-10:00] [exapatch] [NOTIFICATION:1] [] [utils] [pid: 569] [tid: Thread-1] [ecid: ] [lineno: 340] ERROR: 10.245.39.46: patching did
not complete successfully. Check logs for additional information. ]]

Cause

1. ILOM of the corresponding node would have not updated.

2. There was no sufficient time provided between ILOM pathcing completion and Compute node patching start. Usually it requires 15-20 minutes interval.

Solution/Workaround

Please make sure you have patched corresponding ILOM with version 3.1.x.xx or higher.
Ensure the "ipmi" command works fine as mentioned below example in the target node( installed_memory can have any other value, below is just an example).

[root@compute-node]# ipmitool sunoem cli "show /System/Memory installed_memory" | grep GB
installed_memory = 96 GB

If the above commands works fine and fetch output as mentioned above, retry patching.

Guest vServer patching fails with time out error and below error is seen in /var/log/boot.log.

Symptoms

Guest vServer patching fails with time out error and below error is seen in /var/log/boot.log

@ init: rc main process (670) killed by TERM signal
@ Telling INIT to go to single user mode

This will result the vServer in hung status or interfaces will not be up.

Cause

We suspect the upstart (upstart-0.6.5-12.el6_4.1.x86_64 ) have caused some boot issue.

Solution/Workaround

Restart the guest vServer, this should fix the issue.

ZFS patching fails due to losing IO access to the pool.

Symptoms

ZFS patching fails and Exapatch terminated as follows ...

INFO: ZFS-Storage-Head xx.xx.xx.xx successfully completed all pre-patch
checks
ERROR: ZFS-Storage-Head xx.xx.xx.xx STOP!! This head in pool exalogic is not
online.
Unable to proceed as pre-patch checks failed for active head xx.xx.xx.xx
Additional information may be found in the console output and the log file:
/var/log/exapatch_20151119112342.log

Observed in the exapatch log, errors are similar to the following:-

Head in Active state, continuing...
STOP!! This head in pool exalogic is not online.
STATUS : degraded
]]
[2015-11-19T11:53:27.441-05:00] [exapatch] [TRACE:1] [] [utils] [pid: 14494] [tid: MainThread] [ecid: ] [lineno: 219] [[
STDERR:
1
STOP!! This head in pool exalogic is not online.
STATUS : degraded
]]
[2015-11-19T11:53:27.441-05:00] [exapatch] [NOTIFICATION:1] [] [utils] [pid: 14494] [tid: MainThread] [ecid: ] [lineno: 340] ERROR: ZFS-Storage-Head xx.xxx.xx.xx STOP!! This head in pool exalogic is not online.
[2015-11-19T11:53:27.441-05:00] [exapatch] [NOTIFICATION:1] [] [utils] [pid: 14494] [tid: MainThread] [ecid: ] [lineno: 340] Unable to proceed as pre-patch checks failed for active head xx.xxx.xx.xx

Cause

During the patching, we found the system in a strange situation where the storage pool was owned by the passive(STRIPPED) head:

el01sn01:configuration storage> get

pool = exalogic

status = exported

owner = el01sn01

el01sn01:> configuration cluster show

Properties:

state = AKCS_OWNER << Active head doesn't own the pool

description = Active (takeover completed)

peer_asn = 2058f7ae-0247-c86e-c2f1-94f8cd68faa7

peer_hostname = el01sn02

peer_state = AKCS_STRIPPED

peer_description = Ready (waiting for failback)

Children:

resources => Configure resources

Node el01sn02 was READY (PASSIVE) but the storage was online on this.

el01sn02:configuration storage> get << this is the passive head

pool = exalogic

status = online

errors = 0

owner = el01sn01

profile = mirror

log_profile = log_stripe

cache_profile = cache_stripe

scrub = resilver completed after 0h0m with 0 errors

As the result, all clients are losing IO access to the pool.

Solution/Workaround

Reboot the active head and retry the patching

Refer <Note 1910623.1>

psuSetup.sh failing due to lack of free space.

Symptoms

psusetup.sh fails due to lack of free space. It fails while reserving share quota to 200M for exalogic-lctools.

[root@compute-node]# ./psuSetup.sh <zfs_ip>

INFO: xx.xx.xx.xxx is a ZFS appliance
INFO: The IPoIB of the ZFS is 192.168.6.131
INFO: /exalogic-lctools does not appear to be mounted from xx.xx.xx.xxx

@ INFO: using existing "common" on the ZFS Appliance with IP address
xx.xx.xx.xxx
@ INFO: using existing "exalogic-lctools" on the ZFS Appliance with IP address
xx.xx.xx.xxx
ERROR: (failed to commit changes (encountered while attempting to run command "commit"))

Cause

When it is trying to set the limit, it was already over the current available limit.

Solution/Workaround

Free up space in ZFSA. Refer <Note 2039102.1>

____________________________________________________________________________________________________________________________________________________________________________________

PCIE fault HW alert in ZFS post IB firmware update

Symptoms:

PCIE related HW fault is reported in ZFS, after applying IB card firmware.

Error message is Similar to:-

SUNW-MSG-ID: SPX86-8003-QH, TYPE: Fault, VER: 1, SEVERITY: Major
EVENT-TIME: Mon Aug 17 06:34:36 2015
PLATFORM: i86pc, CSN: 1441NML0FE, HOSTNAME: denp04sn02
SOURCE: appliance/kit/akd:default, REV: 1.0
EVENT-ID: d63362ca-89ea-e1e9-90cb-ee2bc152e335
DESC: An Integrated I/O (II0) non-fatal error in downstream PCIE device has
occurred.
AUTO-RESPONSE: The service-required LED on the chassis will be illuminated.
IMPACT: System continues to run with degraded resources.
REC-ACTION: Contact your service provider for proper repair procedures.

Cause:

A Hermon IB card firmware upgrade was active on the card.

Solution/Workaround:

Implement the workaround mentioned in <Note 2040757.1>

_____________________________________________________________________________________________________________________________________________________________________________________

Exalogic X2-2 Compute Nodes ILOM Network Settings Lost After PSU Upgrade

Refer to <Note 2114516.1> for details on this known issue.

_____________________________________________________________________________________________________________________________________________________________________________________

Exalogic: ZFS-7320 ILOM Network Settings Lost on X2-2 and X3-2 Racks After PSU Upgrade

Refer to <Note 2115679.1> for details on this known issue.

Exalogic : vServers Stuck In Single User Mode When Applying PSU In Parallel On Multiple vServers

Refer to <Note 2213470.1> for details on this known issue.

Appendix C: Errata

1. Migration of assets managed by Proxy Controllers

In the troubleshooting MOS note 1590392.1 section "Problem: components are listed under both the ProxyControllers(PC)"

Original Text

In the Exalogic Control BUI, if it is observed that any of the component asset is listed under both the Proxy Controllers (PC), they need to be migrated to a single PC.

Updated Text

For a given switch (NM2-GW switches, NM2-36p switch), only one proxy controller needs to be managing it. If a particular switch appears to be managed by both PC1 and PC2, it must be migrated to one of the proxy controllers. It does not matter whether it is migrated to PC1 or PC2.

One or more compute nodes may appear in the Managed Assets list of both proxy controllers. For instance, el01cn01.example.com may appear as being managed by both PC1 and PC2. This is expected behavior; no migration is required for the compute nodes.

Other non-switch assets, such as ZFS storage heads and PDU may also appear in the Managed Assets list of both proxy controllers. Migration is not required for these assets prior to applying the July 2015 PSU; migration is required only for switches.

Login to EMOC BUI and navigate to "Administration" item in the left panel and find the entries for 'PC1' and 'PC2' vServers.

Select a Proxy Controller say, "PC1" in the left panel. In the center panel click on the "Managed Assets" tab and set the Asset Type Filter to "Network Switches" to get list of switches managed by the 'PC1' Proxy Controller.

Select the switch that you wish to migrate to the other ProxyController "PC2". Click on the icon that provides the option to "Migrate Assets". A confirmation dialog shows up, select 'Migrate' button to proceed.

Once the migration completes finishes, a notification pop-up appears at the bottom right corner of the EMOC BUI, confirming the successful migration.

Appendix D: FAQs

Information About Usage Of Exapatch Force ( -f or --force ) Option Used During Exalogic Patching

Refer to <Note 1997466.1> for details

Guest vServer patching FAQ's

Refer to <Note 2031749.1> for details.

References

<NOTE:2092033.1> - Exalogic Compute Node or Guest vServer PSU Upgrade Using Exapatch Fails With Error "IndexError: list index out of range"
<NOTE:1571367.1> - Exalogic Infrastructure PSU Upgrade - Known Issues
<NOTE:1449226.1> - Exachk Health-Check Tool for Exalogic
<NOTE:2087741.1> - Patching ZS3-ES Renumbering Check On Exalogic x4-2 And x5-2 Racks
<NOTE:1329262.1> - How to Perform a Healthcheck on Exalogic
<NOTE:1995236.1> - Exalogic Patch Set Update (PSU) Release 2.0.6.2.1 (Linux - Virtual) for April 2015
<NOTE:1314535.1> - Exalogic Patch Set Updates (PSU) Master Note
<NOTE:2114516.1> - Exalogic X2-2 Compute Nodes ILOM Network Settings Lost After PSU Upgrade
<NOTE:1530781.1> - Exalogic Infrastructure Physical and Virtual Releases/PSUs – Software and Firmware Version Information
<NOTE:2016700.1> - Exalogic: vServer Creation Takes a Long Time or Fails After a Few Days of Uptime
<NOTE:2115679.1> - Exalogic: ZFS-7320 ILOM Network Settings Lost on X2-2 and X3-2 Racks After PSU Upgrade
<NOTE:2053733.1> - Many oclib (/var/mnt/virtlibs/*) Mounts Seen On EMOC Control VServer After Upgrading To Exalogic April 2015 PSU 2.0.6.2.1 Or Later Versions
<NOTE:2040757.1> - Oracle ZFS Storage Appliance: How To Resolve - An Integrated I/O non-fatal error in downstream PCIE device has occurred.
<NOTE:2213470.1> - Exalogic : vServers Stuck In Single User Mode When Applying PSU In Parallel On Multiple vServers

Attachments

This solution has no attachment