Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-2302714.1
Update Date:2018-01-10
Keywords:

Solution Type  Problem Resolution Sure

Solution  2302714.1 :   Preventing an Infiniband Switch from becoming un-bootable (during upgrade for Sun Datacenter Infiniband 36 or during reboot or upgrade for Sun Network QDR InfiniBand Gateway Switch) due to Real Time Clock corruption  


Related Items
  • Exalogic Elastic Cloud X5-2 Eighth Rack
  •  
  • Zero Data Loss Recovery Appliance X6 Hardware
  •  
  • Sun Datacenter InfiniBand Switch 36
  •  
  • Sun Network QDR InfiniBand Gateway Switch
  •  
Related Categories
  • PLA-Support>Sun Systems>SAND>Network>SN-SND: Sun Network Infiniband
  •  


Preventing an Infiniband Switch being rendered un-bootable when upgrading from firmware version 2.1.x due to corruption to the Real Time Clock, which prevents the timer in GRUB from counting down and prevents the booting of the firmware.

In this Document
Symptoms
Changes
Cause
Solution


Applies to:

Sun Datacenter InfiniBand Switch 36 - Version All Versions and later
Sun Network QDR InfiniBand Gateway Switch - Version All Versions and later
Zero Data Loss Recovery Appliance X6 Hardware - Version All Versions to All Versions [Release All Releases]
Exalogic Elastic Cloud X5-2 Eighth Rack - Version X5 to X6 [Release X5 to X6]
Information in this document applies to any platform.

Symptoms

If the Real Time Clock is corrupted, the following symptoms will be observed:

  • The "hwclock" system commands takes long to execute. (Typically more than 30 seconds)
  • The "hwclock" system command returns with error status and no output or with clock being incorrect.

Changes

 

Cause

 The root cause is corruption to the Real Time Clock which prevents the timer in GRUB from counting down and prevents the booting of the firmware.

Solution

Attached is a patch tool binary file named patch_bug_26678971, which can be downloaded to the switch, e.g. to /tmp of the switch, and executed without
any options. You need to be logged in as root user to be able to download and use the tool.

The correct md5sum of the patch tool is:

  
# md5sum patch_bug_26678971
0d44e269a17716682c17ef3ee6bb0850 patch_bug_26678971
#
The tool will recover a switch that have hit this bug, and it will modify a config file, /etc/crontab, to avoid hitting this bug again with firmware version 2.1.x or 2.2.7 or earlier 2.2.x firmware versions. After downloading the tool, ensure that the binary file has executable rights set.
If using the tool in a switch that has hit this bug you will see the following output:
# ./patch_bug_26678971
Fixing RTC
This switch is now successfully patched for bug 26678971.
#
And after this the switch has recovered from the RTC issue and will not hit this bug again.
If using the tool in a switch that has not hit this bug you will see the following output:
# ./patch_bug_26678971
This switch is now successfully patched for bug 26678971.
#
  
 
NOTE:
Due to this patch tool modifies a config file, you will hit bug 26781756 - fwverify fails after running Patch tool for bug 26678971 when running fwverify. This can be ignored.

Example:

--------
Verifying installed files:
..............................................................................
............................................................. FAILED

* Package nm2-ilom-2.1.7-1.i386:
S.5....T /etc/crontab
--------

 

IMPORTANT:
Due to the config file being written during upgrade, you will need to re-run the patch tool after a forced reload or upgrade to another firmware version where this bug is present.
i.e. all 2.1.x versions and 2.2.x versions from version 2.2.7 and earlier.

 

To recover a switch that has become un-bootable:

  • Please connect a USB keyboard to the IB switch, press Enter on the Keyboard once, wait 10 seconds, and then press Enter again. Then the switch will boot (or upgrade if the failure was during an upgrade), and in 5-7 minutes the switch will be back online again.

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback