Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-2202721.1
Update Date:2018-03-06
Keywords:

Solution Type  Problem Resolution Sure

Solution  2202721.1 :   Infiniband Switch Stays In Pre-boot Environment During Upgrade/Reboot  


Related Items
  • Oracle Exalogic Elastic Cloud Software
  •  
  • Exadata Database Machine X2-2 Hardware
  •  
Related Categories
  • PLA-Support>Eng Systems>Exalogic/OVCA>Oracle Exalogic>MW: Exalogic Core
  •  




In this Document
Symptoms
Changes
Cause
Solution
References


Created from <SR 3-13529722291>

Applies to:

Oracle Exalogic Elastic Cloud Software - Version 2.0.6.2.161018 and later
Exadata Database Machine X2-2 Hardware - Version All Versions and later
Oracle Solaris on x86-64 (64-bit)
Linux x86-64
Oracle Virtual Server x86-64

Symptoms

When upgrading gateway switch, it boots in pre-boot environment everytime.

Even power-cycling many times we see the same behavior 'boot' command was also run from pre-boot environment but we see same issue.

Following is the snippet showing IB Switch bootup stuck in pre-boot phase. 

$ ssh root@172.25.xxx.xx

Sun Data center switch pre-boot environment.

root@172.25.xxx.xx's password:

Sun Data center switch pre-boot environment.

======================================================================
= =
= WARNING: This is pre-boot environment used for system maintenance. =
= Application image is not active!!! =
= =
======================================================================

Do you wish to remain in pre-boot environment?
If you do, please answer 'y' (timeout 10 seconds) [N/y]:N

Trying to start application image ...
init> Connection to 172.25.xxx.xx closed by remote host.
Connection to 172.25.xxx.xx closed.

[wtorres@sopsun ~]$ ssh root@172.25.xxx.xx

Sun Data center switch pre-boot environment.

root@172.25.xxx.xx's password:

Sun Data center switch pre-boot environment.

======================================================================
= =
= WARNING: This is pre-boot environment used for system maintenance. =
= Application image is not active!!! =
= =
======================================================================

Do you wish to remain in pre-boot environment?
If you do, please answer 'y' (timeout 10 seconds) [N/y]:y

To start application image enter command 'boot' at any time.

init> help

boot - start application image
check_app_partition - check if application image is bootable
fwupgrade_ssh - do tasks related to upgrade of application image over SSH
fwupgrade_ftp - do tasks related to upgrade of application image over FTP
fwupgrade_tftp - do tasks related to upgrade of application image over TFTP
fsck - do filesystem check of application image

init> check_app_partition
Doing filesystem check ...
e2fsck 1.39 (29-May-2006)
Superblock last mount time is in the future. Fix? no

/dev/sda5 contains a file system with errors, check forced.
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
/dev/sda5: 15915/110592 files (2.8% non-contiguous), 404423/441116 blocks
Everything looks OK.
init>
init> boot
init>

Sun Data center switch pre-boot environment.

(none) login: root
Password:

Sun Data center switch pre-boot environment.

======================================================================
= =
= WARNING: This is pre-boot environment used for system maintenance. =
= Application image is not active!!! =
= =
======================================================================

Do you wish to remain in pre-boot environment?
If you do, please answer 'y' (timeout 10 seconds) [N/y]:N

Trying to start application image ...
init>

Changes

 This is a fresh installation of Exalogic rack. This issue can also happen on existing switch during reboot.

Cause

In general if accessing the switch (ssh to its IP) when the upgrade process is in progress will not do any harm, the worst will be that you get into the pre-boot and need to boot manually doing boot. You cannot interrupt the upgrade process or destroy the image by accessing the switch or logging in during upgrade. If a user does Power Cycle during an upgrade (i.e. just unplug the power cords) then you could cause a image install to be interrupted (e.g. an interrupted write to disk), but doing that during an upgrade is bad practice from a user perspective. The advice is not do power cycle before you are able to login to ILOM CLI after a firmware upgrade and confirm upgrade has completed.

For some cases where the GW switch is stuck in preboot. this could be due to the Hardware clock being wrong. HW clock can also be out of sync if there is a bad battery.

Solution

  • Check for Battery on the switch from ILOM snapshot. If the Battery is bad, it needs to be addressed first before adjusting the HW clock as per below steps.

Here are the instruction on how to recover (for firmware 2.2.2). Please follow these steps .  if you see other output than described below, do not proceed but contact support and provide the error message (e.g. if the clock is correct in 2. you should NOT proceed to 3):

1) Login to preboot menu answering y to the question:

root@ password:

Sun Data center switch pre-boot environment.

======================================================================
= =
= WARNING: This is pre-boot environment used for system maintenance. =
= Application image is not active!!! =
= =
======================================================================

Do you wish to remain in pre-boot environment?
If you do, please answer 'y' (timeout 10 seconds) [N/y]

2) When you get the prompt, check current date:

init> /bin/busybox date (If the clock is back in time proceed with setting hwclock in next step)

3) Set the clock using date command with correct time as example:

init> /bin/busybox date "2016-10-26 13:11"
Wed Oct 26 13:11:00 UTC 2016

4) Set the hwclock base on the already set date:

init> /bin/busybox hwclock -w

5) Run the filesystem check (Only for /dev/sda1 and /dev/sda5 and two times until you end up with clean messages bold below)

init> fsck -yv /dev/sda1

e2fsck 1.39 (29-May-2006)
e2fsck 1.39 (29-May-2006)
fsck: Device or resource busy while trying to open /dev/sda
/dev/sda1 has been mounted 37 times without being checked, check forced.
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information

38 inodes used (0.63%)
3 non-contiguous inodes (7.9%)
# of inodes with ind/dind/tind blocks: 6/3/0
7946 blocks used (33.02%)
0 bad blocks
0 large files

23 regular files
5 directories
0 character device files
0 block device files
0 fifos
0 links
1 symbolic link (1 fast symbolic link)
0 sockets
--------
29 files
init> fsck -yv /dev/sda1
e2fsck 1.39 (29-May-2006)
/dev/sda1: clean, 38/6024 files, 7946/24064 blocks

init> fsck -yv /dev/sda5
e2fsck 1.39 (29-May-2006)
/dev/sda5: clean, 15877/110592 files, 401808/441116 blocks

6) Run the check_app_partition

init> check_app_partition
Doing filesystem check ...
e2fsck 1.39 (29-May-2006)
/dev/sda5: clean, 15877/110592 files, 401808/441116 blocks
Everything looks OK.

7) Boot to OS by calling boot command

init> boot

 

References

<NOTE:1552811.1> - Exalogic NM2 (IB SWITCH) Upgrade Fails
<NOTE:1268557.1> - Exalogic Elastic Cloud Software Known Issues

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback