Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-75-2299834.1
Update Date:2018-03-09
Keywords:

Solution Type  Troubleshooting Sure

Solution  2299834.1 :   PCA Serviceability Issue: Non-default Baud Rate Can Cause Unreadable Text on the SP Console  


Related Items
  • Oracle Virtual Compute Appliance X4-2 Hardware
  •  
  • Oracle Virtual Compute Appliance X3-2 Hardware
  •  
  • Private Cloud Appliance X5-2 Hardware
  •  
Related Categories
  • PLA-Support>Sun Systems>x86>Engineered Systems HW>SN-x86: OVCA
  •  




In this Document
Purpose
Troubleshooting Steps
References


Applies to:

Private Cloud Appliance X5-2 Hardware - Version All Versions and later
Oracle Virtual Compute Appliance X3-2 Hardware - Version All Versions and later
Oracle Virtual Compute Appliance X4-2 Hardware - Version All Versions and later
Information in this document applies to any platform.

Purpose

Oracle PCA is delivered as an appliance: a complete and controlled system composed of selected hardware and software components.

The PCA software ships with the default baud rate set to 9600 on both the compute and management nodes.  A serviceability issue may exist when the default baud rate is changed incorrectly.  Unreadable text may be seen on the SP console.  The SP console may be non-responsive.  You, or an Oracle field engineer, may be unable to troubleshoot and/or interact with the node when needed. 

Unreadable text on the SP console is commonly a result of baud rate mismatch and is not normally caused by a fault on the motherboard or service processor. 

The default baud rate for the server node components is at three locations:

#1 - ILOM

-> show /SP/serial/host speed
   speed = 9600

#2 - ILOM

-> show /SP/serial/external speed
   speed = 9600

Note:
Because the SP Owner Property at /SP/serial/portsharing = sp, the external baud rate setting here is irrelevant.
  

#3 - BIOS

Depending on the version of the node hardware, at one of these locations:

Advanced

  Remote Access Configuration
    Serial Port Mode
      9600,8,n,1

or

Advanced
  Serial Port Console Redirection
    Serial Port Mode
      9600,8,n,1

  
Tip: If you need to check the baud rate BIOS setting when the node is up this can be done from the ILOM:  

[root@ovcamn05r1 ~]# ssh 192.168.4.xxx
Password:

Hostname: ILOM-ovcacn13r1

-> cd /System/BIOS/Config
/System/BIOS/Config

-> set dump_uri=sftp://root@192.168.4.3/tmp/ovcacn13r1.xml
Enter remote user password: ********
Dump successful.

-> exit
Connection to 192.168.4.xxx closed.

[root@ovcamn05r1 ~]# grep Bits_per_second /tmp/ovcacn13r1.xml
<Bits_per_second>9600</Bits_per_second>

[root@ovcamn05r1 ~]#

  
  

 
#4 - The baud rate in PCA controller software is set in the grub.conf (or grub2.conf) file.

Management node

[root@ovcamn05r1 grub]# pwd
/boot/grub
[root@ovcamn05r1 grub]# grep 9600 grub.conf
serial --unit=0 --speed=9600 --word=8 --parity=no --stop=1
serial --unit=0 --speed=9600
serial --unit=0 --speed=9600 --word=8 --parity=no --stop=1
kernel /vmlinuz-4.1.12-61.45.1.el6uek.x86_64 ro root=/dev/mapper/vg00-lvol00 rd_NO_LUKS LANG=en_US.UTF-8 KEYTABLE=us console=tty1 console=ttyS0,9600n8 rd_NO_MD rd_LVM_LV=vg00/lvol00 console=ttyS0 crashkernel=auto SYSFONT=latarcyrheb-sun16 rd_NO_DM xsvnic.xsvnic_havnic=0
kernel /vmlinuz-2.6.32-642.11.1.el6.x86_64 ro root=/dev/mapper/vg00-lvol00 rd_NO_LUKS LANG=en_US.UTF-8 KEYTABLE=us console=tty1 console=ttyS0,9600n8 rd_NO_MD rd_LVM_LV=vg00/lvol00 console=ttyS0 crashkernel=auto SYSFONT=latarcyrheb-sun16 rd_NO_DM xsvnic.xsvnic_havnic=0

Compute node (if grub)

ovcacn07r1 login: root
Password:
Last login: Tue Aug 22 09:09:18 from ovcamn06r1
Warning: making manual modifications in the management domain
might cause inconsistencies between Oracle VM Manager and the server.
[root@ovcacn07r1 ~]# cd /boot/grub
[root@ovcacn07r1 grub]# grep 9600 grub.conf
kernel /xen.gz console=com1,vga com1=9600,8n1 dom0_mem=5888M dom0_vcpus_pin dom0_max_vcpus=16 allowsuperpage dom0_vcpus_pin dom0_max_vcpus=20 crashkernel=256M@64M
kernel /xen.gz console=com1,vga com1=9600,8n1 dom0_mem=5888M dom0_vcpus_pin dom0_max_vcpus=16 allowsuperpage dom0_vcpus_pin dom0_max_vcpus=20 crashkernel=256M@64M

 Compute node (if grub2)

Oracle VM server release 3.4.4
Kernel 4.1.12-103.9.6.el6uek.x86_64 on an x86_64

ovcacn13r1 login: root
Password:
Last login: Fri Mar 9 14:05:18 on hvc0
Warning: making manual modifications in the management domain
might cause inconsistencies between Oracle VM Manager and the server.

[root@ovcacn13r1 grub2]# head -6 /boot/grub2/menu.lst
#
# DO NOT EDIT THIS FILE
#
# It is automatically generated by grub2-mkconfig using templates
# from /etc/grub.d and settings from /etc/default/grub
#

[root@ovcacn13r1 ~]# grep 9600 /boot/grub2/menu.lst
### BEGIN /etc/grub.d/00_header ###
serial --speed=115200 --word=8 --parity=no --stop=1
### END /etc/grub.d/00_header ###
multiboot2 /xen.gz placeholder dom0_mem=max:6144M allowsuperpage dom0_vcpus_pin dom0_max_vcpus=20 console=com1 com1=9600,8n1 crashkernel=512M@64M ${xen_rm_opts}
multiboot2 /xen.gz placeholder dom0_mem=max:6144M allowsuperpage dom0_vcpus_pin dom0_max_vcpus=20 console=com1 com1=9600,8n1 crashkernel=512M@64M ${xen_rm_opts}

[root@ovcacn13r1 ~]#

From the /SP/console, you should see text during the entire boot up sequence, from power on, up to and including the login prompt. 
It is especially important at these key stages:
   Server BIOS
   LSI BIOS
   grub menu
   Login prompt


Examples

   Server BIOS
   - If directed by Oracle Support, you should be able to enter BIOS Setup via F2 or Ctrl-E

   Mgmt and Compute server nodes look similar to this
   BIOS

   LSI BIOS
   - If directed by Oracle Support, you should be able to enter LSI BIOS via Ctrl-H or Ctrl-Y

   Mgmt Node
   LSI  

   Compute Node
   cnLSI

   grub menu
   - If directed by Oracle Support, you should be able to stop and/or edit grub
  
   Mgmt Node
   NewMNgrub

   Compute Node
   NewCNgrub

   OS launch
   (Shown here to verify OS starts and that you can see any errors if any are logged)

   Mgmt Node
   NewMNOSLaunch

   Compute Node
   cnOSlaunch

   Login prompt
   - If directed by Oracle Support, you should be able to login

   Mgmt and Compute nodes both will have login prompts similar to this
   NewLogin

Examples of serviceability issues
You may not see POST errors, node faults, or any other error conditions.
You may not be able to interact with the node via the keyboard when needed.

Complete unreadable text or unknown characters shown
You never see any readable text.  System and LSI BIOS are never shown. 
The grub screen is never shown.  The SP console is non-responsive to any keystrokes.

   NewGarbage

Partial unreadable text
Some text is readable whereas other text is not.  You may or may not see the system BIOS, LSI BIOS, or grub.
The SP console will start and may respond to ESC (
You may be able to run other SP CLI command, but when SP console starts it will only show unreadable text.
The SP console may be responsive to keystrokes but unreadable text will be shown.

   NewGB2

One line
Only one line of readable text is seen followed by blank and/or non-responsive console screen.
The SP console is non-responsive to any keystrokes.

   NewOneLiner

Troubleshooting Steps

In order to troubleshoot hardware problems and to see any error messages or fault conditions, the baud rate may need reset back to the 9600 default.  Service notes follow this section.

It is recommended to keep the appliance at the 9600 default.

If you are directed by Oracle Support to change the default rate to debug or troubleshoot the software, it is recommended to reboot a node and watch the node boot via the /SP/console now, at the time of the change. This will confirm you can see text from node boot up, up to and including the login prompt.  It is much easier to verify it works now than it is when attempting to troubleshoot a hardware fault.

If the baud rate has been changed from the 9600 default, likely scenarios that can cause baud rate mismatch and serviceability issues are, but not limited to:

- Selecting "Restore Defaults" from the server nodes BIOS screen.  This will revert the serial port baud back to 9600 default.

- During service of server node system board such as FRU replacement.  The BIOS and SP settings may get reset back to the 9600 defaults.

- Reflashing or updating the ILOM firmware without retaining the configuration setting.

- During PCA image update.  The grub may be overwritten back to 9600 defaults.


 Service notes to change baud to 9600

1. ssh to node and edit /boot/grub/grub.conf to use 9600.  Use examples above for reference. 
    Note:  There may be several lines in grub with baud rate settings: "serial" and "kernel" for example. All need to be set to 9600.

2. stop the node

-> stop /SYS
Are you sure you want to stop /SYS (y/n)? y
Stopping /SYS

3. Wait for the node to power off (usually 30-60 seconds but may be longer)

-> show /SYS power_state
        power_state = Off

4. Power on the node and start SP console again

-> start /SYS
Are you sure you want to start /SYS (y/n)? y
Starting /SYS

-> start /SP/console
Are you sure you want to start /SP/console (y/n)? y

Serial console started. To stop, type ESC (

 
5.  During POST, press F2 or Ctrl-E to enter BIOS Setup.
     [Setup Selected] will display and BIOS setup will eventually launch after POST and several Network HBA initialization screens.

   SetupSelected

6. Enter BIOS Setup
    On the "Advanced" page set the Serial Port Speed to 9600

   9600

7. Select "Save & Exit"
    This will reset the node

    SaveExit

7a. You may see a REBOOT message similar to:

    Warn
8. Exit out of the /SP/console by pressing Esc (
   You will be at the ILOM -> prompt

   Interm  

Note:  For the next two steps, the SP CLI may be sluggish when typing or may temporarily hang for several seconds.  This is expected and will eventually return to normal responsiveness.

 9. Power off the node

-> stop /SYS
Are you sure you want to stop /SYS (y/n)? y
Stopping /SYS

10. Check until the node is off (usually 15 to 30 seconds but may be longer)

-> show /SYS power_state
power_state = Off

11. When the node is off change the two ILOM settings

-> set /SP/serial/external pendingspeed=9600
Set 'pendingspeed' to '9600'

-> set /SP/serial/external commitpending=true
Set 'commitpending' to 'true'

 -> set /SP/serial/host pendingspeed=9600
Set 'pendingspeed' to '9600'

 -> set /SP/serial/host commitpending=true
Set 'commitpending' to 'true'

Note:  If you get a 'console in use' error after the 'set commitpending' command, try stopping the SP console then continue with the process

-> set commitpending=true
Can not change serial settings - the host serial console is in use.

-> stop /SP/console
Are you sure you want to stop /SP/console (y/n)? y

-> set commitpending=true
Set 'commitpending' to 'true'

 12. Power on and start /SP/console.  You should see readable text from power on to login prompt

-> start /SYS
Are you sure you want to start /SYS (y/n)? y
Starting /SYS

-> start /SP/console
Are you sure you want to start /SP/console (y/n)? y

Serial console started. To stop, type ESC (

 


Service notes to change baud from default to 115200

Note:
Oracle PCA is delivered as an appliance: a complete and controlled system composed of selected hardware and software components.
The default baud is 9600.  As an appliance it should not be changed.
However for troubleshooting or debugging of PCA software, Oracle Support may recommend the baud be changed from default 9600 to 115200.

If so, perform the 12 steps above, substituting 115200 instead of 9600


The Doc was created from
SR 3-15543465654: (GCS Collaboration) PCA3: Management node not booting (ovcamn06r1)
SR 3-15543118361: PCA3: Management node not booting (ovcamn06r1)

References

<NOTE:2221745.1> - [ PCA ] How to change serial console baud rate on a Compute Node and ILOM SP

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback