Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-79-2107700.1
Update Date:2017-10-16
Keywords:

Solution Type  Predictive Self-Healing Sure

Solution  2107700.1 :   SPARC M8 and SPARC M7 Series Servers: iSCSI over IPoIB - CMIOU/eUSB replacement considerations  


Related Items
  • SPARC M7-16
  •  
  • SPARC M8-8
  •  
  • SPARC M7-8
  •  
Related Categories
  • PLA-Support>Sun Systems>SPARC>Enterprise>SN-SPARC: M7
  •  




In this Document
Purpose
Scope
Details
References


Applies to:

SPARC M7-8 - Version All Versions and later
SPARC M7-16 - Version All Versions and later
SPARC M8-8 - Version All Versions and later
Information in this document applies to any platform.

Purpose

When using iSCSI over IPoIB, The boot pool is composed of a ZFS mirror of the eUSB disks available in this Pdom.
Some other devices can be added to the boot pool when necessary. Virtual disks from another ldom for instance.

Best practices are to always have 2 or more devices in the boot pool.

When replacing a CMIOU or a eUSB disk in the domain, this may impact the boot of the domain as the CMIOU to be replaced will come with an empty eUSB, so is the eUSB to be replaced.

When accessible, the fallback mini root image on the SP (aka trampoline, tboot) can be used if no device/eUSB is valid in the boot pool.

At time of replacing a CMIOU/eUSB disk, the configuration must be considered so make sure that it will be possible to find a valid boot device from the boot pool.

https://docs.oracle.com/cd/E53394_01/html/E54742/gppfw.html#scrolltoc
SPARC T7 / M7 / M8 Servers : Information about VersaBoot - iSCSI over IPoIB (Doc ID 2094741.1)

Scope

 

Details

Basically, the 2 main scenarios are

  1. the domain is configured with more than one device in the boot pool then after replacing the CMIOU/eUSB disk
    1. from OBP, the eUSB disk replaced cannot be used as boot-device
    2. from OBP, choose one of the other eUSB/device and ultimately use the fallback image.
    3. the boot-device should be properly configured anyway so the proper device will be used
    4. the empty eUSB will join the existing boot pool when the domain has boot'ed and will be automatically sync'ed
  2. the domain is configured with only one device in the boot pool (or all of the devices are replaced) then when after replacing the CMIOU/eUSB disk there is no valid device in the boot pool.
    1. In such a case, if the domain has access to the SP, the fallback miniroot image can be used to boot (aka trampoline boot). The boot pool will then be rebuilt using the eUSB disk available
    2. When replacing a CMIOU that is the only CMIOU in a logical domain guest that uses iSCSI over IPoIB (versaboot) for booting, and the eUSB disk in that CMIOU is the only disk in the boot pool, you can install the eUSB disk you remove in the new CMIOU
    3. if the domain does not have access to the SP and the eUSB cannot be transferred from the old to new CMIOU, then no device is available to boot from.

To summarize the possible impacts when replacing a CMIOU/eUSB disk on the respective M7 platforms :

  • M7-8 with one Pdomain :

Reminder  - SPARC M7-8 Server : Product Information Page (Doc ID 1967511.1) :

CMIOU/eUSB to be replaced Number of eUSB in bpool LDOM type Action required Comment
CMIOU0-CMIOU1 2 (eUSB from the 2 CMIOUs) primary None

Access to tboot.

Other eUSB from CMIOU0-1 available to boot.

The eUSB replaced will automatically join the bpool.

CMIOU0-CMIOU1 1 primary Transfer eUSB disk or use tboot M7 Supercluster only.
CMIOU[2-7] 1 or more primary None

Access to tboot.

Other eUSB from CMIOU0-1 available to boot.

The eUSB replaced will automatically join the bpool.

CMIOU[2-7]  1 non-primary

Need another device from primary (vdisk) or eUSB from another CMIOU[2-7] in the same ldom (PCIe RC)

or transfer eUSB from old to new eUSB

No access to tboot
CMIOU[2-7] 2 or more non-primary None

No access to tboot.

Other eUSB from CMIOU2-7 available to boot.

The eUSB replaced will automatically join the bpool.

 

  • M7-8 with two Pdomains - M7-16 :

Reminder - SPARC M7-16 Server : Product Information Page (Doc ID 1967858.1)

CMIOU/eUSB to be replaced Number of eUSB in bpool LDOM type Action required Comment
CMIOU0-CMIOU1 2 (eUSB from the 2 CMIOUs) primary None

Access to tboot.

Other eUSB from CMIOU0-1 available to boot.

The eUSB replaced will automatically join the bpool.

CMIOU0-CMIOU1 1 primary Transfer eUSB disk or use tboot M7 Supercluster only.
CMIOU2-CMIOU3 1 or more primary None

Access to tboot.

Other eUSB from CMIOU0-1 available to boot.

The eUSB replaced will automatically join the bpool.

CMIOU2-CMIOU3 1 non-primary 

Need another device from primary (vdisk) or eUSB from another CMIOU[2-3] in the same ldom (PCIe RC)

or transfer eUSB from old to new eUSB

No access to tboot
CMIOU2-CMIOU3 2 non-primary None

No access to tboot.

Other eUSB from CMIOU2-3 available to boot.

The eUSB replaced will automatically join the bpool.

The above applies to each DCU : CMIOU[4-7], CMIOU[8-11], CMIOU[12-15]

 

Note about M7 SPARC Supercluster  :

  • The primary domain boots from eUSB,
  • The last root domain boots from eUSB (set up 2 eUSB pools to allow these two domains to boot independently),
  • In the event of total eUSB failure, the primary falls back to the SP trampoline boot image,
  • In the event of total eUSB failure, the last root domain boots from a bpool vdisk provided by the primary domain,
  • All other domains boot from a bpool vdisk via an mpgroup provided by the primary and last root domain,

 

Example where one of the 2 CMIOUs composing a host has been replaced

  • Booting from the empty (replaced eUSB) fails

{c00} ok printenv boot-device

boot-device = /pci@340/pci@2/usb@0/storage@1/disk@0,0:a /pci@345/pci@2/usb@0/storage@1/disk@0,0:a fallback-miniroot disk net
{c00} ok boot /pci@345/pci@2/usb@0/storage@1/disk@0,0:a
Boot device: /pci@345/pci@2/usb@0/storage@1/disk@0,0:a File and args:
Can't open disk label package
Can't open boot device

  • In this example, the other device could be used (/pci@340/pci@2/usb@0/storage@1/disk@0,0:a) but booting from the fallback image as a test

{c00} ok boot fallback-miniroot
Boot device: /pci@345/pci@1/pci@0/pci@8/usb@0/storage@2/disk@0 File and args:
SunOS Release 5.11 Version 11.3 64-bit
Copyright (c) 1983, 2015, Oracle and/or its affiliates. All rights reserved.
NOTICE: Configuring iSCSI to access the root filesystem...
NOTICE: successfully copied and retained the boot_archive into memory, rebooting ...

rebooting...
Resetting...
NOTICE: Entering OpenBoot.
...
Boot device: /reboot-memory File and args:
SunOS Release 5.11 Version 11.3 64-bit
Copyright (c) 1983, 2015, Oracle and/or its affiliates. All rights reserved.
NOTICE: Configuring iSCSI to access the root filesystem...
Hostname: pdom3

pdom3 console login:

 

Due to the some bugs (fixed in 9.5.4.a), some aliases are not correct

In order to boot on the fallback image, without the fix for the bugs (SysFW 9.5.4.a), the "rcdrom" alias must be used instead of fallback-miniroot.

 

  • The new empty eUSB has joined the pool mirror and becomes bootable.

root@pdom3:~# zpool status bpool
pool: bpool
state: ONLINE
scan: resilvered 85.6M in 8s with 0 errors on Wed Feb 17 09:57:42 2016

config:

NAME STATE READ WRITE CKSUM
bpool ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
c2t0d0 ONLINE 0 0 0
c4t0d0 ONLINE 0 0 0

errors: No known data errors

  • It's now possible to boot from the new eUSB; the one that was falling before

root@pdom3:~# init 0
...
{c00} ok boot /pci@345/pci@2/usb@0/storage@1/disk@0,0:a
Boot device: /pci@345/pci@2/usb@0/storage@1/disk@0,0:a File and args:
SunOS Release 5.11 Version 11.3 64-bit
Copyright (c) 1983, 2015, Oracle and/or its affiliates. All rights reserved.
NOTICE: Configuring iSCSI to access the root filesystem...
Hostname: pdom3

pdom3 console login: jack
Password:
Last login: Wed Feb 17 09:58:26 2016 on console
Oracle Corporation SunOS 5.11 11.3 December 2015

PLEASE DO NOT CHANGE THE CONFIGURATION
jack@pdom3:~$ zpool status
pool: bpool
state: ONLINE
scan: resilvered 85.6M in 8s with 0 errors on Wed Feb 17 09:57:42 2016

config:

NAME STATE READ WRITE CKSUM
bpool ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
c2t0d0 ONLINE 0 0 0
c4t0d0 ONLINE 0 0 0

errors: No known data errors

pool: rpool
state: ONLINE
scan: none requested
config:

NAME STATE READ WRITE CKSUM
rpool ONLINE 0 0 0
c0t600144F093355C6E0000566BD7470006d0 ONLINE 0 0 0

errors: No known data errors

root@pdom3:~# bootadm boot-pool list
Boot pool name: bpool
Parameters: eviction_algorithm=lru
Current: /dev/dsk/c2t0d0, /dev/dsk/c4t0d0
Pending: /dev/dsk/c2t0d0, /dev/dsk/c4t0d0
Platform-specified devices excluded:
Platform-specified (auto-added, unless excluded): /dev/dsk/c2t0d0, /dev/dsk/c4t0d0

 

Example - All of the devices originally in the boot pool have been replaced

  • Booting from the empty (replaced eUSB) fails

{c00} ok printenv boot-device
boot-device = /pci@340/pci@2/usb@0/storage@1/disk@0,0:a /pci@345/pci@2/usb@0/storage@1/disk@0,0:a fallback-miniroot disk net

{c00} ok boot /pci@340/pci@2/usb@0/storage@1/disk@0,0:a
Boot device: /pci@340/pci@2/usb@0/storage@1/disk@0,0:a File and args:
Can't open disk label package
Can't open boot device

  • So the only option is to boot from the fallback image

{c00} ok boot fallback-miniroot
NOTICE: Entering OpenBoot.
NOTICE: Fetching Guest MD from HV.
NOTICE: Starting additional cpus.
NOTICE: Initializing LDC services.
NOTICE: Probing PCI devices.
NOTICE: Finished PCI probing.

SPARC M7-16, No Keyboard
Copyright (c) 1998, 2015, Oracle and/or its affiliates. All rights reserved.
OpenBoot 4.38.2, 477.0000 GB memory available, Serial #106805695.
Ethernet address 0:10:e0:5d:b9:cf, Host ID: 865db9bf.

Boot device: /pci@345/pci@1/pci@0/pci@8/usb@0/storage@2/disk@0 File and args:
SunOS Release 5.11 Version 11.3 64-bit
Copyright (c) 1983, 2015, Oracle and/or its affiliates. All rights reserved.
NOTICE: Configuring iSCSI to access the root filesystem...
NOTICE: successfully copied and retained the boot_archive into memory, rebooting ...

rebooting...
Resetting...
NOTICE: Entering OpenBoot.
...
SPARC M7-16, No Keyboard
Copyright (c) 1998, 2015, Oracle and/or its affiliates. All rights reserved.
OpenBoot 4.38.2, 477.0000 GB memory available, Serial #106805695.
Ethernet address 0:10:e0:5d:b9:cf, Host ID: 865db9bf.

Boot device: /reboot-memory File and args:
SunOS Release 5.11 Version 11.3 64-bit
Copyright (c) 1983, 2015, Oracle and/or its affiliates. All rights reserved.
NOTICE: Configuring iSCSI to access the root filesystem...
Hostname: pdom3

root@pdom3:~# bootadm boot-pool list
Boot pool name: bpool
Parameters: eviction_algorithm=lru
Current: /dev/dsk/c2t0d0, /dev/dsk/c4t0d0
Pending: /dev/dsk/c2t0d0, /dev/dsk/c4t0d0
Platform-specified devices excluded:
Platform-specified (auto-added, unless excluded): /dev/dsk/c2t0d0, /dev/dsk/c4t0d0
root@pdom3:~# zpool status bpool
pool: bpool
state: ONLINE
scan: none requested
config:

NAME STATE READ WRITE CKSUM
bpool ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
c2t0d0 ONLINE 0 0 0
c4t0d0 ONLINE 0 0 0

errors: No known data errors

  • And it's now possible to boot from the eUSB disks, the one failing before

{c00} ok printenv boot-device
boot-device = /pci@340/pci@2/usb@0/storage@1/disk@0,0:a /pci@345/pci@2/usb@0/storage@1/disk@0,0:a fallback-miniroot fallback-miniroot disk net
{c00} ok
{c00} ok boot /pci@340/pci@2/usb@0/storage@1/disk@0,0:a
NOTICE: Entering OpenBoot.
...
SPARC M7-16, No Keyboard
Copyright (c) 1998, 2015, Oracle and/or its affiliates. All rights reserved.
OpenBoot 4.38.2, 477.0000 GB memory available, Serial #106805695.
Ethernet address 0:10:e0:5d:b9:cf, Host ID: 865db9bf.

Boot device: /pci@340/pci@2/usb@0/storage@1/disk@0,0:a File and args:
SunOS Release 5.11 Version 11.3 64-bit
Copyright (c) 1983, 2015, Oracle and/or its affiliates. All rights reserved.
NOTICE: Configuring iSCSI to access the root filesystem...
Hostname: pdom3

root@pdom3:~# bootadm boot-pool list
Boot pool name: bpool
Parameters: eviction_algorithm=lru
Current: /dev/dsk/c2t0d0, /dev/dsk/c4t0d0
Pending: /dev/dsk/c2t0d0, /dev/dsk/c4t0d0
Platform-specified devices excluded:
Platform-specified (auto-added, unless excluded): /dev/dsk/c2t0d0, /dev/dsk/c4t0d0

  • In case of any problem when booting from the fallback image as in the following example, try another boot.

Boot device: /pci@345/pci@1/pci@0/pci@8/usb@0/storage@2/disk@0 File and args:
SunOS Release 5.11 Version 11.3 64-bit
Copyright (c) 1983, 2015, Oracle and/or its affiliates. All rights reserved.
NOTICE: Configuring iSCSI to access the root filesystem...
NOTICE: successfully copied and retained the boot_archive into memory, rebooting ...

rebooting...
Resetting...
...
Boot device: /reboot-memory File and args:
SunOS Release 5.11 Version 11.3 64-bit
Copyright (c) 1983, 2015, Oracle and/or its affiliates. All rights reserved.
NOTICE: Configuring iSCSI to access the root filesystem...

Boot device: /reboot-memory File and args:
SunOS Release 5.11 Version 11.3 64-bit
Copyright (c) 1983, 2015, Oracle and/or its affiliates. All rights reserved.
NOTICE: Configuring iSCSI to access the root filesystem...

An inconsistency in the boot archive was detected and the boot archive has been successfully updated. Rebooting

syncing file systems... done
rebooting...
Resetting...
...
Boot device: /reboot-memory@0:nolabel File and args:
ERROR: /reboot-memory@0: No reboot memory segment.

Evaluating:

Can't open boot device

  • Try another boot on the fallback image

{c00} ok boot pci@345/pci@1/pci@0/pci@8/usb@0/storage@2/disk@0
Boot device: /pci@345/pci@1/pci@0/pci@8/usb@0/storage@2/disk@0 File and args:
SunOS Release 5.11 Version 11.3 64-bit
Copyright (c) 1983, 2015, Oracle and/or its affiliates. All rights reserved.
NOTICE: Configuring iSCSI to access the root filesystem...
NOTICE: successfully copied and retained the boot_archive into memory, rebooting ...

rebooting...
Resetting...
...
Boot device: /reboot-memory File and args:
SunOS Release 5.11 Version 11.3 64-bit
Copyright (c) 1983, 2015, Oracle and/or its affiliates. All rights reserved.
NOTICE: Configuring iSCSI to access the root filesystem...
Hostname: pdom3

 

References

<NOTE:2094741.1> - SPARC T7 / M7 / M8 Servers : Information about VersaBoot - iSCSI over IPoIB
<NOTE:2063247.1> - SPARC M8 and SPARC M7 Series Servers: Device Paths
<NOTE:1967858.1> - SPARC M7-16 Server : Product Information Page
<NOTE:1967511.1> - SPARC M7-8 Server : Product Information Page
<NOTE:2063349.1> - SPARC M7 Series Servers : Interconnect - EoUSB

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback