Asset ID: |
1-79-2107700.1 |
Update Date: | 2017-10-16 |
Keywords: | |
Solution Type
Predictive Self-Healing Sure
Solution
2107700.1
:
SPARC M8 and SPARC M7 Series Servers: iSCSI over IPoIB - CMIOU/eUSB replacement considerations
Related Items |
- SPARC M7-16
- SPARC M8-8
- SPARC M7-8
|
Related Categories |
- PLA-Support>Sun Systems>SPARC>Enterprise>SN-SPARC: M7
|
In this Document
Applies to:
SPARC M7-8 - Version All Versions and later
SPARC M7-16 - Version All Versions and later
SPARC M8-8 - Version All Versions and later
Information in this document applies to any platform.
Purpose
When using iSCSI over IPoIB, The boot pool is composed of a ZFS mirror of the eUSB disks available in this Pdom.
Some other devices can be added to the boot pool when necessary. Virtual disks from another ldom for instance.
Best practices are to always have 2 or more devices in the boot pool.
When replacing a CMIOU or a eUSB disk in the domain, this may impact the boot of the domain as the CMIOU to be replaced will come with an empty eUSB, so is the eUSB to be replaced.
When accessible, the fallback mini root image on the SP (aka trampoline, tboot) can be used if no device/eUSB is valid in the boot pool.
At time of replacing a CMIOU/eUSB disk, the configuration must be considered so make sure that it will be possible to find a valid boot device from the boot pool.
https://docs.oracle.com/cd/E53394_01/html/E54742/gppfw.html#scrolltoc
SPARC T7 / M7 / M8 Servers : Information about VersaBoot - iSCSI over IPoIB (Doc ID 2094741.1)
Scope
Details
Basically, the 2 main scenarios are
- the domain is configured with more than one device in the boot pool then after replacing the CMIOU/eUSB disk
- from OBP, the eUSB disk replaced cannot be used as boot-device
- from OBP, choose one of the other eUSB/device and ultimately use the fallback image.
- the boot-device should be properly configured anyway so the proper device will be used
- the empty eUSB will join the existing boot pool when the domain has boot'ed and will be automatically sync'ed
- the domain is configured with only one device in the boot pool (or all of the devices are replaced) then when after replacing the CMIOU/eUSB disk there is no valid device in the boot pool.
- In such a case, if the domain has access to the SP, the fallback miniroot image can be used to boot (aka trampoline boot). The boot pool will then be rebuilt using the eUSB disk available
- When replacing a CMIOU that is the only CMIOU in a logical domain guest that uses iSCSI over IPoIB (versaboot) for booting, and the eUSB disk in that CMIOU is the only disk in the boot pool, you can install the eUSB disk you remove in the new CMIOU
- if the domain does not have access to the SP and the eUSB cannot be transferred from the old to new CMIOU, then no device is available to boot from.
To summarize the possible impacts when replacing a CMIOU/eUSB disk on the respective M7 platforms :
Reminder - SPARC M7-8 Server : Product Information Page (Doc ID 1967511.1) :
-
- The Host is composed of CMIOU0 - CMIOU1 minimum.
- Only CMIOU0 and CMIOU1 are connected to SP0/SPM0 and SP1/SPM0
- The root complex hosting the path to SP and eUSB must be assigned to the primary/control domain
CMIOU/eUSB to be replaced |
Number of eUSB in bpool |
LDOM type |
Action required |
Comment |
CMIOU0-CMIOU1 |
2 (eUSB from the 2 CMIOUs) |
primary |
None |
Access to tboot.
Other eUSB from CMIOU0-1 available to boot.
The eUSB replaced will automatically join the bpool.
|
CMIOU0-CMIOU1 |
1 |
primary |
Transfer eUSB disk or use tboot |
M7 Supercluster only. |
CMIOU[2-7] |
1 or more |
primary |
None |
Access to tboot.
Other eUSB from CMIOU0-1 available to boot.
The eUSB replaced will automatically join the bpool.
|
CMIOU[2-7] |
1 |
non-primary |
Need another device from primary (vdisk) or eUSB from another CMIOU[2-7] in the same ldom (PCIe RC)
or transfer eUSB from old to new eUSB
|
No access to tboot |
CMIOU[2-7] |
2 or more |
non-primary |
None |
No access to tboot.
Other eUSB from CMIOU2-7 available to boot.
The eUSB replaced will automatically join the bpool.
|
- M7-8 with two Pdomains - M7-16 :
Reminder - SPARC M7-16 Server : Product Information Page (Doc ID 1967858.1)
-
- Each of the host in the M7-8 with two Pdomains is composed of one DCU.
- Each of the host in the M7-16 is composed on one or more of the 4 DCUs.
- Each DCU has the first 2 slots (0-1, 4-5, 8-9, 12-13) populated minimum
- The first 2 slots (0-1, 4-5, 8-9, 12-13) of each DCU are connected to the SP/SPP-SPM
- The root complex hosting the path to SP and eUSB must be assigned to the primary/control domain
CMIOU/eUSB to be replaced |
Number of eUSB in bpool |
LDOM type |
Action required |
Comment |
CMIOU0-CMIOU1 |
2 (eUSB from the 2 CMIOUs) |
primary |
None |
Access to tboot.
Other eUSB from CMIOU0-1 available to boot.
The eUSB replaced will automatically join the bpool.
|
CMIOU0-CMIOU1 |
1 |
primary |
Transfer eUSB disk or use tboot |
M7 Supercluster only. |
CMIOU2-CMIOU3 |
1 or more |
primary |
None |
Access to tboot.
Other eUSB from CMIOU0-1 available to boot.
The eUSB replaced will automatically join the bpool.
|
CMIOU2-CMIOU3 |
1 |
non-primary |
Need another device from primary (vdisk) or eUSB from another CMIOU[2-3] in the same ldom (PCIe RC)
or transfer eUSB from old to new eUSB
|
No access to tboot |
CMIOU2-CMIOU3 |
2 |
non-primary |
None |
No access to tboot.
Other eUSB from CMIOU2-3 available to boot.
The eUSB replaced will automatically join the bpool.
|
The above applies to each DCU : CMIOU[4-7], CMIOU[8-11], CMIOU[12-15]
Note about M7 SPARC Supercluster :
- The primary domain boots from eUSB,
- The last root domain boots from eUSB (set up 2 eUSB pools to allow these two domains to boot independently),
- In the event of total eUSB failure, the primary falls back to the SP trampoline boot image,
- In the event of total eUSB failure, the last root domain boots from a bpool vdisk provided by the primary domain,
- All other domains boot from a bpool vdisk via an mpgroup provided by the primary and last root domain,
Example where one of the 2 CMIOUs composing a host has been replaced
- Booting from the empty (replaced eUSB) fails
{c00} ok printenv boot-device
boot-device = /pci@340/pci@2/usb@0/storage@1/disk@0,0:a /pci@345/pci@2/usb@0/storage@1/disk@0,0:a fallback-miniroot disk net
{c00} ok boot /pci@345/pci@2/usb@0/storage@1/disk@0,0:a
Boot device: /pci@345/pci@2/usb@0/storage@1/disk@0,0:a File and args:
Can't open disk label package
Can't open boot device
- In this example, the other device could be used (/pci@340/pci@2/usb@0/storage@1/disk@0,0:a) but booting from the fallback image as a test
{c00} ok boot fallback-miniroot
Boot device: /pci@345/pci@1/pci@0/pci@8/usb@0/storage@2/disk@0 File and args:
SunOS Release 5.11 Version 11.3 64-bit
Copyright (c) 1983, 2015, Oracle and/or its affiliates. All rights reserved.
NOTICE: Configuring iSCSI to access the root filesystem...
NOTICE: successfully copied and retained the boot_archive into memory, rebooting ...
rebooting...
Resetting...
NOTICE: Entering OpenBoot.
...
Boot device: /reboot-memory File and args:
SunOS Release 5.11 Version 11.3 64-bit
Copyright (c) 1983, 2015, Oracle and/or its affiliates. All rights reserved.
NOTICE: Configuring iSCSI to access the root filesystem...
Hostname: pdom3
pdom3 console login:
Due to the some bugs (fixed in 9.5.4.a), some aliases are not correct
In order to boot on the fallback image, without the fix for the bugs (SysFW 9.5.4.a), the "rcdrom" alias must be used instead of fallback-miniroot.
- The new empty eUSB has joined the pool mirror and becomes bootable.
root@pdom3:~# zpool status bpool
pool: bpool
state: ONLINE
scan: resilvered 85.6M in 8s with 0 errors on Wed Feb 17 09:57:42 2016
config:
NAME STATE READ WRITE CKSUM
bpool ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
c2t0d0 ONLINE 0 0 0
c4t0d0 ONLINE 0 0 0
errors: No known data errors
- It's now possible to boot from the new eUSB; the one that was falling before
root@pdom3:~# init 0
...
{c00} ok boot /pci@345/pci@2/usb@0/storage@1/disk@0,0:a
Boot device: /pci@345/pci@2/usb@0/storage@1/disk@0,0:a File and args:
SunOS Release 5.11 Version 11.3 64-bit
Copyright (c) 1983, 2015, Oracle and/or its affiliates. All rights reserved.
NOTICE: Configuring iSCSI to access the root filesystem...
Hostname: pdom3
pdom3 console login: jack
Password:
Last login: Wed Feb 17 09:58:26 2016 on console
Oracle Corporation SunOS 5.11 11.3 December 2015
PLEASE DO NOT CHANGE THE CONFIGURATION
jack@pdom3:~$ zpool status
pool: bpool
state: ONLINE
scan: resilvered 85.6M in 8s with 0 errors on Wed Feb 17 09:57:42 2016
config:
NAME STATE READ WRITE CKSUM
bpool ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
c2t0d0 ONLINE 0 0 0
c4t0d0 ONLINE 0 0 0
errors: No known data errors
pool: rpool
state: ONLINE
scan: none requested
config:
NAME STATE READ WRITE CKSUM
rpool ONLINE 0 0 0
c0t600144F093355C6E0000566BD7470006d0 ONLINE 0 0 0
errors: No known data errors
root@pdom3:~# bootadm boot-pool list
Boot pool name: bpool
Parameters: eviction_algorithm=lru
Current: /dev/dsk/c2t0d0, /dev/dsk/c4t0d0
Pending: /dev/dsk/c2t0d0, /dev/dsk/c4t0d0
Platform-specified devices excluded:
Platform-specified (auto-added, unless excluded): /dev/dsk/c2t0d0, /dev/dsk/c4t0d0
Example - All of the devices originally in the boot pool have been replaced
- Booting from the empty (replaced eUSB) fails
{c00} ok printenv boot-device
boot-device = /pci@340/pci@2/usb@0/storage@1/disk@0,0:a /pci@345/pci@2/usb@0/storage@1/disk@0,0:a fallback-miniroot disk net
{c00} ok boot /pci@340/pci@2/usb@0/storage@1/disk@0,0:a
Boot device: /pci@340/pci@2/usb@0/storage@1/disk@0,0:a File and args:
Can't open disk label package
Can't open boot device
- So the only option is to boot from the fallback image
{c00} ok boot fallback-miniroot
NOTICE: Entering OpenBoot.
NOTICE: Fetching Guest MD from HV.
NOTICE: Starting additional cpus.
NOTICE: Initializing LDC services.
NOTICE: Probing PCI devices.
NOTICE: Finished PCI probing.
SPARC M7-16, No Keyboard
Copyright (c) 1998, 2015, Oracle and/or its affiliates. All rights reserved.
OpenBoot 4.38.2, 477.0000 GB memory available, Serial #106805695.
Ethernet address 0:10:e0:5d:b9:cf, Host ID: 865db9bf.
Boot device: /pci@345/pci@1/pci@0/pci@8/usb@0/storage@2/disk@0 File and args:
SunOS Release 5.11 Version 11.3 64-bit
Copyright (c) 1983, 2015, Oracle and/or its affiliates. All rights reserved.
NOTICE: Configuring iSCSI to access the root filesystem...
NOTICE: successfully copied and retained the boot_archive into memory, rebooting ...
rebooting...
Resetting...
NOTICE: Entering OpenBoot.
...
SPARC M7-16, No Keyboard
Copyright (c) 1998, 2015, Oracle and/or its affiliates. All rights reserved.
OpenBoot 4.38.2, 477.0000 GB memory available, Serial #106805695.
Ethernet address 0:10:e0:5d:b9:cf, Host ID: 865db9bf.
Boot device: /reboot-memory File and args:
SunOS Release 5.11 Version 11.3 64-bit
Copyright (c) 1983, 2015, Oracle and/or its affiliates. All rights reserved.
NOTICE: Configuring iSCSI to access the root filesystem...
Hostname: pdom3
root@pdom3:~# bootadm boot-pool list
Boot pool name: bpool
Parameters: eviction_algorithm=lru
Current: /dev/dsk/c2t0d0, /dev/dsk/c4t0d0
Pending: /dev/dsk/c2t0d0, /dev/dsk/c4t0d0
Platform-specified devices excluded:
Platform-specified (auto-added, unless excluded): /dev/dsk/c2t0d0, /dev/dsk/c4t0d0
root@pdom3:~# zpool status bpool
pool: bpool
state: ONLINE
scan: none requested
config:
NAME STATE READ WRITE CKSUM
bpool ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
c2t0d0 ONLINE 0 0 0
c4t0d0 ONLINE 0 0 0
errors: No known data errors
- And it's now possible to boot from the eUSB disks, the one failing before
{c00} ok printenv boot-device
boot-device = /pci@340/pci@2/usb@0/storage@1/disk@0,0:a /pci@345/pci@2/usb@0/storage@1/disk@0,0:a fallback-miniroot fallback-miniroot disk net
{c00} ok
{c00} ok boot /pci@340/pci@2/usb@0/storage@1/disk@0,0:a
NOTICE: Entering OpenBoot.
...
SPARC M7-16, No Keyboard
Copyright (c) 1998, 2015, Oracle and/or its affiliates. All rights reserved.
OpenBoot 4.38.2, 477.0000 GB memory available, Serial #106805695.
Ethernet address 0:10:e0:5d:b9:cf, Host ID: 865db9bf.
Boot device: /pci@340/pci@2/usb@0/storage@1/disk@0,0:a File and args:
SunOS Release 5.11 Version 11.3 64-bit
Copyright (c) 1983, 2015, Oracle and/or its affiliates. All rights reserved.
NOTICE: Configuring iSCSI to access the root filesystem...
Hostname: pdom3
root@pdom3:~# bootadm boot-pool list
Boot pool name: bpool
Parameters: eviction_algorithm=lru
Current: /dev/dsk/c2t0d0, /dev/dsk/c4t0d0
Pending: /dev/dsk/c2t0d0, /dev/dsk/c4t0d0
Platform-specified devices excluded:
Platform-specified (auto-added, unless excluded): /dev/dsk/c2t0d0, /dev/dsk/c4t0d0
- In case of any problem when booting from the fallback image as in the following example, try another boot.
Boot device: /pci@345/pci@1/pci@0/pci@8/usb@0/storage@2/disk@0 File and args:
SunOS Release 5.11 Version 11.3 64-bit
Copyright (c) 1983, 2015, Oracle and/or its affiliates. All rights reserved.
NOTICE: Configuring iSCSI to access the root filesystem...
NOTICE: successfully copied and retained the boot_archive into memory, rebooting ...
rebooting...
Resetting...
...
Boot device: /reboot-memory File and args:
SunOS Release 5.11 Version 11.3 64-bit
Copyright (c) 1983, 2015, Oracle and/or its affiliates. All rights reserved.
NOTICE: Configuring iSCSI to access the root filesystem...
Boot device: /reboot-memory File and args:
SunOS Release 5.11 Version 11.3 64-bit
Copyright (c) 1983, 2015, Oracle and/or its affiliates. All rights reserved.
NOTICE: Configuring iSCSI to access the root filesystem...
An inconsistency in the boot archive was detected and the boot archive has been successfully updated. Rebooting
syncing file systems... done
rebooting...
Resetting...
...
Boot device: /reboot-memory@0:nolabel File and args:
ERROR: /reboot-memory@0: No reboot memory segment.
Evaluating:
Can't open boot device
- Try another boot on the fallback image
{c00} ok boot pci@345/pci@1/pci@0/pci@8/usb@0/storage@2/disk@0
Boot device: /pci@345/pci@1/pci@0/pci@8/usb@0/storage@2/disk@0 File and args:
SunOS Release 5.11 Version 11.3 64-bit
Copyright (c) 1983, 2015, Oracle and/or its affiliates. All rights reserved.
NOTICE: Configuring iSCSI to access the root filesystem...
NOTICE: successfully copied and retained the boot_archive into memory, rebooting ...
rebooting...
Resetting...
...
Boot device: /reboot-memory File and args:
SunOS Release 5.11 Version 11.3 64-bit
Copyright (c) 1983, 2015, Oracle and/or its affiliates. All rights reserved.
NOTICE: Configuring iSCSI to access the root filesystem...
Hostname: pdom3
References
<NOTE:2094741.1> - SPARC T7 / M7 / M8 Servers : Information about VersaBoot - iSCSI over IPoIB
<NOTE:2063247.1> - SPARC M8 and SPARC M7 Series Servers: Device Paths
<NOTE:1967858.1> - SPARC M7-16 Server : Product Information Page
<NOTE:1967511.1> - SPARC M7-8 Server : Product Information Page
<NOTE:2063349.1> - SPARC M7 Series Servers : Interconnect - EoUSB
Attachments
This solution has no attachment