![]() | Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Solution Type Technical Instruction Sure Solution 1487791.1 : SuperCluster - How to cleanly shutdown and startup an Oracle SuperCluster T4-4 or T5-8
In this Document
Applies to:SPARC SuperCluster T4-4 - Version All Versions to All Versions [Release All Releases]Oracle SuperCluster T5-8 Full Rack - Version All Versions to All Versions [Release All Releases] Oracle SuperCluster T5-8 Half Rack - Version All Versions to All Versions [Release All Releases] SPARC SuperCluster T4-4 Half Rack - Version All Versions to All Versions [Release All Releases] Oracle Solaris on SPARC (64-bit) GoalDescribe the recommended procedure for cleanly powering down and powering up an Oracle SuperCluster T4-4 or T5-8.
Solution
This note will address the various offered configurations of Oracle SuperCluster if your machine has any variations approved as exceptions then your steps may vary.
Shutdown ProceduresIf running Oracle Solaris Cluster OSC3.3u1/S10 or OSC4.0/S11 Then you need to shutdown the clustering service. Run the following on all global zones involved in clustering to prevent failover when shutting down applications and zones.# /usr/cluster/bin/cluster shutdown -g 0 -y If running OpsCenter 12C in SuperCluster mode you will have to also halt the enterprise controller so it does not attempt to fail over while bringing down CRS.# /opt/SUNWxvmoc/bin/ecadm ha-stop-no-relocate
Follow applicable documentation to cleanly shutdown all user applications or databases running in zones or LDoms.Obtain a list of all running zones# zoneadm list
global sol10_zone Shutdown all running zones# zoneadm -z sol10_zone shutdown
Obtain a list of all running LDoms# ldm list
NAME STATE FLAGS CONS VCPU MEMORY UTIL UPTIME primary active -n-cv- UART 128 523776M 1.1% 4d 19h 50m orlscclldm01 active -n---- 5001 64 32G 0.0% 11d 2h 59m ssccn2-app1 active -t--v- 5000 64 256G 1.6% 3d 23h 45m
The T4-4 and T5-8 LDom configurations can vary based off configuration chosen during installation. If running with 1 LDom you will shutdown the machine just as you would any other server just by cleanly shutting down the OS. If running 2 Ldoms you will shutdown the guest domain first and then the primary (control). if running with 3 or more domains you will have to identify the domain(s) that is/are running off virtualized hardware and shut it/them down first before moving on to shutting down the guest domain and finally the primary(control).
Obtain the names of the LDoms with direct hardware access.T4-4
# ldm list-io |egrep "pci@400|pci@700"
pci@400 pci_0 primary pci@700 pci_3 ssccn2-app1 ... T5-8 is built a bit differently but you can identify the same searching on the SASHBA
# ldm list-io |grep SASHBA
/SYS/MB/SASHBA0 PCIE pci_0 primary OCC /SYS/MB/SASHBA1 PCIE pci_15 ssccn1-dom3 OCC Stop the domains from the ldm list command one that are not on this list# ldm stop-domain orlscclldm01
Stop the guest domain with hardware access# ldm stop-domain ssccn2-app1
Shutdown the CRS stack on all domains running Oracle CRS.# /u01/app/11.2.0.3/grid/bin/crsctl stop crs
Verify all oracle processes are stopped and if they are not remediate as necessaryps -ef |grep oracle
Shutdown Exadata storage cell services and operating systems# cd /opt/oracle.SupportTools/onecommand
# dcli -g cell_group -l root 'cellcli -e "alter cell shutdown services all"'
# dcli -g cell_group -l root shutdown -now
Shutdown the operating system of the control LDom# shutdown -g0 -i0 -y
Connect to the compute node ILOM and stop SYSstop /SYS
Show and then set, if need be, the power switch settings so the T4-4 or T5-8 machines DO NOT power on automatically when the rack power is restored. the following show the settings that you want to reach.-> show /SP/policy
/SP/policy Targets: Properties: HOST_AUTO_POWER_ON = disabled HOST_COOLDOWN = disabled HOST_LAST_POWER_STATE = disabled HOST_POWER_ON_DELAY = disabled PARALLEL_BOOT = enabled If any of yours are set to enabled modify them as such
->set /SP/policy HOST_AUTO_POWER_ON=disabled Shutdown the ZFS Storage ApplianceBrowse to the BUI of both storage heads and form the dashboard select the power off appliance button in the upper left section of the screen below the Oracle logo.
The switches do no have specific power off instructions they will be powered off when power is removed from the rack.Flip the breakers on the PDUs to the off position.Startup ProceduresPlease note that if you are running switch firmware 1.1.3-x you will need to run steps to correct the switch infiniband partitioning. This is documented in <Document 1452277.1> SPARC SuperCluster Critical Issues. it is highly advisable to upgrade your rack to the latest Quarterly Maintenance Bundle to get the switch to version 2.0.6 or above to prevent this issue. The link to the download can be found here
![]() Flip the breakers on all PDUs to the on positionOnly perform the following steps if the switches are at a firmware version below 2.0.6
Verify and if necessary fix the partitioning on the IB Switches# smpartition list active
# getmaster
smpartition command should reflect 3 or more partitioned on 0x0501,0x0502,0x0503,etc... depending on configuration. getmaster should reflect the spine switch as the master. If does not follow go to the next command
# smpartition start; smpartition commit
If this does not remediate the issue please open an SR with your SuperCLuster CSI and serial number and request an engineer to assist you with the more indepth remediation steps. Reference this document ID in your SR.
Internal remediation steps before proceeding check /conf/configvalid and verify that it is 1. If it is not at any point during these steps echo 1 > /conf/configvalid START:
Startup the ZFS Storage ApplianceBrowse to the BUI of both storage heads and if you can connect proceed to the Exadata storage cell steps
If you can not connect with the BUI verify the 7320 has started by doing an ssh as root into the sp of heads and issuing the following:
-> start /SYS
Verify the startup of the Exadata storage cells
Run the following from cel01 of your SuperCluster as celladmin verify that the cell services are online and that all griddisks are active.
dcli -g cell_group -l celladmin 'cellcli -e "list cell"
dcli -g cell_group -l celladmin 'cellcli -e "list griddisk"
Bring up the T4-4 or T5-8 systemsLog into the ILOM for each T4-4 or T5-8 and start /SYS and then monitor the progress via the /SP/console
-> start /SYS
->start /SP/console
Verify the systemUnless configured otherwise by the site database administrators or system administrators all LDoms, Zones, Clusterware and Database related items should come up automatically as the system comes up. If it fails to do so manually start these components as per your site standard operating procedures. Please verify the system is all the way up via the console before checking dependent items. If for any reason you can not restart anything please gather appropriate diagnostic data and file an SR after consulting with your local administrators. The svcs -xv will let you know which system services if any did not start and assist you in debugging why.
# ldm list
# zoneadm list
# /u01/app/11.2.0.3/grid/bin/crsctl status res -t
# svcs -xv
Restart all applicable applications and test.Community DiscussionsStill have questions? Use the communities window below to search for similar discussions or start a new discussion on this subject. (Window is the live community not a screenshot) Click here to open in main browser window References<NOTE:1452277.1> - SuperCluster Critical Issues<NOTE:1567979.1> - Oracle SuperCluster Supported Software Versions - All Hardware Types Attachments This solution has no attachment |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|