Asset ID: |
1-71-1902710.1 |
Update Date: | 2017-04-30 |
Keywords: | |
Solution Type
Technical Instruction Sure
Solution
1902710.1
:
How to configure Datacenter InfiniBand Switch 36 & QDR InfiniBand Gateway Switches for ASR
Related Items |
- Exadata X4-2 Hardware
- Sun Datacenter InfiniBand Switch 36
- Exadata X3-2 Half Rack
- Exadata Database Machine X2-2 Hardware
- SPARC SuperCluster T4-4
- Sun Network QDR InfiniBand Gateway Switch
|
Related Categories |
- PLA-Support>Sun Systems>SAND>Network>SN-SND: Sun Network Infiniband
|
In this Document
Applies to:
Sun Datacenter InfiniBand Switch 36 - Version All Versions to All Versions [Release All Releases]
Exadata X4-2 Hardware - Version All Versions to All Versions [Release All Releases]
Exadata X3-2 Half Rack - Version All Versions to All Versions [Release All Releases]
Exadata Database Machine X2-2 Hardware - Version All Versions to All Versions [Release All Releases]
SPARC SuperCluster T4-4 - Version All Versions to All Versions [Release All Releases]
Information in this document applies to any platform.
Goal
ASR for Datacenter InfiniBand Switch 36 & QDR InfiniBand Gateway Switches
This document describes the process for setting up Automated Service Request (ASR) for both of the following products:
- Sun Datacenter InfiniBand Switch 36 (nm2-36)
- Sun Network QDR InfiniBand Gateway Switch (nm2-gw).
Solution
Verify Your Switch Information
Verify the switch:
To verify the switch FW is at least 2.1.2-2 or higher log into the switch as root and issue the “version” command:
FabMan@ib-switch-> version
SUN DCS 36p version: 2.1.2-2
Build time: Feb 19 2013 13:29:01
SP board info:
Manufacturing Date: 2012.09.06
Serial Number: "NCDA11784"
Hardware Revision: 0x0007
Firmware Revision: 0x0000
BIOS version: SUN0R100
BIOS date: 06/22/2010
Verify the Serial Number:
The Entitlement Serial number of the switch may be found in three locations:
-
The white tab located to the left of the leftmost PSU on the back of the switch, viewed from the front of the rack in Engineered Systems
-
A label on the Chassis
-
Stored internally for switches newer than September 2012.
IMPORTANT: In order for ASR to create a case, the internal serial number MUST match that indicated by the white tab and the label.
Upgrading your Firmware
If the firmware version is not at revision 2.1.2-2 or higher you must update it in order to proceed. Instructions for upgrading FW may be found in the Switch Remote Administration or Gateway Remote Administration, upgrading the firmware. (Go to http://docs.oracle.com and search for “Sun Datacenter InfiniBand”)
Note: If the switch/gateway is part of an engineered system be sure that the new FW is compatible with the image version you are running and that it is supported by that Engineered System. Also, be sure and use the FW update procedures provided by the Engineered Systems Owners Guide.
Verify the Serial Number in the system
Verifying if the Serial Number exists
For older switches shipped earlier than April 2013 it is most likely that the serial number will need to be set (as it does not already exist). One should always check first if the serial number already exists using the 'showfruinfo' command. If the S/N is present already in the record called Sun_FRU_LabelR/Sun_Serial_Number, then you do not need to set the value.
For newer switches released after the first of April 2013 it is most likely that the correct serial number is already set but may require updated FW to verify it.
The procedure for setting the serial number (if need be) is provided in this document in an Internal-only section, as it can only be performed in the Field by an Oracle Support Engineer. Please contact Oracle Support if you need the Serial# set.
NOTE: Switches with "Sun SpecPartNo: 885-1507-01" ("-01" only) can not be set in the field, the switch must be replaced.
NOTE: If Sun_FRU_LabelR/Sun_Serial_Number is not set, please contact Oracle as an On-Stie task will need to be opened for an Oracle engineer to set this serial number.
Example:
FabMan@ib-switch-> showfruinfo
SSun_Man1R:
UNIX_Timestamp32 : Fri Mar 8 10:16:35 2013
Sun_Fru_Description : ASSY,NM2-36P
Vendor_ID_Code : 13 A6
Vendor_ID_Code_Source : 01
Vendor_Name_And_Site_Location : 5030 CELESTICA CORP. SRIRACHA CHONBURI TH
Sun_Part_Number : 7057247
Sun_Serial_Number : 465769T+1309RR046Y
Serial_Number_Format : 4V3F1-2Y2W2X4S
Initial_HW_Dash_Level : 99
Initial_HW_Rev_Level : 01
Sun_Fru_Shortname : NM2, 36 ports
Sun_Hazard_Class_Code : Y
Sun_SpecPartNo : 7042787 <<------Sun_SpecPartNo:
Sun_FRU_LabelR:
Sun_Serial_Number : AK00086322
FRU_Part_Dash_Number : 7052970
(in this example the SN and part number are good)
If the Serial number appear correctly but ASR is not recognize a valid S/N, please re-build the servicetag.xml as follows:
Logged in as ilom-admin:
Last login: Fri Aug 12 11:52:55 2016 from 10.152.19.177
Oracle(R) Integrated Lights Out Manager
Version ILOM 3.0 r47111
Copyright (c) 2012, Oracle and/or its affiliates. All rights reserved.
-> set /SP/services/servicetag state=disabled
Set 'state' to 'disabled'
-> set /SP/services/servicetag state=enabled
Set 'state' to 'enabled'
->
Setting the Serial Number in the system (Internal-only section).
NOTE: Only an Oracle Badged Field Engineer is allowed to run the commands to set the serial number for these switches. This command is not allowed to be run by a Customer
NOTE: Switches with "Sun SpecPartNo: 885-1507-01" can not be set in the field, the switch must be replaced.
See Doc Configuring Sun Serial Number returns with Error: static FRUID header at offset 0x1800 is wrong! (Doc ID 1943393.1)
Setting the Serial Number on FW version 2.1-2-2 or higher :
IMPORTANT: If while entering the S/N you make a mistake and need to hit the backspace key, the mistake and backspace will become part of the S/N and ASR will NOT work! You will not see this when running showfruinfo but you will if you vi the "@usr@local@bin@showfruinfo.out" file in a snapshot.
You MUST re-enter the S/N with NO typo's. Just the clean S/N
1) log in to the switch as 'root':
$ ssh root@my_switch
2) execute the command, providing the new S/N (retrieved from the white pull tab) when prompted:
# /usr/local/util/update__FRU_LabelR_SN.sh
Example:
# /usr/local/util/update__FRU_LabelR_SN.sh
update__FRU_LabelR_SN.sh version 1.9 12/06/20
###################################################################
# WARNING: This utility is reserved for Oracle service personnel. #
# Any other use can damage the switch and is strictly #
# prohibited. #
###################################################################
Do you wish to continue?
If you do, please answer 'y' (timeout 10 seconds) [N/y]:y
Please provide a value for Sun_FRU_LabelR/Sun_Serial_Number: 1013AK208D
Environment daemon running (PID 2594)
Stopping Environment daemon. [ OK ]
Environment daemon is stopped
Checking the current FRUID content ... OK
Getting existing FRUID contents, please wait ...
OK
Programming new FRUID contents, please wait ...
Writing 8192 bytes to address 0
OK
Environment daemon is stopped
Starting Environment daemon. [ OK ]
Environment daemon running (PID 26559)
3) Validate the change with 'showfruinfo':
Example:
FabMan@ib-switch-> showfruinfo
Sun_Man1R:
UNIX_Timestamp32 : Fri Mar 19 16:29:59 2010
Sun_Fru_Description : ASSY,NM2-GW
Vendor_ID_Code : 11 E1
Vendor_ID_Code_Source : 01
Vendor_Name_And_Site_Location : 4577 CELESTICA CORP. SAN JOSE CA US
Sun_Part_Number : 5111402
Sun_Serial_Number : 0110SJC-1010NG0040
Serial_Number_Format : 4V3F1-2Y2W2X4S
Initial_HW_Dash_Level : 03
Initial_HW_Rev_Level : 50
Sun_Fru_Shortname : NM2 gateway
Sun_Hazard_Class_Code : Y
Sun_SpecPartNo : 885-1655-01
Sun_FRU_LabelR:
Sun_Serial_Number : 1013AK208D
FRU_Part_Dash_Number : 541-4188-01
IMPORTANT: Once you have verified the Serial number is correct, the servicetag.xml file must be updated before proceeding with ASR:
Example:
Logged in as ilom-admin:
Last login: Fri Aug 12 11:52:55 2016 from 10.152.19.177
Oracle(R) Integrated Lights Out Manager
Version ILOM 3.0 r47111
Copyright (c) 2012, Oracle and/or its affiliates. All rights reserved.
-> set /SP/services/servicetag state=disabled
Set 'state' to 'disabled'
-> set /SP/services/servicetag state=enabled
Set 'state' to 'enabled'
->
NOTE: The following telnet test doesn't appear to be possible using putty as the telnet client. It does work with either the Solaris or Linux telnet client.
After the serial number is configured, wait about two minutes and verify the following also returns the correct S/N:
telnet <IP of the switch> 6481
< NOTE: you will only have a blank screen here >
GET /stv1/agent/ HTTP/
You should see the following:
GET /stv1/agent/ HTTP/
HTTP/1.0 200 OK
Content-type: text/xml;charset=UTF-8
Connection: Close
<?xml version="1.0" encoding="UTF-8"?>
<st1:response xmlns:st1="http://www.sun.com/stv1/svctag">
<agent>
<agent_urn>urn:st:2119b1e0-1dd2-11b2-a9b2-9f002081a491</agent_urn>
<agent_version>1.1.3</agent_version>
<registry_version>1.0</registry_version>
<system_info>
<system>Linux</system>
<host>infiniband36gw</host>
<release>2.6.27.13-nm2</release>
<version></version>
<hostid>980ae8e4</hostid>
<architecture></architecture>
<platform></platform>
<manufacturer></manufacturer>
<serial_number>1013AK208D</serial_number>
</system_info>
</agent>
</st1:response>
Connection to 192.168.1.222 closed by foreign host.
Setting the Product Level Identity for Engineered Systems
If the switch or gateway being activated is contained within an engineered Systems like Exadata, you will also need to set the “/SP/system_identifier” field in the ILOM.
The /SP/system_identifier field is transferred with any fault events sent and identifies the switch is part of an Engineered System.
(If this is a standalone switch that is not part of an Engineered systems the system_identifier field dows not need to be set.)
To set the system identifier, start by typing 'spsh' at the prompt to enter into the ILOM environment.
# spsh
->
-> show /SP
/SP
Targets:
alertmgmt
cli
clients
clock
config
diag
faultmgmt
logs
network
serial
services
sessions
users
Properties:
hostname = MySwitch
system_contact = (none)
system_description = Sun Network QDR InfiniBand Gateway Switch...
system_identifier = (none)
system_location = (none)
Commands:
cd
reset
set
show
version
In the above example the system_identifier is empty (=none).
This should be set to the Product Serial Number of the Engineered System which contains it.
The product serial number may be found on the left front vertical rail of the systems just to the left of the switches.
The same Serial Number may be found in the rear of the rack in the top left corner above the PDU. This label is much more difficult to read and may require additional light.
Once this serial number has been determined it should be added to the system_identifier field.
The system Identifier field will look like one of the following, depending on the engineered system that contains it:
Exadata Database Machine X3-2 AK00000001
Oracle Big Data Appliance AK00000002
Oracle Exalogic X2-2 AK0000003
Oracle SPARC SuperCluster T44 AK00000004
Oracle Exalytic X2-4 AK00000005
Set the /SP/system_identifier in the appropriate format above using the Serial Number for the Product:
-> set /SP system_identifier = “Exadata Database Machine X3-2 AK00000001”
-> show /SP
/SP
Targets:
alertmgmt
cli
clients
clock
config
diag
faultmgmt
logs
network
serial
services
sessions
users
Properties:
hostname = MySwitch
system_contact = (none)
system_description = Sun Network QDR InfiniBand Gateway Switch...
system_identifier = Exadata Database Machine X3-2 AK00000001
system_location = (none)
Activating ASR on the Infiniband Switch/Gateway
Verify FW levels, Serial Number and ASRM
Requirements:
ASR Support for the Sun Datacenter InfiniBand Switch 36P and Sun Network QDR InfiniBand Gateway Switch switches requires a FW level of at least 2.1.2-2 or greater.
It is also required that an ASR manager already exists to which the switch may be pointed. Be sure the connections between the switch and the ASR manager are open, specifically for ports 162 and 6481.
Note: If the switch/gateway is part of an engineered system be sure that the new FW is compatible with the image version you are running and that it is supported by that Engineered System.
Set the Trap Destinations.
Log in as ilom-admin to the switch or gateway.
Go to the /SP/alertmgmt/rules location
cd /SP/alertmgmt/rules/
These are the “rules” for sending the SNMP packets used by ASR. In ASR documentation these “rules” designate the “trap Destination”.
Find an Unused Rule
An unused rule will show all zeros or the word “none” in the records. In the example below Rule #1 is in use, but rule #2 is available for use.
-> show 1
/SP/alertmgmt/rules/1
Targets:
Properties:
community_or_username = public
destination = 10.10.10.123
destination_port = 0
email_custom_sender = (none)
email_message_prefix = (none)
event_class_filter = (none)
event_type_filter = (none)
level = minor
snmp_version = 2c
testrule = (Cannot show property)
type = snmptrap
-> show 2
/SP/alertmgmt/rules/2
Targets:
Properties:
community_or_username = public
destination = 0.0.0.0
destination_port = 0
email_custom_sender = (none)
email_message_prefix = (none)
event_class_filter = (none)
event_type_filter = (none)
level = disable
snmp_version = 1
testrule = (Cannot show property)
type = snmptrap
Set the unused rule to point to the ASR Manager
You will need to set 4 values in the rule in order to properly identify the destination and other information about the SNMP packet. These records and their values are:
-
community_or_username = public (should be defaulted to it, but just in case)
-
destination = <IP_of_the_ASR_Manager>
-
destination_port = 162
-
level = minor
-
snmp_version = 2c
-
all other records should remain “none”
Move into the unused Rule location and set the records to the correct values:
-> cd /SP/alertmgmt/rules/2
-> set destination=<IP_of_the_ASR_Manager>
Set 'destination' to '<IP_of_the_ASR_Manager>'
-> set destination_port=162
Set 'destination_port' to '162'
-> set level=minor
Set 'level' to 'minor'
-> set snmp_version=2c
Set 'snmp_version' to '2c'
-> show
/SP/alertmgmt/rules/2
Targets:
Properties:
community_or_username = public
destination = <IP_of_the_ASR_Manager>
destination_port = 162
email_custom_sender = (none)
email_message_prefix = (none)
event_class_filter = (none)
event_type_filter = (none)
level = minor
snmp_version = 2c
testrule = (Cannot show property)
type = snmptrap
Activate the IB Switch on your ASR Manager.
Note: Path for the asr command:
ASRM lower than 4.9:
/opt/SUNWswasr/bin/asr
ASRM 5.0 and higher
/opt/asrmanager/bin/asr
1) To activate the nodes on the ASR manager, you should run the following command for each of the switches or gateways.
Using the IP address to activate the switch:
/opt/SUNWswasr/bin/asr activate_asset -i <ip_address_of_the_switch>
or by using the hostname to activate the switch:
/opt/SUNWswasr/bin/asr activate_asset -h <hostname_of_the_switch>
Don't run both of the above - either will work!
If "asr activate_asset" complains about invalid or unknown serial#, then there may be hidden invalid characters surrounding the actual serial#. So, it is recommended to repeat the first step of running /usr/local/util/update__FRU_LabelR_SN.sh to update the serial# again.
# showfruinfo | od -c
A proper example is given here:
[root@ib_gw_au ~]# showfruinfo
Sun_Man1R:
UNIX_Timestamp32 : Wed Nov 27 11:28:25 2013
Sun_Fru_Description : ASSY,NM2-GW
Vendor_ID_Code : 13 A6
Vendor_ID_Code_Source : 01
Vendor_Name_And_Site_Location : 5030 CELESTICA CORP. SRIRACHA CHONBURI TH
Sun_Part_Number : 7057249
Sun_Serial_Number : 465769T+1335RT02YC
Serial_Number_Format : 4V3F1-2Y2W2X4S
Initial_HW_Dash_Level : 99
Initial_HW_Rev_Level : 01
Sun_Fru_Shortname : NM2 gateway
Sun_Hazard_Class_Code : Y
Sun_SpecPartNo : 7054735
Sun_FRU_LabelR:
Sun_Serial_Number : AK00221891
FRU_Part_Dash_Number : 7054724
[root@ib_gw_au ~]#
[root@ib_gw_au ~]# showfruinfo | od -c
...
0001260 4 7 3 5 \0 \0 \0 \0 \n \n S u n _ F R
0001300 U _ L a b e l R : \n S u n _
0001320 S e r i a l _ N u m b e r
0001340 : A K 0
0001360 0 2 2 1 8 9 1 \0 \0 \0 \0 \0 \0 \0 \0 \0
0001400 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \n F R
0001420 U _ P a r t _ D a s h _ N u m b
0001440 e r : 7
0001460 0 5 4 7 2 4 \0 \0 \0 \0 \n \n
0001474
...
[root@ib_gw_au ~]#
2) To validate ASR assets:
/opt/SUNWswasr/bin/asr list_asset -i <ip_address_of_the_switch>
In the preceding command, asset_ip is the IP address of a switch. This will list just that asset at that IP address, if it has been successfully added to the list.
Example: (output modified to fit page)
# /opt/SUNWswasr/bin/asr list_asset -i <ip_address_of_the_switch>
IP_ADDRESS HOST_NAME SERIAL_NUMBER ASR PROTOCOL SOURCE PRODUCT_NAME
------------- ---------- -------------- ------- -------- ------ -----------------------
10.172.144.76 MySwitch 1013AK208D Enabled SNMP ILOM Sun Network QDR InfiniBand GW Switch
To list all assets, enter the command without the options. Caution, on a large server this could be hundreds of lines of output.
/opt/SUNWswasr/bin/asr list_asset
If no assets are listed, then verify that all steps of this setup document have run successfully.
Confirm Activation.
Confirm activation and assign contacts to nodes in My Oracle Support. This must be done by the Customer. For more information on the process, see ASR MOS 5.3+ Activation Process (Doc ID 1329200.1).
Send a "test" trap for final verification
Test traps are needed to verify that all the settings have been set correctly in My Oracle Support and within the sun Datacenter InfiniBand Switch 36" and "Sun Network QDR InfiniBand Gateway Switch. themselves.
Contact the ASR Backline to assist in verification.
Follow the guidelines provided in:
How to open a Service Request for Auto Service Request (ASR) installation and configuration problems (Doc ID 1909589.1)
Log in as ilom-admin to the switch or gateway.
Activate a "test" procedure on the previously activated rule.
The previously activated rule should already be pointing to the ASR Manager (see instruction on "Set the unused rule to point to the ASR Manager" as above).
cd /SP/alertmgmt/rules/2
set testrule=true
Verify that the test ASR event has been created and sent.
References
<NOTE:2140928.1> - How to Prepare an Infiniband (IB) Fabric for Planned Outage of an IB Switch
Attachments
This solution has no attachment