Asset ID: |
1-71-1008327.1 |
Update Date: | 2017-02-28 |
Keywords: | |
Solution Type
Technical Instruction Sure
Solution
1008327.1
:
How to validate Sun Storage 6000, 2500 and Flexline Array Controller Out of Band Communication
Related Items |
- Sun Storage 6580 Array
- Sun Storage 6180 Array
- Sun Storage Flexline 280 Array
- Sun Storage 2510 Array
- Sun Storage 2540 Array
- Sun Storage 2540-M2 Array
- Sun Storage 6780 Array
- Sun Storage 6140 Array
- Sun Storage 2530 Array
- Sun Storage 2530-M2 Array
- Sun Storage Flexline 380 Array
- Sun Storage 6540 Array
- Sun Storage Flexline 240 Array
- Sun Storage 6130 Array
|
Related Categories |
- PLA-Support>Sun Systems>DISK>Arrays>SN-DK: 6140_6180
|
PreviouslyPublishedAs
211394
Applies to:
Sun Storage 6780 Array - Version Not Applicable and later
Sun Storage 2510 Array - Version Not Applicable and later
Sun Storage 2540 Array - Version Not Applicable and later
Sun Storage 6140 Array - Version Not Applicable and later
Sun Storage 2530 Array - Version Not Applicable and later
All Platforms
Goal
The purpose of this document is to identify whether a SANtricity, Common Array Manager (CAM), or other management capable software for the 2500, 6000, and Flexline array family can communicate properly with each RAID Controller's IP based management port.
Symptoms include but are not limited to:
- Unresponsive array in Sun StorageTek SANtricity(SANtricity).
- Array Communication Out of Band (OOB) event in Sun StorageTek Common Array Manager(CAM).
- Cannot Register Array in CAM.
Solution
Please validate that each troubleshooting step below is true for your environment. Each step will provide instructions via a link to a document, for validating the step and taking corrective action as necessary.
The steps are ordered in the most appropriate sequence to isolate the issue and identify the proper resolution. Please do not skip a step.
A. Verify communication with the array controllers.
With CAM you can use the 'service' command to validate the communication. If you are using SANtricity Storage Manager, go to Step B.
Location for 'service' command :
Solaris : /opt/SUNWsefms/bin/
Windows : C:\Program Files\Sun\Common Array Manager\Component\fms\bin
Linux : /opt/sun/cam/private/fms/bin
Syntax :
service -d <array_name> -c contact [-t <a|b>]
Example :
# ./ras_admin device_list
Monitored On Device Type IP Address WWN Active ASR
------------ --------------- ---- ------------ ---------------- ------ ---
mycamhost my6180 6180 10.30.16.158 200400a0b8168e11 Y N
# ./service -d my6180 -c contact
Executing the contact command on my6180
Attempting to contact the array using the following address(es):
10.30.16.158
10.30.16.159
Controller A is accessible via:
10.30.16.158 (oob)
Controller B is accessible via:
10.30.16.159 (oob)
Completion Status: Finished
- If you are able to contact all controllers in the array, then you have validated communication, and no further work is required.
- If one or both controllers cannot be contacted, continue to Step C.
B. Telnet to both controller IP on port 2463
Telnet to both A and B Controllers IP address on port 2463.
For arrays that are running firmware 06.70.54.11 or lower, a normal telnet connection will last a full sixty(60) seconds. This can include the following arrays:
- Sun StorEdge 6130
- Sun StorageTek 6140
- Sun StorageTek 6540
- Sun StorageTek 2510
- Sun StorageTek 2530
- Sun StorageTek 2540
- StorageTek Flexline 240
- StorageTek Flexline 280
- StorageTek Flexline 380
For arrays that are running firmware 07.10.25.10 or above, a normal telnet connection will last a full five minutes(600 seconds). This can include the following arrays:
- Sun StorageTek 6140
- Sun StorageTek 6540
- Sun StorageTek 2510
- Sun StorageTek 2530
- Sun StorageTek 2540
- Sun Storage 2530-M2
- Sun Storage 2540-M2
- Sun Storage 6180
- Sun Storage 6580
- Sun Storage 6780
- StorageTek Flexline 380
NOTE 1: It is not absolutely necessary to know the firmware of an array. It is important to understand that if the telnet session lasts longer than 60s, you continue to time the array connection until completion.
Example (Solaris) :
# time telnet 10.30.16.158 2463
Trying 10.30.16.158...
Connected to 10.30.16.158.
Escape character is '^]'.
Connection to 10.30.16.158 closed by foreign host.
real 5:02.4
user 0.0
sys 0.0
Important note about above example :
After running the command the initial output "Connected to <IP address>. Escape character is '^]'" should be displayed and then it will wait (appear to hang)
PLEASE LEAVE THE COMMAND RUNNING. Do not enter anything into the telnet session, just leave it running.
Eventually (after up to 5 minutes) the connection should close automatically with the message "Connection to <IP addess> closed by foreign host"
After this the output will be displayed from the time command reporting how long the connection remained open - this is the required information to gather.
- If both controllers show that they can connect for a full 60 or 600 seconds, that the arrays are capable of establishing a connection via the management interface. Check that the management software can communicate with the array. Reference <Document 1007046.1>: Sun Storage 2500, 2500-M2, 6000 and Flexline Arrays: Troubleshooting Management Communication Faults
- If one controller maintains a full 60 second connection and the other cannot, this indicates that the controller that cannot communicate may be either Offline, or not have a network connection to it. Check that the management software can register the array using the IP address of the controller that can establish a connection on port 2463, and check the status of the alternate. <Document 1007046.1>: Sun Storage 2500, 2500-M2, 6000 and Flexline Arrays: Troubleshooting Management Communication Faults
- If both controllers cannot stay connected for a full 60 seconds, this indicates that both of the controllers are going through a boot cycle/loop. Go to Step J.
- If you get the 'Connection Refused' this means that there is a route to the array, but the management service on the controller is not listening. Go to Step D.
- If you get 'nodename nor servname provided, or not known' you do not appear to have a route to the IP specified. Go to Step C.
C. Ping both controller IP
Use ping from the management host to ping both A and B Controllers IP address.
Syntax :
ping <ip-address>
Example :
# ping 10.30.16.158
10.30.16.158 is alive
# ping 10.30.16.159
10.30.16.159 is alive
- If you can ping an array controller, this my indicate that either the port 2463 is blocked, or is not listening on the controller. Please continue to Step J.
- If you cannot ping one or both array controller, continue to Step D.
D. Validate the RAID Controller Fault LED
Using the location table above, validate whether the fault LED is on or off for a given controller.
Observe the LED for approximately 2 minutes, to ensure that the fault status is stable.
For 6130, FLX240, FLX280, D-Series, B-Series, and 2500 Storage Arrays:
- If the Amber LED is lit for a full two minutes, this indicates that the controller in question is in an offline status. Go to Step J.
- If the Amber LED is lit, but cycles between Off and On states this indicates that the controller in question may be continually rebooting, and require service intervention. Go to Step J.
- If the Amber LED is off, go to Step E.
For 6140, 6540, 6180, 6580, and 6780:
- If the Amber LED is lit for a full two minutes, this indicates that the controller in question is in an offline status. Go to Step J.
- If the Amber LED is lit, but cycles between Off and On and the seven-segment tray ID is constantly changing, this indicates that the controller in question may be continually rebooting, and require service intervention. Go to Step J.
- If the Amber LED is off, go to Step E.
E. Validate physical network link
Check the Ethernet cables to ensure that there is no visible damage and that they are securely connected.
Check that link LEDs on network ports are lit and green.
Be advised there is a closed CR for problems in CAM 6.9.0.20 when the array controller's IP addresses are under Network Address Translation (NAT) control, CAM will not be able to manage the array; Reference
Bug: 16241591. Arrays under NAT did work in CAM 6.8. This bug will remain unresolved.
- If physical connection is good, continue to Step F.
- After doing one of the following repeat Steps A through E:
- Replace Cable
- Swap port on Ethernet Router/HUB/Switch
- Directly connect using a Cross-Over Cable.
NOTE 2: Do not change ethernet ports on the array. Each one uses its own IP address, and may not be set properly for your LAN.
F. Validate that ethernet switch/router is set to Auto-Negotiate
The method for this differs, based on Ethernet switch model and vendor.
Please contact your switch/router vendor for documentation and support on this.
- If auto-negotiate is set, continue to Step G.
- If auto-negotiate is not set, set it on your switch, and reset the network link(cable pull is fine), and repeat Steps A through F.
While auto-negotiation is a best practice, we do have examples of situations where specifying something else, like 100-full say, works better for a certain customer with a certain switch. In those cases, we sometimes think the switches are part of the problem, but if it works, this may provide a temporary work-around.
G. Validate current array controller network settings via serial port
For arrays running 06.16.81.10 or later, the array has a Service Menu which is accessible via the serial port on the array.
Use this serial connection to access the service interface, and check the following settings on each controller :
- DHCP Setting - ON or OFF?
- IP Address
- Subneet Netmask
- Gateway IP Address
- For 2500 Arrays, Reference Sun StorageTek 2500 Series Array Hardware Installation Guide.
- For 2500-M2 Arrays, Reference Sun Storage 2500-M2 Arrays Hardware Installation Guide.
- For 6130/FLX240/FLX280 Arrays, there was no service menu until after 06.16.81.10. The serial settings are: 19200 baud, 8-bit, No Parity, 1 stop bit, No flow control. The menu is identical to any of the arrays in this list.
- For 6140 Arrays, Reference Sun StorageTek 6140 Array Getting Started Guide.
- For 6540/FLX380 Arrays, Reference Sun StorageTek 6540 Array Hardware Installation Guide.
- For 6580/6780 Arrays, Reference Hardware Installation Guide for Sun Storage 6580 and 6780 Arrays.
- For 6180 Arrays, Reference Sun Storage 6180 Array Hardware Installation Guide.
NOTE 3: For 6140, 6180, 6540, 6580, 6780 and FLX380 Arrays, Network Port 1 is the INNER MOST port on the controller. Port 2 is the OUTER MOST port on the controller.
NOTE 4: The 6130/FLX240/FLX280 uses an RS232 Null Modem Cable.
- If your array has firmware below 06.16.81.10, go to Step J.
- If you are using DHCP, continue to Step H.
- If the network settings are wrong, change your network settings as necessary to work on your LAN, and repeat Steps A through D.
- If a gateway is set, make sure that you can ping the gateway address from the management host. If you cannot, correct this issue, and go to Step A.
- If no changes were made and the port settings are correct, or if you cannot get a serial connection, go to Step I.
H. Validate DHCP server
Verify that your DHCP server does not require the array to release and renew its IP address. If it does, you will want to either:
- Set a static IP on the controller
- Set a static IP on the DHCP server
- Get used to having to finding a specific IP range and registering an array every time you need to monitor it. The management software has no way to pick up the new IP address that the DHCP server provided it. DHCP was originally intended as a way to discover an array for the first time without setting up
a private LAN, as the controllers will use BOOTP by default at startup.
- If you changed your IP to static and made new settings, repeat Step A.
- If you have a dynamic IP, repeat Step A using the current IP table from your DHCP server.
- If you are using the dynamically assigned IP for Steps A through C, go to Step I.
- If you are not using DHCP, go to Step I.
I. Verify known issues documented in existing Alerts
- Review Alert <Document 1019498.1>: Sun StorageTek 25x0 and 6140 Arrays may send wrong Network packets, causing the Sun StorageTek Common Array Manager (CAM) Host to Lose the Network Connection to the Array.
- If you are under the affected products list, please attempt to either clear the ARP table for the network port(s) the controller(s) is/are plugged into, or reboot the network switch, and repeat Step A.
- If you are not under the affected products list, of the Sun Alert, continue with Step I-2 below.
- Review Alert <Document 1000651.1> Sun StorageTek 6130/6140/6540 and Sun StorageTek Flexline 240/280/380 Arrays May Experience Loss of Network Access From Management Host.
- If you are under the affected products list, apply the workaround or resolution described in the Alert, and repeat Step A.
- If you are not under the affected products list, of the Sun Alert, continue to Step J.
- Review Alert <Document 1598709.1> Sun Storage Common Array Manager (CAM): Applying Patch 147416-02/147419-02 May Cause Alarm "Lost out-of-band communication".
- If you are under the affected products list, apply the workaround or resolution described in the Alert, and repeat Step A.
- If you are not under the affected products list, of the Sun Alert, continue to Step J.
J. Data Collection
Please collect and compile the following information:
- Results of Step A for both controllers
- Results of Step B for both controllers
- Results of Step C for both controllers
- Results of Step D for both controllers
- Results of Step F for both controllers
- Results of Step G for both controllers
- IP address, netmask, and gateway setting of management host
- IP address, netmask, and gateway setting of Controller A
- IP address, netmask, and gateway setting of Controller B
- Which Network port is connected to the LAN from each controller
- Array Model
Do you still have questions? You can use My Oracle Support Communities. Communities put you in touch with industry professionals like yourself. They are monitored by Oracle support engineers, so you can expect reliable and correct answers. Ask questions and see what others are asking about in the
Disk Storage 2000, 3000, 6000 RAID Arrays & JBODs Community.
References
<NOTE:1007046.1> - Sun Storage 2500, 2500-M2, 6000 and Flexline Arrays: Troubleshooting Management Communication Faults
<NOTE:1019498.1> - Sun StorageTek 25x0 and 6140 Arrays May Send Wrong Network Packets, Causing the Sun StorageTek Common Array Manager (CAM) Host to Lose the Network Connection to the Array
<NOTE:1598709.1> - Sun Storage Common Array Manager (CAM): Applying Patch 147416-02/147419-02 May Cause Alarm "Lost out-of-band communication"
Attachments
This solution has no attachment