Asset ID: |
1-75-1007046.1 |
Update Date: | 2017-10-05 |
Keywords: | |
Solution Type
Troubleshooting Sure
Solution
1007046.1
:
Sun Storage 2500, 2500-M2, 6000 and Flexline Arrays: Troubleshooting Management Communication Faults
Related Items |
- Sun Storage 6180 Array
- Sun Storage Flexline 280 Array
- Sun Storage 2540-M2 Array
- Sun Storage Common Array Manager (CAM)
- Sun Storage 2540 Array
- Sun Storage 2510 Array
- Sun Storage 6140 Array
- Sun Storage Flexline 210 Array
- Sun Storage 2530-M2 Array
- Sun Storage 2530 Array
- Sun Storage Flexline 380 Array
- SANtricity Storage Manager
- Sun Storage 6540 Array
- Sun Storage Flexline 240 Array
- Sun Storage 6130 Array
|
Related Categories |
- PLA-Support>Sun Systems>DISK>Arrays>SN-DK: 6130
- _Old GCS Categories>Sun Microsystems>Storage Software>Modular Disk Device Software
|
PreviouslyPublishedAs
209726
Applies to:
Sun Storage 6140 Array - Version Not Applicable and later
Sun Storage Flexline 210 Array - Version Not Applicable and later
Sun Storage 2530-M2 Array - Version Not Applicable to Not Applicable [Release N/A]
SANtricity Storage Manager - Version 10.50 and later
Sun Storage 2530 Array - Version Not Applicable and later
All Platforms
Purpose
This document assists in the identification and resolution of communication issues between Sun Storage Arrays and Sun Storage Common Array Manager (CAM) or SANtricity.
Symptoms include:
- ASR Summary with SCRK:oob Component Name:OutOfBand Id:oob
- ASR Summary with ASR:oob
- ASR Summary with ASR:ib
- CAM Alert or ASR event - xx.12.31 xxx.CommunicationLostEvent.oob
- CAM Alert or ASR event - xx.12.21 xxxx.CommunicationLostEvent.ib
- Failure to register an array in SANtricity or CAM
- Array is listed as Unresponsive in the Enterprise Management window in SANtricity
- Array is listed as Unresponsive in the Array Summary Page in CAM
Troubleshooting Steps
Please validate that each troubleshooting step below is true for your environment. Each step will provide instructions via a link to a document, for validating the step and taking corrective action as necessary. The steps are ordered in the most appropriate sequence to isolate the issue and identify the proper resolution. Please do not skip a step.
A. Verify whether the issue is based on an Alert/Alarm, or a problem with Registration/Adding.
- If the issue is based on an Alert/Alarm received from your management host, or observed in the Array Summary(CAM) or Enterprise Management(SANtricity), go to Step B.
- If the issue is based on a problem with registering/adding the array to the management software, go to Step C.
B. Verify whether the array is being managed in band or out of band.
For CAM:
In the Array Summary window, the management type will be listed in parentheses as In-Band or Out-of-Band. Alternatively the alarm code listed as XX.12.YY in the alarm dictates whether the array is in band or out of band. XX is the array type. YY can be 21 or 31, indicating In-Band or Out-of-Band respectively.
For SANtricity via GUI:
This is displayed in the enterprise management window under management connections. This is either Out of Band or In Band
NOTE: There is no easy way to identify whether the array is managed in-band or out-of-band via the CLI for either application.
- If the array is being managed Out-of-Band, go to Step C.
- If the array is being managed In-Band, go to Step D.
C. Validate that you can communicate with each array controller out of band
Reference: <Document 1008327.1> How to validate Sun Storage 6000, 2500 and Flexline Array Controller Out of Band Communication
- If the controllers can be communicated with properly, continue to Step F.
- If the controllers communicated with properly, but the array still shows up as unresponsive, go to Step E.
D. Validate that you can communicate with each array controller in band
Reference: <Document 1021058.1> Validating Sun StorageTek[TM] 2500, 6000, and Flexline Array Controller In Band Proxy Agent Communication
- If the array and the in-band agent can be communicated with properly, but the array still cannot be registered, go to Step F.
- If the array and the in-band agent can be communicated with properly, but the array still shows up as unresponsive, go to Step E.
E. Validate array status after initializing CAM or SANtricity Services
CAM
Solaris 10 : svcadm restart svc:/system/fmservice:default
Solaris 8,9: /opt/SUNWsefms/sbin/fmservice.sh restart
Linux : /opt/sun/cam/private/fms/sbin/fmservice.sh restart
Windows : Use control panel to restart Sun_STK_FMS
Then check status:
Solaris 10 : svcs svc:/system/fmservice:default
Solaris 8,9: /opt/SUNWsefms/sbin/fmservice.sh status
Linux : /opt/sun/cam/private/fms/sbin/fmservice.sh status
Windows : Use control panel to check status of Sun_STK_FMS
Status should be online.
SANtricity
Simply Closing the Enterprise Management window to exit the application,
and launching it again, takes care of this task.
- If the array is still unresponsive, or shows an alert/alarm, go to Step F.
- If the array alert/alarm is gone, you have corrected the problem, no further action is required.
F. Re-register the array
If possible, remove and register the array from CAM or SANtricity. Attempt doing so by alternating between controller IP's during registration.
- If you can register the array, continue to Step G.
- If you cannot register the array, using either IP address, continue to Step H.
G. Validate whether issue is intermittent or not
- If the array slips between having a communication issue and communicating ok, check your CAM version for 6.4.1 or below. There are issues with long running jobs for arrays, or with the scripting client that are addressed in 6.5 and later.
- If the array status slips between having a communication issue and communicating ok, check your network for the following traits:
- Whether your management LAN is a private LAN. This makes the network software on the array controllers subject to attack, and network congestion can cause the poll to fail.
- Whether any type of port scanning is taking place on the LAN. Port scanning can cause TCP port connections to max out, which will result in a failed poll of the array.
- If you suspect either of the above issues, and your connection is intermittent, try to tune the polling interval larger.
CAM
- Click General Configuration
- Click Health Monitoring
- Change Monitoring Frequency
- Click Save
By default, CAM has a five(5) minute polling interval, will retry twice, and after fifteen(15) minutes, will throw an Alarm for loss of communication.
SANtricity
You cannot tune the polling interval in SANtricity Storage Manager.
If your issue is not intermittent, or if tuning the polling interval has not helped, continue to Step H.
H. Collect Data
At this point, if you have validated that each troubleshooting step above is true for your environment, and the issue still exists, further troubleshooting is required.
- If possible, collect CAM array support data(will not be available if the array cannot be communicated with). Reference <Document 1002514.1> : Collecting Support Data for Arrays Using Sun StorageTek[TM] Common Array Manager
- If possible, collect SANtricity support data(will not be available if the array cannot be communicated with). Reference: <Document 1014074.1> Collecting Support Data for Arrays Using Sun StorageTek[TM] SANtricity Storage Manager
- If using CAM, collect CAM host support data. Reference <Document 1021091.1> Collecting Sun Storage Common Array Manager Host Support Data
- Provide a network map of the management LAN
- Provide a network map of the in-band management if applicable
- Provide Polling Interval
- Indicate whether array is on a public or private LAN
- Indicate whether array has a Static or DHCP assigned IP address
- Indicate which of the above steps were attempted
- Provide host type of management software
- Provide Array Model
- Provide Management Software name and version
And contact Support
Do you still have questions? You can use My Oracle Support Communities. Communities put you in touch with industry professionals like yourself. They are monitored by Oracle support engineers, so you can expect reliable and correct answers. Ask questions and see what others are asking about in the
Disk Storage 2000, 3000, 6000 RAID Arrays & JBODs Community.
This document contains normalized content and is managed by the the Domain Lead
(s) of the respective domains. To notify content owners of a knowledge gap
contained in this document, and/or prior to updating this document, please
add a comment to the document and it will be processed.
Most Customers will be resolved by following the path of Steps C or D. The remaining few have either CAM services issues, or an intermittent problem on their network.
Due to CR6830106
running several long running jobs, or using the sscs scripting client
for multiple operations in succession, can cause the array to stop
communicating for a period of time. Upgrade to release 6.5 or
later and re-evaluate your scripts/jobs.
Ensure that they are at the latest version of Common Array Manager
Previously Published As
91322
References
<NOTE:1008327.1> - How to validate Sun Storage 6000, 2500 and Flexline Array Controller Out of Band Communication
<NOTE:1014074.1> - Collecting Support Data for Arrays Using Sun StorageTek SANtricity Storage Manager
<NOTE:1021058.1> - Validating Sun StorageTek[TM] 2500 and 6000 Array Controller In Band Proxy Agent Communication
<NOTE:1002514.1> - Collecting Sun Storage Common Array Manager Support Data for Arrays
Attachments
This solution has no attachment