![]() | Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||
Solution Type Sun Alert Sure Solution 1019002.1 : Collecting Support Data On Certain Arrays May Cause One or Both Array Controllers to Reboot
PreviouslyPublishedAs 231741 Bug Id <BUG:15433279> Product Sun StorageTek Flexline 240 Array Sun StorageTek Flexline 280 Array Sun StorageTek Flexline 380 Array Sun StorageTek 6130 Array Sun StorageTek 6140 Array Sun StorageTek 6540 Array Date of Workaround Release 15-Feb-2008 Date of Resolved Release 16-Jun-2008 ***Checked for relevance 13-Jan-2014*** 1. ImpactFor Sun StorageTek 6130, 6140, 6540 and Flexline 240, 280, and 380 arrays, collecting support data may cause one or both array controllers to reboot. Results can range from device path failovers to loss of access due to both primary and secondary paths to a device being inaccessible.2. Contributing FactorsThis issue can occur on the following platforms and in the following releases:
This issue can only occur during array support data collection using SANtricity or Common Array Manager (CAM). Specifically the "Capture State" function of the collection which generates the stateCaptureData.dmp file. The following specific circumstances lead to this bug: 1) The Host Port/HBA must be or have been seen in the SAN by one or both controllers. This can be verified if it is listed in the WWPN drop down in CAM, or as an un-aliased Host Port in SANtricity. This means that "fake" WWPNs that are aliased on the array, or proactively created before the SAN is zoned WILL NOT cause this issue. 2) The Host Port/Initiator must have an existing alias. This means that not only had the HBA been seen by the array, but at some point the user had created an alias for it to assign to a specific host for Volume-to-LUN mapping on the array. 3) Since having been aliased, as in step 2, the Host Port/Initiator must have either: A) been removed from (unliased) and
placed in the "Free" list (available as a drop down in CAM)
or B) had the alias name changed
4) After conditions 1-3 are met, the execution of collecting the "All Support Data" or specifically the "State Capture" (SANtricity Only) will cause the controllers to panic. 3. SymptomsAt a minimum, users will see device path fail-overs. At worst, they will see a loss of access due to both primary and secondary paths to a device being inaccessible.This issue can only occur during array support data collection. Host message events showing SCSI or Fibre Channel messages indicating loss of access to one or both array controllers are the most common symptom associated with this issue. SANtricity or CAM may throw an error during collection, similar to the following (CAM error shown): ionShow 99 fails!The controller(s) will boot due to a defined initiator or host port not being seen by the array at the time the data collection command is run. 4. WorkaroundThis issue can be avoided by not collecting support data when you know that:A) An Initiator/Host Port alias for a
connected HBA has been changed
B) An Initiator/Host Port alias has
been deleted from the array. This is normal if you replace/remove an
HBA as part of reallocation or hardware replacement.
If either of the above, consider performing a controller reset to avoid an unplanned outage. Otherwise, individual collections of "Alarms", "Array Profiles", and "Major Event Logs" are possible within each of the management utilities. The following CLI commands can be run to obtain support data for CAM: service -d <array_name> -c print -t arrayprofile > arrayProfileSummary.txtThe following SMcli or script editor commands can be run under SANtricity:
5. ResolutionThis issue is addressed in the following releases:
The above firmware releases are available with CAM 6.1.0 at: http://www.sun.com/download/index.jsp View by Category -> Systems Administration -> Storage Management Modification History 16-Jun-2008: Updated Contributing Factors and Resolution sections; Now Resolved 14-Jul-2008: Updated Workaround section for corrections 13-JAN-2014: Checked for currency/relevance/formatting; no change in content Internal Comments Questions regarding this document should be addressed to sunalertpublication_us_grp@oracle.com and copy the responsible engineer/submitter listed below. Internal Contributor/submitter siegfried.hepp@oracle.com Internal Eng Responsible Engineer rich.floyd@oracle.com Internal Services Knowledge Engineer david.mariotto@oracle.com Note: Firmware 07.10.25.10
is the next revision up, but is not bundled with CAM. In order to obtain this Firmware, a call with Sun Support is required. If stateCapture is required, please escalate the issue, and a tailored serial collection sequence will be made available. Controllers will report an exception in the exception log(excLogShow) as in the following example: ---- Log Entry #33 OCT-10-2007 08:27:55 AM ---- Exception: Invalid Opcode pc: 0x00000780 (Unknown Program Counter) Registers: edi = 1e330955 esi = 0 ebp = 1d549f08 esp = 1d549ec 4 ebx = 14 edx = 0 ecx = 8 eax = 0 eflags = 246 pc = 780 Stack Trace: 53cdba vxTaskEntry +a : vkiTask (1d54a8cc, 0, 0, 0, 0, 0, 0, 0, 0, 0) 4bb6eb vkiTask +bb : srcOpTask (1e33b2f0, 0, 0, 0, 0, 0, 0, 0, 0 , 0, 1d54a6c0, 4bb640, 0, 0, eeeeeeee, 1d54a6a4, ...) 1d9c325b srcOpTask +db : cmdProcess ([17b651d8, ffffffff, 1d54a640, 297, 1d54a8cc, ...]) ??? 1dbb32ff cmdProcess +6f : symSYMbolCommandHandler (17b651d8, &cmdE0, 0, 53cf13, 17b651d8, 3429334d) 1dd9fd54 symSYMbolCommandHandler+44 : svciov_dispatch (17a8ba10, 17b651d8, 1d5 4a5a4, &cmdE0) 1ddb26bf svciov_dispatch +3cf : stateCapture_1 ([17a7b780, 17b651d8, 54, 17 a7b780, &cmdE0, ...]) ??? 1dd9dd91 stateCapture_1 +41 : systemStateCapture ([17a7b75c, 1, a010241, 1ddb28b9, 0, ...]) ??? 1ddf5737 systemStateCapture +6b7 : ionShow (63, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) 1e0defa2 ionShow +52 : showState__CQ23ion10IonManager ([1d264434, 1d54a39c, 1d54a390, 1e0def5f, &ionShow, ...]) ??? 1e0df0e6 showState__CQ23ion10IonManager+a6 : tditnShow__CQ23ion10IonManageri ( 1d264434, 0, 1ddf4fa0, 1d54a39c) 1e103977 tditnShow__CQ23ion10IonManageri+a7 : tditn +340 ([0, 43108e, 1d54a310 , 1e1038e4, 1, ...]) ??? 1e102f2e tditn +42e : tditn +70 (0, 4, 0, 1e3308c7) ******** Task Id: 0x1d54a6f0 Name: "symTask1" Status: 0x00 (ready) Options: 0x0004 (dealloc_stk) Priority: 125 Stack base: 0x1d54a6f0 Stack end: 0x1d5476f0 (adjusted for name) Stack size: 0x3000 (12288) Stack margin: 0x17c (380) Stack limit: 0x1d5479b8 Pend queue: 0x17a7b498 Last errno: 0x1c0001 value = 1 = 0x1 -> Internal Eng Business Unit Group NWS (Network Storage) Internal Escalation ID 37959518, 37970982, 1-22943001, 65766342 ReferencesAttachments This solution has no attachment |
||||||||||||
|