![]() | Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||||||||
Solution Type Problem Resolution Sure Solution 2235315.1 : Solaris Panic BAD TRAP - occurred in module "qlc" due to a NULL pointer dereference
In this Document
Created from <SR 3-14315005841> Applies to:Sun SPARC Enterprise M3000 Server - Version All Versions and laterQlogic FC HBA - Version All Versions and later Information in this document applies to any platform. SymptomsThis is a Solaris 11.1 GA server with two Oracle Qlogic 8GB FC HBAs The server panics, and before that we can see Loop OFFLINE / ONLINE errors on one FC HBA port : ....
Feb 16 15:28:27 server01 qlc: [ID 439991 kern.info] NOTICE: Qlogic qlc(2,0): Loop OFFLINE Feb 16 15:28:27 server01 qlc: [ID 439991 kern.info] NOTICE: Qlogic qlc(2,0): Loop ONLINE Feb 16 15:28:27 server01 fp: [ID 517869 kern.info] NOTICE: fp(0): PLOGI to ef failed state=Packet Transport error, reason=No Connection Feb 16 15:28:27 server01 fctl: [ID 517869 kern.warning] WARNING: fp(0)::PLOGI to ef failed. state=e reason=5. Feb 16 15:28:27 server01 qlc: [ID 439991 kern.info] NOTICE: Qlogic qlc(2,0): Loop OFFLINE Feb 16 15:28:37 server01 qlc: [ID 439991 kern.info] NOTICE: Qlogic qlc(2,0): Loop ONLINE Feb 16 15:28:37 server01 fp: [ID 517869 kern.info] NOTICE: fp(0): PLOGI to ef failed state=Packet Transport error, reason=No Connection Feb 16 15:28:37 server01 fctl: [ID 517869 kern.warning] WARNING: fp(0)::PLOGI to ef failed. state=e reason=5. Feb 16 15:28:37 server01 qlc: [ID 439991 kern.info] NOTICE: Qlogic qlc(2,0): Loop OFFLINE Feb 16 15:28:42 server01 qlc: [ID 439991 kern.info] NOTICE: Qlogic qlc(2,0): Loop ONLINE Feb 16 15:28:42 server01 fp: [ID 517869 kern.info] NOTICE: fp(0): PLOGI to ef failed state=Packet Transport error, reason=No Connection Feb 16 15:28:42 server01 fctl: [ID 517869 kern.warning] WARNING: fp(0)::PLOGI to ef failed. state=e reason=5. Feb 16 15:28:42 server01 qlc: [ID 439991 kern.info] NOTICE: Qlogic qlc(2,0): Loop OFFLINE Feb 16 15:28:42 server01 qlc: [ID 439991 kern.info] NOTICE: Qlogic qlc(2,0): Loop ONLINE Feb 16 15:28:42 server01 qlc: [ID 439991 kern.info] NOTICE: Qlogic qlc(2,0): Loop OFFLINE Feb 16 15:28:44 server01 unix: [ID 836849 kern.notice] Feb 16 15:28:44 server01 ^Mpanic[cpu2]/thread=2a100577c60: Feb 16 15:28:44 server01 unix: [ID 340138 kern.notice] BAD TRAP: type=31 rp=2a1005776a0 addr=10 mmu_fsr=0 occurred in module "qlc" due to a NULL pointer dereference Feb 16 15:28:44 server01 unix: [ID 100000 kern.notice] Feb 16 15:28:44 server01 unix: [ID 839527 kern.notice] sched: Feb 16 15:28:44 server01 unix: [ID 520581 kern.notice] trap type = 0x31 Feb 16 15:28:44 server01 unix: [ID 381800 kern.notice] addr=0x10 Feb 16 15:28:44 server01 unix: [ID 101969 kern.notice] pid=0, pc=0x7b6649cc, sp=0x2a100576f41, tstate=0x9980001601, context=0x0 Feb 16 15:28:44 server01 unix: [ID 743441 kern.notice] g1-g7: 0, 64003afde000, 2427, ffffffffffefffff, 406, 0, 2a100577c60 Feb 16 15:28:44 server01 unix: [ID 100000 kern.notice] Feb 16 15:28:44 server01 genunix: [ID 723222 kern.notice] 000002a1005773f0 unix:die+7c (31, 2a1005776a0, 10, 0, 0, 10ac000) Feb 16 15:28:44 server01 genunix: [ID 702911 kern.notice] %l0-3: 0000000000000031 0000000001000000 0000000000002000 00000000010ac3d8 Feb 16 15:28:44 server01 %l4-7: 00000000010ac000 0000000000000000 0000000000000005 000002a1005774b0 Feb 16 15:28:45 server01 genunix: [ID 723222 kern.notice] 000002a1005774d0 unix:trap+a40 (2a1005776a0, f25a6010, 1fff, 0, 1c00, 4420f8) Feb 16 15:28:45 server01 genunix: [ID 702911 kern.notice] %l0-3: 0000000000000010 0000000000000031 00000000c1780000 0000000000000001 Feb 16 15:28:45 server01 %l4-7: 000000007b6960e2 0000000000000005 0000000000000001 0000000000000000 Feb 16 15:28:45 server01 genunix: [ID 723222 kern.notice] 000002a1005775f0 unix:ktl0+48 (64003afde528, 0, ffffffffffffffff, 0, 64003b02cd65, 0) Feb 16 15:28:45 server01 genunix: [ID 702911 kern.notice] %l0-3: 0000000000000002 0000000000001400 0000009980001601 000000000101acb0 Feb 16 15:28:45 server01 %l4-7: 000000007b69743c 0000000001076c00 0000000000000000 000002a1005776a0 Feb 16 15:28:45 server01 genunix: [ID 562518 kern.notice] 000002a100577740 800 (64003b134a40, 2a100577c60, 40002332000, 640062f1eb00, 22e, fffffffffeffffff) Feb 16 15:28:45 server01 genunix: [ID 702911 kern.notice] %l0-3: 000064003afdea60 000000000000022e 0000000000000000 000000000000022f Feb 16 15:28:45 server01 %l4-7: 0000000000000002 0000000000001170 000064003afde000 00000000ffff7c00 Feb 16 15:28:45 server01 genunix: [ID 723222 kern.notice] 000002a1005777f0 qlc:qlc_task_thread+318 (64003b134a40, 406, 100000, 2000, 64003afde000, 1000) Feb 16 15:28:46 server01 genunix: [ID 702911 kern.notice] %l0-3: 000064003afde572 000064003afde550 000064003afde000 0000000000000406 Feb 16 15:28:46 server01 %l4-7: 0000000000000406 0000000000002427 000064003afde000 0000000000000001 Feb 16 15:28:46 server01 genunix: [ID 723222 kern.notice] 000002a1005778b0 qlc:qlc_driver_thread+2c (64003b134a40, 6400392390d8, 7b6637c4, 0, 64003afde000, 64003afde568) Feb 16 15:28:46 server01 genunix: [ID 702911 kern.notice] %l0-3: 000064003afde000 000064003afde570 0000000000000406 0000000000000006 Feb 16 15:28:46 server01 %l4-7: 0000000000000006 0000000000000006 0000000000000000 0000000000000001 Feb 16 15:28:46 server01 genunix: [ID 723222 kern.notice] 000002a100577960 genunix:taskq_thread+3a8 (fff7fc00, 6400391e0c08, 22a93172b8, 6400391e0c3a, 6400391e0c3c, 6400392390d8) Feb 16 15:28:46 server01 genunix: [ID 702911 kern.notice] %l0-3: 0000000000080000 0000000000010000 00006400391e0c38 0000000000000001 Feb 16 15:28:46 server01 %l4-7: 00006400391e0c28 00006400391e0c78 00006400391e0c30 00000000fffeffff Feb 16 15:28:46 server01 unix: [ID 100000 kern.notice] Feb 16 15:28:46 server01 genunix: [ID 672855 kern.notice] syncing file systems... Feb 16 15:28:46 server01 genunix: [ID 904073 kern.notice] done Feb 16 15:28:47 server01 genunix: [ID 111219 kern.notice] dumping to /dev/zvol/dsk/rpool/dump, offset 65536, content: kernel Feb 16 15:29:01 server01 genunix: [ID 100000 kern.notice] Feb 16 15:29:01 server01 genunix: [ID 665016 kern.notice] ^M100% done: 257334 pages dumped, Feb 16 15:29:01 server01 genunix: [ID 851671 kern.notice] dump succeeded Feb 16 15:31:37 server01 genunix: [ID 540533 kern.notice] ^MSunOS Release 5.11 Version 11.1 64-bit
On the crash dump analysis we can see more in detail the panic stack, trap on qlc function qlc:qlc_abort_queues+0x88 CAT(vmcore.1139/11U)> panic pc: unix:panicsys+0x48: call unix:setjmp void unix:panicsys+0x48((const char *)0x10ac388, (va_list)0x2a1007f7478, (struct regs *)0x191fda0, (int)1, 0x9900001602, , , , , , , , 0x10ac388, 0x2a1007f7478) CAT(vmcore.1139/11U)>
CauseThere are two different issues here: 1. There is a HW link problem between FC HBA port and the FC switch where this port is connected , due to that we are getting loop offline/online errors 2. We are hitting the following bug (all of them related to the same RCA), qlc driver causes the panic when the link down occurs :
SolutionAbove bugs have been fixed on Solaris 10 SPARC : qlc patch <SunPatch:149175-05> (or greater) References<BUG:16174012> - SUNBT7199879 QLC DRIVER CAUSES THE PANIC WHEN THE LINK DOWN OCCURS.<BUG:15817320> - BACKPORT 16174012 TO 11.2 QLC DRIVER PANIC <BUG:15973480> - QLC DRIVER CAUSES THE PANIC WHEN THE LINK DOWN OCCURS S10U11 <BUG:16868908> - BAD TRAP PANIC IN MODULE QLC Attachments This solution has no attachment |
||||||||||||||||||
|