![]() | Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||||||
Solution Type Problem Resolution Sure Solution 1990394.1 : Brocade CP Set to Faulty Because CP ERROR Asserted - WARNING, SilkWorm48000, Detected termination of process fwd
In this Document
Created from <SR 3-10281225391> Applies to:Brocade 48000 Director - Version All Versions and laterBrocade SAN Switch Hardware - Version All Versions and later Information in this document applies to any platform. SymptomsThis is a Brocade 48K, with FOS v6.4.3b , currently CP1 on Slot6 is active, and CP0 on Slot5 is standby, working fine. firmwareshow -v : Slot Name Appl Primary/Secondary Versions Status -------------------------------------------------------------------------- 5 CP0 FOS v6.4.3b STANDBY v6.4.3b 6 CP1 FOS v6.4.3b ACTIVE * v6.4.3b
2015/02/11-03:56:09, [FSSM-1003], 18787, SLOT 6 | CHASSIS, WARNING, SilkWorm48000, HA State out of sync.
2015/02/11-03:56:42, [ISNS-1011], 18788, SLOT 6 | FID 128, INFO, WRO_CORE_48K_BLUE, iSNS Client Service is disabled. 2015/02/11-03:57:08, [EM-1033], 18789, SLOT 6 | CHASSIS, ERROR, SilkWorm48000, CP in Slot 5 set to faulty because CP ERROR asserted. 2015/02/11-03:57:46, [HAMK-1004], 18790, SLOT 6 | CHASSIS, INFO, SilkWorm48000, Resetting standby CP (double reset may occur) 2015/02/11-03:57:50, [EM-1047], 18791, SLOT 6 | CHASSIS, INFO, SilkWorm48000, CP in slot 5 not faulty, CP ERROR deasserted. 2015/02/11-03:58:05, [FW-1424], 18792, SLOT 6 | FID 128, WARNING, WRO_CORE_48K_BLUE, Switch status changed from HEALTHY to MARGINAL. 2015/02/11-03:58:05, [FW-1433], 18793, SLOT 6 | FID 128, WARNING, WRO_CORE_48K_BLUE, Switch status change contributing factor CP: CP non-redundant (Slot5/CP0) faulty. 2015/02/11-03:58:53, [HAM-1004], 18794, SLOT 5 | CHASSIS, INFO, SilkWorm48000, Processor rebooted - Software Fault:Kernel Panic 2015/02/11-03:59:01, [TRCE-1001], 18795, SLOT 5 | CHASSIS, WARNING, SilkWorm48000, Trace dump available (Slot 5)! (reason: PANIC) 2015/02/11-03:59:01, [TRCE-1004], 18796, SLOT 5 | CHASSIS, WARNING, SilkWorm48000, Trace dump (Slot 5) was not transferred because trace auto-FTP disabled. 2015/02/11-03:59:02, [TRCE-1001], 18797, SLOT 6 | CHASSIS, WARNING, SilkWorm48000, Trace dump available (Slot 5)! (reason: PANIC) 2015/02/11-03:59:02, [TRCE-1004], 18798, SLOT 6 | CHASSIS, WARNING, SilkWorm48000, Trace dump (Slot 5) was not transferred because trace auto-FTP disabled. 2015/02/11-03:59:38, [FSSM-1002], 18799, SLOT 6 | CHASSIS, INFO, SilkWorm48000, HA State is in sync. 2015/02/11-03:59:38, [FSSM-1002], 18800, SLOT 5 | CHASSIS, INFO, SilkWorm48000, HA State is in sync. 2015/02/11-03:59:39, [FW-1425], 18801, SLOT 6 | FID 128, INFO, WRO_CORE_48K_BLUE, Switch status changed from MARGINAL to HEALTHY.
slotshow:
slotshow -m : Slot Blade Type ID Model Name Status -------------------------------------------------- 1 SW BLADE 18 FC4-32 ENABLED 2 SW BLADE 18 FC4-32 ENABLED 3 SW BLADE 18 FC4-32 ENABLED 4 UNKNOWN VACANT 5 CP BLADE 16 CP256 ENABLED 6 CP BLADE 16 CP256 ENABLED 7 UNKNOWN VACANT 8 SW BLADE 18 FC4-32 ENABLED 9 SW BLADE 17 FC4-16 ENABLED 10 SW BLADE 18 FC4-32 ENABLED
*** CORE FILES WARNING (02/11/15 - 03:00:18 ) ***
5376 KBytes in 1 file(s) use "supportsave" command to upload ASSERT - Failed expression: size == sizeof (fwDump_t), file = thresh_agent.c, line = 2422, user mode Call backtrace: /fabos/lib/libutils.so.1.0(do_assert+0x250) [0xfed47dc] fwd(fwDumpCB+0xa8) [0x100238f4] /fabos/lib/libipc.so.1.0 [0xf3defc4] /fabos/lib/libipc.so.1.0 [0xf3df140] /fabos/lib/libgiot.so.1.0 [0xfe33524] /lib/libpthread.so.0 [0xfe02470] /lib/libc.so.6(clone+0x84) [0xf19a610] do_assert: forcing segv to get core file 2015/02/11-03:56:09, [RAS-1005], 28090, SLOT 5 | FFDC | FID 128, WARNING, WRO_CORE_48K_BLUE, Software 'assert' error detected. 2015/02/11-03:56:09, [RAS-1001], 28091, SLOT 5 | CHASSIS, INFO, SilkWorm48000, First failure data capture (FFDC) event occurred. 2015/02/11-03:56:11, [TRCE-1001], 28092, SLOT 5 | CHASSIS, WARNING, SilkWorm48000, Trace dump avDetected termination of fwd:1205 (1) ailable (Slot 5)exit code:11, exit sig:17, parent sig:0 ! (reason: FFDC) 2015/02/11-03:56:11, [TRCE-1004], 28093, SLOT 5 | CHASSIS, WARNING, SilkWorm48000, Trace dump (Slot 5) was not transferred because trace auto-FTP disabled. == Dumping debug information == PID VSZ RSS COMMAND 1 1696 592 init 2 0 0 ksoftirqd/0 3 0 0 events/0 4 0 0 khelper 5 0 0 kthread 27 0 0 kblockd/0 56 0 0 pdflush 59 0 0 aio/0 58 0 0 kswapd0 66 0 0 kseriod 243 0 0 kjournald 263 1676 412 wdtd 335 0 0 kjournald 508 2116 652 inetd 521 2556 1092 kmsghandler 535 1700 384 klogd 536 1944 688 syslogd 537 1808 620 crond 566 0 0 RASLOGK_TH 583 0 0 krscmon 689 0 0 kwt_nb_thread 770 0 0 module-99-th 773 0 0 module-107-th 776 0 0 module-146-th 779 0 0 module-126-th 782 0 0 module-162-th 801 0 0 kmtracer 932 20692 2708 ipadmd 935 11488 1852 telnetmond 936 47480 4908 hasmd 1000 0 0 FSSK_TH 1043 4988 1076 sshd 1044 1720 560 getty 1045 1720 560 getty 1054 29328 3676 pdmd 1049 0 0 ISCK_TH 1050 0 0 XCP_TX 1051 0 0 XCP_RX 1052 0 0 XCP_TX 1053 0 0 XCP_RX 1057 12384 1240 proxy 1058 73472 6200 raslogd 1059 33380 7708 traced 1060 46016 3664 bmd 1061 12300 2932 diagd 1056 0 0 RTEK_TH 1065 88852 4880 emd 1067 12724 3116 porttestd 1081 0 0 porttestd 1195 80924 6356 webd 1196 29472 3592 arrd 1197 124032 10696 cald 1198 73292 5760 essd 1199 73668 5704 evmd 1200 67176 7256 fabricd 1201 57088 5960 fcpd 1202 88184 4068 fdmid 1203 47784 4892 ficud 1204 65976 6640 fspfd 1206 72684 4740 rcsd 1207 63768 3708 ipsd 1208 107696 45792 iswitchd 1209 98316 6420 msd 1210 81348 13684 nsd 1211 30124 4164 pdmd 1214 85476 6932 psd 1215 43092 6552 rpcd 1216 80780 6824 secd 1217 73012 3864 authd 1236 116788 12460 snmpd 1237 97664 21836 trafd 1238 39940 3692 tsd 1239 99648 9048 zoned 1278 8332 2168 httpd.0 1281 105884 26444 0.weblinker.fcg 1294 68980 4784 icpd 1295 46528 3760 isnscd 1320 38744 3628 scpd 13663 0 0 pdflush 18033 2564 1080 sh 18034 2228 784 ps 2015/02/11-03:56:11, [KSWD-1002], 28094, SLOT 5 | FFDC | CHASSIS, WARNING, SilkWorm48000, Detected termination of process fwd:1205 <<<----!!! 2015/02/11-03:56:11, [HAM-1014], 28095, SLOT 5 | CHASSIS, CRITICAL, SilkWorm48000, Non restartable component (fw (pid=1205)) died. 2015/02/11-03:56:11, [FSSM-1003], 28096, SLOT 6 | CHASSIS, WARNING, SilkWorm48000, HA State out of sync. Time=2:56:14-716194 Total:0KB Used:0KB Free:0KB Buffers:0KB Cached:0KB Time=2:56:14-716194 Total:0KB Used:0KB Free:0KB Buffers:0KB Cached:0KB
CauseThe reason for the panic is that the fabricwatch daemon (fwd) terminated. There are a number of daemons in FOS which are termed 'non restartable' and when these daemons die/fail the only way for them to recover/restart is to panic/reboot the operating system - hence the CP failover in this case This is a known Brocade Defect fixed in the 7.x code but it's not being back ported into 6.4.3x. SolutionBrocade has identified this as a known issue, that it won't be fixed on FOS 6.4.3x version, latest FOS supported on Brocade 48000 Upgrade to FOS 7.3.x.
Recomendation for Brocade 48000: leave the switch as it is (although there is no solution for FOS 6.4.3 - Brocade 48K ) Attachments This solution has no attachment |
||||||||||||||||
|