Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-1507966.1
Update Date:2017-11-08
Keywords:

Solution Type  Problem Resolution Sure

Solution  1507966.1 :   Sun SPARC Sun Fire 12K/15K/E20K/E25K (Starcat): NOTICE: Solaris failed to send a message (0x4/0x5309) to the System Controller. Error: 145  


Related Items
  • Sun Fire 15K Server
  •  
  • Sun Fire E20K Server
  •  
  • Sun Fire E25K Server
  •  
  • Sun Fire 12K Server
  •  
Related Categories
  • PLA-Support>Sun Systems>SPARC>Enterprise>SN-SPARC: SF-Exxk
  •  




In this Document
Symptoms
Cause
Solution
References


Created from <SR 3-6479721981>

Applies to:

Sun Fire E20K Server - Version All Versions and later
Sun Fire E25K Server - Version All Versions and later
Sun Fire 15K Server - Version All Versions and later
Sun Fire 12K Server - Version All Versions and later
Information in this document applies to any platform.

Symptoms

/var/adm/messages on the domain shows:

Nov 22 10:42:20 sec17-e scosmb: [ID 855069 kern.info] NOTICE: Solaris failed to send a message (0x4/0x5309) to the System Controller. Error: 145

 

Cause

The fmadm faulty -a output may not show an error on the domain.

fmdump -e output may show a list of ce events:

Nov 23 04:41:14.2136 ereport.cpu.ultraSPARC-IVplus.ce
Nov 23 04:41:20.2099 ereport.cpu.ultraSPARC-IVplus.ce
Nov 23 04:46:14.9670 ereport.cpu.ultraSPARC-IVplus.ce
Nov 23 05:06:17.9180 ereport.cpu.ultraSPARC-IVplus.ce
Nov 23 05:06:23.9087 ereport.cpu.ultraSPARC-IVplus.ce
Nov 23 05:06:24.0818 ereport.cpu.ultraSPARC-IVplus.ce
Nov 23 05:06:30.0786 ereport.cpu.ultraSPARC-IVplus.ce
Nov 23 06:49:20.1044 ereport.cpu.ultraSPARC-IVplus.ce
Nov 23 06:49:26.1036 ereport.cpu.ultraSPARC-IVplus.ce
Nov 23 11:24:39.4755 ereport.cpu.ultraSPARC-IVplus.ce
Nov 23 11:24:39.4763 ereport.cpu.ultraSPARC-IVplus.ce
Nov 23 11:24:39.4770 ereport.cpu.ultraSPARC-IVplus.ce
Nov 23 11:24:45.4694 ereport.cpu.ultraSPARC-IVplus.ce
Nov 23 11:24:45.4695 ereport.cpu.ultraSPARC-IVplus.ce

However look for rstops on the System Controller.
 
An rstop (recordstop) flood was found, and appears to be the cause of the communication error(s)

Nov 22 10:42:20 sec17-e scosmb: [ID 855069 kern.info] NOTICE: Solaris failed to send a message (0x4/0x5309) to the System Controller. Error: 145

Normally replacement of the dimm(s) will resolve the Notice/Warning.
 

Solution

1. Collect a system controller and domain explorer for analysis

2. Open an SR with Oracle and have the data analyzed to verify a dimm is at fault   

    See Oracle Diagnostic File Upload (Doc ID 1547088.2) for information on how provide data to Oracle

3. Replace the dimm(s) associated with domain getting the communication error(s) and verify the Kernel and FMA patches are up-to-date.

 

The ECC storm discussed in this doc can also lead to the behaviour described into:
Systems With UltraSPARC IV+ Processors Running Solaris 9 or 10 May Experience "send mondo timeout" Panic (Doc ID 1019109.1)

References

<NOTE:1312847.1> - Oracle Explorer Data Collector Resource Center
<NOTE:1547088.2> - How to Upload Files to Oracle Support

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback