Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-2192792.1
Update Date:2017-10-11
Keywords:

Solution Type  Problem Resolution Sure

Solution  2192792.1 :   SPARC M7 Series Servers : Warning "Standby Service Processor Bus Error" when starting a host  


Related Items
  • SPARC M7-16
  •  
  • SPARC M7-8
  •  
Related Categories
  • PLA-Support>Sun Systems>SPARC>Enterprise>SN-SPARC: M7
  •  




In this Document
Symptoms
Changes
Cause
Solution
References


Applies to:

SPARC M7-8 - Version All Versions and later
SPARC M7-16 - Version All Versions and later
Information in this document applies to any platform.

Symptoms

The following warning may be reported in the host console logs when starting an M7 server host, during step 4 of 7.

2016-10-13 10:47:44.065 4:0:0:0>WARNING:
      reporting PCPU ID=0
      TestTitle=HB PCIE LINK ENABLE
      TestInfo=Standby Service Processor Bus Error
      Standby Service Processor=/SYS/SP0
      SP SSI 0 BUS ERROR CODE ADR=0x0003fffff2e50422
      SP SSI 0 BUS ERROR CODE VAL=0x86
      SP SSI 0 BUS ERROR ADDR ADR=0x0003fffff2e50428
      SP SSI 0 BUS ERROR ADDR VAL=0x02d15030
      SP SSI 0 BUS ERROR COUNT=0x0001
      SP SSI 1 BUS ERROR CODE ADR=0x0003fffff2e50432
      SP SSI 1 BUS ERROR CODE VAL=0x00
      SP SSI 1 BUS ERROR ADDR ADR=0x0003fffff2e50438
      SP SSI 1 BUS ERROR ADDR VAL=0x00000000
      SP SSI 1 BUS ERROR COUNT=0x0000
      SP0 SSI OIC MUX IN STATUS ADR=0x0003fffff2e59020
      SP0 SSI OIC MUX IN STATUS VAL=0x20
      SP1 SSI OIC MUX IN STATUS ADR=0x0003fffff2e59021
      SP1 SSI OIC MUX IN STATUS VAL=0x20
      SP SSI OIC MUX OUT STATUS ADR=0x0003fffff2e59022
      SP SSI OIC MUX OUT STATUS VAL=0x20
      SP HOT PLUG STATUS ADR=0x0003fffff2e59010
      SP HOT PLUG STATUS VAL=0x03
END_WARNING

And after 5 occurences of the above warning

WARNING: Standby /SYS/SP1 is not accessible, reached Max Bus Error Count=5

Standby /SYS/SP1 inaccessible: skipping topology checks for/SYS/SP1/PCIE_SWITCH/PCIE_LINK6
Standby /SYS/SP1 inaccessible: skipping topology checks for/SYS/SP1/PCIE_SWITCH/PCIE_LINK8
Standby /SYS/SP1 inaccessible: skipping topology checks for/SYS/SP1/USB_CTRL1
Standby /SYS/SP1 inaccessible: skipping topology checks for/SYS/SP1/PCIE_SWITCH/PCIE_LINK12
Standby /SYS/SP1 inaccessible: skipping topology checks for/SYS/SP1/SPM1/VIDEO

/SYS/SP1/PCIE_SWITCH/PCIE_LINK6 111d 8091 1 G2 Untested
/SYS/SP1/PCIE_SWITCH/PCIE_LINK8 111d 8091 1 G2 Untested
/SYS/SP1/USB_CTRL1 104c 8241 1 G2 Untested
/SYS/SP1/PCIE_SWITCH/PCIE_LINK12 111d 8091 1 G1 Untested
/SYS/SP1/SPM1/VIDEO 102b 0522 1 G1 Untested

 

Changes

 Fix for bug Bug 22707384 introduced in 9.7.3.b

Cause

When starting an M7 server host, the system is trying to access the devices/links from the Standby SP. When this access test is failing, a bus error is reported during the startup sequence, step 4 of 7.

Starting from SysFW 9.7.3.b, no ereport/fault are reported as a result of one single access error.

Instead, the Standby SP will be reported as inaccessible as a result of five failures to access the Standby SP.

Unless some other problems indicting the SP are reported, this warning and inability to access the Standby SP devices can be safely ignored.

 

Solution

For example, when starting host1, 2 occurences of access failure are reported

2016-10-13 10:47:44.065 4:0:0:0>WARNING:
reporting PCPU ID=0
TestTitle=HB PCIE LINK ENABLE
TestInfo=Standby Service Processor Bus Error
Standby Service Processor=/SYS/SP0
SP SSI 0 BUS ERROR CODE ADR=0x0003fffff2e50422
SP SSI 0 BUS ERROR CODE VAL=0x86
SP SSI 0 BUS ERROR ADDR ADR=0x0003fffff2e50428
SP SSI 0 BUS ERROR ADDR VAL=0x02d15030
SP SSI 0 BUS ERROR COUNT=0x0001
SP SSI 1 BUS ERROR CODE ADR=0x0003fffff2e50432
SP SSI 1 BUS ERROR CODE VAL=0x00
SP SSI 1 BUS ERROR ADDR ADR=0x0003fffff2e50438
SP SSI 1 BUS ERROR ADDR VAL=0x00000000
SP SSI 1 BUS ERROR COUNT=0x0000
SP0 SSI OIC MUX IN STATUS ADR=0x0003fffff2e59020
SP0 SSI OIC MUX IN STATUS VAL=0x20
SP1 SSI OIC MUX IN STATUS ADR=0x0003fffff2e59021
SP1 SSI OIC MUX IN STATUS VAL=0x20
SP SSI OIC MUX OUT STATUS ADR=0x0003fffff2e59022
SP SSI OIC MUX OUT STATUS VAL=0x20
SP HOT PLUG STATUS ADR=0x0003fffff2e59010
SP HOT PLUG STATUS VAL=0x03
END_WARNING

2016-10-13 10:47:45.630 5:0:0:0>WARNING:
reporting PCPU ID=256
TestTitle=HB PCIE LINK ENABLE
TestInfo=Standby Service Processor Bus Error
Standby Service Processor=/SYS/SP0
SP SSI 0 BUS ERROR CODE ADR=0x0003fffff2e50422
SP SSI 0 BUS ERROR CODE VAL=0x86
SP SSI 0 BUS ERROR ADDR ADR=0x0003fffff2e50428
SP SSI 0 BUS ERROR ADDR VAL=0x02d15030
SP SSI 0 BUS ERROR COUNT=0x0001
SP SSI 1 BUS ERROR CODE ADR=0x0003fffff2e50432
SP SSI 1 BUS ERROR CODE VAL=0x81
SP SSI 1 BUS ERROR ADDR ADR=0x0003fffff2e50438
SP SSI 1 BUS ERROR ADDR VAL=0x6027fb00
SP SSI 1 BUS ERROR COUNT=0x0001
SP0 SSI OIC MUX IN STATUS ADR=0x0003fffff2e59020
SP0 SSI OIC MUX IN STATUS VAL=0x20
SP1 SSI OIC MUX IN STATUS ADR=0x0003fffff2e59021
SP1 SSI OIC MUX IN STATUS VAL=0x30
SP SSI OIC MUX OUT STATUS ADR=0x0003fffff2e59022
SP SSI OIC MUX OUT STATUS VAL=0x30
SP HOT PLUG STATUS ADR=0x0003fffff2e59010
SP HOT PLUG STATUS VAL=0x03
END_WARNING

The DCU SPM is SP1/SPM1 and so SP0/SPM1 is the standby DCU#1 SPM. From a host1 and DCU#1 perspective, SP0 is the Standby SP hosting the devices and the Standby DCU#1 SPM to be checked.

/HOST1
...
sp_name = /SYS/SP1/SPM1

/System/DCUs/DCU_1
...
sp_name = /SYS/SP1/SPM1

Then the SP0  was successfully accessed so the SPM(1) and associated links are reported as "clean" :

2016-10-13 10:49:44.357 4:0:0:0> /SYS/SP0/PCIE_SWITCH/PCIE_LINK6 111d 8091 1 G2 Clean
2016-10-13 10:49:44.377 4:0:0:0> /SYS/SP0/PCIE_SWITCH/PCIE_LINK8 111d 8091 1 G2 Clean
2016-10-13 10:49:44.397 4:0:0:0> /SYS/SP0/USB_CTRL1 104c 8241 1 G2 Clean
2016-10-13 10:49:44.417 4:0:0:0> /SYS/SP0/PCIE_SWITCH/PCIE_LINK12 111d 8091 1 G1 Clean
2016-10-13 10:49:44.436 4:0:0:0> /SYS/SP0/SPM1/VIDEO 102b 0522 1 G1 Clean

The above warning can be safely ignored.

In the case of 5 consecutive occurences of the bus error, on the 5th occurence the following message will be printed

WARNING: Standby /SYS/SP1 is not accessible, reached Max Bus Error Count=5

As a result, the SPM is reported as "inaccessible" and the devices as "untested".

Standby /SYS/SP1 inaccessible: skipping topology checks for/SYS/SP1/PCIE_SWITCH/PCIE_LINK6

Standby /SYS/SP1 inaccessible: skipping topology checks for/SYS/SP1/PCIE_SWITCH/PCIE_LINK8
Standby /SYS/SP1 inaccessible: skipping topology checks for/SYS/SP1/USB_CTRL1
Standby /SYS/SP1 inaccessible: skipping topology checks for/SYS/SP1/PCIE_SWITCH/PCIE_LINK12
Standby /SYS/SP1 inaccessible: skipping topology checks for/SYS/SP1/SPM1/VIDEO

/SYS/SP1/PCIE_SWITCH/PCIE_LINK6 111d 8091 1 G2 Untested
/SYS/SP1/PCIE_SWITCH/PCIE_LINK8 111d 8091 1 G2 Untested
/SYS/SP1/USB_CTRL1 104c 8241 1 G2 Untested
/SYS/SP1/PCIE_SWITCH/PCIE_LINK12 111d 8091 1 G1 Untested
/SYS/SP1/SPM1/VIDEO 102b 0522 1 G1 Untested

 

No ereport or fault will result from such situation.

Only the host console logs will report the warning and errors.

Unless some other problems indicting the SP are reported, no action is required. This can be safely ignored.

 

References

<BUG:22707384> - CHANGE FPGA ACCESSES FROM 2C# TO 2E5 AND TOPO CHECKS FOR INACCESSIBLE SPS

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback