Asset ID: |
1-71-1002109.1 |
Update Date: | 2018-04-02 |
Keywords: | |
Solution Type
Technical Instruction Sure
Solution
1002109.1
:
Sun Fire[TM] 12K/15K/E20K/E25K: POST Overview
Related Items |
- Sun Fire 15K Server
- Sun Fire E20K Server
- Sun Fire E25K Server
- Sun Fire 12K Server
|
Related Categories |
- PLA-Support>Sun Systems>SPARC>Enterprise>SN-SPARC: SF-Exxk
- _Old GCS Categories>Sun Microsystems>Servers>High-End Servers
|
PreviouslyPublishedAs
203006
Applies to:
Sun Fire 15K Server - Version All Versions and later
Sun Fire E20K Server - Version All Versions and later
Sun Fire 12K Server - Version All Versions and later
Sun Fire E25K Server - Version All Versions and later
All Platforms
Goal
This document is an overview of POST (Power -On Self Test) on the Sun Fire[TM] 12K/15K/E20K/E25K platform.
Solution
When lpost executes for CPU it checks lpost versions on FPROM and SMS.
POST is the software that takes control of hardware in SunFire 12K/15K/E20K/E25K domains at power on or equivalent reset. It probes, tests and configures the domain resources and transfers control to OBP.
POST is a multi-threaded application (number of threads = number of processors + 1 for hpost). Domain processors are typically sequenced in parallel by hpost using local tests called lpost.
The hpost component
hpost (Host POST) is the controlling entity of POST. It is part of the SMS package. Hpost communicates with several sms daemons, most notably HWAD and PCD. All platform hardware communication from hpost is done using hwad, across the console bus.
The lpost components
lpost (local POST) is a component of hpost that is executed by a domain CPU. The CPU lpost tests are stored in FPROM on slot 0. The SC also has a disk copy of each lpost file in /opt/SUNWSMS/hostobjs.
For non-System boards, such as IO and expander, the appropriate lpost image from the SC's disk is downloaded to the domain's memory.
In all cases lpost is slave to hpost. Communication between the two is through SRAM on a specific I/O board.
CPU lpost version
To check the CPU lpost version on SC and FPROM (on SB) use the command given below.
Example for domain A SBs:-
v4u-15ka-sc0:sms-svc:1> flashupdate -d a -f /opt/SUNWSMS/hostobjs/sgcpu.flash -n
Current System Board FPROM Information
========================================
CPU at SB2, FPROM 0:
POST 03/05/04 11:28:00 Release 5.17.0 Build 6.4 I/F 12
OBP 03/05/04 11:27:00 Release 5.17.0 Build 6.4
Ver 03/05/04 11:28:00 Release 5.17.0 Build 6.4
CPU at SB2, FPROM 1:
POST 03/05/04 11:28:00 Release 5.17.0 Build 6.4 I/F 12
OBP 03/05/04 11:27:00 Release 5.17.0 Build 6.4
Ver 03/05/04 11:28:00 Release 5.17.0 Build 6.4
Flash Image Information
==========================
POST 03/05/04 11:28:00 Release 5.17.0 Build 6.4 I/F 12
OBP 03/05/04 11:27:00 Release 5.17.0 Build 6.4
Ver 03/05/04 11:28:00 Release 5.17.0 Build 6.4
Do you wish to update the FPROM (yes/no) N
- If the version of lpost on the FPROM is the same as the version on SMS:- no issues.
- If FPROM lpost is an earlier version than SMS lpost :- CPUs associated with that FPROM may fail POST.
- If FPROM lpost is a later version than SMS lpost :- hpost will log an uprev warning. SMS needs to be patched up if this warning is seen.
To check lpost version for non-system boards use this command:
starcat-sc0:sms-svc:29> cd /opt/SUNWSMS/hostobjs
starcat-sc0:sms-svc:31> mcs -p pcilpost.elf | grep elf
pcilpost.elf:
SMI sun4u_Sun_Fire_15K pcilpost.elf 5.17.0 Fri Mar 5 19:30:41 GMT 2004
starcat-sc0:sms-svc:33> mcs -p caged_pcilpost.elf | grep elf
caged_pcilpost.elf:
SMI sun4u_Sun_Fire_15K caged_pcilpost.elf 5.17.0 Fri Mar 5 19:30:41 GMT 2004
starcat-sc0:sms-svc:34> mcs -p explpost.elf | grep elf
explpost.elf:
SMI sun4u_Sun_Fire_15K explpost.elf 5.17.0 Fri Mar 5 19:31:30 GMT 2004
POST order of operation.
- POST reads information from PCD and determines which slots (0/1) are assigned to the domain.
- POST assumes that all hardware resources assigned to the domain are powered on. Powering on the components is the responsibility of SMS (setkeyswitch).
- Clear error state in the domain resources. If the error state can not be cleared then the component is marked as failed.
- The domain's resources are scanned to inventory the components present (such as CPUs, memory and I/O adapters), and their characteristics (such as type, speed and size). POST's findings are compared to entries in the PCD to confirm agreement between two. Incompatibilities such as those between part revisions or sizes and between actual and operation frequencies, are detected and handled. Components may be failed out of the configuration to maintain an internally consistent system.
- Built in self tests (LBIST and IBIST) are executed within and between ASICs to confirm individual ASIC operation and the communication paths between them.
- Lpost tests are downloaded and executed for slot 0 and slot 1 boards.
- At the end, POST assembles all of the components that have been successfully configured and passed all tests into the final domain configuration. This information is then passed to OBP.
Invoking POST
- setkeyswitch This is the most common invocation of POST.
- DR attach When slot0 or slot1 are DR'ed into running domain.
- Dstop/Rstop If a failure causes a correctable error (rstop), POST is responsible for recording the state of the hardware at the time of error. If a failure causes dstop of the domain, POST will record the current state of the hardware, and the Hpost is executed to (hopefully) identify the faulty component and configure it out of the domain.
- Panic/reboot After domain reset, POST must ensure that the OBP is delivered a valid set of resources from which to build a device tree.
- Manually In some instances, a support specialist may execute a POST directly.
POST special cases
1. Split Expander:- To deal with this, POST examines the PCD for information on all 18 possible domains. If a domain is listed as active, then the expander boards associated with all active slots in that domain are considered active. These Expander boards and ASICs on them are assumed to have been tested and configured by previous POST process. POST will modify the configuration of the split expander, but in a manner that does not impact the running domain. If POST finds discrepancies or errors when examining the split expander, such as misconfigured ASICs or a domain stop condition, it will abandon the slot it is using on that expander, reporting it as failed.
If the expander is not in use then POST assumes ownership of the full EXB and configures it. To avoid a race condition of two different POST taking ownership of the same expander, a lock file gets created by hpost ($SMSVAR/.lock/hpost.lock.nn).
2. DR attach of an I/O board: At attach time, DR arranges for CPU and memory from slot0 SB to run hpost. POST creates a transaction/error cage consisting of loaned processor, memory and the I/O board being tested. During POST, the cage prevents transactions from the board under test from routing to any other configured components of the running domain. After post is complete, CPU and memory are released to SunOS and I/O board is introduced into the running domain.
Internal information and links
There are few hpost command line options that are useful to know about and understand.
Be aware that running hpost directly will not bring a domain online. POST will only configure and test hardware. It does not download OBP or initiate SunOS boot process.
Attachments
This solution has no attachment