Asset ID: |
1-72-1577361.1 |
Update Date: | 2017-09-29 |
Keywords: | |
Solution Type
Problem Resolution Sure
Solution
1577361.1
:
Downgrading system firmware invalidates the TLI with FDD/FMA reporting the following faults [ fault.chassis.tli.invalid, ILOM-8000-30]
Related Items |
- SPARC T4-1
- Netra SPARC T4-1 Server
- Netra SPARC T4-2 Server
- Netra SPARC T4-1B
- SPARC T4-4
- SPARC T4-2
|
Related Categories |
- PLA-Support>Sun Systems>SPARC>CMT>SN-SPARC: T4
|
After downgrading the T4 platform to system firmware 8.2.2.c, the system is unbootable and FDD/FMA reports with the following error fault [fault.chassis.tli.invalid, ILOM-8000-30]
In this Document
Created from <SR 3-7658553331>
Applies to:
SPARC T4-2 - Version All Versions to All Versions [Release All Releases]
SPARC T4-1 - Version All Versions to All Versions [Release All Releases]
Netra SPARC T4-1B - Version All Versions to All Versions [Release All Releases]
Netra SPARC T4-1 Server - Version All Versions to All Versions [Release All Releases]
Netra SPARC T4-2 Server - Version All Versions to All Versions [Release All Releases]
Information in this document applies to any platform.
Symptoms
After downgrading the system firmware, the system is unbootable and FDD/FMA reports with the following error fault [fault.chassis.tli.invalid, ILOM-8000-30]
Cause
SPARC Systems require a valid TLI before the system is allowed to Power On, allowing a SPARC system to boot with an invalid TLI (MACADDR and HOSTID) may be destructive to both the application and network.
Solution
The SPARC system uses the TOP-level Indicator(TLI) to store unique attributes of a particular server.A TLI elements contains the following critical system information information, the PSN,MACADDR and HOSTID are unique properties for each server.
Product Part Numnber(PPN)
Product Serial Number(PSN)
Host MAC address (MACADDR)
System Hostid (HOSTID)
SPARC T series system stores 3 copies of the TLI on different FRU locations. The SPARC system requires that the TLI data are valid before the system is allowed to power on. The ILOM prevents the system to be powered on (start /SYS) when there is no valid MACADDR and HOSTID.
A system with an invalid TLI data will display the following information on the ILOM console.
Oracle(R) Integrated Lights Out Manager
Version 3.0.16.9.b r77667
Copyright (c) 2012, Oracle and/or its affiliates. All rights reserved.
Warning: Product identification data missing. System may not function properly.
Service must update product identification data. Contact Service immediately.
->
When a System is powered on (start /SYS) and there is an invalid TLI the system will fail with the following FDD diagnosis [fault.chassis.tli.invalid (ILOM-8000-30)].
-> start /SYS
Are you sure you want to start /SYS (y/n)? y
Starting /SYS
-> show faulty
Target | Property | Value
--------------------+------------------------+---------------------------------
/SP/faultmgmt/0 | fru | /SYS
/SP/faultmgmt/0/ | class | fault.chassis.tli.invalid
faults/0 | |
/SP/faultmgmt/0/ | sunw-msg-id | ILOM-8000-30
faults/0 | |
/SP/faultmgmt/0/ | component | /SYS
faults/0 | |
/SP/faultmgmt/0/ | uuid | e400e251-e94a-c047-a2cf-a12e111b
faults/0 | | 33b4
/SP/faultmgmt/0/ | timestamp | 2013-08-15/14:56:34
faults/0 | |
/SP/faultmgmt/0/ | product_serial_number | unknown
faults/0 | |
/SP/faultmgmt/0/ | chassis_serial_number | unknown
faults/0 | |
->
For SPARC T3-1, T4-1 the TLI is stored in the following location
Primary TLI : Power Distribution Board ( /SYS/PDB )
Backup 1 : ILOM ( /persist/psnc_backup1.xml )
Backup 2 : Connector Board (/SYS/CONNBD)
For SPARC T3-2, T4-2 the TLI is stored in the following location
Primary TLI :///SYS/FANBD
Backup 1 : file:///persist/psnc_backup1.xml
Backup 2 : fruid:///SYS/SASBP
For SPARC T5-2 the TLI is stored in the following location
Primary TLI : SASBP ( /SYS/SASBP )
Backup 1 : ILOM ( /persist/psnc_backup1.xml )
Backup 2 : Motherboard (/SYS/MB)
For SPARC T3-4,T4-4,T5-4 and T5-8 the TLI is stored in the following location
Primary TLI : Rear Chassis Subassembly (/SYS/PDB) on T5 Server this is listed as (/SYS/RCSA)
Backup 1 : ILOM ( /persist/psnc_backup1.xml )
Backup 2 : Front I/O Assembly (/SYS/MB/FIO)
Note: - The PDB for these system is stored in the RCSA - The FIO is located in the Main Module
The error is unlikely a HW issue, this could be due to a recent System Firmware Upgrade, Downgrade or a recent HW replacement. Kindly contact Oracle and open a new service request.
The quickest way to check the status of the TLI is to get into restricted shell from ILOM and run "showpsnc".
In the following example all the elements in the TLI are valid.
-> set SESSION mode=restricted
WARNING: The "Restricted Shell" account is provided solely
to allow Services to perform diagnostic tasks.
[(restricted_shell) t4-1-sin06-b-sp:~]# showpsnc
Primary: fruid:///SYS/PDB
Backup 1: file:///persist/psnc_backup1.xml
Backup 2: fruid:///SYS/CONNBD
Element | Primary | Backup1 | Backup2
------------------+-------------------+-------------------+-------------------
PPN 30056845+1+1 30056845+1+1 30056845+1+1
PSN 1141XXXXXX 1141XXXXXX 1141XXXXXX
MACADDR 00:21:XX:XX:XX:XX 00:21:XX:XX:XX:XX 00:21:XX:XX:XX:XX
HOSTID 85dXXXXX 85dXXXXX 85dXXXXX
[(restricted_shell) t4-1-sin06-b-sp:~]#
example of a valid TLI on a SPARC T5-8 , Take note that the T5 uses the new TLI 8PN format, unless FRUs are bilingual the output showpsnc for 7PN and 8PN are almost identical. 8PN TLI has an additional entry "Product Name".
-> set SESSION mode=restricted
WARNING: The "Restricted Shell" account is provided solely
to allow Services to perform diagnostic tasks.
[(restricted_shell) t5-8-sin06-a-sp:~]# showpsnc
Primary: fruid:///SYS/RCSA
Backup 1: file:///persist/psnc_backup1.xml
Backup 2: fruid:///SYS/FIO
Element | Primary | Backup1 | Backup2
------------------+-------------------+-------------------+-------------------
PPN 31373242+1+1 31373242+1+1 31373242+1+1
PSN AK000XXXXX AK000XXXXX AK000XXXXX
MACADDR 00:10:XX:XX:XX:XX 00:10:XX:XX:XX:XX 00:10:XX:XX:XX:XX
HOSTID 86XXXXXX 86XXXXXX 86XXXXXX
Product Name SPARC T5-8 SPARC T5-8 SPARC T5-8
[(restricted_shell) t5-8-sin06-a-sp:~]#
showpsnc output collected from T4-1B and T5-1B Blade
[(restricted_shell) t4-1b-bur09-b-sp:~]# showpsnc
Primary: fruid:///SYS/MB
Backup 1: file:///persist/psnc_backup1.xml
Element | Primary | Backup1
------------------+-------------------+-------------------
PPN 30065641+1+1 30065641+1+1
PSN 1136NN10H7 1136NN10H7
MACADDR 00:21:28:D6:B9:D0 00:21:28:D6:B9:D0
HOSTID 85d6b9d0 85d6b9d0
[(flash)root@t5-1b-bur09-p1-sp:~]# showpsnc
Primary: fruid:///SYS/MB
Backup 1: file:///persist/psnc_backup1.xml
Element | Primary | Backup1
------------------+-------------------+-------------------
PPN ATO-2BB ATO-2BB
PSN 1249NN315P 1249NN315P
MACADDR 00:10:E0:23:23:82 00:10:E0:23:23:82
HOSTID 86232382 86232382
Product Name SPARC T5-1B SPARC T5-1B
When the element in the TLI are invalid the container status is listed as not valid. In the following example the all the TLI elements are shown as not valid andthe system will not power on (start /SYS).
bash-2.05b# showpsnc
Primary: fruid:///SYS/PDB
read TLI failed for /SYS/PDB(fru_status: 14): Data could not be found
read Ethernet_Addr failed for /SYS/PDB(fru_status: 14): Data could not be found
read HOSTID failed for /SYS/PDB(fru_status: 14): Data could not be found
Backup 1: file:///persist/psnc_backup1.xml
Backup 2: fruid:///SYS/CONNBD
read TLI failed for /SYS/CONNBD(fru_status: 14): Data could not be found
Element | Primary | Backup 1 | Backup 2
------------------+-------------------+-------------------+-------------------
Container Status Not Valid Not Valid Not Valid
PPN unknown 30056845+1+1 unknown
PSN unknown 1141XXXXXX unknown
MACADDR unknown 00:21:XX:XX:XX:XX unknown
HOSTID 0 85dXXXXX 0
bash-2.05b#
There are been changes made in the TLI formats, the legacy format 7PN (Sun Legacy 7 digit identity record) and the new format 8PN (Oracle 8 digit identiy record).
For SPARC T3 and T4, System Firmware 8.2.2.c and below uses the 7PN format excusively. System Firmware 8.3.0.b (onwards) uses a bilingual ILOM that could understand both 7PN and 8PN TLI formats. A bilingual FRU is needed so that parts can be interchanged between the older 7PN and the newer 8PN systems.
The SPARC T5 platform uses the 8PN TLI format, We do not expect to see "fault.chassis.tli.invalid" issue when downgrading or upgrading firmware on the SPARC T5
Below is an example of a Bilingual FRU taken from a T4-1 with 7PN TLI upgraded to 8.3.0.c, when uprading from 8.2.2.c
[(restricted_shell) t4-1-sin06-b-sp:~]# showpsnc
Primary: fruid:///SYS/PDB
Backup 1: file:///persist/psnc_backup1.xml
Backup 2: fruid:///SYS/CONNBD
Element | Primary (7) | Primary (8) <<<<<<<<<< Bilingual FRU Format
------------------+-------------------+-------------------
PPN 30056845+1+1 30056845+1+1
PSN 1141XXXXXX 1141XXXXXX
MACADDR 00:21:XX:XX:XX:XX 00:21:XX:XX:XX:XX
HOSTID 85dXXXXX 85dXXXXX
Element | Backup1
------------------+-------------------
PPN 30056845+1+1
PSN 1141XXXXXX
MACADDR 00:21:XX:XX:XX:XX
HOSTID 85dXXXXX
Element | Backup2 (7) | Backup2 (8) <<<<<<<<<< Bilingual FRU Format
------------------+-------------------+-------------------
PPN 30056845+1+1 30056845+1+1
PSN 1141XXXXXX 1141XXXXXX
MACADDR 00:21:XX:XX:XX:XX 00:21:XX:XX:XX:XX
HOSTID 85dXXXXX 85dXXXXX
On some T4 systems, after upgrading from System Firmware 8.2.2.c to 8.3.0.b (onwards), downgrading back to 8.2.2.c may trigger a fault.chassis.tli.invalid(ILOM-8000-30) fault. The workaround to this issue is to do the following.
STEP 1.) Ensure that Downgrading to 8.2.2.c has been completed
T4 System Firmware 8.2.2.c
SPARC T4-1 Patch 148822-05
SPARC T4-2 Patch 148823-05
SPARC T4-4 Patch 148824-05
SPARC T4-1B Patch 148825-04
Netra SPARC T4-1 Patch 148826-05
Netra SPARC T4-2 Patch 148827-04
Netra SPARC T4-1B Patch 148828-04
STEP 2.) login to ILOM as root and run "show faulty"
For Example
-> show faulty
Target | Property | Value
--------------------+------------------------+---------------------------------
/SP/faultmgmt/0 | fru | /SYS
/SP/faultmgmt/0/ | class | fault.chassis.tli.invalid
faults/0 | |
/SP/faultmgmt/0/ | sunw-msg-id | ILOM-8000-30
faults/0 | |
/SP/faultmgmt/0/ | component | /SYS
faults/0 | |
/SP/faultmgmt/0/ | uuid | e400e251-e94a-c047-a2cf-a12e111b
faults/0 | | 33b4
/SP/faultmgmt/0/ | timestamp | 2013-08-15/14:56:34
faults/0 | |
/SP/faultmgmt/0/ | product_serial_number | unknown
faults/0 | |
/SP/faultmgmt/0/ | chassis_serial_number | unknown
faults/0 | |
STEP 3.) From RESTRICTED shell run "showpsnc", ALL the elements should be listed as "Not valid".
For Example
Oracle(R) Integrated Lights Out Manager
Version 3.0.16.9.b r77667
Copyright (c) 2012, Oracle and/or its affiliates. All rights reserved.
Warning: Product identification data missing. System may not function properly.
Service must update product identification data. Contact Service immediately.
-> set SESSION mode=restricted
WARNING: The "Restricted Shell" account is provided solely
to allow Services to perform diagnostic tasks.
[(restricted_shell) t4-1-sin06-b-sp:~]$ showpsnc
Primary: fruid:///SYS/PDB
Backup 1: file:///persist/psnc_backup1.xml
Backup 2: fruid:///SYS/CONNBD
Element | Primary (7) | Primary (8)
------------------+-------------------+-------------------
Container Status Not Valid Not Valid
PPN 30056845+1+1 30056845+1+1
PSN 1141XXXXXX 1141XXXXXX
MACADDR 00:21:XX:XX:XX:XX
HOSTID 85dXXXXX 0
Element | Backup1
------------------+-------------------
Container Status Not Valid
PPN 30056845+1+1
PSN 1141XXXXXX
MACADDR 00:21:XX:XX:XX:XX
HOSTID 85dXXXXX
Element | Backup2 (7) | Backup2 (8)
------------------+-------------------+-------------------
Container Status Not Valid Not Valid
PPN 30056845+1+1 30056845+1+1
PSN 1141XXXXXX 1141XXXXXX
MACADDR 00:21:XX:XX:XX:XX
HOSTID 85dXXXXX 0
[(restricted_shell) t4-1-sin06-b-sp:~]$
->
STEP 4.) login to ILOM under escalation mode ( Doc id 1019946 )
STEP 5.) run the escalation mode "clearfru -d Top_Level_IdentifierR" and "clearfru -d System_IdentityR" on PRIMARY fruid
For Example ( From a T4-1 )
bash-2.05b#
bash-2.05b# clearfru -d Top_Level_IdentifierR /SYS/PDB
Clear FRUs (y or [n])? y
Deleted /SYS/PDB:FL:Top_Level_IdentifierR
bash-2.05b# clearfru -d System_IdentityR /SYS/PDB
Clear FRUs (y or [n])? y
Deleted /SYS/PDB:FL:System_IdentityR
bash-2.05b#
STEP 6.) run the escalation mode "clearfru -d Top_Level_IdentifierR" and "clearfru -d System_IdentityR" on BACKUP2 fruid
For Example ( From a T4-1 )
bash-2.05b#
bash-2.05b# clearfru -d Top_Level_IdentifierR /SYS/CONNBD
Clear FRUs (y or [n])? y
Deleted /SYS/CONNBD:FL:Top_Level_IdentifierR
bash-2.05b# clearfru -d System_IdentityR /SYS/CONNBD
Clear FRUs (y or [n])? y
Deleted /SYS/CONNBD:FL:System_IdentityR
bash-2.05b#
Note: If this is a T4-4 system the fruid should be "/SYS/MB/FIO"
STEP 7.) verify that the TLI Elements for PRIMARY and BACKUP2 are cleared by running the TLI command "showpsnc". The PPN,PSN,MACCADDR and HOSTID entry will be marked as "unknown".
For Example
bash-2.05b# showpsnc
Primary: fruid:///SYS/PDB
read TLI failed for /SYS/PDB(fru_status: 14): Data could not be found
read Ethernet_Addr failed for /SYS/PDB(fru_status: 14): Data could not be found
read HOSTID failed for /SYS/PDB(fru_status: 14): Data could not be found
Backup 1: file:///persist/psnc_backup1.xml
Backup 2: fruid:///SYS/CONNBD
read TLI failed for /SYS/CONNBD(fru_status: 14): Data could not be found
Element | Primary | Backup 1 | Backup 2
------------------+-------------------+-------------------+-------------------
Container Status Not Valid Not Valid Not Valid
PPN unknown 30056845+1+1 unknown
PSN unknown 1141XXXXXX unknown
MACADDR unknown 00:21:XX:XX:XX:XX unknown
HOSTID 0 85dXXXXXX 0
bash-2.05b#
STEP 8.) recreate the PRIMARY TLI ( in system firmware 8.2.2.c) with the command "setpsnc".
For Example
bash-2.05b# setpsnc
Reading fruid:///SYS/PDB...
read TLI failed for /SYS/PDB(fru_status: 14): Data could not be found
read Ethernet_Addr failed for /SYS/PDB(fru_status: 14): Data could not be found
read HOSTID failed for /SYS/PDB(fru_status: 14): Data could not be found
Warning: Could not read container.
PPN ['unknown']: 30056845+1+1
PSN ['unknown']: 1141BDY9A6
MACADDR ['unknown']: 00:21:XX:XX:XX:XX
HOSTID [0x0]: 85dXXXXX
PPN 30056845+1+1
PSN 1141XXXXXX
MACADDR 00:21:XX:XX:XX:XX
HOSTID 85dXXXXXX
Is the above correct? (y|n) [n]: y
Writing fruid:///SYS/PDB...
You will need to reboot the SP for these changes to take fully effect.
bash-2.05b#
bash-2.05b# showpsnc
Primary: fruid:///SYS/PDB
Backup 1: file:///persist/psnc_backup1.xml
Backup 2: fruid:///SYS/CONNBD
read TLI failed for /SYS/CONNBD(fru_status: 14): Data could not be found
Element | Primary | Backup 1 | Backup 2
------------------+-------------------+-------------------+-------------------
Container Status Valid Not Valid Not Valid
PPN 30056845+1+1 30056845+1+1 unknown
PSN 1141XXXXXX 1141XXXXXX unknown
MACADDR 00:21:XX:XX:XX:XX 00:21:XX:XX:XX:XX unknown
HOSTID 85dXXXXX 85dXXXXX 0
STEP 9.) Copy the Primay TLI into Backup 1 and Backup 2 wit "copypsnc" command
For Example
bash-2.05b# copypsnc PRIMARY BACKUP1
bash-2.05b# copypsnc PRIMARY BACKUP2
bash-2.05b# showpsnc
Primary: fruid:///SYS/PDB Backup 1: file:///persist/psnc_backup1.xml
Backup 2: fruid:///SYS/CONNBD
Element | Primary | Backup 1 | Backup 2
------------------+-------------------+-------------------+-------------------
Container Status Valid Valid Valid
PPN 30056845+1+1 30056845+1+1 30056845+1+1
PSN 1141XXXXXX 1141XXXXXX 1141XXXXXX
MACADDR 00:21:XX:XX:XX:XX 00:21:XX:XX:XX:XX 00:21:XX:XX:XX:XX
HOSTID 85dXXXXX 85dXXXXX 85dXXXXX
bash-2.05b#
STEP 10.) Exit escalation mode and powercycle the platform.
For Example
bash-2.05b# exit
exit
->
-> reset /SP
Are you sure you want to reset /SP (y/n)? y
Performing reset on /SP
-> stop -f /SYS
Are you sure you want to immediately stop /SYS (y/n)? y
stop: Target already stopped
-> show faulty
Target | Property | Value
--------------------+------------------------+---------------------------------
-> start /SYS
Are you sure you want to start /SYS (y/n)? y
Starting /SYS
-> start /SP/console
Are you sure you want to start /SP/console (y/n)? y
Serial console started. To stop, type #.
[CPU 0:0:0] NOTICE: Initializing TOD: 2013/08/15 08:09:47
[CPU 0:0:0] NOTICE: Loaded ASR status DB data. Ver. 3.
Attachments
This solution has no attachment