Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-2353961.1
Update Date:2018-01-27
Keywords:

Solution Type  Problem Resolution Sure

Solution  2353961.1 :   T7-1 WARNING: nvmex 1 IO timeout (CSTS 1)! (CSTS 2) (CSTS 3)  


Related Items
  • SPARC T8-1
  •  
  • SPARC T7-2
  •  
  • Netra SPARC S7-2
  •  
  • MiniCluster S7-2 Hardware
  •  
  • SPARC T8-4
  •  
  • SPARC T8-2
  •  
  • SPARC T7-4
  •  
  • SPARC S7-2L
  •  
  • SPARC T7-1
  •  
Related Categories
  • PLA-Support>Sun Systems>SPARC>CMT>SN-SPARC: T7
  •  




In this Document
Symptoms
Changes
Cause
Solution
References


Created from <SR 3-16692393861>

Applies to:

SPARC T7-1 - Version All Versions to All Versions [Release All Releases]
SPARC T7-2 - Version All Versions to All Versions [Release All Releases]
SPARC T7-4 - Version All Versions to All Versions [Release All Releases]
Netra SPARC S7-2 - Version All Versions to All Versions [Release All Releases]
SPARC S7-2L - Version All Versions to All Versions [Release All Releases]
Information in this document applies to any platform.

Symptoms

nvme: WARNING: nvmex 1 IO timeout (CSTS 1)!

Changes

nvme: WARNING: nvmex 1 IO timeout (CSTS 1) *indicates nvmex instance 1 suffered an IO timeout and it has controller status "Controller Fatal Status"*

nvme: WARNING: nvmex 1 IO timeout (CSTS 2) * Shutdown Status *

nvme: WARNING: nvmex 1 IO timeout (CSTS 3) *"Controller Fatal Status" and "Shutdown Status"*

The value following "nvmex" (in this case a "1") indicates the nvmex instance.

The value following "CSTS" (in this case a "1") indicates the controller status, where the controller status can be any one of the following as defined within the header file:

/usr/include/sys/nvme/nvme_reg.h

/* CSTS (Offset 1Ch) - Controller Status */

#define NVR_CSTS_RDY    (1)     /* Controller is ready for commands */
        NVR_CSTS_RDY_SFT = 0,   /* Ready */
        NVR_CSTS_CFS_SFT = 1,   /* Controller Fatal Status */
        NVR_CSTS_SHST_SFT = 2,  /* Shutdown Status */

"CSTS 3" would suggest Controller Fatal Status ~and also~ Shutdown Status
If a 3 is found in the controller-status register of the nvme deviceIt indicates bits 1 and 2 are set within that register which means the status is both of these: "Controller Fatal Status" and "Shutdown Status"

Cause

Fault Fault critical Fault detected at time = Fri Jan 19 15:41:59 2018. The suspect component: /SYS/DBP/NVME1 has fault.io.pciex.device-interr-deg with probability=100. Refer to http://support.oracle.com/msg/PCIEX-8000-ND for details.
 
-> show faulty
Target | Property | Value
-------------------+-----------------------+-----------------------------------
/SP/faultmgmt/0 | fru | /SYS/DBP/NVME1
/SP/faultmgmt/0/ | class | fault.io.pciex.device-interr-deg
 faults/0 | |
/SP/faultmgmt/0/ | sunw-msg-id | PCIEX-8000-ND
 faults/0 | |
/SP/faultmgmt/0/ | component | hc:///chassis=0/motherboard=0/host
 faults/0 | | bridge=3/pciexrc=12/pciexbus=0/pci
  | | exdev=0/pciexfn=0/pciexbus=0/pciex
  | | dev=0/pciexfn=56/pciexbus=0/pciexd
  | | ev=0/pciexfn=0

PROPERTIES:
 /SYS/DBP/NVME1
  Properties:
  type = PCIE Module
  fault_state = Faulted
  clear_fault_action = (none)

faultmgmt shell:

Problem Status : open
Diag Engine : eft 1.16
System
  Manufacturer : Oracle Corporation
  Name : SPARC T7-1
  Part_Number : 35172098+1+1
  Serial_Number : 1733NN72E4

----------------------------------------
Suspect 1 of 1
  Problem class : fault.io.pciex.device-interr-deg
  Certainty : 100%
  Affects : dev:////pci@303/pci@1/pci@0/pci@7/nvme@0
  Status : faulted

  FRU
  Status : faulty
  Location : /SYS/DBP/NVME1
  Chassis
  Manufacturer : Oracle Corporation
  Name : SPARC T7-1
  Part_Number : 35172098+1+1
  Serial_Number : 1733NN72E4
  Resource
  Location : hc:///chassis=0/motherboard=0/hostbridge=3/pciexrc=12/pciexbus=0/pciexdev=0/pciexfn=0/pciexbus=0/pciexdev=0/pciexfn=56/pciexbus=0/pciexdev=0/pciexfn=0

Description : A fault was diagnosed by the Host Operating System.

Action : Please refer to the associated reference document at
  http://support.oracle.com/msg/PCIEX-8000-ND for a complete,
  detailed description and the latest service procedures and
  policies regarding this diagnosis.

For status of the NVME drives prior to OS reboot:
Explorer: disks; nvme; then review:
nvmeadm_getlog_-e.out
and
nvmeadm_getlog_-h.out
 

Solution

Replace NVME1, Clear faults in ilom and faultmgmt shell if needed.
 

References

<NOTE:1021335.1> - PCIEX-8000-ND - PCIEX subsystem problem

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback