![]() | Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||||||||||||||||
Solution Type Technical Instruction Sure Solution 2032240.1 : How to Replace an Oracle Server X5-4 NVMe Disk [VCAP]
In this Document
Applies to:Oracle Server X5-4 - Version All Versions to All Versions [Release All Releases]Information in this document applies to any platform. GoalHow to Replace an Oracle Server X5-4 NVMe Disk. SolutionDISPATCH INSTRUCTIONS WHAT SKILLS DOES THE FIELD ENGINEER/ADMINISTRATOR NEED: TIME ESTIMATE: 30 minutes TASK COMPLEXITY: 0 FIELD ENGINEER/ADMINISTRATOR INSTRUCTIONS: PROBLEM OVERVIEW: An Oracle Server X5-4 NVMe Disk needs replacement WHAT STATE SHOULD THE SYSTEM BE IN TO BE READY TO PERFORM THE RESOLUTION ACTIVITY? : NVMe drives are a combined controller and storage device and have very different failure modes compared to SAS devices. So the controller can report a Healthy Status and can also report failure code. If the controller believes the internal state of drive metadata could allow the drive to return incorrect data to the host, the drive will go into Disable Logical mode. This mode will shut down the drive storage device, but the controller will still be visible to the NVMe driver. This is also known as ASSERT or BAD_CONTEXT mode.
The Oracle X5-4 Server supports NVME disk on Solaris or Oracle Linux. [See the procedures below]
If you cannot use the hotplug command due to "command not found" or similar, then enable it like this: #svcadm enable hotplug
For a list of the virtual PCIe slots of NVMe drives as seen by the operating system, see NVMe Storage Drive Virtual PCIe Slot Designation: https://docs.oracle.com/cd/E56388_01/html/E56396/gomph.html#scrolltoc NVMe Storage Drive Virtual PCIe Slot DesignationIf NVMe storage drives are installed, they are labeled on the system front panel as NVMe0, NVMe1, NVMe2, and NVMe3. However, the server BIOS internally identifies these drives by their virtual PCIe slot numbers. When using operating system commands to power NVMe drives off before removal, you need to know the virtual PCIe slot number of the drive. The following table lists the drive front panel label and its corresponding virtual PCIe slot number used by the operating system.
Note that the virtual PCIe slot name is not the same as the name on the server front panel label. 1. Log in to the Oracle Solaris host. 2. Find the NVMe drive virtual PCIe slot number. Type:
# hotplug list –lc This command produces output similar to the following for each of the NVMe drives installed in the server: # hotplug list –lc Connection State Description Path ------------------------------------------------------- pcie100 ENABLED PCIe-Native /pci@0,0/pci8086,2f06@2,2/pci111d,80b5@0/pci111d,80b5@4 pcie101 ENABLED PCIe-Native /pci@0,0/pci8086,2f06@2,2/pci111d,80b5@0/pci111d,80b5@5 pcie102 ENABLED PCIe-Native /pci@0,0/pci8086,2f06@2,2/pci111d,80b5@0/pci111d,80b5@6 pcie103 ENABLED PCIe-Native /pci@0,0/pci8086,2f06@2,2/pci111d,80b5@0/pci111d,80b5@7 3. Prepare the NVMe drive for removal by powering off the drive slot. For example, to prepare NVMe0 for removal, type the following commands: # hotplug poweroff pcie100 #hotplug list –lc The following output appears for the NVMe0 drive that has been unmounted: # hotplug list –lc Connection State Description Path ------------------------------------------------------- pcie100 PRESENT PCIe-Native /pci@0,0/pci8086,2f06@2,2/pci111d,80b5@0/pci111d,80b5@4 pcie101 ENABLED PCIe-Native /pci@0,0/pci8086,2f06@2,2/pci111d,80b5@0/pci111d,80b5@5 pcie102 ENABLED PCIe-Native /pci@0,0/pci8086,2f06@2,2/pci111d,80b5@0/pci111d,80b5@6 pcie103 ENABLED PCIe-Native /pci@0,0/pci8086,2f06@2,2/pci111d,80b5@0/pci111d,80b5@7
4, Verify that the blue OK to Remove indicator on the NVMe drive is lit. 5. On the drive you plan to remove, push the latch release button to open the drive latch. 6. Grasp the latch and pull the drive out of the drive slot. 7. Verify that the NVMe drive has been removed. Type: # hotplug list –lc The following output appears (the removed drive will show the EMPTY state): # hotplug list –lc Connection State Description Path ------------------------------------------------------- pcie100 EMPTY PCIe-Native /pci@0,0/pci8086,2f06@2,2/pci111d,80b5@0/pci111d,80b5@4 pcie101 ENABLED PCIe-Native /pci@0,0/pci8086,2f06@2,2/pci111d,80b5@0/pci111d,80b5@5 pcie102 ENABLED PCIe-Native /pci@0,0/pci8086,2f06@2,2/pci111d,80b5@0/pci111d,80b5@6 pcie103 ENABLED PCIe-Native /pci@0,0/pci8086,2f06@2,2/pci111d,80b5@0/pci111d,80b5@7
8. Align the replacement drive with the drive slot. 9. Slide the drive into the slot until the drive is fully seated. 10. Close the drive latch to lock the drive in place. 11. Power on the slot for the drive. Type: [This step may be automatic, however the command is here just in case]
# hotplug enable pcie100
12. Confirm that the drive has been enabled and is seen by the system. Type: # hotplug list –lc The following status is displayed (installed NVMe drives show the ENABLED state). # hotplug list –lc Connection State Description Path ------------------------------------------------------- pcie100 ENABLED PCIe-Native /pci@0,0/pci8086,2f06@2,2/pci111d,80b5@0/pci111d,80b5@4 pcie101 ENABLED PCIe-Native /pci@0,0/pci8086,2f06@2,2/pci111d,80b5@0/pci111d,80b5@5 pcie102 ENABLED PCIe-Native /pci@0,0/pci8086,2f06@2,2/pci111d,80b5@0/pci111d,80b5@6 pcie103 ENABLED PCIe-Native /pci@0,0/pci8086,2f06@2,2/pci111d,80b5@0/pci111d,80b5@7
13. To check the NVMe drive health, firmware level, temperature, get error log, SMART data, low level format, etc., type: # nvmeadm list root@x5-4-bur09-a:~# nvmeadm list SUNW-NVME-1 SUNW-NVME-2 SUNW-NVME-3 SUNW-NVME-4 root@x5-4-bur09-a:~# nvmeadm getlog -h SUNW-NVME-1 SUNW-NVME-1 SMART/Health Information: Critical Warning: 0 Temperature: 294 Kelvin Available Spare: 100 percent Available Spare Threshold: 10 percent Percentage Used: 1 percent Data Unit Read: 0x1ddb62 of 512k bytes. Data Unit Written: 0x147179 of 512k bytes. Number of Host Read Commands: 0x6387c6d3 Number of Host Write Commands: 0x7066af00 Controller Busy Time in Minutes: 0x34e Number of Power Cycle: 0x93 Number of Power On Hours: 0x197 Number of Unsafe Shutdown: 0x89 Number of Media Errors: 0x0 Number of Error Info Log Entries: 0x0
How to replace an NVME disk from Oracle Linux Operating System
Note that address 12:00.0, which represents PCIe slot 100 and is the drive labeled NVMe0 on the system front panel and the drive powered off is not listed. After you physically remove an NVMe drive from the server, wait at least 10 seconds before installing a replacement drive. 9. Align the replacement drive with the drive slot. 10. Slide the drive into the slot until the drive is fully seated. 11. Close the drive latch to lock the drive in place.
Power On an NVMe Storage Drive Before You Begin
# lspci -nnd :0953
PARTS NOTE:
Attachments This solution has no attachment |
||||||||||||||||||||||||||
|