Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-2048891.1
Update Date:2017-10-05
Keywords:

Solution Type  Problem Resolution Sure

Solution  2048891.1 :   Exalytics OVM Server Mount Points (/u01 and /u02) Are Starting Up in Read Only Mode on Reboot and Performance is Very Slow  


Related Items
  • Oracle Enterprise Linux
  •  
  • Oracle Exalytics Software
  •  
  • Exalytics In-Memory Machine X3-4
  •  
Related Categories
  • PLA-Support>Eng Systems>Exalytics>Oracle Exalytics>DB: Exalytics_EST
  •  




In this Document
Symptoms
Changes
Cause
Solution
References


Created from <SR 3-11151789241>

Applies to:

Exalytics In-Memory Machine X3-4 - Version All Versions and later
Oracle Exalytics Software - Version 1.0.0.5.0 and later
Oracle Enterprise Linux - Version 5.0 and later
Information in this document applies to any platform.

Symptoms

On an Exalytics OVM server, the /u01 and /u02 mount points are coming up in Read only mode on VM reboot and the write speed on the underlying OVM repository is very slow compared to another Exalytics OVM host at the same site.  The "# time dd if=/dev/zero of=test bs=1M count=512 conv=fdatasync" output shows significantly slower speed when compared to the same command output from the other server.

On slow server (VM1), command and results are:

# time dd if=/dev/zero of=test bs=1M count=512 conv=fdatasync
512+0 records in
512+0 records out
536870912 bytes (537 MB) copied, 263.265 seconds, 2.0 MB/s               --- Note speed of 263.264 seconds ---

real    4m23.287s
user    0m0.001s
sys     0m1.511s

While on the other Exalytics OVM host (VM2), where OVM's are working fine, the same command shows output like:

# time dd if=/dev/zero of=test bs=1M count=512 conv=fdatasync
512+0 records in
512+0 records out
536870912 bytes (537 MB) copied, 1.0105 seconds, 531 MB/s              --- Note speed of 1.0105 seconds ---

real    0m1.153s
user    0m0.001s
sys     0m0.952s

Changes

None

Cause

Found repeated warning messages in the dmesg file showing: "mpt2sas 0000:11:00.0: vpd r/w failed. This is likely a firmware bug on this device. Contact the card vendor for a firmware update."
Analysis of ILOM snapshot and exalyticsdiag.sh found that there was a failing backup battery on the LSI card which led to slow performance.

When the Backup Battery Unit (BBU) is present and operating normally, the virtual disks are placed in Write-Back (WB) mode. This uses the cache memory to store writes that get written back to the disk physically in a block to optimize the data placement and speed on disk. Once the write is in the cache memory, it is acknowledged back to the O/S as I/O completed which improves the O/S level performance.

If the power is lost, the BBU ensures the cache is maintained and written back to disk once power is restored.

When the BBU is not present, or not able to be used currently, (as in this case), the HBA reverts the virtual disks into Write-Through (WT) mode. This mode bypasses the cache memory and writes all I/O's through to the disk physically, and waits on the disk to acknowledge the write is completed physically. This results in slower performance but guarantees all writes are done to non-volatile disk storage so there is no risk of data loss if power was lost.
 

Solution

Hardware team dispatched Field Engineer to replace the failing battery.

Battery replacement improved speed and made OVM server performance similar to that seen on the other OVM server on a different Exalytics host.
 

References

<NOTE:1500235.1> - How To Collect an Sosreport on Oracle Linux
<NOTE:1527167.1> - Oracle Exalytics In-Memory Machine diagnostic data collection script
<NOTE:1674265.1> - SRDC - ES ILOM Snapshot Collection

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback