How to Troubleshoot BDA X5-2 Disk Replacement Resynchronization/Partition Format with 4TB Disk Taking Excessive Time

Asset ID:	1-71-2123034.1
Update Date:	2016-04-04
Keywords:

Solution Type Technical Instruction Sure

Solution 2123034.1 : How to Troubleshoot BDA X5-2 Disk Replacement Resynchronization/Partition Format with 4TB Disk Taking Excessive Time

Applies to:

Big Data Appliance X5-2 Hardware - Version All Versions and later
Linux x86-64

Goal

This note gives some tips on how to troubleshoot a case where:

1. The resynchronization step of an OS disk replacement is taking a very long period of time.

You would be at this place when configuring an OS disk during the step to resynchronize in the "Repairing the RAID Arrays" section of:

a) How to Configure a Server Disk After Disk Replacement as an Operating System Disk for /u02 and /dev/sdb on Oracle Big Data Appliance V2.2.*/V2.3.1/V2.4.0/V2.5.0/V3.0.0/V3.0.1/V3.1.0/V4.0.0/V4.1.0/V4.2.0/V4.3.0 (Doc ID 1581373.1).
b) How to Configure a Server Disk After Disk Replacement as an Operating System Disk for /u01 and /dev/sda on Oracle Big Data Appliance V2.2.*/V2.3.1/V2.4.0/V2.5.0/V3.0.0/V3.0.1/V3.1.0/V4.0.0/V.4.1.0/V4.2.0/V4.3.0 (Doc ID 1581338.1).

2. The partition format of an HDFS disk is taking a very long period of time.

You would be at this place when configuring an HDFS disk during the step to format the partition in the "Partitioning a Disk for HDFS or Oracle NoSQL Database" section of:

How to Configure a Server Disk After Replacement as an HDFS Disk or Oracle NoSQL Database Disk on Oracle Big Data Appliance V2.2.*/V2.3.1/V2.4.0/V2.5.0/V3.0.0/V3.0.1/V3.1.0/V4.0.0/V4.1.0/V4.2.0/V4.3.0 (Doc ID 1581583.1).

Solution

The resynchronization step is not expected to take excessive time. It is based on the size of the / partition which is typically 500GB, not on the size of the disk.

If the resynchronization step is taking a long time troubleshoot with the following:

1. Check the speed of replication. It could have been altered. Typically output looks like:

a) Check sync_speed_max. Expected output is:

# cat /sys/block/md2/md/sync_speed_max
200000 (system)

b) Check sync_speed. Expected output is:

# cat /sys/block/md2/md/sync_speed
none

2. Check the health of the disk with MegaCli64 or storcli. Verify if media errors exist.

3. Check the buffering on the new disk and compare the values with other disks. Check with: hdparm -t /dev/disk/by-hba-slot/s<x>. For example:

# hdparm -t /dev/disk/by-hba-slot/s0

/dev/disk/by-hba-slot/s0:
Timing buffered disk reads: 524 MB in 3.01 seconds = 174.09 MB/sec

4. Check iostat.

If there is a lot of activity with other disks that would slow down the HBA.

Attachments

This solution has no attachment