![]() | Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Solution Type Predictive Self-Healing Sure Solution 1581583.1 : How to Configure a Server Disk After Replacement as an HDFS Disk or Oracle NoSQL Database Disk on Oracle Big Data Appliance V2.2.*/V2.3.1/V2.4.0/V2.5.0/V3.x/V4.x
In this Document
Applies to:Big Data Appliance X5-2 Full Rack - Version All Versions to All Versions [Release All Releases]Big Data Appliance X3-2 Starter Rack - Version All Versions to All Versions [Release All Releases] Big Data Appliance X3-2 In-Rack Expansion - Version All Versions to All Versions [Release All Releases] Big Data Appliance X4-2 In-Rack Expansion - Version All Versions to All Versions [Release All Releases] Big Data Appliance X5-2 In-Rack Expansion - Version All Versions to All Versions [Release All Releases] Linux x86-64 PurposeThe document will describe the steps for configuring a server disk as an HDFS disk or Oracle NoSQL Database disk after disk drive replacement on Oracle Big Data Appliance V2.2.*/V2.3.1/V2.4.0/V2.5.0/V3.x/V4.x. ScopeThis document is to be used by anyone who is configuring the disk. If attempting the steps and further assistance is needed please log a service request to contact support for help. DetailsOverview
Failure of a disk is never catastrophic on Oracle Big Data Appliance. No user data should be lost. Data stored in HDFS or Oracle NoSQL Database is automatically replicated. The following are the basic steps for replacing a server disk drive and configuring it as an HDFS or Oracle NoSQL Database Disk: 1. Replace the failed disk drive. 2. Perform the basic configuration steps for the new disk. If multiple disks are unconfigured, then configure them in order from the lowest to the highest slot number. Finish all the steps for one disk and then start with all the steps for the next. 3. Identify the dedicated function of the failed disk, either as an operating system disk, an HDFS disk, or an Oracle NoSQL Database disk for the disk drive that has been replaced. The steps for 1, 2, and 3 are listed in "Steps for Replacing a Disk Drive and Determining its Function on the Oracle Big Data Appliance V2.2.*/V2.3.1/V2.4.0/V2.5.0/V3.x/V4.x (Doc ID 1581331.1)." 4. Configure the disk for its dedicated function, in this case for HDFS or Oracle NoSQL Database. 5. Verify that the configuration is correct. The steps for 4 and 5 are listed here in this document.
About Disk Drive IdentifiersThe Oracle Big Data Appliance includes a disk enclosure cage that holds 12 disk drives and is controlled by the HBA (Host Bus Adapter). The drives in this enclosure are identified by slot numbers (0..11) and can have different purposes, for example the drives in slot 0 and 1 have a raid 1 OS and boot partitions. The drives can be dedicated to specific functions, as shown in Table 1. See the following tables to identify the function of the drive, the slot number, and the mount point which will be used later in the procedure.
Disk Drive IdentifiersThe following table (Table 1) shows the mappings between the RAID logical drives and the probable initial kernel device names, and the dedicated function of each drive in an Oracle Big Data Appliance server. The server with the failed drive is part of either a CDH cluster (HDFS) or an Oracle NoSQL Database cluster. This information will be used in a later procedure of partioning the disk for it's appropriate function so please note which mapping is applicable for the disk drive that is being replaced. Table 1 - Disk Drive Identifiers
Standard Mount PointsThe following table (Table 2) shows the mappings between HDFS partitions and mount points. This information will be used in the later procedure so please note which mapping is applicable for the disk drive that is being replaced. Table 2 - Mount Points
Note: MegaCli64, mount, umount and many of the commands require root so the recommendation is to run the entire procedure as root.
Note: The code examples provided here are based on replacing /dev/disk/by-hba-slot/s4 == /dev/sde == /dev/disk/by-hba-slot/s4p1 == /dev/sde1 == /u05. These 4 mappings for example are an easy way to set up the information that will be needed throughout the procedure. It is best to figure out the mapping and write it down to use in the procedure. For example, slot # is one less that mount point. All disks replacements will vary so please replace the examples with the proper information in regards to the disk replacement being done. Helpful Tips: You can re-confirm the relationship among the disk slot number, the current kernel device name and mount point as follows: Configuring an HDFS or Oracle NoSQL Database DiskComplete the following steps for any disk not used by the operating system. See Table 1 to determine how the disk is configured. Most disks are used for HDFS or Oracle NoSQL Database, as shown in Table 1. If multiple disks are unconfigured, then configure them in order from the lowest to the highest slot number. Finish all the steps for one disk and then start with all the steps for the next. Verify that the failed disk was not used for either the operating system before configuring it for a particular function. See the following note to determine the function of the disk drive: "Steps for Replacing a Disk Drive and Determining its Function on the Oracle Big Data Appliance V2.2.*/V2.3.1/V2.4.0/V2.5.0/V3.x/V4.x (Doc ID 1581331.1)." 1. Unmounting an HDFS or Oracle NoSQL Database Partition 2. Partitioning a Disk for HDFS or Oracle NoSQL Database 3. Mount HDFS or Oracle NoSQL Database Partition 4. Verifying the Disk Configuration Unmounting an HDFS or Oracle NoSQL Database PartitionTo dismount HDFS or Oracle NoSQL Database partition: # mount -l
Sample output: # mount -l
/dev/md2 on / type ext3 (rw,noatime) proc on /proc type proc (rw) sysfs on /sys type sysfs (rw) devpts on /dev/pts type devpts (rw,gid=5,mode=620) /dev/md0 on /boot type ext3 (rw) tmpfs on /dev/shm type tmpfs (rw) /dev/sda4 on /u01 type ext4 (rw,nodev,noatime) [/u01] /dev/sdb4 on /u02 type ext4 (rw,nodev,noatime) [/u02] /dev/sdc1 on /u03 type ext4 (rw,nodev,noatime) [/u03] /dev/sdd1 on /u04 type ext4 (rw,nodev,noatime) [/u04] /dev/sde1 on /u05 type ext4 (rw,nodev,noatime) [/u05] /dev/sdf1 on /u06 type ext4 (rw,nodev,noatime) [/u06] /dev/sdg1 on /u07 type ext4 (rw,nodev,noatime) [/u07] /dev/sdh1 on /u08 type ext4 (rw,nodev,noatime) [/u08] /dev/sdi1 on /u09 type ext4 (rw,nodev,noatime) [/u09] /dev/sdj1 on /u10 type ext4 (rw,nodev,noatime) [/u10] /dev/sdk1 on /u11 type ext4 (rw,nodev,noatime) [/u11] /dev/sdl1 on /u12 type ext4 (rw,nodev,noatime) [/u12] fuse_dfs on /mnt/hdfs-nnmount type fuse.fuse_dfs (rw,nosuid,nodev,allow_other,allow_other,default_permissions) none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw) sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw) nfsd on /proc/fs/nfsd type nfsd (rw) 3. Dismount the HDFS mount points for the failed disk as root user. Replace mountpoint below with the mount point obtained earlier as shown in the Standard Mount Points table (Table 2) above: # umount mountpoint
Example of dismounting /u05, # umount /u05
If the umount command succeed, then verify the partition is no longer listed by listing the mounted HDFS partitions: # mount -l
Sample output shows that /u05 has been dismounted: # mount -l
/dev/md2 on / type ext3 (rw,noatime) proc on /proc type proc (rw) sysfs on /sys type sysfs (rw) devpts on /dev/pts type devpts (rw,gid=5,mode=620) /dev/md0 on /boot type ext3 (rw) tmpfs on /dev/shm type tmpfs (rw) /dev/sda4 on /u01 type ext4 (rw,nodev,noatime) [/u01] /dev/sdb4 on /u02 type ext4 (rw,nodev,noatime) [/u02] /dev/sdc1 on /u03 type ext4 (rw,nodev,noatime) [/u03] /dev/sdd1 on /u04 type ext4 (rw,nodev,noatime) [/u04] /dev/sdf1 on /u06 type ext4 (rw,nodev,noatime) [/u06] /dev/sdg1 on /u07 type ext4 (rw,nodev,noatime) [/u07] /dev/sdh1 on /u08 type ext4 (rw,nodev,noatime) [/u08] /dev/sdi1 on /u09 type ext4 (rw,nodev,noatime) [/u09] /dev/sdj1 on /u10 type ext4 (rw,nodev,noatime) [/u10] /dev/sdk1 on /u11 type ext4 (rw,nodev,noatime) [/u11] /dev/sdl1 on /u12 type ext4 (rw,nodev,noatime) [/u12] fuse_dfs on /mnt/hdfs-nnmount type fuse.fuse_dfs (rw,nosuid,nodev,allow_other,allow_other,default_permissions) none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw) sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw) nfsd on /proc/fs/nfsd type nfsd (rw) If a umount command fails with a device busy message, then the partition is still in use. For example an HDFS partition could be in use by the datanode service. Continue to the next step. Example: # umount /u05
umount: /u05: device is busy umount: /u05: device is busy 4. Open a browser window to Cloudera Manager. For example: http://bda1node03.example.com:7180
5. Complete these steps in Cloudera Manager: In this example /u05/hadoop/dfs has been removed.
"Restart this DataNode i. Click on the button that says "Restart this DataNode." Note: If you removed the mount point in Cloudera Manager, then you must restore the mount point in Cloudera Manager after finishing all other configuration procedures.
6. Return to your session on the server with the failed drive. # umount mountpoint
Example removing /u05 showing that it succeeds: # umount /u05
Partitioning a Disk for HDFS or Oracle NoSQL Database
To configure a disk, you must partition and format it. Having verified that the drive is not an OS disk proceed to partition the disk. To format a disk for use by HDFS or Oracle NoSQL Database: Note: Replace sn or snp1 in the following commands with the appropriate operating system location name that was determined from Table 1, such as s4 or s4p1.
1. Complete the steps in the document "Steps for Replacing a Disk Drive and Determining its Function on the Oracle Big Data Appliance V2.2.*/V2.3.1/V2.4.0/V2.5.0/V3.x/V4.x (Doc ID 1581331.1)." 2. Partition the drive as root. Replace /dev/disk/by-hba-slot/sn with the operating system location name that was determined from Table 1 On OL6: # parted /dev/disk/by-hba-slot/sn -s mklabel gpt mkpart primary ext4 0% 100%
On OL5: # parted /dev/disk/by-hba-slot/s4 -s mklabel gpt mkpart primary ext3 0% 100%
Optional Sanity check: Confirm the partition was fully created. Replace /sn with the appropriate slot number for the partition: # parted /dev/disk/by-hba-slot/sn
This is sample output which indicates the partition was fully created: # parted /dev/disk/by-hba-slot/s4 GNU Parted 1.8.1 Using /dev/sde Welcome to GNU Parted! Type 'help' to view a list of commands. (parted) print Model: LSI MR9261-8i (scsi) Disk /dev/sde: 3000GB Sector size (logical/physical): 512B/512B Partition Table: gpt Number Start End Size File system Name Flags 1 17.4kB 3000GB 3000GB ext3 primary (parted) quit Information: Don't forget to update /etc/fstab, if necessary. Note there is nothing to update in /etc/fstab. 3. Format the partition for an ext4 file system as user root. Replace /dev/disk/by-hba-slot/snp1 with the proper hdfs partition name determined from Table 2 above: # mkfs -t ext4 /dev/disk/by-hba-slot/snp1
Example using /dev/disk/by-hba-slot/s4p1: # mkfs -t ext4 /dev/disk/by-hba-slot/s4p1
mkfs 1.41.12 (17-May-2010) Filesystem label= OS type: Linux Block size=4096 (log=2) Fragment size=4096 (log=2) Stride=0 blocks, Stripe width=0 blocks 122011648 inodes, 488036855 blocks 24401842 blocks (5.00%) reserved for the super user First data block=0 Maximum filesystem blocks=4294967296 14894 block groups 32768 blocks per group, 32768 fragments per group 8192 inodes per group Superblock backups stored on blocks: 32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208, 4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968, 102400000, 214990848 Writing inode tables: done Creating journal (32768 blocks): done Writing superblocks and filesystem accounting information: done This filesystem will be automatically checked every 28 mounts or 180 days, whichever comes first. Use tune4fs -c or -i to override. 4. Verify that the device is missing: # ls -l /dev/disk/by-label
In this example output, u05 and ../../sde1 are missing: # ls -l /dev/disk/by-label
total 0 lrwxrwxrwx 1 root root 10 Aug 28 09:45 BDAUSB -> ../../sdm1 lrwxrwxrwx 1 root root 10 Aug 28 09:45 u01 -> ../../sda4 lrwxrwxrwx 1 root root 10 Aug 28 09:45 u02 -> ../../sdb4 lrwxrwxrwx 1 root root 10 Aug 28 09:45 u03 -> ../../sdc1 lrwxrwxrwx 1 root root 10 Aug 28 09:45 u04 -> ../../sdd1 lrwxrwxrwx 1 root root 10 Aug 28 09:45 u06 -> ../../sdf1 lrwxrwxrwx 1 root root 10 Aug 28 09:45 u07 -> ../../sdg1 lrwxrwxrwx 1 root root 10 Aug 28 09:45 u08 -> ../../sdh1 lrwxrwxrwx 1 root root 10 Aug 28 09:45 u09 -> ../../sdi1 lrwxrwxrwx 1 root root 10 Aug 28 09:45 u10 -> ../../sdj1 lrwxrwxrwx 1 root root 10 Aug 28 09:45 u11 -> ../../sdk1 lrwxrwxrwx 1 root root 10 Aug 28 09:45 u12 -> ../../sdl1 It is possible that the device will not show as missing as seen in the following output: # ls -l /dev/disk/by-label
total 0 lrwxrwxrwx 1 root root 10 Aug 28 09:45 BDAUSB -> ../../sdm1 lrwxrwxrwx 1 root root 10 Aug 28 09:45 u01 -> ../../sda4 lrwxrwxrwx 1 root root 10 Aug 28 09:45 u02 -> ../../sdb4 lrwxrwxrwx 1 root root 10 Aug 28 09:45 u03 -> ../../sdc1 lrwxrwxrwx 1 root root 10 Aug 28 09:45 u04 -> ../../sdd1 lrwxrwxrwx 1 root root 10 Aug 28 09:45 u05 -> ../../sde1 lrwxrwxrwx 1 root root 10 Aug 28 09:45 u06 -> ../../sdf1 lrwxrwxrwx 1 root root 10 Aug 28 09:45 u07 -> ../../sdg1 lrwxrwxrwx 1 root root 10 Aug 28 09:45 u08 -> ../../sdh1 lrwxrwxrwx 1 root root 10 Aug 28 09:45 u09 -> ../../sdi1 lrwxrwxrwx 1 root root 10 Aug 28 09:45 u10 -> ../../sdj1 lrwxrwxrwx 1 root root 10 Aug 28 09:45 u11 -> ../../sdk1 lrwxrwxrwx 1 root root 10 Aug 28 09:45 u12 -> ../../sdl1 5. Reset the appropriate partition label, reserved space, and file system check options to the missing device as root. Replace /unn with the correct mount point and replace /dev/disk/by-hba-slot/snp1 with the with the proper Symbolic Link to Physical Slot and Partition name determined from Table 2 above: Note: For OL6 use tune2fs. For OL5 use tune4fs. On OL6 use tune2fs: # tune2fs -c -1 -i 0 -m 0.2 -L /unn /dev/disk/by-hba-slot/snp1
On OL5 use tune4fs: # tune4fs -c -1 -i 0 -m 0.2 -L /unn /dev/disk/by-hba-slot/snp1 For example, this command resets the label for /dev/disk/by-hba-slot/s4p1 to /u05: # tune4fs -c -1 -i 0 -m 0.2 -L /u05 /dev/disk/by-hba-slot/s4p1
tune4fs 1.41.12 (17-May-2010) Setting maximal mount count to -1 Setting interval between checks to 0 seconds Setting reserved blocks percentage to 0.2% (976073 blocks) Note: If an incorrect tune2fs command is run, correct this by running the correct tune2fs command after. For example if you incorrectly run: run the correct tune2fs command "tune2fs -c -1 -i 0 -m 0.2 -L /u06 /dev/disk/by-hba-slot/s5p1" and proceed. The reason for this is that tune2fs is used to set the label on the partition so the incorrect command will not affect the slot/mount. In the example here it will not affect "slot 4/u05" which will still have the same label set (having the same label temporarily is ok as long as you don't try to do a mount). 6. Verifiy the replaced disk is listed in 'ls -l /dev/disk/by-label' output. If the replaced disk is listed then please skip to 'Mount HDFS or Oracle NoSQL Database Partition' section. But if the replaced disk is NOT listed then continue to next step (7). # ls -l /dev/disk/by-label
For example if disk in slot 4 / u05 is replaced then below output shows the replaced disk is still missing # ls -l /dev/disk/by-label
total 0 lrwxrwxrwx 1 root root 10 Aug 28 09:45 BDAUSB -> ../../sdm1 lrwxrwxrwx 1 root root 10 Aug 28 09:45 u01 -> ../../sda4 lrwxrwxrwx 1 root root 10 Aug 28 09:45 u02 -> ../../sdb4 lrwxrwxrwx 1 root root 10 Aug 28 09:45 u03 -> ../../sdc1 lrwxrwxrwx 1 root root 10 Aug 28 09:45 u04 -> ../../sdd1 lrwxrwxrwx 1 root root 10 Aug 28 09:45 u06 -> ../../sde1 lrwxrwxrwx 1 root root 10 Aug 28 09:45 u07 -> ../../sdf1 lrwxrwxrwx 1 root root 10 Aug 28 09:45 u08 -> ../../sdg1 lrwxrwxrwx 1 root root 10 Aug 28 09:45 u09 -> ../../sdh1 lrwxrwxrwx 1 root root 10 Aug 28 09:45 u10 -> ../../sdi1 lrwxrwxrwx 1 root root 10 Aug 28 09:45 u11 -> ../../sdj1 lrwxrwxrwx 1 root root 10 Aug 28 09:45 u12 -> ../../sdk1 7. Trigger kernel device uevents to replay missing events at system coldplug. a) For Linux OS 5, execute below command udevtrigger
b) For Linux OS 6, execute below command udevadm trigger
Note:- With both commands --verbose option can be used to check what events are triggered 8. Verifiy the replaced disk is listed in 'ls -l /dev/disk/by-label' output. # ls -l /dev/disk/by-label
Mount HDFS or Oracle NoSQL Database Partition
1. Mount the HDFS or Oracle NoSQL Databases partition as root, entering the appropriate mount point by replacing /unn below with the correct mount point from Table 2 above: # mount /unn
For example: # mount /u05
Optional sanity checks on an HDFS cluster only: 1. As an optional sanity check, run the following replacing /unn with the correct mount point: # ls -la /unn
Example: # ls -la /u05 total 28 drwxr-xr-x 4 root root 4096 Jul 29 21:58 . drwxr-xr-x 39 root root 4096 Sep 26 06:49 .. drwxr-xr-x 4 root root 4096 Jul 29 21:58 hadoop drwx------ 2 root root 16384 Jul 29 18:43 lost+found 2. Run the following replacing /unn/hadoop with the correct mount point: # ls -la /unn/hadoop
Example: # ls -la /u05/hadoop total <X> drwxr-xr-x 4 root root 4096 Jul 29 21:58 . drwxr-xr-x 4 root root 4096 Jul 29 21:58 .. drwx------ 3 hdfs hadoop 4096 Sep 26 06:50 dfs drwxr-xr-x 7 mapred hadoop 4096 Sep 26 15:54 mapred In the case where the mount point needs to be added back into Cloudera Manager no dfs subdirectory will be seen. Example: # ls -la /u05/hadoop total <X> drwxr-xr-x 4 root root 4096 Jul 29 21:58 . drwxr-xr-x 4 root root 4096 Jul 29 21:58 .. drwxr-xr-x 7 mapred hadoop 4096 Sep 26 15:54 mapred 2. If you are configuring multiple drives, then repeat the previous steps. 3. If you previously removed a mount point in Cloudera Manager, then restore it to the list. Restart of DataNode is needed after the disk is replaced and configured. If DataNode is NOT restarted then the replaced disk will not be recognized by HDFS.
a. Open a browser window to Cloudera Manager. For example: http://bda1node03.example.com:7180
b. Open Cloudera Manager and log in as admin. Before: /u10/hadoop/dfs,/u09/hadoop/dfs,/u08/hadoop/dfs,/u07/hadoop/dfs,/u06/hadoop/dfs,/u04/hadoop/dfs,/u03/hadoop/dfs
After: /u10/hadoop/dfs,/u09/hadoop/dfs,/u08/hadoop/dfs,/u07/hadoop/dfs,/u06/hadoop/dfs,/u05/hadoop/dfs,/u04/hadoop/dfs,/u03/hadoop/dfs 1. To add the node click on the + plus sign higher than where the node will be added. 2. An empty box will be added that you can fill in.
h. Click Save Changes.
4. If you previously removed a mount point from NodeManager Local Directories, then also restore it to the list using Cloudera Manager. (On BDA V3.* and higher). a. On the Services page, click Yarn. b. In the Status Summary, click NodeManager. c. From the list, click to select the NodeManager that is on the host with the failed disk. d. Click the Configuration sub-tab. e. If the mount point is missing from the NodeManager Local Directories field, then add it to the list. f. Click Save Changes. g. From the Actions list, choose Restart this NodeManager.
Optional Sanity Checks on an HDFS cluster only: 1. After adding the mount point back into Cloudera Manager, in the cases where this was done in the last step and the dfs directory shows up the following sanity check can be done to make sure this succeeded. Replace /unn with the appropriate mount point: # ls -la /unn
Example: # ls -la /u05 total 28 drwxr-xr-x 4 root root 4096 Jul 29 21:58 . drwxr-xr-x 39 root root 4096 Sep 26 06:49 .. drwxr-xr-x 4 root root 4096 Jul 29 21:58 hadoop drwx------ 2 root root 16384 Jul 29 18:43 lost+found 2. Replace /unn/hadoop with the appropriate mount point to show the dfs directory: # ls -la /unn/hadoop
Example: # ls -la /u05/hadoop total <X> drwxr-xr-x 4 root root 4096 Jul 29 21:58 . drwxr-xr-x 4 root root 4096 Jul 29 21:58 .. drwx------ 3 hdfs hadoop 4096 Sep 26 06:50 dfs drwxr-xr-x 7 mapred hadoop 4096 Sep 26 15:54 mapred 3. For the new disk, after adding into Cloudera Manager, you will observe nothing in /unn/hadoop/dfs as compared with other systems. This will change over time as disk is used. Compare with a disk that has not been replaced. Replace /unn/hadoop/dfs with the mount point that has not been replaced. # du -ms /unn/hadoop/dfs
Example: # du -ms /u04/hadoop/dfs 118841 /u04/hadoop/dfs 4. Replace /unn/hadoop/dfs with the mount point that has been replaced. # du -ms /unn/hadoop/dfs
Example: # du -ms /u05/hadoop/dfs 1 /u05/hadoop/dfs Oracle NoSQL Database Disk ConfigurationThe following steps apply only to an Oracle NoSQL Database disk in an Oracle NoSQL Database cluster. This does not apply to a CDH cluster (HDFS). 1. Re-create the storage directories. Replace /unn with the appropriate mount point for the disk you have replaced: # mkdir -p /unn/kvdata
# chmod 755 /unn/kvdata # chown oracle:oinstall /unn/kvdata For example if it is disk /u04 that went down run the following commands: # mkdir -p /u04/kvdata
# chmod 755 /u04/kvdata # chown oracle:oinstall /u04/kvdata
# service nsdbservice start
Verifying the Disk ConfigurationTo verify the disk configuration: # bdachecksw
Example successful output from running bdachecksw: # bdachecksw
SUCCESS: Correct OS disk sda partition info : 1 ext3 raid 2 ext3 raid 3 linux-swap 4 ext3 primary SUCCESS: Correct OS disk sdb partition info : 1 ext3 raid 2 ext3 raid 3 linux-swap 4 ext3 primary SUCCESS: Correct data disk sdc partition info : 1 ext3 primary SUCCESS: Correct data disk sdd partition info : 1 ext3 primary SUCCESS: Correct data disk sde partition info : 1 ext3 primary SUCCESS: Correct data disk sdf partition info : 1 ext3 primary SUCCESS: Correct data disk sdg partition info : 1 ext3 primary SUCCESS: Correct data disk sdh partition info : 1 ext3 primary SUCCESS: Correct data disk sdi partition info : 1 ext3 primary SUCCESS: Correct data disk sdj partition info : 1 ext3 primary SUCCESS: Correct data disk sdk partition info : 1 ext3 primary SUCCESS: Correct data disk sdl partition info : 1 ext3 primary SUCCESS: Correct software RAID info : /dev/md2 level=raid1 num-devices=2 /dev/md0 level=raid1 num-devices=2 SUCCESS: Correct mounted partitions : /dev/md0 /boot ext3 /dev/md2 / ext3 /dev/sda4 /u01 ext4 /dev/sdb4 /u02 ext4 /dev/sdc1 /u03 ext4 /dev/sdd1 /u04 ext4 /dev/sde1 /u05 ext4 /dev/sdf1 /u06 ext4 /dev/sdg1 /u07 ext4 /dev/sdh1 /u08 ext4 /dev/sdi1 /u09 ext4 /dev/sdj1 /u10 ext4 /dev/sdk1 /u11 ext4 /dev/sdl1 /u12 ext4 SUCCESS: Correct swap partitions : /dev/sdb3 partition /dev/sda3 partition SUCCESS: Correct internal USB device (sdm) : 1 SUCCESS: Correct internal USB partitions : 1 primary ext3 SUCCESS: Correct internal USB ext3 partition check : clean SUCCESS: Correct Linux kernel version : Linux 2.6.32-200.21.1.el5uek SUCCESS: Correct Java Virtual Machine version : HotSpot(TM) 64-Bit Server 1.6.0_29 SUCCESS: Correct puppet version : 2.6.11 SUCCESS: Correct MySQL version : 5.5.17 SUCCESS: All required programs are accessible in $PATH SUCCESS: All required RPMs are installed and valid SUCCESS: Correct bda-monitor status : bda monitor is running SUCCESS: Big Data Appliance software validation checks succeeded 2. If there are errors, then redo the configuration steps as necessary to correct the problem. a) If error like below occurs i.e replaced disk partition is listed at the end and all partitions are recognized then this error can be ignored and is caused due to Bug 17899101 in bdachecksw script. ERROR: Wrong mounted partitions : /dev/md0 /boot ext3 /dev/md2 / ext3 /dev/sd4 /u01 ext4 /dev/sd4 /u02 ext4 /dev/sd1 /u03 ext4 /dev/sd1 /u04 ext4 /dev/sd1 /u06 ext4 /dev/sd1 /u07 ext4 /dev/sd1 /u08 ext4 /dev/sd1 /u09 ext4 /dev/sd1 /u10 ext4 /dev/sd1 /u11 ext4 /dev/sd1 /u12 ext4 /dev/sd1 /u05 ext4
INFO: Expected mounted partitions : 12 data partitions, /boot and / Bug 17899101 is fixed in V2.4 release of BDA. Patch 17924936 contains one-off patch for BUG 17899101 to V2.3.1 release of BDA. Patch 17924887 contains one-off patch for BUG 17899101 to V2.2.1 release of BDA. Refer to the Readme file for instructions on how to apply the patch. Readme file also contains un-install instructions as needed. b) If an incorrect tune2fs command was entered above and then corrected by re-running the correct command but then you find that although bdachecksw and bdacheckhw are successful, mount -l shows output like below: /dev/sdl1 on /u05 type ext4 (rw,nodev,noatime) [/u06]
/dev/sdl1 on /u06 type ext4 (rw,nodev,noatime) [/u06] instead of: /dev/sdl1 on /u05 type ext4 (rw,nodev,noatime) [/u05]
/dev/sdl1 on /u06 type ext4 (rw,nodev,noatime) [/u06] then this is indicative of the list of mounts containing old information.
# umount /dev/sdl1 # mount /u05 If this does not resolve the issue as long as bdachecksw and bdacheckhw are completely successful you can try a reboot to clear any stale information. What If Firmware Warnings or Errors occur?If the bdacheckhw utility reports errors / warnings with regards to the HDD (Hard Disk Drive) Firmware Information indicating that the HDD firmware needs to be updated follow the instructions in "Firmware Usage and Upgrade Information for BDA Software Managed Components on Oracle Big Data Appliance V2 [ID 1542871.1]”. What If a Server Fails to Restart?The server may restart during the disk replacement procedures, either because you issued a reboot command or made an error in a MegaCli64 command. In most cases, the server restarts successfully, and you can continue working. However, in other cases, an error occurs so that you cannot reconnect using ssh. In this case, you must complete the reboot using Oracle ILOM. http://bda1node12-c.example.com
Note: Your browser must have a JDK plug-in installed. If you do not see the Java coffee cup on the log-in page, then you must install the plug-in before continuing. 2. Log in using your Oracle ILOM credentials. See the following documentation for more information: Oracle Integrated Lights Out Manager (ILOM) 3.0 documentation at http://docs.oracle.com/cd/E19860-01/
References<NOTE:1542871.1> - Firmware Usage and Upgrade Information for BDA Software Managed Components on Oracle Big Data Appliance<NOTE:1581331.1> - Steps for Replacing a Disk Drive and Determining its Function on the Oracle Big Data Appliance V2.2.*/V2.3.1/V2.4.0/V2.5.0/V3.x/V4.x Attachments This solution has no attachment |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|