Asset ID: |
1-72-1479681.1 |
Update Date: | 2014-11-12 |
Keywords: | |
Solution Type
Problem Resolution Sure
Solution
1479681.1
:
Error During Exadata 11.2.3.1.1 Upgrade Unable to backup file into /boot/cellboot.backup.11.2.2.3.5.110815.tar
Related Items |
- Oracle Exadata Storage Server Software
- Exadata Database Machine X2-2 Qtr Rack
- Exadata Database Machine X2-2 Full Rack
- Exadata Database Machine X2-8
- Exadata Database Machine X2-2 Half Rack
- Exadata Database Machine X2-2 Hardware
|
Related Categories |
- PLA-Support>Eng Systems>Exadata/ODA/SSC>Oracle Exadata>DB: Exadata_EST
|
Created from <SR 3-5996964501>
Applies to:
Exadata Database Machine X2-2 Full Rack - Version All Versions and later
Oracle Exadata Storage Server Software - Version 11.2.2.3.5 to 11.2.3.1.1 [Release 11.2]
Exadata Database Machine X2-2 Half Rack - Version All Versions and later
Exadata Database Machine X2-2 Hardware - Version All Versions and later
Exadata Database Machine X2-2 Qtr Rack - Version All Versions and later
Information in this document applies to any platform.
Symptoms
Patching of one exadata cells failed with [ERROR] Unable to backup file into /boot/cellboot.backup.11.2.2.3.5.110815.tar
dmXXcelYY: [ERROR] Unable to backup file into /boot/cellboot.backup.11.2.2.3.5.110815.tar
dmXXcelYY: _EXIT_ERROR_Cell dmXXcelYY AA.BB.CC.DD 2012-07-28 06:51:27: Patch or rollback failed as reported by /root/_patch_hctap_/_p_/install.sh -query state on the cell.
dmXXcelYY:
dmXXcelYY: [INFO] Patchmgr was launched from dmXXdbZZ.Company.Country_AA.BB.CC.EE_tmp_patch_11.2.3.1.1.120607.
dmXXcelYY: Cell dmXXcelYY AA.BB.CC.DD
dmXXcelYY: _EXIT_ERROR_Cell dmXXcelYY AA.BB.CC.DD 2012-07-28 06:51:27: Patch or rollback failed as reported by /root/_patch_hctap_/_p_/install.sh -query state on the cell.
40;31mFAILED[0m for following cells
dmXXcelYY: dmXXcelYY AA.BB.CC.DD 2012-07-28 06:51:27: Patch or rollback failed as reported by /root/_patch_hctap_/_p_/install.sh -query state on the cell.
2012-07-28 06:51:28 4 of 5 :[40;31mFAILED[0m: Details in files <cell_name>.log, /tmp/patch_11.2.3.1.1.120607/patchmgr.stdout, /tmp/patch_11.2.3.1.1.120607/patchmgr.stderr.
2012-07-28 06:51:28 4 of 5 :[40;1;31mFAILED[0m: DONE: Wait for cells to reboot and come online.
[ERROR] This patchmgr run failed. Please run cleanup before retrying.
[40;1;36m================PatchMgr run ended Sat Jul 28 06:51:28 EDT 2012 ===========[0m
Current imageinfo
# imageinfo
Kernel version: 2.6.18-194.3.1.0.4.el5 #1 SMP Sat Feb 19 03:38:37 EST 2011 x86_64
Cell version: CELL-01514: Connect Error. Verify that Management Server is listening at the specified HTTP port: 8888.
Cell rpm version: cell-11.2.2.3.5_LINUX.X64_110815-1
Active image version: 11.2.2.3.5.110815
Active image activated: 2011-10-15 21:09:02 -0400
Active image status: success
Active system partition on device: /dev/md6
Active software partition on device: /dev/md8
In partition rollback: Impossible
Cell boot usb partition: /dev/sdm1
Cell boot usb version: 11.2.2.3.5.110815
Inactive image version: 11.2.3.1.1.120607
[WARNING] File not found /opt/oracle.cellos/patch/history/image.id.11.2.3.1.1.120607
Inactive system partition on device: /dev/md5
Inactive software partition on device: /dev/md7
Boot area has rollback archive for the version: undefined
Rollback to the inactive partitions: Impossible
Please note:
In this example, the IP addresses are replaced with AA.BB.CC.DD and AA.BB.CC.EE
Also the machine names are replaced with dmXXcelYY and dmXXcelZZ, and the domain name is replaced with Company.Country
Changes
Upgrading cell image from 11.2.2.3.5 to 11.2.3.1.1
Cause
Journal entries for /dev/md4 was missing
tune2fs -l /dev/md4 output from good cell and bad cell
bad cell
=======
filesystem features: filetype sparse_super
...
Filesystem created: Sat Feb 5 21:36:44 2011
Last mount time: Sat Oct 15 21:04:22 2011
Last write time: Sun Apr 22 01:22:41 2012
Mount count: 0
...
blank for journal
good cell
========
Filesystem features: has_journal ext_attr filetype needs_recovery sparse_super
...
Filesystem created: Sat Feb 5 21:37:06 2011
Last mount time: Sat Jul 28 01:00:27 2012
Last write time: Sat Jul 28 01:00:27 2012
Mount count: 35
...
Journal inode: 8
mdadm --detail /dev/md$x; *note -- sub x for 1 2 5 6 7 8 11
/dev/md2:
Version : 0.90
Creation Time : Sat Feb 5 21:35:55 2011
Raid Level : raid1
Array Size : 2096384 (2047.59 MiB 2146.70 MB)
Used Dev Size : 2096384 (2047.59 MiB 2146.70 MB)
Raid Devices : 2
Total Devices : 2
Preferred Minor : 2
Persistence : Superblock is persistent
Update Time : Sun Jul 22 04:24:36 2012
State : clean
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0
UUID : 44f506c7:5d4d1b71:a9f3cadd:69dc557b
Events : 0.116
Number Major Minor RaidDevice State
0 8 9 0 active sync /dev/sda9
1 8 25 1 active sync /dev/sdb9
/dev/md5:
Version : 0.90
Creation Time : Sat Feb 5 21:36:04 2011
Raid Level : raid1
Array Size : 10482304 (10.00 GiB 10.73 GB)
Used Dev Size : 10482304 (10.00 GiB 10.73 GB)
Raid Devices : 2
Total Devices : 2
Preferred Minor : 5
Persistence : Superblock is persistent
Update Time : Sat Jul 28 17:00:15 2012
State : clean
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0
UUID : a67a2ad1:05c76851:623dfe7f:ff0c322d
Events : 0.106
Number Major Minor RaidDevice State
0 8 5 0 active sync /dev/sda5
1 8 21 1 active sync /dev/sdb5
/dev/md4 which is the /boot partition is missing
Note : Cell is booting up because the /boot is still found from the USB recovery Drive
Solution
Isolate the faulty cell by dropping all grid disk manually that belong to faulty cell
Fix the journal using the command below
df -h
====
dmXXcelYY: Filesystem Size Used Avail Use% Mounted on
dmXXcelYY: /dev/md6 9.9G 4.9G 4.5G 52% /
dmXXcelYY: tmpfs 12G 0 12G 0% /dev/shm
dmXXcelYY: /dev/md8 2.0G 645M 1.3G 34% /opt/oracle
dmXXcelYY: /dev/md11 2.3G 182M 2.0G 9% /var/log/oracle
tune2fs -j /dev/md4
mount -a
df -h must show /boot mounted
Reboot faulty cell
Then attempt to patch faulty cell
cd /tmp/patch_11.2.3.1.1.120607
echo "dmXXcelYY" > dmXXcelYY
./patchmgr -cells dmXXcelYY -cleanup
./patchmgr -cells dmXXcelYY –patchcheck_prereq
./patchmgr -cells dmXXcelYY –patch
./patchmgr -cells dmXXcelYY -cleanup
After patching add griddisk back manually
Attachments
This solution has no attachment