Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-1479681.1
Update Date:2014-11-12
Keywords:

Solution Type  Problem Resolution Sure

Solution  1479681.1 :   Error During Exadata 11.2.3.1.1 Upgrade Unable to backup file into /boot/cellboot.backup.11.2.2.3.5.110815.tar  


Related Items
  • Oracle Exadata Storage Server Software
  •  
  • Exadata Database Machine X2-2 Qtr Rack
  •  
  • Exadata Database Machine X2-2 Full Rack
  •  
  • Exadata Database Machine X2-8
  •  
  • Exadata Database Machine X2-2 Half Rack
  •  
  • Exadata Database Machine X2-2 Hardware
  •  
Related Categories
  • PLA-Support>Eng Systems>Exadata/ODA/SSC>Oracle Exadata>DB: Exadata_EST
  •  




Created from <SR 3-5996964501>

Applies to:

Exadata Database Machine X2-2 Full Rack - Version All Versions and later
Oracle Exadata Storage Server Software - Version 11.2.2.3.5 to 11.2.3.1.1 [Release 11.2]
Exadata Database Machine X2-2 Half Rack - Version All Versions and later
Exadata Database Machine X2-2 Hardware - Version All Versions and later
Exadata Database Machine X2-2 Qtr Rack - Version All Versions and later
Information in this document applies to any platform.

Symptoms

Patching of one exadata cells failed with [ERROR] Unable to backup file into /boot/cellboot.backup.11.2.2.3.5.110815.tar

dmXXcelYY: [ERROR] Unable to backup file into /boot/cellboot.backup.11.2.2.3.5.110815.tar
dmXXcelYY: _EXIT_ERROR_Cell dmXXcelYY AA.BB.CC.DD 2012-07-28 06:51:27: Patch or rollback failed as reported by /root/_patch_hctap_/_p_/install.sh -query state on the cell.
dmXXcelYY:
dmXXcelYY: [INFO] Patchmgr was launched from dmXXdbZZ.Company.Country_AA.BB.CC.EE_tmp_patch_11.2.3.1.1.120607.
dmXXcelYY: Cell dmXXcelYY AA.BB.CC.DD
dmXXcelYY: _EXIT_ERROR_Cell dmXXcelYY AA.BB.CC.DD 2012-07-28 06:51:27: Patch or rollback failed as reported by /root/_patch_hctap_/_p_/install.sh -query state on the cell.

40;31mFAILED[0m for following cells
dmXXcelYY:  dmXXcelYY AA.BB.CC.DD 2012-07-28 06:51:27: Patch or rollback failed as reported by /root/_patch_hctap_/_p_/install.sh -query state on the cell.
2012-07-28 06:51:28 4 of 5 :[40;31mFAILED[0m: Details in files <cell_name>.log, /tmp/patch_11.2.3.1.1.120607/patchmgr.stdout, /tmp/patch_11.2.3.1.1.120607/patchmgr.stderr.
2012-07-28 06:51:28 4 of 5 :[40;1;31mFAILED[0m: DONE: Wait for cells to reboot and come online.
[ERROR] This patchmgr run failed. Please run cleanup before retrying.
[40;1;36m================PatchMgr run ended Sat Jul 28 06:51:28 EDT 2012 ===========[0m

 

Current imageinfo

 

# imageinfo

Kernel version: 2.6.18-194.3.1.0.4.el5 #1 SMP Sat Feb 19 03:38:37 EST 2011 x86_64
Cell version: CELL-01514: Connect Error. Verify that Management Server is listening at the specified HTTP port: 8888.
Cell rpm version: cell-11.2.2.3.5_LINUX.X64_110815-1

Active image version: 11.2.2.3.5.110815
Active image activated: 2011-10-15 21:09:02 -0400
Active image status: success
Active system partition on device: /dev/md6
Active software partition on device: /dev/md8

In partition rollback: Impossible

Cell boot usb partition: /dev/sdm1
Cell boot usb version: 11.2.2.3.5.110815

Inactive image version: 11.2.3.1.1.120607
[WARNING] File not found /opt/oracle.cellos/patch/history/image.id.11.2.3.1.1.120607
Inactive system partition on device: /dev/md5
Inactive software partition on device: /dev/md7

Boot area has rollback archive for the version: undefined
Rollback to the inactive partitions: Impossible

 

Please note:

In this example, the IP addresses are replaced with AA.BB.CC.DD and AA.BB.CC.EE

Also the machine names are replaced with dmXXcelYY and dmXXcelZZ, and the domain name is replaced with Company.Country

Changes

 Upgrading cell image from 11.2.2.3.5 to 11.2.3.1.1

Cause

Journal entries for /dev/md4 was missing

tune2fs -l /dev/md4 output from good cell and bad cell

bad cell
=======
filesystem features:      filetype sparse_super
...
Filesystem created:       Sat Feb  5 21:36:44 2011
Last mount time:          Sat Oct 15 21:04:22 2011
Last write time:          Sun Apr 22 01:22:41 2012
Mount count:              0
...
blank for journal

good cell
========
Filesystem features:      has_journal ext_attr filetype needs_recovery sparse_super
...
Filesystem created:       Sat Feb  5 21:37:06 2011
Last mount time:          Sat Jul 28 01:00:27 2012
Last write time:          Sat Jul 28 01:00:27 2012
Mount count:              35
...
Journal inode:            8

 

mdadm --detail /dev/md$x; *note -- sub x for 1 2 5 6 7 8 11

 

/dev/md2:
        Version : 0.90
  Creation Time : Sat Feb  5 21:35:55 2011
     Raid Level : raid1
     Array Size : 2096384 (2047.59 MiB 2146.70 MB)
  Used Dev Size : 2096384 (2047.59 MiB 2146.70 MB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 2
    Persistence : Superblock is persistent

    Update Time : Sun Jul 22 04:24:36 2012
          State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

           UUID : 44f506c7:5d4d1b71:a9f3cadd:69dc557b
         Events : 0.116

    Number   Major   Minor   RaidDevice State
       0       8        9        0      active sync   /dev/sda9
       1       8       25        1      active sync   /dev/sdb9
/dev/md5:
        Version : 0.90
  Creation Time : Sat Feb  5 21:36:04 2011
     Raid Level : raid1
     Array Size : 10482304 (10.00 GiB 10.73 GB)
  Used Dev Size : 10482304 (10.00 GiB 10.73 GB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 5
    Persistence : Superblock is persistent

    Update Time : Sat Jul 28 17:00:15 2012
          State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

           UUID : a67a2ad1:05c76851:623dfe7f:ff0c322d
         Events : 0.106

    Number   Major   Minor   RaidDevice State
       0       8        5        0      active sync   /dev/sda5
       1       8       21        1      active sync   /dev/sdb5

 

/dev/md4 which is the /boot partition is missing

 

Note : Cell is booting up because the /boot is still found from the USB recovery Drive

Solution

Isolate the faulty cell by dropping all grid disk manually that belong to faulty cell

Fix the journal using the command below

df -h
====
dmXXcelYY: Filesystem            Size  Used Avail Use% Mounted on
dmXXcelYY: /dev/md6              9.9G  4.9G  4.5G  52% /
dmXXcelYY: tmpfs                  12G     0   12G   0% /dev/shm
dmXXcelYY: /dev/md8              2.0G  645M  1.3G  34% /opt/oracle
dmXXcelYY: /dev/md11             2.3G  182M  2.0G   9% /var/log/oracle

tune2fs -j /dev/md4

mount -a

df -h must show /boot mounted

Reboot faulty cell

Then attempt to patch faulty cell

cd /tmp/patch_11.2.3.1.1.120607
echo "dmXXcelYY" > dmXXcelYY
./patchmgr -cells dmXXcelYY -cleanup
./patchmgr -cells dmXXcelYY –patchcheck_prereq
./patchmgr -cells dmXXcelYY –patch
./patchmgr -cells dmXXcelYY -cleanup

After patching add griddisk back manually


Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback