Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-71-1526585.1
Update Date:2013-02-14
Keywords:

Solution Type  Technical Instruction Sure

Solution  1526585.1 :   Managing SuperCluster Boot Environments  


Related Items
  • SPARC SuperCluster T4-4
  •  
Related Categories
  • PLA-Support>Eng Systems>Exadata/ODA/SSC>SPARC SuperCluster>DB: SuperCluster_EST
  •  




In this Document
Goal
Fix
 Managing Space
 Best Practices


Applies to:

SPARC SuperCluster T4-4 - Version All Versions and later
Oracle Solaris on SPARC (64-bit)

Goal

Multiple boot environments reduce risk when updating software because system administrators can create backup boot environments before making any software updates to the system. Each time the SuperCluster is patched with the latest QMU (Quarterly Maintenance Update) the maintenance update script will create a new QMU boot environment on each compute and general purpose domain which the administrator will activate during a maintenance window. Obviously if these older environments are not managed it's possible to consume space on the root zpool. To ensure adequate space following each successful upgrade to the latest QMU we would recommend removing older boot environments so the root zpool doesn't run out of space. Before removing anything it's obviously advisable to ensure that you've got a complete backup of the system.

Fix

In this example, we've a SuperCluster domain which still has some of the original installation boot environments which are no longer required:

# beadm list
BE                     Active Mountpoint Space   Policy Created
--                     ------ ---------- -----   ------ -------
S11-bld55              -      -          66.0K   static 2012-07-02 14:27
SSCMU_2012.07          -      -          32.85M  static 2012-09-12 17:57
SSCMU_2012.10          NR     /          260.25G static 2012-11-27 10:31
SSCMU_2012.10-backup-1 -      -          229.0K  static 2012-11-27 16:57
solaris                -      -          22.90M  static 2012-04-30 22:48
solaris11-sru55        -      -          60.51M  static 2012-07-02 14:14

It would be advisable to keep at least the previous QMU boot environment each time the system is upgraded in case of a problem with the current BE. In this case we'll remove the "solaris", "solaris11-sru55" and "S11-bld55" BE's:

# beadm destroy S11-bld55
Are you sure you want to destroy S11-bld55?  This action cannot be undone(y/[n]): y
# beadm destroy solaris11-sru55
Are you sure you want to destroy solaris11-sru55?  This action cannot be undone(y/[n]): y
# beadm destroy solaris
Are you sure you want to destroy solaris?  This action cannot be undone(y/[n]): y

That should then leave us with the following:

# beadm list
BE                     Active Mountpoint Space   Policy Created
--                     ------ ---------- -----   ------ -------
SSCMU_2012.07          -      -          32.85M  static 2012-09-12 17:57
SSCMU_2012.10          NR     /          260.25G static 2012-11-27 10:31
SSCMU_2012.10-backup-1 -      -          229.0K  static 2012-11-27 16:57

Managing Space

Depending on how the system and local disk space is being used, you might become low on space and a common issue issue is not being aware of snapshot clones which are created during a beadm create. A snapshot itself if created with a zfs snapshot can be deleted as long as it's not cloned or being referenced elsewhere. In this example the system administrator has noticed that they're running low on space for some applications to run:

# zpool list
NAME        SIZE  ALLOC   FREE  CAP  DEDUP  HEALTH  ALTROOT
BIrpool-1   279G   257G  22.2G  92%  1.00x  ONLINE  -
zonespool  19.9G   906M  19.0G   4%  1.00x  ONLINE  -

We've only got 22gb free so where's all the space gone?

# beadm list
BE            Active Mountpoint Space   Policy Created          
--            ------ ---------- -----   ------ -------          
SSCMU_2012.07 -      -          401.45M static 2012-09-03 07:09 
SSCMU_2012.10 NR     /          285.00G static 2012-11-26 07:49 
backup        -      -          192.0K  static 2013-02-05 08:55 

The current boot environment SSCMU_2012.10 looks to contain the majority of space so knowing which large files I need to delete, let's go through each of the boot environments and delete the reference to the file as that should allow the space to be released.

# rm /root/large_space_consuming_file
# beadm mount backup /mnt
# rm /mnt/root/large_space_consuming_file
# beadm unmount backup
# beadm mount SSCMU_2012.07 /mnt
# rm /mnt/root/large_space_consuming_file
# beadm unmount SSCMU_2012.07

# zfs list -t snapshot
NAME                                                                   USED  AVAIL  REFER  MOUNTPOINT
BIrpool-1/ROOT/SSCMU_2012.10@2012-11-26-15:49:39                       774M      -  2.01G  -
BIrpool-1/ROOT/SSCMU_2012.10@2013-02-05-16:55:22                       140G      -   142G  -
BIrpool-1/ROOT/SSCMU_2012.10/var@2012-11-26-15:49:39                  92.1M      -   545M  -
BIrpool-1/ROOT/SSCMU_2012.10/var@2013-02-05-16:55:22                    23K      -   580M  -
BIrpool-1/zones/zc-zone/rpool/ROOT/solaris-1@2012-11-26-15:49:41       245M      -   549M  -
BIrpool-1/zones/zc-zone/rpool/ROOT/solaris-1/var@2012-11-26-15:49:41  23.6M      -  35.3M  -
zonespool/zc-zone/rpool/ROOT/solaris-1@2012-11-26-15:49:41             253M      -   548M  -
zonespool/zc-zone/rpool/ROOT/solaris-1@2013-02-05-16:55:26                0      -   579M  -
zonespool/zc-zone/rpool/ROOT/solaris-1/var@2012-11-26-15:49:41        23.6M      -  35.2M  -
zonespool/zc-zone/rpool/ROOT/solaris-1/var@2013-02-05-16:55:26            0      -  47.6M  -

We've got 140Gb in BIrpool-1/ROOT/SSCMU_2012.10@2013-02-05-16:55:22 which still hasn't been freed up so there must be some other reason why we cannot free the space up even though we're deleted the file from all boot environments

# zfs destroy BIrpool-1/ROOT/SSCMU_2012.10@2013-02-05-16:55:22
cannot destroy 'BIrpool-1/ROOT/SSCMU_2012.10@2013-02-05-16:55:22': snapshot has dependent clones
use '-R' to destroy the following datasets:
BIrpool-1/ROOT/backup/var
BIrpool-1/ROOT/backup

 We can't delete it because it's part of our backup boot environment and this is because beadm create will create a snapshot from the current be and use that as a clone for the new be.

# zfs get origin BIrpool-1/ROOT/SSCMU_2012.10@2013-02-05-16:55:22
NAME                                              PROPERTY  VALUE   SOURCE
BIrpool-1/ROOT/SSCMU_2012.10@2013-02-05-16:55:22  origin    -       -

Ah, that was taken from our current BE and thus a clone which we cannot remove until we destroy the backup be.

# beadm destroy backup
Are you sure you want to destroy backup?  This action cannot be undone(y/[n]): y

root@etc17-04-mgmt:~# zpool list
NAME        SIZE  ALLOC   FREE  CAP  DEDUP  HEALTH  ALTROOT
BIrpool-1   279G   198G  81.1G  70%  1.00x  ONLINE  -
zonespool  19.9G   906M  19.0G   4%  1.00x  ONLINE  -

and now we've re-claimed the space.

Best Practices

  • Aim to keep at least one previous QMU boot environment.
  • Ensure regular system backups are performed.
  • Avoid storing application data on the boot/root disks as that data would be better suited to being located on ZFS-SA.
  • Monitor available space frequently and take necessary action early to avoid unnecessary problems.

For further details on managing boot environments please review the Creating and Administering Oracle Solaris 11 Boot Environments chapter and the ZFS Administrators Guide.


Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback