How to Fix Solaris Panics Caused by "freeing free", "ufs_putapage: bn == UFS_HOLE", or "alloccgblk: can't find blk in cyl, pos"

Asset ID:	1-71-1017680.1
Update Date:	2018-04-07
Keywords:

Solution Type Technical Instruction Sure

Solution 1017680.1 : How to Fix Solaris Panics Caused by "freeing free", "ufs_putapage: bn == UFS_HOLE", or "alloccgblk: can't find blk in cyl, pos"

Applies to:

OpenSolaris Operating System - Version 2008.05 and later
Solaris Operating System - Version 8.0 and later
SPARC T4-4
All Platforms
***Checked for relevance on 08-Jul-2014***
Document is still relevant and appropriate, as UFS is still in use and ufs corruptions do happen.

Goal

File system corruption can lead to system panics with the following types of panic strings:

freeing free frag
freeing free inode
freeing free block
or
ufs_putapage: bn == UFS_HOLE
or
alloccgblk: can't find blk in cyl, pos

These panic strings indicate that the system was trying to put a fragment, inode or block onto the free list but found that it was already there. The correct response for Solaris is to stop operating with the bad data by panicking the system.

NOTE: The default response of panicking the system can be disabled. See the
following document for alternative responses:

Document 1009218.1 Troubleshooting the Cause of Solaris UFS File System Corruption and Preventing Future Corruption

The goal of this document is to provide guidance on using fsck(1M) to resolve the panics by cleaning up the file system corruption.

Solution

The fsck(1M) utility is the primary means to repair file system inconsistencies. The procedure to follow is slightly different, depending on whether the file system to repair is the root file system (containing the boot image) or a data file system. Both scenarios will be discussed.

Steps to Follow

1. Identify the corrupt file system. Here is an example of a complete panic string:

panic[cpu83]/thread=2a1007adca0: free: freeing free frag, dev:0x2000000008, blk:40988, cg:60, ino:350487, fs:/var/tmp/syscopy

The panic string is one of the types previously identified:

panic: free: freeing free * (frag, block, inode, etc.)
panic: ufs_putapage: bn == UFS_HOLE
alloccgblk: can't find blk in cyl, pos

   The file system that is corrupt, causing the panic, is identified by the string:

     fs:/var/tmp/syscopy

   Therefore, in this example, /var/tmp/syscopy needs to be repaired.

NOTE: The file system to repair may only be reported in the panic string in the system core file. It can be found by executing:

        strings vmcore.# | head

Look for the file system at the end of the panic string:

        "fs = /file-system"

2. Run fsck(1M) on the device containing the noted file system. If the file system is not reported it would be a good idea to fsck(1M) each file system separately, using the -o f options.

   Fsck(1M) *MUST* be executed on the unmounted file system!

   For a non-root or data file system, execute the following:

     fsck -o f /dev/rdsk/<device>

   The -o f options force fsck(1M) to scan the whole file system.

NOTE: The format of the device to specify is dependent on whether or not the device is actually a volume. The following are examples of different device specifications:

Example SVM # fsck /dev/md/rdsk/d0
Example VxVM # fsck /dev/vx/rdsk/rootdg/rootvol
Example disk # fsck /dev/rdsk/c0t0d0s0

   For a root file system, the file system cannot be unmounted while the system is booted. Therefore, booting from an alternate device will be necessary.

   For SPARC systems, the alternate boot device options are:

   boot cdrom -s
   boot net -s
   boot -F failsafe

NOTE: X86/X64 systems have similar options for booting into single-user mode from an alternate boot device.

   Once booted, execute the following:

     fsck -o f /dev/rdsk/<device>

   The -o f options forces fsck(1M) to scan the whole file system.

NOTE: If the corrupt root file system is on a mirrored metadevice using SVM, perform the fsck(1M) using the steps from the following document:

<Document 1340586.1> How to access (root) disk under Solaris Volume Manager Control from failsafe or CDROM

If the corrupt root file system is a volume under VxVM, please contact Symantec for assistance with fscking the filesystem.

   In all cases, execute fsck(1M) on the corrupt file system until fsck(1M) passes without errors. If fsck(1M) will not complete without error on subsequent passes, the file system may be corrupted to the point that it may need to be rebuilt and restored. Repeat the fsck(1M) until no further errors are detected.

   Once the corruption has been corrected, an attempt should be made to determine the source of the corruption and address the cause. Details on how to do this can be found in the following

<Document 1009218.1> Troubleshooting the Cause of Solaris File System Corruption and Preventing Future Corruption

To discuss this information further with Oracle experts and industry peers, we encourage you to review, join or start a discussion in the My Oracle Support Community, Oracle Solaris Kernel Community.

References

<NOTE:1009218.1> - Troubleshooting the Cause of Solaris UFS File System Corruption and Preventing Future Corruption
<NOTE:1340586.1> - How to Access (Root) Disk under Solaris Volume Manager Control (SVM) from Failsafe or CDROM and Update the boot_archive in Solaris 10

Attachments

This solution has no attachment