Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-1504807.1
Update Date:2015-09-11
Keywords:

Solution Type  Problem Resolution Sure

Solution  1504807.1 :   Sun Storage 7000 Unified Storage System: Failing or moving readzilla SSD devices can lead to a panic  


Related Items
  • Sun ZFS Storage 7420
  •  
  • Sun Storage 7410 Unified Storage System
  •  
  • Sun Storage 7310 Unified Storage System
  •  
  • Sun ZFS Storage 7320
  •  
Related Categories
  • PLA-Support>Sun Systems>DISK>ZFS Storage>SN-DK: 7xxx NAS
  •  




In this Document
Symptoms
Changes
Cause
Solution
References


Created from <SR 3-6306787561>

Applies to:

Sun ZFS Storage 7320 - Version All Versions and later
Sun ZFS Storage 7420 - Version All Versions and later
Sun Storage 7310 Unified Storage System - Version All Versions and later
Sun Storage 7410 Unified Storage System - Version All Versions and later
7000 Appliance OS (Fishworks)

Symptoms

The system panics with a stack like this:

panic[cpu1]/thread=ffffff00f5be1c40:
mutex_enter: bad mutex, lp=ffffff81ecd64e58 owner=5c00 thread=ffffff00f5be1c40

ffffff00f5be1950 unix:mutex_panic+73 ()
ffffff00f5be19b0 unix:mutex_vector_enter+446 ()
ffffff00f5be1a00 nv_sata:nv_power_reset+2a2 ()
ffffff00f5be1a60 nv_sata:nv_reset+b2 ()
ffffff00f5be1a90 nv_sata:nv_monitor_reset+2c8 ()
ffffff00f5be1af0 nv_sata:nv_timeout+55c ()
ffffff00f5be1b30 genunix:callout_list_expire+77 ()
ffffff00f5be1b60 genunix:callout_expire+31 ()
ffffff00f5be1b80 genunix:callout_execute+1e ()
ffffff00f5be1c20 genunix:taskq_thread+248 ()
ffffff00f5be1c30 unix:thread_start+8 ()

syncing file systems...
done

This may be caused by a faulted readzilla.

Check the system logs for errors that correspond to a failure of one of the readzilla read cache SSDs in the nas head unit, this will have occurred shortly before the panic:
Oct 11 09:52:10 dennas402 sata: [ID 801593 kern.warning] WARNING: /pci@0,0/pci10de,cb84@5,1:
Oct 11 09:52:10 dennas402  SATA device at port 1 - device failed
Oct 11 09:52:10 dennas402 scsi: [ID 107833 kern.warning] WARNING: /pci@0,0/pci10de,cb84@5,1/disk@1,0 (sd195):
Oct 11 09:52:10 dennas402       Command failed to complete...Device is gone
Oct 11 09:52:10 dennas402 scsi: [ID 107833 kern.warning] WARNING: /pci@0,0/pci10de,cb84@5,1/disk@1,0 (sd195):
Oct 11 09:52:10 dennas402       drive offline
Oct 11 09:52:10 dennas402 scsi: [ID 107833 kern.warning] WARNING: /pci@0,0/pci10de,cb84@5,1/disk@1,0 (sd195):
Oct 11 09:52:10 dennas402       SYNCHRONIZE CACHE command failed (5)
 

The panic can also be caused by simply swapping the readzilla drives around.

Changes

Faulted readzilla

Readzilla drives being swapped around

Cause

This is a known problem with interrupt servicing in the sata driver code.

See <Bug 15794723> for further details.

Solution

Upgrade to appliance software version 2011.1.6.0 or later.

See the My Oracle Support portal Patches and Updates tab, or MOS Document ID 2021771.1 - Oracle ZFS Storage Appliance: Software Updates  for details.
 

References

<BUG:15794723> - SUNBT7168171-AK-2011.04.24 PANIC: NV_POWER_RESET ATTEMPTS TO ACQUIRE A MUTEX FOR
<NOTE:2021771.1> - Oracle ZFS Storage Appliance: Software Updates

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback