![]() | Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition | ||
|
|
![]() |
||||||||||||||||||||
Solution Type Problem Resolution Sure Solution 1489376.1 : Exalogic ZFS Storage Pool Failure or Faulted
In this Document
Created from <SR 3-5989624601> Applies to:Sun ZFS Storage 7320 - Version All Versions to All Versions [Release All Releases]Oracle Exalogic Elastic Cloud Software - Version 1.0.0.0.0 and later Sun ZFS Storage 7320 7000 Appliance OS (Fishworks) SymptomsWhen trying to view shares on Storage Node getting below error: XXXXXXsnXX:> shares error: The action could not be completed because the target 'exalogic/local' no
ChangesLogzilla swap and memory upgraded on Storage Node, where the issue is reported CauseAfter the Cluster takeover the logzillas were being removed but were not added again. SolutionPlease contact Oracle Support for assistance for manual steps to recover the situation. For Exalogic Support Engineers: The following INTERNAL ONLY section of this note provides a description of the steps that will need to be performed under support supervision of Exalogic Support and ZFS Engineer Perform the below steps to resolve this issue: 1. Check zpool list XXXXXXsnXX:> confirm shell zpool list
NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT exalogic 16.3T 12.2T 4.09T 74% 1.00x DEGRADED - system 464G 167G 297G 35% 1.00x ONLINE - Refer:- For more information on Zpool 2. Generate an akd core (gcore `pgrep -xo akd`) and look at the ::nas_cache... #echo :nas_cache | mdb core.<PID>
You can see nas_cache has given no output. 3. Verify if zfs/exalogic is faulted in akd as below > ::ak_rm_elem !grep exalo
93ea348 SINGLETON FAULTED ak:/zfs/exalogic 17848b48 SYMBIOTE IMPORTED ak:/nas/exalogic 178488c8 SYMBIOTE FAULTED ak:/replication/exalogic 17848648 SYMBIOTE FAULTED ak:/shadow/exalogic 178483c8 SYMBIOTE IMPORTED ak:/fct/exalogic 4. Disable and enable the akd Note: In this step, akd will be restarted. Before and after disableing and enabling the akd, make sure to check the peer cluster state to verify whether the cluster states shown by peer is proper and should not be in transition like rebooting, joining . It should be owner, stripped. Disabled akd: svcadm disable -t akd
then enable it: svcadm enable akd
It will take a couple of minutes but once akd is restarted the shares will be accessible... NOTE:
Instead of disabling and enabling akd which restarts akd, alternative we can run following command. raw nas.discover({pool:”<pool>”})
Where <pool> is the name of the pool (ie. raw nas.discover({pool:"my pool"}) ) 5. After disabling/enabling the akd check the nas_cache again and you will see Entries and Mountpoints.
# echo ::nas_cache |mdb -p `pgrep -ox akd` nas cache at 0x910b6c8 Entries: ADDR DATASET STATE FLAGS 19557208 NONE /export/binaries/mw_home1
6. Once the nas_cache looksfine then fix the logzillas which are UNAVAIL (note: the logzillas were replaced with larger logzillas)...
> logs
7. Check for the GUID that needs to be cleaned up using ::spa -v... > ::spa -v
... ffffff826238f2c0 DEGRADED - replacing ffffff826238a080 REMOVED - /dev/dsk/c0t5000A72030022B98d0s0 ffffff82623a40c0 CANT_OPEN BAD_LABEL /dev/dsk/c0t5000A7203004D24Fd0s0 ... 8. To get the GUID, do the following:
> ffffff826238f2c0::print vdev_t vdev_guid |=E 1769560619761342386
9. Now remove the GUID from the pool XXXXXXsnXX# zpool remove exalogic 1769560619761342386 &
10. Now check the zpool status again: logs
replacing-9 DEGRADED 0 2 0 c0t5000A72030022B90d0 REMOVED 0 0 0 c0t5000A7203004D200d0 UNAVAIL 0 0 0 corrupted data replacing-11 DEGRADED 0 2 0 c0t5000A72030022315d0 REMOVED 0 0 0 c0t5000A7203004D21Ed0 UNAVAIL 0 0 0 corrupted data c0t5000A7203004D221d0 ONLINE 0 0 0 11. Once that was all removed, now we add the logzillas back... XXXXXXsnXX# zpool add exalogic log c0t5000A7203004D21Ed0&
References<NOTE:1476998.1> - List Of Hardware and Storage Related Notes That Exalogic SRs Can Be Linked To/Referenced While Working Exalogic SRsAttachments This solution has no attachment |
||||||||||||||||||||
|