Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-2001263.1
Update Date:2017-11-18
Keywords:

Solution Type  Problem Resolution Sure

Solution  2001263.1 :   Oracle ZFS Storage Appliance: Performance issue after enabling ICAP Vscan (virus scanning) service on a share  


Related Items
  • Sun ZFS Storage 7420
  •  
  • Sun Storage 7110 Unified Storage System
  •  
  • Oracle ZFS Storage ZS3-2
  •  
  • Sun Storage 7210 Unified Storage System
  •  
  • Sun Storage 7410 Unified Storage System
  •  
  • Sun Storage 7310 Unified Storage System
  •  
  • Sun ZFS Storage 7120
  •  
  • Oracle ZFS Storage ZS3-4
  •  
  • Sun ZFS Storage 7320
  •  
  • Oracle ZFS Storage ZS3-BA
  •  
Related Categories
  • PLA-Support>Sun Systems>DISK>ZFS Storage>SN-DK: 7xxx NAS
  •  




In this Document
Symptoms
Changes
Cause
Solution
References


Created from <SR 3-9721766661>

Applies to:

Oracle ZFS Storage ZS3-2 - Version All Versions and later
Oracle ZFS Storage ZS3-BA - Version All Versions and later
Sun Storage 7410 Unified Storage System - Version All Versions and later
Sun Storage 7310 Unified Storage System - Version All Versions and later
Sun Storage 7110 Unified Storage System - Version All Versions and later
7000 Appliance OS (Fishworks)
Appliance Type: Sun Storage 7110
Appliance Version: 2011.04.24.9.0,1-1.46

Symptoms

After enabling the Virus Scan option on a share there was a performance issue accessing the filesystems.

There was a high CPU load observed - check via Analytics.

 

From the appliance Command Line Interface (CLI):

CLI> confirm shell prstat

    PID USERNAME  SIZE   RSS   STATE   PRI NICE      TIME      CPU     PROCESS/NLWP
    19098 daemon   4100K 2820K cpu6     0    0            0:05:53  12%    vscand/85

 

By restarting the ICAP Vscan service, the issue was temporarily resolved:

Oct  3 21:07:11 7110 vscand: [ID 540744 daemon.notice] quarantine /export/data/home/user1/Thunderbird/Profiles/khrcwfvs.default/Cache/_CACHE_003_ -8 - Malformed container violation
Oct  3 21:07:24 7110 vscand: [ID 540744 daemon.notice] quarantine /export/data/home/user2/Thunderbird/Profiles/d4lq8lsx.default/Cache/_CACHE_003_ -8 - Malformed container violation
Oct  3 21:31:28 7110 vscand: [ID 678180 daemon.notice] Scan Engine - connection error (192.x.x.45:1344) Connection timed out
Oct  3 21:41:25 7110 vscand: [ID 678180 daemon.notice] Scan Engine - connection error (192.x.x.45:1344) Connection refused
Oct  4 00:41:40 7110 vscand: [ID 678180 daemon.notice] Scan Engine - connection error (192.x.x.44:1344) Connection timed out

Oct  6 22:05:51 7110 vscand: [ID 678180 daemon.notice] Scan Engine - connection error (192.x.x.45:1344) Connection timed out
Oct  6 22:08:45 7110 vscand: [ID 940187 daemon.error] Error receiving data from Scan Engine: Connection timed out

 

Changes

Enabled ICAP virus scan on the project / share.

Cause

It was confirmed that the Vscan engine was in a waiting state,  waiting for a response from the scan engine.

We see an significant increase in the CPU usage.

 

The mdb output from ::cpuinfo shows that all cpu's are in use by vscand doing interupts:

> ::cpuinfo
ID ADDR             FLG NRUN BSPL PRI RNRN KRNRN SWITCH THREAD           PROC
 0 fffffffffbc303e0  1f    1    0  10   no    no t-1 fffff6000f324760 vscand
 1 fffff6000ae48040  1f    0    0  10   no    no t-1 fffff60093b5bb60 vscand
 2 fffff6000b89a080  1f    0    0   0   no    no t-0 fffff6000f369ae0 vscand
 3 fffffffffbc3aba0  1b    0    0   0   no    no t-0 fffff602b21b3bc0 uadmin
 4 fffff6000b883540  1f    1    0   0  yes    no t-0 fffff602b226f000 vscand
 5 fffff6000b87f580  1f    0    0  10   no    no t-1 fffff602b2432040 vscand
 6 fffff6000b87e080  1f    1    0   0  yes    no t-1 fffff602b12c37a0 vscand
 7 fffff6000b9f0a80  1f    0    0  11   no    no t-4 fffff602b24a4b80 vscand


Threads ONPROC are currently doing vscan_drv_read  :

ffff602b226f000 ONPROC   <NONE>                  1
                0xb
                0
                apic_setspl+0x5c
                do_splx+0x62
                apic_intr_exit+0x32
                hilevel_intr_epilog+0x123
                do_interrupt+0xfb
                _sys_rtt_ints_disabled+8
                tsc_read+3
                gethrtime+0xd
                pc_gethrestime+0x49
                0
                mutex_enter+0x10
                vscan_drv_read+0x8b <<<<<
                cdev_read+0x49
                spec_read+0x233
                fop_read+0xa7
                read+0x2b8
                read32+0x22
                sys_syscall32+0xff

 

Solution

Analysing the vscand process core file confirmed that Vscan service was in a waiting state, waiting for a response from the configured scan engine.

These Virus Scan request timeout messages only appear after vscand has waited 15 minutes for the scan engine system to respond.

In that time other requests have queued up and threads spin,  waiting for the mutex,  and hence the high CPU usage.

The customer was requested to contact the 3rd party scan engine manufacturer to see what options they suggested in understanding the delay in response.



As a workaround this customer has now excluded certain file types from being scanned.

See Oracle's online documentation for details of how to set the file type to exclude:   http://docs.oracle.com/cd/E26502_01/html/E29031/vscanadm-1m.html

They found that were able to reduce the timeouts, after changing the file scanning to exclude the *.msf files (which are mail index files for Thunderbird).

ie. From the appliance Command Line Interface (CLI), the vscanadm command was used exclude these files:

7110 CLI> confirm shell vscanadm show
max-size=20M
max-size-action=allow
types=-msf,+*

0:enable=on
0:host=192.x.x.44
0:port=1344
0:max-connection=200

1:enable=on
1:host=192.x.x.45
1:port=1344
1:max-connection=200

 

Also, depending on the virus scan type, some files are put on the quarantine list that can cause the CPU usage of vscand to grow.

From the virus scan logs, we deleted these files.   Other virus scanners may have other suspected files (to delete).

NAS# more vsc*
........
11/17/15 15:36:10 quarantine 85[/export/homedir/redirect/jewat/Downloads/GoToMeeting Launcher (6).exe.83zymj2.partial] 0 - 25[Win.Adware.Softpulse-223;]
11/17/15 15:36:36 quarantine 85[/export/homedir/redirect/jewat/Downloads/GoToMeeting Launcher (7).exe.opahu55.partial] 0 - 25[Win.Adware.Softpulse-223;]
11/17/15 15:36:45 quarantine 85[/export/homedir/redirect/jewat/Downloads/GoToMeeting Launcher (8).exe.31n3e1b.partial] 0 - 25[Win.Adware.Softpulse-223;]
11/17/15 15:43:58 quarantine 85[/export/homedir/redirect/jewat/Downloads/GoToMeeting Launcher (9).exe.yhn0wx3.partial] 0 - 25[Win.Adware.Softpulse-223;]
02/19/16 08:15:05 quarantine 81[/export/homedir/redirect/lehe/Documents/Systeembeheer/Software/wordview_nl-nl.exe] 0 - 23[Win.Trojan.Ramnit-6140;]
02/23/16 07:12:24 quarantine 84[/export/data/09 Algemeen/100 Postbussen/leon/Systeembeheer/Downloads/CF4-Generic.exe] 0 - 23[Win.Trojan.Ramnit-6286;]

 

References

<NOTE:1582979.1> - Sun Storage 7000 Unified Storage System: SMB I/O Stalls when ICAP Virus Scan Engine ( vscan ) Scans File
<BUG:19864355> - VSCAN IS HOGGING THE CPU
<BUG:22122994> - SMB CREATE REQUESTS TIMEOUT BEFORE VIRUS SCANNING IS COMPLETED.

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback