Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-1632242.1
Update Date:2016-02-12
Keywords:

Solution Type  Problem Resolution Sure

Solution  1632242.1 :   Snapshot-based Backup via NFS Over Infiniband Network Will Freeze The Node  


Related Items
  • Exadata Database Machine V2
  •  
Related Categories
  • PLA-Support>Eng Systems>Exadata/ODA/SSC>Oracle Exadata>DB: Exadata_EST
  •  




In this Document
Symptoms
Cause
Solution
References


Created from <SR 3-7822700871>

Applies to:

Exadata Database Machine V2 - Version All Versions and later
Information in this document applies to any platform.

Symptoms

Backup server will go to a hung state when doing a Snapshot-based Backup from a Exadata Compute Node via NFS over Infiniband Network. The same backup procedure will work fine if its over normal 10Gb ethernet card with default setting.

The message file from the node will be showing the below stack information when running backup.

Aug 28 10:27:43 exahostdb01 lvm[79582]: Monitoring snapshot VGExaDb-u01_snap <<<
..
Aug 28 10:31:14 exahostdb01 kernel: nfs: server exaNFSbackup not responding, still trying
Aug 28 10:39:45 exahostdb01 kernel: nfs: server exaNFSbackup OK
Aug 28 10:55:00 exahostdb01 kernel: nfs: server exaNFSbackup not responding, still trying
...
Aug 28 11:01:10 exahostdb01 kernel: nfs: server exaNFSbackup OK
Aug 28 11:13:00 exahostdb01 kernel: nfs: server exaNFSbackup not responding, still trying
Aug 28 11:13:53 exahostdb01 kernel: nfs: server exaNFSbackup OK
Aug 28 12:41:20 exahostdb01 kernel: RDS/IB: re-connect to 169.XXX.XXX.XXX is stalling for more than 1 min...(drops=12 err=0)
Aug 28 12:41:20 exahostdb01 kernel: RDS/IB: re-connect to 169.XXX.XXX.XXX is stalling for more than 1 min...(drops=12 err=0)
Aug 28 12:41:58 exahostdb01 kernel: RDS/IB: re-connect to 10.XXX.XXX.XXX is stalling for more than 1 min...(drops=1 err=0)
Aug 28 14:13:49 exahostdb01 kernel: RDS/IB: connected to 10.XXX.XXX.XXX version 3.1
Aug 28 14:16:47 exahostdb01 kernel: RDS/IB: connected to 169.XXX.XXX.XXX version 3.1
Aug 28 14:16:47 exahostdb01 kernel: RDS/IB: connected to 169.XXX.XXX.XXX version 3.1
....
Sep  5 12:16:12 exahostdb01 kernel: INFO: task lsof:65691 blocked for more than 120 seconds.
Sep  5 12:16:12 exahostdb01 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Sep  5 12:16:12 exahostdb01 kernel: lsof          D 0000000000000000     0 65691  65680 0x00000080
Sep  5 12:16:12 exahostdb01 kernel:  ffff88116ec7bc08 0000000000000082 0000000000000000 ffffffffadf60c48
Sep  5 12:16:12 exahostdb01 kernel:  ffff88355e6ea080 ffffffff81aae4c0 ffff88355e6ea450 0000000176b6fa52
Sep  5 12:16:12 exahostdb01 kernel:  000000006ec7bc98 0000000000000000 0000000000000000 ffff88355e6ea080
Sep  5 12:16:12 exahostdb01 kernel: Call Trace:
Sep  5 12:16:12 exahostdb01 kernel:  [<ffffffff814569cc>] io_schedule+0x42/0x5c
Sep  5 12:16:12 exahostdb01 kernel:  [<ffffffffa0614b02>] nfs_wait_bit_uninterruptible+0xe/0x12 [nfs]
Sep  5 12:16:12 exahostdb01 kernel:  [<ffffffff81456efb>] __wait_on_bit+0x4a/0x7c
Sep  5 12:16:12 exahostdb01 kernel:  [<ffffffffa0614af4>] ? nfs_wait_bit_uninterruptible+0x0/0x12 [nfs]
Sep  5 12:16:12 exahostdb01 kernel:  [<ffffffffa0614af4>] ? nfs_wait_bit_uninterruptible+0x0/0x12 [nfs]
Sep  5 12:16:12 exahostdb01 kernel:  [<ffffffff81456fa0>] out_of_line_wait_on_bit+0x73/0x80
Sep  5 12:16:12 exahostdb01 kernel:  [<ffffffff8107706d>] ? wake_bit_function+0x0/0x2f
Sep  5 12:16:12 exahostdb01 kernel:  [<ffffffffa0614af2>] nfs_wait_on_request+0x2b/0x2d [nfs]
Sep  5 12:16:12 exahostdb01 kernel:  [<ffffffffa0618a6c>] nfs_sync_mapping_wait+0xec/0x1fa [nfs]
Sep  5 12:16:12 exahostdb01 kernel:  [<ffffffffa0619073>] nfs_write_mapping+0x77/0x9e [nfs]
Sep  5 12:16:12 exahostdb01 kernel:  [<ffffffff810432d6>] ? should_resched+0xe/0x2f
Sep  5 12:16:12 exahostdb01 kernel:  [<ffffffffa06190b4>] nfs_wb_nocommit+0x1a/0x1c [nfs]
Sep  5 12:16:12 exahostdb01 kernel:  [<ffffffffa060e184>] nfs_getattr+0x61/0xef [nfs]
Sep  5 12:16:12 exahostdb01 kernel:  [<ffffffff8111ea7b>] vfs_getattr+0x4c/0x69
Sep  5 12:16:12 exahostdb01 kernel:  [<ffffffff8111eae8>] vfs_fstatat+0x50/0x67
Sep  5 12:16:12 exahostdb01 kernel:  [<ffffffff8111ebe5>] vfs_stat+0x1b/0x1d
Sep  5 12:16:12 exahostdb01 kernel:  [<ffffffff8111ec06>] sys_newstat+0x1f/0x39
Sep  5 12:16:12 exahostdb01 kernel:  [<ffffffff810a9d23>] ? audit_syscall_entry+0x103/0x12f
Sep  5 12:16:12 exahostdb01 kernel:  [<ffffffff81011db2>] system_call_fastpath+0x16/0x1b
...
Sep 10 16:00:39 exahostdb01 kernel: ixgbe 0000:20:00.0: eth0: NIC Link is Up 1 Gbps, Flow Control: RX/TX
Sep 10 16:00:39 exahostdb01 kernel: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
Sep 10 16:00:41 exahostdb01 kernel: ib0: packet len 2398 (> 2048) too long to send, dropping
Sep 10 16:00:41 exahostdb01 last message repeated 2 times
Sep 10 16:11:34 exahostdb01 kernel: ixgbe 0000:30:00.1: eth5: NIC Link is Down <<<<
bondib0   Link encap:InfiniBand  HWaddr 80:00:00:48:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00  (ib0 + ib1)
         inet addr:10.x.x.x  Bcast:10.x.x.255  Mask:255.255.255.0
         inet6 addr: fe80::221:2800:1fc:b3ed/64 Scope:Link
         UP BROADCAST RUNNING MASTER MULTICAST  MTU:65520  Metric:1

 

 



Cause

The stack was showing that the nfs process was trying to get back the status of the task via "nfs_wait_bit_uninterruptible" function, where the process was in uninterruptable state , because the communication to the source location was not successful. The above logs are clearly pointing that the base IB devices that are part of bondib0 are get to downstate intermittently and then joining back. So its a communication issue from the client side as network communication to the source is down (bondib0).

This is because the MTU size set of the IB device was default and with 64K size. This will is a common issue with IB when the MTU size is with a larger value.


 

Solution

1) Reduce the MTU size of IB device to 7000 and restart the network service
2) Remount the shares exported from NFS
3) Take backup orver IB devices.

References

<NOTE:1546861.1> - [Linux OS] System Hung with Large Numbers of Page Allocation Failures with "order:5" on Exadata Environments
<NOTE:1586212.1> - How to Change MTU Size in Exadata Environment

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback