Sun Microsystems, Inc.  Oracle System Handbook - ISO 7.0 May 2018 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-2200379.1
Update Date:2016-11-03
Keywords:

Solution Type  Problem Resolution Sure

Solution  2200379.1 :   Compute Node reprovisioning fails at 92%  


Related Items
  • Private Cloud Appliance X5-2 Hardware
  •  
Related Categories
  • PLA-Support>Eng Systems>Exalogic/OVCA>Oracle Virtual Compute Appliance>DB: OVCA_EST
  •  




In this Document
Symptoms
Changes
Cause
Solution
References


Created from <SR 3-13135348451>

Applies to:

Private Cloud Appliance X5-2 Hardware - Version All Versions to All Versions [Release All Releases]
Information in this document applies to any platform.

Symptoms

Compute Node reprovisioning fails at 92%.

Changes

Repository 'Rack1-Repository' was deleted.
The LUN called 'iscsi_repository1' on the internal ZFS, used by this repository, was deleted as well.

Cause

Repository called 'Rack1-Repository', as well as LUN with name 'iscsi_repository1', are mandatory names that need to be in place and cannot be deleted.
This results in the following entries in the /var/log/ovca.log of the active Management Node:

Processing server ovcacn28r1.^M
Processing Generic Local Storage Array @ ovcacn28r1.^M
Found volume : 3600605b00a76f9c01f964660079842d1.^M
Volume 3600605b00a76f9c01f964660079842d1 does not contain a filesystem - creating.^M
Fileserver : Local FS ovcacn28r1.^M
Creating filesystem.^M
Creating repository ovcacn28r1-localfsrepo.^M
Presenting to server ovcacn28r1.^M

[2016-10-16 14:32:21 575687] DEBUG (service:114) call start: register_discovered_node('00:10:e0:96:03:29', '192.168.4.107')
[2016-10-16 14:32:21 575687] DEBUG (service:116) call complete: register_discovered_node
[2016-10-16 14:32:21 575694] DEBUG (service:114) call start: register_discovered_node('00:10:e0:96:03:29', '192.168.4.107')
[2016-10-16 14:32:21 575694] DEBUG (service:116) call complete: register_discovered_node
[2016-10-16 14:32:24 555687] DEBUG (nodestateserver:738) Examining node: ilom 00:10:e0:95:05:29 initializing_stage_create_shared_filesystem
[2016-10-16 14:32:24 555687] INFO (utils:85) Polling create_shared_filesystem: trial # 1, elapsed time: 0 seconds.
[2016-10-16 14:32:26 555687] INFO (nodestateserver:357) Calling OVM shell script create_shared_filesystem
[2016-10-16 14:32:26 555687] INFO (nodestateserver:360) create_shared_filesystem for ovcacn28r1 /dev/disk/by-id/scsi-3None
[2016-10-16 14:32:26 555687] WARNING (ovm:60) calling ovm_shell script create_shared_filesystem.py with args ('--server=ovcacn28r1', '--device=/dev/disk/by-id/scsi
[2016-10-16 14:32:28 555687] DEBUG (nodestateserver:364) create_shared_filesystem returned: ******^M
Password:****** for server with servername ovcacn28r1 and devicename /dev/disk/by-id/scsi-3None.^M
Processing server ovcacn26r1.^M
Processing server ovcacn13r1.^M

<..>

 

Eventually it fails with:

[2016-10-16 14:37:24 555687] WARNING (nodestateserver:367) Shared filesystem was not created successfully on server: ovcacn28r1
[2016-10-16 14:37:29 555687] DEBUG (nodestateserver:738) Examining node: ilom 00:10:e0:95:05:29 initializing_stage_create_shared_filesystem
[2016-10-16 14:37:29 555687] INFO (utils:85) Polling create_shared_filesystem: trial # 37, elapsed time: 305 seconds.
[2016-10-16 14:37:31 555687] INFO (nodestateserver:357) Calling OVM shell script create_shared_filesystem
[2016-10-16 14:37:31 555687] INFO (nodestateserver:360) create_shared_filesystem for ovcacn28r1 /dev/disk/by-id/scsi-3None
[2016-10-16 14:37:31 555687] WARNING (ovm:60) calling ovm_shell script create_shared_filesystem.py with args ('--server=ovcacn28r1', '--device=/dev/disk/by-id/scsi-3None')
[2016-10-16 14:37:32 555687] DEBUG (nodestateserver:364) create_shared_filesystem returned: ******^M
Password:****** for server with servername ovcacn28r1 and devicename /dev/disk/by-id/scsi-3None.^M
Processing server ovcacn26r1.^M
Processing server ovcacn13r1.^M
Processing server ovcacn11r1.^M
Processing server ovcacn07r1.^M
Processing server ovcacn29r1.^M
Processing server ovcacn10r1.^M
Processing server ovcacn27r1.^M
Processing server ovcacn12r1.^M
Processing server ovcacn09r1.^M
Processing server ovcacn08r1.^M
Processing server ovcacn14r1.^M
Processing server ovcacn28r1.^M
Found LUN SUN (1).^M
Found LUN Rack1-Repository2.^M
Found LUN Rack1-Repository3.^M
Found LUN Rack1-Repository4.^M
Found LUN Rack1-Repository5.^M
Found LUN Rack1-Repository6.^M
Found LUN Rack1-Repository7.^M
Found LUN Rack1-Repository8.^M
Found LUN 10.196.32.110-112-ASM1.^M
Found LUN 10.196.32.110-112-ASM2.^M
Found LUN 10.196.32.110-112-ASM3.^M
Found LUN 10.196.32.110-112-ASM4.^M
Found LUN 10.196.32.110-112-ASM5.^M
Found LUN Rack1-Repository9.^M
Found LUN Rack1-Repository10.^M
Found LUN Rack1-Repository11.^M
Found LUN Test_LUN_SUN.^M
Found LUN 3600605b00a76f9c01f964660079842d1.^M

[2016-10-16 14:37:32 555687] WARNING (nodestateserver:367) Shared filesystem was not created successfully on server: ovcacn28r1
[2016-10-16 14:37:32 555687] ERROR (utils:106) Run # 37 after 305 secs: Function create_shared_filesystem.
Traceback (most recent call last):
File "/usr/lib/python2.6/site-packages/ovca/daemon/utils.py", line 98, in new_function
% (func.__name__, max_time, max_trial))
JobTimeoutError: Polling create_shared_filesystem exceeds 300 seconds and max trial count 5. Breaking loop.
[2016-10-16 14:37:32 555687] DEBUG (nodestateserver:1368) DEAD node: ilom 00:10:e0:95:05:29. Last good state: initializing_stage_create_shared_filesystem

 

Solution

- on the internal ZFS, create a LUN (300GB - 1TB....)
- name: iscsi_repository1
- make sure initiatorgroup: OVM-igroup
- make sure targetgroup: OVM

IMPORTANT: the exact names, as stated above, MUST be used for creating the new LUN.

- refresh the storage(SAN Server: OVCA_ZFSSA_Rack1) in OVMM.
- create a repository called 'Rack1-Repository' using the newly created LUN.
- make sure the new repository is discovered on all compute nodes in the serverpool
- restart the re-provisioning

References

<BUG:24912612> - PROVISIONING OF COMPUTE NODE FAILS AT 92%

Attachments
This solution has no attachment
  Copyright © 2018 Oracle, Inc.  All rights reserved.
 Feedback