Bug 1329202

Summary: Adding 3rd node to the hosted engine cluster fails with "certificate enrollment failed" error
Product: [oVirt] ovirt-engine Reporter: Bhaskarakiran <byarlaga>
Component: BLL.HostedEngineAssignee: Simone Tiraboschi <stirabos>
Status: CLOSED WORKSFORME QA Contact: meital avital <mavital>
Severity: high Docs Contact:
Priority: high    
Version: 3.6.5CC: bugs, byarlaga, mzywusko, oourfali, rnachimu, sabose, sasundar, ylavi
Target Milestone: ovirt-3.6.7Flags: ylavi: ovirt-3.6.z?
rule-engine: planning_ack?
rule-engine: devel_ack?
rule-engine: testing_ack?
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-06-02 07:29:14 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Integration RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1258386    
Attachments:
Description Flags
vdsm log
none
engine log none

Description Bhaskarakiran 2016-04-21 12:07:25 UTC
Created attachment 1149435 [details]
vdsm log

Description of problem:
======================

Addition of 3rd node always fails with "certification enrollment failed" error.


Version-Release number of selected component (if applicable):
=============================================================
3.6.5.3-0.1.el6

How reproducible:
=================
100%

Steps to Reproduce:
Add the 3rd node to the hosted engine cluster

RHEL rpm's installed:

[root@rhsqa5 ~]# rpm -qa |grep rhev
fence-agents-rhevm-4.0.11-27.el7_2.7.x86_64
qemu-kvm-tools-rhev-2.3.0-31.el7_2.10.x86_64
qemu-kvm-rhev-2.3.0-31.el7_2.10.x86_64
qemu-img-rhev-2.3.0-31.el7_2.10.x86_64
rhevm-sdk-python-3.6.3.0-1.el7ev.noarch
libcacard-rhev-2.3.0-31.el7_2.10.x86_64
qemu-kvm-common-rhev-2.3.0-31.el7_2.10.x86_64


Actual results:


Expected results:


Additional info:

Attaching the vdsm log.

Comment 1 Bhaskarakiran 2016-04-21 15:32:38 UTC
I have tried couple of times but still fails with the same error. This is blocking us to start ROBO testing itself.

Comment 2 Ramesh N 2016-04-22 06:17:44 UTC
Can you attach engine.log from hosted engine VM?.

Comment 3 Bhaskarakiran 2016-04-22 06:30:31 UTC
Created attachment 1149646 [details]
engine log

Comment 4 Sahina Bose 2016-04-22 06:42:05 UTC
As a workaraound, you can copy /etc/pki/ovirt-engine/serial.txt.old as  /etc/pki/ovirt-engine/serial.txt.

Comment 5 Ramesh N 2016-04-22 06:48:23 UTC
File '/etc/pki/ovirt-engine/serial.txt' got deleted during host addition. As a result, Sign Certificate request is failing. I can see the following error in engine log. 

2016-04-22 02:14:13,576 ERROR [org.ovirt.engine.core.utils.hostinstall.OpenSslCAWrapper] (VdsDeploy) [1d49336e] Sign Certificate request failed with exit code 1
2016-04-22 02:14:13,576 ERROR [org.ovirt.engine.core.utils.hostinstall.OpenSslCAWrapper] (VdsDeploy) [1d49336e] Sign Certificate request script errors:
Using configuration from openssl.conf
unable to load number from serial.txt
error while loading serial number
140695882463048:error:0D066096:asn1 encoding routines:a2i_ASN1_INTEGER:short line:f_int.c:215:
Cannot sign certificate


We are seeing this issue frequently during host addition in hosted engine setup with gluster.

Comment 8 Sandro Bonazzola 2016-04-27 12:54:43 UTC
Bhaskarakiran please provide full sos report from the 3rd host and from the engine vm, thanks.

Also please note you're using 3.6.3 on the hosts (rhevm-sdk-python-3.6.3.0-1.el7ev.noarch) and 3.6.5 in the engine (3.6.5.3-0.1.el6)

Please try to reproduce with hosts and engine aligned to the same version.

Comment 9 Sandro Bonazzola 2016-04-27 13:06:16 UTC
Bhaskarakiran also, how were you trying to add this 3rd host?
Running hosted-engine --deploy on it?
Just adding it using the web ui as common host?
Using ansible / gdeploy?

Comment 10 Bhaskarakiran 2016-04-29 10:45:48 UTC
Sandro, i used hosted-engine --deploy to add the 3rd host. The setup if out now, will try on fresh setup and provide the logs.

Comment 11 Red Hat Bugzilla Rules Engine 2016-05-05 10:10:10 UTC
Bug tickets must have version flags set prior to targeting them to a release. Please ask maintainer to set the correct version flags and only then set the target milestone.

Comment 12 Bhaskarakiran 2016-05-20 11:29:12 UTC
I see that this is hit if a single network is used for both virt and gluster. I tried with seperate networks for both virt and gluster and didn't see this.

Comment 13 Sahina Bose 2016-06-02 07:29:14 UTC
Closing this, as the recommendation in case of HC is to use multiple networks to separate virt and gluster traffic