Bug 1254523 - Report cannot add host when setup Hosted Engine through vlan or bond network
Summary: Report cannot add host when setup Hosted Engine through vlan or bond network
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-node-plugin-hosted-engine
Version: 3.5.4
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: 3.5.4
Assignee: Douglas Schilling Landgraf
QA Contact: Virtualization Bugs
URL:
Whiteboard: node
Depends On:
Blocks: 1250199
TreeView+ depends on / blocked
 
Reported: 2015-08-18 10:17 UTC by wanghui
Modified: 2016-03-24 08:12 UTC (History)
17 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-08-25 15:34:16 UTC
oVirt Team: Node
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
HE bond log files (267.42 KB, application/x-gzip)
2015-08-18 10:17 UTC, wanghui
no flags Details
HE vlan log files (316.38 KB, application/x-gzip)
2015-08-18 10:17 UTC, wanghui
no flags Details

Description wanghui 2015-08-18 10:17:13 UTC
Created attachment 1064276 [details]
HE bond log files

Description of problem:
It reports cannot add host when setup Hosted Engine through vlan or bonded network. It will report error when add host to cluster as follows.

[ERROR] Cannot automatically add the host to cluster Default: Cannot add Host. Connecting to host via SSH has failed, verify that the host is reachable (IP address, routable address etc.) You may refer to the engine.log file for further details.
[ERROR] Failed to execute stage 'Closing up': Cannot add the host to cluster Default

Version-Release number of selected component (if applicable):
rhev-hypervisor6-6.7-20150813.0.el6ev
ovirt-node-3.2.3-18.el6.noarch
ovirt-node-plugin-hosted-engine-0.2.0-18.0.el6ev.noarch
ovirt-hosted-engine-setup-1.2.5.3-1.el6ev.noarch

How reproducible:
100%

Steps to Reproduce:
Scenario 1:
1. Clean install rhev-hypervisor6-6.7-20150813.0.el6ev
2. Create network through bond

# cat /proc/net/bonding/bond1 
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)

Bonding Mode: fault-tolerance (active-backup)
Primary Slave: None
Currently Active Slave: p3p1
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0

Slave Interface: p3p1
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 00:1b:21:36:79:f0
Slave queue ID: 0

Slave Interface: p3p2
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 00:1b:21:36:79:f1
Slave queue ID: 0

3. Setup hosted engine through ova type.
4. Setup vm to rhevm
5. Select the default cluster to add host 
Enter the name of the cluster to which you want to add the host (Default) [Default]:

Scenario 2:
1. Clean install rhev-hypervisor6-6.7-20150813.0.el6ev
2. Create network through vlan 
3. Setup hosted engine through ova type.
4. Setup vm to rhevm
5. Select the default cluster to add host 
Enter the name of the cluster to which you want to add the host (Default) [Default]:

Actual results:
1. It reports error like follows.
[ERROR] Cannot automatically add the host to cluster Default: Cannot add Host. Connecting to host via SSH has failed, verify that the host is reachable (IP address, routable address etc.) You may refer to the engine.log file for further details.
[ERROR] Failed to execute stage 'Closing up': Cannot add the host to cluster Default

Expected results:
1. Setup hosted engine through vlan or bond network succeed.

Additional info:
No such issue when using em1 as network.

Comment 1 wanghui 2015-08-18 10:17:48 UTC
Created attachment 1064277 [details]
HE vlan log files

Comment 3 Sandro Bonazzola 2015-08-18 13:01:38 UTC
2015-08-17 05:10:56 DEBUG otopi.plugins.ovirt_hosted_engine_setup.engine.add_host add_host._closeup:654 Cannot add the host to cluster Default
Traceback (most recent call last):
  File "/usr/share/ovirt-hosted-engine-setup/plugins/ovirt-hosted-engine-setup/engine/add_host.py", line 645, in _closeup
  File "/usr/lib/python2.6/site-packages/ovirtsdk/infrastructure/brokers.py", line 13280, in add
  File "/usr/lib/python2.6/site-packages/ovirtsdk/infrastructure/proxy.py", line 88, in add
  File "/usr/lib/python2.6/site-packages/ovirtsdk/infrastructure/proxy.py", line 118, in request
  File "/usr/lib/python2.6/site-packages/ovirtsdk/infrastructure/proxy.py", line 146, in __doRequest
  File "/usr/lib/python2.6/site-packages/ovirtsdk/web/connection.py", line 134, in doRequest
RequestError: 
status: 409
reason: Conflict
detail: Cannot add Host. SSH authentication failed, verify authentication parameters are correct (Username/Password, public-key etc.) You may refer to the engine.log file for further details.


Not sure what Conflict means here. Juan?

Comment 4 Juan Hernández 2015-08-18 13:26:17 UTC
The 409 HTTP code and the "Conflict" reason are the translations of the type of error associated to that error message into HTTP terms. From ErrorMessage.java:

  VDS_CANNOT_AUTHENTICATE_TO_SERVER(ErrorType.CONFLICT)

In theory "conflict" means that there is a conflict between the requested action and the state of the system, and that the conflict can be resolved by the user:

  http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html - 10.4.10

In this particular case I think you shouldn't focus on that, rather on what the error message is saying.

Comment 5 Douglas Schilling Landgraf 2015-08-21 16:09:45 UTC
Hi wanghui,

Could you please share the log from engine vm?  (/var/log/ovirt-engine/engine.log)

I was able to reproduce your report only when the hostname of rhev-h was 'localhost.localdomain' so engine was not able to communicate with rhev-h as it tried to connect via ssh using root. Would be nice to double check if rhev-h hostname was reachable from rhev-m.


Steps that I made
----------------------------
#1 Installed rhev-hypervisor6-6.7-20150813.0.el6ev
#2 (TUI) Network tab configured hostname
#3 (TUI) Configured bond0  and assigned eth0 and eth1 to it
#4 (TUI) Hosted Engine Tab provided http://address/rhel67.iso
         and started the setup

#5 Installed the RHEL6.7 and setup rhevm 3.5.4.2-1.3.el6ev
  <At this stage node.localdomain and engine.localdomain are configured
   in both machines in /etc/hosts>

#6 Ask the hosted-engine to connect to engine and finish the setup
   Should be everything okay. If I missed the step #2 I can reproduce
   your report of "status: 409" as described in comment#3.

Additional data (from working scenario)
===========================================
# cat /etc/redhat-release 
Red Hat Enterprise Virtualization Hypervisor release 6.7 (20150813.0.el6ev)

[root@node ~]# hosted-engine --vm-status


--== Host 1 status ==--

Status up-to-date                  : True
Hostname                           : node.localdomain
Host ID                            : 1
Engine status                      : {"health": "good", "vm": "up", "detail": "up"}
Score                              : 2400
Local maintenance                  : False
Host timestamp                     : 10596
Extra metadata (valid at timestamp):
	metadata_parse_version=1
	metadata_feature_version=1
	timestamp=10596 (Fri Aug 21 16:08:32 2015)
	host-id=1
	score=2400
	maintenance=False
	state=EngineUp


# cat /proc/net/bonding/bond0 
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)

Bonding Mode: load balancing (round-robin)
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0

Slave Interface: eth0
MII Status: up
Speed: 100 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 52:54:00:59:17:ec
Slave queue ID: 0

Slave Interface: eth1
MII Status: up
Speed: 100 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 52:54:00:6a:67:2d
Slave queue ID: 0

[root@node ~]# ifconfig 
bond0     Link encap:Ethernet  HWaddr 52:54:00:59:17:EC  
          inet6 addr: fe80::5054:ff:fe59:17ec/64 Scope:Link
          UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1
          RX packets:5686303 errors:0 dropped:6579 overruns:0 frame:0
          TX packets:7391175 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:6632432255 (6.1 GiB)  TX bytes:6658130300 (6.2 GiB)

eth0      Link encap:Ethernet  HWaddr 52:54:00:59:17:EC  
          inet6 addr: fe80::5054:ff:fe59:17ec/64 Scope:Link
          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:2843258 errors:0 dropped:2994 overruns:0 frame:0
          TX packets:3695604 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:3318764852 (3.0 GiB)  TX bytes:3329191078 (3.1 GiB)
          Interrupt:11 Base address:0x4000 

eth1      Link encap:Ethernet  HWaddr 52:54:00:59:17:EC  
          inet6 addr: fe80::5054:ff:fe59:17ec/64 Scope:Link
          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:2843110 errors:0 dropped:3585 overruns:0 frame:0
          TX packets:3695598 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:3313670435 (3.0 GiB)  TX bytes:3328941500 (3.1 GiB)
          Interrupt:11 Base address:0x2000 

lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:65536  Metric:1
          RX packets:23184 errors:0 dropped:0 overruns:0 frame:0
          TX packets:23184 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:10535033 (10.0 MiB)  TX bytes:10535033 (10.0 MiB)

rhevm     Link encap:Ethernet  HWaddr 52:54:00:59:17:EC  
          inet addr:192.168.122.30  Bcast:192.168.122.255  Mask:255.255.255.0
          inet6 addr: fe80::5054:ff:fe59:17ec/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:2020935 errors:0 dropped:0 overruns:0 frame:0
          TX packets:1588440 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:1338074021 (1.2 GiB)  TX bytes:6216474993 (5.7 GiB)

vnet0     Link encap:Ethernet  HWaddr FE:16:3E:03:BD:98  
          inet6 addr: fe80::fc16:3eff:fe03:bd98/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:1647 errors:0 dropped:0 overruns:0 frame:0
          TX packets:2375 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:500 
          RX bytes:505453 (493.6 KiB)  TX bytes:494602 (483.0 KiB)


Thanks!

Comment 6 Yaniv Lavi 2015-08-23 12:41:05 UTC
I also see that localhost.localdomain was used in the setup.
Can you please make sure your host name is DNS resolvable? and retest asap?

Comment 7 wanghui 2015-08-24 01:37:52 UTC
(In reply to Yaniv Dary from comment #6)
> I also see that localhost.localdomain was used in the setup.
> Can you please make sure your host name is DNS resolvable? and retest asap?

Hi Yaniv,

rhevh's hostname is loalhost.localdomain which can be only resolved by local. For the self-hosted engine doc, i didn't see the rhevh hostname should be resolvable. And rhevm add rhevh should not need to resolve hostname also. We add rhevh from rhevm side only need to provide its hostname. 

So I need to make it clearly, why rhevh hostname shoule be DNS reslovable?

Thanks,
Hui Wang

Comment 8 Yaniv Lavi 2015-08-24 09:53:06 UTC
(In reply to wanghui from comment #7)
> (In reply to Yaniv Dary from comment #6)
> > I also see that localhost.localdomain was used in the setup.
> > Can you please make sure your host name is DNS resolvable? and retest asap?
> 
> Hi Yaniv,
> 
> rhevh's hostname is loalhost.localdomain which can be only resolved by
> local. For the self-hosted engine doc, i didn't see the rhevh hostname
> should be resolvable. And rhevm add rhevh should not need to resolve
> hostname also. We add rhevh from rhevm side only need to provide its
> hostname. 
> 
> So I need to make it clearly, why rhevh hostname shoule be DNS reslovable?

It is our support requirements, you should test like in a customer environment. Issue that arise from unsupported flow, are not real issue.
The engine and it's hosts must be DNS resolvable. When you provide a hostname, the engine need to be able to reach the host, which require DNS.

> 
> Thanks,
> Hui Wang

Comment 9 wanghui 2015-08-25 09:08:52 UTC
Retest on rhev-hypervisor6-6.7-20150813.0.el6ev.

Setup rhevh hostname as a dns resolvable.
Setup rhevm hostname as a dns resolvable.

Test can be passed with bond(mode=1) as rhevh network.

Comment 10 Yaniv Lavi 2015-08-25 15:34:16 UTC
Virt QE has informed me this is resolved on DNS resolvable. Closing.

Comment 11 wanghui 2015-08-31 05:14:19 UTC
I have one question for this issue, do we provide a way that we can add the host again after we resolve the hostname? Because we almost finished the HE setup process(Engine is up now). Can we provide a new way to handle this host is not reachable issue instead of quit directly?

Thanks,
Hui Wang

Comment 12 Yaniv Lavi 2015-08-31 08:31:48 UTC
(In reply to wanghui from comment #11)
> I have one question for this issue, do we provide a way that we can add the
> host again after we resolve the hostname? Because we almost finished the HE
> setup process(Engine is up now). Can we provide a new way to handle this
> host is not reachable issue instead of quit directly?

For hosted engine the best practice would be to reinstall cleanly, so remove any old installation and start over. 

> 
> Thanks,
> Hui Wang


Note You need to log in before you can comment on or make changes to this bug.