Created attachment 1190819 [details] engine.log Description of problem: Failed to add RHVH to engine with vlan configured Version-Release number of selected component (if applicable): redhat-virtualization-host-4.0-20160812.0.x86_64 imgbased-0.8.4-1.el7ev.noarch vdsm-4.18.11-1.el7ev.x86_64 Red Hat Virtualization Manager Version: 4.0.2.6-0.1.el7ev How reproducible: 100% Steps to Reproduce: 1.Install RHVH 2.Configure a vlan on RHVH 3.On RHEVM portal, create a new datacenter and a new Cluster. 4.On RHEVM portal, select "Network", select target ovirtmgmt then "Edit", enable "Enable VLAN tagging". 5. Adding RHVH from engine side using the vlan ip configured in step#2 Actual results: After step5, failed to add rhvh to engine. The error were shown as "Failed to install Host dguo_vlan. Processing stopped due to timeout." Expected results: After step5, the RHVH could be added to engine successfully Additional info: During the installing period, the VLAN IP on RHVH were disappear.
Created attachment 1190821 [details] network_script on rhvh
Created attachment 1190822 [details] sosreport on rhvh
Created attachment 1190823 [details] rhvh: /var/log
Can you have a look?
(In reply to dguo from comment #0) > Created attachment 1190819 [details] > engine.log > > Description of problem: > Failed to add RHVH to engine with vlan configured > > Version-Release number of selected component (if applicable): > redhat-virtualization-host-4.0-20160812.0.x86_64 > imgbased-0.8.4-1.el7ev.noarch > vdsm-4.18.11-1.el7ev.x86_64 > Red Hat Virtualization Manager Version: 4.0.2.6-0.1.el7ev > > How reproducible: > 100% > > Steps to Reproduce: > 1.Install RHVH > 2.Configure a vlan on RHVH > 3.On RHEVM portal, create a new datacenter and a new Cluster. > 4.On RHEVM portal, select "Network", select target ovirtmgmt then "Edit", > enable "Enable VLAN tagging". > 5. Adding RHVH from engine side using the vlan ip configured in step#2 > > Actual results: > After step5, failed to add rhvh to engine. The error were shown as "Failed > to install Host dguo_vlan. Processing stopped due to timeout." > > Expected results: > After step5, the RHVH could be added to engine successfully > > Additional info: > During the installing period, the VLAN IP on RHVH were disappear. Please provide more details on what exactly has been done: - How a VLAN has been added and to what? (and where?). - If the management network has been moved to a VLAN, how could it be accessed? (it was accessed obviously when there was no vlan, so why with a vlan you are still expecting to be able and access it). As a general point: The management network is a special one, it should be edited with care as you can end up loosing the host. If the management should have been set with a VLAN, then this needs to be performed before adding the host to Engine.
Comment on attachment 1190823 [details] rhvh: /var/log No VDSM logs
(In reply to Edward Haas from comment #5) > (In reply to dguo from comment #0) > > Created attachment 1190819 [details] > > engine.log > > > > Description of problem: > > Failed to add RHVH to engine with vlan configured > > > > Version-Release number of selected component (if applicable): > > redhat-virtualization-host-4.0-20160812.0.x86_64 > > imgbased-0.8.4-1.el7ev.noarch > > vdsm-4.18.11-1.el7ev.x86_64 > > Red Hat Virtualization Manager Version: 4.0.2.6-0.1.el7ev > > > > How reproducible: > > 100% > > > > Steps to Reproduce: > > 1.Install RHVH > > 2.Configure a vlan on RHVH > > 3.On RHEVM portal, create a new datacenter and a new Cluster. > > 4.On RHEVM portal, select "Network", select target ovirtmgmt then "Edit", > > enable "Enable VLAN tagging". > > 5. Adding RHVH from engine side using the vlan ip configured in step#2 > > > > Actual results: > > After step5, failed to add rhvh to engine. The error were shown as "Failed > > to install Host dguo_vlan. Processing stopped due to timeout." > > > > Expected results: > > After step5, the RHVH could be added to engine successfully > > > > Additional info: > > During the installing period, the VLAN IP on RHVH were disappear. > > Please provide more details on what exactly has been done: > - How a VLAN has been added and to what? (and where?). > - If the management network has been moved to a VLAN, how could it be > accessed? (it was accessed obviously when there was no vlan, so why with a > vlan you are still expecting to be able and access it). > > As a general point: The management network is a special one, it should be > edited with care as you can end up loosing the host. > If the management should have been set with a VLAN, then this needs to be > performed before adding the host to Engine. Indeed, more than one nic were existing on the rhvh side, one of them(#em1) is connected to public switch, the other #p3p1 is connected to vlan switch which also connect the rhevm. So could always access the rhvh from #em1, and creating vlan over #p3p1. 1. Creating a dhcp vlan over #p3p1 from cockpit, called #p3p1.20 with ip 192.168.20.99 2. From engine side which also has an internal ip 192.168.20.41, adding the rhvh with vlan tagging "20" For details, I will attach the cockpit screenshot.
Created attachment 1191036 [details] Creating vlan over p3p1
Created attachment 1191038 [details] Details of "p3p1.20"
Created attachment 1191039 [details] After creating the vlan
Thank you for the clarification. Please provide the VDSM and Engine logs (VDSM ones were missing from the previous logs) from when the failure occurred. We will also need the list of ifcfg files (with content) after the cockpit configuration has been completed and just before you attempt to add the host to engine. Please also issue an 'ip addr' command and provide the output. Final question (to get a context to all of this): Is this a regression test? I mean, has this scenario of adding a host through a VLAN management interface has been tested in 3.6?
Results from the tests conducted by mburman: - Avoid masking NetworkManager allowed the host deployment to finish. - Configuring a vlan or bond device through Cockpit created an ifcfg file with an uuid as its name. VDSM does not support such naming, it expects the ifcfg file to be named with the device name.
Great findings. We'll revert the masking. This bug will be used to track the bug which affects the deployment failure. The ifcfg issue is covered in bug 1367378. One question: How did you define the bond/vlan?
*** Bug 1366562 has been marked as a duplicate of this bug. ***
(In reply to Edward Haas from comment #11) > Thank you for the clarification. > > Please provide the VDSM and Engine logs (VDSM ones were missing from the > previous logs) from when the failure occurred. > > We will also need the list of ifcfg files (with content) after the cockpit > configuration has been completed and just before you attempt to add the host > to engine. Please also issue an 'ip addr' command and provide the output. > > Final question (to get a context to all of this): Is this a regression test? > I mean, has this scenario of adding a host through a VLAN management > interface has been tested in 3.6? Please see the logs and config files attached. For the final question, you mean ngn3.6? This scenario has been blocked by bug 1329956
Created attachment 1191475 [details] host deploy log
Created attachment 1191476 [details] ifcfg file after creating bond [root@dell-op790-01 ~]# ip a s 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: p4p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000 link/ether 00:10:18:81:a4:a0 brd ff:ff:ff:ff:ff:ff 3: p4p2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000 link/ether 00:10:18:81:a4:a2 brd ff:ff:ff:ff:ff:ff 4: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000 link/ether d4:be:d9:95:61:ca brd ff:ff:ff:ff:ff:ff inet 10.66.148.7/22 brd 10.66.151.255 scope global dynamic em1 valid_lft 13589sec preferred_lft 13589sec inet6 2620:52:0:4294:d6be:d9ff:fe95:61ca/64 scope global noprefixroute dynamic valid_lft 2591968sec preferred_lft 604768sec inet6 fe80::d6be:d9ff:fe95:61ca/64 scope link valid_lft forever preferred_lft forever 5: p3p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000 link/ether 00:1b:21:27:47:0b brd ff:ff:ff:ff:ff:ff 6: p3p1.20@p3p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP link/ether 00:1b:21:27:47:0b brd ff:ff:ff:ff:ff:ff inet 192.168.20.99/24 brd 192.168.20.255 scope global dynamic p3p1.20 valid_lft 85751sec preferred_lft 85751sec inet6 fe80::21b:21ff:fe27:470b/64 scope link valid_lft forever preferred_lft forever
Created attachment 1191477 [details] engine.log_part_aa
Created attachment 1191478 [details] engine.log_part_ab
Created attachment 1191479 [details] engine.log_part_ac
Re-test the below scenarios with build 20160817.0, all were added to engine successfully. 1. dhcp vlan 2. static vlan 3. dhcp bond 4. static bond 5. dhcp bond+vlan 6. static bond+vlan Test steps: 1.Install rhvh 2.Update the cfg files for each scenario 3.Restart the network service to up the bond/vlan 4.Add the rhvh to engine Actual result: 1.Add the rhvh successfully
The above verification was done base on creating ifcfg-files manually. And the issue that the vdsm handle the ifcfg file is coverd in Bug 1367378
Ryan, Could you please provide the patch for this.