Created attachment 1242108 [details] all logs Description of problem: Failed to add rhvh4.1 to engine via the bond+vlan configured by NM during anaconda installation Version-Release number of selected component (if applicable): Red Hat Virtualization Manager Version: 4.1.0-0.3.beta2.el7 redhat-virtualization-host-4.1-20170116 imgbased-0.9.4-0.1.el7ev.noarch vdsm-4.19.1-1.el7ev.x86_64 How reproducible: 100% Steps to Reproduce: 1. Install a rhvh4.1 via anaconda UI 2. Enter network page, firstly setup a bond0(for IPV4 setting, disabled, for IPV6 setting, ignore), then setup a vlan bond0.50 over this bond0 3. Reboot the rhvh 4. Login to rhvh, modify "VDSM/disableNetworkManager=bool:False" in /etc/ovirt-host-deploy.conf.d/90-ngn-do-not-keep-networkmanager.conf 5. Add host to engine Actual results: 1. After step #5, add host to engine failed Expected results: 1. After step #5, the host can be added to engine successfully Additional info: 1. Both dhcp and static vlan(over bond) were failed
I know there was a problem with one of the vdsm-jsonrpc versions. I think you should run with 1.3.6, please check what you run with there. From VDSM side, there is nothing going on there, it never gets a setupNetwork command and current caps report include the bond and vlan. The ifcfg files also show that they have never been acquired by VDSM (as expected, because there was no setupNetwork issued).
(In reply to Edward Haas from comment #1) > I know there was a problem with one of the vdsm-jsonrpc versions. > I think you should run with 1.3.6, please check what you run with there. > Yes, I had notice this version issue in other mail thread, and already replaced the version to 1.3.6-1 on my test rhevm. [root@rhvm41-vlan50-1 ~]# rpm -qa|grep vdsm-jsonrpc vdsm-jsonrpc-java-1.3.6-1.el7ev.noarch > From VDSM side, there is nothing going on there, it never gets a > setupNetwork command and current caps report include the bond and vlan. The > ifcfg files also show that they have never been acquired by VDSM (as > expected, because there was no setupNetwork issued). Yet, all the other scenarios can be added to engine successfully. dhcp and static bond, dhcp and static vlan configured during anaconda installation, they can be added successfully, while only this vlan over bond is failed.
(In reply to dguo from comment #2) I have tried to go over the Engine logs but they seem to be out of sync with the vdsm logs (different time periods). Could you please send synced logs and mention from what time to look in the logs?
huzhao, Could you help to provide the logs info for #c3 due to dguo is on PTO. Thanks.
Created attachment 1242416 [details] All logs in engine side and RHVH side
(In reply to Edward Haas from comment #3) > (In reply to dguo from comment #2) > > I have tried to go over the Engine logs but they seem to be out of sync with > the vdsm logs (different time periods). > > Could you please send synced logs and mention from what time to look in the > logs? Edward, see attachment " All logs in engine side and RHVH side". For the logs in engine side: - engine.log, from 2017-01-19 03:44:32 - ovirt-host-deploy-20170119034458-192.168.50.138-52c3e735.log, record the rhvh time, from 2017-01-19 08:44:34
(In reply to Huijuan Zhao from comment #6) > Edward, see attachment " All logs in engine side and RHVH side". > For the logs in engine side: > - engine.log, from 2017-01-19 03:44:32 > - ovirt-host-deploy-20170119034458-192.168.50.138-52c3e735.log, record the > rhvh time, from 2017-01-19 08:44:34 Do you mean that there is a 5 hours difference between the logs? A record on Engine at 03:44 will show up on the host side at 08:44?
(In reply to Edward Haas from comment #7) > Do you mean that there is a 5 hours difference between the logs? > A record on Engine at 03:44 will show up on the host side at 08:44? Yes
(In reply to Huijuan Zhao from comment #6) In these logs we do see the setupNetwork. It shows that the address with which the host was added (dhcp based) has changed after adding ovirtmgmt bridge. I guess the mac changed, perhaps the bond swapped its mac (with the other slave mac). (To check this, you need to do collect the output of "ip link" before adding the host and after (in the 120sec window, as after it we rollback). If this is the case, there is nothing much we can do. Bond mac may change and we do not control it. (Doing so is a full blown RFE)
Here is the relevant line from supervdsm log: sourceRoute::INFO::2017-01-19 08:45:58,252::sourceroute::76::root::(configure) Configuring gateway - ip: 192.168.50.144, network: 192.168.50.0/24, subnet: 255.255.255.0, gateway: 192.168.50.1, table: 3232248464, device: ovirtmgmt (Original address was 192.168.50.138)
(In reply to Edward Haas from comment #9) Yes, the MAC of bond0.50 and bond0 is changed during adding host to engine. 1. Before adding host to engine # ip link 4: eno3: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP mode DEFAULT qlen 1000 link/ether 08:94:ef:21:c0:4f brd ff:ff:ff:ff:ff:ff 5: eno4: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP mode DEFAULT qlen 1000 link/ether 08:94:ef:21:c0:4f brd ff:ff:ff:ff:ff:ff 11: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT qlen 1000 link/ether 08:94:ef:21:c0:4f brd ff:ff:ff:ff:ff:ff 12: bond0.50@bond0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT qlen 1000 link/ether 08:94:ef:21:c0:4f brd ff:ff:ff:ff:ff:ff 2. The middle status during adding host to engine # ip link 4: eno3: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP mode DEFAULT qlen 1000 link/ether 08:94:ef:21:c0:4f brd ff:ff:ff:ff:ff:ff 5: eno4: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP mode DEFAULT qlen 1000 link/ether 08:94:ef:21:c0:4f brd ff:ff:ff:ff:ff:ff 11: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT qlen 1000 link/ether 08:94:ef:21:c0:4f brd ff:ff:ff:ff:ff:ff 12: bond0.50@bond0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master ovirtmgmt state UP mode DEFAULT qlen 1000 link/ether 12:fb:c4:7c:bd:08 brd ff:ff:ff:ff:ff:ff 13: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT qlen 1000 link/ether 7e:c2:98:de:0d:37 brd ff:ff:ff:ff:ff:ff 14: ovirtmgmt: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT qlen 1000 link/ether 12:fb:c4:7c:bd:08 brd ff:ff:ff:ff:ff:ff 3. The last output after adding host to engine failed # ip link 4: eno3: <BROADCAST,MULTICAST> mtu 1500 qdisc mq state DOWN mode DEFAULT qlen 1000 link/ether 08:94:ef:21:c0:4f brd ff:ff:ff:ff:ff:ff 5: eno4: <BROADCAST,MULTICAST> mtu 1500 qdisc mq state DOWN mode DEFAULT qlen 1000 link/ether 08:94:ef:21:c0:50 brd ff:ff:ff:ff:ff:ff 13: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT qlen 1000 link/ether 7e:c2:98:de:0d:37 brd ff:ff:ff:ff:ff:ff 15: bond0: <NO-CARRIER,BROADCAST,MULTICAST,MASTER,UP> mtu 1500 qdisc noqueue state DOWN mode DEFAULT qlen 1000 link/ether ca:73:47:3c:e2:21 brd ff:ff:ff:ff:ff:ff 16: bond0.50@bond0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state LOWERLAYERDOWN mode DEFAULT qlen 1000 link/ether ca:73:47:3c:e2:21 brd ff:ff:ff:ff:ff:ff
(In reply to Huijuan Zhao from comment #11) > (In reply to Edward Haas from comment #9) > > Yes, the MAC of bond0.50 and bond0 is changed during adding host to engine. > I am guessing that the middle state is during the 120sec after the setupNetwork has been applied and before the rollback. Looks like only the VLAN mac has changed, originally it was the same as the mac of the bond itself, then, after setupNetworks it changed and is no longer the same mac as of the bond.
It seems that this is related to the order of actions taken to setup the device hierarchy. With NM, the order and end result seems to be consistent where all addresses are the same. When doing the same with the ip tool, the inconsistency due to the step order can be seen: All mac addresses are identical: ip link add dummy_88 type dummy ip link add dummy_99 type dummy ip link add bond99 type bond ip link set dummy_88 master bond99 ip link set dummy_99 master bond99 ip link add link bond99 name bond99.101 type vlan id 101 VLAN mac address is different from all the rest: ip link add dummy_88 type dummy ip link add dummy_99 type dummy ip link add bond99 type bond ip link add link bond99 name bond99.101 type vlan id 101 ip link set dummy_88 master bond99 ip link set dummy_99 master bond99 We need to investigate how to assure that the VLAN interface is created after the slaves have been enslaved to the bond.
Unfortunately, I could not reproduce it. Could you reproduce it on a VM and share the image with us? In the original comment, it was mentioned that this fails with a static IP as well. The logs included the dhcp scenario and we identified it as related to the mac address change, but I do not expect mac changes to have the same affect. Could you please also add logs for the static IP scenario, just to check if this is the same problem or not. Please also add the 'ip link' output as you did last time so we can see if this is the same mac address issue.
We seem to have managed to recreate this here with the help of mburman, for the moment we do not need the VM asked in comment 14.
Just verified on rhvh-4.1-20170222.0(vdsm-4.19.6-1.el7ev)