Created attachment 1371898 [details] Logs Description of problem: Host become non-operational if it has an un-synced network with vm<>non-VM difference. It's not possible activate a host that has a network attached to it and it is un-synced cause of a bridge property false/true. If trying to activate the host before syncing the network the host become non-operational. Version-Release number of selected component (if applicable): 4.2.0.2-0.1.el7 vdsm-4.20.9.3-1.el7ev.x86_64 How reproducible: 100% Steps to Reproduce: 1. Create 2 DCs, run host in DC1 2. Create network called 'net1' on both DCs, on DC1 set it as VM network, on DC2 set it as non-VM network. Attach net1 to host in DC1. 3. Set host to maintenance and Move the host to DC2(net1 network is now out-of-sync and non-VM network in DC2) 4. Try to activate host Actual results: Host become non-operational. It can be up again only after we synced the network. Expected results: Host should be up and operational, even if we didn't synced the network yet. It shouldn't be non-operational.
is "net1" a required network? If it is, I think this is not a bug.
This bug report has Keywords: Regression or TestBlocker. Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP.
(In reply to Red Hat Bugzilla Rules Engine from comment #2) > This bug report has Keywords: Regression or TestBlocker. > Since no regressions or test blockers are allowed between releases, it is > also being identified as a blocker for this release. Please resolve ASAP. (In reply to Dan Kenigsberg from comment #1) > is "net1" a required network? If it is, I think this is not a bug. I should have mentioned this important detail, 'net1' isn't a required network and that is why it is a bug.
It's not possible to make the host operational unless syncing the network.
Taking back to 4.2 because it hurts our QE automation.
Hi Michael, I'm closing the bug since you and me cannot reproduce it on the latest engine. If you manage to reproduce it, please reopen.
(In reply to Alona Kaplan from comment #6) > Hi Michael, > > I'm closing the bug since you and me cannot reproduce it on the latest > engine. > > If you manage to reproduce it, please reopen. Managed to reproduce) on 4.2.1.1-0.1.el7 with the same steps described in comment#0
Created attachment 1380986 [details] engine log, new reproduction
To reproduce the issue the network should be 'vm' network in the dc and 'non-vm' network in the host. (Step 2 in the bug description should be changed to - Create network called 'net1' on both DCs, on DC1 set it as non-VM network, on DC2 set it as VM network. Attach net1 to host in DC1.) In this case, the host is marked as non-operation, no matter if the network is required or not. Maybe we should consider keeping this logic only for required networks, but it doesn't seem to me like a regression, this code wasn't changed since 2014. Are you sure in previous versions the behaviour was different?
Hi Alona, I understand, you are right, this happens on 4.1.9 as well, so it's not regression. Currently the audit log will show - - Host navy-vds1.qa.lab.tlv.redhat.com does not comply with the cluster Cluster1 networks, the following VM networks are non-VM networks: 'net-1' - Host navy-vds1.qa.lab.tlv.redhat.com's following network(s) are not synchronized with their Logical Network configuration: net-1' - Status of host navy-vds1.qa.lab.tlv.redhat.com was set to NonOperational. Meni, how would you like to proceed?
We can leave it as is, If the host cannot run VMs it should be NonOperational, But we need to make sure that the NonOperational host status is attached to the -VM networks errors.
moved to 4.2.1 by mistake.
Changed the first message to - 'Host navy-vds1.qa.lab.tlv.redhat.com does not comply with the cluster Cluster1 networks, the following VM networks are non-VM networks: 'net-1'. The host will become NonOperational.'
*** Bug 1285785 has been marked as a duplicate of this bug. ***
The new first message doesn't appear in our latest d/s build 4.2.2-0.1.el7 Still see the old message - "Host orchid-vds1.qa.lab.tlv.redhat.com does not comply with the cluster Cluster1 networks, the following VM networks are non-VM networks: 'net-3'"
Hi Michael, Please attach the engine log.
Created attachment 1397561 [details] failed qa engine log
Hi Michael, According to the attached logs the version of your engine is 4.2.1.6-0.1.el7 and not 4.2.2-0.1.el7. Please upgrade your engine. 2018-02-18 11:20:43,637+02 INFO [org.ovirt.engine.core.uutils.config.ShellLikeConfd] (ServerService Thread Pool -- 61) [] Value of property 'PACKAGE_DISPLAY_VERSION' is '4.2.1.6-0.1.el7'. 2018-02-18 11:20:43,638+02 INFO [org.ovirt.engine.core.uutils.config.ShellLikeConfd] (ServerService Thread Pool -- 61) [] Value of property 'PACKAGE_NAME' is 'ovirt-engine'. 2018-02-18 11:20:43,638+02 INFO [org.ovirt.engine.core.uutils.config.ShellLikeConfd] (ServerService Thread Pool -- 61) [] Value of property 'PACKAGE_VERSION' is '4.2.1.6'.
Yes, i had an issue on my rhvm env and now it fixed. Retesting
Verified on - 4.2.2-0.1.el7 "Host orchid-vds2.qa.lab.tlv.redhat.com does not comply with the cluster Cluster1 networks, the following VM networks are non-VM networks: 'net-3'. The host will become NonOperational."
This bugzilla is included in oVirt 4.2.2 release, published on March 28th 2018. Since the problem described in this bug report should be resolved in oVirt 4.2.2 release, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report.