Bug 1344075
| Summary: | [3.5] VM split brain during networking issues | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Virtualization Manager | Reporter: | Michal Skrivanek <michal.skrivanek> |
| Component: | vdsm | Assignee: | Francesco Romani <fromani> |
| Status: | CLOSED ERRATA | QA Contact: | Nisim Simsolo <nsimsolo> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 3.5.7 | CC: | adahms, agk, ahadas, bazulay, fromani, jentrena, lsurette, mgoldboi, michal.skrivanek, mkalinin, mtessun, nsimsolo, pkliczew, pstehlik, rbalakri, rhev-integ, Rhev-m-bugs, rhodain, srevivo, stirabos, tdosek, ycui, ykaul |
| Target Milestone: | --- | Keywords: | ZStream |
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: |
Previously, when the VDSM service was restarted on a host, the host would still respond to queries from the Manager over JSON-RPC protocol, which could lead to incorrectly reported status of virtual machines in the engine database. In the case of highly available virtual machines, this would cause the virtual machine to be restarted under certain circumstances even though the virtual machine was running. This issue has now been resolved, and API calls are correctly blocked while the VDSM service is starting.
|
Story Points: | --- |
| Clone Of: | 1342388 | Environment: | |
| Last Closed: | 2016-06-27 12:42:45 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | Virt | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | 1339291, 1342388 | ||
| Bug Blocks: | |||
|
Description
Michal Skrivanek
2016-06-08 16:54:18 UTC
https://gerrit.ovirt.org/#/c/58892/ merged -> MODIFIED Verification build: rhevm-3.5.8-0.1.el6ev.noarch qemu-kvm-rhev-0.12.1.2-2.491.el6_8.1.x86_64 libvirt-0.10.2-60.el6.x86_64 vdsm-4.16.37-1.el6ev.x86_64 sanlock-2.8-2.el6_5.x86_64 Verification scenarios: # Add 60 seconds sleep /usr/share/vdsm/clientIf.py (the scenario of reproducing this bug before the fix): 1. Use 2 hosts under the same cluster, on SPM host edit /usr/share/vdsm/clientIf.p and add time.sleep(60) under def _recoverExistingVms(self): 2. enable HA on VM. 3. Run VM. 4. Restart vdsms service. 5. Verify VM is not migrating to the second host. After VDSM service restarted, verify same qemu-kvm process is running on SPM host and verify no qemu-kvm process for same VM on the second host. Verify VM continue to run properly. # Stop VDSM service: 1. Stop VDSM service on the host with running VM. 2. Wait for host to become non-responsive and VM in unknown state. 3. Verify soft fencing started on the host and VM status restored to up. 4. Verify VM continue to run properly. # Power off host: 1. Power off host with VM running on it. 2. Wait for host to become in non-responsive state and VM in unknown state. 3. From webadmin confirm 'host has been rebooted'. 4. Verify VM is migrating to the active host and VM is restarting. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2016:1342 Fixed in vdsm-4.16.37-1.el6ev.x86_64, prior to 3.5.9. Engine bug for 3.5.9: https://bugzilla.redhat.com/show_bug.cgi?id=1352612 Sorry, the 3.5.9 bug is still vdsm-hostdeploy. |