Bug 1497173
Summary: | VM marked as non responsive if it has ISO from an inaccessible ISO domain | ||
---|---|---|---|
Product: | Red Hat Enterprise Virtualization Manager | Reporter: | nijin ashok <nashok> |
Component: | vdsm | Assignee: | Dan Kenigsberg <danken> |
Status: | CLOSED DUPLICATE | QA Contact: | Raz Tamir <ratamir> |
Severity: | high | Docs Contact: | |
Priority: | unspecified | ||
Version: | 4.1.6 | CC: | bazulay, fromani, gveitmic, lsurette, mavital, srevivo, tjelinek, ycui, ykaul |
Target Milestone: | ovirt-4.2.0 | ||
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2017-10-03 07:34:26 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | Virt | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
nijin ashok
2017-09-29 11:39:42 UTC
(In reply to nijin ashok from comment #0) > Description of problem: > > RHV VMs are marked as non-responsive if it's having an ISO from an > inaccessible ISO domain. The vdsm will not be able to get any statistics > from the VM and it will log in the vdsm log as "monitor become > unresponsive". All monitoring calls will get discarded or blocked. Since > it's in non-responsive state, we will not be able to detach the CD from the > portal. The only option available is to shutdown the VM. Even we will not be > able to use virsh commands to detach the CD as the call is blocked in > GetAllDomainStats. > Also Powering off the VM sometimes fails and we have to > kill the qemu-kvm process manually. The qemu-kvm process of the VM will be > in D state. This last bit of information about libvirt/qemu is important. It means that QEMU and/or libvirt are stuck, so Vdsm is reacting accordingly to this state and -as far as I can understand- handling it well. Problem is: we should not end up in this state in the first place. QEMU/libvirt should tolerate the unavailability of ISO domain. What we can do is: 1. review the configuration we use for cdrom devices, check if RHV is compliant with best practices Once #1 is correct, if we still see this behaviour, we need to 2. file a bug against libvirt/qemu > Expected results: > > VMs should work fine or at least should put into VM should keep working with I/O error on the inaccessible cdrom devices, and this is what we need to check with the two steps above, but I think this is the best behaviour we can get in this scenario. (In reply to Francesco Romani from comment #2) > (In reply to nijin ashok from comment #0) > > Description of problem: > > > > RHV VMs are marked as non-responsive if it's having an ISO from an > > inaccessible ISO domain. The vdsm will not be able to get any statistics > > from the VM and it will log in the vdsm log as "monitor become > > unresponsive". All monitoring calls will get discarded or blocked. Since > > it's in non-responsive state, we will not be able to detach the CD from the > > portal. The only option available is to shutdown the VM. Even we will not be > > able to use virsh commands to detach the CD as the call is blocked in > > GetAllDomainStats. > > Also Powering off the VM sometimes fails and we have to > > kill the qemu-kvm process manually. The qemu-kvm process of the VM will be > > in D state. > > This last bit of information about libvirt/qemu is important. It means that > QEMU and/or libvirt are stuck, so Vdsm is reacting accordingly to this state > and -as far as I can understand- handling it well. > > Problem is: we should not end up in this state in the first place. > QEMU/libvirt should tolerate the unavailability of ISO domain. > > What we can do is: > 1. review the configuration we use for cdrom devices, check if RHV is > compliant with best practices > Once #1 is correct, if we still see this behaviour, we need to so setting for now for 4.2 and we can decide later based on the result of this investigation. > 2. file a bug against libvirt/qemu > > > Expected results: > > > > VMs should work fine or at least should put into > > VM should keep working with I/O error on the inaccessible cdrom devices, and > this is what we need to check with the two steps above, but I think this is > the best behaviour we can get in this scenario. (In reply to Francesco Romani from comment #2) > What we can do is: > 1. review the configuration we use for cdrom devices, check if RHV is > compliant with best practices Isn't this the same as BZ1207992? |