Created attachment 943999 [details] snapshot fails Description of problem: rhel-guest-image-7.0-20140930.0.el7 snapshot fails while running over couple of RHEL6.6. Version-Release number of selected component (if applicable): libvirt-client-0.10.2-46.el6.x86_64 libvirt-0.10.2-46.el6.x86_64 libvirt-python-0.10.2-46.el6.x86_64 libvirt-lock-sanlock-0.10.2-46.el6.x86_64 qemu-kvm-rhev-0.12.1.2-2.448.el6.x86_64 vdsm-4.16.5-2.el6ev.x86_64 sanlock-2.8-1.el6.x86_64 How reproducible: 100% Steps to Reproduce: 1.First of all you have to disable selinux on both your RHEL6.6 hosts with setenforce 0. 2.Now try executing live snapshot from running guest-image VM on some RHEL6.6 host. 3.Snapshhot, fails... Actual results: PSA (please see attachments). Expected results: Snapshot should pass. Additional info: logs attached.
Created attachment 944001 [details] logs
Francesco, this seems related to some of your latest changes, can you please have a look? It seems that from the storage point of view everything is working as it should
(In reply to Tal Nisan from comment #2) > Francesco, this seems related to some of your latest changes, can you please > have a look? > It seems that from the storage point of view everything is working as it > should I checked locally on the box (in read only mode, so no harm is possible). VDSM indeed reports that live snapshot is not supported. This causes the engine to prohibit the operation as shown. This is expected and the wanted behaviour. # vdsClient -s 0 getVdsCaps | grep -i snapshot liveSnapshot = 'false' However, VDSM is acting as middleman here. Let's check what libvirt reports: # virsh -r capabilities | grep -i snapshot <disksnapshot default='off' toggle='no'/> <disksnapshot default='off' toggle='no'/> And this is the reason why it fails. Now, that's either libvirt or qemu issue.
Seems libvirt issue. Still on the affected box: # telnet localhost 4444 Trying ::1... Connected to localhost. Escape character is '^]'. {"QMP": {"version": {"qemu": {"micro": 1, "minor": 12, "major": 0}, "package": "(qemu-kvm-rhev-0.12.1.2-2.448.el6)"}, "capabilities": []}} { "execute": "qmp_capabilities" } {"return": {}} { "execute": "query-commands" } {"return": [{"name": "quit"}, {"name": "eject"}, {"name": "__com.redhat_drive_del"}, {"name": "change"}, {"name": "screendump"}, {"name": "__com.redhat_qxl_screendump"}, {"name": "stop"}, {"name": "cont"}, {"name": "system_wakeup"}, {"name": "system_reset"}, {"name": "system_powerdown"}, {"name": "device_add"}, {"name": "device_del"}, {"name": "cpu"}, {"name": "memsave"}, {"name": "pmemsave"}, {"name": "inject-nmi"}, {"name": "migrate"}, {"name": "migrate_cancel"}, {"name": "migrate_set_speed"}, {"name": "client_migrate_info"}, {"name": "dump-guest-memory"}, {"name": "query-dump-guest-memory-capability"}, {"name": "migrate_set_downtime"}, {"name": "block_resize"}, {"name": "netdev_add"}, {"name": "netdev_del"}, {"name": "transaction"}, {"name": "blockdev-snapshot-sync"}, {"name": "__com.redhat_drive-mirror"}, {"name": "__com.redhat_drive-reopen"}, {"name": "balloon"}, {"name": "set_link"}, {"name": "getfd"}, {"name": "closefd"}, {"name": "block_passwd"}, {"name": "block_set_io_throttle"}, {"name": "set_password"}, {"name": "expire_password"}, {"name": "__com.redhat_set_password"}, {"name": "__com.redhat_spice_migrate_info"}, {"name": "qmp_capabilities"}, {"name": "human-monitor-command"}, {"name": "__com.redhat_drive_add"}, {"name": "block-stream"}, {"name": "__com.redhat_block-commit"}, {"name": "block-job-set-speed"}, {"name": "block-job-cancel"}, {"name": "chardev-add"}, {"name": "chardev-remove"}, {"name": "query-events"}, {"name": "query-pci"}, {"name": "query-version"}, {"name": "query-commands"}, {"name": "query-chardev"}, {"name": "query-block"}, {"name": "query-blockstats"}, {"name": "query-cpus"}, {"name": "query-kvm"}, {"name": "query-status"}, {"name": "query-mice"}, {"name": "query-vnc"}, {"name": "query-spice"}, {"name": "query-name"}, {"name": "query-uuid"}, {"name": "query-migrate"}, {"name": "query-balloon"}, {"name": "query-block-jobs"}]} "blockdev-snapshot-sync" command is reported, thus the disk snapshot should be advertised as supported. Please move to libvirt for investigation.
(In reply to Nikolai Sednev from comment #0) > Created attachment 943999 [details] > snapshot fails > > Description of problem: > rhel-guest-image-7.0-20140930.0.el7 snapshot fails while running over couple > of RHEL6.6. > > Version-Release number of selected component (if applicable): > libvirt-client-0.10.2-46.el6.x86_64 > libvirt-0.10.2-46.el6.x86_64 > libvirt-python-0.10.2-46.el6.x86_64 > libvirt-lock-sanlock-0.10.2-46.el6.x86_64 > qemu-kvm-rhev-0.12.1.2-2.448.el6.x86_64 > vdsm-4.16.5-2.el6ev.x86_64 > sanlock-2.8-1.el6.x86_64 > > > How reproducible: > 100% > > Steps to Reproduce: > 1.First of all you have to disable selinux on both your RHEL6.6 hosts with > setenforce 0. > 2.Now try executing live snapshot from running guest-image VM on some > RHEL6.6 host. > 3.Snapshhot, fails... Sorry, just to make sure I understood things correctly. You have a VM on RHEL 6.6 host with the installed packages above (libvirt-0.10.2-46.el6.x86_64 et al.) In turn, that VM happens to run the rhel-guest-image-7.0-20140930.0.el7. But is the RHEL6.6 host that doesn't allow to take snapshots, right? In other words, that RHEL6.6 host shouldn't allow to take snapshot for *any* VM, not only RHEL-guest image, right? If so, the reason is what I explained above, looks like a libvirt issue and must be moved to libvirt.
(In reply to Francesco Romani from comment #5) > (In reply to Nikolai Sednev from comment #0) > > Created attachment 943999 [details] > > snapshot fails > > > > Description of problem: > > rhel-guest-image-7.0-20140930.0.el7 snapshot fails while running over couple > > of RHEL6.6. > > > > Version-Release number of selected component (if applicable): > > libvirt-client-0.10.2-46.el6.x86_64 > > libvirt-0.10.2-46.el6.x86_64 > > libvirt-python-0.10.2-46.el6.x86_64 > > libvirt-lock-sanlock-0.10.2-46.el6.x86_64 > > qemu-kvm-rhev-0.12.1.2-2.448.el6.x86_64 > > vdsm-4.16.5-2.el6ev.x86_64 > > sanlock-2.8-1.el6.x86_64 > > > > > > How reproducible: > > 100% > > > > Steps to Reproduce: > > 1.First of all you have to disable selinux on both your RHEL6.6 hosts with > > setenforce 0. > > 2.Now try executing live snapshot from running guest-image VM on some > > RHEL6.6 host. > > 3.Snapshhot, fails... > > Sorry, just to make sure I understood things correctly. > > You have a VM on RHEL 6.6 host with the installed packages above > (libvirt-0.10.2-46.el6.x86_64 et al.) > > In turn, that VM happens to run the rhel-guest-image-7.0-20140930.0.el7. > But is the RHEL6.6 host that doesn't allow to take snapshots, right? > > In other words, that RHEL6.6 host shouldn't allow to take snapshot for *any* > VM, not only RHEL-guest image, right? > > If so, the reason is what I explained above, looks like a libvirt issue and > must be moved to libvirt. I'm importing the guest image from export domain, then creating VM over RHEL7.0 host, then starting it over RHEL6.6 host with disabled selinux, then trying to make snapshot.
BTW, with selinux enforcing newly created VM without any OS isn't starting over RHEL6.6 host and goes to RHEL7.0 host instead it was set to rung from RHEL6.6.
Only in case I'm setting selinux on RHEL6.6 host to permissive, VM starting on RHEL6.6, yet with this error in WEBUI: 2014-Oct-06, 15:01 The Balloon device on VM test2 on host alma03.qa.lab.tlv.redhat.com is inflated but the device cannot be controlled (guest agent is down).
(In reply to Nikolai Sednev from comment #6) > > If so, the reason is what I explained above, looks like a libvirt issue and > > must be moved to libvirt. > > I'm importing the guest image from export domain, then creating VM over > RHEL7.0 host, then starting it over RHEL6.6 host with disabled selinux, then > trying to make snapshot. OK, so the issue is live snapshot not working on RHEL 6.6, whatever is the OS running on the VM. The reason is what I explained above. libvirt issue.
(In reply to Nikolai Sednev from comment #7) > BTW, with selinux enforcing newly created VM without any OS isn't starting > over RHEL6.6 host and goes to RHEL7.0 host instead it was set to rung from > RHEL6.6. should be caused by https://bugzilla.redhat.com/show_bug.cgi?id=1139873 (In reply to Nikolai Sednev from comment #8) > Only in case I'm setting selinux on RHEL6.6 host to permissive, VM starting > on RHEL6.6, yet with this error in WEBUI: > > 2014-Oct-06, 15:01 > > The Balloon device on VM test2 on host alma03.qa.lab.tlv.redhat.com is > inflated but the device cannot be controlled (guest agent is down). This should not be critical. Let's focus on the libvirt snapshot misreporting.
(In reply to Nikolai Sednev from comment #8) > The Balloon device on VM test2 on host alma03.qa.lab.tlv.redhat.com is > inflated but the device cannot be controlled (guest agent is down). so…is it running or not?
3.4.x not affected (relevant patches not in engine's 3.4.x branch)
(In reply to Michal Skrivanek from comment #11) > (In reply to Nikolai Sednev from comment #8) > > The Balloon device on VM test2 on host alma03.qa.lab.tlv.redhat.com is > > inflated but the device cannot be controlled (guest agent is down). > > so…is it running or not? The guest agent you mean? It looks like crashed, as it stopped showing FQDN and IP address of the host it was running on, but I don't know for sure, as I have no access to guest image, user/password, it was imported in to environment.
I cannot reproduce this bug on: rhevm-3.5.0-0.14.beta.el6ev.noarch vdsm-4.14.7-3.el6ev.x86_64 qemu-kvm-rhev-0.12.1.2-2.415.el6_5.3.x86_64 I use rhevm-image-uploader instead of iso-uploader based on the steps in TCMS. 1. Upload ovf with rhevm-image-uploader 2. Import template from export domain. 3. Create VM from template and run it on RHEL-6.6-20140926.0 host. 4. Click "create snapshot" button to make snapshot. Snapshot created successfully for rhel-guest-image-7.0 (7.0-20140930.0) vm and status shows OK.
(In reply to Wei Shi from comment #14) > I cannot reproduce this bug on: > rhevm-3.5.0-0.14.beta.el6ev.noarch > vdsm-4.14.7-3.el6ev.x86_64 > qemu-kvm-rhev-0.12.1.2-2.415.el6_5.3.x86_64 > > I use rhevm-image-uploader instead of iso-uploader based on the steps in > TCMS. > > 1. Upload ovf with rhevm-image-uploader > 2. Import template from export domain. > 3. Create VM from template and run it on RHEL-6.6-20140926.0 host. > 4. Click "create snapshot" button to make snapshot. > > Snapshot created successfully for rhel-guest-image-7.0 (7.0-20140930.0) vm > and status shows OK. VM starts and guest agent crashes. Snapshot creation failed. Components: rhevm-3.5.0-0.14.beta.el6ev.noarch libvirt-0.10.2-46.el6.x86_64 vdsm-4.16.6-1.el6ev.x86_64 qemu-kvm-rhev-0.12.1.2-2.448.el6.x86_64 sanlock-2.8-1.el6.x86_64 Logs attached.
Created attachment 946382 [details] logs from 13_10_14
(In reply to Wei Shi from comment #14) > Snapshot created successfully for rhel-guest-image-7.0 (7.0-20140930.0) vm > and status shows OK. well, maybe you have wrong libvirt? It is indeed not supposed to work ATM due to bug 1149667 Which is what Nikolai is seeing I believe
*** Bug 1150428 has been marked as a duplicate of this bug. ***
I don't think there is anything else to track other than snapshots on 6.6. I'd close as a duplicate of bug 1150609 if no one objects
(In reply to Michal Skrivanek from comment #19) > I don't think there is anything else to track other than snapshots on 6.6. > I'd close as a duplicate of bug 1150609 if no one objects I disagree: bug 1150609 Reported: 2014-10-08 10:04 EDT by Jan Kurik , mine reported earlier (Reported: 2014-10-05 06:42 EDT by Nikolai Sednev ) and hence shouldn't be closed as duplicate.
yeah, but a RHEL bug already in VERIFIED and addressing the real problem. Also this should have been moved instead of cloned to RHEL as 1149667 anyway, for the sake of testing feel free to give it a shot with updated libvirt from RHBA-2014:19136-01
I reproduced the bug on these latest components for RHEVM3.5 over HE, it's not fixed yet: qemu-kvm-rhev-0.12.1.2-2.448.el6.x86_64 ovirt-hosted-engine-setup-1.2.1-1.el6ev.noarch libvirt-0.10.2-46.el6.x86_64 ovirt-hosted-engine-ha-1.2.4-1.el6ev.noarch vdsm-4.16.7.1-1.el6ev.x86_64 sanlock-2.8-1.el6.x86_64 ovirt-host-deploy-1.3.0-1.el6ev.noarch rhevm-3.5.0-0.15.beta.el6ev.noarch BTW 1149667 is at post state, making this bug anyway not on-qa, but assigned, which I'm changing now to be the right way, until issue fixed.
Works for me on these components: qemu-kvm-rhev-0.12.1.2-2.448.el6.x86_64 libvirt-0.10.2-46.el6_6.1.x86_64 vdsm-4.16.7.1-1.el6ev.x86_64 sanlock-2.8-1.el6.x86_64
*** Bug 1158049 has been marked as a duplicate of this bug. ***
*** Bug 1158469 has been marked as a duplicate of this bug. ***
I have upgraded from EL 6.5 to 6.6 and installed the latest libvirt. I am current with the the ovirt-3.5 repo (not nightly). # virsh -r capabilities | grep -i snapshot <returns nothing> # rpm -qa | grep libvirt libvirt-0.10.2-46.el6_6.1.x86_64 libvirt-python-0.10.2-46.el6_6.1.x86_64 libvirt-client-0.10.2-46.el6_6.1.x86_64 libvirt-lock-sanlock-0.10.2-46.el6_6.1.x86_64 Linux 2.6.32-504.el6.x86_64 x86_64 x86_64 x86_64 GNU/Linux # Yet, oVirt 3.5.0 still reports Live Snapshot Support: Inactive My packages: qemu-kvm-rhev-0.12.1.2-2.415.el6_5.14.x86_64 vdsm-4.16.7-1.gitdb83943.el6.x86_64 libvirt-0.10.2-46.el6_6.1.x86_64 sanlock-2.8-1.el6.x86_64 In comment #24, Nikolai has components from 3.5 Nightly. Should we be using components from 3.5 Nightly or will packages be ported back into the ovirt-3.5 repo?
(In reply to Scott Worthington from comment #27) > Should we be using > components from 3.5 Nightly or will packages be ported back into the > ovirt-3.5 repo? no, you don't need any cheange/update on engine side, just the libvirt pkg. Which looks ok Just to be sure, did you restart vdsm and engine after the update? Also, regardless what is says - does it actually work? (it should:-)
To comment 28: Yes --> I put the machine in maintenance, upgraded the packages, rebooted the VM host, and activated the machine. The UI still stated that Snapshot Support was Inactive. I put the machine back into maintenance and clicked on "Reinstall" and then re-activated the machine. The UI still stated that Snapshot Support was Inactive. Next, I migrated a VM guest to the EL 6.6 host, and then attempted to create a snapshot from the 3.5.0 engine interface. UI Responds Error while executing action: Cannot create Snapshot. Operation not supported by QEMU.
On both EL 6.6 and 6.5 KVM hosts, I ran # vdsClient -s 0 getVdsCaps | grep -i snapshot # virsh -r capabilities | grep -i snapshot And they both return nothing.
To comment #28 about restarting the engine --> My oVirt 3.5.0 engine is on bare metal. Rebooting the machine hosting the oVirt 3.5.0 engine does not change the status of the EL 6.6 KVM host --> Live Snapshot status still says "Inactive".
(In reply to Scott Worthington from comment #31) > To comment #28 about restarting the engine --> My oVirt 3.5.0 engine is on > bare metal. Rebooting the machine hosting the oVirt 3.5.0 engine does not > change the status of the EL 6.6 KVM host --> Live Snapshot status still says > "Inactive". Unfortunately looks you hit a related bug. If VDSM reports live snapshot inactive and after that nothing more, the Engine doesn't update its data structures and still reports the state as Inactive. Fix is underway. Meantime, you can alter the DB table like this update vds_dynamic set is_live_snapshot_support=true ... for each of your affected hosts (I guess all of them). Make sure you do this with Engine shut off.
master patch: http://gerrit.ovirt.org/#/c/34673/
Reproduced on HE setup, both hosts reported via WEBUI of RHEVM as Live Snapsnot Support: Inactive. Components: rhevm-3.5.0-0.18.beta.el6ev.noarch ovirt-hosted-engine-setup-1.2.1-2.el6ev.noarch qemu-kvm-rhev-0.12.1.2-2.448.el6.x86_64 libvirt-0.10.2-46.el6_6.1.x86_64 vdsm-4.16.7.2-1.el6ev.x86_64 sanlock-2.8-1.el6.x86_64 ovirt-host-deploy-1.3.0-1.el6ev.noarch ovirt-hosted-engine-ha-1.2.4-1.el6ev.noarch Logs attached.
Created attachment 953727 [details] logs
Created attachment 953728 [details] screenshot
did you have a previous libvirt package installed on the host, within the same engine without reinstall? remove&add the host if it is the case
(In reply to Michal Skrivanek from comment #37) > did you have a previous libvirt package installed on the host, within the > same engine without reinstall? remove&add the host if it is the case Yep, here's why https://bugzilla.redhat.com/show_bug.cgi?id=1159211#c0 investigating in the logs meantime, to see if it is something else.
(In reply to Michal Skrivanek from comment #37) > did you have a previous libvirt package installed on the host, within the > same engine without reinstall? remove&add the host if it is the case With a host that was upgraded from 6.5 to 6.6 and had Live Snapshot = "Inactive" in the Engine, following the procedure of Removing the VM Host from the Engine and Adding the VM Host back into the Engine results in Live Snapshot = "Active". remove&add the host, resolves the issue
(In reply to Scott Worthington from comment #39) > (In reply to Michal Skrivanek from comment #37) > > did you have a previous libvirt package installed on the host, within the > > same engine without reinstall? remove&add the host if it is the case > > With a host that was upgraded from 6.5 to 6.6 and had Live Snapshot = > "Inactive" in the Engine, following the procedure of Removing the VM Host > from the Engine and Adding the VM Host back into the Engine results in Live > Snapshot = "Active". > > remove&add the host, resolves the issue alright, closing again
*** Bug 1154364 has been marked as a duplicate of this bug. ***
(In reply to Scott Worthington from comment #39) > (In reply to Michal Skrivanek from comment #37) > > did you have a previous libvirt package installed on the host, within the > > same engine without reinstall? remove&add the host if it is the case > > With a host that was upgraded from 6.5 to 6.6 and had Live Snapshot = > "Inactive" in the Engine, following the procedure of Removing the VM Host > from the Engine and Adding the VM Host back into the Engine results in Live > Snapshot = "Active". > > remove&add the host, resolves the issue How you can remove and add the host if it's being deployed via hosted engine deployment procedure via prompt and not via GUI? Both hosts running as HE hosts, hence they added using HE deployment.
Remove/Add is a workaround. The missed updated of the field is tracked here https://bugzilla.redhat.com/show_bug.cgi?id=1159211 patch already landed on 3.5 branch.
(In reply to Nikolai Sednev from comment #42) > (In reply to Scott Worthington from comment #39) > > (In reply to Michal Skrivanek from comment #37) > > > did you have a previous libvirt package installed on the host, within the > > > same engine without reinstall? remove&add the host if it is the case > > > > With a host that was upgraded from 6.5 to 6.6 and had Live Snapshot = > > "Inactive" in the Engine, following the procedure of Removing the VM Host > > from the Engine and Adding the VM Host back into the Engine results in Live > > Snapshot = "Active". > > > > remove&add the host, resolves the issue > > How you can remove and add the host if it's being deployed via hosted engine > deployment procedure via prompt and not via GUI? > Both hosts running as HE hosts, hence they added using HE deployment. In my deployment, I am not using HE. I have 3 KVM hosts and one oVirt 3.5.0 Engine on bare-metal.
(In reply to Scott Worthington from comment #44) > In my deployment, I am not using HE. I have 3 KVM hosts and one oVirt 3.5.0 > Engine on bare-metal. yeah, it has nothing to do with HE
Work around does not working with hosted engine, you can't add/remove
So what? The bug is fixed and verified. Suitability of a workaround have no relevance to the bug itself once it's fixed and shipped live Which it is for a month already.
shipped live in 6.6.z - https://rhn.redhat.com/errata/RHBA-2014-1720.html
(In reply to Michal Skrivanek from comment #48) > shipped live in 6.6.z - https://rhn.redhat.com/errata/RHBA-2014-1720.html Hi Michael, 1)Where is the fix for this bug? I don't see it within this bug, work arounds are not a solution for the bugs, please attach fix to this bug, otherwise I will reopen it as not solved. 2)Talking about hosted engine situation, you can't add/remove the host to it, you may only add it once via prompt, not via WEBUI, as it is part of the HA and should be added via "hosted-engine --deploy" via prompt only. Please describe how this scenario is fixed?
fix is described in the comments, though it's getting a bit confusing to read all of them:) ok.. There is only one fix, in comment #48. If you manage to reproduce the problem the please feel free to reopen, but make sure you have a clean environment to start with.
Works for me on rhevm-3.5.0-0.20.el6ev.noarch qemu-kvm-rhev-0.12.1.2-2.448.el6.x86_64 libvirt-0.10.2-46.el6_6.1.x86_64 vdsm-4.16.7.4-1.el6ev.x86_64 sanlock-2.8-1.el6.x86_64