Bug 1040979 - [RFE] enhance API to be able to freeze/thaw a VM before doing SAN based snapshots
Summary: [RFE] enhance API to be able to freeze/thaw a VM before doing SAN based snaps...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: RFEs
Version: 3.3.0
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ovirt-3.6.0-rc3
: 3.6.0
Assignee: Daniel Erez
QA Contact: Carlos Mestre González
URL:
Whiteboard:
Depends On: 1047142 1102457
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-12-12 13:02 UTC by Luc de Louw
Modified: 2016-03-09 20:34 UTC (History)
14 users (show)

Fixed In Version:
Doc Type: Enhancement
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-03-09 20:34:11 UTC
oVirt Team: Storage
Target Upstream Version:
sherold: Triaged+


Attachments (Terms of Use)
Libvirtd, qemu, engine, packages logs (333.51 KB, application/x-gzip)
2016-02-18 13:51 UTC, Carlos Mestre González
no flags Details
libvirt and vdsm logs (2.68 MB, application/x-gzip)
2016-02-24 11:23 UTC, Carlos Mestre González
no flags Details
output of "egrep freeze|thaw vdsm.log" (5.61 KB, text/plain)
2016-02-24 13:11 UTC, Nir Soffer
no flags Details
libvirt, vdsm and engine logs (389.45 KB, application/x-gzip)
2016-02-25 13:18 UTC, Carlos Mestre González
no flags Details


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2016:0376 normal SHIPPED_LIVE Red Hat Enterprise Virtualization Manager 3.6.0 2016-03-10 01:20:52 UTC
oVirt gerrit 44480 None None None Never

Description Luc de Louw 2013-12-12 13:02:37 UTC
Description of problem:
When doing SAN based snapshots, there is no way to tell the VM to set the application running on it in a backup mode (i.e. LOCK Tables on Databases)


Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1. 
2.
3.

Actual results:


Expected results:
Having an API like:

<actions>
<link href="/api/vms/c957f846-d6e3-42cf-8957-8e244514f475/freeze" rel="freeze"/>
<link href="/api/vms/c957f846-d6e3-42cf-8957-8e244514f475/thaw" rel="thaw"/>
</actions>


Additional info:
This probably can be done to use the existing code used when doing RHEV based Snapshots

Comment 2 Ayal Baron 2013-12-12 13:19:16 UTC
Michal, any input on this rfe?

Comment 3 Michal Skrivanek 2013-12-12 13:31:11 UTC
Given the support we added in 3.1 vdsm makes libvirt taking over qemu-ga channel, and that communicating with qemu-ga is discouraged(or not possible atm), we can either
- use custom option for qemu-ga and libvirt so it doesn't "see" the agent. Then vdsm can manage connection and commands for qemu-ga
- implement the same functionality in ovirt-ga

Since the functionality is literally few lines of code I'd choose to implement this in ovirt-ga. There are already integration complaints about the number of different agents ovirt guest needs to install&run

Comment 7 Michal Skrivanek 2015-11-12 15:40:29 UTC
libvirt now has the necessary API virDomainFSFreeze and virDomainFSThaw so it can be exposed via vdsm

Comment 8 Allon Mureinik 2015-11-12 16:45:20 UTC
(In reply to Michal Skrivanek from comment #7)
> libvirt now has the necessary API virDomainFSFreeze and virDomainFSThaw so
> it can be exposed via vdsm

This integration was actually implemented in RHEV 3.6.0 as part of bug 1240636.
Moving to QA to verify.

Comment 9 Allon Mureinik 2015-11-12 16:46:18 UTC
Forgot to mention - the relevant APIs are vms/{vm_id}/freezefilesystems and ​ ​vms/{vm_id}/​thawfilesystems​​

Comment 10 Tal Nisan 2016-01-04 16:03:23 UTC
Daniel, it might be worth to document it, if you think so please fill in the doc text

Comment 11 Daniel Erez 2016-01-10 10:08:53 UTC
(In reply to Tal Nisan from comment #10)
> Daniel, it might be worth to document it, if you think so please fill in the
> doc text

Should already be covered by bug 1285925 (handled by rhev-docs).

Comment 12 Carlos Mestre González 2016-02-17 15:36:23 UTC
Daniel,

I've tried a simple call to freeze a VM (RHEL 6.7, check versions below), response:
<action>
    <status>
        <state>failed</state>
    </status>
    <fault>
        <reason>Operation Failed</reason>
        <detail>[Guest agent non responsive]</detail>
    </fault>
</action>


engine: rhevm-3.6.2.6-0.1.el6.noarch

hypervisor:
libvirt-1.2.17-13.el7_2.2.x86_64
qemu-guest-agent-2.3.0-4.el7.x86_64
qemu-kvm-common-rhev-2.3.0-31.el7_2.6.x86_64
qemu-kvm-tools-rhev-2.3.0-31.el7_2.6.x86_64
libvirt-daemon-driver-qemu-1.2.17-13.el7_2.2.x86_64
ipxe-roms-qemu-20130517-7.gitc4bce43.el7.noarch
qemu-img-rhev-2.3.0-31.el7_2.6.x86_64
qemu-kvm-rhev-2.3.0-31.el7_2.6.x86_64
vdsm-4.17.21-0.el7ev.noarch
vdsm-xmlrpc-4.17.21-0.el7ev.noarch

guest:

[root@localhost ~]# rpm -qa | grep agent
ovirt-guest-agent-1.0.11-1.el6.noarch

on RHEL 6.7: Linux localhost.localdomain 2.6.32-555.el6.x86_64 #1 SMP Fri Apr 17 16:28:17 EDT 2015 x86_64 x86_64 x86_64 GNU/Linux

guest agent logs (as you see the guest agent is running but then stops after the call 2016-02-17 15:22:48,207::ovirt-guest-agent::102::root::Stopping oVirt guest agent):

Dummy-1::INFO::2016-02-17 15:09:27,207::OVirtAgentLogic::294::root::Received an external command: api-version...
Dummy-1::INFO::2016-02-17 15:09:27,208::OVirtAgentLogic::112::root::API Version updated from 0 to 2
MainThread::INFO::2016-02-17 15:22:48,207::ovirt-guest-agent::102::root::Stopping oVirt guest agent
CredServer::INFO::2016-02-17 15:22:48,310::CredServer::261::root::CredServer has stopped.
MainThread::INFO::2016-02-17 15:22:48,312::ovirt-guest-agent::76::root::oVirt guest agent is down.
MainThread::INFO::2016-02-17 15:22:50,081::ovirt-guest-agent::57::root::Starting oVirt guest agent
CredServer::INFO::2016-02-17 15:22:50,226::CredServer::257::root::CredServer is running...

The engine.log:
2016-02-17 17:23:45,042 WARN  [org.ovirt.engine.core.dal.job.ExecutionMessageDirector] (ajp-/127.0.0.1:8702-6) [3fc0f01f] The message key 'FreezeVm' is missing from 'bundles/ExecutionMessages'
2016-02-17 17:23:45,168 INFO  [org.ovirt.engine.core.bll.FreezeVmCommand] (ajp-/127.0.0.1:8702-6) [3fc0f01f] Running command: FreezeVmCommand internal: false. Entities affected :  ID: fc562f86-3b60-4a54-b351-dcbac2f0b997 Type: VMAction group CONFIGURE_VM_STORAGE with role type USER
2016-02-17 17:23:45,172 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.FreezeVDSCommand] (ajp-/127.0.0.1:8702-6) [3fc0f01f] START, FreezeVDSCommand(HostName = host_mixed_2, VdsAndVmIDVDSParametersBase:{runAsync='true', hostId='e1903445-198c-4c69-aae3-7aba812145ac', vmId='fc562f86-3b60-4a54-b351-dcbac2f0b997'}), log id: 7beb4d31
2016-02-17 17:23:51,181 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.FreezeVDSCommand] (ajp-/127.0.0.1:8702-6) [3fc0f01f] Failed in 'FreezeVDS' method
2016-02-17 17:23:51,209 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ajp-/127.0.0.1:8702-6) [3fc0f01f] Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: VDSM host_mixed_2 command failed: Guest agent is not responding: Guest agent not available for now
2016-02-17 17:23:51,209 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.FreezeVDSCommand] (ajp-/127.0.0.1:8702-6) [3fc0f01f] Command 'org.ovirt.engine.core.vdsbroker.vdsbroker.FreezeVDSCommand' return value 'StatusOnlyReturnForXmlRpc [status=StatusForXmlRpc [code=19, message=Guest agent is not responding: Guest agent not available for now]]'
2016-02-17 17:23:51,209 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.FreezeVDSCommand] (ajp-/127.0.0.1:8702-6) [3fc0f01f] HostName = host_mixed_2
2016-02-17 17:23:51,209 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.FreezeVDSCommand] (ajp-/127.0.0.1:8702-6) [3fc0f01f] Command 'FreezeVDSCommand(HostName = host_mixed_2, VdsAndVmIDVDSParametersBase:{runAsync='true', hostId='e1903445-198c-4c69-aae3-7aba812145ac', vmId='fc562f86-3b60-4a54-b351-dcbac2f0b997'})' execution failed: VDSGenericException: VDSErrorException: Failed to FreezeVDS, error = Guest agent is not responding: Guest agent not available for now, code = 19
2016-02-17 17:23:51,209 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.FreezeVDSCommand] (ajp-/127.0.0.1:8702-6) [3fc0f01f] FINISH, FreezeVDSCommand, log id: 7beb4d31
2016-02-17 17:23:51,210 ERROR [org.ovirt.engine.core.bll.FreezeVmCommand] (ajp-/127.0.0.1:8702-6) [3fc0f01f] Command 'org.ovirt.engine.core.bll.FreezeVmCommand' failed: EngineException: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSErrorException: VDSGenericException: VDSErrorException: Failed to FreezeVDS, error = Guest agent is not responding: Guest agent not available for now, code = 19 (Failed with error nonresp and code 19)
2016-02-17 17:23:51,242 WARN  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ajp-/127.0.0.1:8702-6) [3fc0f01f] Correlation ID: 3fc0f01f, Job ID: 4c5c3aae-c0b4-4d4a-9480-0a2eed39d534, Call Stack: null, Custom Event ID: -1, Message: Failed to freeze guest filesystems on test_freeze (Host: host_mixed_2, User: admin@internal).
2016-02-17 17:23:51,271 ERROR [org.ovirt.engine.api.restapi.resource.AbstractBackendResource] (ajp-/127.0.0.1:8702-6) [] Operation Failed: [Guest agent non responsive]


Daniel, are those the proper guest agent version/libvirt version? if are, I'm marking thas FAILED QA.

Comment 13 Michal Skrivanek 2016-02-17 15:41:30 UTC
(In reply to Carlos Mestre González from comment #12)
Carlos,
you are checking the worng agent, qemu-guest-agent is the one which matters. What is/was the status of that one? Looks like it was down.

Comment 14 Carlos Mestre González 2016-02-17 16:01:58 UTC
checking seems qemu-ga is running (as root?): 

root       635  0.0  0.0  22576  1544 ?        Ss   13:04   0:00 /usr/bin/qemu-ga --method=virtio-serial --path=/dev/virtio-ports/org.qemu.guest_agent.0 --blacklist=guest-file-open guest-file-close guest-file-read guest-file-write guest-file-seek guest-file-flush -F/etc/qemu-ga/fsfreeze-hook

qemu      9066  8.7 28.3 1656464 1098740 ?     Sl   17:06   3:56 /usr/libexec/qemu-kvm -name test_freeze -S -machine pc-i440fx-rhel7.2.0,accel=kvm,usb=off -cpu Nehalem -m size=1048576k,slots=16,maxmem=4294967296k -realtime mlock=off -smp 1,maxcpus=16,sockets=16,cores=1,threads=1 -numa node,nodeid=0,cpus=0,mem=1024 -uuid fc562f86-3b60-4a54-b351-dcbac2f0b997 -smbios type=1,manufacturer=Red Hat,product=RHEV Hypervisor,version=7.2-9.el7,serial=e6b0ce83-bfae-403a-a6fc-6e3792d81101,uuid=fc562f86-3b60-4a54-b351-dcbac2f0b997 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-test_freeze/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=2016-02-17T15:06:44,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x5 -device virtio-serial-pci,id=virtio-serial0,max_ports=16,bus=pci.0,addr=0x6 -drive if=none,id=drive-ide0-1-0,readonly=on,format=raw,serial= -device ide-cd,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -drive file=/rhev/data-center/6b6602e0-bc14-41b8-8e4b-a1ef53f83750/7b00a096-46ae-4241-956d-9a8461f0227a/images/01007e14-0626-4a19-b8df-5f5c934d439e/eb7a3659-681f-4dfc-9c46-1ce175bea57b,if=none,id=drive-virtio-disk0,format=qcow2,serial=01007e14-0626-4a19-b8df-5f5c934d439e,cache=none,werror=stop,rerror=stop,aio=native -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x7,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,fd=26,id=hostnet0,vhost=on,vhostfd=27 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=00:1a:4a:16:81:bb,bus=pci.0,addr=0x3 -chardev socket,id=charchannel0,path=/var/lib/libvirt/qemu/channels/fc562f86-3b60-4a54-b351-dcbac2f0b997.com.redhat.rhevm.vdsm,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.rhevm.vdsm -chardev socket,id=charchannel1,path=/var/lib/libvirt/qemu/channels/fc562f86-3b60-4a54-b351-dcbac2f0b997.org.qemu.guest_agent.0,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=org.qemu.guest_agent.0 -chardev spicevmc,id=charchannel2,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=3,chardev=charchannel2,id=channel2,name=com.redhat.spice.0 -spice port=5900,tls-port=5901,addr=10.35.65.33,x509-dir=/etc/pki/vdsm/libvirt-spice,seamless-migration=on -device qxl-vga,id=video0,ram_size=67108864,vram_size=33554432,vgamem_mb=16,bus=pci.0,addr=0x2 -device intel-hda,id=sound0,bus=pci.0,addr=0x4 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -msg timestamp=on


# file /dev/virtio-ports/org.qemu.guest_agent.0 
/dev/virtio-ports/org.qemu.guest_agent.0: symbolic link to `../vport2p2'

# file /var/lib/libvirt/qemu/channels/fc562f86-3b60-4a54-b351-dcbac2f0b997.org.qemu.guest_agent.0
/var/lib/libvirt/qemu/channels/fc562f86-3b60-4a54-b351-dcbac2f0b997.org.qemu.guest_agent.0: socket

# cat /var/log/libvirt/qemu/test_freeze.log 
2016-02-17 15:06:44.541+0000: starting up libvirt version: 1.2.17, package: 13.el7_2.2 (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>, 2015-11-23-07:46:04, x86-019.build.eng.bos.redhat.com), qemu version: 2.3.0 (qemu-kvm-rhev-2.3.0-31.el7_2.6)
LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin QEMU_AUDIO_DRV=spice /usr/libexec/qemu-kvm -name test_freeze -S -machine pc-i440fx-rhel7.2.0,accel=kvm,usb=off -cpu Nehalem -m size=1048576k,slots=16,maxmem=4294967296k -realtime mlock=off -smp 1,maxcpus=16,sockets=16,cores=1,threads=1 -numa node,nodeid=0,cpus=0,mem=1024 -uuid fc562f86-3b60-4a54-b351-dcbac2f0b997 -smbios type=1,manufacturer=Red Hat,product=RHEV Hypervisor,version=7.2-9.el7,serial=e6b0ce83-bfae-403a-a6fc-6e3792d81101,uuid=fc562f86-3b60-4a54-b351-dcbac2f0b997 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-test_freeze/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=2016-02-17T15:06:44,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x5 -device virtio-serial-pci,id=virtio-serial0,max_ports=16,bus=pci.0,addr=0x6 -drive if=none,id=drive-ide0-1-0,readonly=on,format=raw,serial= -device ide-cd,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -drive file=/rhev/data-center/6b6602e0-bc14-41b8-8e4b-a1ef53f83750/7b00a096-46ae-4241-956d-9a8461f0227a/images/01007e14-0626-4a19-b8df-5f5c934d439e/eb7a3659-681f-4dfc-9c46-1ce175bea57b,if=none,id=drive-virtio-disk0,format=qcow2,serial=01007e14-0626-4a19-b8df-5f5c934d439e,cache=none,werror=stop,rerror=stop,aio=native -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x7,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,fd=26,id=hostnet0,vhost=on,vhostfd=27 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=00:1a:4a:16:81:bb,bus=pci.0,addr=0x3 -chardev socket,id=charchannel0,path=/var/lib/libvirt/qemu/channels/fc562f86-3b60-4a54-b351-dcbac2f0b997.com.redhat.rhevm.vdsm,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.rhevm.vdsm -chardev socket,id=charchannel1,path=/var/lib/libvirt/qemu/channels/fc562f86-3b60-4a54-b351-dcbac2f0b997.org.qemu.guest_agent.0,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=org.qemu.guest_agent.0 -chardev spicevmc,id=charchannel2,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=3,chardev=charchannel2,id=channel2,name=com.redhat.spice.0 -spice port=5900,tls-port=5901,addr=10.35.65.33,x509-dir=/etc/pki/vdsm/libvirt-spice,seamless-migration=on -device qxl-vga,id=video0,ram_size=67108864,vram_size=33554432,vgamem_mb=16,bus=pci.0,addr=0x2 -device intel-hda,id=sound0,bus=pci.0,addr=0x4 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -msg timestamp=on
2016-02-17 15:06:44.547+0000: 9066: info : libvirt version: 1.2.17, package: 13.el7_2.2 (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>, 2015-11-23-07:46:04, x86-019.build.eng.bos.redhat.com)
2016-02-17 15:06:44.547+0000: 9066: info : virObjectUnref:259 : OBJECT_UNREF: obj=0x7f8eb816a8c0
2016-02-17T15:06:44.807550Z qemu-kvm: warning: CPU(s) not present in any NUMA nodes: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
2016-02-17T15:06:44.808075Z qemu-kvm: warning: All CPU(s) up to maxcpus should be described in NUMA config
main_channel_link: add main channel client
main_channel_handle_parsed: net test: latency 190.350000 ms, bitrate 1620599 bps (1.545524 Mbps) LOW BANDWIDTH
inputs_connect: inputs channel client create
red_dispatcher_set_cursor_peer: 
red_channel_client_disconnect: rcc=0x7f96b7dfa000 (channel=0x7f96b706e000 type=3 id=0)
red_channel_client_disconnect: rcc=0x7f96b8c60000 (channel=0x7f96b7812600 type=2 id=0)
red_channel_client_disconnect: rcc=0x7f96b7e04000 (channel=0x7f96b7054580 type=4 id=0)
red_channel_client_disconnect: rcc=0x7f96b84d9000 (channel=0x7f96b7066000 type=1 id=0)
main_channel_client_on_disconnect: rcc=0x7f96b84d9000
red_client_destroy: destroy client 0x7f96b6f55c80 with #channels=6
red_dispatcher_disconnect_cursor_peer: 
red_channel_client_disconnect_dummy: rcc=0x7f96b7de3000 (channel=0x7f96b8d53b80 type=6 id=0)
snd_channel_put: SndChannel=0x7f96b7de8000 freed
red_channel_client_disconnect_dummy: rcc=0x7f96b7dca000 (channel=0x7f96b8d53a20 type=5 id=0)
snd_channel_put: SndChannel=0x7f96bc78e000 freed
red_dispatcher_disconnect_display_peer:

Comment 15 Michal Skrivanek 2016-02-17 16:58:22 UTC
(In reply to Carlos Mestre González from comment #14)
> checking seems qemu-ga is running (as root?): 

ok, then please include libvirt debug logs showing the call failed and we can move this bug down the stack

Comment 16 Carlos Mestre González 2016-02-18 13:51:49 UTC
Created attachment 1128224 [details]
Libvirtd, qemu, engine, packages logs

I made a new run and collected the logs, the results are the same as I mentioned before

Attached is the libvirtd, qemu and engine logs. Also in hypervisor the packages installed and the guest the kenrel version.

Comment 17 Carlos Mestre González 2016-02-18 14:03:59 UTC
(In reply to Michal Skrivanek from comment #15)
> (In reply to Carlos Mestre González from comment #14)
> > checking seems qemu-ga is running (as root?): 
> 
> ok, then please include libvirt debug logs showing the call failed and we
> can move this bug down the stack

Hi Michal,

I provided the logs, please take a look, I'll mark the bug as assigned if this is indeed a real issue.

Comment 18 Michal Skrivanek 2016-02-19 10:24:09 UTC
It still looks like qemu-guest-agent is not running in the VM, as your rpm query doesn't contain the package at all. How come you don't even have it installed?
Host side is irrelevant

Comment 19 Carlos Mestre González 2016-02-22 12:54:49 UTC
Hi Michal,

You're right, our guest images didn't have the qemu-guest-agent package installed. I tested the basic flow and seems to work. I need to run more tests to make sure it works properly and I'll verify the bug.

Comment 20 Carlos Mestre González 2016-02-23 22:12:54 UTC
Hi,

I wrote a set of test cases for this feature (I've had sent the polarion plan to the list with the cases)

The utility works as expected on most cases, but there are a few issues (both maybe have the same root cuase)

1) All thaw calls succeed, this is regardless if the vm filesystem guest is frozen state or not.

2) After a thaw call all the freeze POST returns:

<reason>Operation Failed</reason>
<detail>[Unable to freeze guest filesystems]</detail>

and the engine envent:

VDSM host_mixed_1 command failed: internal error: unable to execute QEMU agent command 'guest-fsfreeze-freeze': The command guest-fsfreeze-freeze has been disabled for this instance".

Failed to freeze guest filesystems on test_freeze_multiple_disks

even though the operation actually SUCCEEDS (the guest file system is frozen).

Here are the scenarios to reproduce this issue:

Scenario 1)
1. Create a vm and start it
2. call the freeze action => POST succeeds, guest filesystem is frozen
3. call the thaw action   => POST succeeds, guest filesystem is unfrozen
4. call the freeze action => POST operation show an error, but the guest system is frozen.

Scenario 2)
1. Create a vm and start it
2. Call the thaw action => POST succeeds (even though the guest filesystem is not frozen)
3. Call the freeze action => POST operation show an error, but the guest system is frozen.

I'll submit libvirt logs.

Daniel, do you have any preliminary comments on this?

Comment 21 Nir Soffer 2016-02-24 07:30:42 UTC
(In reply to Carlos Mestre González from comment #20)
> Here are the scenarios to reproduce this issue:
> 
> Scenario 1)
> 1. Create a vm and start it
> 2. call the freeze action => POST succeeds, guest filesystem is frozen
> 3. call the thaw action   => POST succeeds, guest filesystem is unfrozen
> 4. call the freeze action => POST operation show an error, but the guest
> system is frozen.
> 
> Scenario 2)
> 1. Create a vm and start it
> 2. Call the thaw action => POST succeeds (even though the guest filesystem
> is not frozen)
> 3. Call the freeze action => POST operation show an error, but the guest
> system is frozen.

This looks like libvirt issue.

To make sure that this is a libvirt issue, please try to reproduce this 
with virsh. You can freeze and thaw using virsh like this:

1. Start the vm
2. Enter virsh shell
   # virsh
3. List the vms
   virsh # list
4. Enter credentials
   user: vsdm@ovirt
   pass: shibboleth
5. Find the vm id
6. Freeze the vm
   virsh # domfsfreeze <domain id>
7. Thaw
   virsh # domfsthaw <domain id>

Adding Eric for advice on debugging this.

Comment 22 Carlos Mestre González 2016-02-24 11:14:10 UTC
Hi Nir,

I tested with virsh, seems it works as expected (I don't see the error after a freeze command unless is already frozen):

virsh # domfsfreeze 2
Froze 2 filesystem(s)

virsh # domfsfreeze 2
error: Unable to freeze filesystems
error: internal error: unable to execute QEMU agent command 'guest-fsfreeze-freeze': The command guest-fsfreeze-freeze has been disabled for this instance

virsh # domfsthaw 2
Thawed 2 filesystem(s)

virsh # domfsthaw 2
Thawed 0 filesystem(s)

virsh # domfsfreeze 2
Froze 2 filesystem(s)

The issue then seems is the vdsm reporting wrong?

Packages versions:
vdsm-4.17.21-0.el7ev.noarch
libvirt-1.2.17-13.el7_2.3.x86_64
libvirt-client-1.2.17-13.el7_2.3.x86_64
qemu-kvm-common-rhev-2.3.0-31.el7_2.7.x86_64
qemu-kvm-tools-rhev-2.3.0-31.el7_2.7.x86_64
qemu-img-rhev-2.3.0-31.el7_2.7.x86_64
qemu-guest-agent-2.3.0-4.el7.x86_64
qemu-kvm-rhev-2.3.0-31.el7_2.7.x86_64
libvirt-daemon-driver-qemu-1.2.17-13.el7_2.3.x86_64

guest:
qemu-guest-agent-0.12.1.2-2.479.el6.x86_64

Comment 23 Carlos Mestre González 2016-02-24 11:23:43 UTC
Created attachment 1130181 [details]
libvirt and vdsm logs

Adding libvirtd and vdsm logs for the error shown calling freeze (remember the guest vm filesystem is frozen but the vdsm/engine reports and error)

in vdsm starts

jsonrpc.Executor/3::DEBUG::2016-02-24 13:05:36,811::__init__::503::jsonrpc.JsonRpcServer::(_serveRequest) Calling 'VM.freeze' in bridge with {u'vmID': u'52caf69e-eb99-4cba-bc23-4571262ac82a'}

libvirt logs is in a different timezone, sorry about that, should be 2016-02-24 10:05:36

Comment 24 Carlos Mestre González 2016-02-24 11:54:55 UTC
Adding Eric back for advice on debugging this.

Comment 25 Nir Soffer 2016-02-24 13:11:26 UTC
Created attachment 1130222 [details]
output of "egrep freeze|thaw vdsm.log"

Comment 26 Nir Soffer 2016-02-24 13:34:04 UTC
I found the issue - freeze and thaw return an error, but the error is
swallowed by the jsonrpc bridge.

Thanks for detecting this!

Comment 27 Nir Soffer 2016-02-24 13:36:51 UTC
Moving to urgent, since freeze errors are hidden, which may lead to loosing
user data.

Example flow:

1. Freeze fails but jsonrpc bridge return successful response
2. User performs snapshot but guest file systems are not frozen
3. User restore snapshot with inconsistent file systems

Comment 28 Nir Soffer 2016-02-24 13:50:41 UTC
Please attach engine logs - jsonrpc is not logging errors, so we don't know 
what happens after an error.

Looking at the log again, this may be only a vdsm log issue, not showing
the errors in the log.

Comment 29 Nir Soffer 2016-02-24 14:08:17 UTC
(In reply to Carlos Mestre González from comment #22)

Looking at filtered vdsm log (attachment 1130222 [details])

1. Successful freeze

jsonrpc.Executor/3::DEBUG::2016-02-24 13:05:36,811::__init__::503::jsonrpc.JsonRpcServer::(_serveRequest) Calling 'VM.freeze' in bridge with {u'vmID': u'52caf69e-eb99-4cba-bc23-4571262ac82a'}
jsonrpc.Executor/3::INFO::2016-02-24 13:05:36,812::vm::3018::virt.vm::(freeze) vmId=`52caf69e-eb99-4cba-bc23-4571262ac82a`::Freezing guest filesystems
jsonrpc.Executor/3::INFO::2016-02-24 13:05:36,879::vm::3033::virt.vm::(freeze) vmId=`52caf69e-eb99-4cba-bc23-4571262ac82a`::2 guest filesystems frozen
jsonrpc.Executor/3::DEBUG::2016-02-24 13:05:36,879::__init__::533::jsonrpc.JsonRpcServer::(_serveRequest) Return 'VM.freeze' in bridge with True

2. Successful thaw

jsonrpc.Executor/6::DEBUG::2016-02-24 13:05:50,130::__init__::503::jsonrpc.JsonRpcServer::(_serveRequest) Calling 'VM.thaw' in bridge with {u'vmID': u'52caf69e-eb99-4cba-bc23-4571262ac82a'}
jsonrpc.Executor/6::INFO::2016-02-24 13:05:50,131::vm::3041::virt.vm::(thaw) vmId=`52caf69e-eb99-4cba-bc23-4571262ac82a`::Thawing guest filesystems
jsonrpc.Executor/6::INFO::2016-02-24 13:05:50,140::vm::3056::virt.vm::(thaw) vmId=`52caf69e-eb99-4cba-bc23-4571262ac82a`::2 guest filesystems thawed
jsonrpc.Executor/6::DEBUG::2016-02-24 13:05:50,140::__init__::533::jsonrpc.JsonRpcServer::(_serveRequest) Return 'VM.thaw' in bridge with True

3. Successful thaw

jsonrpc.Executor/7::DEBUG::2016-02-24 13:05:50,367::__init__::503::jsonrpc.JsonRpcServer::(_serveRequest) Calling 'VM.thaw' in bridge with {u'vmID': u'52caf69e-eb99-4cba-bc23-4571262ac82a'}
jsonrpc.Executor/7::INFO::2016-02-24 13:05:50,368::vm::3041::virt.vm::(thaw) vmId=`52caf69e-eb99-4cba-bc23-4571262ac82a`::Thawing guest filesystems
jsonrpc.Executor/7::INFO::2016-02-24 13:05:50,370::vm::3056::virt.vm::(thaw) vmId=`52caf69e-eb99-4cba-bc23-4571262ac82a`::0 guest filesystems thawed
jsonrpc.Executor/7::DEBUG::2016-02-24 13:05:50,370::__init__::533::jsonrpc.JsonRpcServer::(_serveRequest) Return 'VM.thaw' in bridge with True

4. Successful freeze

jsonrpc.Executor/2::DEBUG::2016-02-24 13:06:03,623::__init__::503::jsonrpc.JsonRpcServer::(_serveRequest) Calling 'VM.freeze' in bridge with {u'vmID': u'52caf69e-eb99-4cba-bc23-4571262ac82a'}
jsonrpc.Executor/2::INFO::2016-02-24 13:06:03,624::vm::3018::virt.vm::(freeze) vmId=`52caf69e-eb99-4cba-bc23-4571262ac82a`::Freezing guest filesystems
jsonrpc.Executor/2::INFO::2016-02-24 13:06:03,655::vm::3033::virt.vm::(freeze) vmId=`52caf69e-eb99-4cba-bc23-4571262ac82a`::2 guest filesystems frozen
jsonrpc.Executor/2::DEBUG::2016-02-24 13:06:03,655::__init__::533::jsonrpc.JsonRpcServer::(_serveRequest) Return 'VM.freeze' in bridge with True

5. Failed freeze (expected)

jsonrpc.Executor/3::DEBUG::2016-02-24 13:06:03,723::__init__::503::jsonrpc.JsonRpcServer::(_serveRequest) Calling 'VM.freeze' in bridge with {u'vmID': u'52caf69e-eb99-4cba-bc23-4571262ac82a'}
jsonrpc.Executor/3::INFO::2016-02-24 13:06:03,724::vm::3018::virt.vm::(freeze) vmId=`52caf69e-eb99-4cba-bc23-4571262ac82a`::Freezing guest filesystems
jsonrpc.Executor/3::WARNING::2016-02-24 13:06:03,726::vm::3023::virt.vm::(freeze) vmId=`52caf69e-eb99-4cba-bc23-4571262ac82a`::Unable to freeze guest filesystems: internal error: unable to execute QEMU agent command 'guest-fsfreeze-freeze': The command guest-fsfreeze-freeze has been disabled for this instance

6. Successful thaw

jsonrpc.Executor/6::DEBUG::2016-02-24 13:06:22,358::__init__::503::jsonrpc.JsonRpcServer::(_serveRequest) Calling 'VM.thaw' in bridge with {u'vmID': u'52caf69e-eb99-4cba-bc23-4571262ac82a'}
jsonrpc.Executor/6::INFO::2016-02-24 13:06:22,358::vm::3041::virt.vm::(thaw) vmId=`52caf69e-eb99-4cba-bc23-4571262ac82a`::Thawing guest filesystems
jsonrpc.Executor/6::INFO::2016-02-24 13:06:22,363::vm::3056::virt.vm::(thaw) vmId=`52caf69e-eb99-4cba-bc23-4571262ac82a`::2 guest filesystems thawed
jsonrpc.Executor/6::DEBUG::2016-02-24 13:06:22,364::__init__::533::jsonrpc.JsonRpcServer::(_serveRequest) Return 'VM.thaw' in bridge with True

7. Successful thaw

jsonrpc.Executor/7::DEBUG::2016-02-24 13:06:22,460::__init__::503::jsonrpc.JsonRpcServer::(_serveRequest) Calling 'VM.thaw' in bridge with {u'vmID': u'52caf69e-eb99-4cba-bc23-4571262ac82a'}
jsonrpc.Executor/7::INFO::2016-02-24 13:06:22,461::vm::3041::virt.vm::(thaw) vmId=`52caf69e-eb99-4cba-bc23-4571262ac82a`::Thawing guest filesystems
jsonrpc.Executor/7::INFO::2016-02-24 13:06:22,463::vm::3056::virt.vm::(thaw) vmId=`52caf69e-eb99-4cba-bc23-4571262ac82a`::0 guest filesystems thawed
jsonrpc.Executor/7::DEBUG::2016-02-24 13:06:22,463::__init__::533::jsonrpc.JsonRpcServer::(_serveRequest) Return 'VM.thaw' in bridge with True

8. Successful freeze

jsonrpc.Executor/1::DEBUG::2016-02-24 13:06:30,277::__init__::503::jsonrpc.JsonRpcServer::(_serveRequest) Calling 'VM.freeze' in bridge with {u'vmID': u'52caf69e-eb99-4cba-bc23-4571262ac82a'}
jsonrpc.Executor/1::INFO::2016-02-24 13:06:30,277::vm::3018::virt.vm::(freeze) vmId=`52caf69e-eb99-4cba-bc23-4571262ac82a`::Freezing guest filesystems
jsonrpc.Executor/1::INFO::2016-02-24 13:06:30,304::vm::3033::virt.vm::(freeze) vmId=`52caf69e-eb99-4cba-bc23-4571262ac82a`::2 guest filesystems frozen
jsonrpc.Executor/1::DEBUG::2016-02-24 13:06:30,304::__init__::533::jsonrpc.JsonRpcServer::(_serveRequest) Return 'VM.freeze' in bridge with True

9. Failed freeze (expected)

jsonrpc.Executor/2::DEBUG::2016-02-24 13:06:30,358::__init__::503::jsonrpc.JsonRpcServer::(_serveRequest) Calling 'VM.freeze' in bridge with {u'vmID': u'52caf69e-eb99-4cba-bc23-4571262ac82a'}
jsonrpc.Executor/2::INFO::2016-02-24 13:06:30,359::vm::3018::virt.vm::(freeze) vmId=`52caf69e-eb99-4cba-bc23-4571262ac82a`::Freezing guest filesystems
jsonrpc.Executor/2::WARNING::2016-02-24 13:06:30,361::vm::3023::virt.vm::(freeze) vmId=`52caf69e-eb99-4cba-bc23-4571262ac82a`::Unable to freeze guest filesystems: internal error: unable to execute QEMU agent command 'guest-fsfreeze-freeze': The command guest-fsfreeze-freeze has been disabled for this instance

The only issue I see is missing error log in the jsonrpc server.
We can check vdsm return value in engine log.

This also match libvirt output:

> virsh # domfsfreeze 2
> Froze 2 filesystem(s)
> 
> virsh # domfsfreeze 2
> error: Unable to freeze filesystems
> error: internal error: unable to execute QEMU agent command
> 'guest-fsfreeze-freeze': The command guest-fsfreeze-freeze has been disabled
> for this instance
> 
> virsh # domfsthaw 2
> Thawed 2 filesystem(s)
> 
> virsh # domfsthaw 2
> Thawed 0 filesystem(s)
> 
> virsh # domfsfreeze 2
> Froze 2 filesystem(s)

I don't see the bug you describe.

Carlos, can you explain where is the bug?

Moving back to ON_QA, since it seems that this was false alarm.

Comment 30 Nir Soffer 2016-02-24 14:17:53 UTC
Returning the bug to original owner, sorry for the noize.

Comment 31 Carlos Mestre González 2016-02-24 15:48:58 UTC
(In reply to Nir Soffer from comment #29)
> (In reply to Carlos Mestre González from comment #22)
> 
> Looking at filtered vdsm log (attachment 1130222 [details])

> 7. Successful thaw
> 
> jsonrpc.Executor/7::DEBUG::2016-02-24
> 13:06:22,460::__init__::503::jsonrpc.JsonRpcServer::(_serveRequest) Calling
> 'VM.thaw' in bridge with {u'vmID': u'52caf69e-eb99-4cba-bc23-4571262ac82a'}
> jsonrpc.Executor/7::INFO::2016-02-24 13:06:22,461::vm::3041::virt.vm::(thaw)
> vmId=`52caf69e-eb99-4cba-bc23-4571262ac82a`::Thawing guest filesystems
> jsonrpc.Executor/7::INFO::2016-02-24 13:06:22,463::vm::3056::virt.vm::(thaw)
> vmId=`52caf69e-eb99-4cba-bc23-4571262ac82a`::0 guest filesystems thawed
> jsonrpc.Executor/7::DEBUG::2016-02-24
> 13:06:22,463::__init__::533::jsonrpc.JsonRpcServer::(_serveRequest) Return
> 'VM.thaw' in bridge with True
> 
> 8. Successful freeze
> 
> jsonrpc.Executor/1::DEBUG::2016-02-24
> 13:06:30,277::__init__::503::jsonrpc.JsonRpcServer::(_serveRequest) Calling
> 'VM.freeze' in bridge with {u'vmID': u'52caf69e-eb99-4cba-bc23-4571262ac82a'}
> jsonrpc.Executor/1::INFO::2016-02-24
> 13:06:30,277::vm::3018::virt.vm::(freeze)
> vmId=`52caf69e-eb99-4cba-bc23-4571262ac82a`::Freezing guest filesystems
> jsonrpc.Executor/1::INFO::2016-02-24
> 13:06:30,304::vm::3033::virt.vm::(freeze)
> vmId=`52caf69e-eb99-4cba-bc23-4571262ac82a`::2 guest filesystems frozen
> jsonrpc.Executor/1::DEBUG::2016-02-24
> 13:06:30,304::__init__::533::jsonrpc.JsonRpcServer::(_serveRequest) Return
> 'VM.freeze' in bridge with True
> 
> 9. Failed freeze (expected)
> 
> jsonrpc.Executor/2::DEBUG::2016-02-24
> 13:06:30,358::__init__::503::jsonrpc.JsonRpcServer::(_serveRequest) Calling
> 'VM.freeze' in bridge with {u'vmID': u'52caf69e-eb99-4cba-bc23-4571262ac82a'}
> jsonrpc.Executor/2::INFO::2016-02-24
> 13:06:30,359::vm::3018::virt.vm::(freeze)
> vmId=`52caf69e-eb99-4cba-bc23-4571262ac82a`::Freezing guest filesystems
> jsonrpc.Executor/2::WARNING::2016-02-24
> 13:06:30,361::vm::3023::virt.vm::(freeze)
> vmId=`52caf69e-eb99-4cba-bc23-4571262ac82a`::Unable to freeze guest
> filesystems: internal error: unable to execute QEMU agent command
> 'guest-fsfreeze-freeze': The command guest-fsfreeze-freeze has been disabled
> for this instance

> Moving back to ON_QA, since it seems that this was false alarm.

Nir, as my comment 20, the issue is when I call via POST rest API the freezefilesystem, the engine returns an "Unable to freeze the guest filesystem", even thought the guest filesystem is indeed FROZEN, this is after a thawfilesystem call (see my second scenario in comment 20)

From my comment 22 I tested this with virsh and seems fine (not an issue)

If you check the log, the successful freeze (8) is at  13:06:30,277, the second (9) is at 
13:06:30,358... 100 miliseconds after so I'm guessing the POST on rest is calling the Vm.freeze twice even though there's only one POST rest call and returning the last one.

Comment 32 Nir Soffer 2016-02-24 16:28:13 UTC
(In reply to Carlos Mestre González from comment #31)
> (In reply to Nir Soffer from comment #29)
> > (In reply to Carlos Mestre González from comment #22)
> > 
> > Looking at filtered vdsm log (attachment 1130222 [details])
> 
> > 7. Successful thaw
> > 
> > jsonrpc.Executor/7::DEBUG::2016-02-24
> > 13:06:22,460::__init__::503::jsonrpc.JsonRpcServer::(_serveRequest) Calling
> > 'VM.thaw' in bridge with {u'vmID': u'52caf69e-eb99-4cba-bc23-4571262ac82a'}
> > jsonrpc.Executor/7::INFO::2016-02-24 13:06:22,461::vm::3041::virt.vm::(thaw)
> > vmId=`52caf69e-eb99-4cba-bc23-4571262ac82a`::Thawing guest filesystems
> > jsonrpc.Executor/7::INFO::2016-02-24 13:06:22,463::vm::3056::virt.vm::(thaw)
> > vmId=`52caf69e-eb99-4cba-bc23-4571262ac82a`::0 guest filesystems thawed
> > jsonrpc.Executor/7::DEBUG::2016-02-24
> > 13:06:22,463::__init__::533::jsonrpc.JsonRpcServer::(_serveRequest) Return
> > 'VM.thaw' in bridge with True
> > 
> > 8. Successful freeze
> > 
> > jsonrpc.Executor/1::DEBUG::2016-02-24
> > 13:06:30,277::__init__::503::jsonrpc.JsonRpcServer::(_serveRequest) Calling
> > 'VM.freeze' in bridge with {u'vmID': u'52caf69e-eb99-4cba-bc23-4571262ac82a'}
> > jsonrpc.Executor/1::INFO::2016-02-24
> > 13:06:30,277::vm::3018::virt.vm::(freeze)
> > vmId=`52caf69e-eb99-4cba-bc23-4571262ac82a`::Freezing guest filesystems
> > jsonrpc.Executor/1::INFO::2016-02-24
> > 13:06:30,304::vm::3033::virt.vm::(freeze)
> > vmId=`52caf69e-eb99-4cba-bc23-4571262ac82a`::2 guest filesystems frozen
> > jsonrpc.Executor/1::DEBUG::2016-02-24
> > 13:06:30,304::__init__::533::jsonrpc.JsonRpcServer::(_serveRequest) Return
> > 'VM.freeze' in bridge with True
> > 
> > 9. Failed freeze (expected)
> > 
> > jsonrpc.Executor/2::DEBUG::2016-02-24
> > 13:06:30,358::__init__::503::jsonrpc.JsonRpcServer::(_serveRequest) Calling
> > 'VM.freeze' in bridge with {u'vmID': u'52caf69e-eb99-4cba-bc23-4571262ac82a'}
> > jsonrpc.Executor/2::INFO::2016-02-24
> > 13:06:30,359::vm::3018::virt.vm::(freeze)
> > vmId=`52caf69e-eb99-4cba-bc23-4571262ac82a`::Freezing guest filesystems
> > jsonrpc.Executor/2::WARNING::2016-02-24
> > 13:06:30,361::vm::3023::virt.vm::(freeze)
> > vmId=`52caf69e-eb99-4cba-bc23-4571262ac82a`::Unable to freeze guest
> > filesystems: internal error: unable to execute QEMU agent command
> > 'guest-fsfreeze-freeze': The command guest-fsfreeze-freeze has been disabled
> > for this instance
> 
> > Moving back to ON_QA, since it seems that this was false alarm.
> 
> Nir, as my comment 20, the issue is when I call via POST rest API the
> freezefilesystem, the engine returns an "Unable to freeze the guest
> filesystem", even thought the guest filesystem is indeed FROZEN, this is
> after a thawfilesystem call (see my second scenario in comment 20)

There are no logs here showing how did you call the rest api.

On the vdsm side, everything is fine, and behave exactly like virsh.

> From my comment 22 I tested this with virsh and seems fine (not an issue)

This was your second freeze call in virsh:

> virsh # domfsfreeze 2
> error: Unable to freeze filesystems
> error: internal error: unable to execute QEMU agent command
> 'guest-fsfreeze-freeze': The command guest-fsfreeze-freeze has been disabled
> for this instance

It failed just like  your second freeze call in vdsm.

> If you check the log, the successful freeze (8) is at  13:06:30,277, the
> second (9) is at 
> 13:06:30,358... 100 miliseconds after so I'm guessing the POST on rest is
> calling the Vm.freeze twice even though there's only one POST rest call and
> returning the last one.

There is no need to guess, we can see this in engine log.

Please attach engine.log showing the rest api calls.

Comment 33 Carlos Mestre González 2016-02-24 17:06:54 UTC
(In reply to Nir Soffer from comment #32)
> (In reply to Carlos Mestre González from comment #31)
> > (In reply to Nir Soffer from comment #29)
> > > (In reply to Carlos Mestre González from comment #22)
> 
> There is no need to guess, we can see this in engine log.
> 
> Please attach engine.log showing the rest api calls.

That call was with the rest API since I did it and remember the exact time, that's why I stated that in the log descriptions.

Still I don't see how could I've done the two calls in less than a second of difference manually.

I had an issue with my environment so I'll add new logs with the engine next time.

Comment 34 Carlos Mestre González 2016-02-25 13:16:55 UTC
I did a new run:

1. create and star the vm
2. Run freeze and thaw a few times.

Result: after the 2nd or 3rd time of running freeze/thaw the freeze commands returns failure from rest.

from engine.log, after the Thaw I call a freeze from the rest API:
2016-02-25 14:54:53,777 INFO  [org.ovirt.engine.core.bll.ThawVmCommand] (ajp-/127.0.0.1:8702-9) [42a5590c] Running command: ThawVmCommand internal: false. Entities affected :  ID: caf8ad91-8681-40bb-bc26-d76fed873a59 Type: VMAction group CONFIGURE_VM_STORAGE with role type USER
2016-02-25 14:54:53,783 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.ThawVDSCommand] (ajp-/127.0.0.1:8702-9) [42a5590c] START, ThawVDSCommand(HostName = host_mixed_1, VdsAndVmIDVDSParametersBase:{runAsync='true', hostId='2999eacf-7123-4b0e-bfc5-cd5be8073dbc', vmId='caf8ad91-8681-40bb-bc26-d76fed873a59'}), log id: 51bd018
2016-02-25 14:54:54,793 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.ThawVDSCommand] (ajp-/127.0.0.1:8702-9) [42a5590c] FINISH, ThawVDSCommand, log id: 51bd018
2016-02-25 14:54:54,812 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ajp-/127.0.0.1:8702-9) [42a5590c] Correlation ID: 42a5590c, Job ID: 5d65f356-4aa9-4283-aef5-3e9fdf5c8534, Call Stack: null, Custom Event ID: -1, Message: Guest filesystems on VM golden_env_mixed_virtio_1_0 have been thawed successfully.
2016-02-25 14:54:54,836 INFO  [org.ovirt.engine.core.bll.aaa.LogoutSessionCommand] (ajp-/127.0.0.1:8702-9) [3913b31f] Running command: LogoutSessionCommand internal: false.
2016-02-25 14:54:54,847 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ajp-/127.0.0.1:8702-9) [3913b31f] Correlation ID: 3913b31f, Call Stack: null, Custom Event ID: -1, Message: User admin@internal logged out.
2016-02-25 14:55:34,452 INFO  [org.ovirt.engine.core.bll.aaa.LoginUserCommand] (ajp-/127.0.0.1:8702-12) [] Running command: LoginUserCommand internal: false.
2016-02-25 14:55:34,459 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ajp-/127.0.0.1:8702-12) [] Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: User admin@internal logged in.
2016-02-25 14:55:34,486 WARN  [org.ovirt.engine.core.dal.job.ExecutionMessageDirector] (ajp-/127.0.0.1:8702-12) [70589812] The message key 'FreezeVm' is missing from 'bundles/ExecutionMessages'
2016-02-25 14:55:34,557 INFO  [org.ovirt.engine.core.bll.FreezeVmCommand] (ajp-/127.0.0.1:8702-12) [70589812] Running command: FreezeVmCommand internal: false. Entities affected :  ID: caf8ad91-8681-40bb-bc26-d76fed873a59 Type: VMAction group CONFIGURE_VM_STORAGE with role type USER
2016-02-25 14:55:34,565 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.FreezeVDSCommand] (ajp-/127.0.0.1:8702-12) [70589812] START, FreezeVDSCommand(HostName = host_mixed_1, VdsAndVmIDVDSParametersBase:{runAsync='true', hostId='2999eacf-7123-4b0e-bfc5-cd5be8073dbc', vmId='caf8ad91-8681-40bb-bc26-d76fed873a59'}), log id: 63ff5f1a
2016-02-25 14:55:34,615 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.FreezeVDSCommand] (ajp-/127.0.0.1:8702-12) [70589812] FINISH, FreezeVDSCommand, log id: 63ff5f1a
2016-02-25 14:55:34,625 INFO  [org.ovirt.engine.core.bll.aaa.LoginUserCommand] (ajp-/127.0.0.1:8702-3) [] Running command: LoginUserCommand internal: false.
2016-02-25 14:55:34,645 WARN  [org.ovirt.engine.core.dal.job.ExecutionMessageDirector] (ajp-/127.0.0.1:8702-3) [8f2c49d] The message key 'FreezeVm' is missing from 'bundles/ExecutionMessages'
2016-02-25 14:55:34,652 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ajp-/127.0.0.1:8702-12) [70589812] Correlation ID: 70589812, Job ID: b6657dda-ed6b-4986-8a84-84af51b1559f, Call Stack: null, Custom Event ID: -1, Message: Guest filesystems on VM golden_env_mixed_virtio_1_0 have been frozen successfully.
2016-02-25 14:55:34,674 INFO  [org.ovirt.engine.core.bll.aaa.LogoutSessionCommand] (ajp-/127.0.0.1:8702-12) [3f25933a] Running command: LogoutSessionCommand internal: false.
2016-02-25 14:55:34,694 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ajp-/127.0.0.1:8702-12) [3f25933a] Correlation ID: 3f25933a, Call Stack: null, Custom Event ID: -1, Message: User admin@internal logged out.
2016-02-25 14:55:34,703 INFO  [org.ovirt.engine.core.bll.FreezeVmCommand] (ajp-/127.0.0.1:8702-3) [8f2c49d] Running command: FreezeVmCommand internal: false. Entities affected :  ID: caf8ad91-8681-40bb-bc26-d76fed873a59 Type: VMAction group CONFIGURE_VM_STORAGE with role type USER
2016-02-25 14:55:34,711 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.FreezeVDSCommand] (ajp-/127.0.0.1:8702-3) [8f2c49d] START, FreezeVDSCommand(HostName = host_mixed_1, VdsAndVmIDVDSParametersBase:{runAsync='true', hostId='2999eacf-7123-4b0e-bfc5-cd5be8073dbc', vmId='caf8ad91-8681-40bb-bc26-d76fed873a59'}), log id: 445ef917
2016-02-25 14:55:35,721 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.FreezeVDSCommand] (ajp-/127.0.0.1:8702-3) [8f2c49d] Failed in 'FreezeVDS' method
2016-02-25 14:55:35,735 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ajp-/127.0.0.1:8702-3) [8f2c49d] Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: VDSM host_mixed_1 command failed: internal error: unable to execute QEMU agent command 'guest-fsfreeze-freeze': The command guest-fsfreeze-freeze has been disabled for this instance
2016-02-25 14:55:35,736 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.FreezeVDSCommand] (ajp-/127.0.0.1:8702-3) [8f2c49d] Command 'org.ovirt.engine.core.vdsbroker.vdsbroker.FreezeVDSCommand' return value 'StatusOnlyReturnForXmlRpc [status=StatusForXmlRpc [code=76, message=internal error: unable to execute QEMU agent command 'guest-fsfreeze-freeze': The command guest-fsfreeze-freeze has been disabled for this instance]]'
2016-02-25 14:55:35,736 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.FreezeVDSCommand] (ajp-/127.0.0.1:8702-3) [8f2c49d] HostName = host_mixed_1
2016-02-25 14:55:35,736 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.FreezeVDSCommand] (ajp-/127.0.0.1:8702-3) [8f2c49d] Command 'FreezeVDSCommand(HostName = host_mixed_1, VdsAndVmIDVDSParametersBase:{runAsync='true', hostId='2999eacf-7123-4b0e-bfc5-cd5be8073dbc', vmId='caf8ad91-8681-40bb-bc26-d76fed873a59'})' execution failed: VDSGenericException: VDSErrorException: Failed to FreezeVDS, error = internal error: unable to execute QEMU agent command 'guest-fsfreeze-freeze': The command guest-fsfreeze-freeze has been disabled for this instance, code = 76
2016-02-25 14:55:35,736 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.FreezeVDSCommand] (ajp-/127.0.0.1:8702-3) [8f2c49d] FINISH, FreezeVDSCommand, log id: 445ef917
2016-02-25 14:55:35,737 ERROR [org.ovirt.engine.core.bll.FreezeVmCommand] (ajp-/127.0.0.1:8702-3) [8f2c49d] Command 'org.ovirt.engine.core.bll.FreezeVmCommand' failed: EngineException: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSErrorException: VDSGenericException: VDSErrorException: Failed to FreezeVDS, error = internal error: unable to execute QEMU agent command 'guest-fsfreeze-freeze': The command guest-fsfreeze-freeze has been disabled for this instance, code = 76 (Failed with error freezeErr and code 76)
2016-02-25 14:55:35,767 WARN  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ajp-/127.0.0.1:8702-3) [8f2c49d] Correlation ID: 8f2c49d, Job ID: 8902cc3e-4852-4b37-83dc-eb909cea3ff4, Call Stack: null, Custom Event ID: -1, Message: Failed to freeze guest filesystems on golden_env_mixed_virtio_1_0 (Host: host_mixed_1, User: admin@internal).
2016-02-25 14:55:35,785 ERROR [org.ovirt.engine.api.restapi.resource.AbstractBackendResource] (ajp-/127.0.0.1:8702-3) [] Operation Failed: [Unable to freeze guest filesystems]
2016-02-25 14:55:35,791 INFO  [org.ovirt.engine.core.bll.aaa.LogoutSessionCommand] (ajp-/127.0.0.1:8702-3) [1185c4a7] Running command: LogoutSessionCommand internal: false.


and in vdsm.log you see how the freeze succeeds but then shows a failure (shows two calls?):

jsonrpc.Executor/4::DEBUG::2016-02-25 14:55:34,567::__init__::503::jsonrpc.JsonRpcServer::(_serveRequest) Calling 'VM.freeze' in bridge with {u'vmID': u'caf8ad91-8681-40bb-bc26-d76fed873a59'}
jsonrpc.Executor/4::INFO::2016-02-25 14:55:34,568::vm::3018::virt.vm::(freeze) vmId=`caf8ad91-8681-40bb-bc26-d76fed873a59`::Freezing guest filesystems
jsonrpc.Executor/4::INFO::2016-02-25 14:55:34,611::vm::3033::virt.vm::(freeze) vmId=`caf8ad91-8681-40bb-bc26-d76fed873a59`::2 guest filesystems frozen
jsonrpc.Executor/4::DEBUG::2016-02-25 14:55:34,611::__init__::533::jsonrpc.JsonRpcServer::(_serveRequest) Return 'VM.freeze' in bridge with True
jsonrpc.Executor/6::DEBUG::2016-02-25 14:55:34,714::__init__::503::jsonrpc.JsonRpcServer::(_serveRequest) Calling 'VM.freeze' in bridge with {u'vmID': u'caf8ad91-8681-40bb-bc26-d76fed873a59'}
jsonrpc.Executor/6::INFO::2016-02-25 14:55:34,715::vm::3018::virt.vm::(freeze) vmId=`caf8ad91-8681-40bb-bc26-d76fed873a59`::Freezing guest filesystems
periodic/3::WARNING::2016-02-25 14:55:34,716::periodic::258::virt.periodic.VmDispatcher::(__call__) could not run <class 'virt.periodic.DriveWatermarkMonitor'> on [u'caf8ad91-8681-40bb-bc26-d76fed873a59']
jsonrpc.Executor/6::WARNING::2016-02-25 14:55:34,717::vm::3023::virt.vm::(freeze) vmId=`caf8ad91-8681-40bb-bc26-d76fed873a59`::Unable to freeze guest filesystems: internal error: unable to execute QEMU agent command 'guest-fsfreeze-freeze': The command guest-fsfreeze-freeze has been disabled for this instance

Even though I call the rest api ONCE the engine shows two calls one after the other (?), and the return value of the call is a failure.

I'm moving this to assigned until this is fixed.

Comment 35 Carlos Mestre González 2016-02-25 13:18:40 UTC
Created attachment 1130535 [details]
libvirt, vdsm and engine logs

New runs, shows error freeze.

Comment 36 Nir Soffer 2016-02-25 13:49:06 UTC
Freese errors on vdsm are expected, and we can see in engine log that they
are reported as they should to engine.

Carlos, how did you submit the requests?

Can you reproduce it with curl from the command line, or using Python sdk?

Comment 37 Carlos Mestre González 2016-02-25 14:22:53 UTC
Mark as verified, I tested it with the curl and doesn't seem to reproduce at all (Before I was using an external tool rest client for firefox)

Tested basic flow on iscsi/nfs, with multiple disks and also negative cases with other vms status (off, suspend, migrating) and during snapshot operation(s).

Comment 40 errata-xmlrpc 2016-03-09 20:34:11 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2016-0376.html


Note You need to log in before you can comment on or make changes to this bug.