Bug 1634000

Summary: qemuDomainBlockJobSetSpeed is blocked (held by remoteDispatchDomainMigratePerform3Params)
Product: [Community] Virtualization Tools Reporter: git.user
Component: libvirtAssignee: Libvirt Maintainers <libvirt-maint>
Status: CLOSED DEFERRED QA Contact:
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: unspecifiedCC: berrange, libvirt-maint, pkrempa
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2024-12-17 12:27:50 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
proof of concept none

Description git.user 2018-09-28 11:43:28 UTC
Description of problem:
Hello!

When I tried to changed bandwidth of a running block migration job I end up with a timeout error:

# virsh blockjob dom disk --bandwidth 10000000
error: Timed out during operation: cannot acquire state change lock (held by remoteDispatchDomainMigratePerform3Params)

I crawled a bit with gdb and it looks like it's a qemuDomainObjBeginJobInternal (called by qemuDomainBlockJobSetSpeed) who failed to get lock:

#0  0x00007f3a8cdc2b80 in virCondWaitUntil () from /usr/lib/x86_64-linux-gnu/libvirt.so.0
#1  0x00007f3a449a2d2b in qemuDomainObjBeginJobInternal (driver=driver@entry=0x7f3a2812a040, obj=0x7f3a68011690, job=job@entry=QEMU_JOB_MODIFY, asyncJob=asyncJob@entry=QEMU_ASYNC_JOB_NONE)
    at ../../../src/qemu/qemu_domain.c:4797
#2  0x00007f3a449a6a3b in qemuDomainObjBeginJob (driver=driver@entry=0x7f3a2812a040, obj=<optimized out>, job=job@entry=QEMU_JOB_MODIFY) at ../../../src/qemu/qemu_domain.c:4916
#3  0x00007f3a44a18d61 in qemuDomainBlockJobSetSpeed (dom=0x7f3a78012a40, path=0x7f3a78007710 "/var/lib/nova/instances/e518a852-e38e-450c-a64a-f7c4f03c4cf8/disk", bandwidth=<optimized out>,

If I understand it correctly the real issue is a flag QEMU_JOB_MODIFY:
(v4.6.0 src/qemu/qemu_driver.c  +17292)

    if (qemuDomainObjBeginJob(driver, vm, QEMU_JOB_MODIFY) < 0)
        goto cleanup;

It described as /* May change state */. I didnt look any further but I doubt qemuDomainBlockJobSetSpeed can change state of vm. As far as I can see it only changing speed of block mirroring. Would it be acceptable to use QEMU_JOB_MIGRATION_OP hire? (just like qemuDomainMigrateSetMaxSpeed).

Sorry for luck of a details. I just hope the issue is pretty obvious now.
Thanks

Version-Release number of selected component (if applicable):
libvirt of at least >=4.0.0 

How reproducible:
always

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 git.user 2018-09-28 11:56:50 UTC
Created attachment 1488053 [details]
proof of concept

So I can change block job bandwidth.

Comment 2 Peter Krempa 2018-10-01 10:48:22 UTC
(In reply to git.user from comment #0)
> Description of problem:
> Hello!
> 
> When I tried to changed bandwidth of a running block migration job I end up
> with a timeout error:

Note that the block job used for migration is an internal implementation and normal APIs should not be used to interact with it. 



> # virsh blockjob dom disk --bandwidth 10000000
> error: Timed out during operation: cannot acquire state change lock (held by
> remoteDispatchDomainMigratePerform3Params)
> 
> I crawled a bit with gdb and it looks like it's a
> qemuDomainObjBeginJobInternal (called by qemuDomainBlockJobSetSpeed) who
> failed to get lock:
> 
> #0  0x00007f3a8cdc2b80 in virCondWaitUntil () from
> /usr/lib/x86_64-linux-gnu/libvirt.so.0
> #1  0x00007f3a449a2d2b in qemuDomainObjBeginJobInternal

[...]

> If I understand it correctly the real issue is a flag QEMU_JOB_MODIFY:
> (v4.6.0 src/qemu/qemu_driver.c  +17292)
> 
>     if (qemuDomainObjBeginJob(driver, vm, QEMU_JOB_MODIFY) < 0)
>         goto cleanup;
> 
> It described as /* May change state */. I didnt look any further but I doubt
> qemuDomainBlockJobSetSpeed can change state of vm. As far as I can see it

It certainly DOES change state as it modifies the migration

> only changing speed of block mirroring. Would it be acceptable to use
> QEMU_JOB_MIGRATION_OP hire? (just like qemuDomainMigrateSetMaxSpeed).

No. The qemuDomainMigrateSetMaxSpeed API should be used also to modify the speed of the blockjob used for migration as it's internal impl.

Comment 3 Daniel Berrangé 2024-12-17 12:27:50 UTC
Thank you for reporting this issue to the libvirt project. Unfortunately we have been unable to resolve this issue due to insufficient maintainer capacity and it will now be closed. This is not a reflection on the possible validity of the issue, merely the lack of resources to investigate and address it, for which we apologise. If you none the less feel the issue is still important, you may choose to report it again at the new project issue tracker https://gitlab.com/libvirt/libvirt/-/issues The project also welcomes contribution from anyone who believes they can provide a solution.