Bug 828633
Summary: | Sanlock locking failed for readonly devices | |||
---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Mark Wu <wudxw> | |
Component: | libvirt | Assignee: | Peter Krempa <pkrempa> | |
Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | |
Severity: | high | Docs Contact: | ||
Priority: | unspecified | |||
Version: | 17 | CC: | abaron, acathrow, bazulay, berrange, clalancette, crobinso, dallan, dougsland, dyasny, fsimonce, iheim, itamar, jbrooks, jforbes, jyang, laine, libvirt-maint, lvroyce, mburns, mgoldboi, robert, shuming, veillard, virt-maint, ykaul | |
Target Milestone: | --- | |||
Target Release: | --- | |||
Hardware: | Unspecified | |||
OS: | Linux | |||
Whiteboard: | storage | |||
Fixed In Version: | Doc Type: | Bug Fix | ||
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 837659 (view as bug list) | Environment: | ||
Last Closed: | 2012-06-30 22:04:31 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 822145, 837659 |
Description
Mark Wu
2012-06-05 05:17:15 UTC
I don't think this is related to VDSM, could you please check that qemu-sanlock.conf is properly configured? # egrep -v ^# /etc/libvirt/qemu-sanlock.conf auto_disk_leases=0 # by vdsm-4.9.6 require_lease_for_disks=0 # by vdsm-4.9.6 Eventually take the latest VDSM build and run: # service vdsmd reconfigure Here's my qemu-sanlock.conf, the same as yours. sudo egrep -v ^# /etc/libvirt/qemu-sanlock.conf auto_disk_leases=0 require_lease_for_disks=0 And I also run `/usr/lib/systemd/systemd-vdsmd reconfigure` to configure libvirt. It doesn't help. This problem happens in add a disk resource to sanlock, and should not be related to if the lease file is specified or required. I also encountered same problem in my latest vdsm:vdsm-4.9.6-0.269.git3e44fe0.fc16,libvirt:libvirt-0.9.12-1.fc16.x86_64, I fix this problem by adding configure to qemu-sanlock.conf: ## beginning of configuration section by vdsm-4.9.6 auto_disk_leases=0 require_lease_for_disks=0 ignore_readonly_and_shared_disks=1 ## end of configuration section by vdsm-4.9.6 It is verified in my test env,see change here:http://gerrit.ovirt.org/#/c/5057/ (In reply to comment #3) > I also encountered same problem in my latest > vdsm:vdsm-4.9.6-0.269.git3e44fe0.fc16,libvirt:libvirt-0.9.12-1.fc16.x86_64, > I fix this problem by adding configure to qemu-sanlock.conf: > ## beginning of configuration section by vdsm-4.9.6 > auto_disk_leases=0 > require_lease_for_disks=0 > ignore_readonly_and_shared_disks=1 > ## end of configuration section by vdsm-4.9.6 This is probably a different libvirt bug since when you set require_lease_for_disks=0 the ignore_readonly_and_shared_disks value shouldn't matter. Daniel what do you think? Federico, I think it should be the same bug. The checking for readonly happens in adding resource and the requireLeaseForDisks is checked in acquiring resource. Adding resource is earlier than acquiring resource, so even ignore_readonly_and_shared_disks is set, the problem still happens. virDomainLockProcessResume virDomainLockManagerNew virDomainLockManagerAddDisk virLockManagerSanlockAddResource .. ... if (flags & VIR_LOCK_MANAGER_RESOURCE_READONLY) { /* Problem happens here */ virLockError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", _("Readonly leases are not supported")); return -1; } virLockManagerAcquire ... virLockManagerSanlockAcquire ... if (priv->res_count == 0 && priv->hasRWDisks && driver->requireLeaseForDisks) { /* requireLeaseForDisks is checked here */ virLockError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", _("Read/write, exclusive access, disks were present, but no leases specified")); return -1; } Here's the full gdb call trace: Breakpoint 1, virDomainLockManagerAddDisk (lock=0x7ff0bc002da0, disk=0x7ff0bc00b7b0) at locking/domain_lock.c:71 71 { (gdb) bt #0 virDomainLockManagerAddDisk (lock=0x7ff0bc002da0, disk=0x7ff0bc00b7b0) at locking/domain_lock.c:71 #1 0x0000003e03cfc134 in virDomainLockManagerNew (plugin=<optimized out>, dom=dom@entry=0x7ff0bc002f00, withResources=withResources@entry=true) at locking/domain_lock.c:142 #2 0x0000003e03cfc223 in virDomainLockProcessResume (plugin=<optimized out>, dom=dom@entry=0x7ff0bc002f00, state=0x0) at locking/domain_lock.c:199 #3 0x000000000048d400 in qemuProcessStartCPUs (driver=driver@entry=0x7ff0cc0094f0, vm=vm@entry=0x7ff0bc002f00, conn=conn@entry=0x7ff0cc1800e0, reason=reason@entry=VIR_DOMAIN_RUNNING_BOOTED, asyncJob=asyncJob@entry=QEMU_ASYNC_JOB_NONE) at qemu/qemu_process.c:2656 #4 0x0000000000490a21 in qemuProcessStart (conn=conn@entry=0x7ff0cc1800e0, driver=driver@entry=0x7ff0cc0094f0, vm=vm@entry=0x7ff0bc002f00, migrateFrom=migrateFrom@entry=0x0, cold_boot=cold_boot@entry=true, start_paused=start_paused@entry=false, autodestroy=autodestroy@entry=false, stdin_fd=stdin_fd@entry=-1, stdin_path=stdin_path@entry=0x0, snapshot=snapshot@entry=0x0, vmop=vmop@entry=VIR_NETDEV_VPORT_PROFILE_OP_CREATE) at qemu/qemu_process.c:3718 #5 0x00000000004643e7 in qemudDomainCreate (conn=0x7ff0cc1800e0, xml=<optimized out>, flags=<optimized out>) at qemu/qemu_driver.c:1386 #6 0x0000003e03cd912f in virDomainCreateXML (conn=0x7ff0cc1800e0, xmlDesc=0x7ff0bc001e40 "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<domain type=\"kvm\">\n\t<name>vm1</name>\n\t<uuid>f7c1b02f-b304-4e68-8565-3659b1214cba</uuid>\n\t<memory>1048576</memory>\n\t<currentMemory>1048576</currentMemory>\n\t<vcpu"..., flags=0) at libvirt.c:2008 #7 0x000000000043b711 in remoteDispatchDomainCreateXML (ret=0x7ff0bc00c8a0, args=0x7ff0bc00c540, rerr=0x7ff0d52bac30, client=0x9848a0, server=<optimized out>, msg=<optimized out>) at remote_dispatch.h:958 #8 remoteDispatchDomainCreateXMLHelper (server=<optimized out>, client=0x9848a0, msg=<optimized out>, rerr=0x7ff0d52bac30, args=0x7ff0bc00c540, ret=0x7ff0bc00c8a0) at remote_dispatch.h:938 #9 0x0000003e03d2d2b6 in virNetServerProgramDispatchCall (msg=0x9908a0, client=0x9848a0, server=0x9710e0, prog=0x972660) at rpc/virnetserverprogram.c:416 #10 virNetServerProgramDispatch (prog=0x972660, server=server@entry=0x9710e0, client=0x9848a0, msg=0x9908a0) at rpc/virnetserverprogram.c:289 #11 0x0000003e03d28e61 in virNetServerHandleJob (jobOpaque=<optimized out>, opaque=0x9710e0) at rpc/virnetserver.c:168 #12 0x0000003e03c66dee in virThreadPoolWorker (opaque=opaque@entry=0x97a420) at util/threadpool.c:144 #13 0x0000003e03c66879 in virThreadHelper (data=<optimized out>) at util/threads-pthread.c:161 #14 0x0000003e1f807d14 in start_thread (arg=0x7ff0d52bb700) at pthread_create.c:309 #15 0x0000003e1f4f199d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 Latest upstream libvirt code (to be in 0.9.13 release) has a new config param to explicitly skip any readonly/shared disks when doing locking commit b8012ce9312f00947c5ca7250a7a96534c85835f Author: David Weber <wb> Date: Mon May 14 09:53:02 2012 +0000 sanlock: fix locking for readonly devices Add ignore param for readonly and shared disk in sanlock I hit this bug again when I create a VM from engine GUI. I am testing oVirt 3.1 beta. So I suggest we should resolve this problem before the 3.1 release. this change depends on libvirt 0.9.13 release, which is not officially available. Dave - any eta on 0.9.13 release for fc17 to resolve this for ovirt users? I don't think this necessarily requires all of 0.9.13. Cole, is the fix backportable? Yeah I think so, but it's not a libvirt-maint candidate so we'll need to backport it separately. I'm just moving this bug to F17 libvirt, if that's wrong please feel free to move it back, clone it, etc. Hope we can be informed here if there's any available libvirt candidate.'cause patch for vdsm (http://gerrit.ovirt.org/#/c/5057/)is highly depend on it.Thanks a lot! Royce, I believe you will get email notification on the status change of this bug since you're in the cc list. So don't worry about it. Cole, any eta on a release with a fix for this BZ? You can workaround the problem by commenting out the this line in /etc/libvirt/qemu.conf lock_manager="sanlock" And then reconfigure and restart libvirt service. reconfigure vdsmd service didn't work because it will enable the line again automatically. I'll push an update in a minute. The version number will be 0.9.11.4-2, so that's what the vdsm spec should require (or higher) libvirt-0.9.11.4-2.fc17 has been submitted as an update for Fedora 17. https://admin.fedoraproject.org/updates/libvirt-0.9.11.4-2.fc17 I just installed libvirt-0.9.11.4-2.fc17 on my oVirt 3.1 setup, and I'm getting the same "Readonly leases are not supported" behavior as before. Jason, it doesn't look like http://gerrit.ovirt.org/#/c/5057/ is applied to ovirt yet After installing the updated libvirt, I noticed that "ignore_readonly_and_shared_disks = 1" was still commented out in my qemu-sanlock.conf file, so I uncommented it, and it didn't resolve the issue. I just built and installed the ovirt 3.1 packages from git, I have libvirt-0.9.11.4-2.fc17 installed, ignore_readonly_and_shared_disks = 1 in my qemu-sanlock.conf, and selinux in permissive mode. I was able to boot from an iso image. Not sure if the newer ovirt pkgs made the difference, or if it was cleaning and reinstalling the engine config. A lot of moving parts... I'm happy to test any specific things. Jason, The libvirt fix just introduced the option "ignore_readonly_and_shared_disks", but didn't enable it by default. It will be enabled by the vdsm patch in http://gerrit.ovirt.org/#/c/5057/. Perhaps you forgot to restart libvirtd in the test you described in comment #21. So you could have another test when vdsm patch is merged. At that time you needn't manually enable "ignore_readonly_and_shared_disks". Daniel, why should we also set ignore_readonly_and_shared_disks=1 if we already declare: auto_disk_leases=0 require_lease_for_disks=0 It is todo with the order in which things are processed. When the virLockManagerSanlockAddResource method is called, the first thing that is done is to check the flags & filter out those which are not supported (ie readonly + shared). Only later on does the auto_disk_lease flag get checked.The ignore_readonly_and_shared_disks parameter takes effect before all of those, avoiding the first check which raises the error. What we really need todo is a. implement support for shared leases b. completely ignore readonly disks at all times at that point the "ignore_readonly_and_shared_disks" parameter can go away, and more sensible default behaviour will apply. I understand what is in the code does but there's probably a semantic that I don't get here: auto_disk_leases: whether libvirt should manage and automatically provide the leases for the disks require_lease_for_disks: whether libvirt should enforce the presence of leases for the disks ignore_readonly_and_shared_disks: do not consider readonly and shared disks for the locking mechanism I see why this is useful for: auto_disk_leases=(0|1) require_lease_for_disks=1 ignore_readonly_and_shared_disks=(0|1) But I don't see how this configuration can be valid: auto_disk_leases=0 require_lease_for_disks=0 ignore_readonly_and_shared_disks=0 In this case libvirt is not managing the leases nor is checking their presence, why should it even care about the readonly/shared disks? If auto_disk_leases/require_lease_for_disks is 0, then ignore_readonly_and_shared_disks should be assumed 1 (or not be considered at all). > In this case libvirt is not managing the leases nor is checking their presence, > why should it even care about the readonly/shared disks?
As I mentioned above, the check for readonly/shared disks occurs before we get around to require_lease_for_disks/auto_disk_leases. So you need to have both set.
Package libvirt-0.9.11.4-2.fc17: * should fix your issue, * was pushed to the Fedora 17 testing repository, * should be available at your local mirror within two days. Update it with: # su -c 'yum update --enablerepo=updates-testing libvirt-0.9.11.4-2.fc17' as soon as you are able to. Please go to the following url: https://admin.fedoraproject.org/updates/FEDORA-2012-9708/libvirt-0.9.11.4-2.fc17 then log in and leave karma (feedback). (In reply to comment #27) > > In this case libvirt is not managing the leases nor is checking their presence, > why should it even care about the readonly/shared disks? > > As I mentioned above, the check for readonly/shared disks occurs before we > get around to require_lease_for_disks/auto_disk_leases. So you need to have > both set. Given that the position of the check should probably go deeper in the stack, eg: virLockManagerSanlockAddDisk/virLockManagerSanlockCreateLease (you'll need to pass the VIR_LOCK_MANAGER_RESOURCE_SHARED/VIR_LOCK_MANAGER_RESOURCE_READONLY parameter in the future anyway), it's also possible to fix the issue performing the early checks only if driver->autoDiskLease is true: if (type != VIR_LOCK_MANAGER_RESOURCE_TYPE_DISK || driver->autoDiskLease) { if (flags & VIR_LOCK_MANAGER_RESOURCE_READONLY) { virLockError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", _("Readonly leases are not supported")); return -1; } if (flags & VIR_LOCK_MANAGER_RESOURCE_SHARED) { virLockError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", _("Shareable leases are not supported")); return -1; } } Dan posted a fix upstream: https://www.redhat.com/archives/libvir-list/2012-June/msg00916.html We will need to drop the backported sanlock.conf patch, since Dan's patch reverts it. The fix is now pushed upstream: commit acbd4965c44c4dbc676dfe89aff970052e376073 Author: Daniel P. Berrange <berrange> Date: Thu Jun 21 15:34:46 2012 +0100 Add support for shared sanlock leases A sanlock lease can be marked as shared (rather than exclusive) using SANLK_RES_SHARED flag. This adds support for that flag and ensures that in auto disk mode, any shared disks use shared leases. This also makes any read-only disks be completely ignored. These changes remove the need for the option ignore_readonly_and_shared_disks so that is removed Signed-off-by: Daniel P. Berrange <berrange> Sorry Peter, I intended to do a build yesterday but didn't get around to it. I'll do one now. libvirt-0.9.11.4-3.fc17 has been submitted as an update for Fedora 17. https://admin.fedoraproject.org/updates/libvirt-0.9.11.4-3.fc17 libvirt-0.9.11.4-3.fc17 has been pushed to the Fedora 17 stable repository. If problems still persist, please make note of it in this bug report. |