Description of problem: In an oVirt environment using iSCSI storage, after live snapshots are taken for a VM, a subsequent virDomainBlockCommit operation will always fail with the following message: error : qemuMonitorJSONCheckError:352 : internal error: unable to execute QEMU command 'block-commit': Top image file /rhev/data-center/mnt/blockSD/a246386b-f31d-451d-b8a6-2f4c0dfa27b4/images/dad706d6-e1e9-4f31-87d4-610b3511a932/59354326-443a-42dc-b21d-8649056ced83 not found It seems that some recent changes to how relative paths are handled has caused a regression. libvirt-1.1.3.5-1.fc20 fails but libvirt-1.1.1-29.el7.x86_64 works correctly. See subsequent attachments for the script I used to produce this and the resulting debugging output. Sorry, but I was unable to produce this outside of an oVirt environment. Version-Release number of selected component (if applicable): libvirt-1.1.3.5-1.fc20 qemu-kvm-1.6.2-5.fc20 How reproducible: Always, given an oVirt environment. Steps to Reproduce: 1. Launch a single VM having a single disk that is hosted on an iSCSI oVirt storage domain 2. Execute the attached test script to create the two live snapshots. The parameters are <storagepool ID> <host 1> <host 2>. 3. Execute the blockcommit virsh command as instructed by the test script on the host where the VM is running Actual results: blockcommit fails (see this transcript) sudo PYTHONPATH=lib python vdsm-test.py be744655-2a19-4919-92f3-e7c20ac03193 ale lager Running test: merge internal snapshot Basic test parameters: vm=af8df38e-2e99-4a89-9483-a540027ea6b1 sp=be744655-2a19-4919-92f3-e7c20ac03193 sd=a246386b-f31d-451d-b8a6-2f4c0dfa27b4 img=dad706d6-e1e9-4f31-87d4-610b3511a932 vol=7e0ec106-f2dc-4567-b050-e3289bf2d47f Creating volume: 59354326-443a-42dc-b21d-8649056ced83 Waiting for task c871c288-1201-4142-b95b-1eab3a48a51c c871c288-1201-4142-b95b-1eab3a48a51c: finished Creating snapshot 59354326-443a-42dc-b21d-8649056ced83 (base 7e0ec106-f2dc-4567-b050-e3289bf2d47f) Current volume chain: 7e0ec106-f2dc-4567-b050-e3289bf2d47f, 59354326-443a-42dc-b21d-8649056ced83, 378fdb5e-1853-4946-9a79-fd7bd00bb466 Creating volume: ea12203c-36a5-43d7-b351-492ec8c7944d Waiting for task a3aeeef7-3abf-4408-a1aa-74f7d06108c1 a3aeeef7-3abf-4408-a1aa-74f7d06108c1: finished Creating snapshot ea12203c-36a5-43d7-b351-492ec8c7944d (base 59354326-443a-42dc-b21d-8649056ced83) Current volume chain: 7e0ec106-f2dc-4567-b050-e3289bf2d47f, ea12203c-36a5-43d7-b351-492ec8c7944d, 59354326-443a-42dc-b21d-8649056ced83, 378fdb5e-1853-4946-9a79-fd7bd00bb466 Issue the following command on the host where the vm is running: virsh blockcommit af8df38e-2e99-4a89-9483-a540027ea6b1 /rhev/data-center/mnt/blockSD/a246386b-f31d-451d-b8a6-2f4c0dfa27b4/images/dad706d6-e1e9-4f31-87d4-610b3511a932/ea12203c-36a5-43d7-b351-492ec8c7944d --base /rhev/data-center/mnt/blockSD/a246386b-f31d-451d-b8a6-2f4c0dfa27b4/images/dad706d6-e1e9-4f31-87d4-610b3511a932/7e0ec106-f2dc-4567-b050-e3289bf2d47f --top /rhev/data-center/mnt/blockSD/a246386b-f31d-451d-b8a6-2f4c0dfa27b4/images/dad706d6-e1e9-4f31-87d4-610b3511a932/59354326-443a-42dc-b21d-8649056ced83 ... on the host running vm ... sudo virsh blockcommit af8df38e-2e99-4a89-9483-a540027ea6b1 /rhev/data-center/mnt/blockSD/a246386b-f31d-451d-b8a6-2f4c0dfa27b4/images/dad706d6-e1e9-4f31-87d4-610b3511a932/ea12203c-36a5-43d7-b351-492ec8c7944d --base /rhev/data-center/mnt/blockSD/a246386b-f31d-451d-b8a6-2f4c0dfa27b4/images/dad706d6-e1e9-4f31-87d4-610b3511a932/7e0ec106-f2dc-4567-b050-e3289bf2d47f --top /rhev/data-center/mnt/blockSD/a246386b-f31d-451d-b8a6-2f4c0dfa27b4/images/dad706d6-e1e9-4f31-87d4-610b3511a932/59354326-443a-42dc-b21d-8649056ced83 Please enter your authentication name: vdsm@ovirt Please enter your password: error: internal error: unable to execute QEMU command 'block-commit': Top image file /rhev/data-center/mnt/blockSD/a246386b-f31d-451d-b8a6-2f4c0dfa27b4/images/dad706d6-e1e9-4f31-87d4-610b3511a932/59354326-443a-42dc-b21d-8649056ced83 not found Expected results: blockcommit should start successfully Additional info: version of vdsm: vdsm-4.15.0-11.gitcd200ae.fc20.x86_64 (which contains some local changes that should be unrelated to this issue since the same version works with libvirt-1.1.1-29.el7.x86_64 on an el7 host)
Created attachment 900481 [details] Test script to create two live snapshots You may use this test script to produce the snapshots and help you construct the correct virsh blockcommit command to use.
Created attachment 900482 [details] libvirtd.log when the problem occurs
Created attachment 900483 [details] vdsm log when the problem happens
Created attachment 900484 [details] libvirt qemu log for the affected vm
Could it be related to https://bugzilla.redhat.com/show_bug.cgi?id=1061827#c13 ?
(In reply to Richard W.M. Jones from comment #5) > Could it be related to > https://bugzilla.redhat.com/show_bug.cgi?id=1061827#c13 ? Related but not quite the same thing. Here we are deliberately passing in valid absolute path names but qemu (bdrv_find_backing_file) path cannot be resolved into one of the actual backing images. I'm attaching a debug patch that I have applied to qemu and here is the output for a particular run of the code: bdrv_find_backing_image: looking for /rhev/data-center/mnt/blockSD/a246386b-f31d-451d-b8a6-2f4c0dfa27b4/images/907e7803-233f-47ad-b48b-232e1aea8697/196cd6c6-252d-4ada-b898-bc8652d3e3f6 -> /rhev/data-center/mnt/blockSD/a246386b-f31d-451d-b8a6-2f4c0dfa27b4/images/907e7803-233f-47ad-b48b-232e1aea8697/196cd6c6-252d-4ada-b898-bc8652d3e3f6 2 tmp: /rhev/data-center/mnt/blockSD/a246386b-f31d-451d-b8a6-2f4c0dfa27b4/images/907e7803-233f-47ad-b48b-232e1aea8697/196cd6c6-252d-4ada-b898-bc8652d3e3f6 full: /dev/dm-12 -> /rhev/data-center/mnt/blockSD/a246386b-f31d-451d-b8a6-2f4c0dfa27b4/images/907e7803-233f-47ad-b48b-232e1aea8697/acc6f7c4-9b6d-4052-abf5-cab1af5f2e8f 2 tmp: /rhev/data-center/mnt/blockSD/a246386b-f31d-451d-b8a6-2f4c0dfa27b4/images/907e7803-233f-47ad-b48b-232e1aea8697/196cd6c6-252d-4ada-b898-bc8652d3e3f6 full: /dev/dm-12 -> ../907e7803-233f-47ad-b48b-232e1aea8697/378fdb5e-1853-4946-9a79-fd7bd00bb466 2 tmp: /rhev/data-center/mnt/blockSD/a246386b-f31d-451d-b8a6-2f4c0dfa27b4/images/907e7803-233f-47ad-b48b-232e1aea8697/196cd6c6-252d-4ada-b898-bc8652d3e3f6 full: /dev/dm-12 Clearly it should have matched on the first iteration of the loop but realpath() is failing to resolve it. Prior to launching the blockcommit I ensured that all images in the chain were reachable (no broken symlinks). I wonder if somehow the qemu backing chain structures have been corrupted by the live snapshot process.
Created attachment 900523 [details] qemu debug patch referenced in comment #6
OK, my second guess: Could it be to do with the way that RHEV backing images are not "realized" always? You have to do some faffing around with VDSM, see: https://www.redhat.com/archives/libguestfs/2014-January/msg00153.html
(In reply to Adam Litke from comment #6) > (In reply to Richard W.M. Jones from comment #5) > > Could it be related to > > https://bugzilla.redhat.com/show_bug.cgi?id=1061827#c13 ? > ... > Clearly it should have matched on the first iteration of the loop but > realpath() is failing to resolve it. Prior to launching the blockcommit I > ensured that all images in the chain were reachable (no broken symlinks). I > wonder if somehow the qemu backing chain structures have been corrupted by > the live snapshot process. The libvirtd log you've posted contains an I/O error notification: 2014-05-29 16:26:50.996+0000: 925: debug : qemuMonitorIOProcess:393 : QEMU_MONITOR_IO_PROCESS: mon=0x7f09dc00fc60 buf={"timestamp": {"seconds": 1401380810, "microseconds": 996357}, "event": "BLOCK_IO_ERROR", "data": {"device": "drive-virtio-disk0", "operation": "write", "action": "stop"}} Isn't your storage broken actually? Also libvirt recognized the the image name and passed it through to qemu. You would get a different error if the image wouldn't be in libvirt's internal struct. Could you also post output of `virsh qemu-monitor-command --pretty VMNAME '{"execute": "query-block"}'` after the failed block-commit? Also the mentioned libvirt version libvirt-1.1.3.5-1.fc20 certainly doesn't have any of the backing-store changes you've mentioned.
Moving to qemu to investigate further. Libvirt seems to do the right thing here.
(In reply to Richard W.M. Jones from comment #8) > OK, my second guess: Could it be to do with the way that RHEV > backing images are not "realized" always? You have to do some > faffing around with VDSM, see: > https://www.redhat.com/archives/libguestfs/2014-January/msg00153.html This cannot be the problem. I've checked that each volume in the backing chain leads to an activated block device (eg. top == /dev/dm-12).
Here is the query-block output. Regarding the I/O errors, those only start to happen after the failed block-commit attempt. $ sudo virsh qemu-monitor-command --pretty brewer-2 '{"execute": "query-block"}' Please enter your authentication name: vdsm@ovirt Please enter your password: { "return": [ { "io-status": "ok", "device": "drive-ide0-1-0", "locked": false, "removable": true, "inserted": { "iops_rd": 0, "image": { "virtual-size": 75497472, "filename": "/rhev/data-center/mnt/192.168.2.1:_home_storage_iso/e453999e-ec9f-420b-941c-a9f9cc4432bb/images/11111111-1111-1111-1111-111111111111/CorePlus-current.iso", "format": "raw", "actual-size": 75501568 }, "iops_wr": 0, "ro": true, "backing_file_depth": 0, "drv": "raw", "iops": 0, "bps_wr": 0, "encrypted": false, "bps": 0, "bps_rd": 0, "file": "/rhev/data-center/mnt/192.168.2.1:_home_storage_iso/e453999e-ec9f-420b-941c-a9f9cc4432bb/images/11111111-1111-1111-1111-111111111111/CorePlus-current.iso", "encryption_key_missing": false }, "tray_open": false, "type": "unknown" }, { "io-status": "failed", "device": "drive-virtio-disk0", "locked": false, "removable": false, "inserted": { "iops_rd": 0, "image": { "backing-image": { "backing-image": { "backing-image": { "virtual-size": 2147483648, "filename": "/rhev/data-center/mnt/blockSD/a246386b-f31d-451d-b8a6-2f4c0dfa27b4/images/58cc8d73-0804-41b7-a457-93e7f2280478/../58cc8d73-0804-41b7-a457-93e7f2280478/378fdb5e-1853-4946-9a79-fd7bd00bb466", "cluster-size": 65536, "format": "qcow2", "actual-size": 0, "dirty-flag": false }, "backing-filename-format": "qcow2", "virtual-size": 2147483648, "filename": "/rhev/data-center/mnt/blockSD/a246386b-f31d-451d-b8a6-2f4c0dfa27b4/images/58cc8d73-0804-41b7-a457-93e7f2280478/0ee5ce39-c24f-4f7d-8067-6fce06a0fdfa", "cluster-size": 65536, "format": "qcow2", "actual-size": 0, "full-backing-filename": "/rhev/data-center/mnt/blockSD/a246386b-f31d-451d-b8a6-2f4c0dfa27b4/images/58cc8d73-0804-41b7-a457-93e7f2280478/../58cc8d73-0804-41b7-a457-93e7f2280478/378fdb5e-1853-4946-9a79-fd7bd00bb466", "backing-filename": "../58cc8d73-0804-41b7-a457-93e7f2280478/378fdb5e-1853-4946-9a79-fd7bd00bb466", "dirty-flag": false }, "backing-filename-format": "qcow2", "virtual-size": 2147483648, "filename": "/rhev/data-center/mnt/blockSD/a246386b-f31d-451d-b8a6-2f4c0dfa27b4/images/58cc8d73-0804-41b7-a457-93e7f2280478/6f455f42-bac4-4439-8f2f-a2044bc5e708", "cluster-size": 65536, "format": "qcow2", "backing-filename": "/rhev/data-center/mnt/blockSD/a246386b-f31d-451d-b8a6-2f4c0dfa27b4/images/58cc8d73-0804-41b7-a457-93e7f2280478/0ee5ce39-c24f-4f7d-8067-6fce06a0fdfa", "dirty-flag": false }, "backing-filename-format": "qcow2", "virtual-size": 2147483648, "filename": "/rhev/data-center/mnt/blockSD/a246386b-f31d-451d-b8a6-2f4c0dfa27b4/images/58cc8d73-0804-41b7-a457-93e7f2280478/2d47ee0c-ccbc-4a54-ad1e-53270861a480", "cluster-size": 65536, "format": "qcow2", "actual-size": 0, "backing-filename": "/rhev/data-center/mnt/blockSD/a246386b-f31d-451d-b8a6-2f4c0dfa27b4/images/58cc8d73-0804-41b7-a457-93e7f2280478/6f455f42-bac4-4439-8f2f-a2044bc5e708", "dirty-flag": false }, "iops_wr": 0, "ro": false, "backing_file_depth": 3, "drv": "qcow2", "iops": 0, "bps_wr": 0, "backing_file": "/rhev/data-center/mnt/blockSD/a246386b-f31d-451d-b8a6-2f4c0dfa27b4/images/58cc8d73-0804-41b7-a457-93e7f2280478/6f455f42-bac4-4439-8f2f-a2044bc5e708", "encrypted": false, "bps": 0, "bps_rd": 0, "file": "/rhev/data-center/mnt/blockSD/a246386b-f31d-451d-b8a6-2f4c0dfa27b4/images/58cc8d73-0804-41b7-a457-93e7f2280478/2d47ee0c-ccbc-4a54-ad1e-53270861a480", "encryption_key_missing": false }, "type": "unknown" } ], "id": "libvirt-70" }
Here is the error message from libvirt (to match the query-block output in the previous comment. $ sudo virsh blockcommit f356f6a7-6c22-48d6-8e33-84eba7521bb1 /rhev/data-center/mnt/blockSD/a246386b-f31d-451d-b8a6-2f4c0dfa27b4/images/58cc8d73-0804-41b7-a457-93e7f2280478/2d47ee0c-ccbc-4a54-ad1e-53270861a480 --base /rhev/data-center/mnt/blockSD/a246386b-f31d-451d-b8a6-2f4c0dfa27b4/images/58cc8d73-0804-41b7-a457-93e7f2280478/0ee5ce39-c24f-4f7d-8067-6fce06a0fdfa --top /rhev/data-center/mnt/blockSD/a246386b-f31d-451d-b8a6-2f4c0dfa27b4/images/58cc8d73-0804-41b7-a457-93e7f2280478/6f455f42-bac4-4439-8f2f-a2044bc5e708 Please enter your authentication name: vdsm@ovirt Please enter your password: error: internal error: unable to execute QEMU command 'block-commit': Top image file /rhev/data-center/mnt/blockSD/a246386b-f31d-451d-b8a6-2f4c0dfa27b4/images/58cc8d73-0804-41b7-a457-93e7f2280478/6f455f42-bac4-4439-8f2f-a2044bc5e708 not found
Adam, can you try latest qemu from fedora virt preview? Might help to determine if qemu is indeed the culprit: http://fedoraproject.org/wiki/Virtualization_Preview_Repository CCing some block layer folks
(In reply to Adam Litke from comment #6) > (In reply to Richard W.M. Jones from comment #5) > > Could it be related to > > https://bugzilla.redhat.com/show_bug.cgi?id=1061827#c13 ? > > Related but not quite the same thing. Here we are deliberately passing in > valid absolute path names but qemu (bdrv_find_backing_file) path cannot be > resolved into one of the actual backing images. > > I'm attaching a debug patch that I have applied to qemu and here is the > output for a particular run of the code: > > bdrv_find_backing_image: looking for > /rhev/data-center/mnt/blockSD/a246386b-f31d-451d-b8a6-2f4c0dfa27b4/images/ > 907e7803-233f-47ad-b48b-232e1aea8697/196cd6c6-252d-4ada-b898-bc8652d3e3f6 > -> > /rhev/data-center/mnt/blockSD/a246386b-f31d-451d-b8a6-2f4c0dfa27b4/images/ > 907e7803-233f-47ad-b48b-232e1aea8697/196cd6c6-252d-4ada-b898-bc8652d3e3f6 > 2 > tmp: > /rhev/data-center/mnt/blockSD/a246386b-f31d-451d-b8a6-2f4c0dfa27b4/images/ > 907e7803-233f-47ad-b48b-232e1aea8697/196cd6c6-252d-4ada-b898-bc8652d3e3f6 > full: /dev/dm-12 > -> > /rhev/data-center/mnt/blockSD/a246386b-f31d-451d-b8a6-2f4c0dfa27b4/images/ > 907e7803-233f-47ad-b48b-232e1aea8697/acc6f7c4-9b6d-4052-abf5-cab1af5f2e8f > 2 > tmp: > /rhev/data-center/mnt/blockSD/a246386b-f31d-451d-b8a6-2f4c0dfa27b4/images/ > 907e7803-233f-47ad-b48b-232e1aea8697/196cd6c6-252d-4ada-b898-bc8652d3e3f6 > full: /dev/dm-12 > -> > ../907e7803-233f-47ad-b48b-232e1aea8697/378fdb5e-1853-4946-9a79-fd7bd00bb466 > 2 > tmp: > /rhev/data-center/mnt/blockSD/a246386b-f31d-451d-b8a6-2f4c0dfa27b4/images/ > 907e7803-233f-47ad-b48b-232e1aea8697/196cd6c6-252d-4ada-b898-bc8652d3e3f6 > full: /dev/dm-12 > > Clearly it should have matched on the first iteration of the loop but > realpath() is failing to resolve it. Prior to launching the blockcommit I > ensured that all images in the chain were reachable (no broken symlinks). I > wonder if somehow the qemu backing chain structures have been corrupted by > the live snapshot process. This is interesting, because although realpath() is returning NULL, filename_full has contents in the buffer (the contents of filename_full is undefined on NULL). So it got close :) Obviously, on that first iteration, some error happened in realpath. Could you modify your debug patch to print errno (e.g. strerror(errno)), and post the results? That could help determine why realpath failed.
Hi Jeff: Here is the new output. Note that new volumes were used in this test so don't try comparing volume IDs from the previous comments: (The error is -EPERM) sudo virsh blockcommit b6d88230-7c33-4484-9d50-6a392030e33d /rhev/data-center/mnt/blockSD/a246386b-f31d-451d-b8a6-2f4c0dfa27b4/images/119241f1-d78d-49e0-96c5-375ada61e4e2/b0f8ff80-7633-4eaf-ba7f-07eb00081820 --base /rhev/data-center/mnt/blockSD/a246386b-f31d-451d-b8a6-2f4c0dfa27b4/images/119241f1-d78d-49e0-96c5-375ada61e4e2/67845f45-a4e3-4b12-8d79-62d4b33aca61 --top /rhev/data-center/mnt/blockSD/a246386b-f31d-451d-b8a6-2f4c0dfa27b4/images/119241f1-d78d-49e0-96c5-375ada61e4e2/3d826cd5-97a1-4c1b-9701-4a3de8ff2262 ... bdrv_find_backing_image: looking for /rhev/data-center/mnt/blockSD/a246386b-f31d-451d-b8a6-2f4c0dfa27b4/images/119241f1-d78d-49e0-96c5-375ada61e4e2/3d826cd5-97a1-4c1b-9701-4a3de8ff2262 -> /rhev/data-center/mnt/blockSD/a246386b-f31d-451d-b8a6-2f4c0dfa27b4/images/119241f1-d78d-49e0-96c5-375ada61e4e2/3d826cd5-97a1-4c1b-9701-4a3de8ff2262 2 tmp: /rhev/data-center/mnt/blockSD/a246386b-f31d-451d-b8a6-2f4c0dfa27b4/images/119241f1-d78d-49e0-96c5-375ada61e4e2/3d826cd5-97a1-4c1b-9701-4a3de8ff2262 full: /dev/dm-19 realpath failed with error: Permission denied -> /rhev/data-center/mnt/blockSD/a246386b-f31d-451d-b8a6-2f4c0dfa27b4/images/119241f1-d78d-49e0-96c5-375ada61e4e2/67845f45-a4e3-4b12-8d79-62d4b33aca61 2 tmp: /rhev/data-center/mnt/blockSD/a246386b-f31d-451d-b8a6-2f4c0dfa27b4/images/119241f1-d78d-49e0-96c5-375ada61e4e2/3d826cd5-97a1-4c1b-9701-4a3de8ff2262 full: /dev/dm-19 realpath failed with error: Permission denied -> ../119241f1-d78d-49e0-96c5-375ada61e4e2/56f4d756-4ac3-4477-b786-d730aed9e591 2 tmp: /rhev/data-center/mnt/blockSD/a246386b-f31d-451d-b8a6-2f4c0dfa27b4/images/119241f1-d78d-49e0-96c5-375ada61e4e2/3d826cd5-97a1-4c1b-9701-4a3de8ff2262 full: /dev/dm-19 realpath failed with error: Permission denied
Created attachment 901471 [details] New qemu debug patch
qemu-kvm-2.0.0-5.fc20.x86_64 from the virt-preview repo appears to fail in the same way: sudo virsh blockcommit 83d5c6db-0af9-4d1c-93d0-1dde75550a70 /rhev/data-center/mnt/blockSD/a246386b-f31d-451d-b8a6-2f4c0dfa27b4/images/649ec064-7818-458e-994f-52df39a066e8/77df4dd9-a079-4dd5-9d7a-623d2c4c1bc4 --base /rhev/data-center/mnt/blockSD/a246386b-f31d-451d-b8a6-2f4c0dfa27b4/images/649ec064-7818-458e-994f-52df39a066e8/b85f711d-830b-417e-9025-85b1904d4c96 --top /rhev/data-center/mnt/blockSD/a246386b-f31d-451d-b8a6-2f4c0dfa27b4/images/649ec064-7818-458e-994f-52df39a066e8/6af6013f-29f6-4b88-8db4-d662fdfd8bca Please enter your authentication name: vdsm@ovirt Please enter your password: error: internal error: unable to execute QEMU command 'block-commit': Top image file /rhev/data-center/mnt/blockSD/a246386b-f31d-451d-b8a6-2f4c0dfa27b4/images/649ec064-7818-458e-994f-52df39a066e8/6af6013f-29f6-4b88-8db4-d662fdfd8bca not found
Thanks Adam, Since the error from realpath() is EPERM, could you check the system logs to see if there are any selinux errors?
Hey Jeff, Looks like issuing a 'setenforce 0' on the host makes the problem go away. I will attach the audit.log in the next comment.
Created attachment 901490 [details] audit.log with selinux enforcing Here is the audit.log when the operation is failing. The qemu pid is 5703 libvirtd is pid 4862
Example AVCs: type=AVC msg=audit(1401724213.736:688): avc: denied { read } for pid=5364 comm="qemu-system-x86" path="/dev/dm-12" dev="devtmpfs" ino=91464 scontext=system_u:system_r:svirt_t:s0:c166,c273 tcontext=system_u:object_r:fixed_disk_device_t:s0 tclass=blk_file type=AVC msg=audit(1401724243.864:701): avc: denied { read } for pid=5364 comm="qemu-system-x86" path="/dev/dm-18" dev="devtmpfs" ino=91807 scontext=system_u:system_r:svirt_t:s0:c166,c273 tcontext=system_u:object_r:fixed_disk_device_t:s0 tclass=blk_file
hmm, selinux labeling problem will most certainly be a libvirt issue.
Hi Peter, Anything else I can collect for you to aid in resolving this?
Do you have any un-commented options in /etc/libvirt/qemu.conf ? Looks like the selinux driver in libvirt isn't working for some reason.
Just these (which seem to be unrelated) sudo grep -v '^#' /etc/libvirt/qemu.conf dynamic_ownership=0 spice_tls=1 save_image_format="lzop" spice_tls_x509_cert_dir="/etc/pki/vdsm/libvirt-spice" remote_display_port_min=5900 remote_display_port_max=6923 lock_manager="sanlock" auto_dump_path="/var/log/core" sudo grep -v '^#' /etc/libvirt/libvirtd.conf listen_addr="0.0.0.0" unix_sock_group="qemu" unix_sock_rw_perms="0770" auth_unix_rw="sasl" host_uuid="efd0b34a-1fd6-4e25-87a4-cfbd23ca64a7" keepalive_interval=-1 log_outputs="1:file:/var/log/libvirt/libvirtd.log" log_filters="3:virobject 3:virfile 2:virnetlink 3:cgroup 3:event 3:json 1:libvirt 1:util 1:qemu" ca_file="/etc/pki/vdsm/certs/cacert.pem" cert_file="/etc/pki/vdsm/certs/vdsmcert.pem" key_file="/etc/pki/vdsm/keys/vdsmkey.pem"
Hi Peter, What can we do to move this forward? It's blocking virDomainBlockCommit for oVirt on block storage domains. Would be nice to have this fixed in the July release.
Created attachment 922603 [details] libvirtd log error is at: 2014-07-30 14:15:04.470+0000: 1363: error : qemuMonitorJSONCheckError:377 : internal error: unable to execute QEMU command 'block-commit': Top image file /rhev/data-center/aab45993-c5cf-436e-b55a-873bd83681d2/ae92afa6-66e8-479e-bc4d-bb0ac06423de/images/eefd510e-2491-4d68-bfbf-31c47dd3bf65/../eefd510e-2491-4d68-bfbf-31c47dd3bf65/7804a559-f27d-4072-8c3e-525f75052837 not found
Created attachment 922604 [details] libvirtd log error message: 2014-07-30 14:15:04.470+0000: 1363: error : qemuMonitorJSONCheckError:377 : internal error: unable to execute QEMU command 'block-commit': Top image file /rhev/data-center/aab45993-c5cf-436e-b55a-873bd83681d2/ae92afa6-66e8-479e-bc4d-bb0ac06423de/images/eefd510e-2491-4d68-bfbf-31c47dd3bf65/../eefd510e-2491-4d68-bfbf-31c47dd3bf65/7804a559-f27d-4072-8c3e-525f75052837 not found
Created attachment 922606 [details] audit log I don't see any denials here, but the same issue happens as before.
Created attachment 922607 [details] libvirt log for affected vm
I came across one more possible problem that could cause this bug: I'm suspecting that somewhere in the path there's something that can't be stat-ed/readlink'd - due to selinux - or whatever realpath() does to canonicalize the path. As nothing showed up in the AVC log it's possible that some of the selinux rules may have "dontaudit" option enabled and thus don't show up in the auditlog. Would you mind trying https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Security-Enhanced_Linux/sect-Security-Enhanced_Linux-Fixing_Problems-Possible_Causes_of_Silent_Denials.html and reproducing the issue and then posting the selinux log? Thanks.
Created attachment 923330 [details] audit.log with all denials logged Aha, now I am seeing denials for devicemapper devices.
I can confirm that the problem only occurs with live-snapshots that have been created during the current generation of VM execution. In other words, if snapshots are created, the VM is then stopped and restarted, the commit operation will succeed. If snapshots are created and a merge is attempted immediately, the commit operation will fail.
*** Bug 1124445 has been marked as a duplicate of this bug. ***
The problem should be fixed by the following patch: commit f9ea3d60119e82c02c00fbf3678c3ed20634dea1 Author: Eric Blake <eblake> Date: Wed Mar 11 14:37:04 2015 -0600 qemu: read backing chain names from qemu https://bugzilla.redhat.com/show_bug.cgi?id=1199182 documents that after a series of disk snapshots into existing destination images, followed by active commits of the top image, it is possible for qemu 2.2 and earlier to end up tracking a different name for the image than what it would have had when opening the chain afresh. That is, when starting with the chain 'a <- b <- c', the name associated with 'b' is how it was spelled in the metadata of 'c', but when starting with 'a', taking two snapshots into 'a <- b <- c', then committing 'c' back into 'b', the name associated with 'b' is now the name used when taking the first snapshot. Sadly, older qemu doesn't know how to treat different spellings of the same filename as identical files (it uses strcmp() instead of checking for the same inode), which means libvirt's attempt to commit an image using solely the names learned from qcow2 metadata fails with a cryptic: error: internal error: unable to execute QEMU command 'block-commit': Top image file /tmp/images/c/../b/b not found even though the file exists. Trying to teach libvirt the rules on which name qemu will expect is not worth the effort (besides, we'd have to remember it across libvirtd restarts, and track whether a file was opened via metadata or via snapshot creation for a given qemu process); it is easier to just always directly ask qemu what string it expects to see in the first place. As a safety valve, we validate that any name returned by qemu still maps to the same local file as we have tracked it, so that a compromised qemu cannot accidentally cause us to act on an incorrect file. * src/qemu/qemu_monitor.h (qemuMonitorDiskNameLookup): New prototype. * src/qemu/qemu_monitor_json.h (qemuMonitorJSONDiskNameLookup): Likewise. * src/qemu/qemu_monitor.c (qemuMonitorDiskNameLookup): New function. * src/qemu/qemu_monitor_json.c (qemuMonitorJSONDiskNameLookup) (qemuMonitorJSONDiskNameLookupOne): Likewise. * src/qemu/qemu_driver.c (qemuDomainBlockCommit) (qemuDomainBlockJobImpl): Use it. Please note that it might have some dependencies so it's questionable whether it's worth backporting this or just wait for new release.
*** Bug 1206456 has been marked as a duplicate of this bug. ***
Peter, I see a few patches of yours referencing backing chains that hit libvirt.git after Eric's fix, are there any other patches that should be backported on top of that? I'm just looking at backporting to v1.2.13 for now since that's what we are sticking with in F22
I'll prepare a list of commits that you'll need on top of v1.2.13. I move this back to assigned to me until I do that.
ping pkrempa, help with the patch list? or just backport the changes to v1.2.13-maint. I'd like to get a build out ASAP for virt test day on thursday. thanks
Unfortunately the patches that fix the relative naming problem currently break non-shared storage migration so I'd rather not backport anything until we sort it out upstream.
This message is a reminder that Fedora 21 is nearing its end of life. Approximately 4 (four) weeks from now Fedora will stop maintaining and issuing updates for Fedora 21. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '21'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 21 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete.
Since f21 is end-of-life soon, and it doesn't sound like this was going to be an easy backport anyways, just closing. If anyone is still hitting this on later fedora versions, please reopen