1643707 – LSM is failing with L2 permissions failure.

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1643707 - LSM is failing with L2 permissions failure.

Summary: LSM is failing with L2 permissions failure.

Keywords:
Status:	CLOSED INSUFFICIENT_DATA
Alias:	None
Product:	Red Hat Enterprise Linux 7
Classification:	Red Hat
Component:	libvirt
Sub Component:
Version:	7.5
Hardware:	Unspecified
OS:	Unspecified
Priority:	urgent
Severity:	urgent
Target Milestone:	rc
Target Release:	---
Assignee:	Peter Krempa
QA Contact:	Han Han
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1417161 1597019 1660265
TreeView+	depends on / blocked

Reported:	2018-10-27 16:51 UTC by Frank DeLorey
Modified:	2022-08-02 08:07 UTC (History)
CC List:	29 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2019-09-10 10:59:02 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Issue Tracker	CEECBA-5636	0	None	None	None	2021-08-06 06:19:41 UTC
Red Hat Knowledge Base (Solution)	3771541	0	None	None	None	2018-12-21 22:11:51 UTC

Description Frank DeLorey 2018-10-27 16:51:49 UTC

Description of problem:

 Customer is unable to do Live Storage Migration of the disk  "diskname", it fails. 

Version-Release number of selected component (if applicable):

4.0.7

How reproducible:

Currently stuck and cannot migrate this drive. 

Steps to Reproduce:
1.
2.
3.

Actual results:

LSM of a drive fails and reports unable to flush L2 cache: permission denied

Expected results:

LSM should move the drive

Additional info:

From strace:

2:34:44 read(10, "\x01\x00\x00\x00\x00\x00\x00\x00", 512) = 8
02:34:44 ppoll([{fd=24, events=POLLIN|POLLERR|POLLHUP}, {fd=10, events=POLLIN|POLLERR|POLLHUP}], 2, {0, 0}, NULL, 8) = 0 (Timeout)
02:34:44 ppoll([{fd=24, events=POLLIN|POLLERR|POLLHUP}, {fd=10, events=POLLIN|POLLERR|POLLHUP}], 2, {0, 0}, NULL, 8) = 0 (Timeout)
02:34:44 io_submit(140227814273024, 1, [{pwritev, fildes=23, iovec=[{"\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01"..., 65536}], offset=131072, resfd=24}]
) = -1 EACCES (Permission denied)
02:34:44 io_submit(140227814273024, 1, [{pwritev, fildes=23, iovec=[{"\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01"..., 65536}], offset=131072, resfd=24}]
) = -1 EACCES (Permission denied)
02:34:44 ppoll([{fd=24, events=POLLIN|POLLERR|POLLHUP}, {fd=10, events=POLLIN|POLLERR|POLLHUP}], 2, {0, 0}, NULL, 8) = 0 (Timeout)
02:34:44 ppoll([{fd=24, events=POLLIN|POLLERR|POLLHUP}, {fd=10, events=POLLIN|POLLERR|POLLHUP}], 2, {0, 0}, NULL, 8) = 0 (Timeout)
02:34:44 ppoll([{fd=24, events=POLLIN|POLLERR|POLLHUP}, {fd=10, events=POLLIN|POLLERR|POLLHUP}], 2, {0, 0}, NULL, 8) = 0 (Timeout)
02:34:44 io_submit(140227814273024, 1, [{pwritev, fildes=23, iovec=[{"\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01"..., 65536}], offset=131072, resfd=24}]
) = -1 EACCES (Permission denied)
02:34:44 ppoll([{fd=24, events=POLLIN|POLLERR|POLLHUP}, {fd=10, events=POLLIN|POLLERR|POLLHUP}], 2, {0, 0}, NULL, 8) = 0 (Timeout)
02:34:44 io_submit(140227814273024, 1, [{pwritev, fildes=23, iovec=[{"\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01"..., 65536}], offset=131072, resfd=24}]
) = -1 EACCES (Permission denied)
02:34:44 ppoll([{fd=24, events=POLLIN|POLLERR|POLLHUP}, {fd=10, events=POLLIN|POLLERR|POLLHUP}], 2, {0, 0}, NULL, 8) = 0 (Timeout)
02:34:44 write(2, "2018-10-26T18:34:44.718660Z ", 28) = 28
02:34:44 write(2, "qemu-kvm:", 9)       = 9
02:34:44 write(2, " ", 1)               = 1
02:34:44 write(2, "Failed to flush the L2 table cac"..., 53) = 53
02:34:44 write(2, "\n", 1)              = 1
02:34:44 io_submit(140227814273024, 1, [{pwritev, fildes=23, iovec=[{"\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01"..., 65536}], offset=131072, resfd=24}]
) = -1 EACCES (Permission denied)


The strace is too large to attach to this case.

Comment 2 Michal Skrivanek 2018-10-28 06:18:46 UTC

Please add logs. Bug without logs is pretty much useless

Comment 3 Dan Kenigsberg 2018-10-28 06:44:42 UTC

When filing RHV bugs, please include relevant software versions. I have extracted these from sosreport-ucaprdrvh05.hq.tnb.com.my.02239297-20181027030125 but ovirt-engine is interesting, too.

vdsm-4.19.51-1.el7ev.x86_64
libvirt-daemon-kvm-3.9.0-14.el7_5.2.x86_64
qemu-kvm-rhev-2.10.0-21.el7_5.1.x86_64
running kernel-3.10.0-862.el7.x86_64

what is the backing storage for source and destination? NFS? iSCSI?

(I'm only trying to collect more data, it is unlikely that I continue the research here)

Comment 10 nijin ashok 2018-10-29 19:53:32 UTC

The migration is from the storage domain "f219a555-9187-4158-b7ed-60ba2cc51098" to "2cde64b7-aa85-45fe-846f-65dcd3e5ced7".

It failed when it does "blockJobAbort" with the error below.

===
2018-10-27 02:34:42,578+0800 INFO  (jsonrpc/1) [vdsm.api] START diskReplicateFinish(srcDisk={u'device': u'disk', u'poolID': u'33f59a82-e273-433f-a59f-4ae661527d04', u'volumeID': u'2cde64b7-aa85-45fe-846f-65dcd3e5ced7', u'domainID': u'f219a555-9187-4158-b7ed-60ba2cc51098', u'imageID': u'733646b4-de58-4d09-96eb-8b698b759dd1'}, dstDisk={u'device': u'disk', u'poolID': u'33f59a82-e273-433f-a59f-4ae661527d04', u'volumeID': u'2cde64b7-aa85-45fe-846f-65dcd3e5ced7', u'domainID': u'326f1ee8-cdaf-4ae5-abae-2d37b6fc9cdf', u'imageID': u'733646b4-de58-4d09-96eb-8b698b759dd1'}) from=::ffff: (api:46)

libvirtError: internal error: unable to execute QEMU command 'block-job-complete': Block job 'drive-virtio-disk2' not found
===


As Yuval Turgeman mentioned in the mailing list, there are AVC denied error in the audit log at the time of issue which seems to be the root cause.

===
type=AVC msg=audit(1540578884.718:9218518): avc:  denied  { write } for  pid=6841 comm="qemu-kvm" path="/dev/dm-233" dev="tmpfs" ino=470420506 scontext=system_u:system_r:svirt_t:s0:c107,c132 tcontext=system_u:object_r:fixed_disk_device_t:s0 tclass=blk_file
type=SYSCALL msg=audit(1540578884.718:9218518): arch=c000003e syscall=209 success=no exit=-13 a0=7f89550eb000 a1=1 a2=7f89410f9820 a3=1 items=0 ppid=1 pid=6841 auid=4294967295 uid=107 gid=107 euid=107 suid=107 fsuid=107 egid=107 sgid=107 fsgid=107 tty=(none) ses=4294967295 comm="qemu-kvm" exe="/usr/libexec/qemu-kvm" subj=system_u:system_r:svirt_t:s0:c107,c132 key=(null)
===

It's showing error on "/dev/dm-233" and the context was having is fixed_disk_device_t instead of "svirt_image_t" as per the log above. The dm-233 is the destination storage LV.


lrwxrwxrwx.  1 root root       9 Oct 27 01:25 326f1ee8--cdaf--4ae5--abae--2d37b6fc9cdf-2cde64b7--aa85--45fe--846f--65dcd3e5ced7 -> ../dm-233


The server is having libvirt 3.9.0 where libvirt will use mount namespace. So it's unlikely that any other process(like udev, vdsm) changes the SELinux context.  The strace and qemu logs clearly shows "permission denied" error message.

2018-10-26T18:34:44.718660Z qemu-kvm: Failed to flush the L2 table cache: Permission denied
2018-10-26T18:34:44.719401Z qemu-kvm: Failed to flush the refcount block cache: Permission denied

Since the SELinux context is set by libvirtd, I don't think this is an RHV bug. However, I was not able to reproduce this issue yet.

Do we have any known issue with libvirt not setting correct context during blockCopy/blockJobAbort?

Also, the customer was able to migrate the disk without any changes in the next attempt.

Comment 11 Han Han 2018-10-31 02:59:31 UTC

(In reply to nijin ashok from comment #10)
> The migration is from the storage domain
> "f219a555-9187-4158-b7ed-60ba2cc51098" to
> "2cde64b7-aa85-45fe-846f-65dcd3e5ced7".
> 
> It failed when it does "blockJobAbort" with the error below.
> 
> ===
> 2018-10-27 02:34:42,578+0800 INFO  (jsonrpc/1) [vdsm.api] START
> diskReplicateFinish(srcDisk={u'device': u'disk', u'poolID':
> u'33f59a82-e273-433f-a59f-4ae661527d04', u'volumeID':
> u'2cde64b7-aa85-45fe-846f-65dcd3e5ced7', u'domainID':
> u'f219a555-9187-4158-b7ed-60ba2cc51098', u'imageID':
> u'733646b4-de58-4d09-96eb-8b698b759dd1'}, dstDisk={u'device': u'disk',
> u'poolID': u'33f59a82-e273-433f-a59f-4ae661527d04', u'volumeID':
> u'2cde64b7-aa85-45fe-846f-65dcd3e5ced7', u'domainID':
> u'326f1ee8-cdaf-4ae5-abae-2d37b6fc9cdf', u'imageID':
> u'733646b4-de58-4d09-96eb-8b698b759dd1'}) from=::ffff: (api:46)
> 
> libvirtError: internal error: unable to execute QEMU command
> 'block-job-complete': Block job 'drive-virtio-disk2' not found
> ===
> 
There is one libvirt bug BZ1506072 similar to this bug.
> 
> As Yuval Turgeman mentioned in the mailing list, there are AVC denied error
> in the audit log at the time of issue which seems to be the root cause.
> 
The audit log of BZ1506072:
type=AVC msg=audit(1508873458.895:3099876): avc:  denied  { write } for  pid=4211 comm="qemu-kvm" name="dm-143" dev="tmpfs" ino=176589443 scontext=system_u:system_r:svirt_t:s0:c339,c582 tcontext=system_u:object_r:fixed_disk_device_t:s0 tclass=blk_file


type=SYSCALL msg=audit(1508873458.895:3099876): arch=x86_64 syscall=open success=no exit=EACCES a0=56074c4f8031 a1=84002 a2=0 a3=0 items=0 ppid=1 pid=4211 auid=4294967295 uid=107 gid=107 euid=107 suid=107 fsuid=107 egid=107 sgid=107 fsgid=107 tty=(none) ses=4294967295 comm=qemu-kvm exe=/usr/libexec/qemu-kvm subj=system_u:system_r:svirt_t:s0:c339,c582 key=(null)
> ===
> type=AVC msg=audit(1540578884.718:9218518): avc:  denied  { write } for 
> pid=6841 comm="qemu-kvm" path="/dev/dm-233" dev="tmpfs" ino=470420506
> scontext=system_u:system_r:svirt_t:s0:c107,c132
> tcontext=system_u:object_r:fixed_disk_device_t:s0 tclass=blk_file
> type=SYSCALL msg=audit(1540578884.718:9218518): arch=c000003e syscall=209
> success=no exit=-13 a0=7f89550eb000 a1=1 a2=7f89410f9820 a3=1 items=0 ppid=1
> pid=6841 auid=4294967295 uid=107 gid=107 euid=107 suid=107 fsuid=107
> egid=107 sgid=107 fsgid=107 tty=(none) ses=4294967295 comm="qemu-kvm"
> exe="/usr/libexec/qemu-kvm" subj=system_u:system_r:svirt_t:s0:c107,c132
> key=(null)
> ===
> 
> It's showing error on "/dev/dm-233" and the context was having is
> fixed_disk_device_t instead of "svirt_image_t" as per the log above. The
> dm-233 is the destination storage LV.
> 
> 
> lrwxrwxrwx.  1 root root       9 Oct 27 01:25
> 326f1ee8--cdaf--4ae5--abae--2d37b6fc9cdf-2cde64b7--aa85--45fe--846f--
> 65dcd3e5ced7 -> ../dm-233
> 
> 
> The server is having libvirt 3.9.0 where libvirt will use mount namespace.
> So it's unlikely that any other process(like udev, vdsm) changes the SELinux
> context.  The strace and qemu logs clearly shows "permission denied" error
> message.
> 
> 2018-10-26T18:34:44.718660Z qemu-kvm: Failed to flush the L2 table cache:
> Permission denied
> 2018-10-26T18:34:44.719401Z qemu-kvm: Failed to flush the refcount block
> cache: Permission denied
> 
> Since the SELinux context is set by libvirtd, I don't think this is an RHV
> bug. However, I was not able to reproduce this issue yet.
> 
> Do we have any known issue with libvirt not setting correct context during
> blockCopy/blockJobAbort?
> 
However, BZ1506072 is fixed in libvirt-3.9.0-3.el7. I will try to reproduce this bug on the libvirt version of reporter.

Could you tell me what is the exact libvirt version of reporter? I don't know how to find it in the sos reports....

> Also, the customer was able to migrate the disk without any changes in the
> next attempt.

Comment 12 Han Han 2018-10-31 03:35:41 UTC

I find the libvirt-libs version of sos report is: libvirt-libs-3.2.0-14.el7_4.3.x86_64. However, the similar bug is fixed since libvirt-3.2.0-14.el7_4.5(BZ1516717).
I will keep tracking and try to reproduce it.

Comment 13 Han Han 2018-10-31 07:56:07 UTC

Sorry for the wrong libvirt version. I mistook history libvirt version as current libvirt version...

I try to reproduce it on libvirt-3.9.0-14.el7_5.2.x86_64 qemu-kvm-rhev-2.10.0-21.el7_5.1.x86_64 via virsh:
Since LSM in rhv actually invokes libvirt blockCopy with --shallow --reuse-external flag to copy storage, then invokes blockJobAbort with --pivot to switch to new storage. So I tried with following

1. Prepare a running VM, lvm backend, virtio type
# virsh dumpxml rhel|grep '/<disk/,/<\disk/'
    <disk type='block' device='disk'>
      <driver name='qemu' type='raw'/>
      <source dev='/dev/test/base'/>
      <target dev='vda' bus='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
    </disk>

2. Create backing file
# qemu-img create -b /dev/test/base -f qcow2 /dev/test/lv1 -o backing_fmt=raw

# qemu-img create -b /dev/test/base -f qcow2 /dev/test/lv2 -o backing_fmt=raw

3. Start vm and create disk snapshot with reused external
# virsh start rhel
Domain rhel started

# virsh snapshot-create-as rhel s1 --no-metadata --reuse-external --disk-only --diskspec vda,file=/dev/test/lv1
Domain snapshot s1 created


4. Do shallow blockcopy with reused external disk. Use --pivot to switch to new storage after storage copied.
# virsh blockcopy rhel vda /dev/test/lv2 --shallow  --wait --verbose --transient-job --reuse-external --pivot
Block Copy: [100 %]
Successfully pivoted


# virsh blockcopy rhel vda /dev/test/lv1 --shallow  --wait --verbose --transient-job --reuse-external --pivot
Block Copy: [100 %]
Successfully pivoted


This bug is not reproduced.

Comment 14 nijin ashok 2018-10-31 09:10:09 UTC

(In reply to Han Han from comment #13)
> This bug is not reproduced.

I was not able to reproduce in my test environment as well. I tested it multiple times using RHV with same vdsm/libvirt version. 

From the customer's vdsm log, I can see that the issue has happened on multiple VMs. And the customer is having libvirt 3.9.0-14.el7_5.2 which already have the fix for BZ 1516717 and BZ 1506072. Also, the issue is fixed automatically if he tried the LSM multiple times.

I was trying to get libvirt debug logs and output of "ls -lRZ /dev/*" within the mount namespace in a loop during the migration to have more understanding on this if he is able to reproduce the issue again.

Do you need any other specific information if the customer is able to reproduce the issue again so that we can understand the root cause?

Comment 15 Han Han 2018-10-31 10:14:48 UTC

(In reply to nijin ashok from comment #14)
> (In reply to Han Han from comment #13)
> > This bug is not reproduced.
> 
> I was not able to reproduce in my test environment as well. I tested it
> multiple times using RHV with same vdsm/libvirt version. 
> 
> From the customer's vdsm log, I can see that the issue has happened on
> multiple VMs. And the customer is having libvirt 3.9.0-14.el7_5.2 which
> already have the fix for BZ 1516717 and BZ 1506072. Also, the issue is fixed
> automatically if he tried the LSM multiple times.
> 
> I was trying to get libvirt debug logs and output of "ls -lRZ /dev/*" within
> the mount namespace in a loop during the migration to have more
> understanding on this if he is able to reproduce the issue again.
> 
> Do you need any other specific information if the customer is able to
> reproduce the issue again so that we can understand the root cause?
No more extra info needed. It is better to get libvirtd.log, but libvirtd log was not enable in the env of reporter:)

I will check the libvirt part of vdsm.log with problem tomorrow. And try to reproduce it by following the vdsm log...

Comment 16 nijin ashok 2018-10-31 13:32:26 UTC

Also, the bug 1597019 looks similar to this one which was opened by RHV QE. I can see same "permission denied" error messages in the qemu logs for the VM in that BZ as well. Looks like they were not able to reproduce it after that.

Comment 17 Han Han 2018-11-01 04:17:45 UTC

Following the operations in vdsm, I wrote a script to reproduce it:

#!/bin/bash -x
DOM=rhel
DISK=vda
IMG_SRC=/var/lib/libvirt/images/rhel.qcow2
BASE=/datacenter/storage/base
MIR1=/datacenter/storage/lv1
MIR2=/datacenter/storage/lv2

function blockcopy_check() {
    virsh blockcopy $DOM $DISK $1 --shallow --wait --verbose --pivot --transient-job --reuse-external
    if [ $? -ne 0 ];then
        echo fail
        exit 2
    fi
}

#qemu-img convert $IMG_SRC -O raw $BASE
qemu-img create -b $BASE -f qcow2 $MIR1 -o backing_fmt=raw
qemu-img create -b $BASE -f qcow2 $MIR2 -o backing_fmt=raw
# preparation
# Please prepare symlink of block backend: $BASE $MIR1 $MIR2
useradd vdsm -u 36 -G kvm
DIR=$(dirname $BASE)
chmod 777 $DIR -R
chcon -u system_u -r object_r -t virt_var_run_t -l s0 $DIR -R

while true;do
    virsh start $DOM && sleep 30
    virsh snapshot-create-as $DOM s1 --no-metadata --disk-only --reuse-external --diskspec $DISK,file=$MIR1
    blockcopy_check $MIR2
    blockcopy_check $MIR1
    virsh destroy $DOM
    virsh detach-disk $DOM $DISK --config
    virsh attach-disk $DOM $BASE $DISK --config
done


Hope it can hit this bug sometime :)

Comment 29 Han Han 2019-01-14 10:30:47 UTC

I tried to produce it on:
libvirt-3.9.0-14.el7_5.8.x86_64
qemu-kvm-rhev-2.10.0-21.el7_5.7.x86_64
selinux-policy-3.13.1-192.el7_5.7.noarch

1. Create logical volumes on fc storage
2. Create symbol links of the LVs
# ls /rhev/ -alZ
drwxr-xr-x. root root unconfined_u:object_r:mnt_t:s0   .
dr-xr-xr-x. root root system_u:object_r:root_t:s0      ..
lrwxrwxrwx. root root unconfined_u:object_r:mnt_t:s0   lv1 -> /dev/fc/lv
lrwxrwxrwx. root root unconfined_u:object_r:mnt_t:s0   lv2 -> /dev/fc/lv1

# ls /dev/fc -alZ
drwxr-xr-x. root root system_u:object_r:device_t:s0    .
drwxr-xr-x. root root system_u:object_r:device_t:s0    ..
lrwxrwxrwx. root root system_u:object_r:device_t:s0    lv -> ../dm-3
lrwxrwxrwx. root root system_u:object_r:device_t:s0    lv1 -> ../dm-4

3. Create a VM based /rhev/lv1
Disk xml:
<disk type='block' device='disk' snapshot='no'>
<driver name='qemu' type='raw' cache='none' error_policy='stop' io='native'/>
      <source dev='/rhev/lv1'/>
      <backingStore/>
      <target dev='vda' bus='virtio'/>
    </disk>

Start vm and blockcopy to another logical volume:
# virsh blockcopy fc vda /rhev/lv2 --wait --verbose --pivot
Block Copy: [100 %]
Successfully pivoted

Copy back:
# virsh blockcopy fc vda /rhev/lv1 --wait --verbose --pivot                                                                               
Block Copy: [100 %]
Successfully pivoted

Bug not reproduced.

Hi Raz, are they that vdsm does when preparing fc storage? What vdsm exactly does when preparing fc?

Comment 30 Raz Tamir 2019-01-27 09:29:52 UTC

Hi Han Han,

+Elad who discovered this bug (https://bugzilla.redhat.com/show_bug.cgi?id=1597019) on RHV QE side and looks exactly the same.

We probably can reproduce it on our side?

Comment 31 Elad 2019-01-27 11:13:51 UTC

Hi Han Han, the following are the commands used for volume and VM creation. All taken from the logs uploaded to bug 1597019 in https://bugzilla.redhat.com/attachment.cgi?id=1455763

hypervisor-lynx01_vdsm.log:
LV creation:

2018-06-29 14:58:20,773+0300 DEBUG (tasks/4) [storage.Misc.excCmd] /usr/bin/taskset --cpu-list 0-7 /usr/bin/sudo -n /sbin/lvm lvcreate --config ' devices { preferred_names = ["^/dev/mapper/"] ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3 filter = [ '\''a|/dev/mapper/3600a09803830447a4f244c4657595057|'\'', '\''r|.*|'\'' ] }  global {  locking_type=1  prioritise_write_locks=1  wait_for_locks=1  use_lvmetad=0 }  backup {  retain_min = 50  retain_days = 0 } ' --autobackup n --contiguous n --size 1024m --addtag OVIRT_VOL_INITIALIZING --name 45468d0d-77ab-4ff1-a60c-6e68e1c7e0bb f54afc10-74ad-4ae5-99b5-32f1ac874c44 (cwd None) (commands:65)


hypervisor-lynx03_vdsm.log:
VM XML for domain create (start VM):

<domain type="kvm" xmlns:ovirt-tune="http://ovirt.org/vm/tune/1.0" xmlns:ovirt-vm="http://ovirt.org/vm/1.0"><name>vm_type_2914583186_from_both</name><uuid>a09ab6e7-57bf-4010-b616-003df33ff5b5</uuid><memory>1048576</memory><currentMemory>1048576</currentMemory><maxMemory slots="16">4194304</maxMemory><vcpu current="1">16</vcpu><sysinfo type="smbios"><system><entry name="manufacturer">oVirt</entry><entry name="product">OS-NAME:</entry><entry name="version">OS-VERSION:</entry><entry name="serial">HOST-SERIAL:</entry><entry name="uuid">a09ab6e7-57bf-4010-b616-003df33ff5b5</entry></system></sysinfo><clock offset="variable" adjustment="0"><timer name="rtc" tickpolicy="catchup"></timer><timer name="pit" tickpolicy="delay"></timer><timer name="hpet" present="no"></timer></clock><features><acpi></acpi><vmcoreinfo></vmcoreinfo></features><cpu match="exact"><model>Westmere</model><topology cores="1" threads="1" sockets="16"></topology><numa><cell id="0" cpus="0" memory="1048576"></cell></numa></cpu><cputune></cputune><devices><input type="mouse" bus="ps2"></input><channel type="unix"><target type="virtio" name="ovirt-guest-agent.0"></target><source mode="bind" path="/var/lib/libvirt/qemu/channels/a09ab6e7-57bf-4010-b616-003df33ff5b5.ovirt-guest-agent.0"></source></channel><channel type="unix"><target type="virtio" name="org.qemu.guest_agent.0"></target><source mode="bind" path="/var/lib/libvirt/qemu/channels/a09ab6e7-57bf-4010-b616-003df33ff5b5.org.qemu.guest_agent.0"></source></channel><controller type="virtio-serial" index="0" ports="16"><alias name="ua-1e1a4b3e-822e-4474-bcef-9a7a062f7fbe"></alias></controller><video><model type="qxl" vram="8192" heads="1" ram="65536" vgamem="16384"></model><alias name="ua-3caa0ebc-f90a-4450-a72c-3006e313c9ed"></alias></video><memballoon model="virtio"><stats period="5"></stats><alias name="ua-4f67f598-d26c-4700-b2b1-71443e4fb2f4"></alias></memballoon><sound model="ich6"><alias name="ua-58891468-8889-43ac-8661-f1b02fcb610e"></alias></sound><controller type="usb" model="piix3-uhci" index="0"></controller><graphics type="spice" port="-1" autoport="yes" passwd="*****" passwdValidTo="1970-01-01T00:00:01" tlsPort="-1"><channel name="main" mode="secure"></channel><channel name="inputs" mode="secure"></channel><channel name="cursor" mode="secure"></channel><channel name="playback" mode="secure"></channel><channel name="record" mode="secure"></channel><channel name="display" mode="secure"></channel><channel name="smartcard" mode="secure"></channel><channel name="usbredir" mode="secure"></channel><listen type="network" network="vdsm-ovirtmgmt"></listen></graphics><rng model="virtio"><backend model="random">/dev/urandom</backend><alias name="ua-ab6e48fd-84ee-4681-95e9-5bf3ac10be0e"></alias></rng><controller type="scsi" model="virtio-scsi" index="0"><alias name="ua-b54c7cc0-0810-46ef-a893-977030002d54"></alias></controller><channel type="spicevmc"><target type="virtio" name="com.redhat.spice.0"></target></channel><interface type="bridge"><model type="virtio"></model><link state="up"></link><source bridge="ovirtmgmt"></source><alias name="ua-aceaac08-f924-4f3a-ab96-a7a1ac0cc711"></alias><mac address="00:1a:4a:16:88:0f"></mac><filterref filter="vdsm-no-mac-spoofing"></filterref><bandwidth></bandwidth></interface><disk type="file" device="cdrom" snapshot="no"><driver name="qemu" type="raw" error_policy="report"></driver><source file="" startupPolicy="optional"></source><target dev="hdc" bus="ide"></target><readonly></readonly><alias name="ua-84ff7fe0-9caf-4c6b-8d02-9e76ec9fe314"></alias></disk><disk snapshot="no" type="block" device="disk"><target dev="vda" bus="virtio"></target><source dev="/rhev/data-center/mnt/blockSD/f54afc10-74ad-4ae5-99b5-32f1ac874c44/images/fee6faec-ebfc-440a-8e88-84fe4ecf68d1/cad8fb18-bfdc-4585-aee6-1a8218d18c09"></source><driver name="qemu" io="native" type="qcow2" error_policy="stop" cache="none"></driver><alias name="ua-fee6faec-ebfc-440a-8e88-84fe4ecf68d1"></alias><boot order="1"></boot><serial>fee6faec-ebfc-440a-8e88-84fe4ecf68d1</serial></disk></devices><pm><suspend-to-disk enabled="no"></suspend-to-disk><suspend-to-mem enabled="no"></suspend-to-mem></pm><os><type arch="x86_64" machine="pc-i440fx-rhel7.5.0">hvm</type><smbios mode="sysinfo"></smbios></os><metadata><ovirt-tune:qos></ovirt-tune:qos><ovirt-vm:vm><minGuaranteedMemoryMb type="int">1024</minGuaranteedMemoryMb><clusterVersion>4.2</clusterVersion><ovirt-vm:custom></ovirt-vm:custom><ovirt-vm:device mac_address="00:1a:4a:16:88:0f"><ovirt-vm:custom></ovirt-vm:custom></ovirt-vm:device><ovirt-vm:device devtype="disk" name="vda"><ovirt-vm:poolID>5eb0ed42-7acf-11e8-ae76-00163e7be006</ovirt-vm:poolID><ovirt-vm:volumeID>cad8fb18-bfdc-4585-aee6-1a8218d18c09</ovirt-vm:volumeID><ovirt-vm:imageID>fee6faec-ebfc-440a-8e88-84fe4ecf68d1</ovirt-vm:imageID><ovirt-vm:domainID>f54afc10-74ad-4ae5-99b5-32f1ac874c44</ovirt-vm:domainID></ovirt-vm:device><launchPaused>false</launchPaused><resumeBehavior>auto_resume</resumeBehavior></ovirt-vm:vm></metadata></domain>'}




The failure:

2018-06-29 15:04:31,212+0300 INFO  (jsonrpc/3) [virt.vm] (vmId='a09ab6e7-57bf-4010-b616-003df33ff5b5') Disabling drive monitoring (drivemonitor:58)
2018-06-29 15:04:31,276+0300 ERROR (jsonrpc/3) [virt.vm] (vmId='a09ab6e7-57bf-4010-b616-003df33ff5b5') Unable to stop the replication for the drive: vda (vm:4750)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 4747, in diskReplicateFinish
    self._dom.blockJobAbort(drive.name, blockJobFlags)
  File "/usr/lib/python2.7/site-packages/vdsm/virt/virdomain.py", line 98, in f
    ret = attr(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/common/libvirtconnection.py", line 130, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/common/function.py", line 92, in wrapper
    return func(inst, *args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/libvirt.py", line 773, in blockJobAbort
    if ret == -1: raise libvirtError ('virDomainBlockJobAbort() failed', dom=self)
libvirtError: internal error: unable to execute QEMU command 'block-job-complete': Block job 'drive-ua-fee6faec-ebfc-440a-8e88-84fe4ecf68d1' not found

Comment 33 Elad 2019-02-11 07:41:28 UTC

Hi Han Han, unfortunately, we still don't have the exact steps that trigger the issue. 
You can track the correspondings in bug 1660265 which has several customer tickets attached.

Comment 35 Peter Krempa 2019-02-19 16:27:26 UTC

I looked through some of the logs attached but none of those I looked at contained libvirtd debug logs. Without that I can't help much.

It is fishy though that 'block-job-complete' was called in the situations where the job is missing as libvirt makes sure that the job is in ready state before sending the command or that it can't be completed. Both should not have happened.

Unfortunately without debug logs or an reproducer it's very hard to figure out what's happening.

Comment 37 Nir Soffer 2019-03-11 22:37:51 UTC

(In reply to Peter Krempa from comment #35)
> It is fishy though that 'block-job-complete' was called in the situations
> where the job is missing as libvirt makes sure that the job is in ready
> state before sending the command or that it can't be completed. Both should
> not have happened.

Peter, we had a similar issue in blockCommit in the past, when we failed to check
if the block job is ready in the xml before trying to pivot. Maybe we have similar
race?

What vdsm i doing is (simplified):

    # Looking for the replication blockJob info (checking its presence)
    blkJobInfo = self._dom.blockJobInfo(drive.name, 0)

    # Checking if we reached the replication mode ("mirroring" in libvirt
    # and qemu terms)
    if blkJobInfo['cur'] != blkJobInfo['end']:                                                                                                                           
        fail...

    # Stopping the replication                                                                                                                                       
    self._dom.blockJobAbort(drive.name, blockJobFlags)

In active layer commit, we do:

    def _activeLayerCommitReady(self, jobInfo, drive):                                                                                                                       
        if (jobInfo['cur'] == jobInfo['end'] and jobInfo['type'] == pivot):

            # Check the job state in the xml to make sure the job is
            # ready. We know about two interesting corner cases:
            #
            # - cur == 0 and end == 0 when a job starts. Trying to pivot
            #   succeeds, but the xml never updates after that.
            #   See https://bugzilla.redhat.com/1442266.
            #
            # - cur == end and cur != 0, but the job is not ready yet, and
            #   blockJobAbort raises an error.
            #   See https://bugzilla.redhat.com/1376580

            self.log.debug("Checking xml for drive %r", drive.name)
            root = ET.fromstring(self._dom.XMLDesc(0))
            disk_xpath = "./devices/disk/target[@dev='%s'].." % drive.name
            disk = root.find(disk_xpath)
            if disk is None:
                self.log.warning("Unable to find %r in vm xml", drive)
                return False
            return disk.find("./mirror[@ready='yes']") is not None
        return False

Should we do something similar for blockCopy?

Or maybe we should register VIR_DOMAIN_EVENT_ID_BLOCK_JOB_2 and 
VIR_DOMAIN_BLOCK_JOB_READY to detect that that a blockCopy job
is ready?

Currently we register libvirt.VIR_DOMAIN_EVENT_ID_BLOCK_JOB, but we only
log the event, and take no action.

Comment 38 Peter Krempa 2019-03-12 13:03:27 UTC

(In reply to Nir Soffer from comment #37)
> (In reply to Peter Krempa from comment #35)
> > It is fishy though that 'block-job-complete' was called in the situations
> > where the job is missing as libvirt makes sure that the job is in ready
> > state before sending the command or that it can't be completed. Both should
> > not have happened.
> 
> Peter, we had a similar issue in blockCommit in the past, when we failed to
> check
> if the block job is ready in the xml before trying to pivot. Maybe we have
> similar
> race?

You are right in the regard that active block-commit and block-copy have exactly the same internals in libvirt ...

> 
> What vdsm i doing is (simplified):
> 
>     # Looking for the replication blockJob info (checking its presence)
>     blkJobInfo = self._dom.blockJobInfo(drive.name, 0)
> 
>     # Checking if we reached the replication mode ("mirroring" in libvirt
>     # and qemu terms)
>     if blkJobInfo['cur'] != blkJobInfo['end']:                              
> 
>         fail...
> 
>     # Stopping the replication                                              
> 
>     self._dom.blockJobAbort(drive.name, blockJobFlags)

I presume this includes _PIVOT here. You are right that the scenario described below can happen. The thing is though that it will have different behaviour regarding the error messages reported:

If we assume that the _PIVOT flag is used we perform the following check:

17223     if (disk->mirrorState != VIR_DOMAIN_DISK_MIRROR_STATE_READY) {              
17224         virReportError(VIR_ERR_BLOCK_COPY_ACTIVE,                               
17225                        _("disk '%s' not ready for pivot yet"),                  
17226                        disk->dst);                                              
17227         goto cleanup;                                                           
17228     }                                                                           

Libvirt updates the mirror state only according to events.

(Historically (e.g. in libvirt-3.9.0 as described ) there was also code which updated the state if it was not READY, but that actually checked the 'ready' flag in qemu (if qemu supported it) and thus should not have caused problems)). I'm not sure which version introduced the 'ready' state.

The error that you can see e.g. in comment 31 happens later as a response to using block-job-complete command:

libvirtError: internal error: unable to execute QEMU command 'block-job-complete': Block job 'drive-ua-fee6faec-ebfc-440a-8e88-84fe4ecf68d1' not found

> In active layer commit, we do:
> 
>     def _activeLayerCommitReady(self, jobInfo, drive):                      
> 
>         if (jobInfo['cur'] == jobInfo['end'] and jobInfo['type'] == pivot):
> 
>             # Check the job state in the xml to make sure the job is
>             # ready. We know about two interesting corner cases:
>             #
>             # - cur == 0 and end == 0 when a job starts. Trying to pivot
>             #   succeeds, but the xml never updates after that.
>             #   See https://bugzilla.redhat.com/1442266.
>             #
>             # - cur == end and cur != 0, but the job is not ready yet, and
>             #   blockJobAbort raises an error.
>             #   See https://bugzilla.redhat.com/1376580

There is also a case when cur == end and its non-zero, but the job is not ready for pivot yet. I've recently pushed a fix to this which decrements "cur" by one if the job is not ready:

commit 73ce3911aa0404f070468ae00c78369a5b1ac071
Author: Peter Krempa <pkrempa>
Date:   Tue Jan 29 17:17:29 2019 +0100

    qemu: blockjob: Don't report block job progress at 100% if job isn't ready
    
    Some clients poll virDomainGetBlockJobInfo rather than wait for the
    VIR_DOMAIN_BLOCK_JOB_READY event. In some cases qemu can get to 100% and
    still not reach the synchronised phase. Initiating a pivot in that case
    will fail.
    
    Given that computers are interacting here, the error that the job
    can't be finalized yet is not handled very well by those specific
    implementations.
    
    Our docs now correctly state to use the event. We already do a similar
    output adjustment in case when the progress is not available from qemu
    as in that case we'd report 0 out of 0, which some apps also incorrectly
    considered as 100% complete.
    
    In this case we subtract 1 from the progress if the ready state is not
    signalled by qemu if the progress was at 100% otherwise.
    
    Signed-off-by: Peter Krempa <pkrempa>

diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
index 96cb108a0e..2485dd93eb 100644
--- a/src/qemu/qemu_driver.c
+++ b/src/qemu/qemu_driver.c
@@ -17440,6 +17440,14 @@ qemuBlockJobInfoTranslate(qemuMonitorBlockJobInfoPtr rawInfo,
         }
     }

+    /* If qemu reports that it's not ready yet don't make the job go to
+     * cur == end as some apps wrote code polling this instead of waiting for
+     * the ready event */
+    if (rawInfo->ready == 0 &&
+        info->cur == info->end &&
+        info->cur > 0)
+        info->cur -= 1;
+
     info->type = rawInfo->type;
     if (info->type == VIR_DOMAIN_BLOCK_JOB_TYPE_COMMIT &&
         disk->mirrorJob == VIR_DOMAIN_BLOCK_JOB_TYPE_ACTIVE_COMMIT)

v5.0.0-180-g73ce3911aa



> 
>             self.log.debug("Checking xml for drive %r", drive.name)
>             root = ET.fromstring(self._dom.XMLDesc(0))
>             disk_xpath = "./devices/disk/target[@dev='%s'].." % drive.name
>             disk = root.find(disk_xpath)
>             if disk is None:
>                 self.log.warning("Unable to find %r in vm xml", drive)
>                 return False
>             return disk.find("./mirror[@ready='yes']") is not None
>         return False
> 
> Should we do something similar for blockCopy?
> 
> Or maybe we should register VIR_DOMAIN_EVENT_ID_BLOCK_JOB_2 and 
> VIR_DOMAIN_BLOCK_JOB_READY to detect that that a blockCopy job
> is ready?

That is the best solution. Also new libvirt munges the output if the job is not ready for pivot now.

> 
> Currently we register libvirt.VIR_DOMAIN_EVENT_ID_BLOCK_JOB, but we only
> log the event, and take no action.

VIR_DOMAIN_EVENT_ID_BLOCK_JOB_2 is better as it reports the disk 'target' and thus works also for network disks where we don't report the first event.

Comment 50 Marina Kalinin 2019-08-29 18:48:11 UTC

Since we cannot reproduce after multiple attempts and we do not have the logs available, let's close the bug for now.
The issue is documented in KCS: https://access.redhat.com/solutions/3771541.

If anyone hits the issue again and has the libvirt logs from that same time, please upload to the BZ and reopen.

Comment 51 Marina Kalinin 2019-09-09 20:29:42 UTC

FWIW, it is possible to edit libvirt log level settings without restarting libvirt, using virt-admin command, as documented here:
https://access.redhat.com/solutions/53568

Comment 52 Germano Veit Michel 2019-09-10 02:39:10 UTC

Hi Peter and Jaroslav,

Would you be kind enough to specify the log_filters needed to further investigate this problem?

We are currently proposing adding a small subset of libvirt logs in RHV by default via BZ1749634 ("1:qemu" at the moment) and are interested in what filters could help solving this bug.
This is because this bug is very hard to reproduce and enabling debug logs after the problem happened is not very useful, so the idea is if we could run by default with some logging that would help solve this bug too.

Or we don't know the specifics and something more broad is needed, like below?
~~~
log_filters="1:libvirt 1:qemu 1:conf 1:security 3:event 3:json 3:file 3:object 1:util"
~~~

Thanks,

Comment 54 Jaroslav Suchanek 2019-09-10 10:55:10 UTC

(In reply to Germano Veit Michel from comment #52)
> Hi Peter and Jaroslav,
> 
> Would you be kind enough to specify the log_filters needed to further
> investigate this problem?
> 
> We are currently proposing adding a small subset of libvirt logs in RHV by
> default via BZ1749634 ("1:qemu" at the moment) and are interested in what
> filters could help solving this bug.
> This is because this bug is very hard to reproduce and enabling debug logs
> after the problem happened is not very useful, so the idea is if we could
> run by default with some logging that would help solve this bug too.
> 
> Or we don't know the specifics and something more broad is needed, like
> below?
> ~~~
> log_filters="1:libvirt 1:qemu 1:conf 1:security 3:event 3:json 3:file
> 3:object 1:util"
> ~~~

Yes, this setting should be fine.

Thanks.

Comment 57 Marina Kalinin 2020-03-16 15:54:10 UTC

I propose to add insights rule to check whether libvirt logs are enabled.
Should the rule check if libvirt logs are size 0 or don't exist? Or check the `/etc/libvirt/libvirtd.conf` if it exists and what is the config in it?

Petr, would you have any suggestions maybe on what to check?

Comment 58 Peter Krempa 2020-03-16 16:31:43 UTC

Logging can be done in various ways. The best way is to check whether the proper logging is configured in /etc/libvirt/libvirtd.conf

Depending on version the 'log_level', 'log_filters' and 'log_outputs' variables configure logging.

Please see https://wiki.libvirt.org/page/DebugLogs for more details.

Note You need to log in before you can comment on or make changes to this bug.

abpatil
aperotti
aromito
bcholler
branpise
bzlotnik
dyuan
fdelorey
gveitmic
gwatson
hhan
jdenemar
jsuchane
lichen
lsurette
michal.skrivanek
mjankula
mkalinin
mtessun
nashok
nsoffer
pkrempa
smaudet
srevivo
tnisan
xuzhang
yafu
yalzhang
ycui