Bug 1866707 - qemu-kvm is crashing with error "scsi_target_emulate_report_luns: Assertion `i == n + 8' failed"
Summary: qemu-kvm is crashing with error "scsi_target_emulate_report_luns: Assertion `...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux Advanced Virtualization
Classification: Red Hat
Component: qemu-kvm
Version: ---
Hardware: All
OS: Linux
low
high
Target Milestone: rc
: ---
Assignee: Maxim Levitsky
QA Contact: qing.wang
URL:
Whiteboard:
Depends On:
Blocks: 1888132
TreeView+ depends on / blocked
 
Reported: 2020-08-06 07:52 UTC by nijin ashok
Modified: 2021-04-13 02:17 UTC (History)
12 users (show)

Fixed In Version: qemu-kvm-5.1.0-15.module+el8.3.1+8772+a3fdeccd
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1888132 (view as bug list)
Environment:
Last Closed: 2021-02-22 15:39:38 UTC
Type: Bug
Target Upstream Version:


Attachments (Terms of Use)
Full threads backtrace of comment15 (24.63 KB, text/plain)
2020-08-11 02:45 UTC, Han Han
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 5341981 0 None None None 2020-08-24 07:58:45 UTC

Description nijin ashok 2020-08-06 07:52:53 UTC
Description of problem:

The qemu-kvm is getting crashed with the error below if the VM is requesting a "REPORT LUNS" and if any of the disks was hot plugged/unplugged during the same time.

===
qemu-kvm: hw/scsi/scsi-bus.c:415: scsi_target_emulate_report_luns: Assertion `i == n + 8' failed.
2020-08-06 04:03:04.940+0000: shutting down, reason=crashed
===

For the customer who reported this issue, the commvault agent VM in RHV is getting crashed intermittently. This agent VM is having Windows installed where the snapshot disk of all other VMs will get attached and detached during the backup process of the VMs. This agent VM crashed twice during the backup process with the above error for customer. It looks like this agent VM is frequently requesting "REPORT LUNS".

I was also able to reproduce this issue by the steps below.

In the host started a while loop to attach and detach the disk.

# while true; do virsh detach-device myagent test.xml; sleep 1; virsh attach-device myagent test.xml;sleep 1;done

In the VM, started 3 instances of while loop for sg_luns. (VM is having 4 vCPU and 2 GB RAM).

# while true;do sg_luns /dev/sda;done

Within 10 minutes, the VM crashed with the mentioned error.

Version-Release number of selected component (if applicable):

qemu-kvm-rhev-2.12.0-33.el7_7.8.x86_64

How reproducible:

100%

Steps to Reproduce:

Please refer above.

Actual results:

qemu-kvm is crashing with error "scsi_target_emulate_report_luns: Assertion `i == n + 8' failed"

Expected results:

The VM should not crash.

Additional info:

Many backup vendors in RHV has similar flow for backing up the VMs where the snapshot base disk of all the VMs will be attached and deteached to the agent VM during the backup process. If the agent VM dies during this time, the whole backup process is affected.

Comment 3 CongLi 2020-08-06 09:07:57 UTC
Hi Qing,

Could you please try to reproduce this bug?

Thanks.

Comment 4 qing.wang 2020-08-06 09:58:57 UTC
Hi nijin, could you please provide you reproduced xml of agent and disk.

Comment 8 qing.wang 2020-08-07 06:12:00 UTC
I failed to reproduce this issue:
My ENV:
host:3.10.0-1062.el7.x86_64
qemu-kvm-common-rhev-2.12.0-33.el7_7.8.x86_64
Guest: 3.10.0-1062.21.1.el7.x86_64

Test steps:
1.create images:
qemu-img create -f qcow2 /home/kvm_autotest_root/images/stg0.qcow2 1G
qemu-img create -f qcow2 /home/kvm_autotest_root/images/stg1.qcow2 1.1G
qemu-img create -f qcow2 /home/kvm_autotest_root/images/stg2.qcow2 1.2G
qemu-img create -f qcow2 /home/kvm_autotest_root/images/stg3.qcow2 1.3G
qemu-img create -f qcow2 /home/kvm_autotest_root/images/stg4.qcow2 1.4G
qemu-img create -f qcow2 /home/kvm_autotest_root/images/stg5.qcow2 1.5G
qemu-img create -f qcow2 /home/kvm_autotest_root/images/stg6.qcow2 1.6G
qemu-img create -f qcow2 /home/kvm_autotest_root/images/stg7.qcow2 1.7G
qemu-img create -f qcow2 /home/kvm_autotest_root/images/stg8.qcow2 1.8G
qemu-img create -f qcow2 /home/kvm_autotest_root/images/stg9.qcow2 1.9G
qemu-img create -f qcow2 /home/kvm_autotest_root/images/stg10.qcow2 2G
qemu-img create -f qcow2 /home/kvm_autotest_root/images/stg11.qcow2 2.1G

2.boot vm
virsh destroy myagent
virsh undefine myagent
virsh define agent-pc.xml
virsh start myagent

3.execute script after guest reboot
date;echo `date`>t;a=1;while true;do let a=a+1;echo "a=$a";virsh detach-device myagent diskb.xml; [[ $? == 0 ]] || exit;sleep 1; virsh attach-device myagent diskb.xml;[[ $? == 0 ]] || exit;sleep 1;done;echo `date`>>t;date

4.open 3 shell in guest and run
 while true;do sg_luns /dev/sda;done

running about 1 hour not reproduce this isse


=========================================
agent-pc.xml and diskb.xml can find via:
http://fileshare.englab.nay.redhat.com/pub/section2/images_backup/qbugs/1866707/2020-08-07-0154/1866707/

<domain type='kvm' id='32'>
    <name>myagent</name>
    <uuid>f5e388b4-7abb-4e7f-a949-fbbe4887d92a</uuid>
    <memory unit='KiB'>2097152</memory>
    <currentMemory unit='KiB'>2097152</currentMemory>
    <vcpu placement='static'>4</vcpu>
    <resource>
        <partition>/machine</partition>
    </resource>
    <os>
        <type arch='x86_64' machine='pc'>hvm</type>
        <boot dev='hd'/>
    </os>
    <iothreads>1</iothreads>
    <iothreadids>
        <iothread id='1'/>
    </iothreadids>
    <clock offset='utc'/>
    <on_poweroff>destroy</on_poweroff>
    <on_reboot>restart</on_reboot>
    <on_crash>restart</on_crash>
    <devices>
        <emulator>/usr/libexec/qemu-kvm</emulator>
        <input type='tablet' bus='usb'/>
        <disk type='block' device='disk' snapshot='no'>
            <driver name='qemu' type='qcow2' cache='none' error_policy='stop' io='native'/>
            <source dev='/home/kvm_autotest_root/images/rhel77-64-virtio-scsi.qcow2'/>
            <backingStore/>
            <target dev='sda' bus='scsi'/>
            <alias name='scsi0-0-0-0'/>
            <address type='drive' controller='0' bus='0' target='0' unit='0'/>
        </disk>


         <disk type='block' device='disk' snapshot='no'>
            <driver name='qemu' type='qcow2' cache='none' error_policy='stop' io='native'/>
            <source dev='/home/kvm_autotest_root/images/stg1.qcow2'/>
            <backingStore/>
            <target dev='sdb' bus='scsi'/>
            <alias name='scsi0-0-0-1'/>
            <address type='drive' controller='0' bus='0' target='0' unit='1'/>
        </disk>
        <disk type='block' device='disk' snapshot='no'>
            <driver name='qemu' type='qcow2' cache='none' error_policy='stop' io='native'/>
            <source dev='/home/kvm_autotest_root/images/stg2.qcow2'/>
            <backingStore/>
            <target dev='sdc' bus='scsi'/>
            <alias name='scsi0-0-0-2'/>
            <address type='drive' controller='0' bus='0' target='0' unit='2'/>
        </disk>
        <disk type='block' device='disk' snapshot='no'>
            <driver name='qemu' type='qcow2' cache='none' error_policy='stop' io='native'/>
            <source dev='/home/kvm_autotest_root/images/stg3.qcow2'/>
            <backingStore/>
            <target dev='sdd' bus='scsi'/>
            <alias name='scsi0-0-0-3'/>
            <address type='drive' controller='0' bus='0' target='0' unit='3'/>
        </disk>
        <disk type='block' device='disk' snapshot='no'>
            <driver name='qemu' type='qcow2' cache='none' error_policy='stop' io='native'/>
            <source dev='/home/kvm_autotest_root/images/stg4.qcow2'/>
            <backingStore/>
            <target dev='sde' bus='scsi'/>
            <alias name='scsi0-0-0-4'/>
            <address type='drive' controller='0' bus='0' target='0' unit='4'/>
        </disk>
        <disk type='block' device='disk' snapshot='no'>
            <driver name='qemu' type='qcow2' cache='none' error_policy='stop' io='native'/>
            <source dev='/home/kvm_autotest_root/images/stg5.qcow2'/>
            <backingStore/>
            <target dev='sdf' bus='scsi'/>
            <alias name='scsi0-0-0-5'/>
            <address type='drive' controller='0' bus='0' target='0' unit='5'/>
        </disk>
        <disk type='block' device='disk' snapshot='no'>
            <driver name='qemu' type='qcow2' cache='none' error_policy='stop' io='native'/>
            <source dev='/home/kvm_autotest_root/images/stg6.qcow2'/>
            <backingStore/>
            <target dev='sdg' bus='scsi'/>
            <alias name='scsi0-0-0-6'/>
            <address type='drive' controller='0' bus='0' target='0' unit='6'/>
        </disk>
        <disk type='block' device='disk' snapshot='no'>
            <driver name='qemu' type='qcow2' cache='none' error_policy='stop' io='native'/>
            <source dev='/home/kvm_autotest_root/images/stg7.qcow2'/>
            <backingStore/>
            <target dev='sdh' bus='scsi'/>
            <alias name='scsi0-0-0-7'/>
            <address type='drive' controller='0' bus='0' target='0' unit='7'/>
        </disk>
        <disk type='block' device='disk' snapshot='no'>
            <driver name='qemu' type='qcow2' cache='none' error_policy='stop' io='native'/>
            <source dev='/home/kvm_autotest_root/images/stg8.qcow2'/>
            <backingStore/>
            <target dev='sdi' bus='scsi'/>
            <alias name='scsi0-0-0-8'/>
            <address type='drive' controller='0' bus='0' target='0' unit='8'/>
        </disk>
        <disk type='block' device='disk' snapshot='no'>
            <driver name='qemu' type='qcow2' cache='none' error_policy='stop' io='native'/>
            <source dev='/home/kvm_autotest_root/images/stg9.qcow2'/>
            <backingStore/>
            <target dev='sdj' bus='scsi'/>
            <alias name='scsi0-0-0-9'/>
            <address type='drive' controller='0' bus='0' target='0' unit='9'/>
        </disk>
        <disk type='block' device='disk' snapshot='no'>
            <driver name='qemu' type='qcow2' cache='none' error_policy='stop' io='native'/>
            <source dev='/home/kvm_autotest_root/images/stg10.qcow2'/>
            <backingStore/>
            <target dev='sdk' bus='scsi'/>
            <alias name='scsi0-0-0-a'/>
            <address type='drive' controller='0' bus='0' target='0' unit='10'/>
        </disk>
         <disk type='block' device='disk' snapshot='no'>
            <driver name='qemu' type='qcow2' cache='none' error_policy='stop' io='native'/>
            <source dev='/home/kvm_autotest_root/images/stg11.qcow2'/>
            <backingStore/>
            <target dev='sdl' bus='scsi'/>
            <alias name='scsi0-0-0-b'/>
            <address type='drive' controller='0' bus='0' target='0' unit='11'/>
        </disk>
         <disk type='file' device='disk'>
            <driver name='qemu' type='qcow2' cache='none' iothread='1'/>
            <source file='/home/kvm_autotest_root/images/stg0.qcow2'/>
            <target dev='sdm' bus='virtio'/>
        </disk>
        <disk type='file' device='cdrom'>
            <driver name='qemu' type='raw'/>
            <backingStore/>
            <target dev='hda' bus='ide'/>
            <readonly/>
            <alias name='ide0-0-0'/>
            <address type='drive' controller='0' bus='0' target='0' unit='0'/>
        </disk>
        <controller type='scsi' index='0' model='virtio-scsi'>
            <alias name='scsi0'/>
            <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
        </controller>

        <controller type='ide' index='0'>
            <alias name='ide'/>
            <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/>
        </controller>
        <controller type='usb' index='0'>
            <alias name='usb'/>
            <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/>
        </controller>
        <controller type='pci' index='0' model='pci-root'>
            <alias name='pci.0'/>
        </controller>
        <interface type='bridge'>
            <mac address='52:54:00:17:c7:f6'/>
            <source bridge='switch'/>
            <target dev='vnet19'/>
            <model type='virtio'/>
            <driver name='vhost' queues='4'/>
            <alias name='net0'/>
            <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
        </interface>
        <serial type='pty'>
            <source path='/dev/pts/3'/>
            <target type='isa-serial' port='0'/>
            <alias name='serial0'/>
        </serial>
        <console type='pty' tty='/dev/pts/3'>
            <source path='/dev/pts/3'/>
            <target type='serial' port='0'/>
            <alias name='serial0'/>
        </console>
        <input type='mouse' bus='ps2'>
            <alias name='input0'/>
        </input>
        <input type='keyboard' bus='ps2'>
            <alias name='input1'/>
        </input>
        <graphics type='vnc' port='5901' autoport='yes' listen='0.0.0.0'>
            <listen type='address' address='0.0.0.0'/>
        </graphics>
        <video>
            <model type='cirrus' vram='16384' heads='1' primary='yes'/>
            <alias name='video0'/>
            <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
        </video>
        <memballoon model='none'>
            <alias name='balloon0'/>
        </memballoon>
    </devices>
    <seclabel type='dynamic' model='selinux' relabel='yes'>
        <label>system_u:system_r:svirt_t:s0:c47,c608</label>
        <imagelabel>system_u:object_r:svirt_image_t:s0:c47,c608</imagelabel>
    </seclabel>
    <seclabel type='dynamic' model='dac' relabel='yes'>
        <label>+107:+107</label>
        <imagelabel>+107:+107</imagelabel>
    </seclabel>
</domain>
============================================================================================
 <disk type='block' device='disk' snapshot='no'>
            <driver name='qemu' type='qcow2' cache='none' error_policy='stop' io='native'/>
            <source dev='/home/kvm_autotest_root/images/stg1.qcow2'/>
            <backingStore/>
            <target dev='sdb' bus='scsi'/>
            <alias name='scsi0-0-0-1'/>
            <address type='drive' controller='0' bus='0' target='0' unit='1'/>
        </disk>

Comment 9 CongLi 2020-08-10 06:30:09 UTC
Hi Ademar and John,

This is a customer critical case, but QE could not reproduce this bug in our env, 
could someone from developer side help check this bug ?

Thanks.

Comment 10 CongLi 2020-08-10 06:35:40 UTC
(In reply to nijin ashok from comment #0)
> Description of problem:
> For the customer who reported this issue, the commvault agent VM in RHV is
> getting crashed intermittently. This agent VM is having Windows installed
> where the snapshot disk of all other VMs will get attached and detached
> during the backup process of the VMs. This agent VM crashed twice during the
> backup process with the above error for customer. It looks like this agent
> VM is frequently requesting "REPORT LUNS".

Hi Qing,

Could you please have a try with a Windows VM and attach with a snapshot disk as described?

Thanks.

Comment 11 nijin ashok 2020-08-10 09:58:13 UTC
(In reply to qing.wang from comment #8)
> I failed to reproduce this issue:

I tried again and it failed at the 311 th attempt.

====
a=311
Device detached successfully

error: Failed to attach device from test.xml
error: Unable to read from monitor: Connection reset by peer
====

The XML, I tried for attaching the disk was below.


===
<disk device="disk" snapshot="no" type="file">
    <address bus="0" controller="0" target="0" type="drive" unit="4" />
    <source file="/var/lib/vdsm/transient/106fd6c1-a215-457e-b8c5-a08bb6448f58-0e91c3ce-423a-4448-a554-8b554c8453c6.6h64rC">
        <seclabel model="dac" relabel="no" type="none" />
    </source>
    <target bus="scsi" dev="sdl" />
    <serial>f84aa7c9-5769-4a87-87c5-5a79fd4de42d</serial>
    <driver cache="writethrough" error_policy="stop" io="threads" name="qemu" type="qcow2" />
    <alias name="ua-f84aa7c9-5769-4a87-87c5-5a79fd4de42d" />
</disk>
===


This "106fd6c1-a215-457e-b8c5-a08bb6448f58-0e91c3ce-423a-4448-a554-8b554c8453c6.6h64rC" is qcow2 image with a snapshot.

===
qemu-img info /var/lib/vdsm/transient/106fd6c1-a215-457e-b8c5-a08bb6448f58-0e91c3ce-423a-4448-a554-8b554c8453c6.6h64rC -U
image: /var/lib/vdsm/transient/106fd6c1-a215-457e-b8c5-a08bb6448f58-0e91c3ce-423a-4448-a554-8b554c8453c6.6h64rC
file format: qcow2
virtual size: 1.0G (1073741824 bytes)
disk size: 196K
cluster_size: 65536
backing file: /rhev/data-center/mnt/blockSD/106fd6c1-a215-457e-b8c5-a08bb6448f58/images/f84aa7c9-5769-4a87-87c5-5a79fd4de42d/0e91c3ce-423a-4448-a554-8b554c8453c6
backing file format: raw
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false

===

Can you please try with the above changes?

Comment 12 qing.wang 2020-08-10 10:55:02 UTC
Tried on guest windows 2019,not hit this issue,but i have no idea what is snapshot in libvirt:

1.install sg_utils in windows
2.execute run.bat:
:redo
sg_luns c:
goto redo

updated xml for myagent:

<domain type='kvm' id='32'>
    <name>myagent</name>
    <memory unit='KiB'>2097152</memory>
    <currentMemory unit='KiB'>2097152</currentMemory>
    <vcpu placement='static' current='4'>16</vcpu>
    <iothreads>1</iothreads>
    <resource>
        <partition>/machine</partition>
    </resource>
    <sysinfo type='smbios'>
        <system>
            <entry name='manufacturer'>Red Hat</entry>
            <entry name='product'>RHEV Hypervisor</entry>
            <entry name='version'>7.7-10.el7</entry>
            <entry name='serial'>5b6975b0-b76f-401c-8923-24ff9f146c69</entry>
            <entry name='uuid'>4a474e30-dcde-4541-8f2f-5607ae22f563</entry>
        </system>
    </sysinfo>
    <os>
        <type arch='x86_64' machine='pc-i440fx-rhel7.6.0'>hvm</type>
        <smbios mode='sysinfo'/>
    </os>
    <features>
        <acpi/>
    </features>

    <clock offset='variable' adjustment='0' basis='utc'>
        <timer name='rtc' tickpolicy='catchup'/>
        <timer name='pit' tickpolicy='delay'/>
        <timer name='hpet' present='no'/>
    </clock>
    <on_poweroff>destroy</on_poweroff>
    <on_reboot>restart</on_reboot>
    <on_crash>destroy</on_crash>
    <pm>
        <suspend-to-mem enabled='no'/>
        <suspend-to-disk enabled='no'/>
    </pm>
    <devices>
        <emulator>/usr/libexec/qemu-kvm</emulator>
        <input type='tablet' bus='usb'/>
        <disk type='block' device='disk' snapshot='no'>
            <driver name='qemu' type='qcow2' cache='none' error_policy='stop' io='native'/>
            <source dev='/home/kvm_autotest_root/images/win2019-64-virtio-scsi.qcow2'>
                <seclabel model='dac' relabel='no'/>
            </source>
            <backingStore/>
            <target dev='sda' bus='scsi'/>
            <alias name='scsi0-0-0-0'/>
            <address type='drive' controller='0' bus='0' target='0' unit='0'/>
        </disk>


        <disk type='block' device='disk' snapshot='no'>
            <driver name='qemu' type='qcow2' cache='none' error_policy='stop' io='native'/>
            <source dev='/home/kvm_autotest_root/images/stg1.qcow2'>
                <seclabel model='dac' relabel='no'/>
            </source>
            <backingStore/>
            <target dev='sdb' bus='scsi'/>
            <alias name='scsi0-0-0-1'/>
            <address type='drive' controller='0' bus='0' target='0' unit='1'/>
        </disk>
        <disk type='block' device='disk' snapshot='no'>
            <driver name='qemu' type='qcow2' cache='none' error_policy='stop' io='native'/>
            <source dev='/home/kvm_autotest_root/images/stg2.qcow2'>
                <seclabel model='dac' relabel='no'/>
            </source>
            <backingStore/>
            <target dev='sdc' bus='scsi'/>
            <alias name='scsi0-0-0-2'/>
            <address type='drive' controller='0' bus='0' target='0' unit='2'/>
        </disk>
        <disk type='block' device='disk' snapshot='no'>
            <driver name='qemu' type='qcow2' cache='none' error_policy='stop' io='native'/>
            <source dev='/home/kvm_autotest_root/images/stg3.qcow2'>
                <seclabel model='dac' relabel='no'/>
            </source>
            <backingStore/>
            <target dev='sdd' bus='scsi'/>
            <alias name='scsi0-0-0-3'/>
            <address type='drive' controller='0' bus='0' target='0' unit='3'/>
        </disk>
        <disk type='block' device='disk' snapshot='no'>
            <driver name='qemu' type='qcow2' cache='none' error_policy='stop' io='native'/>
            <source dev='/home/kvm_autotest_root/images/stg4.qcow2'>
                <seclabel model='dac' relabel='no'/>
            </source>
            <backingStore/>
            <target dev='sde' bus='scsi'/>
            <alias name='scsi0-0-0-4'/>
            <address type='drive' controller='0' bus='0' target='0' unit='4'/>
        </disk>
        <disk type='block' device='disk' snapshot='no'>
            <driver name='qemu' type='qcow2' cache='none' error_policy='stop' io='native'/>
            <source dev='/home/kvm_autotest_root/images/stg5.qcow2'>
                <seclabel model='dac' relabel='no'/>
            </source>
            <backingStore/>
            <target dev='sdf' bus='scsi'/>
            <alias name='scsi0-0-0-5'/>
            <address type='drive' controller='0' bus='0' target='0' unit='5'/>
        </disk>
        <disk type='block' device='disk' snapshot='no'>
            <driver name='qemu' type='qcow2' cache='none' error_policy='stop' io='native'/>
            <source dev='/home/kvm_autotest_root/images/stg6.qcow2'>
                <seclabel model='dac' relabel='no'/>
            </source>
            <backingStore/>
            <target dev='sdg' bus='scsi'/>
            <alias name='scsi0-0-0-6'/>
            <address type='drive' controller='0' bus='0' target='0' unit='6'/>
        </disk>
        <disk type='block' device='disk' snapshot='no'>
            <driver name='qemu' type='qcow2' cache='none' error_policy='stop' io='native'/>
            <source dev='/home/kvm_autotest_root/images/stg7.qcow2'>
                <seclabel model='dac' relabel='no'/>
            </source>
            <backingStore/>
            <target dev='sdh' bus='scsi'/>
            <alias name='scsi0-0-0-7'/>
            <address type='drive' controller='0' bus='0' target='0' unit='7'/>
        </disk>
        <disk type='block' device='disk' snapshot='no'>
            <driver name='qemu' type='qcow2' cache='none' error_policy='stop' io='native'/>
            <source dev='/home/kvm_autotest_root/images/stg8.qcow2'>
                <seclabel model='dac' relabel='no'/>
            </source>
            <backingStore/>
            <target dev='sdi' bus='scsi'/>
            <alias name='scsi0-0-0-8'/>
            <address type='drive' controller='0' bus='0' target='0' unit='8'/>
        </disk>
        <disk type='block' device='disk' snapshot='no'>
            <driver name='qemu' type='qcow2' cache='none' error_policy='stop' io='native'/>
            <source dev='/home/kvm_autotest_root/images/stg9.qcow2'>
                <seclabel model='dac' relabel='no'/>
            </source>
            <backingStore/>
            <target dev='sdj' bus='scsi'/>
            <alias name='scsi0-0-0-9'/>
            <address type='drive' controller='0' bus='0' target='0' unit='9'/>
        </disk>
        <disk type='block' device='disk' snapshot='no'>
            <driver name='qemu' type='qcow2' cache='none' error_policy='stop' io='native'/>
            <source dev='/home/kvm_autotest_root/images/stg10.qcow2'>
                <seclabel model='dac' relabel='no'/>
            </source>
            <backingStore/>
            <target dev='sdk' bus='scsi'/>
            <alias name='scsi0-0-0-a'/>
            <address type='drive' controller='0' bus='0' target='0' unit='10'/>
        </disk>
        <disk type='block' device='disk' snapshot='no'>
            <driver name='qemu' type='qcow2' cache='none' error_policy='stop' io='native'/>
            <source dev='/home/kvm_autotest_root/images/stg11.qcow2'>
                <seclabel model='dac' relabel='no'/>
            </source>
            <backingStore/>
            <target dev='sdl' bus='scsi'/>
            <alias name='scsi0-0-0-b'/>
            <address type='drive' controller='0' bus='0' target='0' unit='11'/>
        </disk>
        <disk type='file' device='disk'>
            <driver name='qemu' type='qcow2' cache='none' iothread='1'/>
            <source file='/home/kvm_autotest_root/images/stg0.qcow2'>
                <seclabel model='dac' relabel='no'/>
            </source>
            <target dev='sdm' bus='virtio'/>
        </disk>
        <!--        <disk type='file' device='cdrom'>-->
        <!--            <driver name='qemu'/>-->
        <!--            <source file='/var/lib/libvirt/images/winutils.iso'/>-->
        <!--            <target dev='hdb'/>-->
        <!--            <readonly/>-->
        <!--        </disk>-->
        <disk type='file' device='cdrom'>
            <driver name='qemu' type='raw'/>
            <source file='/home/kvm_autotest_root/iso/windows/winutils.iso'/>
            <backingStore/>
            <target dev='hdb' bus='ide'/>
            <readonly/>
            <alias name='ide0-0-0'/>
            <address type='drive' controller='0' bus='0' target='0' unit='0'/>
        </disk>
        <controller type='scsi' index='0' model='virtio-scsi'>
            <alias name='scsi0'/>
            <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
        </controller>

        <controller type='ide' index='0'>
            <alias name='ide'/>
            <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/>
        </controller>
        <controller type='usb' index='0'>
            <alias name='usb'/>
            <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/>
        </controller>
        <controller type='pci' index='0' model='pci-root'>
            <alias name='pci.0'/>
        </controller>
        <interface type='bridge'>
            <mac address='52:54:00:17:c7:f6'/>
            <source bridge='switch'/>
            <target dev='vnet19'/>
            <model type='virtio'/>
            <driver name='vhost' queues='4'/>
            <alias name='net0'/>
            <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
        </interface>
        <serial type='pty'>
            <source path='/dev/pts/3'/>
            <target type='isa-serial' port='0'/>
            <alias name='serial0'/>
        </serial>
        <console type='pty' tty='/dev/pts/3'>
            <source path='/dev/pts/3'/>
            <target type='serial' port='0'/>
            <alias name='serial0'/>
        </console>
        <input type='mouse' bus='ps2'>
            <alias name='input0'/>
        </input>
        <input type='keyboard' bus='ps2'>
            <alias name='input1'/>
        </input>
        <graphics type='vnc' port='5901' autoport='yes' listen='0.0.0.0'>
            <listen type='address' address='0.0.0.0'/>
        </graphics>
        <video>
            <model type='cirrus' vram='16384' heads='1' primary='yes'/>
            <alias name='video0'/>
            <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
        </video>
        <memballoon model='none'>
            <alias name='balloon0'/>
        </memballoon>
    </devices>
<!--    <seclabel type='dynamic' model='selinux' relabel='yes'>-->
<!--        <label>system_u:system_r:svirt_t:s0:c47,c608</label>-->
<!--        <imagelabel>system_u:object_r:svirt_image_t:s0:c47,c608</imagelabel>-->
<!--    </seclabel>-->
    <seclabel type='dynamic' model='dac' relabel='yes'>
        <label>+107:+107</label>
        <imagelabel>+107:+107</imagelabel>
    </seclabel>
</domain>


Could you please help to verify this bug with "attach/detach snapshot disk of all other VMs"

Comment 14 John Ferlan 2020-08-10 11:38:59 UTC
qemu-kvm-rhev-2.12.0-33.el7_7.8.x86_64 is RHEL 7.7 for zstream release 8 - IOW: it's an older RHEL version.

Although the assert exists in the top of the qemu source tree, I have a feeling this 

fwiw: comment 11 - the xml snippet provided is more related to the agent_vm.xml that's attached via comment 5. I also see that agent_vm.xml uses IOThreads on the scsi-controller while the xml from comment 8 uses IOThreads on just one disk/volume.  Might be interesting to try the test without IOThreads - just to rule that out.  Which address the attach/detach scsi xml uses would also have been interesting to know...

I will have someone look at this from the dev side, but it could be difficult to repro/solve too as there's quite a lot of changes since rhel 7.7.

Comment 15 Han Han 2020-08-11 02:44:19 UTC
Reproduce a SEGSEGV on libvirt-4.5.0-36.el7_9.2.x86_64 qemu-kvm-rhev-2.12.0-48.el7.x86_6

Steps:
on host:
attach/detach scsi disk(drive_add&device_add, device_del&drive_del in qmp):
while true;do virsh attach-disk seabios /tmp/scsi sda; virsh detach-disk seabios /tmp/scsi;done

on guest:
while true;do sg_luns /dev/sda;done

Backtrace:
#0  qemu_strnlen (max_len=8, s=0x0) at util/cutils.c:107

(gdb) bt
#0  0x000055a10d8de2a6 in strpadcpy (max_len=8, s=0x0) at util/cutils.c:107
#1  0x000055a10d8de2a6 in strpadcpy (buf=0x55a112324008 "", buf_size=8, str=0x0, pad=32 ' ') at util/cutils.c:37
#2  0x000055a10d77386f in scsi_disk_emulate_command (outbuf=0x55a112324000 "", req=0x55a111791600) at hw/scsi/scsi-disk.c:774
#3  0x000055a10d77386f in scsi_disk_emulate_command (req=0x55a111791600, buf=<optimized out>) at hw/scsi/scsi-disk.c:1903
#4  0x000055a10d776fd2 in scsi_req_enqueue (req=req@entry=0x55a111791600) at hw/scsi/scsi-bus.c:802
#5  0x000055a10d614b59 in virtio_scsi_handle_cmd_vq (s=0x55a112806170, req=<optimized out>) at /usr/src/debug/qemu-2.12.0/hw/scsi/virtio-scsi.c:580
#6  0x000055a10d614b59 in virtio_scsi_handle_cmd_vq (s=s@entry=0x55a112806170, vq=vq@entry=0x55a112812100) at /usr/src/debug/qemu-2.12.0/hw/scsi/virtio-scsi.c:620
#7  0x000055a10d6157ba in virtio_scsi_data_plane_handle_cmd (vdev=<optimized out>, vq=0x55a112812100) at /usr/src/debug/qemu-2.12.0/hw/scsi/virtio-scsi-dataplane.c:60
#8  0x000055a10d8e2148 in aio_dispatch_handlers (ctx=ctx@entry=0x55a110a11900) at util/aio-posix.c:410
#9  0x000055a10d8e2bba in aio_poll (ctx=ctx@entry=0x55a110a11900, blocking=blocking@entry=true) at util/aio-posix.c:707
#10 0x000055a10d8498c5 in blk_prw (blk=blk@entry=0x55a110ae3080, offset=offset@entry=0, buf=buf@entry=0x7ffd9f9b2220 "", bytes=bytes@entry=512, co_entry=co_entry@entry=0x55a10d84ad90 <blk_read_entry>, flags=flags@entry=0)
    at block/block-backend.c:1263
#11 0x000055a10d84aeea in blk_pread_unthrottled (count=512, buf=0x7ffd9f9b2220, offset=0, blk=0x55a110ae3080) at block/block-backend.c:1433
#12 0x000055a10d84aeea in blk_pread_unthrottled (blk=blk@entry=0x55a110ae3080, offset=offset@entry=0, buf=buf@entry=0x7ffd9f9b2220 "", count=count@entry=512) at block/block-backend.c:1280
#13 0x000055a10d6f701f in guess_disk_lchs (blk=blk@entry=0x55a110ae3080, pcylinders=pcylinders@entry=0x7ffd9f9b246c, pheads=pheads@entry=0x7ffd9f9b2470, psectors=psectors@entry=0x7ffd9f9b2474) at hw/block/hd-geometry.c:71
#14 0x000055a10d6f7187 in hd_geometry_guess (blk=0x55a110ae3080, pcyls=pcyls@entry=0x55a112d9e82c, pheads=pheads@entry=0x55a112d9e830, psecs=psecs@entry=0x55a112d9e834, ptrans=ptrans@entry=0x0) at hw/block/hd-geometry.c:136
#15 0x000055a10d6f6d01 in blkconf_geometry (conf=conf@entry=0x55a112d9e810, ptrans=ptrans@entry=0x0, cyls_max=cyls_max@entry=65535, heads_max=heads_max@entry=255, secs_max=secs_max@entry=255, errp=errp@entry=0x7ffd9f9b2570)
    at hw/block/block.c:126
#16 0x000055a10d770a88 in scsi_realize (dev=dev@entry=0x55a112d9e780, errp=errp@entry=0x7ffd9f9b2570) at hw/scsi/scsi-disk.c:2338
#17 0x000055a10d770d70 in scsi_hd_realize (dev=0x55a112d9e780, errp=0x7ffd9f9b2570) at hw/scsi/scsi-disk.c:2403
#18 0x000055a10d778242 in scsi_qdev_realize (errp=0x7ffd9f9b2570, s=0x55a112d9e780) at hw/scsi/scsi-bus.c:54
#19 0x000055a10d778242 in scsi_qdev_realize (qdev=<optimized out>, errp=0x7ffd9f9b25d0) at hw/scsi/scsi-bus.c:204
#20 0x000055a10d70823b in device_set_realized (obj=<optimized out>, value=<optimized out>, errp=0x7ffd9f9b2708) at hw/core/qdev.c:852
#21 0x000055a10d7ffa7e in property_set_bool (obj=0x55a112d9e780, v=<optimized out>, name=<optimized out>, opaque=0x55a111136210, errp=0x7ffd9f9b2708) at qom/object.c:1925
#22 0x000055a10d803aaf in object_property_set_qobject (obj=0x55a112d9e780, value=<optimized out>, name=0x55a10d9b7c2f "realized", errp=0x7ffd9f9b2708) at qom/qom-qobject.c:27
#23 0x000055a10d8016a5 in object_property_set_bool (obj=0x55a112d9e780, value=<optimized out>, name=0x55a10d9b7c2f "realized", errp=0x7ffd9f9b2708) at qom/object.c:1188
#24 0x000055a10d6b10b9 in qdev_device_add (opts=opts@entry=0x55a110ca0d20, errp=errp@entry=0x7ffd9f9b27e0) at qdev-monitor.c:626
#25 0x000055a10d6b1673 in qmp_device_add (qdict=<optimized out>, ret_data=<optimized out>, errp=0x7ffd9f9b2828) at qdev-monitor.c:806
#26 0x000055a10d8d59da in qmp_dispatch (errp=0x7ffd9f9b2820, request=0x7ffd9f9b2820, cmds=<optimized out>) at qapi/qmp-dispatch.c:111
#27 0x000055a10d8d59da in qmp_dispatch (cmds=<optimized out>, request=request@entry=0x55a112e04400) at qapi/qmp-dispatch.c:160
#28 0x000055a10d5cc5a1 in monitor_qmp_dispatch_one (req_obj=<optimized out>) at /usr/src/debug/qemu-2.12.0/monitor.c:4102
#29 0x000055a10d5cc805 in monitor_qmp_bh_dispatcher (data=<optimized out>) at /usr/src/debug/qemu-2.12.0/monitor.c:4160
#30 0x000055a10d8df921 in aio_bh_poll (bh=0x55a110a60240) at util/async.c:90
#31 0x000055a10d8df921 in aio_bh_poll (ctx=ctx@entry=0x55a110a10140) at util/async.c:118
#32 0x000055a10d8e29d0 in aio_dispatch (ctx=0x55a110a10140) at util/aio-posix.c:440
#33 0x000055a10d8df7fe in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at util/async.c:261
#34 0x00007f942e61a099 in g_main_context_dispatch () at /lib64/libglib-2.0.so.0
#35 0x000055a10d8e1cc7 in main_loop_wait () at util/main-loop.c:215
#36 0x000055a10d8e1cc7 in main_loop_wait (timeout=<optimized out>) at util/main-loop.c:238
#37 0x000055a10d8e1cc7 in main_loop_wait (nonblocking=nonblocking@entry=0) at util/main-loop.c:497
#38 0x000055a10d580647 in main () at vl.c:2050
#39 0x000055a10d580647 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4813

Comment 16 Han Han 2020-08-11 02:45:57 UTC
Created attachment 1711027 [details]
Full threads backtrace of comment15

Comment 17 Han Han 2020-08-11 04:54:51 UTC
For host of libvirt-6.6.0-1.fc33.x86_64 qemu-5.0.0-5.fc33.x86_64 and guest of sg3_utils-1.44-5.el8.x86_64 kernel-sg3_utils-1.44-5.el8.x86_64, the bug is not reproduced but hitting another issue:

After many cycle of these two while loops, stop them and attach the scsi disk to VM.
Then check it in VM:
[root@localhost ~]# lsblk /dev/sda
NAME MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda    8:0    0  100M  0 disk 


[root@localhost ~]# lsscsi
[0:0:0:0]    disk    QEMU     QEMU HARDDISK    2.5+  /dev/sda 


[root@localhost ~]# sg_luns /dev/sda
open error: /dev/sda: No such device or address


And I find messages of kernel call backtrace in dmesg:
[  739.133642] INFO: task kworker/0:6:1362 blocked for more than 120 seconds.
[  739.142061]       Tainted: G                 ---------r-  - 4.18.0-221.el8.x86_64 #1
[  739.151685] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  739.160969] kworker/0:6     D    0  1362      2 0x80004080
[  739.167190] Workqueue: events_freezable virtscsi_handle_event [virtio_scsi]
[  739.175050] Call Trace:
[  739.178625]  __schedule+0x27b/0x690
[  739.183123]  schedule+0x40/0xb0
[  739.188373]  schedule_preempt_disabled+0xa/0x10
[  739.194260]  __mutex_lock.isra.5+0x2d0/0x4a0
[  739.199543]  __scsi_add_device+0xaa/0x130
[  739.204262]  scsi_add_device+0xd/0x30
[  739.208648]  virtscsi_handle_event+0x222/0x280 [virtio_scsi]
[  739.215268]  process_one_work+0x1a7/0x360
[  739.220073]  worker_thread+0x30/0x390
[  739.224849]  ? create_worker+0x1a0/0x1a0
[  739.229595]  kthread+0x112/0x130
[  739.233764]  ? kthread_flush_work_fn+0x10/0x10
[  739.238922]  ret_from_fork+0x35/0x40

Comment 18 qing.wang 2020-08-11 05:39:24 UTC
I did not reproduce this issue 
qemu-img create -f qcow2 -b /home/kvm_autotest_root/images/stg1.qcow2 /home/kvm_autotest_root/images/stg1-1.qcow2 1G

attach stg1-1 as data disk in guest

<disk type='block' device='disk' snapshot='no'>
            <driver name='qemu' type='qcow2' cache='none' error_policy='stop' io='native'/>
            <source dev='/home/kvm_autotest_root/images/win2019-64-virtio-scsi.qcow2'>
                <seclabel model='dac' relabel='no'/>
            </source>
            <backingStore/>
            <target dev='sda' bus='scsi'/>
            <alias name='scsi0-0-0-0'/>
            <address type='drive' controller='0' bus='0' target='0' unit='0'/>
        </disk>


        <disk type='block' device='disk' snapshot='no'>
            <driver name='qemu' type='qcow2' cache='none' error_policy='stop' io='native'/>
            <source dev='/home/kvm_autotest_root/images/stg1-1.qcow2'>
                <seclabel model='dac' relabel='no'/>
            </source>
            <backingStore/>
            <target dev='sdb' bus='scsi'/>
            <alias name='scsi0-0-0-1'/>
            <address type='drive' controller='0' bus='0' target='0' unit='1'/>
        </disk>

===================================
disk.xml 

 <disk type='block' device='disk' snapshot='no'>
            <driver name='qemu' type='qcow2' cache='none' error_policy='stop' io='native'/>
            <source dev='/home/kvm_autotest_root/images/stg1-1.qcow2'/>
            <backingStore/>
            <target dev='sdb' bus='scsi'/>
            <alias name='scsi0-0-0-1'/>
            <address type='drive' controller='0' bus='0' target='0' unit='1'/>
        </disk>


qemu-img info /home/kvm_autotest_root/images/stg1-1.qcow2 -U
image: /home/kvm_autotest_root/images/stg1-1.qcow2
file format: qcow2
virtual size: 1.0G (1073741824 bytes)
disk size: 196K
cluster_size: 65536
backing file: /home/kvm_autotest_root/images/stg1.qcow2
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false



Hi nijin, i can not found the disk configuration in your agent_vm.xml, could you please share your agent_vm.xml and disk.xml ?what is your guest ? could you reproduce this with my agent_vm configuration ?

The attached/detached disk have any relationship with system image?


<disk device="disk" snapshot="no" type="file">
>     <address bus="0" controller="0" target="0" type="drive" unit="4" />
>     <source
> file="/var/lib/vdsm/transient/106fd6c1-a215-457e-b8c5-a08bb6448f58-0e91c3ce-
> 423a-4448-a554-8b554c8453c6.6h64rC">
>         <seclabel model="dac" relabel="no" type="none" />
>     </source>
>     <target bus="scsi" dev="sdl" />
>     <serial>f84aa7c9-5769-4a87-87c5-5a79fd4de42d</serial>
>     <driver cache="writethrough" error_policy="stop" io="threads"
> name="qemu" type="qcow2" />
>     <alias name="ua-f84aa7c9-5769-4a87-87c5-5a79fd4de42d" />
> </disk>

Comment 19 John Ferlan 2020-08-11 11:07:56 UTC
w/r/t: Comment 15, that's what will be in RHEL 7.9 and looks fairly similar to bug 1812399

w/r/t: Comment 17, the libvirt is close to upstream, but the qemu is behind a bit - still perhaps related to the same bug though.  There were a number of adjustments made to upstream qemu later in the 5.1 cycle to the device realize/unrealize paths. 

Based on the above Maxim is aware of this bz and believes it's a similar root cause. He's in the process of setting up an environment to reproduce and see if he can backport/test his current upstream series. Could be quite challenging - it's a tricky and long standing problem related to device realization processing.

Comment 20 qing.wang 2020-08-12 01:53:58 UTC
(In reply to John Ferlan from comment #19)
> w/r/t: Comment 15, that's what will be in RHEL 7.9 and looks fairly similar
> to bug 1812399
> 
> w/r/t: Comment 17, the libvirt is close to upstream, but the qemu is behind
> a bit - still perhaps related to the same bug though.  There were a number
> of adjustments made to upstream qemu later in the 5.1 cycle to the device
> realize/unrealize paths. 
> 
> Based on the above Maxim is aware of this bz and believes it's a similar
> root cause. He's in the process of setting up an environment to reproduce
> and see if he can backport/test his current upstream series. Could be quite
> challenging - it's a tricky and long standing problem related to device
> realization processing.

(In reply to qing.wang from comment #18)
> I did not reproduce this issue 
> qemu-img create -f qcow2 -b /home/kvm_autotest_root/images/stg1.qcow2
> /home/kvm_autotest_root/images/stg1-1.qcow2 1G
> 
> attach stg1-1 as data disk in guest
> 
> <disk type='block' device='disk' snapshot='no'>
>             <driver name='qemu' type='qcow2' cache='none'
> error_policy='stop' io='native'/>
>             <source
> dev='/home/kvm_autotest_root/images/win2019-64-virtio-scsi.qcow2'>
>                 <seclabel model='dac' relabel='no'/>
>             </source>
>             <backingStore/>
>             <target dev='sda' bus='scsi'/>
>             <alias name='scsi0-0-0-0'/>
>             <address type='drive' controller='0' bus='0' target='0'
> unit='0'/>
>         </disk>
> 
> 
>         <disk type='block' device='disk' snapshot='no'>
>             <driver name='qemu' type='qcow2' cache='none'
> error_policy='stop' io='native'/>
>             <source dev='/home/kvm_autotest_root/images/stg1-1.qcow2'>
>                 <seclabel model='dac' relabel='no'/>
>             </source>
>             <backingStore/>
>             <target dev='sdb' bus='scsi'/>
>             <alias name='scsi0-0-0-1'/>
>             <address type='drive' controller='0' bus='0' target='0'
> unit='1'/>
>         </disk>
> 
> ===================================
> disk.xml 
> 
>  <disk type='block' device='disk' snapshot='no'>
>             <driver name='qemu' type='qcow2' cache='none'
> error_policy='stop' io='native'/>
>             <source dev='/home/kvm_autotest_root/images/stg1-1.qcow2'/>
>             <backingStore/>
>             <target dev='sdb' bus='scsi'/>
>             <alias name='scsi0-0-0-1'/>
>             <address type='drive' controller='0' bus='0' target='0'
> unit='1'/>
>         </disk>
> 
> 
> qemu-img info /home/kvm_autotest_root/images/stg1-1.qcow2 -U
> image: /home/kvm_autotest_root/images/stg1-1.qcow2
> file format: qcow2
> virtual size: 1.0G (1073741824 bytes)
> disk size: 196K
> cluster_size: 65536
> backing file: /home/kvm_autotest_root/images/stg1.qcow2
> Format specific information:
>     compat: 1.1
>     lazy refcounts: false
>     refcount bits: 16
>     corrupt: false
> 
> 
> 
> Hi nijin, i can not found the disk configuration in your agent_vm.xml, could
> you please share your agent_vm.xml and disk.xml ?what is your guest ? could
> you reproduce this with my agent_vm configuration ?
> 
> The attached/detached disk have any relationship with system image?
> 
> 
> <disk device="disk" snapshot="no" type="file">
> >     <address bus="0" controller="0" target="0" type="drive" unit="4" />
> >     <source
> > file="/var/lib/vdsm/transient/106fd6c1-a215-457e-b8c5-a08bb6448f58-0e91c3ce-
> > 423a-4448-a554-8b554c8453c6.6h64rC">
> >         <seclabel model="dac" relabel="no" type="none" />
> >     </source>
> >     <target bus="scsi" dev="sdl" />
> >     <serial>f84aa7c9-5769-4a87-87c5-5a79fd4de42d</serial>
> >     <driver cache="writethrough" error_policy="stop" io="threads"
> > name="qemu" type="qcow2" />
> >     <alias name="ua-f84aa7c9-5769-4a87-87c5-5a79fd4de42d" />
> > </disk>

Yes,i mean this disk not exist in your agent_vm.xml

Comment 21 John Ferlan 2020-08-13 19:20:41 UTC
Maxim - since you've been working upstream in this area, I figure you could at least own this bug.

The issue here is timing related between two loops in the code on the host where there is no guarantee that the list of devices doesn't change. If device realization happens asynchronously, then calculations are off and the assert occurs. The INQUIRY loop on the guest keeps it busy so that can throw off device realization timing. Maxim can reproduce the same issue with his upstream patches using a high guest IO load. Using IOThreads for the scsi device also plays a role in the timing.

Generating a RHEL7 downstream solution may not be possible as there's been a great deal of change in the device realization code in upstream qemu-5.1 compared to the qemu-2.12 based code that is the basis for RHEL7. Once/If an upstream patch can resolve the issue, we will first clone this bz to backport the changes RHEL 8/AV and then determine what is possible for RHEL7.

Comment 22 Maxim Levitsky 2020-08-31 15:29:51 UTC
I was able to reproduce this upstream, though it usually fails in various other ways,
due to a race between iothread and main thread.

qemu-system-x86_64: ../src/hw/scsi/scsi-bus.c:433: scsi_target_emulate_report_luns: Assertion `i == n + 8' failed.

I posted a patch as part of my broader patch series to fix scsi device removal races.

https://lists.gnu.org/archive/html/qemu-devel/2020-08/msg08151.html

Best regards,
     Maxim Levitsky

Comment 23 John Ferlan 2020-09-14 20:12:45 UTC
FYI: Update to previous patch found at: https://lists.nongnu.org/archive/html/qemu-devel/2020-09/msg04595.html

Comment 24 John Ferlan 2020-09-17 12:56:36 UTC
Looks like the upstream patch series will be merged soon - so I need to figure out what the expectation from the customer is.

Do they want a RHEL7 patch or will fixing this for RHEL-AV be "good enough" with removing IOThreads for RHEL7 as the "workaround" to avoid the issue.

If RHEL-AV is good enough, then I will move this bz to RHEL-AV with the plan to get the patch in for the next/8.3.0 release. If a RHEL7 solution is desired, then I would just clone this bz for RHEL-AV.

Comment 28 CongLi 2020-09-27 00:50:13 UTC
(In reply to Maxim Levitsky from comment #22)
> I was able to reproduce this upstream, though it usually fails in various
> other ways,
> due to a race between iothread and main thread.
> 
> qemu-system-x86_64: ../src/hw/scsi/scsi-bus.c:433:
> scsi_target_emulate_report_luns: Assertion `i == n + 8' failed.
> 
> I posted a patch as part of my broader patch series to fix scsi device
> removal races.
> 
> https://lists.gnu.org/archive/html/qemu-devel/2020-08/msg08151.html
> 
> Best regards,
>      Maxim Levitsky

Hi Maxim,

Could you please provide the reproducer for QE?

Thanks.

Comment 32 Maxim Levitsky 2020-10-01 15:16:25 UTC
So these are roughly the steps that I use to reproduce this bug

1. Add a dummy scsi controller with iothread and a dummy blockdev to the qemu
   (iothread is a must for this bug to reproduce)

# iothread for the virtio-scsi controller
-object iothread,id=iothread1

# the virtio-scsi controller
-device virtio-scsi,id=scsi-test,iothread=iothread1

# dummy blockdev that we will use - you can use any file for it
-blockdev node-name=test_disk,driver=file,filename=./tests/image.raw

# LUN0 scsi disk that we will keep to avoid problems - I fixed one recently,
# in the guest kernel, fix for which might not be yet included
-device scsi-hd,drive=test_disk,bus=scsi-test.0,bootindex=-1,id=scsi_disk,channel=0,scsi-id=0,lun=0


2. Start the guest, note which /dev/sd* device the scsi test device belongs to 
(in my case it was /dev/sdb) 

3. Run 'sg_luns /dev/sdb' just in case to see that device is present


4. Run the following test in it (as root):
(this test starts 32 threads that each hammer the LUN0
with REPORT LUNS) you can tweak the number of threads,
depending on how many threads you assign to the VM.


#! /bin/bash

trap 'kill $(jobs -p)' EXIT SIGINT

for i in `seq 0 32` ; do
	while true ; do
		sg_luns /dev/sdb > /dev/null 2>&1
	done &
done

wait



5. Now you need to add a bunch of scsi disks based on 'test_disk' blockdev to scsi-test bus

I add like 40 disks, then wait a bit for the guest to add them, then remove them and wait again.
It is sadly important to wait since otherwise the guest eventually gets confused. I'll investigate
this soon why this happens.


This is the script I use with my own home brewn 'vmspawn' qemu wrapper, but you can use something
similiar with virsh or qmp socket or something:


NUM_LUNS=40

add_devices()
{
    cmd_array=()

    for i in $(seq 1 $NUM_LUNS) ;  do
        cmd_array+=("device_add scsi-hd,drive=test_disk,bus=scsi-test.0,bootindex=-1,id=scsi_disk$i,channel=0,scsi-id=0,lun=$i,share-rw")
    done
    vm adm hmp "${cmd_array[@]}"
}

remove_devices()
{
    cmd_array=()

    for i in $(seq 1 $NUM_LUNS) ; do
        cmd_array+=("device_del scsi_disk$i")
    done
    vm adm hmp "${cmd_array[@]}"
}


while true ; do
    echo "adding devices"
    add_devices
    sleep 3
    echo "removing devices"
    remove_devices
    sleep 3
done


Now this doesn't always reproduce, and often VM crashes differently,
since the race between removal and iothread can cause a crash in many places,
but sometimes it does happen, as it happened few minutes ago

qemu-system-x86_64: ../src/hw/scsi/scsi-bus.c:433: scsi_target_emulate_report_luns: Assertion `i == n + 8' failed.

I hope that helps.

Comment 34 CongLi 2020-10-04 10:38:00 UTC
(In reply to Maxim Levitsky from comment #32)
> So these are roughly the steps that I use to reproduce this bug
> 
> 1. Add a dummy scsi controller with iothread and a dummy blockdev to the qemu
>    (iothread is a must for this bug to reproduce)
> 
> # iothread for the virtio-scsi controller
> -object iothread,id=iothread1
> 
> # the virtio-scsi controller
> -device virtio-scsi,id=scsi-test,iothread=iothread1
> 
> # dummy blockdev that we will use - you can use any file for it
> -blockdev node-name=test_disk,driver=file,filename=./tests/image.raw
> 
> # LUN0 scsi disk that we will keep to avoid problems - I fixed one recently,
> # in the guest kernel, fix for which might not be yet included
> -device
> scsi-hd,drive=test_disk,bus=scsi-test.0,bootindex=-1,id=scsi_disk,channel=0,
> scsi-id=0,lun=0
> 
> 
> 2. Start the guest, note which /dev/sd* device the scsi test device belongs
> to 
> (in my case it was /dev/sdb) 
> 
> 3. Run 'sg_luns /dev/sdb' just in case to see that device is present
> 
> 
> 4. Run the following test in it (as root):
> (this test starts 32 threads that each hammer the LUN0
> with REPORT LUNS) you can tweak the number of threads,
> depending on how many threads you assign to the VM.
> 
> 
> #! /bin/bash
> 
> trap 'kill $(jobs -p)' EXIT SIGINT
> 
> for i in `seq 0 32` ; do
> 	while true ; do
> 		sg_luns /dev/sdb > /dev/null 2>&1
> 	done &
> done
> 
> wait
> 
> 
> 
> 5. Now you need to add a bunch of scsi disks based on 'test_disk' blockdev
> to scsi-test bus
> 
> I add like 40 disks, then wait a bit for the guest to add them, then remove
> them and wait again.
> It is sadly important to wait since otherwise the guest eventually gets
> confused. I'll investigate
> this soon why this happens.
> 
> 
> This is the script I use with my own home brewn 'vmspawn' qemu wrapper, but
> you can use something
> similiar with virsh or qmp socket or something:
> 
> 
> NUM_LUNS=40
> 
> add_devices()
> {
>     cmd_array=()
> 
>     for i in $(seq 1 $NUM_LUNS) ;  do
>         cmd_array+=("device_add
> scsi-hd,drive=test_disk,bus=scsi-test.0,bootindex=-1,id=scsi_disk$i,
> channel=0,scsi-id=0,lun=$i,share-rw")
>     done
>     vm adm hmp "${cmd_array[@]}"
> }
> 
> remove_devices()
> {
>     cmd_array=()
> 
>     for i in $(seq 1 $NUM_LUNS) ; do
>         cmd_array+=("device_del scsi_disk$i")
>     done
>     vm adm hmp "${cmd_array[@]}"
> }
> 
> 
> while true ; do
>     echo "adding devices"
>     add_devices
>     sleep 3
>     echo "removing devices"
>     remove_devices
>     sleep 3
> done
> 
> 
> Now this doesn't always reproduce, and often VM crashes differently,
> since the race between removal and iothread can cause a crash in many places,
> but sometimes it does happen, as it happened few minutes ago
> 
> qemu-system-x86_64: ../src/hw/scsi/scsi-bus.c:433:
> scsi_target_emulate_report_luns: Assertion `i == n + 8' failed.
> 
> I hope that helps.

Hi Qing,

Can you please reproduce this bug again?

Thanks.

Comment 35 qing.wang 2020-10-20 09:44:02 UTC
I can reproduce this issue on 

4.18.0-234.el8.x86_64
qemu-kvm-common-5.1.0-6.module+el8.3.0+8041+42ff16b8.x86_64
seabios-1.14.0-1.module+el8.3.0+7638+07cf13d2.x86_64
edk2-ovmf-20200602gitca407c7246bf-3.el8.noarch



#0  0x00007f873e3767ff in raise () at /lib64/libc.so.6
#1  0x00007f873e360c35 in abort () at /lib64/libc.so.6
#2  0x00007f873e360b09 in _nl_load_domain.cold.0 () at /lib64/libc.so.6
#3  0x00007f873e36ede6 in .annobin_assert.c_end () at /lib64/libc.so.6
#4  0x0000556b232c2d08 in scsi_target_emulate_report_luns (r=0x7f87280049a0)
    at /usr/src/debug/qemu-kvm-5.1.0-6.module+el8.3.0+8041+42ff16b8.x86_64/hw/scsi/scsi-bus.c:433
#5  0x0000556b232c2d08 in scsi_target_send_command
    (req=0x7f87280049a0, buf=<optimized out>)
    at /usr/src/debug/qemu-kvm-5.1.0-6.module+el8.3.0+8041+42ff16b8.x86_64/hw/scsi/scsi-bus.c:517
#6  0x0000556b232c1e86 in scsi_req_enqueue (req=req@entry=0x7f87280049a0)
    at /usr/src/debug/qemu-kvm-5.1.0-6.module+el8.3.0+8041+42ff16b8.x86_64/hw/scsi/scsi-bus.c:1291
#7  0x0000556b23147f1a in virtio_scsi_handle_cmd_req_submit
    (s=0x556b2674d950, req=<optimized out>)
    at /usr/src/debug/qemu-kvm-5.1.0-6.module+el8.3.0+8041+42ff16b8.x86_64/hw/scsi/virtio-scsi.c:634
#8  0x0000556b23147f1a in virtio_scsi_handle_cmd_vq
    (s=s@entry=0x556b2674d950, vq=vq@entry=0x7f87340db140)
    at /usr/src/debug/qemu-kvm-5.1.0-6.module+el8.3.0+8041+42ff16b8.x86_64/hw/scsi/virtio-scsi.c:634
#9  0x0000556b23148cde in virtio_scsi_data_plane_handle_cmd
--Type <RET> for more, q to quit, c to continue without paging--
    (vdev=<optimized out>, vq=0x7f87340db140)
    at /usr/src/debug/qemu-kvm-5.1.0-6.module+el8.3.0+8041+42ff16b8.x86_64/hw/scsi/virtio-scsi-dataplane.c:60
#10 0x0000556b2315686e in virtio_queue_notify_aio_vq (vq=<optimized out>)
    at /usr/src/debug/qemu-kvm-5.1.0-6.module+el8.3.0+8041+42ff16b8.x86_64/hw/virtio/virtio.c:2325
#11 0x0000556b2342d577 in run_poll_handlers_once
    (timeout=<synthetic pointer>, now=2335880720813188, ctx=0x556b252928a0)
    at /usr/src/debug/qemu-kvm-5.1.0-6.module+el8.3.0+8041+42ff16b8.x86_64/util/aio-posix.c:396
#12 0x0000556b2342d577 in run_poll_handlers
    (timeout=<synthetic pointer>, max_ns=4000, ctx=0x556b252928a0)
    at /usr/src/debug/qemu-kvm-5.1.0-6.module+el8.3.0+8041+42ff16b8.x86_64/util/aio-posix.c:496
#13 0x0000556b2342d577 in try_poll_mode
    (timeout=<synthetic pointer>, ctx=0x556b252928a0)
    at /usr/src/debug/qemu-kvm-5.1.0-6.module+el8.3.0+8041+42ff16b8.x86_64/util/aio-posix.c:541
#14 0x0000556b2342d577 in aio_poll
    (ctx=0x556b252928a0, blocking=blocking@entry=true)
    at /usr/src/debug/qemu-kvm-5.1.0-6.module+el8.3.0+8041+42ff16b8.x86_64/util/aio-posix.c:593
#15 0x0000556b232096d4 in iothread_run (opaque=0x556b251e2f00)
--Type <RET> for more, q to quit, c to continue without paging--
    at /usr/src/debug/qemu-kvm-5.1.0-6.module+el8.3.0+8041+42ff16b8.x86_64/iothread.c:75
#16 0x0000556b2342fb34 in qemu_thread_start (args=0x556b2528ffe0)
    at /usr/src/debug/qemu-kvm-5.1.0-6.module+el8.3.0+8041+42ff16b8.x86_64/util/qemu-thread-posix.c:521
#17 0x00007f873e70a14a in start_thread () at /lib64/libpthread.so.0
#18 0x00007f873e43bf23 in clone () at /lib64/libc.so.6


My test steps:

1.Create images
qemu-img create -f qcow2 stg0.qcow2 1G
...
qemu-img create -f qcow2 stg40.qcow2 1G

2. boot vm
/usr/libexec/qemu-kvm \
    -name 'avocado-vt-vm1'  \
    -sandbox on  \
    -machine pc \
    -device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pci.0,addr=0x2,chassis=1 \
    -device pcie-pci-bridge,id=pcie-pci-bridge-0,addr=0x0,bus=pcie-root-port-0  \
    -nodefaults \
    -device VGA,bus=pci.0,addr=0x3 \
    -m 2048  \
    -smp 12,maxcpus=12,cores=6,threads=1,sockets=2  \
    -device pcie-root-port,id=pcie-root-port-1,bus=pci.0,chassis=2 \
    -device pcie-root-port,id=pcie-root-port-2,port=0x2,bus=pci.0,chassis=3 \
    -device qemu-xhci,id=usb1,bus=pcie-root-port-1,addr=0x0 \
    -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \
    -object iothread,id=iothread1 \
    -device virtio-scsi,id=scsi0 \
    -device virtio-scsi,id=scsi1,iothread=iothread1 \
    -drive id=drive_image1,if=none,snapshot=off,aio=threads,cache=none,format=qcow2,file=/home/kvm_autotest_root/images/rhel820-64-virtio-scsi.qcow2 \
    -device scsi-hd,id=image1,drive=drive_image1,bootindex=0,bus=scsi0.0 \
    \
    -blockdev node-name=test_disk0,driver=file,filename=/home/kvm_autotest_root/images/stg0.qcow2 \
    -device scsi-hd,drive=test_disk0,bus=scsi1.0,bootindex=-1,id=scsi_disk0,channel=0,scsi-id=0,channel=0,scsi-id=0,lun=0,share-rw \
    -blockdev node-name=test_disk1,driver=file,filename=/home/kvm_autotest_root/images/stg1.qcow2 \
    -blockdev node-name=test_disk2,driver=file,filename=/home/kvm_autotest_root/images/stg2.qcow2 \
    -blockdev node-name=test_disk3,driver=file,filename=/home/kvm_autotest_root/images/stg3.qcow2 \
    -blockdev node-name=test_disk4,driver=file,filename=/home/kvm_autotest_root/images/stg4.qcow2 \
    -blockdev node-name=test_disk5,driver=file,filename=/home/kvm_autotest_root/images/stg5.qcow2 \
    -blockdev node-name=test_disk6,driver=file,filename=/home/kvm_autotest_root/images/stg6.qcow2 \
    -blockdev node-name=test_disk7,driver=file,filename=/home/kvm_autotest_root/images/stg7.qcow2 \
    -blockdev node-name=test_disk8,driver=file,filename=/home/kvm_autotest_root/images/stg8.qcow2 \
    -blockdev node-name=test_disk9,driver=file,filename=/home/kvm_autotest_root/images/stg9.qcow2 \
    -blockdev node-name=test_disk10,driver=file,filename=/home/kvm_autotest_root/images/stg10.qcow2 \
    -blockdev node-name=test_disk11,driver=file,filename=/home/kvm_autotest_root/images/stg11.qcow2 \
    -blockdev node-name=test_disk12,driver=file,filename=/home/kvm_autotest_root/images/stg12.qcow2 \
    -blockdev node-name=test_disk13,driver=file,filename=/home/kvm_autotest_root/images/stg13.qcow2 \
    -blockdev node-name=test_disk14,driver=file,filename=/home/kvm_autotest_root/images/stg14.qcow2 \
    -blockdev node-name=test_disk15,driver=file,filename=/home/kvm_autotest_root/images/stg15.qcow2 \
    -blockdev node-name=test_disk16,driver=file,filename=/home/kvm_autotest_root/images/stg16.qcow2 \
    -blockdev node-name=test_disk17,driver=file,filename=/home/kvm_autotest_root/images/stg17.qcow2 \
    -blockdev node-name=test_disk18,driver=file,filename=/home/kvm_autotest_root/images/stg18.qcow2 \
    -blockdev node-name=test_disk19,driver=file,filename=/home/kvm_autotest_root/images/stg19.qcow2 \
    -blockdev node-name=test_disk20,driver=file,filename=/home/kvm_autotest_root/images/stg20.qcow2 \
    -blockdev node-name=test_disk21,driver=file,filename=/home/kvm_autotest_root/images/stg21.qcow2 \
    -blockdev node-name=test_disk22,driver=file,filename=/home/kvm_autotest_root/images/stg22.qcow2 \
    -blockdev node-name=test_disk23,driver=file,filename=/home/kvm_autotest_root/images/stg23.qcow2 \
    -blockdev node-name=test_disk24,driver=file,filename=/home/kvm_autotest_root/images/stg24.qcow2 \
    -blockdev node-name=test_disk25,driver=file,filename=/home/kvm_autotest_root/images/stg25.qcow2 \
    -blockdev node-name=test_disk26,driver=file,filename=/home/kvm_autotest_root/images/stg26.qcow2 \
    -blockdev node-name=test_disk27,driver=file,filename=/home/kvm_autotest_root/images/stg27.qcow2 \
    -blockdev node-name=test_disk28,driver=file,filename=/home/kvm_autotest_root/images/stg28.qcow2 \
    -blockdev node-name=test_disk29,driver=file,filename=/home/kvm_autotest_root/images/stg29.qcow2 \
    -blockdev node-name=test_disk30,driver=file,filename=/home/kvm_autotest_root/images/stg30.qcow2 \
    -blockdev node-name=test_disk31,driver=file,filename=/home/kvm_autotest_root/images/stg31.qcow2 \
    -blockdev node-name=test_disk32,driver=file,filename=/home/kvm_autotest_root/images/stg32.qcow2 \
    -blockdev node-name=test_disk33,driver=file,filename=/home/kvm_autotest_root/images/stg33.qcow2 \
    -blockdev node-name=test_disk34,driver=file,filename=/home/kvm_autotest_root/images/stg34.qcow2 \
    -blockdev node-name=test_disk35,driver=file,filename=/home/kvm_autotest_root/images/stg35.qcow2 \
    -blockdev node-name=test_disk36,driver=file,filename=/home/kvm_autotest_root/images/stg36.qcow2 \
    -blockdev node-name=test_disk37,driver=file,filename=/home/kvm_autotest_root/images/stg37.qcow2 \
    -blockdev node-name=test_disk38,driver=file,filename=/home/kvm_autotest_root/images/stg38.qcow2 \
    -blockdev node-name=test_disk39,driver=file,filename=/home/kvm_autotest_root/images/stg39.qcow2 \
    -blockdev node-name=test_disk40,driver=file,filename=/home/kvm_autotest_root/images/stg40.qcow2 \
    \
    -device pcie-root-port,id=pcie-root-port-3,port=0x3,bus=pci.0,chassis=4 \
    -device virtio-net-pci,mac=9a:21:f7:4a:1e:bd,id=idRuZxfv,netdev=idOpPVAe,bus=pcie-root-port-3,addr=0x0  \
    -netdev tap,id=idOpPVAe,vhost=on  \
    -rtc base=localtime,clock=host,driftfix=slew  \
    -boot menu=off,order=cdn,once=c,strict=off \
    -enable-kvm \
    -vnc :5  \
    -rtc base=localtime,clock=host,driftfix=slew  \
    -boot order=cdn,once=c,menu=off,strict=off \
    -enable-kvm \
    -device pcie-root-port,id=pcie_extra_root_port_0,bus=pci.0 \
    -monitor stdio \
    -chardev file,id=qmp_id_qmpmonitor1,path=/var/tmp/monitor-qmpdbg.log,server,nowait \
    -mon chardev=qmp_id_qmpmonitor1,mode=control  \
    -qmp tcp:0:5955,server,nowait  \
    -chardev file,path=/var/tmp/monitor-serialdbg.log,id=serial_id_serial0 \
    -device isa-serial,chardev=serial_id_serial0  \


3. run sg_lun in guest 
root@vm-198-17 ~ # cat run.sh
trap 'kill $(jobs -p)' EXIT SIGINT

for i in `seq 0 32` ; do
	while true ; do
		sg_luns /dev/sdb > /dev/null 2>&1
	done &
done
echo "wait

4.repeatly hotplug/unplug disks in same bus
root@dell-per440-07 /home/rworkdir/vbugs/bug/1866707 # cat run.sh 
NUM_LUNS=40
add_devices() {
  exec 3<>/dev/tcp/localhost/5955
  echo "$@"
  echo -e "{'execute':'qmp_capabilities'}" >&3
  read response <&3
  echo $response
  for i in $(seq 1 $NUM_LUNS) ; do
  cmd="{'execute':'device_add', 'arguments': {'driver':'scsi-hd','drive':'test_disk$i','id':'scsi_disk$i','bus':'scsi1.0','lun':$i}}"
  echo "$cmd"
  echo -e "$cmd" >&3
  read response <&3
  echo "$response"
  done
}

remove_devices() {
  exec 3<>/dev/tcp/localhost/5955
  echo "$@"
  echo -e "{'execute':'qmp_capabilities'}" >&3
  read response <&3
  echo $response
  for i in $(seq 1 $NUM_LUNS) ; do
  cmd="{'execute':'device_del', 'arguments': {'id':'scsi_disk$i'}}"
  echo "$cmd"
  echo -e "$cmd" >&3
  read response <&3
  echo "$response"
  done
}


while true ; do
    echo "adding devices"
    add_devices
    sleep 3
    echo "removing devices"
    remove_devices
    sleep 3
done

5.keep running until qemu crashed

Comment 37 John Ferlan 2020-11-11 15:28:10 UTC
Can we get the qa_ack+ please?

Comment 42 qing.wang 2020-11-20 06:09:17 UTC
Verified on Red Hat Enterprise Linux release 8.3 (Ootpa)
4.18.0-240.4.1.el8_3.x86_64
qemu-kvm-common-5.1.0-15.module+el8.3.1+8772+a3fdeccd.x86_64

Scenario 1: refer bug 1812399 comment 0
1.boot vm
virsh define pc.xml;virsh start pc

2.hotplug-unplug disk repeatly
while true;do virsh attach-device pc disk.xml; virsh detach-device pc disk.xml;done

Running over 1 hour , no crash issue found.

Scenario 2: 

1. create 40 image files 
qemu-img create -f qcow2 /home/kvm_autotest_root/images/stg0.qcow2 1G
...
qemu-img create -f qcow2 /home/kvm_autotest_root/images/stg40.qcow2 1G

2.boot vm
/usr/libexec/qemu-kvm \
    -name 'avocado-vt-vm1'  \
    -sandbox on  \
    -machine pc \
    -device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pci.0,addr=0x2,chassis=1 \
    -device pcie-pci-bridge,id=pcie-pci-bridge-0,addr=0x0,bus=pcie-root-port-0  \
    -nodefaults \
    -device VGA,bus=pci.0,addr=0x3 \
    -m 2048  \
    -smp 12,maxcpus=12,cores=6,threads=1,sockets=2  \
    -device pcie-root-port,id=pcie-root-port-1,bus=pci.0,chassis=2 \
    -device pcie-root-port,id=pcie-root-port-2,port=0x2,bus=pci.0,chassis=3 \
    -device qemu-xhci,id=usb1,bus=pcie-root-port-1,addr=0x0 \
    -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \
    -object iothread,id=iothread1 \
    -device virtio-scsi,id=scsi0 \
    -device virtio-scsi,id=scsi1,iothread=iothread1 \
    -drive id=drive_image1,if=none,snapshot=off,aio=threads,cache=none,format=qcow2,file=/home/kvm_autotest_root/images/rhel831-64-virtio-scsi.qcow2 \
    -device scsi-hd,id=image1,drive=drive_image1,bootindex=0,bus=scsi0.0 \
    \
    -blockdev node-name=test_disk0,driver=file,filename=/home/kvm_autotest_root/images/stg0.qcow2 \
    -device scsi-hd,drive=test_disk0,bus=scsi1.0,bootindex=-1,id=scsi_disk0,channel=0,scsi-id=0,channel=0,scsi-id=0,lun=0,share-rw \
    -blockdev node-name=test_disk1,driver=file,filename=/home/kvm_autotest_root/images/stg1.qcow2 \
    -blockdev node-name=test_disk2,driver=file,filename=/home/kvm_autotest_root/images/stg2.qcow2 \
    -blockdev node-name=test_disk3,driver=file,filename=/home/kvm_autotest_root/images/stg3.qcow2 \
    -blockdev node-name=test_disk4,driver=file,filename=/home/kvm_autotest_root/images/stg4.qcow2 \
    -blockdev node-name=test_disk5,driver=file,filename=/home/kvm_autotest_root/images/stg5.qcow2 \
    -blockdev node-name=test_disk6,driver=file,filename=/home/kvm_autotest_root/images/stg6.qcow2 \
    -blockdev node-name=test_disk7,driver=file,filename=/home/kvm_autotest_root/images/stg7.qcow2 \
    -blockdev node-name=test_disk8,driver=file,filename=/home/kvm_autotest_root/images/stg8.qcow2 \
    -blockdev node-name=test_disk9,driver=file,filename=/home/kvm_autotest_root/images/stg9.qcow2 \
    -blockdev node-name=test_disk10,driver=file,filename=/home/kvm_autotest_root/images/stg10.qcow2 \
    -blockdev node-name=test_disk11,driver=file,filename=/home/kvm_autotest_root/images/stg11.qcow2 \
    -blockdev node-name=test_disk12,driver=file,filename=/home/kvm_autotest_root/images/stg12.qcow2 \
    -blockdev node-name=test_disk13,driver=file,filename=/home/kvm_autotest_root/images/stg13.qcow2 \
    -blockdev node-name=test_disk14,driver=file,filename=/home/kvm_autotest_root/images/stg14.qcow2 \
    -blockdev node-name=test_disk15,driver=file,filename=/home/kvm_autotest_root/images/stg15.qcow2 \
    -blockdev node-name=test_disk16,driver=file,filename=/home/kvm_autotest_root/images/stg16.qcow2 \
    -blockdev node-name=test_disk17,driver=file,filename=/home/kvm_autotest_root/images/stg17.qcow2 \
    -blockdev node-name=test_disk18,driver=file,filename=/home/kvm_autotest_root/images/stg18.qcow2 \
    -blockdev node-name=test_disk19,driver=file,filename=/home/kvm_autotest_root/images/stg19.qcow2 \
    -blockdev node-name=test_disk20,driver=file,filename=/home/kvm_autotest_root/images/stg20.qcow2 \
    -blockdev node-name=test_disk21,driver=file,filename=/home/kvm_autotest_root/images/stg21.qcow2 \
    -blockdev node-name=test_disk22,driver=file,filename=/home/kvm_autotest_root/images/stg22.qcow2 \
    -blockdev node-name=test_disk23,driver=file,filename=/home/kvm_autotest_root/images/stg23.qcow2 \
    -blockdev node-name=test_disk24,driver=file,filename=/home/kvm_autotest_root/images/stg24.qcow2 \
    -blockdev node-name=test_disk25,driver=file,filename=/home/kvm_autotest_root/images/stg25.qcow2 \
    -blockdev node-name=test_disk26,driver=file,filename=/home/kvm_autotest_root/images/stg26.qcow2 \
    -blockdev node-name=test_disk27,driver=file,filename=/home/kvm_autotest_root/images/stg27.qcow2 \
    -blockdev node-name=test_disk28,driver=file,filename=/home/kvm_autotest_root/images/stg28.qcow2 \
    -blockdev node-name=test_disk29,driver=file,filename=/home/kvm_autotest_root/images/stg29.qcow2 \
    -blockdev node-name=test_disk30,driver=file,filename=/home/kvm_autotest_root/images/stg30.qcow2 \
    -blockdev node-name=test_disk31,driver=file,filename=/home/kvm_autotest_root/images/stg31.qcow2 \
    -blockdev node-name=test_disk32,driver=file,filename=/home/kvm_autotest_root/images/stg32.qcow2 \
    -blockdev node-name=test_disk33,driver=file,filename=/home/kvm_autotest_root/images/stg33.qcow2 \
    -blockdev node-name=test_disk34,driver=file,filename=/home/kvm_autotest_root/images/stg34.qcow2 \
    -blockdev node-name=test_disk35,driver=file,filename=/home/kvm_autotest_root/images/stg35.qcow2 \
    -blockdev node-name=test_disk36,driver=file,filename=/home/kvm_autotest_root/images/stg36.qcow2 \
    -blockdev node-name=test_disk37,driver=file,filename=/home/kvm_autotest_root/images/stg37.qcow2 \
    -blockdev node-name=test_disk38,driver=file,filename=/home/kvm_autotest_root/images/stg38.qcow2 \
    -blockdev node-name=test_disk39,driver=file,filename=/home/kvm_autotest_root/images/stg39.qcow2 \
    -blockdev node-name=test_disk40,driver=file,filename=/home/kvm_autotest_root/images/stg40.qcow2 \
    \
    -device pcie-root-port,id=pcie-root-port-3,port=0x3,bus=pci.0,chassis=4 \
    -device virtio-net-pci,mac=9a:21:f7:4a:1e:bd,id=idRuZxfv,netdev=idOpPVAe,bus=pcie-root-port-3,addr=0x0  \
    -netdev tap,id=idOpPVAe,vhost=on  \
    -rtc base=localtime,clock=host,driftfix=slew  \
    -boot menu=off,order=cdn,once=c,strict=off \
    -enable-kvm \
    -vnc :5  \
    -rtc base=localtime,clock=host,driftfix=slew  \
    -boot order=cdn,once=c,menu=off,strict=off \
    -enable-kvm \
    -device pcie-root-port,id=pcie_extra_root_port_0,bus=pci.0 \
    -monitor stdio \
    -chardev file,id=qmp_id_qmpmonitor1,path=/var/tmp/monitor-qmpdbg.log,server,nowait \
    -mon chardev=qmp_id_qmpmonitor1,mode=control  \
    -qmp tcp:0:5955,server,nowait  \
    -chardev file,path=/var/tmp/monitor-serialdbg.log,id=serial_id_serial0 \
    -device isa-serial,chardev=serial_id_serial0  \

3.login guest and execute sg_luns with multi instances
trap 'kill $(jobs -p)' EXIT SIGINT

for i in `seq 0 32` ; do
	while true ; do
#		sg_luns /dev/sdb > /dev/null 2>&1
    sg_luns /dev/sdb
	done &
done
echo "wait"
wait

4.hotplug-unlug multi disks repeatly on each 3 seconds
NUM_LUNS=40
add_devices() {
  exec 3<>/dev/tcp/localhost/5955
  echo "$@"
  echo -e "{'execute':'qmp_capabilities'}" >&3
  read response <&3
  echo $response
  for i in $(seq 1 $NUM_LUNS) ; do
  cmd="{'execute':'device_add', 'arguments': {'driver':'scsi-hd','drive':'test_disk$i','id':'scsi_disk$i','bus':'scsi1.0','lun':$i}}"
  echo "$cmd"
  echo -e "$cmd" >&3
  read response <&3
  echo "$response"
  done
}

remove_devices() {
  exec 3<>/dev/tcp/localhost/5955
  echo "$@"
  echo -e "{'execute':'qmp_capabilities'}" >&3
  read response <&3
  echo $response
  for i in $(seq 1 $NUM_LUNS) ; do
  cmd="{'execute':'device_del', 'arguments': {'id':'scsi_disk$i'}}"
  echo "$cmd"
  echo -e "$cmd" >&3
  read response <&3
  echo "$response"
  done
}


while true ; do
    echo "adding devices"
    add_devices
    sleep 3
    echo "removing devices"
    remove_devices
    sleep 3
done

running over 1 hour, no crash issue found.

Comment 44 errata-xmlrpc 2021-02-22 15:39:38 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (virt:8.3 bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:0639


Note You need to log in before you can comment on or make changes to this bug.