Bug 2128585

Summary: VM crash when hot unplug pci serial device during boot
Product: Red Hat Enterprise Linux 9 Reporter: yalzhang <yalzhang>
Component: qemu-kvmAssignee: Amnon Ilan <ailan>
qemu-kvm sub component: PCI QA Contact: Chao Yang <chayang>
Status: CLOSED NOTABUG Docs Contact:
Severity: unspecified    
Priority: unspecified CC: coli, dzheng, jinzhao, juzhang, nanliu, virt-maint, zhetang
Version: 9.1   
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-10-24 03:03:57 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
The backtrace none

Description yalzhang@redhat.com 2022-09-21 07:36:38 UTC
Created attachment 1913271 [details]
The backtrace

Description of problem:
VM crash when hot unplug pci serial device during boot

Version-Release number of selected component (if applicable):
libvirt-8.5.0-6.el9.x86_64
qemu-kvm-7.1.0-1.el9.x86_64

How reproducible:
100% 

Steps to Reproduce:
1.  Prepare a vm with file type serial device as below(no other serial or console devices):
# virsh dumpxml avocado-vt-vm1 --xpath //serial
<serial type="file">
  <source path="/var/lib/libvirt/virt-test"/>
  <target type="pci-serial" port="0">
    <model name="pci-serial"/>
  </target>
  <address type="pci" domain="0x0000" bus="0x10" slot="0x01" function="0x0"/>
</serial>

2. Undefine and define the vm again:
#  virsh dumpxml avocado-vt-vm1 > avocado-vt-vm1.xml; virsh undefine avocado-vt-vm1 --nvram; virsh define avocado-vt-vm1.xml; 
(By undefine --nvram, and define again, there will be a reset process when guest start again.)

3. Start the vm, and hot-unplug the file type serial device immediately after start, the vm will crash:
# cat /tmp/test.xml
<serial type="file">
  <source path="/var/lib/libvirt/virt-test"/>
  <target type="pci-serial" port="0">
    <model name="pci-serial"/>
  </target>
</serial>
# virsh start avocado-vt-vm1 ;  virsh detach-device avocado-vt-vm1  /tmp/test.xml  --live;   virsh domstate avocado-vt-vm1 ;  sleep 10;  virsh domstate avocado-vt-vm1
Domain 'avocado-vt-vm1' has been undefined

Domain 'avocado-vt-vm1' defined from avocado-vt-vm1.xml

Domain 'avocado-vt-vm1' started

Device detached successfully

running

shut off

4. check the coredump list
# coredumpctl list
TIME                          PID UID GID SIG     COREFILE EXE                   SIZE
Wed 2022-09-21 01:53:18 EDT 45883 107 107 SIGSEGV present  /usr/libexec/qemu-kvm 1.6M

Actual results:
VM crash when hot unplug pci serial device during boot

Expected results:
VM should not crash

Additional info:

Comment 1 liunana 2022-09-21 09:01:11 UTC
Hi,

I can't reproduce the crash issue with qemu-kvm-7.1.0-1.el9.x86_64.
qemu command line:
     -device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pcie.0,addr=0x1,chassis=1 \
     -device pcie-pci-bridge,id=pcie-pci-bridge-0,addr=0x0,bus=pcie-root-port-0  \
     -chardev socket,id=channel1,path=/tmp/helloworld1,server=on,wait=off \
     -device pci-serial,id=serial0,chardev=channel1,bus=pcie-pci-bridge-0,addr=0x1 \


Would you please help to try the other pci devices to check if you can reproduce this issue?
Thanks a lot!


Best regards
Nana Liu

Comment 2 yalzhang@redhat.com 2022-09-22 01:39:46 UTC
I have tried rtl8139 interface which is also a pci device, the crash can not be reproduced.

qemu command line generated from libvirt for the xml in comment 0 is like this:
-device '{"driver":"pcie-root-port","port":23,"chassis":8,"id":"pci.8","bus":"pcie.0","addr":"0x2.0x7"}' \
-device '{"driver":"pcie-pci-bridge","id":"pci.16","bus":"pci.8","addr":"0x0"}' \
-add-fd set=0,fd=23,opaque=serial0-source \
-chardev file,id=charserial0,path=/dev/fdset/0,append=on \
-device '{"driver":"pci-serial","chardev":"charserial0","id":"serial0","bus":"pci.16","addr":"0x1"}' \

Comment 4 yalzhang@redhat.com 2022-10-24 03:03:57 UTC
(In reply to jingzhao from comment #3)
> Hello, yalan
> 
> So can we close it based on comment1 & 2
> 
> Thanks
> Jing

I agree. Unpluging a device during boot is not a normal operation since it needs the OS's coorperation. This is a negative scenario. And I can not reproduce it on lastet rhel 9.2. Let me close it. Thank you!