Bug 1522682

Summary: libvirtd crashes when start a guest with non-existing disk source and driver type='raw'
Product: Red Hat Enterprise Linux 7 Reporter: Fangge Jin <fjin>
Component: libvirtAssignee: Peter Krempa <pkrempa>
Status: CLOSED ERRATA QA Contact: lijuan men <lmen>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 7.5CC: jiyan, juzhou, lmen, lmiksik, pkrempa, rbalakri, xuzhang, yisun
Target Milestone: rcKeywords: Regression
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: libvirt-3.9.0-6.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-04-10 11:02:08 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
coredump of libvirtd
none
libvirtd log none

Description Fangge Jin 2017-12-06 09:15:10 UTC
Created attachment 1363568 [details]
coredump of libvirtd

Description of problem:
libvirtd crashes when start a guest with non-existing disk source and driver type='raw'

Version-Release number of selected component:
libvirt-3.9.0-5.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Prepare a guest with raw disk image:
    <disk type='file' device='disk'>
      <driver name='qemu' type='raw' cache='none'/>
      <source file='/nfs/fjin/RHEL-7.5-x86_64-latest.qcow2'/>
      <target dev='vda' bus='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
    </disk>
    
    Note: The image /nfs/fjin/RHEL-7.5-x86_64-latest.qcow2 doesn't exist

2. Start guest
    # virsh start foo=1
    error: Disconnected from qemu:///system due to end of file
    error: Failed to start domain foo=1
    error: End of file while reading data: Input/output error


Actual results:
libvirtd crashed

Expected results:
Report correct error when guest fails to start and no crash:
# virsh start foo=1
error: Failed to start domain foo=1
error: Cannot access storage file '/nfs/fjin/RHEL-7.5-x86_64-latest.qcow2' (as uid:107, gid:107): No such file or directory


Additional info:
1. Change disk driver type to "qcow2", guest fails to start and no crash.
2. The backtrace of libvirtd:
(gdb) bt
#0  virStorageFileReportBrokenChain (errcode=2, src=src@entry=0x7fc86801a7f0, parent=parent@entry=0x7fc86801a7f0) at storage/storage_source.c:407
#1  0x00007fc8741a3d90 in qemuDomainDetermineDiskChain (driver=driver@entry=0x7fc86c103c40, vm=vm@entry=0x7fc86c2efe30, disk=disk@entry=0x7fc86801ab50, force_probe=force_probe@entry=true, 
    report_broken=report_broken@entry=true) at qemu/qemu_domain.c:6369
#2  0x00007fc8741cc2ba in qemuProcessPrepareHostStorage (flags=17, vm=0x7fc86c2efe30, driver=0x7fc86c103c40) at qemu/qemu_process.c:5561
#3  qemuProcessPrepareHost (driver=driver@entry=0x7fc86c103c40, vm=vm@entry=0x7fc86c2efe30, flags=flags@entry=17) at qemu/qemu_process.c:5667
#4  0x00007fc8741d2083 in qemuProcessStart (conn=conn@entry=0x7fc868000b90, driver=driver@entry=0x7fc86c103c40, vm=vm@entry=0x7fc86c2efe30, updatedCPU=updatedCPU@entry=0x0, 
    asyncJob=asyncJob@entry=QEMU_ASYNC_JOB_START, migrateFrom=migrateFrom@entry=0x0, migrateFd=migrateFd@entry=-1, migratePath=migratePath@entry=0x0, snapshot=snapshot@entry=0x0, 
    vmop=vmop@entry=VIR_NETDEV_VPORT_PROFILE_OP_CREATE, flags=17, flags@entry=1) at qemu/qemu_process.c:6127
#5  0x00007fc874236436 in qemuDomainObjStart (conn=0x7fc868000b90, driver=driver@entry=0x7fc86c103c40, vm=0x7fc86c2efe30, flags=flags@entry=0, asyncJob=QEMU_ASYNC_JOB_START)
    at qemu/qemu_driver.c:7291
#6  0x00007fc874236b76 in qemuDomainCreateWithFlags (dom=0x7fc85c000e20, flags=0) at qemu/qemu_driver.c:7345
#7  0x00007fc891ba945c in virDomainCreate (domain=domain@entry=0x7fc85c000e20) at libvirt-domain.c:6531
#8  0x000056396f223a73 in remoteDispatchDomainCreate (server=0x563970329f90, msg=0x5639703442f0, args=<optimized out>, rerr=0x7fc8810d5c10, client=0x563970344000) at remote_dispatch.h:4222
#9  remoteDispatchDomainCreateHelper (server=0x563970329f90, client=0x563970344000, msg=0x5639703442f0, rerr=0x7fc8810d5c10, args=<optimized out>, ret=0x7fc85c001280) at remote_dispatch.h:4198
#10 0x00007fc891c19ab2 in virNetServerProgramDispatchCall (msg=0x5639703442f0, client=0x563970344000, server=0x563970329f90, prog=0x563970341950) at rpc/virnetserverprogram.c:437
#11 virNetServerProgramDispatch (prog=0x563970341950, server=server@entry=0x563970329f90, client=0x563970344000, msg=0x5639703442f0) at rpc/virnetserverprogram.c:307
#12 0x000056396f234c7d in virNetServerProcessMsg (msg=<optimized out>, prog=<optimized out>, client=<optimized out>, srv=0x563970329f90) at rpc/virnetserver.c:148
#13 virNetServerHandleJob (jobOpaque=<optimized out>, opaque=0x563970329f90) at rpc/virnetserver.c:169
#14 0x00007fc891af4161 in virThreadPoolWorker (opaque=opaque@entry=0x56397031dc20) at util/virthreadpool.c:167
#15 0x00007fc891af34e8 in virThreadHelper (data=<optimized out>) at util/virthread.c:206
#16 0x00007fc88eefce25 in start_thread () from /lib64/libpthread.so.0
#17 0x00007fc88ec26ccd in clone () from /lib64/libc.so.6

Comment 2 Fangge Jin 2017-12-06 09:16:31 UTC
Created attachment 1363569 [details]
libvirtd log

Comment 4 Peter Krempa 2017-12-07 12:44:38 UTC
Fixed upstream:

commit 2d07f1f0ebd44b0348daa61afa0de34f3f838c22 
Author: Peter Krempa <pkrempa>
Date:   Wed Dec 6 16:20:07 2017 +0100

    storage: Don't dereference driver object if virStorageSource is not initialized
    
    virStorageFileReportBrokenChain uses data from the driver private data
    pointer to print the user and group. This would lead to a crash in call
    paths where we did not initialize the storage backend as recently added
    in commit 24e47ee2b93 to qemuDomainDetermineDiskChain.
    
    Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1522682

Comment 7 jiyan 2018-01-08 10:41:10 UTC
Add another two scenarios that can hit the bug.
https://bugzilla.redhat.com/show_bug.cgi?id=1510323#c10

Comment 8 lijuan men 2018-01-18 06:49:00 UTC
verify the bug

version:
libvirt-3.9.0-8.el7.x86_64
qemu-kvm-rhev-2.10.0-17.el7.x86_64

steps:

scenario1:start a guest with non-existing disk source and driver type='raw'

1.Prepare a guest with non-existing raw disk image:
    <disk type='file' device='disk'>
      <driver name='qemu' type='raw'/>
      <source file='/var/lib/libvirt/images/abc.img'/>
      <target dev='vda' bus='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
    </disk>
2.start the guest:
[root@lmen1 ~]# virsh start test
error: Failed to start domain test
error: Cannot access storage file '/var/lib/libvirt/images/abc.img': No such file or directory

PS:the non-existing qcow2 img is the same result



scenario2: insert source file for cdrom device by 'virsh change-media'
1.start a guest with xml:
 <disk type='file' device='cdrom'>
      <driver name='qemu' type='raw'/>
      <target dev='hdc' bus='ide'/>
      <readonly/>
      <address type='drive' controller='0' bus='1' target='0' unit='0'/>
    </disk>

[root@lmen1 images]# virsh start test
Domain test started

2.insert source file for cdrom device by 'virsh change-media'
[root@lmen1 images]# virsh change-media test hdc  --insert --current b.iso
error: Failed to complete action insert on media
error: Cannot access storage file 'b.iso': No such file or directory


scenario3: update source file for cdrom device by 'virsh change-media'
1.start a guest with xml:
<disk type='file' device='cdrom'>
      <driver name='qemu' type='raw'/>
      <source file='/var/lib/libvirt/images/a.iso'/>
      <target dev='hdc' bus='ide'/>
      <readonly/>
      <address type='drive' controller='0' bus='1' target='0' unit='0'/>
    </disk>

[root@lmen1 images]# virsh start test
Domain test started

[root@lmen1 images]# virsh change-media test hdc  --update --current b.iso
error: Failed to complete action update on media
error: Cannot access storage file 'b.iso': No such file or directory

Comment 12 errata-xmlrpc 2018-04-10 11:02:08 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:0704