RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2001404 - CVE-2021-4145 qemu-kvm: QEMU: NULL pointer dereference in mirror_wait_on_conflicts() in block/mirror.c [rhel-9.0]
Summary: CVE-2021-4145 qemu-kvm: QEMU: NULL pointer dereference in mirror_wait_on_conf...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 9
Classification: Red Hat
Component: qemu-kvm
Version: 9.0
Hardware: x86_64
OS: Unspecified
high
low
Target Milestone: rc
: ---
Assignee: Stefano Garzarella
QA Contact: aihua liang
URL:
Whiteboard:
: 2030708 (view as bug list)
Depends On:
Blocks: 2002607 2018367 CVE-2021-4145
TreeView+ depends on / blocked
 
Reported: 2021-09-06 02:20 UTC by Yanan Fu
Modified: 2022-05-17 12:29 UTC (History)
11 users (show)

Fixed In Version: qemu-kvm-6.2.0-1.el9
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 2002607 (view as bug list)
Environment:
Last Closed: 2022-05-17 12:24:17 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHELPLAN-96266 0 None None None 2021-09-06 02:32:50 UTC
Red Hat Product Errata RHBA-2022:2307 0 None None None 2022-05-17 12:24:50 UTC

Description Yanan Fu 2021-09-06 02:20:26 UTC
Description of problem:
QEMU crash when commit snapshot to base during fio running in guest

Version-Release number of selected component (if applicable):
host:
qemu-kvm-6.1.0-1.el9
kernel-5.14.0-1.el9.x86_64


guest:
kernel-5.14.0-1.el9.x86_64

How reproducible:
100%

Steps to Reproduce:
1. boot guest with system disk
    -device pcie-root-port,id=pcie-root-port-3,port=0x3,addr=0x1.0x3,bus=pcie.0,chassis=4 \
    -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pcie-root-port-3,addr=0x0 \
    -blockdev node-name=file_image1,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=/home/kvm_autotest_root/images/rhel900-64-virtio-scsi.qcow2,cache.direct=on,cache.no-flush=off \
    -blockdev node-name=drive_image1,driver=qcow2,read-only=off,cache.direct=on,cache.no-flush=off,file=file_image1 \
    -device scsi-hd,id=image1,drive=drive_image1,write-cache=on \



2. create snapshot
{'execute': 'blockdev-create', 'arguments': {'options': {'driver': 'file', 'filename': '/root/avocado/data/avocado-vt/sn1.qcow2', 'size': 21474836480}, 'job-id': 'file_sn1'}, 'id': 'GEFOF0Ok'}
{"timestamp": {"seconds": 1630893646, "microseconds": 640248}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "file_sn1"}}
{"timestamp": {"seconds": 1630893646, "microseconds": 640428}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "file_sn1"}}
{"return": {}, "id": "GEFOF0Ok"}
{"timestamp": {"seconds": 1630893646, "microseconds": 641472}, "event": "JOB_STATUS_CHANGE", "data": {"status": "waiting", "id": "file_sn1"}}
{"timestamp": {"seconds": 1630893646, "microseconds": 641542}, "event": "JOB_STATUS_CHANGE", "data": {"status": "pending", "id": "file_sn1"}}
{"timestamp": {"seconds": 1630893646, "microseconds": 641601}, "event": "JOB_STATUS_CHANGE", "data": {"status": "concluded", "id": "file_sn1"}}
{"timestamp": {"seconds": 1630893646, "microseconds": 924017}, "event": "RTC_CHANGE", "data": {"offset": 0}}
{'execute': 'job-dismiss', 'arguments': {'id': 'file_sn1'}, 'id': 'y4Og0JTS'}
{"timestamp": {"seconds": 1630893652, "microseconds": 808242}, "event": "JOB_STATUS_CHANGE", "data": {"status": "null", "id": "file_sn1"}}
{"return": {}, "id": "y4Og0JTS"}
{"execute": "blockdev-add", "arguments": {"node-name": "file_sn1", "driver": "file", "filename": "/root/avocado/data/avocado-vt/sn1.qcow2", "aio": "threads", "auto-read-only": true, "discard": "unmap"}, "id": "7122LIdl"}
{"return": {}, "id": "7122LIdl"}
{"execute": "blockdev-create", "arguments": {"options": {"driver": "qcow2", "file": "file_sn1", "size": 21474836480}, "job-id": "drive_sn1"}, "id": "DuGg41Qk"}
{"timestamp": {"seconds": 1630893663, "microseconds": 559471}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "drive_sn1"}}
{"timestamp": {"seconds": 1630893663, "microseconds": 559602}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "drive_sn1"}}
{"return": {}, "id": "DuGg41Qk"}
{"timestamp": {"seconds": 1630893663, "microseconds": 564397}, "event": "JOB_STATUS_CHANGE", "data": {"status": "waiting", "id": "drive_sn1"}}
{"timestamp": {"seconds": 1630893663, "microseconds": 564473}, "event": "JOB_STATUS_CHANGE", "data": {"status": "pending", "id": "drive_sn1"}}
{"timestamp": {"seconds": 1630893663, "microseconds": 564527}, "event": "JOB_STATUS_CHANGE", "data": {"status": "concluded", "id": "drive_sn1"}}
{"execute": "job-dismiss", "arguments": {"id": "drive_sn1"}, "id": "ltXHq8Bu"}
{"timestamp": {"seconds": 1630893667, "microseconds": 278363}, "event": "JOB_STATUS_CHANGE", "data": {"status": "null", "id": "drive_sn1"}}
{"return": {}, "id": "ltXHq8Bu"}
{"execute": "blockdev-add", "arguments": {"node-name": "drive_sn1", "driver": "qcow2", "file": "file_sn1", "read-only": false}, "id": "MHmlEvUW"}
{"return": {}, "id": "MHmlEvUW"}
{"execute": "blockdev-snapshot", "arguments": {"overlay": "drive_sn1", "node": "drive_image1"}, "id": "B4UIbVn9"}

3. run fio test in guest
/usr/bin/fio --name=stress --filename=/home/atest --ioengine=libaio --rw=write --direct=1 --bs=4K --size=2G --iodepth=256 --numjobs=256 --runtime=1800

4. commit snapshot to base during fio running
'execute': 'block-commit', 'arguments': {'device': 'drive_sn1', 'job-id': 'drive_sn1_vwgZ'}, 'id': 'nstFLfKJ'}
{"timestamp": {"seconds": 1630893702, "microseconds": 96539}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "drive_sn1_vwgZ"}}
{"timestamp": {"seconds": 1630893702, "microseconds": 96617}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "drive_sn1_vwgZ"}}
{"return": {}, "id": "nstFLfKJ"}

QEMU crash at this step



Actual results:
QEMU crash

Expected results:
commit snapshot to base during fio running success.

Additional info:
qemu command line:
/usr/libexec/qemu-kvm \
    -S  \
    -name 'avocado-vt-vm1'  \
    -sandbox on  \
    -machine q35,memory-backend=mem-machine_mem \
    -device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pcie.0,addr=0x1,chassis=1 \
    -device pcie-pci-bridge,id=pcie-pci-bridge-0,addr=0x0,bus=pcie-root-port-0  \
    -nodefaults \
    -device VGA,bus=pcie.0,addr=0x2 \
    -device i6300esb,bus=pcie-pci-bridge-0,addr=0x1 \
    -m 30720 \
    -object memory-backend-ram,size=30720M,id=mem-machine_mem  \
    -smp 20,maxcpus=20,cores=10,threads=1,dies=1,sockets=2  \
    -cpu 'Cascadelake-Server-noTSX',+kvm_pv_unhalt \
    -chardev socket,server=on,path=/tmp/avocado_2x9u27d2/monitor-qmpmonitor1-20210905-044708-NK08vaC8,id=qmp_id_qmpmonitor1,wait=off  \
    -mon chardev=qmp_id_qmpmonitor1,mode=control \
    -device ich9-usb-ehci1,id=usb1,addr=0x1d.0x7,multifunction=on,bus=pcie.0 \
    -device ich9-usb-uhci1,id=usb1.0,multifunction=on,masterbus=usb1.0,addr=0x1d.0x0,firstport=0,bus=pcie.0 \
    -device ich9-usb-uhci2,id=usb1.1,multifunction=on,masterbus=usb1.0,addr=0x1d.0x2,firstport=2,bus=pcie.0 \
    -device ich9-usb-uhci3,id=usb1.2,multifunction=on,masterbus=usb1.0,addr=0x1d.0x4,firstport=4,bus=pcie.0 \
    -device pcie-root-port,id=pcie-root-port-2,port=0x2,addr=0x1.0x2,bus=pcie.0,chassis=3 \
    -device qemu-xhci,id=usb2,bus=pcie-root-port-2,addr=0x0 \
    -device usb-tablet,id=usb-tablet1,bus=usb2.0,port=1 \
    -device pcie-root-port,id=pcie-root-port-3,port=0x3,addr=0x1.0x3,bus=pcie.0,chassis=4 \
    -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pcie-root-port-3,addr=0x0 \
    -blockdev node-name=file_image1,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=/home/kvm_autotest_root/images/rhel900-64-virtio-scsi.qcow2,cache.direct=on,cache.no-flush=off \
    -blockdev node-name=drive_image1,driver=qcow2,read-only=off,cache.direct=on,cache.no-flush=off,file=file_image1 \
    -device scsi-hd,id=image1,drive=drive_image1,write-cache=on \
    -device pcie-root-port,id=pcie-root-port-4,port=0x4,addr=0x1.0x4,bus=pcie.0,chassis=5 \
    -device virtio-net-pci,mac=9a:59:66:96:dd:3d,id=idU6Widf,netdev=idHu8x4m,bus=pcie-root-port-4,addr=0x0  \
    -netdev tap,id=idHu8x4m,vhost=on \
    -vnc :0  \
    -monitor stdio \

Comment 5 Stefano Garzarella 2021-09-09 15:42:11 UTC
(In reply to Yanan Fu from comment #4)
> Here is the backtrace:
> # gdb qemu-kvm core-qemu-kvm-380249-1630831981
> ...
> ...
> Core was generated by `/usr/libexec/qemu-kvm -S -name avocado-vt-vm1
> -sandbox on -machine q35,memory-b'.
> Program terminated with signal SIGSEGV, Segmentation fault.
> #0  mirror_wait_on_conflicts (self=0x0, s=<optimized out>, offset=<optimized
> out>, bytes=<optimized out>)
>     at ../block/mirror.c:172
> 172	                self->waiting_for_op = op;
> [Current thread is 1 (Thread 0x7f0908931ec0 (LWP 380249))]
> (gdb) bt
> #0  mirror_wait_on_conflicts (self=0x0, s=<optimized out>, offset=<optimized
> out>, bytes=<optimized out>)
>     at ../block/mirror.c:172
> #1  0x00005610c5d9d631 in mirror_run (job=0x5610c76a2c00, errp=<optimized
> out>) at ../block/mirror.c:491
> #2  0x00005610c5d58726 in job_co_entry (opaque=0x5610c76a2c00) at
> ../job.c:917
> #3  0x00005610c5f046c6 in coroutine_trampoline (i0=<optimized out>,
> i1=<optimized out>)
>     at ../util/coroutine-ucontext.c:173
> #4  0x00007f0909975820 in ?? () at
> ../sysdeps/unix/sysv/linux/x86_64/__start_context.S:91
>    from /usr/lib64/libc.so.6
> #5  0x00007f090892e980 in ?? ()
> #6  0x0000000000000000 in ?? ()

The issue seems related to the following commit released with QEMU 6.1:
d44dae1a7c block/mirror: fix active mirror dead-lock in mirror_wait_on_conflicts
https://gitlab.com/qemu-project/qemu/-/commit/d44dae1a7cf782ec9235746ebb0e6c1a20dd7288

In mirror_iteration() we call mirror_wait_on_conflicts() with `self` parameter set to NULL.
Starting from commit d44dae1a7c we access `self` in mirror_wait_on_conflicts() without checking if it can be NULL.

I'll fix it.

Comment 8 Stefano Garzarella 2021-09-10 10:23:18 UTC
Patch posted upstream: https://lists.nongnu.org/archive/html/qemu-devel/2021-09/msg02750.html

Comment 14 aihua liang 2021-09-26 03:09:22 UTC
In qemu-kvm-6.1.0-2.el9 and qemu-kvm-6.1.0-1.module+el8.6.0+12535+4e2af250, image cluster leak after qemu crash.
[root@dell-per440-09 images]# qemu-img check rhel900-64-virtio.qcow2 
Leaked cluster 175239 refcount=1 reference=0
Leaked cluster 175240 refcount=1 reference=0
Leaked cluster 175241 refcount=1 reference=0
Leaked cluster 175242 refcount=1 reference=0
Leaked cluster 175243 refcount=1 reference=0
Leaked cluster 175244 refcount=1 reference=0
Leaked cluster 175245 refcount=1 reference=0
Leaked cluster 175246 refcount=1 reference=0
Leaked cluster 175247 refcount=1 reference=0
Leaked cluster 175248 refcount=1 reference=0
Leaked cluster 175249 refcount=1 reference=0
Leaked cluster 175250 refcount=1 reference=0
Leaked cluster 175251 refcount=1 reference=0
Leaked cluster 175252 refcount=1 reference=0
Leaked cluster 175253 refcount=1 reference=0
Leaked cluster 175254 refcount=1 reference=0
Leaked cluster 175255 refcount=1 reference=0
Leaked cluster 175256 refcount=1 reference=0
Leaked cluster 175257 refcount=1 reference=0
Leaked cluster 175258 refcount=1 reference=0
Leaked cluster 175259 refcount=1 reference=0
Leaked cluster 175260 refcount=1 reference=0
Leaked cluster 175261 refcount=1 reference=0
Leaked cluster 175262 refcount=1 reference=0
Leaked cluster 175263 refcount=1 reference=0
Leaked cluster 175264 refcount=1 reference=0
Leaked cluster 175265 refcount=1 reference=0
Leaked cluster 175266 refcount=1 reference=0
Leaked cluster 175267 refcount=1 reference=0
Leaked cluster 175268 refcount=1 reference=0
Leaked cluster 175269 refcount=1 reference=0
Leaked cluster 175270 refcount=1 reference=0
Leaked cluster 175271 refcount=1 reference=0
Leaked cluster 175272 refcount=1 reference=0
Leaked cluster 175273 refcount=1 reference=0
Leaked cluster 175274 refcount=1 reference=0
Leaked cluster 175275 refcount=1 reference=0
Leaked cluster 175276 refcount=1 reference=0
Leaked cluster 175277 refcount=1 reference=0
Leaked cluster 175278 refcount=1 reference=0
Leaked cluster 175279 refcount=1 reference=0
Leaked cluster 175280 refcount=1 reference=0
Leaked cluster 175281 refcount=1 reference=0
Leaked cluster 175282 refcount=1 reference=0
Leaked cluster 175283 refcount=1 reference=0
Leaked cluster 175284 refcount=1 reference=0
Leaked cluster 175285 refcount=1 reference=0
Leaked cluster 175286 refcount=1 reference=0
Leaked cluster 175287 refcount=1 reference=0
Leaked cluster 175288 refcount=1 reference=0
Leaked cluster 175289 refcount=1 reference=0
Leaked cluster 175290 refcount=1 reference=0
Leaked cluster 175291 refcount=1 reference=0
Leaked cluster 175292 refcount=1 reference=0
Leaked cluster 175293 refcount=1 reference=0
Leaked cluster 175294 refcount=1 reference=0
Leaked cluster 175295 refcount=1 reference=0
Leaked cluster 175296 refcount=1 reference=0
Leaked cluster 175297 refcount=1 reference=0
Leaked cluster 175298 refcount=1 reference=0
Leaked cluster 175299 refcount=1 reference=0
Leaked cluster 175300 refcount=1 reference=0
Leaked cluster 175301 refcount=1 reference=0
Leaked cluster 175302 refcount=1 reference=0
Leaked cluster 175303 refcount=1 reference=0
Leaked cluster 175304 refcount=1 reference=0
Leaked cluster 175305 refcount=1 reference=0
Leaked cluster 175306 refcount=1 reference=0
Leaked cluster 175307 refcount=1 reference=0
Leaked cluster 175308 refcount=1 reference=0
Leaked cluster 175309 refcount=1 reference=0
Leaked cluster 175310 refcount=1 reference=0
Leaked cluster 175311 refcount=1 reference=0
Leaked cluster 175312 refcount=1 reference=0
Leaked cluster 175313 refcount=1 reference=0
Leaked cluster 175314 refcount=1 reference=0
Leaked cluster 175315 refcount=1 reference=0
Leaked cluster 175316 refcount=1 reference=0
Leaked cluster 175317 refcount=1 reference=0
Leaked cluster 175318 refcount=1 reference=0
Leaked cluster 175319 refcount=1 reference=0
Leaked cluster 175320 refcount=1 reference=0
Leaked cluster 175321 refcount=1 reference=0
Leaked cluster 175322 refcount=1 reference=0
Leaked cluster 175323 refcount=1 reference=0
Leaked cluster 175324 refcount=1 reference=0
Leaked cluster 175325 refcount=1 reference=0
Leaked cluster 175326 refcount=1 reference=0
Leaked cluster 175327 refcount=1 reference=0
Leaked cluster 175328 refcount=1 reference=0
Leaked cluster 175329 refcount=1 reference=0
Leaked cluster 175330 refcount=1 reference=0
Leaked cluster 175331 refcount=1 reference=0
Leaked cluster 175332 refcount=1 reference=0
Leaked cluster 175333 refcount=1 reference=0
Leaked cluster 175334 refcount=1 reference=0
Leaked cluster 175335 refcount=1 reference=0
Leaked cluster 175336 refcount=1 reference=0
Leaked cluster 175337 refcount=1 reference=0
Leaked cluster 175338 refcount=1 reference=0
Leaked cluster 175339 refcount=1 reference=0
Leaked cluster 175340 refcount=1 reference=0
Leaked cluster 175341 refcount=1 reference=0
Leaked cluster 175342 refcount=1 reference=0
Leaked cluster 175343 refcount=1 reference=0
Leaked cluster 175344 refcount=1 reference=0
Leaked cluster 175345 refcount=1 reference=0
Leaked cluster 175346 refcount=1 reference=0
Leaked cluster 175347 refcount=1 reference=0
Leaked cluster 175348 refcount=1 reference=0
Leaked cluster 175349 refcount=1 reference=0
Leaked cluster 175350 refcount=1 reference=0
Leaked cluster 175351 refcount=1 reference=0
Leaked cluster 175352 refcount=1 reference=0
Leaked cluster 175353 refcount=1 reference=0
Leaked cluster 175354 refcount=1 reference=0
Leaked cluster 175355 refcount=1 reference=0
Leaked cluster 175356 refcount=1 reference=0
Leaked cluster 175357 refcount=1 reference=0
Leaked cluster 175358 refcount=1 reference=0
Leaked cluster 175359 refcount=1 reference=0
Leaked cluster 175360 refcount=1 reference=0
Leaked cluster 175361 refcount=1 reference=0
Leaked cluster 175362 refcount=1 reference=0
Leaked cluster 175363 refcount=1 reference=0
Leaked cluster 175364 refcount=1 reference=0
Leaked cluster 175365 refcount=1 reference=0
Leaked cluster 175366 refcount=1 reference=0
Leaked cluster 175367 refcount=1 reference=0
Leaked cluster 175368 refcount=1 reference=0
Leaked cluster 175369 refcount=1 reference=0
Leaked cluster 175370 refcount=1 reference=0
Leaked cluster 175371 refcount=1 reference=0
Leaked cluster 175372 refcount=1 reference=0
Leaked cluster 175373 refcount=1 reference=0
Leaked cluster 175374 refcount=1 reference=0
Leaked cluster 175375 refcount=1 reference=0
Leaked cluster 175376 refcount=1 reference=0
Leaked cluster 175377 refcount=1 reference=0
Leaked cluster 175378 refcount=1 reference=0
Leaked cluster 175379 refcount=1 reference=0
Leaked cluster 175380 refcount=1 reference=0
Leaked cluster 175381 refcount=1 reference=0
Leaked cluster 175382 refcount=1 reference=0
Leaked cluster 175383 refcount=1 reference=0
Leaked cluster 175384 refcount=1 reference=0
Leaked cluster 175385 refcount=1 reference=0
Leaked cluster 175386 refcount=1 reference=0
Leaked cluster 175387 refcount=1 reference=0
Leaked cluster 175388 refcount=1 reference=0
Leaked cluster 175389 refcount=1 reference=0
Leaked cluster 175390 refcount=1 reference=0
Leaked cluster 175391 refcount=1 reference=0
Leaked cluster 175392 refcount=1 reference=0
Leaked cluster 175393 refcount=1 reference=0
Leaked cluster 175394 refcount=1 reference=0

156 leaked clusters were found on the image.
This means waste of disk space, but no harm to data.
175203/327680 = 53.47% allocated, 5.20% fragmented, 0.00% compressed clusters
Image end offset: 11494752256

And in latest version:qemu-kvm-core-6.1.0-1.module+el8.6.0+12721+8d053ff2.x86_64 and qemu-kvm-6.1.0-3.el9 also hit this issue.

Comment 15 Han Han 2021-12-10 01:58:52 UTC
*** Bug 2030708 has been marked as a duplicate of this bug. ***

Comment 16 Han Han 2021-12-10 02:05:25 UTC
Hi, from another case of this bug(https://bugzilla.redhat.com/show_bug.cgi?id=2030708#c0), an unprivileged user could help to cause the crash of qemu processs.
So it is a possible DOS vulnerability.
Add security keyword here.

Comment 17 Mauro Matteo Cascella 2021-12-15 16:00:26 UTC
Hi Han,

> bug(https://bugzilla.redhat.com/show_bug.cgi?id=2030708#c0), an unprivileged
> user could help to cause the crash of qemu processs.
> So it is a possible DOS vulnerability.
> Add security keyword here.

Could you please elaborate on this? Doesn't the user need proper privileges to create a snapshot of the guest and/or execute a block commit operation? I'm not sure we should call it a security issue if that's the case, because I don't see any trust boundary crossed.

Comment 18 Han Han 2021-12-17 09:54:36 UTC
(In reply to Mauro Matteo Cascella from comment #17)
> Hi Han,
> 
> > bug(https://bugzilla.redhat.com/show_bug.cgi?id=2030708#c0), an unprivileged
> > user could help to cause the crash of qemu processs.
> > So it is a possible DOS vulnerability.
> > Add security keyword here.
> 
> Could you please elaborate on this? Doesn't the user need proper privileges
OK. Let's see a method to reproduce this bug: https://bugzilla.redhat.com/show_bug.cgi?id=2030708#c0

The qemu segment fault is caused by the step of `ssh hhan@$IP dd if=/dev/urandom of=file bs=1G count=1`. It means at some conditions, a unprivileged user **hhan** inside the VM can cause the qemu segment fault.
> to create a snapshot of the guest and/or execute a block commit operation?
> I'm not sure we should call it a security issue if that's the case, because
> I don't see any trust boundary crossed.
The trust boundary is that an unprivileged user inside VM shouldn't make qemu crash.

Comment 19 Yanan Fu 2021-12-20 12:45:07 UTC
QE bot(pre verify): Set 'Verified:Tested,SanityOnly' as gating/tier1 test pass.

Comment 20 aihua liang 2021-12-21 08:18:16 UTC
Test with qemu-kvm-6.2.0-1.el9, don't hit the core dump issue any more.

Comment 21 Mauro Matteo Cascella 2021-12-21 09:45:33 UTC
(In reply to Han Han from comment #18)
> The qemu segment fault is caused by the step of `ssh hhan@$IP dd
> if=/dev/urandom of=file bs=1G count=1`. It means at some conditions, a
> unprivileged user **hhan** inside the VM can cause the qemu segment fault.

> The trust boundary is that an unprivileged user inside VM shouldn't make
> qemu crash.

OK, so if I understand correctly we can consider the various operations on the host (e.g., snapshot-create, blockcommit, domblkthreshold, etc.) as a precondition for this bug to happen. Under such circumstances the guest user writes a huge file and triggers the flaw. No matter how likely these preconditions are, I agree that this should not happen. I think we can opt for a low-severity CVE here.

Comment 24 aihua liang 2021-12-23 06:05:29 UTC
Hi,Hanhan

 Please help to check if the security issue still exist in qemu-kvm-6.2.0-1.el9? If not, will change bug's status to "VERIFIED".

Thanks,
Aliang

Comment 25 Han Han 2021-12-23 07:06:47 UTC
(In reply to aihua liang from comment #24)
> Hi,Hanhan
> 
>  Please help to check if the security issue still exist in
> qemu-kvm-6.2.0-1.el9? If not, will change bug's status to "VERIFIED".
> 
> Thanks,
> Aliang

Well. My machine resources are occupied for other testings now.
You can try to reproduce it as the scripts of https://bugzilla.redhat.com/show_bug.cgi?id=2030708
If it it not reproduced after thousands of loops, I think that means the bug has been fix.

Comment 26 aihua liang 2021-12-23 07:32:33 UTC
As comment25, It needs more time to check the CVE issue and as have no idle resouce at present. So re-set the ITM.

Comment 27 aihua liang 2022-01-06 11:02:22 UTC
Reproduce the security issue that hahan reported in bz2030708.

Even we don't set the block threshold and use a root user, we can also trigger this issue.

Reproduce ratio:
  1/44

Test Env:
  kernel version:5.14.0-30.el9.x86_64
  qemu-kvm version:qemu-kvm-6.1.0-8.el9
  libvirt version:libvirt-7.10.0-1.el9.x86_64

Steps to Reproduce:
1. Prepare an VM named avocado-vt-vm1
2. Monitor the events of libvirt
# virsh event avocado-vt-vm1 --loop --all

3. Run the scripts to write data when block job reach ready status
#!/bin/bash - 
IP=192.168.122.156 # the IP of the guest
VM=avocado-vt-vm1
while true;do
    virsh start $VM
    sleep 30
    virsh snapshot-create $VM --no-metadata --disk-only
    virsh blockcommit $VM vda --active
    ssh root@$IP dd if=/dev/urandom of=file bs=1G count=1
    sleep $(shuf -i 1-10 -n1)
    virsh blockjob $VM vda --pivot
    virsh destroy $VM
    if [ $? -ne 0 ];then
        break
    fi
done

Actual results:
Sometime qemu will get segment fault:

Domain 'avocado-vt-vm1' started

Domain snapshot 1641461222 created
Active Block Commit started
0+1 records in
0+1 records out
33554431 bytes (34 MB, 32 MiB) copied, 0.212443 s, 158 MB/s
error: Requested operation is not valid: domain is not running

error: Failed to destroy domain 'avocado-vt-vm1'
error: Requested operation is not valid: domain is not running

The event log:
event 'agent-lifecycle' for domain 'avocado-vt-vm1': state: 'disconnected' reason: 'domain started'
event 'lifecycle' for domain 'avocado-vt-vm1': Resumed Unpaused
event 'lifecycle' for domain 'avocado-vt-vm1': Started Booted
event 'agent-lifecycle' for domain 'avocado-vt-vm1': state: 'connected' reason: 'channel event'
event 'rtc-change' for domain 'avocado-vt-vm1': -1
event 'block-job' for domain 'avocado-vt-vm1': Active Block Commit for /var/lib/avocado/data/avocado-vt/images/rhel900-64-virtio-scsi.1641461222 ready
event 'block-job-2' for domain 'avocado-vt-vm1': Active Block Commit for vda ready
event 'lifecycle' for domain 'avocado-vt-vm1': Stopped Failed



Hi, Mauro

 From my reproduce result:
  The root caue of this issue is data-writing during block job running, no matter the user is root user or common user.
  So is it still a security-re issue?

BR,
Aliang

Comment 28 aihua liang 2022-01-07 11:46:24 UTC
Test the issue that hahan reported in bz2030708 for 900 times, all pass.

So set bug's statu to "Verified".

Comment 29 Mauro Matteo Cascella 2022-01-10 15:03:52 UTC
Hi Aliang,

(In reply to aihua liang from comment #27)
>  From my reproduce result:
>   The root cause of this issue is data-writing during block job running, no
> matter the user is root user or common user.
>   So is it still a security-re issue?

Well, considering that an unprivileged user can trigger this bug (see comment 18), it doesn't come as a surprise that root is also able to trigger it. As long as you can make QEMU crash from within the guest, I think we should treat this as a (low) security issue.

Regards.

Comment 31 errata-xmlrpc 2022-05-17 12:24:17 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (new packages: qemu-kvm), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:2307


Note You need to log in before you can comment on or make changes to this bug.