Bug 1860627
| Summary: | qemu-img convert exit with exit code 1 without an error after successful operation | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux Advanced Virtualization | Reporter: | Nir Soffer <nsoffer> | ||||
| Component: | qemu-kvm | Assignee: | Eric Blake <eblake> | ||||
| qemu-kvm sub component: | NBD | QA Contact: | zixchen | ||||
| Status: | CLOSED ERRATA | Docs Contact: | |||||
| Severity: | medium | ||||||
| Priority: | medium | CC: | coli, ddepaula, eblake, jinzhao, juzhang, virt-maint, xuwei | ||||
| Version: | 8.2 | Flags: | pm-rhel:
mirror+
|
||||
| Target Milestone: | rc | ||||||
| Target Release: | 8.3 | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | qemu-kvm-5.1.0-2.module+el8.3.0+7652+b30e6901 | Doc Type: | If docs needed, set a value | ||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2020-11-17 17:50:17 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
|
Description
Nir Soffer
2020-07-25 22:47:05 UTC
Note: the correct attachment is attachment 1702415 [details].
Disabling compression works:
$ qemu-img convert -f raw -O qcow2 disk1.raw nbd+unix:///?socket=/tmp/nbd.sock; echo $?
0
$ qemu-img info nbd+unix:///?socket=/tmp/nbd.sock
image: nbd+unix://?socket=/tmp/nbd.sock
file format: qcow2
virtual size: 1 GiB (1073741824 bytes)
disk size: unavailable
cluster_size: 65536
Format specific information:
compat: 1.1
lazy refcounts: false
refcount bits: 16
corrupt: false
$ qemu-img check nbd+unix:///?socket=/tmp/nbd.sock
No errors were found on the image.
1/16384 = 0.01% allocated, 0.00% fragmented, 0.00% compressed clusters
Image end offset: 393216
So this seem to be an issue in qcow2 (or qemu-img) not in nbd.
Debugging qemu-img from git show that the error come from:
# qemu-img.c
2119 if (s->compressed && !s->ret) {
2120 /* signal EOF to align */
2121 ret = blk_pwrite_compressed(s->target, 0, NULL, 0);
2122 if (ret < 0) {
2123 return ret;
2124 }
2125 }
(gdb) bt
#0 bdrv_co_truncate (child=0x555555866670, offset=1073740288, exact=false, prealloc=PREALLOC_MODE_OFF, flags=0, errp=0x0) at /home/nsoffer/src/qemu/block/io.c:3397
#1 0x00005555555cb57c in qcow2_co_pwritev_compressed_part (bs=0x5555558b70d0, offset=0, bytes=0, qiov=0x7fffffffcfc0, qiov_offset=0)
at /home/nsoffer/src/qemu/block/qcow2.c:4522
#2 0x0000555555615bfe in bdrv_driver_pwritev_compressed (bs=0x5555558b70d0, offset=0, bytes=0, qiov=0x7fffffffcfc0, qiov_offset=0)
at /home/nsoffer/src/qemu/block/io.c:1277
#3 0x0000555555617ba4 in bdrv_aligned_pwritev (child=0x5555558c6bc0, req=0x7fffe50d7e10, offset=0, bytes=0, align=1, qiov=0x7fffffffcfc0, qiov_offset=0, flags=32)
at /home/nsoffer/src/qemu/block/io.c:2015
#4 0x00005555556183e8 in bdrv_co_pwritev_part (child=0x5555558c6bc0, offset=0, bytes=0, qiov=0x7fffffffcfc0, qiov_offset=0, flags=BDRV_REQ_WRITE_COMPRESSED)
at /home/nsoffer/src/qemu/block/io.c:2186
#5 0x00005555555fddcb in blk_do_pwritev_part (blk=0x5555558442a0, offset=0, bytes=0, qiov=0x7fffffffcfc0, qiov_offset=0, flags=BDRV_REQ_WRITE_COMPRESSED)
at /home/nsoffer/src/qemu/block/block-backend.c:1260
#6 0x00005555555fdf40 in blk_write_entry (opaque=0x7fffffffcfa0) at /home/nsoffer/src/qemu/block/block-backend.c:1310
#7 0x00005555556d35e6 in coroutine_trampoline (i0=1436206016, i1=21845) at /home/nsoffer/src/qemu/util/coroutine-ucontext.c:173
#8 0x00007ffff6b84d20 in ?? () from /lib64/libc.so.6
#9 0x00007fffffffc760 in ?? ()
#10 0x0000000000000000 in ?? ()
(gdb) frame 0
#0 bdrv_co_truncate (child=0x555555866670, offset=1073740288, exact=false, prealloc=PREALLOC_MODE_OFF, flags=0, errp=0x0) at /home/nsoffer/src/qemu/block/io.c:3397
3397 ret = -ENOTSUP;
(gdb) list
3392 ret = drv->bdrv_co_truncate(bs, offset, exact, prealloc, flags, errp);
3393 } else if (bs->file && drv->is_filter) {
3394 ret = bdrv_co_truncate(bs->file, offset, exact, prealloc, flags, errp);
3395 } else {
3396 error_setg(errp, "Image format driver does not support resize");
3397 ret = -ENOTSUP;
So we have 2 issues:
- Failure to truncate the image, as the last step in the convert when using
nbd backend. not sure why we try to truncate.
The call to truncate looks wrong:
4513 if (bytes == 0) {
4514 /*
4515 * align end of file to a sector boundary to ease reading with
4516 * sector based I/Os
4517 */
4518 int64_t len = bdrv_getlength(bs->file->bs);
4519 if (len < 0) {
4520 return len;
4521 }
4522 return bdrv_co_truncate(bs->file, len, false, PREALLOC_MODE_OFF, 0,
4523 NULL);
4524 }
The comment suggest that we are aligning the end of the file to sector
boundary, but we align it to the current length of the file, there is
no alignment. We can see that we called bdrv_co_truncate() with
len=1073740288, which is the size reported by qemu-nbd:
1024**3 - 1536 = 1073740288.
- The error message "Image format driver does not support resize" is not
written to stderr. This is really bad, making debugging very hard.
4522 return bdrv_co_truncate(bs->file, len, false, PREALLOC_MODE_OFF, 0,
4523 NULL);
Here we call bdrv_co_truncate with NULL errp, so we hide possible error set
in bdrv_co_truncate().
I posted a fix to nbd driver here: https://lists.nongnu.org/archive/html/qemu-block/2020-07/msg01543.html Reproduced with qemu-kvm-5.0.0-2.module+el8.3.0+7379+0505d6ca.x86_64, disable compression, it didn't hit the issue, enable compression it hit the issue.
Version:
kernel-4.18.0-227.el8.x86_64
qemu-kvm-5.0.0-2.module+el8.3.0+7379+0505d6ca.x86_64
Test steps:
1. echo "disk data" > disk1.raw
2. truncate -s 1g disk1.raw
3. truncate -s 1g test.tar
4. qemu-nbd --socket=/tmp/nbd.sock --persistent --format=raw --offset 1536 test.tar
With compression
5. qemu-img convert -f raw -O qcow2 -c disk1.raw nbd+unix:///?socket=/tmp/nbd.sock; echo $?
Without compression
6. qemu-img convert -f raw -O qcow2 disk1.raw nbd+unix:///?socket=/tmp/nbd.sock; echo $?
Actual results:
After step 5, it returns 1, execute Step 1-4 and 6, it returns 1.
Expected result
1. returns 0
Additional info:
After Step 5
1. # qemu-img info nbd+unix:///?socket=/tmp/nbd.sock
image: nbd+unix://?socket=/tmp/nbd.sock
file format: qcow2
virtual size: 1 GiB (1073741824 bytes)
disk size: unavailable
cluster_size: 65536
Format specific information:
compat: 1.1
compression type: zlib
lazy refcounts: false
refcount bits: 16
corrupt: false
2. # qemu-img check nbd+unix:///?socket=/tmp/nbd.sock
No errors were found on the image.
Image end offset: 262144
After Step 6,
[root@hp-dl388pg8-01 bug_test]# qemu-img info nbd+unix:///?socket=/tmp/nbd.sock
image: nbd+unix://?socket=/tmp/nbd.sock
file format: qcow2
virtual size: 1 GiB (1073741824 bytes)
disk size: unavailable
cluster_size: 65536
Format specific information:
compat: 1.1
compression type: zlib
lazy refcounts: false
refcount bits: 16
corrupt: false
[root@hp-dl388pg8-01 bug_test]# qemu-img check nbd+unix:///?socket=/tmp/nbd.sock
No errors were found on the image.
1/16384 = 0.01% allocated, 0.00% fragmented, 0.00% compressed clusters
Image end offset: 393216
NeedInfo, does this bug need to be copied to 8.3.0 av?
I see Eric has posted PULL requests for qemu-5.1: https://lists.nongnu.org/archive/html/qemu-block/2020-07/msg01706.html https://lists.nongnu.org/archive/html/qemu-block/2020-07/msg01709.html Once accepted, we can move this bz straight to POST with the upstream commit id in Devel Whiteboard Test with qemu-kvm-5.1.0-2.module+el8.3.0+7652+b30e6901, this issue is fixed.
Test version:
kernel-4.18.0-232.el8.x86_64
qemu-kvm-5.1.0-2.module+el8.3.0+7652+b30e6901.x86_64
Steps:
1. prepare stg0.raw on nbd server.
2. create a tar file.
# tar -cvf test.tar image_convert.raw
3. export nbd img;
# qemu-nbd --socket=/tmp/nbd.sock --persistent --format=raw --offset 1536 test.tar
4. convert with compression:
# qemu-img convert -f raw -O qcow2 -c /home/kvm_autotest_root/images/stg0.raw nbd+unix:///?socket=/tmp/nbd.sock; echo $?
0
5. convert without compression:
qemu-img convert -f raw -O qcow2 -c /home/kvm_autotest_root/images/stg0.raw nbd+unix:///?socket=/tmp/nbd.sock; echo $?
0
Actual result:
1. After Step4 & 5, command return 0.
2. Check img info without error:
# qemu-img info nbd+unix:///?socket=/tmp/nbd.sock
image: nbd+unix://?socket=/tmp/nbd.sock
file format: qcow2
virtual size: 20 GiB (21478375424 bytes)
disk size: unavailable
cluster_size: 65536
Format specific information:
compat: 1.1
compression type: zlib
lazy refcounts: false
refcount bits: 16
corrupt: false
3. check img without error:
# qemu-img check nbd+unix:///?socket=/tmp/nbd.sock
No errors were found on the image.
86939/327734 = 26.53% allocated, 80.51% fragmented, 79.23% compressed clusters
Image end offset: 3180789760
Expected result:
Same as the actual result.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (virt:8.3 bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:5137 |