Bug 2026747

Summary: I/O errors on f34 s390x kvm guests under f35 hypervisor
Product: [Fedora] Fedora Reporter: Kevin Fenzi <kevin>
Component: qemuAssignee: Fedora Virtualization Maintainers <virt-maint>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 35CC: berrange, cfergeau, crobinso, ondrejj, pampelmuse, pbonzini, philmd, rjones, smitterl, virt-maint
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: qemu-6.1.0-14.fc35 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-02-15 01:37:44 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
libvirt xml for one of the guests
none
log for guest none

Description Kevin Fenzi 2021-11-25 17:51:28 UTC
kvm lpar on a z15 mainframe.

L0 is a fedora 35 hypervisor with qemu-6.1.0-10.fc35.s390x
L1's are Fedora 34 guests. 

No particular errors that seem related on the host, but guests get: 

[Thu Nov 25 09:04:30 2021] EXT4-fs warning (device vda3): ext4_end_bio:342: I/O error 10 writing to inode 133536 starting block 16028014)
[Thu Nov 25 09:04:30 2021] buffer_io_error: 1341 callbacks suppressed
[Thu Nov 25 09:04:30 2021] Buffer I/O error on device vda3, logical block 15371721
[Thu Nov 25 09:04:30 2021] Buffer I/O error on device vda3, logical block 15371722
[Thu Nov 25 09:04:30 2021] Buffer I/O error on device vda3, logical block 15371723
[Thu Nov 25 09:04:30 2021] Buffer I/O error on device vda3, logical block 15371724
[Thu Nov 25 09:04:30 2021] Buffer I/O error on device vda3, logical block 15371725
[Thu Nov 25 09:04:30 2021] Buffer I/O error on device vda3, logical block 15371726
[Thu Nov 25 09:04:30 2021] Buffer I/O error on device vda3, logical block 15371727
[Thu Nov 25 09:04:30 2021] Buffer I/O error on device vda3, logical block 15371728
[Thu Nov 25 09:04:30 2021] Buffer I/O error on device vda3, logical block 15371729
[Thu Nov 25 09:04:30 2021] Buffer I/O error on device vda3, logical block 15371730
[Thu Nov 25 09:04:30 2021] blk_update_request: I/O error, dev vda, sector 128224112 op 0x1:(WRITE) flags 0x4000 phys_seg 126 prio class 0
[Thu Nov 25 09:04:30 2021] blk_update_request: I/O error, dev vda, sector 128225120 op 0x1:(WRITE) flags 0x4000 phys_seg 126 prio class 0
[Thu Nov 25 09:04:30 2021] blk_update_request: I/O error, dev vda, sector 128226136 op 0x1:(WRITE) flags 0x0 phys_seg 4 prio class 0
[Thu Nov 25 09:04:30 2021] EXT4-fs warning (device vda3): ext4_end_bio:342: I/O error 10 writing to inode 133536 starting block 16028271)
[Thu Nov 25 09:04:30 2021] blk_update_request: I/O error, dev vda, sector 128226168 op 0x1:(WRITE) flags 0x4000 phys_seg 126 prio class 0
[Thu Nov 25 09:04:30 2021] blk_update_request: I/O error, dev vda, sector 128227176 op 0x1:(WRITE) flags 0x4000 phys_seg 126 prio class 0
[Thu Nov 25 09:04:30 2021] blk_update_request: I/O error, dev vda, sector 128228184 op 0x1:(WRITE) flags 0x0 phys_seg 4 prio class 0
[Thu Nov 25 09:04:30 2021] EXT4-fs warning (device vda3): ext4_end_bio:342: I/O error 10 writing to inode 133536 starting block 16028527)
[Thu Nov 25 09:04:30 2021] EXT4-fs warning (device vda3): ext4_end_bio:342: I/O error 10 writing to inode 133536 starting block 16027337)
[Thu Nov 25 09:04:30 2021] EXT4-fs warning (device vda3): ext4_end_bio:342: I/O error 10 writing to inode 133536 starting block 16028785)

(or xfs errors on xfs hosts)

Downgrading the host to qemu-5.2.0-8.fc34.s390x and everything seems fine again.

Happy to provide more info...

Comment 1 Daniel Berrangé 2021-11-25 18:00:21 UTC
If you're using libvirt, please provide the libvirt XML config and the corresponding QEMU log file /var/log/libvirt/qemu/$GUEST.log.

If using QEMU direct, just the full QEMU command line you use.

Comment 4 Kevin Fenzi 2021-11-25 18:24:37 UTC
Created attachment 1843618 [details]
libvirt xml for one of the guests

Here's the libvirt xml for a guest. 

I did also try changing between the s390-ccw-virtio-6.1 (what f35 virt-install puts there) and s390-ccw-virtio-4.1 (what the f34 virt-install has there) and that didn't seem to matter.

Comment 5 Kevin Fenzi 2021-11-25 18:27:15 UTC
Created attachment 1843619 [details]
log for guest

Here's the /var/log/libvirt/qemu/buildvm-s390x-23.s390.fedoraproject.org.log

Comment 6 Thomas Huth 2021-11-25 18:28:03 UTC
Hi Kevin! We've seen a similar problem in downstream RHEL ... The fix is likely https://gitlab.com/qemu-project/qemu/-/commit/cc071629539dc1f303175a7 ... do you happen to know how to rebuild the package to have a try whether that fixes the issue for you?

Comment 7 smitterl 2021-11-25 18:29:19 UTC
(In reply to Thomas Huth from comment #6)
> Hi Kevin! We've seen a similar problem in downstream RHEL ... The fix is
> likely https://gitlab.com/qemu-project/qemu/-/commit/cc071629539dc1f303175a7
> ... do you happen to know how to rebuild the package to have a try whether
> that fixes the issue for you?

Agreed.
A) Could you try with qemu 6.2-rc to see if it still reproduces?
B) Can you confirm
 a) the device attached to target 'vda' is backed by virtio-blk and a dasd block device (/dev/dasdX)?
 b) if so, with qemu 6.1 version you mention this should reproduce only easily reproduced without nested:
  1. attach disk /dev/dasdX as virtio-blk to L1 guest
  2. inside L1 guest format and create filesystem ext4
  3. read write, check logs for I/O errors

Comment 8 Daniel Berrangé 2021-11-25 19:03:54 UTC
Here is a scrath build with the mentioned upstream fix:

  https://koji.fedoraproject.org/koji/taskinfo?taskID=79265813

if you confirm whether that solves the I/O errors, I'll do a formal build

Comment 9 Richard W.M. Jones 2021-11-25 19:28:03 UTC
I know everyone's talking about s390x and dasd's and such, but is it
possible this same bug could affect armv7 guest on aarch64 host (Fedora
Rawhide in both cases)?  For about a month I've been seeing errors
like this inside the guest, and they look very similar to this BZ.
Also Dan if there's a scratch building including aarch64 I could test it.

[16784.102801] blk_update_request: I/O error, dev vda, sector 9689856 op 0x1:(WRITE) flags 0x800 phys_seg 16 prio class 0
[16784.130038] blk_update_request: I/O error, dev vda, sector 9690368 op 0x1:(WRITE) flags 0x4800 phys_seg 80 prio class 0
[16784.134209] blk_update_request: I/O error, dev vda, sector 9692928 op 0x1:(WRITE) flags 0x4800 phys_seg 80 prio class 0
[16784.138263] blk_update_request: I/O error, dev vda, sector 9695488 op 0x1:(WRITE) flags 0x800 phys_seg 96 prio class 0
[16784.142771] blk_update_request: I/O error, dev vda, sector 9696752 op 0x1:(WRITE) flags 0x4800 phys_seg 205 prio class 0
[16784.146855] blk_update_request: I/O error, dev vda, sector 9699312 op 0x1:(WRITE) flags 0x800 phys_seg 52 prio class 0
[16784.150877] blk_update_request: I/O error, dev vda, sector 9700136 op 0x1:(WRITE) flags 0x4800 phys_seg 157 prio class 0
[16784.155033] blk_update_request: I/O error, dev vda, sector 9702696 op 0x1:(WRITE) flags 0x4800 phys_seg 80 prio class 0
[16784.159096] blk_update_request: I/O error, dev vda, sector 9705256 op 0x1:(WRITE) flags 0x800 phys_seg 19 prio class 0
[16784.163163] blk_update_request: I/O error, dev vda, sector 9705864 op 0x1:(WRITE) flags 0x4800 phys_seg 80 prio class 0
[16784.188541] vda3: writeback error on inode 4750392, offset 43192320, sector 9682176

Comment 10 Richard W.M. Jones 2021-11-25 19:30:02 UTC
(In reply to Richard W.M. Jones from comment #9)
> I know everyone's talking about s390x and dasd's and such, but is it
> possible this same bug could affect armv7 guest on aarch64 host (Fedora
> Rawhide in both cases)?  For about a month I've been seeing errors
> like this inside the guest, and they look very similar to this BZ.

Which was filed as:
https://bugzilla.redhat.com/show_bug.cgi?id=2004120

Comment 11 Richard W.M. Jones 2021-11-25 19:37:05 UTC
Scratch build with the patch, on top of Rawhide, all arches:
https://koji.fedoraproject.org/koji/taskinfo?taskID=79266649

Comment 12 Richard W.M. Jones 2021-11-25 21:28:35 UTC
Can confirm it fixes the armv7/aarch64 problem here.

Comment 13 Richard W.M. Jones 2021-11-25 21:29:06 UTC
*** Bug 2004120 has been marked as a duplicate of this bug. ***

Comment 15 Richard W.M. Jones 2021-11-25 21:51:16 UTC
(In reply to Richard W.M. Jones from comment #14)
> Build for F35:    
> https://koji.fedoraproject.org/koji/taskinfo?taskID=79268098

The GCC / systemtap / armv7 bug of doom is now affecting F35 too.  New
F35 build including my workaround for that bug:

https://koji.fedoraproject.org/koji/taskinfo?taskID=79268858

Comment 16 Fedora Update System 2021-11-26 08:11:22 UTC
FEDORA-2021-12f6c46ad8 has been submitted as an update to Fedora 35. https://bodhi.fedoraproject.org/updates/FEDORA-2021-12f6c46ad8

Comment 17 Fedora Update System 2021-11-27 21:58:09 UTC
FEDORA-2021-12f6c46ad8 has been pushed to the Fedora 35 testing repository.
Soon you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --advisory=FEDORA-2021-12f6c46ad8`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2021-12f6c46ad8

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 18 Kevin Fenzi 2021-11-27 23:34:38 UTC
I've upgraded to that version of qemu and restarted a guest. Will see how it looks in a while. :) Thanks for the quick fixes.

Comment 19 Kevin Fenzi 2021-11-29 21:29:43 UTC
So far that guest is fine. :) So, looks good here...

Comment 20 Christoph Karl 2022-02-08 18:22:57 UTC
Seems to be a duplicate of this bug:
https://bugzilla.redhat.com/show_bug.cgi?id=2041227

Comment 21 Christoph Karl 2022-02-08 18:24:13 UTC
*** Bug 2041227 has been marked as a duplicate of this bug. ***

Comment 22 Fedora Update System 2022-02-11 02:14:08 UTC
FEDORA-2022-3a60c34473 has been pushed to the Fedora 35 testing repository.
Soon you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --advisory=FEDORA-2022-3a60c34473`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2022-3a60c34473

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 23 Fedora Update System 2022-02-15 01:37:44 UTC
FEDORA-2022-3a60c34473 has been pushed to the Fedora 35 stable repository.
If problem still persists, please make note of it in this bug report.