Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 2190442

Summary: libguestfs appliance crashes: init[1]: segfault at 55a72a57e000 ip 00007f7054ab9cd7 sp 00007ffc3134f298 error 6 in libc.so.6[7f7054a28000+175000] likely on CPU 0 (core 0, socket 0)
Product: Red Hat Enterprise Linux 9 Reporter: Richard W.M. Jones <rjones>
Component: glibcAssignee: glibc team <glibc-bugzilla>
Status: CLOSED CURRENTRELEASE QA Contact: qe-baseos-tools-bugs
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 9.3CC: ashankar, codonell, coli, dj, fweimer, jinzhao, lersek, mnewsome, pbonzini, pfrankli, qcheng, rjones, sipoyare, virt-maint, yoguo
Target Milestone: rcFlags: pm-rhel: mirror+
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glibc-2.34-67.el9 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-05-04 08:15:22 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 2177705    
Bug Blocks: 2190387    
Attachments:
Description Flags
build.log
none
root.log
none
__memcpy_avx_unaligned_erms disassembly none

Description Richard W.M. Jones 2023-04-28 12:58:45 UTC
Description of problem:

The libguestfs appliance crashes when running in C9S.  The stack
trace indicates that the initramfs segfaults somewhere.

supermin: deleting initramfs files
[    3.821870] EXT4-fs (sdb): mounted filesystem without journal. Quota mode: none.
supermin: chroot
[    3.933317] init[1]: segfault at 55a72a57e000 ip 00007f7054ab9cd7 sp 00007ffc3134f298 error 6 in libc.so.6[7f7054a28000+175000] likely on CPU 0 (core 0, socket 0)
[    3.933791] Code: 00 00 c5 7d e7 8f 20 20 00 00 c5 7d e7 97 40 20 00 00 c5 7d e7 9f 60 20 00 00 c5 7d e7 a7 00 30 00 00 c5 7d e7 af 20 30 00 00 <c5> 7d e7 b7 40 30 00 00 c5 7d e7 bf 60 30 00 00 48 83 ef 80 ff c9
[    3.937628] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
[    3.937744] CPU: 0 PID: 1 Comm: init Not tainted 5.14.0-303.el9.x86_64 #1
[    3.937844] Hardware name: Red Hat KVM/RHEL, BIOS 1.16.1-1.el9 04/01/2014
[    3.938030] Call Trace:
[    3.938120]  <TASK>
[    3.938227]  dump_stack_lvl+0x34/0x48
[    3.938527]  panic+0xea/0x2e4
[    3.938557]  do_exit.cold+0x15/0x15
[    3.938575]  do_group_exit+0x2d/0x90
[    3.938595]  get_signal+0x9b5/0xa00
[    3.938618]  arch_do_signal_or_restart+0x25/0x100
[    3.938638]  ? schedule+0x5a/0xc0
[    3.938655]  exit_to_user_mode_loop+0x9c/0x130
[    3.938673]  exit_to_user_mode_prepare+0xb6/0x100
[    3.938687]  irqentry_exit_to_user_mode+0x5/0x30
[    3.938700]  asm_exc_page_fault+0x22/0x30
[    3.938818] RIP: 0033:0x7f7054ab9cd7
[    3.939014] Code: 00 00 c5 7d e7 8f 20 20 00 00 c5 7d e7 97 40 20 00 00 c5 7d e7 9f 60 20 00 00 c5 7d e7 a7 00 30 00 00 c5 7d e7 af 20 30 00 00 <c5> 7d e7 b7 40 30 00 00 c5 7d e7 bf 60 30 00 00 48 83 ef 80 ff c9
[    3.939036] RSP: 002b:00007ffc3134f298 EFLAGS: 00000203
[    3.939070] RAX: 000055a72a55e580 RBX: 0000000000000026 RCX: 000000000000000c
[    3.939087] RDX: 00000000000000e8 RSI: 000055a72a57ade0 RDI: 000055a72a57afc0
[    3.939104] RBP: 0000000000000026 R08: ffffffffffffffc0 R09: 0000000000000008
[    3.939119] R10: fffffffffffffff9 R11: 0000000000000000 R12: 000055a72a55e6b0
[    3.939134] R13: ffffffffffffff80 R14: 0000000000000030 R15: 000055a72a55e580
[    3.939205]  </TASK>

Version-Release number of selected component (if applicable):

kernel 5.14.0-303.el9.x86_64
supermin 5.3.3-1.el9
libguestfs-1:1.50.1-3.el9

How reproducible:

100% in brew, but not locally

Comment 1 Richard W.M. Jones 2023-04-28 12:59:21 UTC
Created attachment 1960828 [details]
build.log

Comment 2 Richard W.M. Jones 2023-04-28 12:59:48 UTC
Created attachment 1960829 [details]
root.log

Comment 3 Richard W.M. Jones 2023-04-28 14:01:38 UTC
This happens under TCG.  Reproducer:

$ LIBGUESTFS_BACKEND_SETTINGS=force_tcg libguestfs-test-tool

Comment 4 Richard W.M. Jones 2023-04-28 14:04:57 UTC
Yongkui, would you mind testing again if bug 2179033 has really been
fixed, and whether this might be a dupe of that bug?  I am supposed to
be using the new qemu package which is supposed to fix this ...

Comment 5 Laszlo Ersek 2023-05-02 14:46:26 UTC
According to <https://defuse.ca/online-x86-assembler.htm>, the byte string "c5 7d e7 b7 40 30 00" corresponds to the instruction "vmovntdq YMMWORD PTR [rdi+0x3040],ymm14". According to <https://www.felixcloutier.com/x86/movntdq>, this is "Store Packed Integers Using Non-Temporal Hint", it belongs to the AVX family, and due to the "non-temporal hint", it could be related to crypto.

According to root.log (comment 2), the glibc version is 2.34-66.el9. The file "/usr/lib/debug/lib64/libc.so.6-2.34-66.el9.x86_64.debug" is provided by package glibc-debuginfo-2.34-66.el9.x86_64".

According to the dmesg, the fault occurs at the following offset in glibc: IP (00007f7054ab9cd7) - base (7f7054a28000) = 91CD7.

addr2line -p -i -f -e /usr/lib/debug/lib64/libc.so.6-2.34-66.el9.x86_64.debug 0x91CD7

__GI__IO_wfile_seekoff at /usr/src/debug/glibc-2.34-66.el9.x86_64/libio/wfileops.c:744

    742 off64_t
    743 _IO_wfile_seekoff (FILE *fp, off64_t offset, int dir, int mode)
    744 {
    745   off64_t result;
    746   off64_t delta, new_offset;
    747   long int count;
    748 

Ugh... this doesn't add up :(

Comment 6 Laszlo Ersek 2023-05-03 09:03:16 UTC
Paolo,

is it possible that TCG mishandles the following instruction (from discussion with Rich):


   b9cd7:       c5 7d e7 b7 40 30 00    vmovntdq %ymm14,0x3040(%rdi)
   b9cde:       00 

(Rich located this in __memcpy_avx_unaligned_erms in libc).

It does not seem like a "BMI" operation, hence bug 2179033 does not immediately appear relevant. The instruction is still part of one of the AVX families, and I seem to recall earlier problems there.

The simple reproducer (from comment 3) is:

LIBGUESTFS_BACKEND_SETTINGS=force_tcg libguestfs-test-tool

The symptom is that the appliance crashes (from comment 0):

[    3.821870] EXT4-fs (sdb): mounted filesystem without journal. Quota mode: none.
supermin: chroot
[    3.933317] init[1]: segfault at 55a72a57e000 ip 00007f7054ab9cd7 sp 00007ffc3134f298 error 6 in libc.so.6[7f7054a28000+175000] likely on CPU 0 (core 0, socket 0)
[    3.933791] Code: 00 00 c5 7d e7 8f 20 20 00 00 c5 7d e7 97 40 20 00 00 c5 7d e7 9f 60 20 00 00 c5 7d e7 a7 00 30 00 00 c5 7d e7 af 20 30 00 00 <c5> 7d e7 b7 40 30 00 00 c5 7d e7 bf 60 30 00 00 48 83 ef 80 ff c9
[    3.937628] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
[    3.937744] CPU: 0 PID: 1 Comm: init Not tainted 5.14.0-303.el9.x86_64 #1
[    3.937844] Hardware name: Red Hat KVM/RHEL, BIOS 1.16.1-1.el9 04/01/2014
[    3.938030] Call Trace:
[    3.938120]  <TASK>
[    3.938227]  dump_stack_lvl+0x34/0x48
[    3.938527]  panic+0xea/0x2e4
[    3.938557]  do_exit.cold+0x15/0x15
[    3.938575]  do_group_exit+0x2d/0x90
[    3.938595]  get_signal+0x9b5/0xa00
[    3.938618]  arch_do_signal_or_restart+0x25/0x100
[    3.938638]  ? schedule+0x5a/0xc0
[    3.938655]  exit_to_user_mode_loop+0x9c/0x130
[    3.938673]  exit_to_user_mode_prepare+0xb6/0x100
[    3.938687]  irqentry_exit_to_user_mode+0x5/0x30
[    3.938700]  asm_exc_page_fault+0x22/0x30
[    3.938818] RIP: 0033:0x7f7054ab9cd7
[    3.939014] Code: 00 00 c5 7d e7 8f 20 20 00 00 c5 7d e7 97 40 20 00 00 c5 7d e7 9f 60 20 00 00 c5 7d e7 a7 00 30 00 00 c5 7d e7 af 20 30 00 00 <c5> 7d e7 b7 40 30 00 00 c5 7d e7 bf 60 30 00 00 48 83 ef 80 ff c9
[    3.939036] RSP: 002b:00007ffc3134f298 EFLAGS: 00000203
[    3.939070] RAX: 000055a72a55e580 RBX: 0000000000000026 RCX: 000000000000000c
[    3.939087] RDX: 00000000000000e8 RSI: 000055a72a57ade0 RDI: 000055a72a57afc0
[    3.939104] RBP: 0000000000000026 R08: ffffffffffffffc0 R09: 0000000000000008
[    3.939119] R10: fffffffffffffff9 R11: 0000000000000000 R12: 000055a72a55e6b0
[    3.939134] R13: ffffffffffffff80 R14: 0000000000000030 R15: 000055a72a55e580
[    3.939205]  </TASK>

The glibc version containing the "offending" instruction encoding is 2.34-66.el9.

I've grepped the QEMU tree for "vmovntdq"; at version 7.2.1 anyway. At that commit, the only hits are in "tests/tcg/i386/x86.csv", originating from commit 91117bc546b1 ("tests/tcg: i386: add SSE tests", 2022-09-01). The two related test cases appear to be

"VMOVNTDQ m256, ymm1","VMOVNTDQ ymm1, m256","vmovntdq ymm1, m256","EVEX.256.66.0F.W0 E7 /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale32","w,r","",""
"VMOVNTDQ m256, ymm1","VMOVNTDQ ymm1, m256","vmovntdq ymm1, m256","VEX.256.66.0F.WIG E7 /r","V","V","AVX","modrm_memonly","w,r","",""

Comment 8 Laszlo Ersek 2023-05-03 09:13:33 UTC
The qemu-kvm package version is 8.0.0-1.el9, from "root.log" (comment 2).

Comment 9 Richard W.M. Jones 2023-05-03 09:37:42 UTC
Created attachment 1961894 [details]
__memcpy_avx_unaligned_erms disassembly

This is the disassembly of __memcpy_avx_unaligned_erms, the glibc function
where the crash happens.  The crash location is at b9cd7.

This is from glibc-2.34-66.el9.x86_64

Comment 10 Laszlo Ersek 2023-05-03 09:56:43 UTC
Rich, can you please check / paste the disassembly around offset 0x91CD7? Does that area look related at all? Thanks.

Comment 11 Richard W.M. Jones 2023-05-03 10:11:22 UTC
(In reply to Laszlo Ersek from comment #10)
> Rich, can you please check / paste the disassembly around offset 0x91CD7?
> Does that area look related at all? Thanks.

That is the _IO_wfile_seekoff which seems completely unrelated.  I put the
full glibc disassembly (with interspersed source) here:

http://oirase.annexia.org/tmp/bz2190442/

Comment 13 Richard W.M. Jones 2023-05-03 13:44:49 UTC
To save anyone else the trouble, git bisect just points to this commit where we
enabled AVX bits in CPUID for TCG, so not interesting:

  2f8a21d8ff3af484a37edc8ea61d127ec1529ab5 is the first bad commit
  commit 2f8a21d8ff3af484a37edc8ea61d127ec1529ab5
  Author: Paul Brook
  Date:   Sun Apr 24 23:02:01 2022 +0100

    target/i386: Enable AVX cpuid bits when using TCG

Comment 14 YongkuiGuo 2023-05-04 07:38:46 UTC
(In reply to Richard W.M. Jones from comment #4)
> Yongkui, would you mind testing again if bug 2179033 has really been
> fixed, and whether this might be a dupe of that bug?  I am supposed to
> be using the new qemu package which is supposed to fix this ...

Thanks for the investigation. Seems like this issue is not a duplicate of bug 2179033. After upgrading glibc to 2.34-67 (https://brewweb.engineering.redhat.com/brew/buildinfo?buildID=2484520), this issue is fixed.

Tested with the following packages:
glibc-2.34-67.el9.x86_64
qemu-kvm-8.0.0-1.el9.x86_64
kernel-5.14.0-306.el9.x86_64
libguestfs-1.50.1-3.el9.x86_64


$ LIBGUESTFS_BACKEND_SETTINGS=force_tcg libguestfs-test-tool
     ************************************************************
     *                    IMPORTANT NOTICE
     *
     * When reporting bugs, include the COMPLETE, UNEDITED
     * output below in your bug report.
     *
     ************************************************************
LIBGUESTFS_BACKEND_SETTINGS=force_tcg
PATH=/home/yoguo/.local/bin:/home/yoguo/bin:/root/.local/bin:/root/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin
XDG_RUNTIME_DIR=/run/user/0
SELinux: Enforcing
guestfs_get_append: (null)
guestfs_get_autosync: 1
guestfs_get_backend: libvirt
guestfs_get_backend_settings: [force_tcg]
guestfs_get_cachedir: /var/tmp
guestfs_get_hv: /usr/libexec/qemu-kvm
guestfs_get_memsize: 1280
guestfs_get_network: 0
guestfs_get_path: /usr/lib64/guestfs
guestfs_get_pgroup: 0
guestfs_get_program: libguestfs-test-tool
guestfs_get_recovery_proc: 1
guestfs_get_smp: 1
guestfs_get_sockdir: /run/user/0
guestfs_get_tmpdir: /tmp
guestfs_get_trace: 0
guestfs_get_verbose: 1
host_cpu: x86_64
Launching appliance, timeout set to 600 seconds.
libguestfs: launch: program=libguestfs-test-tool
libguestfs: launch: version=1.50.1rhel=9,release=3.el9,libvirt
...
supermin: deleting initramfs files
supermin: chroot
Starting /init script ...
+ [[ panic=1 console=ttyS0 edd=off udevtimeout=6000 udev.event-timeout=6000 no_timer_check printk.time=1 cgroup_disable=memory usbcore.nousb cryptomgr.notests tsc=reliable 8250.nr_uarts=1 root=UUID=7df3f816-819b-4655-bd3b-92e0bd154753 selinux=0 guestfs_verbose=1 TERM=xterm-256color == *guestfs_network=1* ]]
+ [[ panic=1 console=ttyS0 edd=off udevtimeout=6000 udev.event-timeout=6000 no_timer_check printk.time=1 cgroup_disable=memory usbcore.nousb cryptomgr.notests tsc=reliable 8250.nr_uarts=1 root=UUID=7df3f816-819b-4655-bd3b-92e0bd154753 selinux=0 guestfs_verbose=1 TERM=xterm-256color == *guestfs_rescue=1* ]]
+ [[ panic=1 console=ttyS0 edd=off udevtimeout=6000 udev.event-timeout=6000 no_timer_check printk.time=1 cgroup_disable=memory usbcore.nousb cryptomgr.notests tsc=reliable 8250.nr_uarts=1 root=UUID=7df3f816-819b-4655-bd3b-92e0bd154753 selinux=0 guestfs_verbose=1 TERM=xterm-256color == *guestfs_noreboot=1* ]]
+ [[ panic=1 console=ttyS0 edd=off udevtimeout=6000 udev.event-timeout=6000 no_timer_check printk.time=1 cgroup_disable=memory usbcore.nousb cryptomgr.notests tsc=reliable 8250.nr_uarts=1 root=UUID=7df3f816-819b-4655-bd3b-92e0bd154753 selinux=0 guestfs_verbose=1 TERM=xterm-256color == *guestfs_boot_analysis=1* ]]
+ mkdir -p /dev/pts /dev/shm
+ mount -t devpts /dev/pts /dev/pts
+ mount -t tmpfs -o mode=1777 shmfs /dev/shm
...
===== TEST FINISHED OK =====

Comment 15 Richard W.M. Jones 2023-05-04 08:08:47 UTC
Thanks.  This was probably caused by the following glibc bug, not qemu:

https://sourceware.org/bugzilla/show_bug.cgi?id=29953

Comment 16 Richard W.M. Jones 2023-05-04 08:15:22 UTC
Fixed in glibc-2.34-67.el9

Comment 19 Laszlo Ersek 2023-05-04 11:18:51 UTC
(In reply to Richard W.M. Jones from comment #11)
> (In reply to Laszlo Ersek from comment #10)
> > Rich, can you please check / paste the disassembly around offset 0x91CD7?
> > Does that area look related at all? Thanks.
> 
> That is the _IO_wfile_seekoff which seems completely unrelated.  I put the
> full glibc disassembly (with interspersed source) here:
> 
> http://oirase.annexia.org/tmp/bz2190442/

Hmmm.

I realize that this problem has now been solved, but I'm not happy that my calculation of offset 0x91CD7 was wrong. It means that I don't understand the calculation well.

The original crash message was

[    3.933317] init[1]: segfault at 55a72a57e000 ip 00007f7054ab9cd7 sp 00007ffc3134f298 error 6 in libc.so.6[7f7054a28000+175000] likely on CPU 0 (core 0, socket 0)

- Address 55a72a57e000 was what PID 1 tried to access.
- The instruction was at virtual address 0x00007f7054ab9cd7.
- The stack pointer is now irrelevant.
- libc had been mapped into the process starting at virtual address 0x7f7054a28000, for length 0x175000.
- Therefore the offset into libc (at which the offending instruction was) was 0x00007f7054ab9cd7-0x7f7054a28000=0x91CD7.

However, from the libc disassembly, that offset does not point to bytes "c5 7d e7 b7 40 30 00" (aka "vmovntdq %ymm14,0x3040(%rdi)").

The first occurrence of that byte string / instruction is as offset 0xb9cd7, in function __memcpy_avx_unaligned_erms. Note that the page-relative offset is the same!

0xb9cd7 - 0x91CD7 = 0x28000

This means that my calculation, based on the dmesg, was off by 0x28 = 40 (decimal) pages. And I don't understand why.

Should I have offset the difference 0x91CD7 by some VMA or LMA, from the "objdump --headers" output? (E.g. the .text section?)

Comment 20 Paolo Bonzini 2023-05-04 12:39:17 UTC
> This means that my calculation, based on the dmesg, was off by 0x28 = 40 (decimal) pages. And I don't understand why.

> Should I have offset the difference 0x91CD7 by some VMA or LMA, from the "objdump --headers" output? (E.g. the .text section?)

Yes, probably. It also doesn't seem to be a coincidence that the bottom bits of the address are 0x28000.

(By the way thanks for identifying the glibc bug. This kind of issue where the emulator code is legit but the OS code doesn't take weird cpuid into account is one of the worst).