Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1970945

Summary:

systemd coredump in message_extend_fields -> realloc

Product:

Red Hat Enterprise Linux 8

Reporter:

Michele Baldessari <michele>

Component:

systemd

Assignee:

Michal Sekletar <msekleta>

Status:

CLOSED ERRATA

QA Contact:

Frantisek Sumsal <fsumsal>

Severity:

urgent

Docs Contact:

Priority:

urgent

Version:

8.6

CC:

achernet, adam.winberg, bdobreli, bwelterl, byodlows, dciabrin, ealcaniz, ffotorel, fkrska, fsun, gerald.prock, gnaik, h13958451065, horst.thaller, jamacku, jcalhoun, jeckersb, jwennerberg, mmatsuya, msekleta, ohudlick, peter.vreman, pratshar, qguo, rhodain, rmetrich, sbalasub, sbroz, shobbs, siggib, skolosov, smeyer, soutteri, stanislav.moravec, systemd-maint-list, tartosh, tcrider, toneata, yferszt, yoliynyk, yvan.broccard, yzheng

Target Milestone:

Keywords:

Triaged, ZStream

Target Release:

---

Flags:

pm-rhel: mirror+

Hardware:

Unspecified

OS:

Unspecified

Whiteboard:

Fixed In Version:

systemd-239-53.el8

Doc Type:

If docs needed, set a value

Doc Text:

Story Points:

---

Clone Of:

Clones:

2024903 2026244 (view as bug list)

Environment:

Last Closed:

2022-05-10 15:25:47 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Bug Depends On:

Bug Blocks:

2024903, 2026244

Attachments:

Description	Flags
Patch for dracut to let systemd ASAN start properly in the initramfs as well	none

Description Michele Baldessari 2021-06-11 14:19:28 UTC

Description of problem:
today we observed a coredump of systemd on a test ceph node (running in a rhel8.4 VM). The machine had been booted a day before and was relatively idle (besides a couple of ceph containers running).

[root@ceph-1 coredump]# coredumpctl info
Failed to check if any systemd-coredump@.service units are running: Connection timed out
           PID: 30920 (systemd)
           UID: 0 (root)
           GID: 0 (root)
        Signal: 11 (SEGV)
     Timestamp: Fri 2021-06-11 07:19:38 UTC (6h ago)
  Command Line: /usr/lib/systemd/systemd --switched-root --system --deserialize 17
    Executable: /usr/lib/systemd/systemd
 Control Group: /init.scope
          Unit: init.scope
         Slice: -.slice
       Boot ID: 042badcdc3d64775bd0b9f227bd23172
    Machine ID: 84415a69e17940f58f15b2a01ec76373
      Hostname: ceph-1
       Storage: /var/lib/systemd/coredump/core.systemd.0.042badcdc3d64775bd0b9f227bd23172.30920.1623395978000000.lz4
       Message: Process 30920 (systemd) of user 0 dumped core.

                Stack trace of thread 30920:
                #0  0x00007f072a82967b kill (libc.so.6)
                #1  0x0000560b02df2f7a crash (systemd)
                #2  0x00007f072abc9b20 __restore_rt (libpthread.so.0)
                #3  0x00007f072a876264 _int_malloc (libc.so.6)
                #4  0x00007f072a8773af _int_realloc (libc.so.6)
                #5  0x00007f072a87867b realloc (libc.so.6)
                #6  0x00007f072c006076 message_extend_fields (libsystemd-shared-239.so)
                #7  0x00007f072c00644f message_append_field_string (libsystemd-shared-239.so)
                #8  0x00007f072c0089a0 message_new_reply (libsystemd-shared-239.so)
                #9  0x00007f072c00a312 sd_bus_message_new_method_error (libsystemd-shared-239.so)
                #10 0x00007f072bff913e sd_bus_reply_method_error (libsystemd-shared-239.so)
                #11 0x00007f072bff9653 sd_bus_reply_method_errno (libsystemd-shared-239.so)
                #12 0x00007f072c002f1d bus_maybe_reply_error (libsystemd-shared-239.so)
                #13 0x00007f072c016340 object_find_and_run (libsystemd-shared-239.so)
                #14 0x00007f072c017afc bus_process_object (libsystemd-shared-239.so)
                #15 0x00007f072c026a2a bus_process_internal (libsystemd-shared-239.so)
                #16 0x00007f072c02871c io_callback (libsystemd-shared-239.so)
                #17 0x00007f072c04fdf9 source_dispatch (libsystemd-shared-239.so)
                #18 0x00007f072c051bed sd_event_dispatch (libsystemd-shared-239.so)
                #19 0x00007f072c051df8 sd_event_run (libsystemd-shared-239.so)
                #20 0x0000560b02e37ab2 manager_loop (systemd)
                #21 0x0000560b02dee471 main (systemd)
                #22 0x00007f072a815493 __libc_start_main (libc.so.6)
                #23 0x0000560b02df00ee _start (systemd)

(gdb) bt
#0  0x00007f072a82967b in kill () at ../sysdeps/unix/syscall-template.S:78
#1  0x0000560b02df2f7a in crash (sig=11) at ../src/core/main.c:194
#2  <signal handler called>
#3  _int_malloc (av=av@entry=0x7f072abb1bc0 <main_arena>, bytes=bytes@entry=33) at malloc.c:3655
#4  0x00007f072a8773af in _int_realloc (av=av@entry=0x7f072abb1bc0 <main_arena>, oldp=oldp@entry=0x560b03441770, oldsize=oldsize@entry=32, nb=nb@entry=48) at malloc.c:4612
#5  0x00007f072a87867b in __GI___libc_realloc (oldmem=0x560b03441780, bytes=bytes@entry=40) at malloc.c:3238
#6  0x00007f072c006076 in message_extend_fields (m=m@entry=0x560b034460b0, align=align@entry=8, sz=sz@entry=16, add_offset=add_offset@entry=false) at ../src/libsystemd/sd-bus/bus-message.c:163
#7  0x00007f072c00644f in message_append_field_string (m=0x560b034460b0, h=h@entry=6, type=type@entry=115 's', s=0x560b0348b100 ":1.1197", ret=0x560b034460e0) at ../src/libsystemd/sd-bus/bus-message.c:252
#8  0x00007f072c0089a0 in message_new_reply (call=0x560b0347c9d0, type=<optimized out>, m=0x7fff0ad56d50) at ../src/libsystemd/sd-bus/bus-message.c:725
#9  0x00007f072c00a312 in sd_bus_message_new_method_error (call=call@entry=0x560b0347c9d0, m=m@entry=0x7fff0ad56d80, e=e@entry=0x7fff0ad56f00) at ../src/libsystemd/sd-bus/bus-message.c:755
#10 0x00007f072bff913e in sd_bus_reply_method_error (call=0x560b0347c9d0, e=0x7fff0ad56f00) at ../src/libsystemd/sd-bus/bus-convenience.c:181
#11 0x00007f072bff9653 in sd_bus_reply_method_errno (call=0x560b0347c9d0, error=-3, p=0x7fff0ad56f00) at ../src/libsystemd/sd-bus/bus-convenience.c:236
#12 0x00007f072c002f1d in bus_maybe_reply_error (m=m@entry=0x560b0347c9d0, r=r@entry=-3, error=error@entry=0x7fff0ad56f00) at ../src/libsystemd/sd-bus/bus-internal.c:341
#13 0x00007f072c016340 in method_callbacks_run (found_object=0x7fff0ad56fb7, require_fallback=false, c=<optimized out>, m=0x560b0347c9d0, bus=0x560b03435a90) at ../src/libsystemd/sd-bus/bus-objects.c:409
#14 object_find_and_run (bus=0x560b03435a90, m=0x560b0347c9d0, p=0x560b0348b058 "/org/freedesktop/systemd1", require_fallback=false, found_object=0x7fff0ad56fb7) at ../src/libsystemd/sd-bus/bus-objects.c:1268
#15 0x00007f072c017afc in bus_process_object (bus=bus@entry=0x560b03435a90, m=m@entry=0x560b0347c9d0) at ../src/libsystemd/sd-bus/bus-objects.c:1388
#16 0x00007f072c026a2a in process_message (m=0x560b0347c9d0, bus=0x560b03435a90) at ../src/libsystemd/sd-bus/sd-bus.c:2705
#17 process_running (ret=0x0, priority=<optimized out>, hint_priority=<optimized out>, bus=0x560b03435a90) at ../src/libsystemd/sd-bus/sd-bus.c:2747
#18 bus_process_internal (bus=0x560b03435a90, hint_priority=<optimized out>, priority=<optimized out>, ret=0x0) at ../src/libsystemd/sd-bus/sd-bus.c:2966
#19 0x00007f072c02871c in io_callback (s=<optimized out>, fd=<optimized out>, revents=<optimized out>, userdata=0x560b03435a90) at ../src/libsystemd/sd-bus/sd-bus.c:3346
#20 0x00007f072c04fdf9 in source_dispatch (s=0x560b03402de0) at ../src/libsystemd/sd-event/sd-event.c:3365
#21 0x00007f072c051bed in sd_event_dispatch (e=0x560b032c1000) at ../src/libsystemd/sd-event/sd-event.c:3784
#22 0x00007f072c051df8 in sd_event_run (e=0x560b032c1000, timeout=18446744073709551615) at ../src/libsystemd/sd-event/sd-event.c:3844
#23 0x0000560b02e37ab2 in manager_loop (m=m@entry=0x560b032c0a70) at ../src/core/manager.c:2884
#24 0x0000560b02dee471 in invoke_main_loop (ret_error_message=0x7fff0ad57378, ret_switch_root_init=<synthetic pointer>, ret_switch_root_dir=<synthetic pointer>, ret_fds=0x7fff0ad57388, ret_shutdown_verb=<synthetic pointer>,
    ret_retval=<synthetic pointer>, ret_reexecute=<synthetic pointer>, saved_rlimit_memlock=0x7fff0ad573f0, saved_rlimit_nofile=0x7fff0ad573e0, m=0x560b032c0a70) at ../src/core/main.c:1665
#25 main (argc=<optimized out>, argv=0x7fff0ad57658) at ../src/core/main.c:2606


(gdb) fr 6
#6  0x00007f072c006076 in message_extend_fields (m=m@entry=0x560b034460b0, align=align@entry=8, sz=sz@entry=16, add_offset=add_offset@entry=false) at ../src/libsystemd/sd-bus/bus-message.c:163
163                     np = realloc(m->header, ALIGN8(new_size));
(gdb) p m->header
$1 = (struct bus_header *) 0x560b03441780
(gdb) p *( m->header)
$2 = {endian = 108 'l', type = 3 '\003', flags = 1 '\001', version = 1 '\001', {dbus1 = {body_size = 0, serial = 0, fields_size = 0}, dbus2 = {_reserved = 0, cookie = 0}}}
(gdb)


Version-Release number of selected component (if applicable):
systemd-239-45.el8.x86_64

Full coredump file and sosreport (taken after a VM reset as the system was fairly hosed) are here:
http://file.rdu.redhat.com/~mbaldess/systemd-coredump-rhel-8.4/

(Note we've only seen this once so far, so there is no way to easily reproduce this)

Comment 2 Renaud Métrich 2021-06-30 19:03:10 UTC

May be similar to BZ #1976833 (cannot tell since it's a memory coruption in malloc structures).

Comment 23 Michal Sekletar 2021-08-23 11:39:41 UTC

I've prepared the test build of systemd (latest RHEL-8.4) that is instrumented using AdressSanitizer and UndefinedBehaviorSanitizer,

https://msekleta.fedorapeople.org/systemd-rhel84-asan-ubsan/

To install the test packages run,

yum-config-manager --add-repo https://msekleta.fedorapeople.org/systemd-rhel84-asan-ubsan/
yum --nogpgcheck update -y systemd

After installing test rpms it is necessary to set sanitizer options (passed to systemd as env vars) on kernel command line and reboot.

grubby --update-kernel ALL --args='ASAN_OPTIONS=strict_string_checks=1:detect_stack_use_after_return=1:check_initialization_order=1:strict_init_order=1:halt_on_error=0:log_path=/run/log/systemd-asan.log'
grubby --update-kernel ALL --args='UBSAN_OPTIONS=print_stacktrace=1:print_summary=1:halt_on_error=0:log_path=/run/log/systemd-ubsan.log'

Note that any memory or undefined behavior related errors will be logged to files /run/log/systemd-asan.log.1 and /run/log/systemd-ubsan.log.1. Whoever is going to install the builds should setup monitoring and watch out for presence of those files as their existence means we have triggered systemd bug.

Comment 27 Renaud Métrich 2021-09-10 05:50:52 UTC

*** Bug 1976833 has been marked as a duplicate of this bug. ***

Comment 29 Renaud Métrich 2021-09-14 13:17:45 UTC

HEADS UP:

The ASAN enabled systemd will crash on boot if embedded in the initramfs.
This is due to libasan trying to read /proc/self/mountinfo, which fails since /proc is not mounted yet:
~~~
==1==AddressSanitizer CHECK failed: ../../../../libsanitizer/sanitizer_common/sanitizer_procmaps_common.cc:75 "((data_.proc_self_maps.len)) > ((0))" (0x0, 0x0)
    <empty stack>

X.XXXXXXX Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000100
[...]
~~~

I've submitted a dracut PR to "fix" this: https://github.com/dracutdevs/dracut/pull/1602

Meanwhile, if you hit this, the recovery steps are to perform the following operations:

1. Boot with "rdinit=/bin/sh"

2. At the prompt, mount /proc

  # mount -t proc proc /proc

3. Execute systemd

  # exec /usr/lib/systemd/systemd

Comment 30 Renaud Métrich 2021-09-14 13:22:54 UTC

Attached is a patch for dracut to make sure that systemd can load properly:

# patch /usr/lib/dracut/modules.d/00systemd/module-setup.sh 00systemd.patch
patching file /usr/lib/dracut/modules.d/00systemd/module-setup.sh

Comment 31 Renaud Métrich 2021-09-14 13:23:56 UTC

Created attachment 1822995 [details]
Patch for dracut to let systemd ASAN start properly in the initramfs as well

Comment 36 Sigurdur Bjornsson 2021-10-04 23:39:04 UTC

I have seen two systemd crashes on Rocky Linux 8.4, running systemd-239-45.el8_4.3.x86_64.
The backtrace isn't identical but it's similar and also points to sd_bus message handling.

I have encountered this problem on 2 VMs, one in VMware (2vCPU/4GB) and the other in RHV (4vCPU/32GB).
First occurred in the middle of the day, second in the middle of the night, both under idle conditions.
On both accounts this seems to have been triggered by a sysstat run.

Here you can see the journal and a backtrace.
https://pastebin.com/5yBSmmbP

Comment 40 Jamie Burchell 2021-10-05 10:23:43 UTC

Encountered this systemd crash on DigitalOcean Rocky Linux 8.4:

[root@test]# systemctl --version
systemd 239 (239-45.el8_4.3)
+PAM +AUDIT +SELINUX +IMA -APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD +IDN2 -IDN +PCRE2 default-hierarchy=legacy

[root@test]# coredumpctl info
Failed to check if any systemd-coredump@.service units are running: Connection timed out
           PID: 195921 (systemd)
           UID: 0 (root)
           GID: 0 (root)
        Signal: 6 (ABRT)
     Timestamp: Sun 2021-10-03 20:56:55 BST (1 day 14h ago)
  Command Line: /usr/lib/systemd/systemd --system --deserialize 32
    Executable: /usr/lib/systemd/systemd
 Control Group: /init.scope
          Unit: init.scope
         Slice: -.slice
       Boot ID: f14aab64ca624f5b9204bea3cc6b726b
    Machine ID: be1642266afc4781a3ce84889079d86b
      Hostname: test
       Storage: /var/lib/systemd/coredump/core.systemd.0.f14aab64ca624f5b9204bea3cc6b726b.195921.1633291015000000.lz4
       Message: Process 195921 (systemd) of user 0 dumped core.

                Stack trace of thread 195921:
                #0  0x00007f2b0c50f68b kill (libc.so.6)
                #1  0x000055de92e41f8a crash (systemd)
                #2  0x00007f2b0c8afb30 __restore_rt (libpthread.so.0)
                #3  0x00007f2b0c50f38f raise (libc.so.6)
                #4  0x00007f2b0c4f9dc5 abort (libc.so.6)
                #5  0x00007f2b0c5524f7 __libc_message (libc.so.6)
                #6  0x00007f2b0c5595fc malloc_printerr (libc.so.6)
                #7  0x00007f2b0c559e3c unlink_chunk.isra.2 (libc.so.6)
                #8  0x00007f2b0c55cb91 _int_malloc (libc.so.6)
                #9  0x00007f2b0c55dc82 malloc (libc.so.6)
                #10 0x00007f2b0c59c4ba __alloc_dir (libc.so.6)
                #11 0x00007f2b0c59c5bd opendir_tail (libc.so.6)
                #12 0x00007f2b0dc88487 cg_enumerate_subgroups (libsystemd-shared-239.so)
                #13 0x000055de92ec27f2 unit_watch_pids_in_path (systemd)
                #14 0x000055de92e970ff service_enter_signal (systemd)
                #15 0x000055de92e9ae65 service_sigchld_event (systemd)
                #16 0x000055de92e7d740 manager_dispatch_sigchld (systemd)
                #17 0x00007f2b0dd35abd source_dispatch (libsystemd-shared-239.so)
                #18 0x00007f2b0dd37b8d sd_event_dispatch (libsystemd-shared-239.so)
                #19 0x00007f2b0dd37d98 sd_event_run (libsystemd-shared-239.so)
                #20 0x000055de92e86ac2 manager_loop (systemd)
                #21 0x000055de92e3d481 main (systemd)
                #22 0x00007f2b0c4fb4a3 __libc_start_main (libc.so.6)
                #23 0x000055de92e3f0fe _start (systemd)

Comment 41 Sigurdur Bjornsson 2021-10-05 14:05:52 UTC

Found another one, this time CentOS 8.4, again triggered by sysstat.


Sep 21 22:30:33 localhost systemd[1]: Starting system activity accounting tool...
Sep 21 22:30:33 localhost systemd-coredump[633002]: Due to PID 1 having crashed coredump collection will now be turned off.
Sep 21 22:30:33 localhost systemd[1]: Caught <ABRT>, dumped core as pid 633001.


systemctl --version
systemd 239 (239-45.el8_4.3)
+PAM +AUDIT +SELINUX +IMA -APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD +IDN2 -IDN +PCRE2 default-hierarchy=legacy



Failed to check if any systemd-coredump@.service units are running: Connection timed out
           PID: 633001 (systemd)
           UID: 0 (root)
           GID: 0 (root)
        Signal: 6 (ABRT)
     Timestamp: Tue 2021-09-21 22:30:33 GMT (1 weeks 6 days ago)
  Command Line: /usr/lib/systemd/systemd --switched-root --system --deserialize 17
    Executable: /usr/lib/systemd/systemd
 Control Group: /init.scope
          Unit: init.scope
         Slice: -.slice
       Boot ID: dd20eda7cb844c77aa39ca13e9f01dbc
    Machine ID: 5a0c87db412d45d3b3f32bd9298e3d81
      Hostname: localhost.localdomain
       Storage: /var/lib/systemd/coredump/core.systemd.0.dd20eda7cb844c77aa39ca13e9f01dbc.633001.1632263433000000.lz4
       Message: Process 633001 (systemd) of user 0 dumped core.
                
                Stack trace of thread 633001:
                #0  0x00007fcf4dbee67b kill (libc.so.6)
                #1  0x000056487f647f7a crash (systemd)
                #2  0x00007fcf4df8eb20 __restore_rt (libpthread.so.0)
                #3  0x00007fcf4dbee37f raise (libc.so.6)
                #4  0x00007fcf4dbd8db5 abort (libc.so.6)
                #5  0x00007fcf4dc314e7 __libc_message (libc.so.6)
                #6  0x00007fcf4dc385ec malloc_printerr (libc.so.6)
                #7  0x00007fcf4dc3bc7d _int_malloc (libc.so.6)
                #8  0x00007fcf4dc3d8d6 __libc_calloc (libc.so.6)
                #9  0x00007fcf4f3ccf4c sd_bus_message_new (libsystemd-shared-239.so)
                #10 0x00007fcf4f3cd384 sd_bus_message_new_signal (libsystemd-shared-239.so)
                #11 0x00007fcf4f3d9fe4 emit_properties_changed_on_interface (libsystemd-shared-239.so)
                #12 0x00007fcf4f3ddc97 sd_bus_emit_properties_changed_strv (libsystemd-shared-239.so)
                #13 0x000056487f65195c send_changed_signal (systemd)
                #14 0x000056487f65a787 bus_foreach_bus (systemd)
                #15 0x000056487f653924 bus_unit_send_change_signal (systemd)
                #16 0x000056487f68ca3b manager_loop (systemd)
                #17 0x000056487f643471 main (systemd)
                #18 0x00007fcf4dbda493 __libc_start_main (libc.so.6)
                #19 0x000056487f6450ee _start (systemd)

Comment 42 Jamie Burchell 2021-10-05 14:14:53 UTC

Not sure what triggered my coredump, but the lines in /var/log/messages immediately before it were:

Oct  3 20:56:54 test systemd[1]: Starting dnf makecache...
Oct  3 20:56:55 test dnf[195919]: Metadata cache refreshed recently.
Oct  3 20:56:55 test systemd-coredump[195922]: Due to PID 1 having crashed coredump collection will now be turned off.

Comment 43 Renaud Métrich 2021-10-05 14:53:24 UTC

Hi all,

Just to be clear, the fact that it happens "randomly" is just due to how the memory allocator allocates memory:
at some point in time the corruption happens and later, once malloc() wants to reuse the corrupted memory slot, systemd crashes.

It may happen in a child of system as well, between fork and exec when systemd executes a ExecStart command for example (once exec'ed memory is replaced so corruption is gone as well).
That's why it can be noticed with "sysstat" or "dnf makecache": it's just still systemd at that time, shortly after fork().

Renaud.

Comment 49 Lukáš Nykrýn 2021-10-14 10:03:31 UTC

Can we ask a customer to try systemd-239-45 (rhel-8.4 version) on top of rhel-8.3 and/or systemd-239-41 (version from rhel-8.3) on top of rhel-8.4? We would like to rule out the possibility, that the bug is in a different component.

Comment 55 Michal Sekletar 2021-11-18 16:24:46 UTC

I was able to make sense out of provided sanitiser reports. I mapped addresses to symbols using gdb and top of the backtrace points to bug in function unit_fail_if_noncanonical(), specifically to line https://github.com/redhat-plumbers/systemd-rhel8/blob/master/src/core/unit.c#L4788. If chase_symlinks() fails then we free uninitialised pointer. Depending on the state of stack at the time of the function call the bug can subsequently trigger various crash behaviours. That is consistent with what we see here, i.e. some report abort() while other report segfaults. The bug is already fixed upstream.

https://github.com/systemd/systemd/commit/58d9d89b4b41189bdcea86c2ad5cf708b7d54aca

I think we should release z-stream update for RHEL-8.4 with the above fix together with the backport of https://github.com/systemd/systemd/pull/20670.

Comment 59 Michal Sekletar 2021-11-22 14:08:17 UTC

Here are test rpms with the backport of patches that I've mentioned in comment #55,

https://msekleta.fedorapeople.org/bz1970945-memory-corruption-test-rpms/

Comment 61 Plumber Bot 2021-11-24 13:35:28 UTC

fix merged to github master branch -> https://github.com/redhat-plumbers/systemd-rhel8/pull/235

Comment 71 errata-xmlrpc 2022-05-10 15:25:47 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (systemd bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:2069

Comment 72 David Tardon 2022-06-01 08:25:03 UTC

*** Bug 2005790 has been marked as a duplicate of this bug. ***

Comment 73 huyubiao 2022-11-14 07:48:52 UTC

hello,
I failed to start systemd in the way described above.
I updated the dracut and compiled the 249 version of systemd. After the __lsan_do_recoverable_leak_check of ASAN is executed or the memory overflows, an error is reported. The error is as follows:

localhost login: 
localhost login: [   74.683917][    T1] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000100
[   74.685137][    T1] CPU: 9 PID: 1 Comm: systemd Kdump: loaded Tainted: G           O      5.10.0-60.18.0.50.h551.eulerosv2r11.x86_64 #1
[   74.687019][    T1] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
[   74.688009][    T1] Call Trace:
[   74.688540][    T1]  dump_stack+0x57/0x6e
[   74.689257][    T1]  panic+0x196/0x3ec
[   74.689863][    T1]  do_exit.cold+0x14/0xaf
[   74.690556][    T1]  ? sched_clock+0x5/0x10
[   74.691230][    T1]  do_group_exit+0x33/0xa0
[   74.691955][    T1]  __x64_sys_exit_group+0x14/0x20
[   74.692738][    T1]  do_syscall_64+0x30/0x40
[   74.693419][    T1]  entry_SYSCALL_64_after_hwframe+0x61/0xc6
[   74.694374][    T1] RIP: 0033:0x7f8db492b0e0
[   74.695079][    T1] Code: 05 c3 0f 1f 84 00 00 00 00 00 b8 18 00 00 00 0f 05 c3 0f 1f 84 00 00 00 00 00 50 58 b8 e7 00 00 00 48 83 ec 08 48 63 ff 0f 05 <e8> 3b f8 00 00 66 66 2e 0f 1f 84 00 00 00 00 00 48 83 ec 28 64 48
[   74.698149][    T1] RSP: 002b:00007ffd99c06a00 EFLAGS: 00000206 ORIG_RAX: 00000000000000e7
[   74.699524][    T1] RAX: ffffffffffffffda RBX: 00007f8db5234618 RCX: 00007f8db492b0e0
[   74.700775][    T1] RDX: 000000000000000f RSI: 0000000000000000 RDI: 0000000000000001
[   74.701992][    T1] RBP: 00007f8db5234618 R08: 00007f8dae000094 R09: 0000000000000000
[   74.703222][    T1] R10: 00007ffd99c064a0 R11: 0000000000000206 R12: 00007ffd99c06a90
[   74.704509][    T1] R13: 0000000000000000 R14: 0000000000000000 R15: 000061c000000880
[   74.705821][    T1] kernel fault(0x5) notification starting on CPU 9
[   74.706909][    T1] kernel fault(0x5) notification finished on CPU 9
[   74.709001][    T1] [kbox] catch panic event, panic reason:Attempted to kill init! exitcode=0x00000100
[   74.710091][    T1] [kbox] catch panic event, start logging
[   74.710750][    T1] [kbox] stack:
[   74.711150][    T1] [kbox]  ffffc01840013d68 0000000000000001 ffffc01840010000 ffffc01840014000
[   74.712160][    T1] [kbox]  0000000000000000 20000000fffffffe 3030303030303030 3030303030303030
[   74.713180][    T1] [kbox]  2020202020202020 2020202020202020 2020202020200030 3030303230323032
[   74.714203][    T1] [kbox]  2020202020202020 2020202020202020 2020202020202020 0020202020202020
[   74.715224][    T1] [kbox]  fb7984b329126600 0000000000000040 0000000000000014 ffffffffc0d09d69
[   74.716238][    T1] [kbox]  000000000000012c ffffffffc0d09efb ffffffffa69ddda0 0000000000000000
[   74.717250][    T1] [kbox]  ffffffffa632f5f8 ffffffffa510bbb6 ffffffffa50587d9 ffff9990801fa001
[   74.718265][    T1] [kbox]  ffffc01840013ed8 0000000000000001 0000000000000000 ffffffffa5a4e45c
[   74.719280][    T1] [kbox]  0000001100000010 ffffc01840013ee8 ffffc01840013e88 fb7984b329126600
[   74.720298][    T1] [kbox]  0000000000000000 0000000000000100 0000000000000000 ffff999081b75a00
[   74.721312][    T1] [kbox]  0000000000000000 0000000000000000 ffff9990801fa0c0 ffff9990801fa001
[   74.722328][    T1] [kbox]  0000000000000001 0000000000000000 0000000000000100 ffffffffa5a4e976
[   74.723352][    T1] [kbox]  ffffffffa50273f5 0000000000000100 ffff9990803150c0 ffff9990801fa0c0
[   74.724371][    T1] [kbox]  0000000000000000 ffffffffa50e62d3 0000000000000000 ffffc01840013f58
[   74.725392][    T1] [kbox]  0000000000000000 0000000000000000 ffffffffa50e6354 ffffffffa5a96a40
[   74.726403][    T1] [kbox]  0000000000000000 ffffffffa5c00099 000061c000000880 0000000000000000
[   74.727417][    T1] [kbox]  0000000000000000 00007ffd99c06a90 00007f8db5234618 00007f8db5234618
[   74.728434][    T1] [kbox]  0000000000000206 00007ffd99c064a0 0000000000000000 00007f8dae000094
[   74.729449][    T1] [kbox]  ffffffffffffffda 00007f8db492b0e0 000000000000000f 0000000000000000
[   74.730469][    T1] [kbox]  0000000000000001 00000000000000e7 00007f8db492b0e0 0000000000000033

Comment 74 huyubiao 2022-11-14 07:50:03 UTC

(In reply to huyubiao from comment #73)
> hello,
> I failed to start systemd in the way described above.
> I updated the dracut and compiled the 249 version of systemd. After the
> __lsan_do_recoverable_leak_check of ASAN is executed or the memory
> overflows, an error is reported. The error is as follows:
> 
> localhost login: 
> localhost login: [   74.683917][    T1] Kernel panic - not syncing:
> Attempted to kill init! exitcode=0x00000100
> [   74.685137][    T1] CPU: 9 PID: 1 Comm: systemd Kdump: loaded Tainted: G  #1
> [   74.687019][    T1] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
> [   74.688009][    T1] Call Trace:
> [   74.688540][    T1]  dump_stack+0x57/0x6e
> [   74.689257][    T1]  panic+0x196/0x3ec
> [   74.689863][    T1]  do_exit.cold+0x14/0xaf
> [   74.690556][    T1]  ? sched_clock+0x5/0x10
> [   74.691230][    T1]  do_group_exit+0x33/0xa0
> [   74.691955][    T1]  __x64_sys_exit_group+0x14/0x20
> [   74.692738][    T1]  do_syscall_64+0x30/0x40
> [   74.693419][    T1]  entry_SYSCALL_64_after_hwframe+0x61/0xc6
> [   74.694374][    T1] RIP: 0033:0x7f8db492b0e0
> [   74.695079][    T1] Code: 05 c3 0f 1f 84 00 00 00 00 00 b8 18 00 00 00 0f
> 05 c3 0f 1f 84 00 00 00 00 00 50 58 b8 e7 00 00 00 48 83 ec 08 48 63 ff 0f
> 05 <e8> 3b f8 00 00 66 66 2e 0f 1f 84 00 00 00 00 00 48 83 ec 28 64 48
> [   74.698149][    T1] RSP: 002b:00007ffd99c06a00 EFLAGS: 00000206 ORIG_RAX:
> 00000000000000e7
> [   74.699524][    T1] RAX: ffffffffffffffda RBX: 00007f8db5234618 RCX:
> 00007f8db492b0e0
> [   74.700775][    T1] RDX: 000000000000000f RSI: 0000000000000000 RDI:
> 0000000000000001
> [   74.701992][    T1] RBP: 00007f8db5234618 R08: 00007f8dae000094 R09:
> 0000000000000000
> [   74.703222][    T1] R10: 00007ffd99c064a0 R11: 0000000000000206 R12:
> 00007ffd99c06a90
> [   74.704509][    T1] R13: 0000000000000000 R14: 0000000000000000 R15:
> 000061c000000880
> [   74.705821][    T1] kernel fault(0x5) notification starting on CPU 9
> [   74.706909][    T1] kernel fault(0x5) notification finished on CPU 9
> [   74.709001][    T1] [kbox] catch panic event, panic reason:Attempted to
> kill init! exitcode=0x00000100
> [   74.710091][    T1] [kbox] catch panic event, start logging
> [   74.710750][    T1] [kbox] stack:
> [   74.711150][    T1] [kbox]  ffffc01840013d68 0000000000000001
> ffffc01840010000 ffffc01840014000
> [   74.712160][    T1] [kbox]  0000000000000000 20000000fffffffe
> 3030303030303030 3030303030303030
> [   74.713180][    T1] [kbox]  2020202020202020 2020202020202020
> 2020202020200030 3030303230323032
> [   74.714203][    T1] [kbox]  2020202020202020 2020202020202020
> 2020202020202020 0020202020202020
> [   74.715224][    T1] [kbox]  fb7984b329126600 0000000000000040
> 0000000000000014 ffffffffc0d09d69
> [   74.716238][    T1] [kbox]  000000000000012c ffffffffc0d09efb
> ffffffffa69ddda0 0000000000000000
> [   74.717250][    T1] [kbox]  ffffffffa632f5f8 ffffffffa510bbb6
> ffffffffa50587d9 ffff9990801fa001
> [   74.718265][    T1] [kbox]  ffffc01840013ed8 0000000000000001
> 0000000000000000 ffffffffa5a4e45c
> [   74.719280][    T1] [kbox]  0000001100000010 ffffc01840013ee8
> ffffc01840013e88 fb7984b329126600
> [   74.720298][    T1] [kbox]  0000000000000000 0000000000000100
> 0000000000000000 ffff999081b75a00
> [   74.721312][    T1] [kbox]  0000000000000000 0000000000000000
> ffff9990801fa0c0 ffff9990801fa001
> [   74.722328][    T1] [kbox]  0000000000000001 0000000000000000
> 0000000000000100 ffffffffa5a4e976
> [   74.723352][    T1] [kbox]  ffffffffa50273f5 0000000000000100
> ffff9990803150c0 ffff9990801fa0c0
> [   74.724371][    T1] [kbox]  0000000000000000 ffffffffa50e62d3
> 0000000000000000 ffffc01840013f58
> [   74.725392][    T1] [kbox]  0000000000000000 0000000000000000
> ffffffffa50e6354 ffffffffa5a96a40
> [   74.726403][    T1] [kbox]  0000000000000000 ffffffffa5c00099
> 000061c000000880 0000000000000000
> [   74.727417][    T1] [kbox]  0000000000000000 00007ffd99c06a90
> 00007f8db5234618 00007f8db5234618
> [   74.728434][    T1] [kbox]  0000000000000206 00007ffd99c064a0
> 0000000000000000 00007f8dae000094
> [   74.729449][    T1] [kbox]  ffffffffffffffda 00007f8db492b0e0
> 000000000000000f 0000000000000000
> [   74.730469][    T1] [kbox]  0000000000000001 00000000000000e7
> 00007f8db492b0e0 0000000000000033