Bug 1573931

Summary: systemd-238 crashes on arm when booting on Fedora 28 with: Caught <ABRT>, dumped core as pid 76. Freezing execution
Product: [Fedora] Fedora Reporter: Mauro Carvalho Chehab <mchehab>
Component: systemdAssignee: systemd-maint
Status: CLOSED EOL QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: unspecified    
Version: 28CC: chotaire+redhat, linuxerianer, lnykryn, msekleta, rkudyba, s, systemd-maint, zbyszek
Target Milestone: ---   
Target Release: ---   
Hardware: arm   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-05-28 18:57:54 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Boot logs
none
dmesg when booting with Kernel 4.13.12 (custom) with Fedora 28 and systemd-235
none
Coredump from systemd 238 on Fedora 28 ARM running in Docker none

Description Mauro Carvalho Chehab 2018-05-02 14:32:31 UTC
Created attachment 1430089 [details]
Boot logs

Description of problem:

When booting with a vanilla 4.16.6 with custom .config, it crashes on arm (Samsung Chromebook snow) with systemd-238-7.fc28.1.armv7hl.

Version-Release number of selected component (if applicable): systemd-238-7.fc28.1.armv7hl

Same Kernel works fine with systemd-234-8.fc27.armv7hl (Fedora 27). Problem appeared after doing a dnf system-upgrade to Fedora 28.

How reproducible: Always after upgrading from Fedora 27 to Fedora 28.

Additional info:

Please see https://bugzilla.redhat.com/show_bug.cgi?id=1573926 for more details about why I'm using a custom Kernel on snow.

Comment 1 Mauro Carvalho Chehab 2018-05-03 21:29:00 UTC
Indeed it is a systemd regression between version 235 and 238. Systemd still crashes on arm with systemd-238-1.fc28.src.rpm (from https://koji.fedoraproject.org/koji/buildinfo?buildID=1053905).

So, I became more agressive and forced version 235-3 from:
   https://koji.fedoraproject.org/koji/buildinfo?buildID=989373
   https://koji.fedoraproject.org/koji/buildinfo?buildID=959141

E. g.: installing the following packages:

cryptsetup-1.7.5-5.fc28.armv7hl.rpm
cryptsetup-libs-1.7.5-5.fc28.armv7hl.rpm
systemd-235-3.fc28.armv7hl.rpm
systemd-libs-235-3.fc28.armv7hl.rpm
systemd-pam-235-3.fc28.armv7hl.rpm
systemd-udev-235-3.fc28.armv7hl.rpm

Fedora 28 now boots fine.

Comment 2 Mauro Carvalho Chehab 2018-05-03 21:31:21 UTC
Created attachment 1430919 [details]
dmesg when booting with Kernel 4.13.12 (custom) with Fedora 28 and systemd-235

Comment 3 chotaire 2018-05-10 04:08:23 UTC
I can confirm the same problem when starting a Fedora 28 docker os container on armv7hl. 

# docker run -ti --restart=unless-stopped --cap-add=SYS_ADMIN --security-opt seccomp=unconfined --name fedora -e "container=docker" -v /sys/fs/cgroup:/sys/fs/cgroup:ro fedora:latest /sbin/init

systemd 238 running in system mode. (+PAM +AUDIT +SELINUX +IMA -APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD +IDN2 -IDN +PCRE2 default-hierarchy=hybrid)
Detected virtualization docker.
Detected architecture arm.
Caught <ABRT>, dumped core as pid 7.
Freezing execution.

Neither Fedora 27, Centos7 nor Rawhide (systemd 237) are having the same issues and they will boot fine. 

On another armv7hl system where Fedora 28 is running natively (dist-upgraded from Fedora 27), everything appears to work fine though.

# rpm -qa systemd
systemd-238-7.fc28.1.armv7hl

Comment 4 chotaire 2018-05-10 04:42:17 UTC
Now after updating Rawhide to systemd 238 I am facing the exact same issues.

[root@18a22d8cf18e /]# rpm -qa systemd
systemd-238-7.fc29.1.armv7hl
[root@18a22d8cf18e /]# systemctl list-unit-files
Failed to connect to bus: No such file or directory

Some info about the host that is attempting to run the Fedora container:

Linux bleh.raspbian 4.14.34-v7+ #1110 SMP Mon Apr 16 15:18:51 BST 2018 armv7l GNU/Linux

# free -h
              total        used        free      shared  buff/cache   available
Mem:           976M        123M        179M        8.2M        674M        781M
Swap:           99M        5.2M         94M


processor       : 3
model name      : ARMv7 Processor rev 4 (v7l)
BogoMIPS        : 38.40
Features        : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm crc32
CPU implementer : 0x41
CPU architecture: 7
CPU variant     : 0x0
CPU part        : 0xd03
CPU revision    : 4

Hardware        : BCM2835
Revision        : a02082
Serial          : 0000000025fae0cf

# docker version

Client:
 Version:      18.05.0-ce
 API version:  1.37
 Go version:   go1.9.5
 Git commit:   f150324
 Built:        Wed May  9 22:24:36 2018
 OS/Arch:      linux/arm
 Experimental: false
 Orchestrator: swarm

Server:
 Engine:
  Version:      18.05.0-ce
  API version:  1.37 (minimum version 1.12)
  Go version:   go1.9.5
  Git commit:   f150324
  Built:        Wed May  9 22:20:37 2018
  OS/Arch:      linux/arm
  Experimental: false

# systemd --version

systemd 232
+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD +IDN

Comment 5 Zbigniew Jędrzejewski-Szmek 2018-05-10 07:37:14 UTC
> Caught <ABRT>, dumped core as pid 76.

The core should in be / or /var/lib/systemd/coredump/. Any chance you could attach it here? If you could load the coredump and also do a 'bt', that'd also make things easier.

Comment 6 chotaire 2018-05-10 14:17:08 UTC
Here is the requested backtrace:

[New LWP 7]
Reading symbols from /usr/lib/systemd/systemd...Reading symbols from /usr/lib/debug/usr/lib/systemd/systemd-238-7.fc28.1.arm.debug...done.
done.
warning: Ignoring non-absolute filename: <linux-vdso.so.1>
Missing separate debuginfo for linux-vdso.so.1
Try: dnf --enablerepo='*debug*' install /usr/lib/debug/.build-id/0b/5a4564a218ca50f61305fa9df45a51e95201ce
Missing separate debuginfo for /lib/libudev.so.1
Try: dnf --enablerepo='*debug*' install /usr/lib/debug/.build-id/7e/9dc1012829c12827563ff34e912f09f6865c02
[New LWP 1]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/libthread_db.so.1".
Core was generated by `/sbin/init'.
Program terminated with signal SIGABRT, Aborted.
#0  0x76a13c1c in kill () at ../sysdeps/unix/syscall-template.S:78
78      T_PSEUDO (SYSCALL_SYMBOL, SYSCALL_NAME, SYSCALL_NARGS)
[Current thread is 1 (LWP 7)]
(gdb) bt
#0  0x76a13c1c in kill () at ../sysdeps/unix/syscall-template.S:78
#1  0x0051bd60 in crash (sig=6) at ../src/core/main.c:196
#2  <signal handler called>
#3  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#4  0x769fef9c in __GI_abort () at abort.c:79
#5  0x76a4e7ec in __libc_message (action=action@entry=do_abort,
    fmt=<optimized out>) at ../sysdeps/posix/libc_fatal.c:181
#6  0x76a54fb8 in malloc_printerr (str=<optimized out>) at malloc.c:5350
#7  0x76a55580 in munmap_chunk (p=<optimized out>) at malloc.c:2846
#8  0x00564d3c in mount_cgroup_controllers (join_controllers=0x7ece1af0)
    at ../src/core/mount-setup.c:303
#9  0x00517124 in initialize_runtime (ret_error_message=0x7ece1b80,
    saved_rlimit_memlock=0x2000, saved_rlimit_nofile=0x7ece1bf0,
    skip_setup=false) at ../src/core/main.c:1825
#10 main (argc=1, argv=0x72704120) at ../src/core/main.c:2370

I am attaching the coredump.

Comment 7 chotaire 2018-05-10 14:18:25 UTC
Created attachment 1434405 [details]
Coredump from systemd 238 on Fedora 28 ARM running in Docker

Comment 8 chotaire 2018-05-10 14:28:52 UTC
Fyi:

# dnf --enablerepo='*debug*' install /usr/lib/debug/.build-id/0b/5a4564a218ca50f61305fa9df45a51e95201ce
No match for argument: /usr/lib/debug/.build-id/0b/5a4564a218ca50f61305fa9df45a51e95201ce

# dnf --enablerepo='*debug*' install /usr/lib/debug/.build-id/7e/9dc1012829c12827563ff34e912f09f6865c02
Package systemd-libs-debuginfo-238-7.fc28.1.armv7hl is already installed, skipping.

Comment 9 Roman 2018-06-10 13:28:50 UTC
(In reply to Mauro Carvalho Chehab from comment #1)
> Indeed it is a systemd regression between version 235 and 238. Systemd still
> crashes on arm with systemd-238-1.fc28.src.rpm (from
> https://koji.fedoraproject.org/koji/buildinfo?buildID=1053905).
> 
> So, I became more agressive and forced version 235-3 from:
>    https://koji.fedoraproject.org/koji/buildinfo?buildID=989373
>    https://koji.fedoraproject.org/koji/buildinfo?buildID=959141
> 
> E. g.: installing the following packages:
> 
> cryptsetup-1.7.5-5.fc28.armv7hl.rpm
> cryptsetup-libs-1.7.5-5.fc28.armv7hl.rpm
> systemd-235-3.fc28.armv7hl.rpm
> systemd-libs-235-3.fc28.armv7hl.rpm
> systemd-pam-235-3.fc28.armv7hl.rpm
> systemd-udev-235-3.fc28.armv7hl.rpm
> 
> Fedora 28 now boots fine.

Just to give another feedback:

I also ran into this problem on a raspi after upgrade from f27 to f28.

I did a chroot with qemu-arm-static to access the sd-card from x86_64 workstation and finally managed to install systemd-235 and cryptsetup-1.7.5 as mentioned above.

Thx, the raspi jump across the error and works again.

Comment 10 Roman 2018-06-27 19:17:14 UTC
I recently updated the raspi to systemd.armv7hl-238-8.git0e0aa59.fc28 and rebooted without any problems.

Comment 11 Ben Cotton 2019-05-02 21:31:01 UTC
This message is a reminder that Fedora 28 is nearing its end of life.
On 2019-May-28 Fedora will stop maintaining and issuing updates for
Fedora 28. It is Fedora's policy to close all bug reports from releases
that are no longer maintained. At that time this bug will be closed as
EOL if it remains open with a Fedora 'version' of '28'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 28 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 12 Ben Cotton 2019-05-28 18:57:54 UTC
Fedora 28 changed to end-of-life (EOL) status on 2019-05-28. Fedora 28 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.