Bug 1251617 - systemd SIGFPE during boot
systemd SIGFPE during boot
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: systemd (Show other bugs)
7.1
Unspecified Unspecified
unspecified Severity unspecified
: rc
: ---
Assigned To: systemd-maint
qe-baseos-daemons
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2015-08-07 18:20 EDT by David Shaw
Modified: 2016-01-12 13:45 EST (History)
2 users (show)

See Also:
Fixed In Version: systemd-219-19.el7
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2016-01-12 13:45:55 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description David Shaw 2015-08-07 18:20:23 EDT
Description of problem:

systemd occasionally crashes with a SIGFPE during the boot process

Version-Release number of selected component (if applicable):

208-20.el7_1.3

How reproducible:

There does not seem to be a particular trigger for it.  It's happened to me twice now, on two different boxes, but that's among countless boots on many boxes where it didn't happen.

Steps to Reproduce:
1.  Boot
2.  If it happens, it looks like:

2015-08-06 17:40:15.958 foo kernel: [140895.743494] traps: systemd[1] trap divide error ip:7f8422c1cd54 sp:7fff31024700 error:0 in systemd[7f8422bf5000+10e000]
2015-08-06 17:40:15.959 foo systemd[1]: Caught <FPE>, dumped core as pid 34222.
2015-08-06 17:40:15.960 foo systemd[1]: Freezing execution.

Additional info:

Digging around a bit led me to https://bugs.freedesktop.org/show_bug.cgi?id=87349 which looks very similar (same function, even).  I do know there is a recent update to the EL7 systemd to 208-20.el7_1.5, but examining the source in that RPM does not show the fix put into place for the above bug.

Here's a 'bt full':

(gdb) bt full
#0  0x00007f49a5ae6ffb in raise () from /lib64/libpthread.so.0
No symbol table info available.
#1  0x00007f49a78751ae in crash (sig=8) at src/core/main.c:148
        rl = {rlim_cur = 18446744073709551615, rlim_max = 18446744073709551615}
        sa = {__sigaction_handler = {sa_handler = 0x0, sa_sigaction = 0x0}, sa_mask = {__val = {0 <repeats 16 times>}}, sa_flags = 0, sa_restorer = 0x0}
        pid = 0
        __func__ = "crash"
        __PRETTY_FUNCTION__ = "crash"
#2  <signal handler called>
No symbol table info available.
#3  0x00007f49a787bd54 in manager_print_jobs_in_progress (m=0x7f49a8e8f600) at src/core/manager.c:251
        i = 0x0
        job_of_n = 0x0
        print_nr = <optimized out>
        cylon = "\033[1;31m*\033[0m\033[31m*    \033[0m\000\033[0m"
        j = <optimized out>
        counter = 0
        cylon_pos = <optimized out>
#4  process_event (ev=0x7fff6f3622b0, m=0x7f49a8e8f600) at src/core/manager.c:1788
        v = 1
        r = <optimized out>
        w = <optimized out>
#5  manager_loop (m=0x7f49a8e8f600) at src/core/manager.c:1887
        event = {events = 1, data = {ptr = 0x7f49a8e8f708, fd = -1461127416, u32 = 2833839880, u64 = 139954343180040}}
        n = <optimized out>
        wait_msec = <optimized out>
        r = <optimized out>
        rl = {interval = 1000000, begin = 139714568111, burst = 50000, num = 14}
        __PRETTY_FUNCTION__ = "manager_loop"
        __func__ = "manager_loop"
#6  0x00007f49a787306c in main (argc=5, argv=0x7fff6f362c18) at src/core/main.c:1667
        m = 0x7f49a8e8f600
        r = <optimized out>
        retval = 1
        before_startup = 4907205
        after_startup = <optimized out>
        timespan = "151.391ms\000\000\000\000\000\000\000\020+6o\377\177\000\000\t\000\000\000\000\000\000\000\300l!\247I\177\000\000\345*x\245I\177\000\000P\342\350\250I\177\000\000\020\340\350\250I\177\000"
        fds = 0x0
        reexecute = false
        shutdown_verb = 0x0
        initrd_timestamp = {realtime = 0, monotonic = 0}
        userspace_timestamp = {realtime = 1438757509162598, monotonic = 4886579}
        kernel_timestamp = {realtime = 1438757504276020, monotonic = 0}
        systemd = "systemd"
        skip_setup = false
        j = <optimized out>
---Type <return> to continue, or q <return> to quit---
        loaded_policy = false
        arm_reboot_watchdog = false
        queue_default_job = <optimized out>
        switch_root_dir = 0x0
        switch_root_init = 0x0
        saved_rlimit_nofile = {rlim_cur = 1024, rlim_max = 4096}
        __func__ = "main"
        __PRETTY_FUNCTION__ = "main"
Comment 2 Michal Sekletar 2016-01-12 13:45:55 EST
Should be fixed by following upstream commit 

https://github.com/systemd/systemd/commit/9c3349e23b14db27e7ba45f82cf647899c563ea9

which is included in RHEL-7.2 version of systemd.

Note You need to log in before you can comment on or make changes to this bug.