Bug 1271832

Summary: systemd[1]: Caught <ABRT>, dumped core as pid 9204
Product: [Fedora] Fedora Reporter: Daniel Black <daniel>
Component: systemdAssignee: systemd-maint
Status: CLOSED EOL QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: unspecified    
Version: 22CC: johannbg, jsynacek, lnykryn, msekleta, s, systemd-maint, twillber, zbyszek
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-07-19 19:19:51 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
core.9204.gz none

Description Daniel Black 2015-10-14 20:22:37 UTC
Description of problem:

rapidly stop / starting a service segfaults systemd. Result was rather limited system functionality

$ sudo reboot
Failed to start reboot.target: Activation of org.freedesktop.systemd1 timed out
Failed to open /dev/initctl: No such device or address
Failed to talk to init daemon.


Version-Release number of selected component (if applicable):


Name        : systemd
Arch        : x86_64
Epoch       : 0
Version     : 219
Release     : 24.fc22


mariadb-server rpm (built from 10.1 branch - https://github.com/MariaDB/server/tree/10.1) - cmake .  -DBUILD_CONFIG=mysql_release -DRPM=fedora22 && make -j 8 package



Steps to Reproduce:

############################

sudo systemctl stop mariadb;sudo systemctl start mariadb

# needs to be fast, do not sleep more than 1 sec



# delay does not matter

sudo systemctl disable mariadb

# delay does not matter

sudo systemctl enable mariadb

# delay does not matter

sudo systemctl status mariadb

############################

Here the service starts showing the infamous

############################

● mariadb.service - MariaDB database server
    Loaded: loaded (/lib/systemd/system/mariadb.service; enabled; vendor 
preset: enabled)
   Drop-In: /etc/systemd/system/mariadb.service.d
            └─migrated-from-my.cnf-settings.conf
    Active: inactive (dead) since Wed 2015-10-14 15:04:58 UTC; 58s ago
  Main PID: 880 (mysqld)
    Status: "Taking your SQL requests now..."
    CGroup: /system.slice/mariadb.service
            └─880 /usr/sbin/mysqld

############################

If I then shut down the MariaDB server, e.g. via the shutdown command 
(or just SIGTERM the process),


Actual results:


17:40 root@spaceman ~ # systemctl stop mariadb.service ; systemctl start mariadb.service

17:40 root@spaceman ~ # systemctl status -l  mariadb.service                              
● mariadb.service - MariaDB database server
   Loaded: loaded (/usr/lib/systemd/system/mariadb.service; enabled; vendor preset: disabled)
  Drop-In: /etc/systemd/system/mariadb.service.d
           └─migrated-from-my.cnf-settings.conf
   Active: active (running) since Wed 2015-10-14 17:40:30 CEST; 3s ago
  Process: 8715 ExecStartPre=/usr/bin/sync (code=exited, status=0/SUCCESS)
 Main PID: 8748 (mysqld)
   Status: "Taking your SQL requests now..."
   CGroup: /system.slice/mariadb.service
           └─8748 /usr/sbin/mysqld

Oct 14 17:40:28 spaceman systemd[1]: Starting MariaDB database server...
Oct 14 17:40:29 spaceman mysqld[8748]: 2015-10-14 17:40:29 139704082127040 [Note] /usr/sbin/mysqld (mysqld 10.1.8-MariaDB-log) starting as process 8748 ...
Oct 14 17:40:30 spaceman systemd[1]: Started MariaDB database server.


17:40 root@spaceman ~ # systemctl stop mariadb.service ; systemctl start mariadb.service  


17:41 root@spaceman ~ # systemctl disable mariadb.service ; systemctl enable mariadb.service    
Removed symlink /etc/systemd/system/mysqld.service.
Removed symlink /etc/systemd/system/multi-user.target.wants/mariadb.service.
Removed symlink /etc/systemd/system/mysql.service.
Created symlink from /etc/systemd/system/mysql.service to /usr/lib/systemd/system/mariadb.service.
Created symlink from /etc/systemd/system/mysqld.service to /usr/lib/systemd/system/mariadb.service.
Created symlink from /etc/systemd/system/multi-user.target.wants/mariadb.service to /usr/lib/systemd/system/mariadb.service.


17:41 root@spaceman ~ # systemctl status -l  mariadb.service                                
● mariadb.service - MariaDB database server
   Loaded: loaded (/usr/lib/systemd/system/mariadb.service; enabled; vendor preset: disabled)
  Drop-In: /etc/systemd/system/mariadb.service.d
           └─migrated-from-my.cnf-settings.conf
   Active: inactive (dead) since Wed 2015-10-14 17:41:11 CEST; 26s ago
 Main PID: 8949 (mysqld)
   Status: "Taking your SQL requests now..."
   CGroup: /system.slice/mariadb.service
           └─8949 /usr/sbin/mysqld

Oct 14 17:41:11 spaceman systemd[1]: Starting MariaDB database server...
Oct 14 17:41:13 spaceman mysqld[8949]: 2015-10-14 17:41:13 140174393092288 [Note] /usr/sbin/mysqld (mysqld 10.1.8-MariaDB-log) starting as process 8949 ...
Oct 14 17:41:16 spaceman systemd[1]: Started MariaDB database server.


17:41 root@spaceman ~ #  mysql -u root -e 'shutdown'

Broadcast message from systemd-journald@spaceman (Wed 2015-10-14 17:42:15 CEST):

systemd[1]: Caught <ABRT>, dumped core as pid 9204.

Broadcast message from systemd-journald@spaceman (Wed 2015-10-14 17:42:15 CEST):

systemd[1]: Freezing execution.



Expected results:


Additional info:


 gdb --batch --eval-command="thread apply all bt"   /usr/lib/systemd/systemd core.9204  | tee systemd-backtrace.txt
[New LWP 9204]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/lib/systemd/systemd --switched-root --system --deserialize 22'.
Program terminated with signal SIGABRT, Aborted.
#0  0x00007fcb43f3ad07 in kill () at ../sysdeps/unix/syscall-template.S:81
81      T_PSEUDO (SYSCALL_SYMBOL, SYSCALL_NAME, SYSCALL_NARGS)

Thread 1 (LWP 9204):
#0  0x00007fcb43f3ad07 in kill () at ../sysdeps/unix/syscall-template.S:81
#1  0x0000558c8168a4dd in crash.lto_priv.227 (sig=6) at ../src/core/main.c:168
#2  <signal handler called>
#3  0x00007fcb43f3a9c8 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:55
#4  0x00007fcb43f3c65a in __GI_abort () at abort.c:89
#5  0x0000558c81720ff2 in log_assert_failed_unreachable (text=<optimized out>, file=<optimized out>, line=<optimized out>, func=<optimized out>) at ../src/shared/log.c:758
#6  0x0000558c8172cafd in service_sigchld_event.lto_priv.358 (u=0x558c83188fb0, pid=<optimized out>, code=<optimized out>, status=0) at ../src/core/service.c:2460
#7  0x0000558c8174b4ff in manager_dispatch_sigchld (m=m@entry=0x558c83071db0) at ../src/core/manager.c:1678
#8  0x0000558c8174d4e5 in manager_dispatch_signal_fd.lto_priv.949 (source=<optimized out>, fd=<optimized out>, revents=<optimized out>, userdata=0x558c83071db0) at ../src/core/manager.c:1937
#9  0x0000558c817435a0 in source_dispatch.lto_priv.632 (s=0x558c83072980) at ../src/libsystemd/sd-event/sd-event.c:2115
#10 0x0000558c816b4180 in sd_event_dispatch (e=0x558c83072580) at ../src/libsystemd/sd-event/sd-event.c:2472
#11 0x0000558c8174c8c0 in sd_event_run (timeout=18446744073709551615, e=0x558c83072580) at ../src/libsystemd/sd-event/sd-event.c:2501
#12 manager_loop (m=0x558c83071db0) at ../src/core/manager.c:2056
#13 0x0000558c816856d1 in main (argc=<optimized out>, argv=<optimized out>) at ../src/core/main.c:1763

Comment 1 Daniel Black 2015-10-15 07:56:17 UTC
rpms: http://hasky.askmonty.org/archive/pack/10.1/build-8889/kvm-rpm-fedora22-amd64/rpms/ - mariadb-server

Comment 2 Daniel Black 2015-10-15 12:45:14 UTC
Created attachment 1083254 [details]
core.9204.gz

Comment 3 Daniel Black 2015-11-03 21:17:40 UTC
mariadb reference: https://mariadb.atlassian.net/browse/MDEV-8956

from Elena:

Correction to the initial scenario: it turns out that the service restart (quick or not) is irrelevant. It was not obvious because the problem happens sporadically, so after adding a delay the test was passing by pure chance.

Further experiments have shown that to get the pseudo-dead service, it is enough to do

# After reboot, mariadb service is already running
sudo systemctl disable mariadb
# optional delay
sudo systemctl enable mariadb
# optional delay
sudo systemctl status mariadb

# At this point we have the "dead" service, further shutdown causes systemd abort.

On my machine it failed in ~30-40% of runs, but it is highly volatile.

Comment 5 Fedora End Of Life 2016-07-19 19:19:51 UTC
Fedora 22 changed to end-of-life (EOL) status on 2016-07-19. Fedora 22 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.