I used coredumpctl list to review my Fedora 40 core dumps. I found many repeated coredumps from smartctl from smartmontools-7.4-3.fc40.x86_64 This is a the latest 'coredumpctl info' listing: PID: 2600 (smartctl) UID: 0 (root) GID: 0 (root) Signal: 6 (ABRT) Timestamp: Sat 2024-04-27 13:08:59 EDT (2h 57min ago) Command Line: /usr/sbin/smartctl --all /dev/nvme0n1 Executable: /usr/sbin/smartctl Control Group: /system.slice/system-dbus\x2d:1.3\x2dorg.kde.kded.smart.slice/dbus-:1.3-org.kde.kded.smart Unit: dbus-:1.3-org.kde.kded.smart Slice: system-dbus\x2d:1.3\x2dorg.kde.kded.smart.slice Boot ID: 2a7263aecd1a41eea19c7831a678d1b4 Machine ID: 00ecfe8d976c4992b66770980e8d368a Hostname: msi Storage: /var/lib/systemd/coredump/core.smartctl.0.2a7263aecd1a41eea19c7831a678d1b4.2600.1714237739000000.zst (inaccessible) Message: Process 2600 (smartctl) of user 0 dumped core. Module libpcre2-8.so.0 from rpm pcre2-10.42-2.fc40.2.x86_64 Module libselinux.so.1 from rpm libselinux-3.6-4.fc40.x86_64 Stack trace of thread 2600: #0 0x00007f37905c8144 __pthread_kill_implementation (libc.so.6 + 0x98144) #1 0x00007f379057065e raise (libc.so.6 + 0x4065e) #2 0x00007f3790558902 abort (libc.so.6 + 0x28902) #3 0x00007f3790559767 __libc_message_impl.cold (libc.so.6 + 0x29767) #4 0x00007f37905d2175 malloc_printerr (libc.so.6 + 0xa2175) #5 0x00007f37905d450c _int_free (libc.so.6 + 0xa450c) #6 0x00007f37905d6dce free (libc.so.6 + 0xa6dce) #7 0x0000562a8d42a8f1 _ZN14drive_databaseD1Ev (smartctl + 0x658f1) #8 0x00007f3790572bb1 __run_exit_handlers (libc.so.6 + 0x42bb1) #9 0x00007f3790572c7e exit (libc.so.6 + 0x42c7e) #10 0x00007f379055a08f __libc_start_call_main (libc.so.6 + 0x2a08f) #11 0x00007f379055a14b __libc_start_main@@GLIBC_2.34 (libc.so.6 + 0x2a14b) #12 0x0000562a8d3f6745 _start (smartctl + 0x31745) ELF object binary architecture: AMD x86-64 The drive info from inxi -D is: inxi -aD Drives: Local Storage: total: 953.87 GiB used: 71.86 GiB (7.5%) ID-1: /dev/nvme0n1 maj-min: 259:0 vendor: A-Data model: SX6000LNP size: 953.87 GiB block-size: physical: 512 B logical: 512 B speed: 31.6 Gb/s lanes: 4 tech: SSD serial: 2K2020091093 fw-rev: V9002s45 temp: 29.9 C scheme: GPT SMART: yes health: PASSED on: 35d 16h cycles: 1,543 read-units: 6,104,462 [3.12 TB] written-units: 10,949,869 [5.60 TB] Reproducible: Sometimes Steps to Reproduce: 1.Have smartmontools package installed on host 2.Boot Fedora 40 host Actual Results: 1. smartctl crashes leaving core information Expected Results: 1. smartctl should not crash Crashes occur for normal boots but an exception seen in the newest five boots was the short boot for doing a 'dnf5 offline reboot' seen at journalctl --list-boots for "-3" -3 5de9b0cf2f064d659eb349ea6df447f7 Fri 2024-04-26 13:12:38 EDT # coredumpctl list |tail -5 Thu 2024-04-25 17:47:49 EDT 2441 1000 1000 SIGABRT present /usr/bin/plasmashell 27.0M Fri 2024-04-26 12:24:18 EDT 2682 0 0 SIGABRT present /usr/sbin/smartctl 190.1K Fri 2024-04-26 13:17:08 EDT 2606 0 0 SIGABRT present /usr/sbin/smartctl 190.3K Fri 2024-04-26 17:27:35 EDT 2524 0 0 SIGABRT present /usr/sbin/smartctl 190.3K Sat 2024-04-27 13:08:59 EDT 2600 0 0 SIGABRT present /usr/sbin/smartctl 190.5K # journalctl --list-boots |tail -5 -4 90a57b349c584bdfa15aa57c17ac2b33 Fri 2024-04-26 12:23:40 EDT Fri 2024-04-26 13:12:25 EDT -3 5de9b0cf2f064d659eb349ea6df447f7 Fri 2024-04-26 13:12:38 EDT Fri 2024-04-26 13:16:18 EDT -2 7123b1d0571349dea5df5f4093b06ea9 Fri 2024-04-26 13:16:31 EDT Fri 2024-04-26 14:58:10 EDT -1 6be890c9068c416abbd69d3925f87b45 Fri 2024-04-26 17:27:01 EDT Fri 2024-04-26 18:44:06 EDT 0 2a7263aecd1a41eea19c7831a678d1b4 Sat 2024-04-27 13:08:20 EDT Sat 2024-04-27 16:08:03 EDT
I suspect the dnf offline reboot short boot to update packages at Fri 2024-04-26 13:12:38 EDT doesn't show a crash in smartctl because it skips running service smartd.service because 'dnf5 offline reboot' does a minimal boot->packages upgrade->full boot. So the crash was the full boot at "-2" 2024-04-26 13:16:31 EDT
I ran the same command after bootup is completed, the core dump doesn't occur if I run this manually: sudo /usr/sbin/smartctl --all /dev/nvme0n1
Constant crashes of this are relatively annoying, I've raised this to 'high' and on my system with the crash I've temporarily turned off 'smartd' service.
I've bumped this up to Fedora 41 and smartmontools-7.4-6.fc41.x86_64 since I've confirmed this also happens in the latest Fedora 41 prebeta
The "coredumpctl info" for Fedora 41, kernel 6.11.0-63.fc41.x86_64 and smartmontools-7.4-6 shows: PID: 2338 (smartctl) UID: 0 (root) GID: 0 (root) Signal: 6 (ABRT) Timestamp: Tue 2024-10-01 10:28:21 EDT (27min ago) Command Line: /usr/sbin/smartctl --all /dev/nvme0n1 Executable: /usr/sbin/smartctl Control Group: /system.slice/system-dbus\x2d:1.3\x2dorg.kde.kded.smart.slice/dbus-:1.3-org.kde.kded.smart Unit: dbus-:1.3-org.kde.kded.smart Slice: system-dbus\x2d:1.3\x2dorg.kde.kded.smart.slice Boot ID: c2d5998fcf6448889e7e4e2120e6e170 Machine ID: 606b1acf646545ed8a19c9cf0245d31e Hostname: msi Storage: /var/lib/systemd/coredump/core.smartctl.0.c2d5998fcf6448889e7e4e2120e6e170.2338.1727792901000000.zst (inaccessible) Message: Process 2338 (smartctl) of user 0 dumped core. Module libpcre2-8.so.0 from rpm pcre2-10.44-1.fc41.1.x86_64 Module libselinux.so.1 from rpm libselinux-3.7-5.fc41.x86_64 Stack trace of thread 2338: #0 0x00007fc9dd651724 __pthread_kill_implementation (libc.so.6 + 0x72724) #1 0x00007fc9dd5f8d0e raise (libc.so.6 + 0x19d0e) #2 0x00007fc9dd5e0942 abort (libc.so.6 + 0x1942) #3 0x00007fc9dd5e17a7 __libc_message_impl.cold (libc.so.6 + 0x27a7) #4 0x00007fc9dd65b8a5 malloc_printerr (libc.so.6 + 0x7c8a5) #5 0x00007fc9dd65dcdc _int_free (libc.so.6 + 0x7ecdc) #6 0x00007fc9dd66060e free (libc.so.6 + 0x8160e) #7 0x00005594d11e5de1 _ZN14drive_databaseD1Ev (smartctl + 0x65de1) #8 0x00007fc9dd5fb461 __run_exit_handlers (libc.so.6 + 0x1c461) #9 0x00007fc9dd5fb52e exit (libc.so.6 + 0x1c52e) #10 0x00007fc9dd5e224f __libc_start_call_main (libc.so.6 + 0x324f) #11 0x00007fc9dd5e230b __libc_start_main@@GLIBC_2.34 (libc.so.6 + 0x330b) #12 0x00005594d11b1755 _start (smartctl + 0x31755) ELF object binary architecture: AMD x86-64
What are the steps to reproduce this exactly? There is smartmontools service calle smartd.service which runs /usr/sbin/smartd but in the report it says /usr/bin/smartctl so something else must be starting it
1. sudo systemctl enable smartd.service is all I needed to reproduce it (having smartd service enabled)
smartd service runs smartd executable, not smartctl so it can't originate from that looking at the report, this line: > Control Group: /system.slice/system-dbus\x2d:1.3\x2dorg.kde.kded.smart.slice/dbus-:1.3-org.kde.kded.smart indicates that this has something to do with KDE, after some checking, file /usr/share/dbus-1/system-services/org.kde.kded.smart.service comes from plasma-disks package. My guess is that plasma-disks tries to do something that is not possible with selinux policy, unfortunately the core dump is not provide too much information. Is there any smartmontools related information in journalctl log? You may also try to boot with selinux in permissive mode and see if the crash happens again or not.
> Is there any smartmontools related information in journalctl log? sure: journalctl -b 0 --no-pager -g smart Oct 14 13:56:25 msi audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=dbus-:1.3-org.kde.kded.smart@0 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' Oct 14 13:56:25 msi audit[2330]: ANOM_ABEND auid=4294967295 uid=0 gid=0 ses=4294967295 subj=system_u:system_r:unconfined_service_t:s0 pid=2330 comm="smartctl" exe="/usr/sbin/smartctl" sig=6 res=1 Oct 14 13:56:25 msi systemd-coredump[2346]: [🡕] Process 2330 (smartctl) of user 0 dumped core. Module libpcre2-8.so.0 from rpm pcre2-10.44-1.fc41.1.x86_64 Module libselinux.so.1 from rpm libselinux-3.7-5.fc41.x86_64 Stack trace of thread 2330: #0 0x00007fa8b699a724 __pthread_kill_implementation (libc.so.6 + 0x72724) #1 0x00007fa8b6941d0e raise (libc.so.6 + 0x19d0e) #2 0x00007fa8b6929942 abort (libc.so.6 + 0x1942) #3 0x00007fa8b692a7a7 __libc_message_impl.cold (libc.so.6 + 0x27a7) #4 0x00007fa8b69a48a5 malloc_printerr (libc.so.6 + 0x7c8a5) #5 0x00007fa8b69a6cdc _int_free (libc.so.6 + 0x7ecdc) #6 0x00007fa8b69a960e free (libc.so.6 + 0x8160e) #7 0x000055b95189cde1 _ZN14drive_databaseD1Ev (smartctl + 0x65de1) #8 0x00007fa8b6944461 __run_exit_handlers (libc.so.6 + 0x1c461) #9 0x00007fa8b694452e exit (libc.so.6 + 0x1c52e) #10 0x00007fa8b692b24f __libc_start_call_main (libc.so.6 + 0x324f) #11 0x00007fa8b692b30b __libc_start_main@@GLIBC_2.34 (libc.so.6 + 0x330b) #12 0x000055b951868755 _start (smartctl + 0x31755) ELF object binary architecture: AMD x86-64 Oct 14 13:56:26 msi abrt-notification[2704]: [🡕] Process 2921 (smartctl) crashed in ??() Oct 14 13:56:35 msi audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=dbus-:1.3-org.kde.kded.smart@0 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
I've stopped accumlating core dumps by simply removing the plasma-disks package. So you're right it's not related to smartd
I will close this and open a new bug under the correct related component of "plasma-disks" at new bug: https://bugzilla.redhat.com/show_bug.cgi?id=2324086 *** This bug has been marked as a duplicate of bug 2324086 ***