First, this was rather difficult to investigate to the at least partially actionable form, not to speak about about reproducibility. The main reason for this difficulty was the fact this occurred while the system was in a shutdown procedure, hence no coredump was preserved, likely because of some systemd-spotted conflict ass demonstrated in the journal at around the same time (for another PID, likely something entirely else crashed as well and took a precedence in getting logged about, otherwise it could be logged for this crashed alsactl daemon in the same fashion): > systemd[1]: Requested transaction contradicts existing jobs: Transaction for systemd-coredump/start is destructive (systemd-poweroff.service has 'start' job queued, but 'stop' is included in transact> > systemd[1]: systemd-coredump.socket: Failed to queue service startup job (Maybe the service file is missing or not a non-template unit?): Transaction for systemd-coredump/start is destructive (system> > systemd[1]: systemd-coredump.socket: Failed with result 'resources'. So, what suspicious I spotted first: > kernel: alsactl[1037]: segfault at 28 ip 000055d3526bc55a sp 00007fffca8baf40 error 4 in alsactl[55d3526ad000+11000] > kernel: Code: e9 2b ff ff ff 48 83 4c 14 f8 00 e9 fe fe ff ff e8 ab 25 ff ff 66 66 2e 0f 1f 84 00 00 00 00 00 41 54 55 53 48 8b 2f 48 89 fb <8b> 55 28 85 d2 7e 1e 45 31 e4 0f 1f 40 00 48 8b 45 20 4a 8b 3c e0 > audit[1037]: ANOM_ABEND auid=4294967295 uid=0 gid=0 ses=4294967295 subj=system_u:system_r:alsa_t:s0 pid=1037 comm="alsactl" exe="/usr/sbin/alsactl" sig=11 res=1 So, we have a standard SIGSEGV case here, but what's the cause? Note that the PID is for what's been running since this boot session: > systemd[1]: Started Manage Sound Card State (restore and store). > audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=alsa-state comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' > systemd[1]: Reached target Sound Card. > alsactl[1037]: alsactl 1.2.2 daemon started Well, I tried to recover the "point of crash" from IP info, but it appeared unreliable: hex(0x55d3526bc55a - 0x55d3526ad000) = 0xf55a (gdb) disassemble /r 0xf55a Dump of assembler code for function replace_untrusted_chars: [...] 0x000000000000f547 <+183>: 75 89 jne 0xf4d2 <replace_untrusted_chars+66> 0x000000000000f549 <+185>: 48 83 c4 08 add $0x8,%rsp 0x000000000000f54d <+189>: 44 89 e0 mov %r12d,%eax 0x000000000000f550 <+192>: 5b pop %rbx 0x000000000000f551 <+193>: 5d pop %rbp 0x000000000000f552 <+194>: 41 5c pop %r12 0x000000000000f554 <+196>: 41 5d pop %r13 0x000000000000f556 <+198>: 41 5e pop %r14 0x000000000000f558 <+200>: 41 5f pop %r15 > 0x000000000000f55a <+202>: c3 retq $ addr2line -e /usr/sbin/alsactl 0xf55a > /usr/src/debug/alsa-utils-1.2.2-1.fc33.x86_64/alsactl/init_utils_string.c:183 $ grep -rnF replace_untrusted_chars /usr/src/debug/alsa-utils-1.2.2-1.fc33.x86_64 > /usr/src/debug/alsa-utils-1.2.2-1.fc33.x86_64/alsactl/init_parse.c:1171: count = replace_untrusted_chars(temp2); > /usr/src/debug/alsa-utils-1.2.2-1.fc33.x86_64/alsactl/init_parse.c:1399: count = replace_untrusted_chars(result); > /usr/src/debug/alsa-utils-1.2.2-1.fc33.x86_64/alsactl/init_utils_string.c:153:static int replace_untrusted_chars(char *str) Well, that'd be rather strange to parse configuration when about to be shut down, wouldn't it? I could, however, find the op code match (from "Code:" line above) and map it down to this offset within this exact alsactl binary (see at the end): 0x15530 Now, follow with me -- if I understand it correctly, this refers to the code that would get executed if the segfault didn't chime in: (gdb) disassemble /r 0x15530 Dump of assembler code for function card_events: [...] 0x000000000001552b <+443>: e8 b0 fb ff ff callq 0x150e0 <remove_from_list> > 0x0000000000015530 <+448>: e9 2b ff ff ff jmpq 0x15460 <card_events+240> > 0x0000000000015535 <+453>: 48 83 4c 14 f8 00 orq $0x0,-0x8(%rsp,%rdx,1) > 0x000000000001553b <+459>: e9 fe fe ff ff jmpq 0x1543e <card_events+206> > 0x0000000000015540 <+464>: e8 ab 25 ff ff callq 0x7af0 <__stack_chk_fail@plt> (you can verify the opcodes with "Code:" section above) $ addr2line -e /usr/sbin/alsactl 0x15530 > /usr/src/debug/alsa-utils-1.2.2-1.fc33.x86_64/alsactl/daemon.c:25 $ cat -n /usr/src/debug/alsa-utils-1.2.2-1.fc33.x86_64/alsactl/daemon.c | sed -n 250,265p 250 while (snd_ctl_read(card->handle, ev) == 1) { 251 type = snd_ctl_event_get_type(ev); 252 if (type != SND_CTL_EVENT_ELEM) 253 continue; 254 mask = snd_ctl_event_elem_get_mask(ev); 255 snd_ctl_event_elem_get_id(ev, id); 256 if (mask == SND_CTL_EVENT_MASK_REMOVE) { 257 remove_from_list(&card->whitelist, id); 258 remove_from_list(&card->blacklist, id); > 259 continue; 260 } 261 if (mask & SND_CTL_EVENT_MASK_INFO) { 262 remove_from_list(&card->whitelist, id); 263 remove_from_list(&card->blacklist, id); 264 } 265 if (mask & (SND_CTL_EVENT_MASK_VALUE| I was unable to find any link between remove_from_list function and replace_untrusted_chars, to it seems to me as if there's something fishy with the former way of nailing it, but this latter seems more plausible -- we can even reconstruct the whole backtrace solely based on logic of inevitability of caller-callee relations: > [???] > alsactl/daemon.c: remove_from_list() # see above > alsactl/daemon.c: card_events() # ditto > alsactl/daemon.c: state_daemon() > alsactl/alsactl.c: main() That's the best approximation of the problem I can provide. At the very least, I'd suggest reviewing every and all suspect case reported by static analyzers, perhaps the problem is already known at this level. $ rpm -q alsa-utils pulseaudio > alsa-utils-1.2.2-1.fc33.x86_64 > pulseaudio-13.99.1-2.fc33.x86_64 (the former is important to align on data the offsets above relate to, the latter may have something to do with disappearance of sound cards) snippet from alsa-info.sh: > !!Soundcards recognised by ALSA > !!----------------------------- > > 0 [PCH ]: HDA-Intel - HDA Intel PCH > HDA Intel PCH at 0xf2240000 irq 134 > > > !!PCI Soundcards installed in the system > !!-------------------------------------- > > 00:1f.3 Audio device [0403]: Intel Corporation 100 Series/C230 Series Chipset Family HD Audio Controller [8086:a170] (rev 31) > Subsystem: Lenovo Device [17aa:5050]
> The main reason for this difficulty was the fact this occurred > while the system was in a shutdown procedure, hence no coredump > was preserved, likely because of some systemd-spotted conflict > as demonstrated in the journal at around the same time Followed-up at https://github.com/systemd/systemd/issues/7176#issuecomment-604450737
This bug appears to have been reported against 'rawhide' during the Fedora 33 development cycle. Changing version to 33.
FEDORA-2021-e53b66a8c6 has been submitted as an update to Fedora 34. https://bodhi.fedoraproject.org/updates/FEDORA-2021-e53b66a8c6
FEDORA-2021-e53b66a8c6 has been pushed to the Fedora 34 testing repository. Soon you'll be able to install the update with the following command: `sudo dnf upgrade --enablerepo=updates-testing --advisory=FEDORA-2021-e53b66a8c6` You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2021-e53b66a8c6 See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.
FEDORA-2021-e53b66a8c6 has been pushed to the Fedora 34 stable repository. If problem still persists, please make note of it in this bug report.