Bug 466377 - alsactl segfaults in a boot sequence
alsactl segfaults in a boot sequence
Status: CLOSED WONTFIX
Product: Fedora
Classification: Fedora
Component: alsa-utils (Show other bugs)
12
All Linux
medium Severity medium
: ---
: ---
Assigned To: Jaroslav Kysela
Fedora Extras Quality Assurance
: Triaged
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2008-10-09 18:20 EDT by Michal Jaegermann
Modified: 2010-12-05 02:07 EST (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2010-12-05 02:07:32 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
an output of 'lspci -tv' from the affected machnine (1.63 KB, text/plain)
2008-10-09 18:20 EDT, Michal Jaegermann
no flags Details
dmesg with alsactl segfaulting (28.90 KB, text/plain)
2009-02-04 14:05 EST, Michal Jaegermann
no flags Details
strace output for alsactl command (21.30 KB, text/plain)
2009-02-05 15:41 EST, Michal Jaegermann
no flags Details
an output from alsa-info.txt for the current rawhide (11.91) (27.91 KB, text/plain)
2009-09-07 14:46 EDT, Michal Jaegermann
no flags Details
patch preventing a segfault when /usr/share/alsa is not available (479 bytes, patch)
2009-12-13 20:51 EST, Michal Jaegermann
no flags Details | Diff

  None (edit)
Description Michal Jaegermann 2008-10-09 18:20:26 EDT
Created attachment 319956 [details]
an output of 'lspci -tv' from the affected machnine

Description of problem:

I found the following in logs after booting
2.6.27-0.398.rc9.fc10.x86_64


kernel: gameport: NS558 PnP Gameport is pnp00:0d/gameport0, io 0x200, speed 979kHz
kernel: VIA 82xx Modem 0000:00:11.6: enabling device (0000 -> 0001) kernel: VIA 82xx Modem 0000:00:11.6: PCI INT C -> GSI 22 (level, low) -> IRQ 22
kernel: ppdev: user-space parallel port driver kernel: VIA 82xx Modem 0000:00:11.6: PCI INT C disabled
kernel: VIA 82xx Modem: probe of 0000:00:11.6 failed with

kernel: VIA 82xx Audio 0000:00:11.5: PCI INT C -> GSI 22 (level, low) -> IRQ 22
kernel: ALSA sound/pci/via82xx.c:580: codec_read: codec 0
] kernel: ALSA sound/pci/via82xx.c:580: codec_read: codec 0
]
kernel: ALSA sound/pci/via82xx.c:580: codec_read: codec 0
]
kernel: ALSA sound/pci/via82xx.c:580: codec_read: codec 0
]
kernel: alsactl[1478]: segfault at 0 ip 00000000004106a7 sp 00007fff38bab5a0 error 4 in alsactl[400000+14000]

A layout of PCI bus in an attachment

Version-Release number of selected component (if applicable):
alsa-utils-1.0.18-3.rc3.fc10.x86_64

How reproducible:
No idea.  So far found that in logs only once.
Comment 1 Bug Zapper 2008-11-25 22:43:57 EST
This bug appears to have been reported against 'rawhide' during the Fedora 10 development cycle.
Changing version to '10'.

More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Comment 2 Jaroslav Kysela 2009-02-04 05:29:17 EST
Does this bug exist in current F-10?
Comment 3 Michal Jaegermann 2009-02-04 14:02:05 EST
> Does this bug exist in current F-10?

The machine in question runs rawhide and here is a small sample of recent messages of that kind:

Feb  1 16:13:22 dyna0 kernel: alsactl[1331]: segfault at 0 ip 00000000004106b7 sp 00007fffa8fbfb40 error 4 in alsactl[400000+14000]
Feb  2 01:46:06 dyna0 kernel: alsactl[1577]: segfault at 0 ip 00000000004106b7 sp 00007ffff9156cd0 error 4 in alsactl[400000+14000]
Feb  2 13:41:04 dyna0 kernel: alsactl[1576]: segfault at 0 ip 00000000004106b7 sp 00007fffcf4d1050 error 4 in alsactl[400000+14000]
Feb  3 09:30:59 dyna0 kernel: alsactl[1585]: segfault at 0 ip 00000000004106b7 sp 00007fff9d42afa0 error 4 in alsactl[400000+14000]
Feb  3 11:00:23 dyna0 kernel: alsactl[1521]: segfault at 0 ip 00000000004106b7 sp 00007fff6f65f0f0 error 4 in alsactl[400000+14000]
Feb  3 11:09:18 dyna0 kernel: alsactl[1581]: segfault at 0 ip 00000000004106b7 sp 00007fff7e75d2e0 error 4 in alsactl[400000+14000]
Feb  3 17:47:55 dyna0 kernel: alsactl[1580]: segfault at 0 ip 00000000004106b7 sp 00007fff273f3f70 error 4 in alsactl[400000+14000]
Feb  4 11:01:43 dyna0 kernel: alsactl[1537]: segfault at 0 ip 00000000004106b7 sp 00007ffffdab7540 error 4 in alsactl[400000+14000]
Feb  4 11:01:43 dyna0 kernel: alsactl[1580]: segfault at 0 ip 00000000004106b7 sp 00007fffa92b3e30 error 4 in alsactl[400000+14000]

A dmesg from the most recent boot with segfaults like the above is attached
Comment 4 Michal Jaegermann 2009-02-04 14:05:51 EST
Created attachment 330903 [details]
dmesg with alsactl segfaulting
Comment 5 Michal Jaegermann 2009-02-04 14:27:49 EST
I realized something.  Does alsactl tries to read something from, say, /usr/share?  From an ordering of entries in dmesg anything but / can be not mounted yet when such access attempts happen during a boot.

Lookin closer even / is not there yet and we are still on initrd. In this
particular case / resides in /dev/sda11.
Comment 6 Jaroslav Kysela 2009-02-05 08:31:44 EST
No, alsa udev rules read configuration from /etc/alsa/alsactl.conf and /lib/alsa/init .

Could you run 'alsaunmute' script and:

/sbin/alsactl -E ALSA_CONFIG_PATH=/etc/alsa/alsactl.conf \ --initfile=/lib/alsa/init/00main restore /dev/snd/controlC0

... to see if segfault can be reproduced after boot time?
Comment 7 Michal Jaegermann 2009-02-05 15:41:22 EST
Created attachment 331050 [details]
strace output for alsactl command

> ... to see if segfault can be reproduced after boot time?

No, unfortunately is not doing that.  'alsaunmute', which really means alsctl runing 'init' command, prints on stderr:

Unknown hardware: "VIA8237" "Analog Devices AD1985" "AC97a:41445375" "" ""
Hardware is initialized using a guess method

That 41445375 looks suspiciously like "uSDA". Is this expected?

alsactl command you asked for just silently runs.  I looked at a strace output for it and also do not see anything which would jump on me as a "smoking gun".  Anyway, it is attached.  Maybe you can find some clues there?

Timings with some hardware only available later?  Whatever that is I am collecting such things (this is logwatch showing):

 WARNING:  Segmentation Faults in these executables
    alsactl :  4 Time(s)

 WARNING:  Kernel Errors Present
    VIA 82xx Modem: probe of 0000:00:11.6 failed with error -13 ...:  4 Time(s)

with 0000:00:11.6 beeing "VIA Technologies, Inc. AC'97 Modem Controller", PCI id 1106:3068.
Comment 8 Michal Jaegermann 2009-02-05 15:59:46 EST
I tried to blacklist snd_via82xx_modem, as it is of no real use on that hardware. It is not loaded anymore but on a reboot I got:

VIA 82xx Audio 0000:00:11.5: PCI INT C -> GSI 22 (level, low) -> IRQ 22
VIA 82xx Audio 0000:00:11.5: setting latency timer to 64
ALSA sound/pci/via82xx.c:580: codec_read: codec 0 is not valid [0xfe0000]
ALSA sound/pci/via82xx.c:580: codec_read: codec 0 is not valid [0xfe0000]
ALSA sound/pci/via82xx.c:580: codec_read: codec 0 is not valid [0xfe0000]
ALSA sound/pci/via82xx.c:580: codec_read: codec 0 is not valid [0xfe0000]
alsactl[1567]: segfault at 0 ip 00000000004106b7 sp 00007fff769444c0 error 4 in alsactl[400000+14000]

Sigh!
Comment 9 Bug Zapper 2009-06-09 05:45:59 EDT
This bug appears to have been reported against 'rawhide' during the Fedora 11 development cycle.
Changing version to '11'.

More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Comment 10 Michal Jaegermann 2009-06-09 14:20:11 EDT
The same bug continues to show up - currently with 2.6.30-0.97.rc8.fc12
kernel.
Comment 11 Noura El hawary 2009-06-09 17:48:10 EDT
This bug appears to have been reported against 'rawhide' during the Fedora 11 development cycle.
Changing version to '11'.

More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Comment 12 Michal Jaegermann 2009-06-09 17:56:29 EDT
Sigh!  Can you read comments before messing things up?  In particular comment #9
and comment #10.
Comment 13 Michal Jaegermann 2009-09-07 14:46:31 EDT
Created attachment 360011 [details]
an output from alsa-info.txt for the current rawhide (11.91)

This segfault does not want to go away.  Here is how it looks now.

VIA 82xx Audio 0000:00:11.5: setting latency timer to 64
ALSA sound/pci/via82xx.c:587: codec_read: codec 0 is not valid [0xfe0000]
ALSA sound/pci/via82xx.c:587: codec_read: codec 0 is not valid [0xfe0000]
ALSA sound/pci/via82xx.c:587: codec_read: codec 0 is not valid [0xfe0000]
ALSA sound/pci/via82xx.c:587: codec_read: codec 0 is not valid [0xfe0000]
alsactl[700]: segfault at 0 ip 000000000040b454 sp 00007fff107b3fe0 error 4 in alsactl[400000+14000]

Attached is an output of 'alsa-info' for the current rawhide and
2.6.31-0.204.rc9.fc12.x86_64 kernel
Comment 14 Matěj Cepl 2009-09-24 10:34:04 EDT
I think, missing sound deserves a blocker
Comment 15 Adam Williamson 2009-10-21 11:30:48 EDT
This was discussed at a preliminary blocker bug review meeting today. Unfortunately we can't take every system-specific sound bug as a blocker. Also, this likely does not result in 'missing sound', exactly - even if sound fails to work at boot time, a manual 'alsaunmute' and/or 'alsactl init' after boot (or probably in /etc/rc.local) should do the trick, by the description.

-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers
Comment 16 Bug Zapper 2009-11-16 04:30:18 EST
This bug appears to have been reported against 'rawhide' during the Fedora 12 development cycle.
Changing version to '12'.

More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Comment 17 Cătălin George Feștilă 2009-12-06 04:48:04 EST
PCI: setting IRQ 3 as level-triggered
VIA 82xx Audio 0000:00:11.5: PCI INT C -> Link[LNKF] -> GSI 3 (level, low) -> IRQ 3
VIA 82xx Audio 0000:00:11.5: setting latency timer to 64
ACPI: PCI Interrupt Link [LNKA] enabled at IRQ 11
PCI: setting IRQ 11 as level-triggered
nvidia 0000:01:00.0: PCI INT A -> Link[LNKA] -> GSI 11 (level, low) -> IRQ 11
NVRM: loading NVIDIA UNIX x86 Kernel Module  173.14.22  Sun Nov  8 20:26:31 PST 2009
ALSA sound/pci/via82xx.c:587: codec_read: codec 0 is not valid [0xfe0000]
ALSA sound/pci/via82xx.c:587: codec_read: codec 0 is not valid [0xfe0000]
ALSA sound/pci/via82xx.c:587: codec_read: codec 0 is not valid [0xfe0000]
ALSA sound/pci/via82xx.c:587: codec_read: codec 0 is not valid [0xfe0000]

Fedora 12 with kernel and Nvidia driver 173 2.6.31.6-145.fc12.i686
Comment 18 Michal Jaegermann 2009-12-06 10:32:35 EST
Well, I see the same, plus an always present segfault, from rawhide kernels too.  This is from recent 2.6.32-0.65.rc8.git5.fc13.x86_64:

VIA 82xx Audio 0000:00:11.5: PCI INT C -> GSI 22 (level, low) -> IRQ 22
VIA 82xx Audio 0000:00:11.5: setting latency timer to 64
ALSA sound/pci/via82xx.c:588: codec_read: codec 0 is not valid [0xfe0000]
ALSA sound/pci/via82xx.c:588: codec_read: codec 0 is not valid [0xfe0000]
ALSA sound/pci/via82xx.c:588: codec_read: codec 0 is not valid [0xfe0000]
ALSA sound/pci/via82xx.c:588: codec_read: codec 0 is not valid [0xfe0000]
alsactl[902]: segfault at 0 ip 000000000040b454 sp 00007fff658b4850 error 4 in alsactl[400000+14000]

I am promissing myself that eventually I will dive into a code but always there is something else to be done.  Originally I hoped that this will be pretty "obvious" for somebody familiar with internals.
Comment 19 Adam Williamson 2009-12-11 15:39:05 EST
unfortunately we have precisely one full-time ALSA developer (Jaroslav, who's in CC) and he's obviously pretty busy :( he will get to this in time, though.

-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers
Comment 20 Jaroslav Kysela 2009-12-12 03:25:34 EST
The segfault comes from the alsactl user space utility so you might want to check the latest alsa-utils package. But the "not valid" messages are from the kernel.

The problem of this bug is that it is not easy reproducible. Maybe you can run 'alsactl' with '-d' (debug) switch in udev rules (write your own script) and capture output to a file.
Comment 21 Michal Jaegermann 2009-12-13 20:50:12 EST
> ... check the latest alsa-utils package

This is from the current rawhide so that would be "the latest"

> The problem of this bug is that it is not easy reproducible.

It turns out that it is quite easy to reproduce once I started look closer.  Here are steps:

1) rename /usr/share/alsa to, say, /usr/share/alsax
2) run '/sbin/alactl restore'
3) watch for a segfault

When doing that under gdb you will see:

Program received signal SIGSEGV, Segmentation fault.
free_space (space=0x0) at init_parse.c:99
(gdb) where
#0  free_space (space=0x0) at init_parse.c:99
#1  0x000000000041052a in init (filename=0x410e87 "/usr/share/alsa/init/00main", cardname=0x7fffffffe270 "0") at init_parse.c:1752
#2  0x0000000000408009 in load_state (file=<value optimized out>, initfile=0x410e87 "/usr/share/alsa/init/00main", cardname=<value optimized out>, do_init=1) at state.c:1618
#3  0x00000000004051aa in main (argc=<value optimized out>, argv=<value optimized out>) at alsactl.c:184

The problem is that files like /usr/share/alsa/init/00main appear to be used "in blind" without checks, or sufficient checks, that they indeed can be read.  If the command is executed from initrd before /usr/share is mounted at all, as it happens in my test case, then you are getting a consistent segfault.

This particular segfault can be prevented by a sanity condition in free_space() like in an attached patch. Running such modified alsactl with /usr/share/alsa/ not available produces a bunch of "No such file or directory" errors and no segfault.  Only that I am not not sure if that is enough for every case.

"codec 0 is not valid [0xfe0000]" is another issue although I originally thought that they may be connected.
Comment 22 Michal Jaegermann 2009-12-13 20:51:40 EST
Created attachment 378131 [details]
patch preventing a segfault when /usr/share/alsa is not available
Comment 23 Jaroslav Kysela 2009-12-14 11:28:50 EST
Thanks for your problem identification. The more "correct" patch is in ALSA repository now:

http://git.alsa-project.org/?p=alsa-utils.git;a=commitdiff;h=1247c3d8ac6a3ab9b77b92de64e721948a473489
Comment 24 Cătălin George Feștilă 2009-12-14 11:39:22 EST
How this hel me ?
I'm not C programmmer .
Comment 25 Cătălin George Feștilă 2009-12-14 11:39:45 EST
How this help me ?
I'm not C programmmer .
Comment 26 Michal Jaegermann 2009-12-14 19:51:09 EST
> The more "correct" patch is in ALSA repository now:

Are these only instances of free_space() now in the future?  I was thinking about doing it that way but I chickened out that I will miss something?  Maybe both types of guards should be applied on a "belt-and-suspenders" principle?  Up to you, obviously enough.

> How this help me ? I'm not C programmmer.

If you are hit by this and this happens from some scripts which run automatically  then check that configuration files are available before running /sbin/alsactl.
Something like that shell code should be sufficient in practice:

  [ -d /usr/share/alsa ] && /sbin/alsactl restore

even if not entirely correct (files could be present but not readable).  A future update will make that unnecessary.
Comment 27 Michal Jaegermann 2009-12-15 10:59:30 EST
Come to think of it this issue reveals another, "location", bug. /sbin/alsactl strongly suggests that this utility can be used early in a boot sequence when /usr is still empty.  Similar considerations apply also to shutdown.  While alsactl can be prevented from segfaulting in such situation without files in /usr/share/alsa/init it really does nothing.

So either this should be /usr/sbin/alsactl (only, and not a symbolic link) or files which it needs to operate should be available when other file systems are not mounted.  In principle /usr/share can be mounted even over a network to be shared between multiple systems and hence the name.  This detail that somebody may have a "flat" filesystem tree is neither here nor there.
Comment 28 Jaroslav Kysela 2009-12-15 11:24:34 EST
Note that in Fedora, the /usr/share/alsa/init tree is in /lib/alsa/init and all alsactl calls in init scripts (udev rules and /etc/init.d/halt) uses this path.
Comment 29 Michal Jaegermann 2009-12-15 11:57:39 EST
> Note that in Fedora, the /usr/share/alsa/init tree is in /lib/alsa/init

Ah, indeed. /usr/share/alsa/init is a symlink.  Only in that case there is something funny with a compiled in defaults.  If /lib/alsa/init would be consulted then my prescription for segfaulting from comment #21 would not work
and I remember in sources references to /usr/share/alsa (I do not have that
handy in the moment but later I can recheck).  Maybe just a misconfiguration while compiling?  Yes, I am aware of option '-i' to alsactl.
Comment 30 Michal Jaegermann 2009-12-15 16:21:38 EST
Re comment #28 and comment #29:

Just what I thought.  In configured sources for alsa-utils-1.0.21-2.fc12
'grep -r /init/00main alsactl' gives this:
alsactl/alsactl.1:The configuration file for init. By default, PREFIX/share/alsa/init/00main
alsactl/alsactl.c:      printf("  -i,--initfile #  main configuation file for init phase (default " DATADIR "/init/00main)\n");
alsactl/alsactl.c:      char *initfile = DATADIR "/init/00main";
alsactl/alsactl_init.xml:        configuration file is <filename>/usr/share/alsa/init/00main</filename>.
alsactl/alsactl_init.xml:          level configuration file is <filename>/usr/share/alsa/init/00main</filename>.

and you have these:
./config.log:#define DATADIR "/usr/share/alsa"
./include/aconfig.h:#define DATADIR "/usr/share/alsa"

so it appears that at least '--datarootdir=/lib' passed to ./configure is missing.  That helps with 'alsaunmute' script too even if a content of
alsactl/alsactl_init.xml or a manual page does not change.

Do you want that as a separate bug or this is fine?

OTOH without this I would not find out that free_space() may segfault. :-)
Problems in selinux policies may cause that too before fixes, for example.
Comment 31 Cătălin George Feștilă 2009-12-16 03:36:47 EST
 
> > How this help me ? I'm not C programmmer.
> 
> If you are hit by this and this happens from some scripts which run
> automatically  then check that configuration files are available before running
> /sbin/alsactl.
> Something like that shell code should be sufficient in practice:
> 
>   [ -d /usr/share/alsa ] && /sbin/alsactl restore
> 
> even if not entirely correct (files could be present but not readable).  A
> future update will make that unnecessary.  

 /sbin/alsactl restore
/sbin/alsactl: load_state:1569: Cannot open /etc/asound.state for reading: No such file or directory
Unknown hardware: "VIA8233" "Analog Devices AD1980" "AC97a:41445370" "0x1043" "0x80a1"
Hardware is initialized using a guess method
Comment 32 Bug Zapper 2010-11-04 07:46:00 EDT
This message is a reminder that Fedora 12 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 12.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '12'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 12's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 12 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Comment 33 Bug Zapper 2010-12-05 02:07:32 EST
Fedora 12 changed to end-of-life (EOL) status on 2010-12-02. Fedora 12 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.

Note You need to log in before you can comment on or make changes to this bug.