Bug 64005
Summary: | aRts sound subsystem fails with "CPU Overload, aborting" message on IBM NetVista Pentium 4. | ||
---|---|---|---|
Product: | [Retired] Red Hat Linux | Reporter: | Aaron Freed <vmorgo> |
Component: | kernel | Assignee: | Arjan van de Ven <arjanv> |
Status: | CLOSED DUPLICATE | QA Contact: | David Lawrence <dkl> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 7.2 | CC: | brosenkr, dledford |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2002-11-08 17:08:40 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Aaron Freed
2002-04-23 17:41:30 UTC
This is almost certainly an issue with the i810_audio driver and aRts. I need the i810 sound driver startup messages. Can you unload then reload the i810 sound driver then run the command: dmesg | tail -10 and put that output in this report? Result of attempting to run sndconfig on one of the affected systems per dledford I was not sure how to add and then remove the i810 sound driver, so I figured that running sndconfig would result in the requested information added to dmesg. Here is what I got.... Intel 810 + AC97 Audio, version 0.05, 17:36:29 Sep 6 2001 PCI: Setting latency timer of device 00:1f.5 to 64 i810: Intel ICH2 found at IO 0x1880 and 0x1c00, IRQ 5 ac97_codec: AC97 Audio codec, id: 0x4144:0x5362 (Unknown) i810_audio: setting clocking to 7031 Intel 810 + AC97 Audio, version 0.05, 17:36:29 Sep 6 2001 PCI: Setting latency timer of device 00:1f.5 to 64 i810: Intel ICH2 found at IO 0x1880 and 0x1c00, IRQ 5 ac97_codec: AC97 Audio codec, id: 0x4144:0x5362 (Unknown) i810_audio: setting clocking to 7031 This is a known problem that was fixed by the 0.21 version of the i810 sound driver (you have the old 0.05 version). Arjan, is the 0.21 version not in the 7.2 errata kernels? I am currently using kernel 2.4.9-31. I believe this is the latest kernel ("errata kernel") It does, indeed, include the version 0.05 of i810. If Arjan could compile a new kernel with i810_audio version 0.21, it would be VASTLY appreciated.... Thank you! Aaron. P. S. I assume that this would be 2.4.9-32....? Can you post the results of running the command: uname -a on your machine? [root@clarke boot]# uname -a Linux clarke 2.4.9-31 #1 Tue Feb 26 07:11:02 EST 2002 i686 unknown [root@clarke boot]# Something isn't making sense. Here, I just installed a fresh copy of the 2.4.9-31 i686 kernel RPM. Running the command: strings /lib/modules/2.4.9-31/kernel/drivers/sound/i810_audio.c | grep version produces this output: kernel_version=2.4.9-31 <6>Intel 810 + AC97 Audio, version 0.21, 07:16:09 Feb 26 2002 In other words, the 0.21 version of the driver I mentioned is already in the 2.4.9-31 kernel. So, the question then becomes, how is it you are still getting the 0.05 version of the driver loaded on the machine where you ran soundcfg? Have you rebooted that machine since installing the 2.4.9-31 kernel RPMs and did you change the lilo.conf or grub.conf files to reflect that you wanted the 2.4.9-31 kernel to be the default kernel on that machine? The command you give, strings /lib/modules/2.4.9-31/kernel/drivers/sound/i810_audio.c | grep version produces strings: /lib/modules/2.4.9-31/kernel/drivers/sound/i810_audio.c: No such file or directory I think you are searching for strings in a source file (.c) when you should be looking in a compiled "out" file (.o). See below: strings /lib/modules/2.4.9-31/kernel/drivers/sound/i810_audio.o | grep version produces this result: kernel_version=2.4.9-31 <6>Intel 810 + AC97 Audio, version 0.21, 07:16:09 Feb 26 2002 Here is the output from dmesg after running the sndconfig program on the SAME SYSTEM as the "strings" command was run on (Clarke). Clarke, like the other IBM NetVistas DOES exhibit the problem with the sound driver, despite the presence of Version 0.21. (See error output from dmesg below). <6>Intel 810 + AC97 Audio, version 0.21, 07:16:09 Feb 26 2002 [root@clarke sound]# sndconfig [root@clarke sound]# dmesg | tail -10 i810: Intel ICH2 found at IO 0x1880 and 0x1c00, IRQ 5 i810_audio: Audio Controller supports 6 channels. ac97_codec: AC97 Audio codec, id: 0x4144:0x5362 (Unknown) i810_audio: AC'97 codec 0 Unable to map surround DAC's (or DAC's not present), total channels = 2 Intel 810 + AC97 Audio, version 0.21, 07:16:09 Feb 26 2002 PCI: Setting latency timer of device 00:1f.5 to 64 i810: Intel ICH2 found at IO 0x1880 and 0x1c00, IRQ 5 i810_audio: Audio Controller supports 6 channels. ac97_codec: AC97 Audio codec, id: 0x4144:0x5362 (Unknown) i810_audio: AC'97 codec 0 Unable to map surround DAC's (or DAC's not present), total channels = 2 The problem remains an issue as described. *** The references to output from dmesg showing 0.05 of the driver were my mistake. I accidently sent the wrong information--I meant to send output from the machine "clarke" but accidently sent it from a server I was building that I had forgotten that I had not yet upgraded. I apologize for any confusion I may have caused. OK, since the problem is still happening with the 0.21 version of the driver, which had previously been reported to fix the problem, I'm going to have to look into it. Any news on this bug? Last I heard, someone (I think Doug Ledford) was going to look into this problem. I don't mean to be pushy, but as I am the one who recommended the entire company switch over to the IBM NetVista Pentium 4's in lieu of another brand of PC, I am kind of under the gun to get this one fixed.... Appreciate all your help! It will probably be a while before the problem is fixed. It doesn't have as high of a priority as it's labelled with since we know this is an artsd specific problem. If you were using esd instead of artsd, you wouldn't be noticing a problem. In the time between now and when this is fixed, you could actually use esd instead of artsd to make sound work on these machines. On Thursday 25 April 2002 03:12 pm, you wrote: How do I use esd instead of artsd? I thought artsd was required for KDE. (It certainly seems that a fuss is raised if ARTSD is overlooked when updating KDE.). Is it simply a matter of turning off ARTSD sound server and then instructing the KDE media player or other sound-generating software to use esd instead? Please let me know.... It may not be so high a priority for you, but, regrettably, it could cost me my job, as I am the one who recommended the company shift away from the Dell machines we were using and to the IBM--which doesn't seem to like ARTSD! Thank you! ... If esd is Maybe a better way to ask: How do I cause KDE to default to the ESD (Or OSS Sound driver) instead of artsd? Is there a how-to or a readme I should be looking at? Is there any time estimate at all on the fix for this bug? Perhaps if I had an idea as to how long I will have to wait (e.g. until the next version of RHAD Linux is released) I could make a decision as to what to do. Please let me know.... (And, by the way, when IS the next Beta going to be posted. Your website STILL offers the old Roswell which was the beta for 7.2.) A. Humm the Skipjack beta is out for some time now.... Hey, can you get the appropriate person to update your website? It still indicates that the beta to be downloaded is the old Roswell one, not the new Skipjack, which Arjan mentioned.... (I found it on one of your mirrors.) Thanks! It's on my plate of things to be fixed, but it is behind a couple other items. I likely won't even get a chance to look at it until next week. I also don't have hardware that exhibits the problem so I'm going to be guessing at what is causing the issue on your hardware. Hi! I see you have just released 7.3. (I didn't realize what "a couple of things" were!) Congratulations! If this bug is fixed in 7.3 (It was still NOT fixed under SkipJack), then you could close this bug report as far as I am concerned (unless others need this problem fixed). In any event, I figured out that the problem appears to be, at least, in part due to this: IRQ 9 is being used by: Cascade from IRQ2 USB/UHCI (Universal Serial Bus) the Sound board (!!!) ACPI power managment firmware (These systems do NOT use APM, though RedHat insists upon installing it.) the Network Adapter (An Intel EtherExpress Pro 100 VE) That's five devices all fighting for the same interrupt! This problem appears on two separate machines, now, from two different manufacturers: The first machine is the IBM NetVista for which this ticket was originally opened. The second machine is a Sony VAIO PCG-GRX570 Laptop Computer. Under SkipJack with no updates (as of April 4): I have had trouble with the following: USB: * (Both computers) USB hangs very easily. SOUND: * IBM NetVista desktop produces NO sound; only "FATAL ERROR: CPU OVERLOAD. Aborting" message. * Sony VAIO PCG-GRX570 produces choppy sound full of dropouts. Frequently fails. Setting it to Full Duplex mode with Real time priority (in KDE ARTS Setup) produces "FATAL ERROR: CPU Overload. Aborting." though there may be a minute or so of good, clear sound before that error occurs. (Both machines) Top DOES show the CPU running at 100 percent until the error comes up. ACPI: * IBM NetVista: ACPI does not work. EVEN WHEN DISABLED IN THE BIOS, SOUND (AND PRESUMABLY OTHER SERVICES) STILL FAILS ON THE IBM NETVISTA. Testing on this machine was more limited--it's not my machine! * Sony VAIO PCG-GRX570 Laptop Even when compiled into the kernel and APM deliberately OMITTED from the kernel, ACPI still appears to do nothing. The system still fails to detect a change from on-line to battery power. Indeed, it insists that there is no battery installed at all! * Sony VAIO PCG-GRX570 Laptop. Machine will hang if booted with the network card enabled on boot, configured for DHCP and NOT plugged in to the network. This may or may not be related to the IRQ 9 problem discussed herein. (Boots okay if the card is disabled and then later brought up with ifup eth0.) Interesting Note: I tried installing SuSE 8.0 on the Sony laptop. It installs, but locks up right after LILO finishes loading the kernel. No screen, not even a blinking cursor. Both battery lights come on solid when this happens, suggesting that it is, once again, ACPI/APM locking up IRQ 9. RedHat, on the other hand, seems to work flawlessly, even to the point of running VMWare phenomenally fast, elegently stepping around this problem. Kudos to you on that! Conclusion: ALL the problems with both of these machines stem from the design of the motherboard wherein all these devices (listed above) are fighting for IRQ 9. Please let me know your thoughts on this matter. I would be happy to test any patches or fixes that you may have available! I presently have SkipJack Beta on the laptop (without patches, because the patches break VMWare 3.1.1.) I will be putting Valhalla on as soon as I finish downloading the ISO's for it. Given the fact that this problem still manifested itself under the latest beta (Skipjack, patched), I am assuming that the problem will still exist under Valhalla. A. The above described problem still exists in RedHat 7.3. I've listed a separate bug report for this situation for your convenience. The new bug report is based on the Sony Vaio PCG GRX570 laptop rather than the IBM NetVista. Are there any plans or developments impending with this bug? I don't mean to be a pest about it, but I would like to know what the status is, other than "assigned" with no evident action since the beginning of May. I have sent several e-mails to arjanv, dkl, and dledford, but haven't heard back. As I had accidentally broken my e-mail for several days, I am wondering if any of you had written and I had just never gotten the response. If so, would you please reply again, if not, would you please reply, even if its to tell me to "go away and stop bothering you about this issue." At least that way, I will know where I stand. Thank you. A. Hey, guys! WORK-AROUND FOR THE SOUND PROBLEM ON THE SONY PCG-GRX-570. I WILL TRY THIS ON THE IBM NETVISTA SHORTLY AND SEE IF IT WORKS ON THAT MACHINE ALSO.... LET ME KNOW WHAT YOU THINK.... Accept default installation of RedHat 7.3, including the driver RHAD selects for the sound card. Now, make the following adjustments: In the KDE control center: Select Sound Select Sound Server Select the General Tab. Select: Start aRts soundserver on KDE startup Autosuspend if idle for (60) seconds Display messages using artsmessage Message Display: WARNINGS Remaining settings need not be turned on. Select the Sound I/O tab Sound I/O method: Autodetect Enable Full Duplex operation NOT SELECTED Use custom sound device NOT SELECTED Use custom sampling rate: NOT SELECTED Other custom options: NOT SELECTED Sound quality: Autodetect (And this one seems to be the key!) Audio buffer size (response time) Set this to the FAR RIGHT, "as large as possible" This is the setting reading "Low CPU usage" "slow response" "more dropouts" Yes, that's right. "More dropouts. Slow response. Low CPU." You would think that this setting would be terrible, but it works perfectly, at least in so far as I can see. My guess is that the buffer becomes large enough so that the system doesn't have to generate interrupts (and interrupt the sound driver) to get the next bit of data. So, in fact, you get the benefits of low CPU along with the benefits of a nice, big, fat buffer so no dropouts. YAY!!! Other notes: 1. Yes, excessive disk activity or CPU activity may cause the odd dropouts, or maybe even a storm of them. But, it's not intollerable. You might want to give some tuning a try. Note: I am listening to musing whilst writing this AND doin a big RSYNC of 3248213093 bytes of MP3 files and I only got one tiny dropout, and that was when I started (yet another ) copy of Konsole. 2. Here's /etc/modules.conf alias parport_lowlevel parport_pc alias eth0 eepro100 alias usb-controller usb-uhci alias sound-slot-0 i810_audio post-install sound-slot-0 /bin/aumix-minimal -f /etc/.aumixrc -L >/dev/null 2>&1 || : pre-remove sound-slot-0 /bin/aumix-minimal -f /etc/.aumixrc -S >/dev/null 2>&1 || : [afreed@montpellier afreed]$ Modifications to modules.conf are not necessary. 3. The sound produced this way is remarkable in its clarity and quality--and the CPU usage is virtually nil! (Even my other workstation, a dual 933 SMP runs about 15 to 20% cpu to drive sound and this one is, maybe 1% at the top.) I haven't been able (yet) to track down any hardware where I could reproduce this problem. However, I had suspected that it might have something to do with the artsd sound buffers (which your workaround confirms). Could you set the sound buffer size back to wherever it was when it was failing, then find out what the -F and -S arguments to artsd are? Also, I would like to know what they are when moved all the way over to the side that's working for you. Furthermore, since this may actually be an artsd bug and not a kernel bug (we aren't for sure yet, but it's possible that the CPU overload could have been artsd attempting to set the sound buffers in an infinite loop and never being happy with what the i810 sound driver sent back as the results of the sound buffer fragment ioctl), I'm going to add brosenkr to the Cc: list as he takes care of the arts package. Hi! I would be glad to do this for you. However, I do not know how to find out what the -F and -S segments to artsd are. If you would tell me, I would be happy to report to you the results. Are these the one or two numbers that appear in the sound buffer size dialog when I move the slider bar? If so, please let me know. I can easily report these to you. I should point out that this problem probably is not related to aRtsd, but rather to the driver or kernel because: 1. Upon installation in the Sony Laptop (and, I am pretty sure, the IBM NetVista, too), RedHat, by default, installs an Intel Pentium III kernel even though the machines have a P4. 2. sndconfig still exibits the poor sound quality (pops, dropouts, etc.) when used to configure the sound card on the Sony. I am pretty sure that the IBM NetVista generates no sound at all. 3. The CPU overload issue seems to only come up when the option to enable Full Duplex is turned on. If I have TOP, gtop, or ksysguard running whilst adjust the sound properties, the CPU usage is almost non-existent, until I turn on the Full Duplex feature, at which point the CPU's proverbial pedal gets pressed to the equally proverbial metal, and remains there until either the CPU overload, sound-server aborting message appears, or I turn off the Full Duplex feature. Other points: The problem with sound production exists when the slider for buffer size in the KDE sound dialog is left at its default setting, which is right in the middle of the slider. I shoved it all the way to the right (buffer as large as possible) and was rewarded with great sound. I await your next e-mail! (And if this work-around really proves valid, maybe you'll put my name up in lights as one of the Great Fixers for RedHat (mmmm Good for my ego! ;-) ) A. On the IBM NetVista, this was the result of attempting to apply the same fix as described above: The machine immediately showed 97% or more CPU being tied up by artsd. It did not matter whether artsd's Full Duplex option was set or not, the CPU load still stayed very high. It also did not matter what I did with the buffer size. I finally shut down the ARTSD server, with the following result: (Details show the entire session, using kcontrol to start the server, then trying to stop it.) kcontrol DCOPServer up and running. trying to create local folder: File exists trying to create local folder: File exists trying to create local folder: File exists trying to create local folder: File exists [root@clarke root]# Can't connect to sound server unable to connect to sound server warning: leaving MCOP Dispatcher and still 9 object references alive. - Arts::SampleStorage - Arts::Synth_MULTI_ADD - Arts::Synth_MULTI_ADD - Arts::Synth_PLAY - Arts::StereoVolumeControl - Arts::StereoEffectStack - Arts::Synth_BUS_DOWNLINK - Arts::SoundServerV2 - Arts::MidiManager warning: leaving MCOP Dispatcher and still 69 types alive. sound server terminated ------- It is odd that my fix seems to work on the Sony laptop, but not on the IBM machine. Perhaps the IBM has a slightly different chipset than the Sony Vaio, even though the Sony and IBM both are detected by RedHat to have the AC'97. I do point out earlier in this bug that the chipsets detected by sndconfig are very slightly different. (I think the sony has an "AAM" in the name, while the IBM has a name ending in "BAM") In any event, the Sony always DID produce sound, albeit originally of poor quality, while the IBM never produced any sound whatsoever. At the risk of being killed by the kernel guys ;): Do the alsa drivers for your soundcard (http://www.alsa-project.org/) have the same problem? Funny you should ask. Actually, I tried SuSE 8.0 (Please forgive me my momentary madness!) because I heard it came with the Alsa drivers. After numerous dismally unsuccessful attempts to install Alsa on RedHat, I finally gave in and tried the SuSE--only to find out that the problem is actually WORSE under SuSE. In fact, once one attempts to configure the Alsa driver (version 0.90, as I recall) or even the upgraded version SuSE suggests one obtain from their website, the machine invariably locks up hard on reboot while the S12alsasound scripts attempts to "restore previous settings". I ended up having to remove the entire S12alsasound script and NEVER configure the drivers just to keep the laptop booting. I have not tried it on the IBM since that is not my machine, and given the atrocious results that I got with SuSE/Alsa, I have no great inclination to try it. You guys may not include all the goodies that SuSE provides with their distro, but you sure managed to provide a faster install, a newer kernel, a more updated version of KDE3, and, best of all, a SOUND DRIVER THAT CAN BE PERSUADED TO WORK QUITE WELL, at least on the Sony VAIO PCG-GRX570 pentium 4 laptop. I have been loaned an IBM Intellistation E Pro that exhibits the behaviour that yours does (according to the person that sent it to me), so I should be able to do something with the problem on the IBM now. The Sony is doing better because of interrupt latency issues. With a larger sound buffer, the DMA hardware is less likely to run out of data whenever the sound daemon is slow to feed it more data due to other programs running. The smaller the buffer, the more likely it is that the sound daemon will have occasional fall outs. The other hardware in your system has a large impact on this as well. Certain operations (such as ripping a CD-ROM disk) will cause *lots* of audio fall outs regardless of the sound buffer setting due to the amount of time the kernel spends with interrupts turned off during a normal ripping operation. However, since the Sony is doing better with the buffer changes, I'll consider that fixed. The IBM problem I'll work on... Great news on your getting your hands on the IBM! Let me know if you really do have the same problem. I hear you are opening a development lab here in Massachusetts, so it MIGHT be possible to bring one of our IBMs there if you want to test/try it. On the other hand, I am certainly more than willing to try the patch/fix you come up with on the IBMs we have here. Just let me know when you want me to give your code a go. As to the Sony, yes, I would expect dropouts while ripping CDs. But should I expect them while playing back a DVD movie? I am, at present, assuming that DVD playback would *NOT* work on the Sony because DVD playback is presumably sufficiently CPU-intensive that it would also cause *LOTS* of dropouts and maybe even crash. Perhaps we should apply the patch to the Sony as well (when you have it done), since my 'really big buffer' trick is just a work-around and I think we both know that there *WILL* be someone out there who wants to RIP CD's/play DVD's and listen to MP3's at the same time. Can I assume that RIPPING without any MP3's playing at the same time would work? And what are your thoughts RE: DVD playback, perhaps via OGLE (or LinDVD if you would care to loan me a copy.) Thanks again! The problem has been confirmed on the IBM that I have here. Preliminary review suggests that the chipset (and ICH2, which is mostly the same as the ICH0 and ICH1 chipsets this driver was originally written for) is not happy with how it is being configured. The DMA engine on the card is not actually running. Once I have a fix for the problem I'll post a patch here (as well as indicate what version of the kernel to expect the fix to show up in). Any progress on fixing this? It's been several months, guys.... This has been solved on our in house test box. Driver version 0.22 and later should enable these machines to work properly. The next errata kernel should have this updated driver in it. People wishing to test the driver sooner than that will need to contact me directly to get test modules for their kernel. I am still seeing this problem with Redhat 7.2 on an IBM NetVista machine. All indications seem to point to the same problem, but I am running v0.22 of the driver. The situation is that all drivers appear to load cleanly, but no sound is output. Could the video and audio devices not be playing nice on the shared IRQ 5? I can hear a little bit of occasional faint scratchy white noise from the card, but nothing intelligible. Here is some relevant information: from uname: Linux vodka 2.4.18-17.7.x #1 Tue Oct 8 13:33:14 EDT 2002 i686 unknown from dmesg: Intel 810 + AC97 Audio, version 0.22, 13:44:24 Oct 8 2002 PCI: Setting latency timer of device 00:1f.5 to 64 i810: Intel ICH2 found at IO 0x1880 and 0x1c00, IRQ 5 i810_audio: Audio Controller supports 6 channels. i810_audio: Defaulting to base 2 channel mode. ac97_codec: AC97 Audio codec, id: 0x4144:0x5362 (Unknown) i810_audio: AC'97 codec 0 Unable to map surround DAC's (or DAC's not present), total channels = 2 i810_audio: setting clocking to 41588 from /proc/pci: Bus 0, device 31, function 2: USB Controller: Intel Corp. 82801BA/BAM USB (Hub #1) (rev 18). IRQ 11. I/O at 0x1820 [0x183f]. Bus 0, device 31, function 3: SMBus: Intel Corp. 82801BA/BAM SMBus (rev 18). IRQ 5. I/O at 0x1810 [0x181f]. Bus 0, device 31, function 4: USB Controller: Intel Corp. 82801BA/BAM USB (Hub #2) (rev 18). IRQ 10. I/O at 0x1840 [0x185f]. Bus 0, device 31, function 5: Multimedia audio controller: Intel Corp. 82801BA/BAM AC'97 Audio (rev 18). IRQ 5. I/O at 0x1c00 [0x1cff]. I/O at 0x1880 [0x18bf]. Bus 1, device 0, function 0: VGA compatible controller: nVidia Corporation Vanta [NV6] (rev 21). IRQ 5. Master Capable. Latency=40. Min Gnt=5.Max Lat=1. Non-prefetchable 32 bit memory at 0xf0000000 [0xf0ffffff]. Prefetchable 32 bit memory at 0xf8000000 [0xf9ffffff]. Bus 2, device 8, function 0: Ethernet controller: Intel Corp. 82801BA/BAM/CA/CAM Ethernet Controller (rev 3). IRQ 11. Master Capable. Latency=66. Min Gnt=8.Max Lat=56. Non-prefetchable 32 bit memory at 0xf2000000 [0xf2000fff]. I/O at 0x2000 [0x203f]. Any ideas? Thanks. Driver version 0.24 is the latest version and it includes changes made by other people on the internet instead of me. Alan Cox has that in his latest kernels and I'm suggesting it for inclusion into the next Red Hat errata kernel. I'm marking this as a duplicate of another bug because the 0.24 driver *should* solve your problem (the other patches that make up the 0.24 driver version include support for the 845 chipset on the NetVista and also for 6 channel support mode). The error messages between this bug report and the one I'm marking this a duplicate of are slightly different, but both problems were addressed by the 0.24 driver version as I seem to recall. *** This bug has been marked as a duplicate of 76830 *** |