Bug 217146
Summary: | Hard lock-up with 2GB RAM and b44 | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 4 | Reporter: | Andrew Gormanly <a.gormanly> | ||||
Component: | kernel | Assignee: | Neil Horman <nhorman> | ||||
Status: | CLOSED CANTFIX | QA Contact: | Brian Brock <bbrock> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 4.4 | CC: | andriusb, jbaron, linville | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | ia32e | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2007-02-05 18:33:11 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Andrew Gormanly
2006-11-24 11:31:58 UTC
That hardware is incapable of DMA above 1GB. As you observed, some hacks have been added to the driver to get around that problem. A hang on >1GB systems might indicate a problem with those hacks. I see one upstream b44 patch that might be helpful. I'll spin test kernel w/ that patch and post here for you to test when it is available. Created attachment 142959 [details]
jwltest-b44-dma_mapping_error.patch
Test kernels w/ the above patch are available here: http://people.redhat.com/linville/kernels/rhel4/ Please give them a try and post the results here...thanks! Thanks for getting on to this so quickly. The patch you have is the same as the changes in the (latest available to the public, Jun 7 2006) Broadcom 1.00g driver, which I already tested in 2.6.9-42.0.2.EL, and your kernel has the same results as both that and the shipping RHEL4 one - i.e. panic when loaded, if booting with higher than mem=1040M (and sometimes below, only mem=1000M is reliable). There's a driver version 1.01 in 2.6.18.3 - might it be worth checking that? #define DRV_MODULE_VERSION "1.01" #define DRV_MODULE_RELDATE "Jun 16, 2006" I'll try and check it out myself over the weekend, and maybe the FC6 kernel source too if I can. Is the system at all responsive? Specifically can you produce sysrq-t operations when the system is hung? It would be helpful to have that info so we could confirm that it was actually b44 killing the system, and if so, what the system state looked like when it went down. Alternatively, if you can configure nmi_watchdog, and capture a core dump via netdump, that would be great No, it's completely locked up; sysrq's don't work, and nothing appears via netdump. The only sign of life is flashing Caps- and Num-lock lights (and the Bluetooth light is on). I'm 99% sure it's b44, as the machine's fine without that module loaded but dies on bringing up the interface. possibility of nmi_watchdog? nmi_watchdog should be on by default - it's a Core 2 Duo (so SMP x86_64). I did, however, already try adding it to the boot line just in case, which made no difference. Ok, so I'm running a bcm4401 card on a system with 2GB of ram on board using kernel 2.6.9-42.0.3, using card to transfer data bi-directionally with this command running locally and on the remote host: cat /dev/zero | ssh <peer> "cat > /dev/null" Running for an hour now with no faults. I'm going to let this run over the weekend to make sure, but I'm inclined to think, that unless something goes wrong during over the weekend that your test may have been flawed, and that this was the b44 >1GB dma problem after all well, it appears that a lockup occured over the weekend, although I didn't consider it, but I johns patch isn't included in the -42.0.3.EL kernel. I've applied the patch and rebuilt the kernel, and am currently retesting. well, good news (in a manner of speaking). After applying Johns patch, I seem to have locked up the box again, so I think I've reproduced your problem. I'll start debugging right away Note to self, I've been testing for a few days and I've been able to recreate the hang several times, but only using TCP. If I send UDP in one direction (to the b44 NIC or from the b44 NIC) then no hang. I'm currently testing bi-directional UDP to see if that causes the hang. If it does, it suggests that this is a problem resulting from some sort of tx/rx race. If not, perhaps a specific problem sending TCP frames (although I'm hesitant to believe that). I've been doing some reading about alternate theories to this hang. I'm setting up a test here and was wondering if you could please do the same. Could you boot your kernel with teh following kernel parameter included: pci=noacpi And see if the hang recurrs? I'd appreciate it. Thanks! Booting with pci=noacpi still gives a hang on bringing up the network card, with the same error message, "Kernel panic - not syncing: PCI-DMA: high address but no IOMMU." could you please add iommu=soft and try again, that should enable the soft iommu support in the kernel for you. Well, by forcing the kernel to use the swiotlb it does actually stay alive after initializing the b44, and it seems stable so far. It's a tricky situation if this is the fix - should the kernel's default for all Intel x86_64 machines be changed to iommu=soft (rather than the present behaviour of iommu=off for machines with <3GB memory but iommu=soft for those with >3GB memory) ? In some ways it's cleaner to have the kernel's behaviour be independent of the amount of RAM in the machine, especially when doing so removes the potential for broken hardware killing the kernel by failing to DMA above a device-dependent random number of bits that its designers decided to use as their DMA limitation (31 in this case). On the other hand, using up a chunk (64MB in 2.6.9) of low-end RAM on all Intel x86_64 systems is a waste when most won't need it... but not much of one given the normal memory for such machines is 512MB-4GB at present, and will grow. Thanks for your time in solving this issue. The really silly thing is that if I'd had <1GB or >3GB of memory in this machine I'd never have seem this bug... Unfortunately, if this is the case with your system, we're a little out of luck. Are you sure that this system has an iommu at all? Its possible that it doesn't. The presence of an iommu is detected in pci_iommu_init that gets universally called on boot up. It could also be that your system is misreporting the size or availability of your iommu. It might be worth your time to instrument pci_iommu_init on your system and print out the values returned from check_iommu_size. If there is a bad value that gets reported back, we could perhaps explore adding a check for it to back off to an swiotlb if needed. I don't think there's any point - I thought it did not have an IOMMU, as no Intel EM64T systems have one (neither does IA64), and that this is one of the differences with AMD64, where the AGP aperture is used as an IOMMU. The panic message in comment #14 appears to confirm this, as does the statement in the RHEL3U2 release notes ( http://www.redhat.com/docs/manuals/enterprise/RHEL-3-Manual/release-notes/as-amd64/RELEASE-NOTES-U2-x86_64-en.html ) "IntelĀ® EM64T does not support an IOMMU in hardware while AMD64 processors do." Regardless, if I understand things correctly, the point is that the Linux kernel for the x86_64 arch (absent any boot switches) does not use any IOMMU if there's less than 3GB RAM in the system, and prints out "PCI-DMA: Disabling IOMMU" on boot. [On AMD64 and >3GB RAM, it uses (by default 64MB of) the GART and prints e.g. "PCI-DMA: Reserving 64MB of IOMMU area in the AGP aperture". On EM64T and >3GB RAM, it uses SWIOTLB and prints "PCI-DMA: Using software bounce buffering for IO (SWIOTLB)".] This is normally fine, and saves us 64MB of RAM. In the case of broken hardware which can't DMA properly when the host system has "too much" physical memory, however, the lack of any (hardware or software) IOMMU kills the kernel. This is why forcing it to use swiotlb works. On an AMD system with this chip, setting it to iommu=force should do the trick. So for machines with the Broadcom 4401 installed, the x86_64 kernel is fine with under 1GB of RAM, and fine with over 3GB of RAM. Anything in between and it panics. Could this be fixed in the b44 driver? The following, from Documentation/DMA-mapping.txt seems like it might be helpful, but I'm not a kernel hacker... "Does your device have any DMA addressing limitations? For example, is your device only capable of driving the low order 24-bits of address on the PCI bus for SAC DMA transfers? If so, you need to inform the PCI layer of this fact. By default, the kernel assumes that your device can address the full 32-bits in a SAC cycle. For a 64-bit DAC capable device, this needs to be increased. And for a device with limitations, as discussed in the previous paragraph, it needs to be decreased." Ultimately, though, I feel that the kernel behaviour is not right - consistently using the (hard- or software) IOMMU on x86_64 would avoid any problems with hardware DMA addressing limitations. I'm not sure how this affects i386 though. yeah, I was just tossing this around, and you're right. the iommu is getting explicityly disabled because you have less than 4GB of RAM installed, so the kernel decides that you don't need any iommu support. And there is really not alot we could do about that. Theres alot of possibilities that you could do to add flags that only enable swiotlb in the event that no real iommu is present, but that wouldnt change the fact that this is all because of the b44 hardwares need to dma under 1GB of RAM, so it won't get much support. We have fixed this (somewhat) in b44 already. There is that patch that is supposed to restrict memory allocations on the b44 driver to under 30 bits of address, and use GFP_DMA if it can't be obtained. It seems we are missing a case though (perhaps one we can't control, possibly in the rx path from the hardware). Etiher way, I think about the only foolproof solution in the b44 driver is to allocate memory for the card only from ZONE_DMA, which is going to have considerable performance impact, perhaps more so than just enabling swiotlb. Perhaps this could be managed on install. If you install these systems with a kickstart file you could add a %post section to the install process to test for the presence of 1GB < x < 4GB of ram and a b44 card. If both are true, you could append the swiotlb line to the boot command line arguments. That way you could at least be a little more selective on which systems used soft iommu. But I think in the end this is going to have to be a CANTFIX. Sorry about that. |