Bug 30839
Summary: | timeouts and hiccups with wvlan_cs (RC2 kernel) | ||||||
---|---|---|---|---|---|---|---|
Product: | [Retired] Red Hat Linux | Reporter: | James Manning <jmm> | ||||
Component: | kernel | Assignee: | Michael K. Johnson <johnsonm> | ||||
Status: | CLOSED NOTABUG | QA Contact: | Brock Organ <borgan> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 7.1 | CC: | notting | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | i386 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2001-03-17 07:47:35 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
James Manning
2001-03-06 19:37:08 UTC
Did any of this work on 0.1.9 or earlier ? I've seen this a couple of times, but it's not very reproducible for me (this is with the kernel config.) I suppose the driver could be doing something bad that interacts poorly with zerocopy. What I've noticed is that when these timeout happen, the entire machine freezes for the period of the timeout (e.g., interrupts are disabled.) One way to get the machine out of this state, if you don't feel like waiting for the watchdog, is to just eject the card. :) 1) I'm going to check with 0.1.9 as soon as I can get home and play with it tonight 2) to take the cardbus bridge out of the equation, I'd love to try an ad-hoc between laptops. Anyone at RH's 2600 Meridian location willing to do lunch wed/thurs at Sarah's Empanada's? :) 3) I saw the "freezes" under 2.2.17 (a freeze that would go away once the card was ejected), but only on the celeron (using the cardbus bridge). I haven't seen that again since going to 2.4.x on the celeron same behavior with 2.4.1-0.1.9: Mar 7 02:39:18 laptop kernel: NETDEV WATCHDOG: eth0: transmit timed out Mar 7 02:39:18 laptop kernel: wvlan_cs: eth0 Tx timed out! Resetting card Mar 7 02:39:18 laptop kernel: wvlan_cs: MAC address on eth0 is 00 02 2d 09 35 46 Mar 7 02:39:18 laptop kernel: wvlan_cs: Valid channels: 1 2 3 4 5 6 7 8 9 10 11 Speaking of Sarah's, did you buy these cards and bridge next door? Are they Buffalo cards? nope, all on-line. bridge: http://www.amtron.com/reader/pcdrp202e.htm cards: http://www.cdw.com/shop/products/default.asp?EDC=202683 Funny enough, i just got back from Sarah's for lunch :) mmmmmmm empanadas .... I have seen these messages with Buffalo cards (re-badged lucent) and the Ricoh bridge with the 2.2 kernel, but not with the 2.4 kernel. Created attachment 12683 [details]
log of random packets on the wvlan_cs card
the attached log is a tcpdump of packets I noticed. The activity light on the cards stays fairly constant even with nothing going on. From the laptop, I did a tcpdump and saw nothing. I put it into promisc and saw the packets in the log I attached. I'll go ahead and try 0.1.28 on both ends, though. trying on 2.4.2-0.1.28 from rawhide, the activity light still acts the same, but now even in promisc mode, tcpdump shows nothing! weird. Could you try iwconfiging the card instead of re-starting pcmcia when this happens and see if that also correctly re-inits the card? what do you mean by "when this happens"? On the laptop, I just get the eth timeouts, and it reinits the card and keeps going just fine. On the celeron (same card, through the Ricoh cardbus bridge using yenta_socket), I have to keep a loop that iwconfig's to set the essid (and only the essid) every 10 seconds, since it goes to garbage on some of the timeouts (not all). Recap: - nothing but waiting for the timeout on the laptop, resets fine. - celeron needs the occasional iwconfig to reset essid from garbage. OMFG! Based on a few entries from the forum over at pcmcia-cs on sourceforge, I hit wavelan.com and got the update for the firmware. I upgraded the firmware on both cards from 6.06 to 6.16, and it's like night and day! I've had the cards going solid for about an hour now, with nothing even close to a problem, aside from some kernel messages: kernel: Undo loss 192.168.2.2/1079 c2 l0 ss2/65535 p0 last message repeated 2 times Admittedly, that was under load testing and from what I've heard is probably an issues elsewhere given how many things have collapsed under load in early 2.4 kernels. Sorry it took so long to try the firmware upgrade, but I can only find it from wavelan.com, and only as a win 95/98/me/nt/2k program :( Thank you for this information! kernel: Undo loss 192.168.2.2/1079 c2 l0 ss2/65535 p0 is a harmless debugging message which have now turned off. Since this problem seemed to be a firmware issue, I'm closing this bug as "NOTABUG" (rather NOTOURBUG but that does not exist :) If you object to that, please reopen this bug. NOTABUG is certainly fine by me... is it acceptable to open this up outside the beta program? I ask mainly because I'd like to make it as easy as possible for others to find this bug and try a firmware upgrade before bothering an already-overloaded kernel team at Red Hat. :) I'm going to go ahead and post in the forum of pcmcia-cs at sourceforge about my success, but anything that helps get less bugs like this filed is hopefully a Good Thing (tm) If it's deemed *not* acceptable to kill off the beta-only checkbox from this bug, do you want me to open a similar public bug, mark this a dup, and paste at least the initial behavior and solution? Whatever works best for you is fine by me... I feel pretty bad having wasted time of Bill, Michael *and* Arjan over a NOTABUG :) I will try to get this added to the knowledgebase; That is the place for such things. Since this is ancient and knowledge-base fodder, i'm gonna try to remove the beta-only toggle |