Bug 285721
Summary: | tg3: tg3_abort_hw timed out for eth0, TX_MODE_ENABLE will not clear MAC_TX_MODE=ffffffff | ||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Matěj Cepl <mcepl> | ||||||||||||||||||||||||
Component: | kernel | Assignee: | Andy Gospodarek <agospoda> | ||||||||||||||||||||||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||||||||||||||||||||
Severity: | low | Docs Contact: | |||||||||||||||||||||||||
Priority: | medium | ||||||||||||||||||||||||||
Version: | 7 | CC: | chris.brown, mcepl, peterm | ||||||||||||||||||||||||
Target Milestone: | --- | Keywords: | Reopened | ||||||||||||||||||||||||
Target Release: | --- | ||||||||||||||||||||||||||
Hardware: | All | ||||||||||||||||||||||||||
OS: | Linux | ||||||||||||||||||||||||||
Whiteboard: | |||||||||||||||||||||||||||
Fixed In Version: | f8 | Doc Type: | Bug Fix | ||||||||||||||||||||||||
Doc Text: | Story Points: | --- | |||||||||||||||||||||||||
Clone Of: | |||||||||||||||||||||||||||
: | 715452 (view as bug list) | Environment: | |||||||||||||||||||||||||
Last Closed: | 2008-01-14 06:27:05 UTC | Type: | --- | ||||||||||||||||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||||||||||||||||
Documentation: | --- | CRM: | |||||||||||||||||||||||||
Verified Versions: | Category: | --- | |||||||||||||||||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||||||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||||||||||||||
Embargoed: | |||||||||||||||||||||||||||
Bug Depends On: | |||||||||||||||||||||||||||
Bug Blocks: | 715452 | ||||||||||||||||||||||||||
Attachments: |
|
Description
Matěj Cepl
2007-09-11 10:06:26 UTC
Created attachment 192361 [details]
output of dmesg command
Of course, I am not sure, what component this should go in -- kernel, hal? Hm thats sounds more like the tg3 driver has a problem when you hibernate using this quirk. Have you tried hibernating without the quirk? Or doesn't the machine hibernate properly when you leave it out. But in general, it's a kernel module spitting out an error message, so i rather think this is a kernel problem, therefore i'm reassigning it to kernel. Thanks, Read ya, Phil yes, I tried to hibernate with any possible quirks or without them at all and it makes this message all the time. Hello, I'm reviewing this bug as part of the kernel bug triage project, an attempt to isolate current bugs in the fedora kernel. http://fedoraproject.org/wiki/KernelBugTriage From the 2.6.23-rc3 Changelog: commit 3e0c95fd648c0d3175b9ff2232597d0b02eb7d46 Author: Michael Chan <mchan> Date: Fri Aug 3 20:56:54 2007 -0700 [TG3]: Fix suspend/resume problem. Joachim Deguara <joachim.deguara> reported that tg3 devices would not resume properly if the device was shutdown before the system was suspended. In such scenario where the netif_running state is 0, tg3_suspend() would not save the PCI state and so the memory enable bit and bus master enable bit would be lost. We fix this by always saving and restoring the PCI state in tg3_suspend() and tg3_resume() regardless of netif_running() state. Signed-off-by: Michael Chan <mchan> Signed-off-by: David S. Miller <davem> Matej, can you test with a kernel based off this? Also, could you clear up whether you are suspend/resuming or hibernate/waking? You mention suspend/resume but then indicate you are running pm-hibernate. Do you still see this issue with pm-suspend? Cheers Chris Patch queued for next kernel update. (In reply to comment #5) > Matej, can you test with a kernel based off this? Is there RPM with the code somewhere -- I don't do Red Hat kernel building all the time ... :-) Besides, I don't see where is the appropriate patch anyway. (In reply to comment #7) > (In reply to comment #5) > > Matej, can you test with a kernel based off this? > > Is there RPM with the code somewhere -- I don't do Red Hat kernel building all > the time ... :-) Understandable. > Besides, I don't see where is the appropriate patch anyway. You can either run: # yum update kernel --enablerepo=development --nogpgcheck which will pull the latest rawhide kernel (should be off 2.6.23) or let me know what arch you are running and I'll do a scratch build for you in Koji. Tried kernel-2.6.23-0.217.rc9.git1.fc8 and the results are no good: a) iwl3945 driver didn't work with NetworkManager-0.6.5-7.fc7 (it did with kernel-2.6.22.9-91.fc7), so ifup had to be used. b) tg3 warning message went away, but after suspend no wireless card whatsoever (I have the computer at home, so I have no chance to test wired ethernet function actually), and there were some backtraces in dmesg (see attached). Created attachment 216801 [details]
output of dmesg
Created attachment 216811 [details]
/var/log/messages
Matej, As the original issue with tg3 appears resolved in 2.6.23, I'm changing the subject to reflect that. It might even be worth filing a new bug for the wireless but to be honest there is so much work going into NetworkManager and intel wifi at the moment it will likely get lost in the noise. I'd suggest your best option would be to test with NetworkManager (again from development if you can) and see if this helps. In the meantime I'll re-assign to the wireless team. For brevity here is the backtrace which is related to your touchpad rather than network driver issues: ============================================= [ INFO: possible recursive locking detected ] 2.6.23-0.217.rc9.git1.fc8 #1 --------------------------------------------- kseriod/253 is trying to acquire lock: (&ps2dev->cmd_mutex){--..}, at: [<c0631cb3>] mutex_lock+0x21/0x24 but task is already holding lock: (&ps2dev->cmd_mutex){--..}, at: [<c0631cb3>] mutex_lock+0x21/0x24 other info that might help us debug this: 4 locks held by kseriod/253: #0: (serio_mutex){--..}, at: [<c0631cb3>] mutex_lock+0x21/0x24 #1: (&serio->drv_mutex){--..}, at: [<c0631cb3>] mutex_lock+0x21/0x24 #2: (psmouse_mutex){--..}, at: [<c0631cb3>] mutex_lock+0x21/0x24 #3: (&ps2dev->cmd_mutex){--..}, at: [<c0631cb3>] mutex_lock+0x21/0x24 stack backtrace: [<c0406463>] show_trace_log_lvl+0x1a/0x2f [<c0406e4d>] show_trace+0x12/0x14 [<c0406e65>] dump_stack+0x16/0x18 [<c0449c56>] __lock_acquire+0x189/0xc67 [<c044abae>] lock_acquire+0x7b/0x9e [<c0631ac0>] __mutex_lock_slowpath+0x10a/0x2dc [<c0631cb3>] mutex_lock+0x21/0x24 [<c059bc3f>] ps2_command+0x92/0x30e [<c05a23c6>] psmouse_sliced_command+0x1c/0x5a [<c05a46eb>] synaptics_pt_write+0x21/0x46 [<c059ba14>] ps2_sendbyte+0x39/0xcb [<c059bcbe>] ps2_command+0x111/0x30e [<c05a2001>] psmouse_probe+0x1d/0x6c [<c05a314d>] psmouse_connect+0xf8/0x20c [<c05993e0>] serio_connect_driver+0x1e/0x2e [<c0599406>] serio_driver_probe+0x16/0x18 [<c05767bd>] driver_probe_device+0xf2/0x173 [<c0576846>] __device_attach+0x8/0xa [<c0575b92>] bus_for_each_drv+0x3c/0x67 [<c05768dc>] device_attach+0x75/0x8a [<c059941c>] serio_find_driver+0x14/0x3c [<c0599f59>] serio_thread+0x166/0x2b9 [<c043e7f7>] kthread+0x3b/0x64 [<c0405ee3>] kernel_thread_helper+0x7/0x10 ======================= The logs then indicates the device coming back up and dhclient restarting a few times before sleeping. NetworkManager fares no better later on it would seem. FWIW, it is really bad form to simply hijack a bug for another problem, rename it, and expect it to be treated as an extension of the same bug. It isn't clear to me that this is the same issue at all, and having synaptics backtraces and tg3 stuff in what now claims to be an iwl3945 bug just creates confusion... Sure, being a bugmaster, I thought that kernel folks have different mores ;-). No, seriously, should I file a new bug? Matej, you weren't the one I was hoping to educate. :-) Yes, please open a new bug. Restoring the name and assignee of this one (and closing it, if appropriate) seems like a good idea too. OK, so let's close this bug as CLOSED/RAWHIDE, and I will upgrade on Monday my computer to F8test3 update that to the latest Rawhide, and I will all bugs which will be eventually found out and you will fix them. Is it a deal? :-) (In reply to comment #13) > FWIW, it is really bad form to simply hijack a bug for another problem, rename > it, and expect it to be treated as an extension of the same bug. It isn't > clear to me that this is the same issue at all, and having synaptics > backtraces and tg3 stuff in what now claims to be an iwl3945 bug just creates > confusion... I'm no hi-jacker, you must have me confused with someone else. I don't *expect* it to be treated as an extension, just that the underlying issue may be the same however the initial tg3 errors were resolved so I felt a change of subject was appropriate. Its your call to ask the reporter to file a new bug or continue on this one. Please don't accuse me of hi-jacking - as indicated I am attempting to triage kernel bugs. (In reply to comment #16) > OK, so let's close this bug as CLOSED/RAWHIDE, and I will upgrade on Monday my > computer to F8test3 update that to the latest Rawhide, and I will all bugs which > will be eventually found out and you will fix them. Is it a deal? :-) Deal. Cheers Chris Sorry guys, if I can have my original summary back (I believe many people search by the error they get in logs), and unfortunately I have to reopen this. I have upgraded to full Rawhide, so I have now here kernel 2.6.23-0.224.rc9.git6.fc8 and NetworkManager-0.7.0-0.3.svn2914.fc8. Unfortunately, the error message is back. Created attachment 223371 [details]
/var/log/messages
These are the /var/log/messages contain both suspend/resume cycle and reboot.
After suspend/resume cycle, there is no network (neither wireless nor wired). I
will file a different bug about this.
Created attachment 223451 [details]
output of dmesg
after restart of the computer (network works)
Matej, Can you attach the before and after suspend output of `lspci -xxxvvv` for this system when you get the tg3 failure? Thanks! Created attachment 226381 [details]
lspci -vvvxxxx after suspend
I got again after resume from suspend to RAM very nice collection of crashes,
non-functional drivers, etc. When I run 'modprobe -v -r iwl3945 tg3' then the
situation turned very quickly to working wireless network even with
NetworkManager (not having wired Ethernet at hand I cannot tried real
functionality of tg3 driver).
Created attachment 226391 [details]
output of dmesg
I think there are some parts of this, which can be interesting. BTW, using
currently kernel-2.6.23-6.fc8 package.
Created attachment 226401 [details]
lspci -vvvxxxx after fresh reboot
I think that should be it. Created attachment 226761 [details]
output of dmesg after hibernation
I have tried hibernate (suspend to Disk; the previous suspend data were with
suspend to RAM) and after resume the results were as bad as with suspend to
RAM. Actually, I haven't managed to make network working at all and I had to
reboot the computer in order to get net connection.
THis is output of dmesg where I see some interesting backtraces.
Created attachment 226771 [details]
output of lspci -vvvvxxxx
Created attachment 226781 [details]
/var/log/messages after resume from hibernation
Hello Matej, Any improvements with recent kernel updates? There have been plenty of wireless driver updates that may have resolved this issue for you. You can also try adding: SUSPEND_MODULES="iwl3945 tg3" to /etc/pm/config.d/unload_modules which might help things a bit. Cheers Chris I cannot find in any log any error messages for now. So, lets CLOSE this for now, and I will reopen it if every needed again. |