Bug 190586

Summary: internet down after kernel upgrade to 2107*
Product: [Fedora] Fedora Reporter: chayane singh <chayane>
Component: kernelAssignee: Dave Jones <davej>
Status: CLOSED ERRATA QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 5CC: ben, bojan, hongjiu.lu, jsmith.fedora, mtasaka, pfrields, rh-fc-bugzilla, wtogami
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: 2111 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2006-05-07 00:00:31 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description chayane singh 2006-05-03 19:21:42 UTC
Description of problem:


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.not able to ping internet ip address , ip address locally assigned, after
kernel upgrade from kernel-2.6.16-1.2096_FC5 to kernel-2.6.16-1.2107_FC5

everything works fine with the old kernel.
  
Actual results:


Expected results:


Additional info:

Comment 1 Jared Smith 2006-05-03 22:02:30 UTC
I'm seeing the same thing:

NIC appears to be up, but can't ping it's local IP, can't ping 127.0.0.1, and
can't ping anything else on the network.  I've double-checked to make sure
iptables is off, etc. but it's still the same.  Rebooting into a previous kernel
works just fine.

Comment 2 H.J. Lu 2006-05-03 23:47:56 UTC
*** Bug 190606 has been marked as a duplicate of this bug. ***

Comment 3 Bojan Smojver 2006-05-04 00:36:58 UTC
Same here.

Comment 4 Dave Jones 2006-05-04 03:36:35 UTC
does booting with pci=nomsi make this go away?

Comment 5 Dave Jones 2006-05-04 04:13:30 UTC
Actually, that option is also broken in this kernel (sigh).

2108, available from http://people.redhat.com/davej/kernels/Fedora/FC5/
has this disabled by default. Give that a try ?


Comment 6 chayane singh 2006-05-04 05:54:26 UTC
no luck.

Comment 7 jouni 2006-05-04 12:24:07 UTC
With me 2107 I did not notice any other problems that when loading web pages 
from other client the pages showed up ok when they contained only text.  If 
they contained any images the pages did not load.  (The browser just stalled 
and showed nothing).  Bigger files/tranmissions -> problem?

Apache log showed no problems.  The requests of the unshown images was logged.

I was not able to test more.  ssh connection worked btw.  Rebooting to old 
kernel cured the problem.

Comment 8 Jerry James 2006-05-04 15:30:20 UTC
With the exception of comment #7, which appears to be a different problem, I'm
seeing this, too ... but only on machines using the tg3 driver.  What driver are
the rest of you using for your network cards?  For my tg3-using machines, the
symptom is that the DHCP step appears to fail.  However, when I look at the
lease file, I see that my machine actually did successfully talk to the DHCP
server and got a lease.  Right after the DHCP step, I see the message "no IPv6
routers present", which is silly since these machines are all configured to use
IPv4.  Somehow the machine becomes convinced that DHCP failed, even though it
succeeded, and it reverts to 127.0.0.1 as its primary address.  Following
bootup, I then get the symptoms the other reporters indicated, but it's due to
the machine thinking the network setup process failed.

I also see quite a few "could not allocate memory" messages in my log from
various kernel subsystems.  If kernel memory allocation is broken, that could
explain the widely varying bug reports for this kernel.

Comment 9 H.J. Lu 2006-05-04 16:01:39 UTC
I saw it on machines with e1000.

Comment 10 Andrew Rucker Jones 2006-05-04 16:22:57 UTC
Dave, i'm sure You're a lot smarter than i am, so i'm sure You've noticed that
all of the problems with 2107 that sound like this one have to do with
networking. I just upgraded myself and discovered that all connections from a
host to itself (whether 127.0.0.1 or a non-loopback address) get "stuck".
Looking at them with netstat -an reveals that there are a lot of bytes in the
Send-Q on the server side of the connection.

For the record, this has happened to me with Thunderbird/Dovecot on the same
machine and Firefox/Squid on the same machine. Other machines connecting to my
Dovecot & Squid server with Thunderbird and Firefox have no problems.

Perhaps You've already figured this out, and i just didn't see it in all the bug
reports. If so, forgive the noise.

Comment 11 Ben Webb 2006-05-04 18:49:46 UTC
I'm also seeing similar behavior (except comment #7) with the e1000 driver, 2107
i686 SMP kernel; the interface comes up (static IP), can ping other hosts, but
can't make any kind of TCP connection. Downgrading to 2096 makes the problem go
away.