Bug 524745 - Network hangs performing rsync with kernel-2.6.30.5-43.fc11.x86_64
Summary: Network hangs performing rsync with kernel-2.6.30.5-43.fc11.x86_64
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 11
Hardware: x86_64
OS: Linux
low
medium
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2009-09-22 03:06 UTC by Gustavo Maciel Dias Vieira
Modified: 2009-09-30 06:05 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2009-09-30 06:05:54 UTC


Attachments (Terms of Use)
Output of lspci (1.69 KB, text/plain)
2009-09-22 03:06 UTC, Gustavo Maciel Dias Vieira
no flags Details

Description Gustavo Maciel Dias Vieira 2009-09-22 03:06:07 UTC
Created attachment 362026 [details]
Output of lspci

When trying to perform a rsync operation, as part of a rsnapshot backup, the entire network interface hangs. No packets are sent or received. After I cancel the rsync command, it takes about 10m for the network to recover.

It does recover, but stays unusable this whole period. Pings sent to the affected machine are either dropped or are replied with huge delays, sometimes as large as 90s.

This bug appears only in kernel-2.6.30.5-43.fc11.x86_64. I can reproduce it every time with the command below. If I boot with kernel-2.6.29.6-217.2.16.fc11.x86_64 it never appears.

The rsync command that triggers is very basic, something like:
/usr/bin/rsync -a --delete --numeric-ids --relative --delete-excluded \
    --exclude=.gvfs --rsh=/usr/bin/ssh <auser>@<ahost>:<adir> \
    <backupdir>

I really don't know what kind of information may be useful for a issue like this. I'm attaching the output of lspci to describe the affected hardware. The affected network interface is a Intel PRO/100 using the e100 driver.

Comment 1 Jakob Hirsch 2009-09-22 15:18:37 UTC
same problem here with e100 since 2.6.30.5-43.fc11.x86_64, everything fine with 2.6.29.* and below (at least since Fedora 9). I did not try rsync, a "wget some.host/bigfile" is enough. An ifdown/ifup-cycle reanimates the interface.

lspci -vv output:

02:00.0 Ethernet controller: Intel Corporation 82557/8/9/0/1 Ethernet Pro 100 (rev 0c)
	Subsystem: Intel Corporation EtherExpress PRO/100 S Server Adapter
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 64 (2000ns min, 14000ns max), Cache Line Size: 64 bytes
	Interrupt: pin A routed to IRQ 16
	Region 0: Memory at fe7df000 (32-bit, non-prefetchable) [size=4K]
	Region 1: I/O ports at dcc0 [size=64]
	Region 2: Memory at fe7e0000 (32-bit, non-prefetchable) [size=128K]
	Capabilities: [dc] Power Management version 2
		Flags: PMEClk- DSI+ D1+ D2+ AuxCurrent=0mA PME(D0+,D1+,D2+,D3hot+,D3cold+)
		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=2 PME-
	Kernel driver in use: e100
	Kernel modules: e100

I'm about to go back to kernel-2.6.29.6-217.2.16.fc11.x86_64 for the time being... but I'll try kernel-2.6.30.6-53.fc11 if I'll get around to it (though there's nothing mentioned in the changelog about it).

Comment 2 Chuck Ebbert 2009-09-24 14:30:02 UTC
2.6.30.7 has an e100 patch but it's hard to tell if it fixes this. Do your kernel boot messages contain this line?:

PCI-DMA: Using software bounce buffering for IO (SWIOTLB)

Comment 3 Jakob Hirsch 2009-09-24 16:15:06 UTC
(In reply to comment #2)
> 2.6.30.7 has an e100 patch but it's hard to tell if it fixes this. Do your
> kernel boot messages contain this line?:
> 
> PCI-DMA: Using software bounce buffering for IO (SWIOTLB)  

It does indeed.

You mean this in changelog?

commit 4d422a0590a44d9de3749dadb673b7b0561cc0d1
Author: Krzysztof Hałasa <khc@pm.waw.pl>
Date:   Sun Aug 23 19:02:13 2009 -0700

    E100: fix interaction with swiotlb on X86.
...
    This patch, while not yet making the driver conform to the PCI DMA API,
    allows it to work correctly on X86 with swiotlb (while not breaking
    other architectures).

My system is running x86_64, so I wonder if that would help...
But reading e.g. this thread, it could: http://www.mail-archive.com/kernel-testers@vger.kernel.org/msg06246.html

I even thought about trying kernel-2.6.31-40.fc12 (hoping that it's installible on my FC11 system and won't break anything else)...

I also wonder, why my system needs SWIOTLB at all. Seems that my Pentium E2160 has no IOMMU...

Comment 4 Chuck Ebbert 2009-09-25 03:17:55 UTC
Looks like that is the fix. And SWIOTLB is used when there is no iommu...

Comment 5 Jakob Hirsch 2009-09-25 11:44:21 UTC
(In reply to comment #4)

I see on koji you build kernel-2.6.30.8-64.fc11. Runs fine here and fixes the problem. Thanks!


Note You need to log in before you can comment on or make changes to this bug.