Bug 524745

Summary: Network hangs performing rsync with kernel-2.6.30.5-43.fc11.x86_64
Product: [Fedora] Fedora Reporter: Gustavo Maciel Dias Vieira <gustavo>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: low    
Version: 11CC: dougsland, gansalmon, itamar, jh.redhat-2018, kernel-maint
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-09-30 06:05:54 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Output of lspci none

Description Gustavo Maciel Dias Vieira 2009-09-22 03:06:07 UTC
Created attachment 362026 [details]
Output of lspci

When trying to perform a rsync operation, as part of a rsnapshot backup, the entire network interface hangs. No packets are sent or received. After I cancel the rsync command, it takes about 10m for the network to recover.

It does recover, but stays unusable this whole period. Pings sent to the affected machine are either dropped or are replied with huge delays, sometimes as large as 90s.

This bug appears only in kernel-2.6.30.5-43.fc11.x86_64. I can reproduce it every time with the command below. If I boot with kernel-2.6.29.6-217.2.16.fc11.x86_64 it never appears.

The rsync command that triggers is very basic, something like:
/usr/bin/rsync -a --delete --numeric-ids --relative --delete-excluded \
    --exclude=.gvfs --rsh=/usr/bin/ssh <auser>@<ahost>:<adir> \
    <backupdir>

I really don't know what kind of information may be useful for a issue like this. I'm attaching the output of lspci to describe the affected hardware. The affected network interface is a Intel PRO/100 using the e100 driver.

Comment 1 Jakob Hirsch 2009-09-22 15:18:37 UTC
same problem here with e100 since 2.6.30.5-43.fc11.x86_64, everything fine with 2.6.29.* and below (at least since Fedora 9). I did not try rsync, a "wget some.host/bigfile" is enough. An ifdown/ifup-cycle reanimates the interface.

lspci -vv output:

02:00.0 Ethernet controller: Intel Corporation 82557/8/9/0/1 Ethernet Pro 100 (rev 0c)
	Subsystem: Intel Corporation EtherExpress PRO/100 S Server Adapter
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 64 (2000ns min, 14000ns max), Cache Line Size: 64 bytes
	Interrupt: pin A routed to IRQ 16
	Region 0: Memory at fe7df000 (32-bit, non-prefetchable) [size=4K]
	Region 1: I/O ports at dcc0 [size=64]
	Region 2: Memory at fe7e0000 (32-bit, non-prefetchable) [size=128K]
	Capabilities: [dc] Power Management version 2
		Flags: PMEClk- DSI+ D1+ D2+ AuxCurrent=0mA PME(D0+,D1+,D2+,D3hot+,D3cold+)
		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=2 PME-
	Kernel driver in use: e100
	Kernel modules: e100

I'm about to go back to kernel-2.6.29.6-217.2.16.fc11.x86_64 for the time being... but I'll try kernel-2.6.30.6-53.fc11 if I'll get around to it (though there's nothing mentioned in the changelog about it).

Comment 2 Chuck Ebbert 2009-09-24 14:30:02 UTC
2.6.30.7 has an e100 patch but it's hard to tell if it fixes this. Do your kernel boot messages contain this line?:

PCI-DMA: Using software bounce buffering for IO (SWIOTLB)

Comment 3 Jakob Hirsch 2009-09-24 16:15:06 UTC
(In reply to comment #2)
> 2.6.30.7 has an e100 patch but it's hard to tell if it fixes this. Do your
> kernel boot messages contain this line?:
> 
> PCI-DMA: Using software bounce buffering for IO (SWIOTLB)  

It does indeed.

You mean this in changelog?

commit 4d422a0590a44d9de3749dadb673b7b0561cc0d1
Author: Krzysztof HaƂasa <khc.pl>
Date:   Sun Aug 23 19:02:13 2009 -0700

    E100: fix interaction with swiotlb on X86.
...
    This patch, while not yet making the driver conform to the PCI DMA API,
    allows it to work correctly on X86 with swiotlb (while not breaking
    other architectures).

My system is running x86_64, so I wonder if that would help...
But reading e.g. this thread, it could: http://www.mail-archive.com/kernel-testers@vger.kernel.org/msg06246.html

I even thought about trying kernel-2.6.31-40.fc12 (hoping that it's installible on my FC11 system and won't break anything else)...

I also wonder, why my system needs SWIOTLB at all. Seems that my Pentium E2160 has no IOMMU...

Comment 4 Chuck Ebbert 2009-09-25 03:17:55 UTC
Looks like that is the fix. And SWIOTLB is used when there is no iommu...

Comment 5 Jakob Hirsch 2009-09-25 11:44:21 UTC
(In reply to comment #4)

I see on koji you build kernel-2.6.30.8-64.fc11. Runs fine here and fixes the problem. Thanks!