Bug 123695

Summary: tg3 performance slows when running oracle and does not recover
Product: Red Hat Enterprise Linux 3 Reporter: john stultz <johnstul>
Component: kernelAssignee: David Miller <davem>
Status: CLOSED WORKSFORME QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: 3.0CC: gone, lcm, petrides, riel, shillman, uthomas
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2004-09-14 20:23:44 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description john stultz 2004-05-20 03:41:32 UTC
From Bugzilla Helper: 
User-Agent: Mozilla/5.0 (compatible; Konqueror/3.2; Linux) (KHTML, 
like Gecko) 
 
Description of problem: 
I've been working with a customer on an 8way x440 connected w/ 
onboard tg3 to gigabit ethernet. Started with RHEL3-U1, and upgraded 
to the RHEL3-U2 errata kernel (2.4.21-15.EL) to resolve SCSI RAID 
performance issues (bug #104633). After rebooting, running scp (using 
blowfish) gets around 20-25MB/sec, and things run well as expected. 
However, shortly after starting Oracle9i, network performance begins 
to suffer. System load is neglagible on the box. Even after killing 
off Oracle, scp performance is terrible, running from 6MB/sec to as 
low as 1.8Mb/s.  
 
So far we've seen it happen about 3 times. Bringing the interface 
down and removing the module, then bringing the interface back up 
does not help. I've verified that the problem isn't traffic on the 
network, as I plugged my laptop (w/ gigabit) directly into the system 
before and after we started to see the problem and the throughput was 
the same in both cases (fine in one, terrible in another).  
 
Version-Release number of selected component (if applicable): 
kernel-smp-2.4.21-15.EL 
 
How reproducible: 
3 of 3 times reproduced.  
 
Steps to Reproduce: 
1. Boot linux-2.4.21-15.EL.smp kernel 
2. Do some scp tests, note performance level 
3. Start oracle 
4. Wait for awhile (within 30 minutes) 
5. Stop oracle 
6. re-run scp tests   
     
 
Actual Results:  Low performance noted (6-1.5MB/s) 
 
Expected Results:  Normal performance noted (25-20MB/s) 
 
Additional info: 
	Regards a customer issue, so quick feedback would be greatly 
appreciated. :)

Comment 1 Suzanne Hillman 2004-05-21 19:42:48 UTC
If quick feedback is necessary, you would be wise to go through
support (presuming you have an up to date entitlement), at
https://www.redhat.com/apps/support/ - bugzilla is less for support
than for bug fixes.

Comment 2 john stultz 2004-05-21 20:23:25 UTC
Ah,good point. I'm just a kernel dev helping out on this issue and
bugzilla is the standard interface for me. I'll advise the customer to
take the support path (although the immediacy has past, at the moment
they're happy backing up to the RHEL3U1 kernel), however I'll continue
to follow and work this bug as normally done in my testing and
development context. 

thanks

Comment 3 David Miller 2004-06-03 00:00:19 UTC
So RHEL3-U1 did not show the slowdown?  Thanks to your detailed
report it is clear that the tg3 driver itself is not the
problem, it seems to be something generic.  Perhaps something
to do with memory pressure.

There is no way to reproduce this other than running Oracle?


Comment 4 john stultz 2004-06-03 00:07:04 UTC
Correct, we never saw the slowdown w/ RHEL3-U1. I have not been able 
to reproduce the issue outside of the customer's setup (which I no 
longer have access to), but I haven't really tried too hard. If I get 
a chance later, I'd like to try to write a few test cases that may 
strain the system in the same way (large number of open connections, 
lots of processes, etc). 

Comment 5 Bob Johnson 2004-08-11 14:26:56 UTC
Need addtional info from IBM.

Comment 6 Bob Johnson 2004-08-22 18:03:41 UTC
IBM need details here to proceed further.

Comment 7 john stultz 2004-08-25 17:09:26 UTC
I'm still trying to reproduce this issue, however I've run into 
unrelated hardware problems. I'll update this bug as soon as I know 
more.  

Comment 8 john stultz 2004-09-14 20:23:44 UTC
I have not been able to reproduce this issue, so I'm going to close 
this. It can be reopened if it is ever seen again.