Red Hat Bugzilla – Bug 123695
tg3 performance slows when running oracle and does not recover
Last modified: 2007-11-30 17:07:01 EST
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (compatible; Konqueror/3.2; Linux) (KHTML,
Description of problem:
I've been working with a customer on an 8way x440 connected w/
onboard tg3 to gigabit ethernet. Started with RHEL3-U1, and upgraded
to the RHEL3-U2 errata kernel (2.4.21-15.EL) to resolve SCSI RAID
performance issues (bug #104633). After rebooting, running scp (using
blowfish) gets around 20-25MB/sec, and things run well as expected.
However, shortly after starting Oracle9i, network performance begins
to suffer. System load is neglagible on the box. Even after killing
off Oracle, scp performance is terrible, running from 6MB/sec to as
low as 1.8Mb/s.
So far we've seen it happen about 3 times. Bringing the interface
down and removing the module, then bringing the interface back up
does not help. I've verified that the problem isn't traffic on the
network, as I plugged my laptop (w/ gigabit) directly into the system
before and after we started to see the problem and the throughput was
the same in both cases (fine in one, terrible in another).
Version-Release number of selected component (if applicable):
3 of 3 times reproduced.
Steps to Reproduce:
1. Boot linux-2.4.21-15.EL.smp kernel
2. Do some scp tests, note performance level
3. Start oracle
4. Wait for awhile (within 30 minutes)
5. Stop oracle
6. re-run scp tests
Actual Results: Low performance noted (6-1.5MB/s)
Expected Results: Normal performance noted (25-20MB/s)
Regards a customer issue, so quick feedback would be greatly
If quick feedback is necessary, you would be wise to go through
support (presuming you have an up to date entitlement), at
https://www.redhat.com/apps/support/ - bugzilla is less for support
than for bug fixes.
Ah,good point. I'm just a kernel dev helping out on this issue and
bugzilla is the standard interface for me. I'll advise the customer to
take the support path (although the immediacy has past, at the moment
they're happy backing up to the RHEL3U1 kernel), however I'll continue
to follow and work this bug as normally done in my testing and
So RHEL3-U1 did not show the slowdown? Thanks to your detailed
report it is clear that the tg3 driver itself is not the
problem, it seems to be something generic. Perhaps something
to do with memory pressure.
There is no way to reproduce this other than running Oracle?
Correct, we never saw the slowdown w/ RHEL3-U1. I have not been able
to reproduce the issue outside of the customer's setup (which I no
longer have access to), but I haven't really tried too hard. If I get
a chance later, I'd like to try to write a few test cases that may
strain the system in the same way (large number of open connections,
lots of processes, etc).
Need addtional info from IBM.
IBM need details here to proceed further.
I'm still trying to reproduce this issue, however I've run into
unrelated hardware problems. I'll update this bug as soon as I know
I have not been able to reproduce this issue, so I'm going to close
this. It can be reopened if it is ever seen again.