Bug 480165

Summary: RHEL 4.8 kernel rhts fs_mark test hangs
Product: Red Hat Enterprise Linux 4 Reporter: Vivek Goyal <vgoyal>
Component: kernelAssignee: Josef Bacik <jbacik>
Status: CLOSED NOTABUG QA Contact: Martin Jenner <mjenner>
Severity: medium Docs Contact:
Priority: medium    
Version: 4.8CC: esandeen, jburke, pbunyan
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-06-07 14:42:15 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Vivek Goyal 2009-01-15 15:07:22 UTC
Description of problem:

fs_mark test hangs on ppc64 machine and localwatchdog hits. Looks like fs_mark is waiting for some journal commit and kjournald is waiting for some some bio transfer to finish but that bio transfer never finishes. So it could be a device driver issue or something like that.

Interesting thing is that it happens most of the time on ibm-l4b-lp1.test.redhat.com machine.

http://rhts.redhat.com/cgi-bin/rhts/test_list.cgi?test_filter=/kernel/performance/fs_mark&result=Warn&rwhiteboard=kernel%202.6.9-78.29.EL%20largesmp&arch=ppc&jobids=41890
http://rhts.redhat.com/cgi-bin/rhts/test_log.cgi?id=5746221
http://rhts.redhat.com/cgi-bin/rhts/test_log.cgi?id=5556212

There are more instances of failure. Not listing all of them.

Version-Release number of selected component (if applicable):
78.29.EL

How reproducible:
Saw it multitple time during various rhts runs.

Steps to Reproduce:
1. run fs_mark test on ibm-l4b-lp1.test.redhat.com
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 3 Josef Bacik 2009-01-19 16:41:39 UTC
how does one get access to this box?  i have to get an objdump of jbd so i can map where we hung.  It seems to me though that this is just jbd getting hung up because its waiting for IO to happen, so its a device driver problem.  Once I can map where we're hung in do_get_write_access I can confirm thats the case, or not if its not.

Comment 4 Vivek Goyal 2009-01-19 18:04:19 UTC
Josef,

This machine is in rhts. You can reserve the machine and use it.

Comment 5 Josef Bacik 2009-01-27 14:16:57 UTC
OK finally figured out how to get what I needed.  The hang is just us waiting on the buffer lock, so we're waiting for IO to complete.  It looks to be a problem at a lower layer, I'm not sure who would be good to assign it to for that sort of thing.

Comment 6 PaulB 2010-11-01 14:01:43 UTC
All,

 Retesting /kernel/performance/fs_mark on ibm-l4b-lp1.test.redhat.com with kernel 2.6.9-90.EL:
http://rhts.redhat.com/cgi-bin/rhts/jobs.cgi?id=177613

 Kernel 2.6.9-90.EL installed and /kernel/performance/fs_mark was run on 
ibm-l4b-lp1.rhts.eng.rdu.redhat.com without issue 5x. The results look good.

Best,
-pbunyan

Comment 7 Josef Bacik 2012-06-07 14:42:15 UTC
Closing out.