Well, I'll open another bug report since 221729 was closed. Maybe this is a different problem than the last deadlocks...??? Steve, like I said in another report, your gonna hate me. Been busy and hadn't tried recently, but I got some time this weekend to try this again, and I can still deadlock gfs2. Upgraded all 3 machines in cluster to latest kernel and updates. Kernel is: 2.6.20-1.2944.fc6. I am attaching 3 backtraces, one from each machine in the cluster. I had a copy from an ext3 to gfs2 partition running on spool7, a copy from an ocfs to the same gfs2 partition (to a different directory structure), and ran a 'df' command on virtual1b. All 3 machines were deadlocked after a few minutes. Not positive but I think it deadlocked on spool8 first... Sorry.... :( If you need more info, let me know.
Created attachment 152632 [details] messages file with backtrace from spool7
Created attachment 152633 [details] messages file with backtrace from spool8
Created attachment 152634 [details] messages file with backtrace from virtual1b
Ummm... let me ammend the first comment... I did a directory list on virtual1b that hung, not a df command...
This looks just like bz #231910, which has a fix. However, 231910 is a RHEL bug. I'm not sure how Steve is handling bugs with respect to the differences between RHEL and fedora. If he needs a fedora version on that bug for tracking purposes, then one this will do fine. But at any rate, there is a solution to this problem with will make it upstream shortly.
Looks like this is a Fedora build issue then. Reassigning to Chris Feist.
Re-assigning to Steve Whitehouse as he provides kernel patches for the fedora kernel.
I'll try and sort this out now that the latest upstream patches have been accepted by Linus.
The patches have now been sent for both FC5/6 and FC7 so I'm just waiting to find out which version of the kernel RPM they'll appear in.
Still waiting on FC5/6, but its in FC7 (pre-release) now and also in the current rawhide devel kernel. Also fixed upstream.
For FC5/6 that will be kernel 2952 which is commited but will be built shortly