Bug 153015 - GFS 6.0 causes poor app performance and kernel panic when not using "localflock" mount option
GFS 6.0 causes poor app performance and kernel panic when not using "localflo...
Status: CLOSED INSUFFICIENT_DATA
Product: Red Hat Cluster Suite
Classification: Red Hat
Component: gfs (Show other bugs)
3
i386 Linux
medium Severity medium
: ---
: ---
Assigned To: Ben Marzinski
GFS Bugs
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2005-03-31 16:44 EST by Kyle Gonzales
Modified: 2010-01-11 22:04 EST (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2006-12-13 14:54:55 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Strace of app over GFS (179.01 KB, text/plain)
2005-03-31 16:46 EST, Kyle Gonzales
no flags Details

  None (edit)
Description Kyle Gonzales 2005-03-31 16:44:42 EST
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.5) Gecko/20041107 Firefox/1.0

Description of problem:
From a customer:

"I have narrowed down the issue with GFS.  I created my new file system
with the locking protocol of LOCK_GULM, and mounted it normally using
mount -t gfs -o acl ......  It showed the exact same problems as before
-- the app would freeze up at a certain point, and I would have to kill
it to continue.  I have created an strace of the app which ends at the
freeze.  It is attached to this email.  When I unmounted the file system
after killing my app, I got the following kernel panic:

Kernel panic: GFS: Assertion failed on line 1076 of file linux_super.c
GFS: assertion: "list_empty(&sdp->sd_plock_list)"
GFS: time = 1112216791
GFS: fsid=TheBorg:gfs_test.0

(The hardware is a Dell GX280 single proc with Hyperthreading enabled.
The running kernel is 2.4.21-27.0.2.Elsmp and GFS-6.0.2-25.)

After rebooting, I mounted the file system using mount -t gfs -o
acl,localflocks ...., and my app appears to be working perfectly -- the
performance is there, I don't see any hesitation, and most of all, I'm
able to get into the parts of the app which were hanging before.  Not
only that, I'm able to get multiple instances of the app running in the
same areas, as it was designed.  Everything looks really nice.

It appears that I am running into a locking performance problem.  I
would welcome any ideas on how to solve this, if there are any
available.  :)"

Version-Release number of selected component (if applicable):
GFS 6.0

How reproducible:
Always

Steps to Reproduce:
1. Mount GFS filesystem using "mount -t gfs -o acl <file system> <mount point>"
2. Run app
3. Stop app and unmount
  

Actual Results:  Poor application performance and lockups, and a kernel panic on unmount

Expected Results:  Good application performance, and no kernel panic

Additional info:
Comment 1 Kyle Gonzales 2005-03-31 16:46:25 EST
Created attachment 112546 [details]
Strace of app over GFS

Strace of app that is seeing poor performance over GFS when not using
localflock mount option
Comment 2 Peter Shearer 2005-03-31 17:48:03 EST
The filesystem was created using the following command:

mkfs_gfs -p lock_gulm -t TheBorg:gfs_test -j 10 /dev/pool/gfs_test

The cluster only has two computers in it, and both mount this file system.  The 
10 journals were created for expansion.  We will deploy it starting with 3 
servers in the cluster (all mounting the file system), and probably the same 
number of journals so that file system resizing is not needed in the near 
future when more servers are added to the cluster.

--Peter
Comment 3 Ben Marzinski 2005-04-22 17:57:13 EDT
Some questions

What is the App? Can I get a copy of it to play with? Is the performance bad right
from the start, or does it get worse over time? If over time, then how long does
it have to run until it starts having problems?
Comment 4 Kyle Gonzales 2005-04-22 18:02:00 EDT
> Some questions
> 
> What is the App? Can I get a copy of it to play with? Is the performance bad 
> right from the start, or does it get worse over time? If over time, then how 
> long does it have to run until it starts having problems?

Peter can provide more information about the app.  If I remember tho, it would
have bad performance right from the start.
Comment 5 Ben Marzinski 2005-04-22 18:47:27 EDT
More questions:

Is the App just a single process running on each machine? If so, getting
traces like the one attached earlier, but of both machines, would be really
helpful.
Comment 6 Ben Marzinski 2006-09-15 15:49:02 EDT
This bug has been inactive for over a year. Does anyone object to me closing it out?

Note You need to log in before you can comment on or make changes to this bug.