198740 – statfs() call on a GFS filesystem in recovery is blocking.

Bug 198740 - statfs() call on a GFS filesystem in recovery is blocking.

Summary: statfs() call on a GFS filesystem in recovery is blocking.

Keywords:
Status:	CLOSED WORKSFORME
Alias:	None
Product:	Red Hat Cluster Suite
Classification:	Retired
Component:	gfs
Sub Component:
Version:	4
Hardware:	ia64
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Assignee:	Ryan O'Hara
QA Contact:	GFS Bugs
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2006-07-13 07:16 UTC by Mathieu AVILA
Modified:	2010-01-12 03:12 UTC (History)
CC List:	2 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2007-09-25 17:00:08 UTC
Embargoed:

Attachments	(Terms of Use)

Description Mathieu AVILA 2006-07-13 07:16:05 UTC

Description of problem:

On a GNBD+GFS cluster, i am running a statfs() system call while crashing
another node. The statfs() system call is blocked (this is normal) , but also
uses 100% of the CPU, as if there was a busy waiting. When the fencing request
is responded, the system call returns and all is back to normal.

I have looked at gfs-kernel/src/gfs/super.c in "stat_gfs_async" and
"stat_gfs_sync" to search for an active waiting but i'am just a GFS user so i
don't understand all that it does.



Version-Release number of selected component (if applicable):

I'm currently using a packaged version of GFS-6.1.5, not the latest stable
branch of the CVS.


How reproducible:

100% reproducible


Steps to Reproduce:
1.run statfs() (a "df" should do the same)
2.crash a node at the same time as (1)
3.run "top" and watch out how CPU "df" crunches.
  
Actual results:

All CPU is consumed in "df"

Expected results:

No workload for "df"

Additional info:

When answering fencing, all goes fine after the end of recovery of GFS.

Comment 1 Lenny Maiorani 2006-07-21 20:01:55 UTC

I have seen something similar. 

There is a utility running which runs 'dt -kt gfs' every 5 seconds. Sometimes
these take a VERY long time (many minutes) to complete and something only a few
seconds (16TB filesystem). We have also seen this occur when deleting several
million files using 'rm -f *' and df hangs indefinitely.

strace shows that it is in 'sys_statfs()'. Unfortunately, we cannot get the
stack because it is consuming all the resources on the processor. We can only
look at it from the other processors on the system. 

Sometimes the node running the df command gets fenced during this period.
Unfortunately, the system logs do not provide any information.

Comment 2 Ryan O'Hara 2006-07-24 18:34:57 UTC

(In reply to comment #1)

> There is a utility running which runs 'dt -kt gfs' every 5 seconds.

What is 'dt'?

Comment 3 Ryan O'Hara 2006-07-24 18:48:11 UTC

(In reply to comment #0)
 
> On a GNBD+GFS cluster, i am running a statfs() system call while crashing
> another node. 

How are you crashing the other node?

> The statfs() system call is blocked (this is normal) , but also
> uses 100% of the CPU, as if there was a busy waiting. When the fencing request
> is responded, the system call returns and all is back to normal.

What fencing method are you using?

Also, are you using dlm or gulm?

Comment 4 Lenny Maiorani 2006-07-24 18:57:05 UTC

Typo... meant to be 'df -kt gfs'

Using dlm.

Comment 5 Mathieu AVILA 2006-07-25 07:28:53 UTC

> How are you crashing the other node?

I'm crashing it using "echo b > /proc/sysrqtrigger". The node does not come back
in the GFS cluster.

> What fencing method are you using?

For the purpose of the experiment, i configured the fencing tool to be manual.
Please note that the node was rebooted and didn't joined back the cluster
immediately (done it manually)

> Also, are you using dlm or gulm?

DLM.

I also forgot to say it was a 3 nodes cluster.

Comment 6 Ryan O'Hara 2006-07-25 14:05:38 UTC

Hmm. I am using a 5 node cluster with DLM and manual fencing and I didn't see
the problem. I need to continue to investigate.

Note that 'df' is known to be slow on GFS(1) filesystems, and this is even
mentioned in the FAQ. I did see that the CPU utilization was high (~99%) for the
df process.

Comment 7 Mathieu AVILA 2006-07-25 14:29:19 UTC

> I didn't see the problem. I need to continue to investigate.

Did you mean that you didn't expererienced the 99% CPU usage ? Or do you mean
that this is not a bug ?

In my mind, having "df" take 99% CPU is not normal; i thought it should
- either block and wait the fencing to be replied, and not take 99%
- not block, compute results, take 99% for a given period of time (I don't mind
if this is long) and then return.

My problem was that for the time i replied manually to the fencing, my "df"
process was blocked and took 99%.

I intentionaly waited a long time (a few minutes) before replying to the fencing
request. "df" was blocked for this period, taking 99% CPU.

Comment 8 Ryan O'Hara 2006-07-25 15:57:57 UTC

(In reply to comment #7)
>
> Did you mean that you didn't expererienced the 99% CPU usage ? Or do you mean
> that this is not a bug ?

I did see 'df' using 99% CPU. But 'df' appears to use ~99% CPU even under normal
circumstances (without introducing a fenced node).

I did not 'reply' to the manual fencing request and 'df' returned normally. I
will continue to attempt to replicate this problem.

Comment 9 Ryan O'Hara 2006-07-28 18:50:14 UTC

I've spent some time trying to recreate this problem and have not been able to
do so. Let me explain what I attempted to do: in a 5-node cluster configured for
manual fencing I did 'df' and the immediately killed the node. The failed node
was fenced and I didn *not* ack the fencing operation. Meanwhile, output from
'top' showed the 'df' process to be consuming abour 99% CPU. I had several
printk statements in the kernel code to watch the progress of the statfs. The
'df' command continued normally and returned successfully without ack-ing the
fenceing operation. In comparison, it appeared to behave just as a 'df' on a
normal GFS filesystem.

Can you tell me what gfs-kernel rpm you have installed? I'd like to make sure we
have the same kernel bits.

Comment 10 Lenny Maiorani 2006-07-28 20:38:12 UTC

kernel 2.6.9-34
ccs-1.0.3-0.x86_64.rpm
cman-1.0.4-0.x86_64.rpm
cman-kernheaders-2.6.9-41.0.x86_64.rpm
dlm-1.0.0-5.x86_64.rpm
dlm-kernheaders-2.6.9-39.1.x86_64.rpm
dmraid-1.0.0.rc8-1_RHEL4_U2.x86_64.rpm
fence-1.32.18-0.x86_64.rpm
GFS-6.1.5-0.x86_64.rpm
GFS-kernheaders-2.6.9-45.0.x86_64.rpm
gnbd-1.0.2-0.x86_64.rpm
gnbd-kernheaders-2.6.9-9.14.x86_64.rpm
gulm-1.0.6-0.x86_64.rpm
iddev-2.0.0-3.x86_64.rpm
ipvsadm-1.24-6.x86_64.rpm
lvm2-cluster-2.02.01-1.2.RHEL4.x86_64.rpm
magma-1.0.4-U4pre1.x86_64.rpm
magma-plugins-1.0.6-U4pre1.x86_64.rpm
perl-Net-Telnet-3.03-3.noarch.rpm
rgmanager-1.9.46-U4pre1.x86_64.rpm
system-config-cluster-1.0.25-1.0.noarch.rpm

Comment 11 Ryan O'Hara 2006-07-31 18:38:01 UTC

I don't see a gfs-kernel package, but I am assuming it is the same as the
kernelheader (2.6.9-45). I have this code running on my cluster now and the
performance for df is quite good. Is there a heavy I/O load on the filesystem
while the df is running?

In addition, I should point out that running 'df' every 5 seconds on a 16TB
filesystem (or any size for that matter) is a bad idea. This will have a severe
impact on cluster/filesystem performance.

Comment 12 Mathieu AVILA 2006-08-01 07:39:36 UTC

I've reproduced it on the "STABLE" branch. Here is the experiment:

In a 3 nodes cluster, only 2 nodes have the GFS filesystem mounted. Both nodes
are running I/Os (these are performance tests run with bonnie++; i think any
kind of intensive I/O should do). At the same time, i issue:
- "echo b > /proc/sysrq-trigger" on the 1st node
- "df" on the 2nd node
On 1 of the 2 remaining nodes, the fence request appears. At this time, i kill
the bonnie++ on the second node. "df" continues to run undefinitely... till a
reply the fence request.


I've performed the same test with no I/Os : "df" exits immediately.

Comment 14 Ryan O'Hara 2007-09-25 17:00:08 UTC

When a node is fenced, all activity on the filesystem is blocked until fencing
succeeds. This should only be apparent when using fence_manual since it requires
a user to manually ack the fence operation. Running "df" while a fence operation
is waiting to be ack'd will cause it to block. However, if "df" can get its
information from cache, it will complete successfully even with a pending fence
operation.

I believe that the real issue being reported here is not that "df" will block
while waiting for a fence operation to be ack'd (that is expected), but rather
that the "df" process spins, consuming ~99% of the cpu.

I've tried to recreate this recently and have been unable to do so. Closing this
for the time being.

Note You need to log in before you can comment on or make changes to this bug.