Bug 503938 - GFS2 with samba+ctdb: withdraw problem
Summary: GFS2 with samba+ctdb: withdraw problem
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel
Version: 5.3
Hardware: x86_64
OS: Linux
low
medium
Target Milestone: ---
: ---
Assignee: Abhijith Das
QA Contact: Cluster QE
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2009-06-03 13:28 UTC by Flávio do Carmo Júnior
Modified: 2010-01-27 11:09 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-01-27 11:09:06 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
gfs2_edit savemeta (60 bytes, text/plain)
2009-06-03 15:33 UTC, Flávio do Carmo Júnior
no flags Details

Description Flávio do Carmo Júnior 2009-06-03 13:28:30 UTC
Description of problem:

I'm using GFS2 with samba+ctdb server, I keep getting problems and one of them generate the below calltrace:

Jun  2 19:42:32 athos ntpd[8364]: synchronized to LOCAL(0), stratum 10
Jun  2 19:42:32 athos ntpd[8364]: kernel time sync enabled 0001
Jun  3 04:04:04 athos kernel: GFS2: fsid=MUSKETEER:home.2: fatal: invalid metadata block
Jun  3 04:04:04 athos kernel: GFS2: fsid=MUSKETEER:home.2:   bh = 176251 (magic number)
Jun  3 04:04:04 athos kernel: GFS2: fsid=MUSKETEER:home.2:   function = gfs2_meta_indirect_buffer, file = fs/gfs2/meta_io.c, line = 334
Jun  3 04:04:04 athos kernel: GFS2: fsid=MUSKETEER:home.2: about to withdraw this file system
Jun  3 04:04:04 athos kernel: GFS2: fsid=MUSKETEER:home.2: telling LM to withdraw
Jun  3 04:04:05 athos kernel: GFS2: fsid=MUSKETEER:home.2: withdrawn
Jun  3 04:04:05 athos kernel:
Jun  3 04:04:05 athos kernel: Call Trace:
Jun  3 04:04:05 athos kernel:  [<ffffffff88513526>] :gfs2:gfs2_lm_withdraw+0xc1/0xd0
Jun  3 04:04:05 athos kernel:  [<ffffffff80063ae7>] __wait_on_bit+0x60/0x6e
Jun  3 04:04:05 athos kernel:  [<ffffffff80015008>] sync_buffer+0x0/0x3f
Jun  3 04:04:05 athos kernel:  [<ffffffff88510b5f>] :gfs2:glock_work_func+0x0/0xb8
Jun  3 04:04:05 athos kernel:  [<ffffffff80063b61>] out_of_line_wait_on_bit+0x6c/0x78
Jun  3 04:04:05 athos kernel:  [<ffffffff8009dbd2>] wake_bit_function+0x0/0x23
Jun  3 04:04:05 athos kernel:  [<ffffffff88510b5f>] :gfs2:glock_work_func+0x0/0xb8
Jun  3 04:04:05 athos kernel:  [<ffffffff8852668f>] :gfs2:gfs2_meta_check_ii+0x2c/0x38
Jun  3 04:04:05 athos kernel:  [<ffffffff88516de4>] :gfs2:gfs2_meta_indirect_buffer+0x104/0x160
Jun  3 04:04:05 athos kernel:  [<ffffffff800888e8>] __wake_up_common+0x3e/0x68
Jun  3 04:04:05 athos kernel:  [<ffffffff88511cff>] :gfs2:gfs2_inode_refresh+0x22/0x2cf
Jun  3 04:04:05 athos kernel:  [<ffffffff80089c9c>] dequeue_task+0x18/0x37
Jun  3 04:04:05 athos kernel:  [<ffffffff885112d5>] :gfs2:inode_go_lock+0x44/0xbe
Jun  3 04:04:05 athos kernel:  [<ffffffff8850efcd>] :gfs2:do_promote+0xad/0x137
Jun  3 04:04:05 athos kernel:  [<ffffffff88510344>] :gfs2:finish_xmote+0x28c/0x2b7
Jun  3 04:04:05 athos kernel:  [<ffffffff88510b7c>] :gfs2:glock_work_func+0x1d/0xb8
Jun  3 04:04:05 athos kernel:  [<ffffffff8004d159>] run_workqueue+0x94/0xe4
Jun  3 04:04:05 athos kernel:  [<ffffffff800499da>] worker_thread+0x0/0x122
Jun  3 04:04:05 athos kernel:  [<ffffffff8009d98c>] keventd_create_kthread+0x0/0xc4
Jun  3 04:04:05 athos kernel:  [<ffffffff80049aca>] worker_thread+0xf0/0x122
Jun  3 04:04:05 athos kernel:  [<ffffffff8008a4b3>] default_wake_function+0x0/0xe
Jun  3 04:04:05 athos kernel:  [<ffffffff8009d98c>] keventd_create_kthread+0x0/0xc4
Jun  3 04:04:05 athos kernel:  [<ffffffff8009d98c>] keventd_create_kthread+0x0/0xc4
Jun  3 04:04:05 athos kernel:  [<ffffffff80032380>] kthread+0xfe/0x132
Jun  3 04:04:05 athos kernel:  [<ffffffff8005dfb1>] child_rip+0xa/0x11
Jun  3 04:04:05 athos kernel:  [<ffffffff8009d98c>] keventd_create_kthread+0x0/0xc4
Jun  3 04:04:05 athos kernel:  [<ffffffff80032282>] kthread+0x0/0x132
Jun  3 04:04:05 athos kernel:  [<ffffffff8005dfa7>] child_rip+0x0/0x11
Jun  3 04:04:05 athos kernel:
Jun  3 09:41:24 athos syslogd 1.4.1: restart.


It happens at 4AM, nobody working at filesystem, maybe just a rsync from gfs2 to another server on LAN.

Version-Release number of selected component (if applicable):
[root@athos ~]# rpm -qa| grep -iE 'gfs2|cman|openais|clust'
system-config-cluster-1.0.55-1.0
cluster-snmp-0.12.1-2.el5
gfs2-utils-0.1.53-1.el5_3.3
cman-2.0.98-1.el5_3.1
Cluster_Administration-en-US-5.2-1
lvm2-cluster-2.02.40-7.el5
modcluster-0.12.1-2.el5
cluster-cim-0.12.1-2.el5
openais-0.80.3-22.el5_3.4
[root@athos ~]# uname -a
Linux athos.intranet.prosul 2.6.18-128.1.10.el5 #1 SMP Wed Apr 29 13:53:08 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux
[root@athos ~]# cat /etc/redhat-release
Red Hat Enterprise Linux Server release 5.3 (Tikanga)

Comment 1 Robert Peterson 2009-06-03 13:51:00 UTC
This was opened against RHEL4, but since it's gfs2, it's clearly
RHEL5.  This is quite possibly a duplicate of bug #495799.
Can I get a copy of your gfs2 metadata to check whether that's the case?
To save off the metadata, do this:

gfs2_edit savemeta /dev/your/device /some/file/name
bzip2 /some/file/name

Then post the bzip2'd file to some server where I can retrieve it.
You may want to read through that bug.  It's currently in NEEDINFO
so perhaps you can provide the info requested.

Comment 2 Flávio do Carmo Júnior 2009-06-03 15:33:09 UTC
Created attachment 346410 [details]
gfs2_edit savemeta

Comment 3 Robert Peterson 2009-06-03 17:52:36 UTC
The saved metadata from comment #2 is not the same file system
reported in the original problem.  The metadata is for "home"
whereas the problem was reported for "home.2".  Nonetheless, this
metadata does show a problem very much like bug #495799.  Here is an
excerpt from the output of a prototype version of fsck.gfs2:

Block 3016156 (0x2e05dc) has 2 inodes referencing it for a total of 2 duplicate references
Inode (null) (2961647/0x2d30ef) has 1 reference(s) to block 3016156 (0x2e05dc)
Clearing...
Block 3016759 (0x2e0837) has 2 inodes referencing it for a total of 2 duplicate references
Inode (null) (2961647/0x2d30ef) has 1 reference(s) to block 3016759 (0x2e0837)
Clearing...
Block 3017436 (0x2e0adc) has 2 inodes referencing it for a total of 2 duplicate references
Inode (null) (2961647/0x2d30ef) has 1 reference(s) to block 3017436 (0x2e0adc)
Clearing...
Block 2961648 (0x2d30f0) has 1 inodes referencing it for a total of 2 duplicate references
Inode (null) (2961647/0x2d30ef) has 2 reference(s) to block 2961648 (0x2d30f0)
Clearing...

So clearly it has four blocks that were assigned to two purposes.
Flávio, can you tell us what happens to this file system to get these
duplicate references?  Can you tell us how many nodes are writing to
the file system and how the data is written?  Can you figure out a way
to get duplicate references from gfs2?

Comment 4 Flávio do Carmo Júnior 2009-06-03 18:44:13 UTC
Hi Robert,

I don't have any "home.2" filesystem in my cluster, I thought that ".2" suffix was related with journal, as I got same messages on all nodes I don't guess that was relevant. Sorry.

I've executed a gfs2_fsck -vy at home filesystem, it shows me a lot of corrections and seems to be working now.

About your questions:

Flávio, can you tell us what happens to this file system to get these
duplicate references?
 - I don't know. This filesystem is mounted as r/w on 4-node cluster, using Samba+CTDB setup, all nodes active and can write to the filesystem.

Can you tell us how many nodes are writing to the file system and how the data is written?
 - 4 Nodes. I've no idea "how" data is written, if you teach me how to check it, I'll be happy to answer you.

Can you figure out a way to get duplicate references from gfs2?
 - Nop. But, as this share is the "home" samba share, I think no more than one node should write to it at the same time on the same directory. I've one user per computer, so this means that only one machine will connect to this directory (/home/loginname) at a time and write/read data there.
This filesystem is using quota. I've noticed about "D<" process of [gfs2_quotad], but I'd read on IRC that this is harmless.

Anything more you need, just let me known.

Comment 5 Steve Whitehouse 2009-06-04 10:20:55 UTC
Well, I guess the question is, given a clean, newly created filesystem, can you reproduce this issue. If so then gathering an strace of the processes which are writing to the fs would be very helpful. Of course we realise that might be a lot of data, so it might not be practical.

In that case, just some idea of the workload would be helpful - what is the average size of the I/O, how many nodes are doing I/O at once, what is the rate of the I/O, how full is the fs, have you had any recovery take place on the fs or anything else that you can think of that might help shed some light on where the issue is.

We've had reports similar to this before, but they've always been put down to some config issue or similar and we have no hard evidence which can help us track down what causes it.

Comment 6 Flávio do Carmo Júnior 2009-06-09 12:50:24 UTC
Hi,

Folks, after fsck I got no more problems with this filesystem, but I'd needed to remove all users from it, as it was really unstable. 
Now I've less than 10 users doing tests with CTDB+GFS2, and at this level the system looks stable.

I'll try to perform new tests with more users, but this is a bit complicated for company, as it isn't even a IT company.


About workload: I've 250 users, 230 with 50Mb at /home and 20 with no limit. Homedir is just for particular files, I don't believe that it have a great I/O use, nothing relevant I think. FS is 50% full, using LVM over a RAID-5 on a IBM DS4700 Storage with 13 FC disks + 1 Hot-spare. About recoveries, No. I don't think so. Users that have enough space to do it, have no idea how to do, and other users have only 50Mb, what should be less than throughput by sec.

If I got something new, I'll post here.

Comment 7 Robert Peterson 2009-06-30 21:54:04 UTC
Abhi has been looking at this one, so I'm reassigning it to him,
although honestly I don't know what more we can do on it, short
of recreating the problem.  There is a chance this could be the
same thing as bug #471141 (albeit with a slightly different
symptom) where some blocks were somehow inexplicably assigned
to multiple purposes, but proving that might be impossible.

Comment 8 Abhijith Das 2009-07-07 14:24:15 UTC
I've been running samba+ctdb+gfs2 tests on my test cluster for a few weeks now and I've not been able to hit this problem. A fresh (or fsck'ed) filesystem seems to work ok. I'm flagging this needinfo so we can get more information when the reporter recreates this.

Comment 9 Flávio do Carmo Júnior 2009-07-07 18:15:44 UTC
Hi Abhi, 

As I was using this system in production and for other problems we've the need to remove that gfs2+ctdb+samba setup.

I recall that doing a fsck on that partition solves the problem, at least for that moment but right after it I stop using the cluster.

I'll start using it again during this month, trying that patch for another bz (502531) and will keep you updated.

-- Flavio

Comment 10 Steve Whitehouse 2010-01-27 11:09:06 UTC
This bug has now been in needinfo for 6 months and I'm closing it on that basis. If there is further information on it, then please reopen it and we'll take a look at it.


Note You need to log in before you can comment on or make changes to this bug.