Description of problem: I'm using GFS2 with samba+ctdb server, I keep getting problems and one of them generate the below calltrace: Jun 2 19:42:32 athos ntpd[8364]: synchronized to LOCAL(0), stratum 10 Jun 2 19:42:32 athos ntpd[8364]: kernel time sync enabled 0001 Jun 3 04:04:04 athos kernel: GFS2: fsid=MUSKETEER:home.2: fatal: invalid metadata block Jun 3 04:04:04 athos kernel: GFS2: fsid=MUSKETEER:home.2: bh = 176251 (magic number) Jun 3 04:04:04 athos kernel: GFS2: fsid=MUSKETEER:home.2: function = gfs2_meta_indirect_buffer, file = fs/gfs2/meta_io.c, line = 334 Jun 3 04:04:04 athos kernel: GFS2: fsid=MUSKETEER:home.2: about to withdraw this file system Jun 3 04:04:04 athos kernel: GFS2: fsid=MUSKETEER:home.2: telling LM to withdraw Jun 3 04:04:05 athos kernel: GFS2: fsid=MUSKETEER:home.2: withdrawn Jun 3 04:04:05 athos kernel: Jun 3 04:04:05 athos kernel: Call Trace: Jun 3 04:04:05 athos kernel: [<ffffffff88513526>] :gfs2:gfs2_lm_withdraw+0xc1/0xd0 Jun 3 04:04:05 athos kernel: [<ffffffff80063ae7>] __wait_on_bit+0x60/0x6e Jun 3 04:04:05 athos kernel: [<ffffffff80015008>] sync_buffer+0x0/0x3f Jun 3 04:04:05 athos kernel: [<ffffffff88510b5f>] :gfs2:glock_work_func+0x0/0xb8 Jun 3 04:04:05 athos kernel: [<ffffffff80063b61>] out_of_line_wait_on_bit+0x6c/0x78 Jun 3 04:04:05 athos kernel: [<ffffffff8009dbd2>] wake_bit_function+0x0/0x23 Jun 3 04:04:05 athos kernel: [<ffffffff88510b5f>] :gfs2:glock_work_func+0x0/0xb8 Jun 3 04:04:05 athos kernel: [<ffffffff8852668f>] :gfs2:gfs2_meta_check_ii+0x2c/0x38 Jun 3 04:04:05 athos kernel: [<ffffffff88516de4>] :gfs2:gfs2_meta_indirect_buffer+0x104/0x160 Jun 3 04:04:05 athos kernel: [<ffffffff800888e8>] __wake_up_common+0x3e/0x68 Jun 3 04:04:05 athos kernel: [<ffffffff88511cff>] :gfs2:gfs2_inode_refresh+0x22/0x2cf Jun 3 04:04:05 athos kernel: [<ffffffff80089c9c>] dequeue_task+0x18/0x37 Jun 3 04:04:05 athos kernel: [<ffffffff885112d5>] :gfs2:inode_go_lock+0x44/0xbe Jun 3 04:04:05 athos kernel: [<ffffffff8850efcd>] :gfs2:do_promote+0xad/0x137 Jun 3 04:04:05 athos kernel: [<ffffffff88510344>] :gfs2:finish_xmote+0x28c/0x2b7 Jun 3 04:04:05 athos kernel: [<ffffffff88510b7c>] :gfs2:glock_work_func+0x1d/0xb8 Jun 3 04:04:05 athos kernel: [<ffffffff8004d159>] run_workqueue+0x94/0xe4 Jun 3 04:04:05 athos kernel: [<ffffffff800499da>] worker_thread+0x0/0x122 Jun 3 04:04:05 athos kernel: [<ffffffff8009d98c>] keventd_create_kthread+0x0/0xc4 Jun 3 04:04:05 athos kernel: [<ffffffff80049aca>] worker_thread+0xf0/0x122 Jun 3 04:04:05 athos kernel: [<ffffffff8008a4b3>] default_wake_function+0x0/0xe Jun 3 04:04:05 athos kernel: [<ffffffff8009d98c>] keventd_create_kthread+0x0/0xc4 Jun 3 04:04:05 athos kernel: [<ffffffff8009d98c>] keventd_create_kthread+0x0/0xc4 Jun 3 04:04:05 athos kernel: [<ffffffff80032380>] kthread+0xfe/0x132 Jun 3 04:04:05 athos kernel: [<ffffffff8005dfb1>] child_rip+0xa/0x11 Jun 3 04:04:05 athos kernel: [<ffffffff8009d98c>] keventd_create_kthread+0x0/0xc4 Jun 3 04:04:05 athos kernel: [<ffffffff80032282>] kthread+0x0/0x132 Jun 3 04:04:05 athos kernel: [<ffffffff8005dfa7>] child_rip+0x0/0x11 Jun 3 04:04:05 athos kernel: Jun 3 09:41:24 athos syslogd 1.4.1: restart. It happens at 4AM, nobody working at filesystem, maybe just a rsync from gfs2 to another server on LAN. Version-Release number of selected component (if applicable): [root@athos ~]# rpm -qa| grep -iE 'gfs2|cman|openais|clust' system-config-cluster-1.0.55-1.0 cluster-snmp-0.12.1-2.el5 gfs2-utils-0.1.53-1.el5_3.3 cman-2.0.98-1.el5_3.1 Cluster_Administration-en-US-5.2-1 lvm2-cluster-2.02.40-7.el5 modcluster-0.12.1-2.el5 cluster-cim-0.12.1-2.el5 openais-0.80.3-22.el5_3.4 [root@athos ~]# uname -a Linux athos.intranet.prosul 2.6.18-128.1.10.el5 #1 SMP Wed Apr 29 13:53:08 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux [root@athos ~]# cat /etc/redhat-release Red Hat Enterprise Linux Server release 5.3 (Tikanga)
This was opened against RHEL4, but since it's gfs2, it's clearly RHEL5. This is quite possibly a duplicate of bug #495799. Can I get a copy of your gfs2 metadata to check whether that's the case? To save off the metadata, do this: gfs2_edit savemeta /dev/your/device /some/file/name bzip2 /some/file/name Then post the bzip2'd file to some server where I can retrieve it. You may want to read through that bug. It's currently in NEEDINFO so perhaps you can provide the info requested.
Created attachment 346410 [details] gfs2_edit savemeta
The saved metadata from comment #2 is not the same file system reported in the original problem. The metadata is for "home" whereas the problem was reported for "home.2". Nonetheless, this metadata does show a problem very much like bug #495799. Here is an excerpt from the output of a prototype version of fsck.gfs2: Block 3016156 (0x2e05dc) has 2 inodes referencing it for a total of 2 duplicate references Inode (null) (2961647/0x2d30ef) has 1 reference(s) to block 3016156 (0x2e05dc) Clearing... Block 3016759 (0x2e0837) has 2 inodes referencing it for a total of 2 duplicate references Inode (null) (2961647/0x2d30ef) has 1 reference(s) to block 3016759 (0x2e0837) Clearing... Block 3017436 (0x2e0adc) has 2 inodes referencing it for a total of 2 duplicate references Inode (null) (2961647/0x2d30ef) has 1 reference(s) to block 3017436 (0x2e0adc) Clearing... Block 2961648 (0x2d30f0) has 1 inodes referencing it for a total of 2 duplicate references Inode (null) (2961647/0x2d30ef) has 2 reference(s) to block 2961648 (0x2d30f0) Clearing... So clearly it has four blocks that were assigned to two purposes. Flávio, can you tell us what happens to this file system to get these duplicate references? Can you tell us how many nodes are writing to the file system and how the data is written? Can you figure out a way to get duplicate references from gfs2?
Hi Robert, I don't have any "home.2" filesystem in my cluster, I thought that ".2" suffix was related with journal, as I got same messages on all nodes I don't guess that was relevant. Sorry. I've executed a gfs2_fsck -vy at home filesystem, it shows me a lot of corrections and seems to be working now. About your questions: Flávio, can you tell us what happens to this file system to get these duplicate references? - I don't know. This filesystem is mounted as r/w on 4-node cluster, using Samba+CTDB setup, all nodes active and can write to the filesystem. Can you tell us how many nodes are writing to the file system and how the data is written? - 4 Nodes. I've no idea "how" data is written, if you teach me how to check it, I'll be happy to answer you. Can you figure out a way to get duplicate references from gfs2? - Nop. But, as this share is the "home" samba share, I think no more than one node should write to it at the same time on the same directory. I've one user per computer, so this means that only one machine will connect to this directory (/home/loginname) at a time and write/read data there. This filesystem is using quota. I've noticed about "D<" process of [gfs2_quotad], but I'd read on IRC that this is harmless. Anything more you need, just let me known.
Well, I guess the question is, given a clean, newly created filesystem, can you reproduce this issue. If so then gathering an strace of the processes which are writing to the fs would be very helpful. Of course we realise that might be a lot of data, so it might not be practical. In that case, just some idea of the workload would be helpful - what is the average size of the I/O, how many nodes are doing I/O at once, what is the rate of the I/O, how full is the fs, have you had any recovery take place on the fs or anything else that you can think of that might help shed some light on where the issue is. We've had reports similar to this before, but they've always been put down to some config issue or similar and we have no hard evidence which can help us track down what causes it.
Hi, Folks, after fsck I got no more problems with this filesystem, but I'd needed to remove all users from it, as it was really unstable. Now I've less than 10 users doing tests with CTDB+GFS2, and at this level the system looks stable. I'll try to perform new tests with more users, but this is a bit complicated for company, as it isn't even a IT company. About workload: I've 250 users, 230 with 50Mb at /home and 20 with no limit. Homedir is just for particular files, I don't believe that it have a great I/O use, nothing relevant I think. FS is 50% full, using LVM over a RAID-5 on a IBM DS4700 Storage with 13 FC disks + 1 Hot-spare. About recoveries, No. I don't think so. Users that have enough space to do it, have no idea how to do, and other users have only 50Mb, what should be less than throughput by sec. If I got something new, I'll post here.
Abhi has been looking at this one, so I'm reassigning it to him, although honestly I don't know what more we can do on it, short of recreating the problem. There is a chance this could be the same thing as bug #471141 (albeit with a slightly different symptom) where some blocks were somehow inexplicably assigned to multiple purposes, but proving that might be impossible.
I've been running samba+ctdb+gfs2 tests on my test cluster for a few weeks now and I've not been able to hit this problem. A fresh (or fsck'ed) filesystem seems to work ok. I'm flagging this needinfo so we can get more information when the reporter recreates this.
Hi Abhi, As I was using this system in production and for other problems we've the need to remove that gfs2+ctdb+samba setup. I recall that doing a fsck on that partition solves the problem, at least for that moment but right after it I stop using the cluster. I'll start using it again during this month, trying that patch for another bz (502531) and will keep you updated. -- Flavio
This bug has now been in needinfo for 6 months and I'm closing it on that basis. If there is further information on it, then please reopen it and we'll take a look at it.