Bug 682767 - GFS2 mount hangs after some time
Summary: GFS2 mount hangs after some time
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: kernel
Version: 6.0
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Ben Marzinski
QA Contact: Red Hat Kernel QE team
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-03-07 14:36 UTC by Jakub Hrozek
Modified: 2011-07-15 15:24 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-07-15 15:24:05 UTC
Target Upstream Version:


Attachments (Terms of Use)
/etc/multipath.conf (2.54 KB, text/plain)
2011-07-11 08:06 UTC, Jakub Hrozek
no flags Details

Description Jakub Hrozek 2011-03-07 14:36:58 UTC
Description of problem:
We use the GFS2 filesystem on an IBM DS3000/1726. We encounter kernel call trace in the logs from time to time and the mountpoint becomes stalled afterwards (ls hangs, mountpoint cannot be umounted etc.)

We use the RHEL6 packaged ql2400-firmware-5.03.02-1 firmware for the SAN.

Version-Release number of selected component (if applicable):
kernel-2.6.32-71.el6.x86_64
ql2400-firmware-5.03.02-1.el6.noarch

How reproducible:
not great, sometimes works fine for a week, sometimes crashes sooner

Steps to Reproduce:
1. multipath the disks provided by the SAN
2. format the multipathed device with GFS2
3. mount the GFS2 FS
4. write to the FS
  
Actual results:
Mar  3 05:30:11 sam kernel: INFO: task scsi_wq_1:317 blocked for more than 120 seconds.
Mar  3 05:30:11 sam kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Mar  3 05:30:11 sam kernel: scsi_wq_1     D ffff880551777ac0     0   317      2 0x00000000
Mar  3 05:30:11 sam kernel: ffff8805517779e0 0000000000000046 0000000000000000 ffff880551590000
Mar  3 05:30:11 sam kernel: ffff88055151f240 0000000000000246 ffff880551777a00 ffff880557191880
Mar  3 05:30:11 sam kernel: ffff880552dc1ad8 ffff880551777fd8 0000000000010518 ffff880552dc1ad8
Mar  3 05:30:11 sam kernel: Call Trace:
Mar  3 05:30:11 sam kernel: [<ffffffff814c8fc5>] schedule_timeout+0x225/0x2f0
Mar  3 05:30:11 sam kernel: [<ffffffff81255ecf>] ? cfq_set_request+0x18f/0x520
Mar  3 05:30:11 sam kernel: [<ffffffff8110e635>] ? mempool_alloc_slab+0x15/0x20
Mar  3 05:30:11 sam kernel: [<ffffffff814c8c33>] wait_for_common+0x123/0x180
Mar  3 05:30:11 sam kernel: [<ffffffff8105c490>] ? default_wake_function+0x0/0x20
Mar  3 05:30:11 sam kernel: [<ffffffff814c8d4d>] wait_for_completion+0x1d/0x20
Mar  3 05:30:12 sam kernel: [<ffffffff812464ec>] blk_execute_rq+0x8c/0xf0
Mar  3 05:30:12 sam kernel: [<ffffffff81241240>] ? blk_rq_bio_prep+0x30/0xc0
Mar  3 05:30:12 sam kernel: [<ffffffff81246066>] ? blk_rq_map_kern+0xd6/0x150
Mar  3 05:30:12 sam kernel: [<ffffffff8134ab5c>] scsi_execute+0xfc/0x160
Mar  3 05:30:12 sam kernel: [<ffffffff8134adb6>] scsi_execute_req+0xb6/0x190
Mar  3 05:30:12 sam kernel: [<ffffffff8134d380>] __scsi_scan_target+0x2c0/0x750
Mar  3 05:30:12 sam kernel: [<ffffffff81056630>] ? __dequeue_entity+0x30/0x50
Mar  3 05:30:12 sam kernel: [<ffffffff81059d12>] ? finish_task_switch+0x42/0xd0
Mar  3 05:30:12 sam kernel: [<ffffffff8134df40>] scsi_scan_target+0xd0/0xe0
Mar  3 05:30:12 sam kernel: [<ffffffffa016b8fd>] fc_scsi_scan_rport+0xbd/0xc0 [scsi_transport_fc]
Mar  3 05:30:12 sam kernel: [<ffffffffa016b840>] ? fc_scsi_scan_rport+0x0/0xc0 [scsi_transport_fc]
Mar  3 05:30:12 sam kernel: [<ffffffff8108c610>] worker_thread+0x170/0x2a0
Mar  3 05:30:12 sam kernel: [<ffffffff81091ca0>] ? autoremove_wake_function+0x0/0x40
Mar  3 05:30:12 sam kernel: [<ffffffff8108c4a0>] ? worker_thread+0x0/0x2a0
Mar  3 05:30:12 sam kernel: [<ffffffff81091936>] kthread+0x96/0xa0
Mar  3 05:30:12 sam kernel: [<ffffffff810141ca>] child_rip+0xa/0x20
Mar  3 05:30:12 sam kernel: [<ffffffff810918a0>] ? kthread+0x0/0xa0
Mar  3 05:30:12 sam kernel: [<ffffffff810141c0>] ? child_rip+0x0/0x20
Mar  3 05:30:12 sam kernel: INFO: task gfs2_logd:3735 blocked for more than 120 seconds.
Mar  3 05:30:12 sam kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Mar  3 05:30:12 sam kernel: gfs2_logd     D 0000000000000002     0  3735      2 0x00000000
Mar  3 05:30:12 sam kernel: ffff88032ba2dc80 0000000000000046 ffff88032ba2dc40 ffffffffa000471c
Mar  3 05:30:12 sam kernel: ffff8805511a2380 ffff8802e7568200 ffff88032ba2dd20 0000000000000002
Mar  3 05:30:12 sam kernel: ffff8803a78ca678 ffff88032ba2dfd8 0000000000010518 ffff8803a78ca678
Mar  3 05:30:12 sam kernel: Call Trace:
Mar  3 05:30:12 sam kernel: [<ffffffffa000471c>] ? dm_table_unplug_all+0x5c/0xd0 [dm_mod]
Mar  3 05:30:12 sam kernel: [<ffffffff8109b9a9>] ? ktime_get_ts+0xa9/0xe0
Mar  3 05:30:12 sam kernel: [<ffffffff8119dcf0>] ? sync_buffer+0x0/0x50
Mar  3 05:30:12 sam kernel: [<ffffffff814c8a23>] io_schedule+0x73/0xc0
Mar  3 05:30:12 sam kernel: [<ffffffff8119dd30>] sync_buffer+0x40/0x50
Mar  3 05:30:12 sam kernel: [<ffffffff814c929f>] __wait_on_bit+0x5f/0x90
Mar  3 05:30:12 sam kernel: [<ffffffff8119dcf0>] ? sync_buffer+0x0/0x50
Mar  3 05:30:12 sam kernel: [<ffffffff814c9348>] out_of_line_wait_on_bit+0x78/0x90
Mar  3 05:30:12 sam kernel: [<ffffffff81091ce0>] ? wake_bit_function+0x0/0x50
Mar  3 05:30:12 sam kernel: [<ffffffff8119dce6>] __wait_on_buffer+0x26/0x30
Mar  3 05:30:12 sam kernel: [<ffffffffa0625c1f>] log_write_header+0x1cf/0x520 [gfs2]
Mar  3 05:30:13 sam kernel: [<ffffffffa062653a>] gfs2_log_flush+0x2ea/0x6b0 [gfs2]
Mar  3 05:30:13 sam kernel: [<ffffffff81091ca0>] ? autoremove_wake_function+0x0/0x40
Mar  3 05:30:13 sam kernel: [<ffffffffa06269e1>] gfs2_logd+0xe1/0x150 [gfs2]
Mar  3 05:30:13 sam kernel: [<ffffffffa0626900>] ? gfs2_logd+0x0/0x150 [gfs2]
Mar  3 05:30:13 sam kernel: [<ffffffff81091936>] kthread+0x96/0xa0
Mar  3 05:30:13 sam kernel: [<ffffffff810141ca>] child_rip+0xa/0x20
Mar  3 05:30:13 sam kernel: [<ffffffff810918a0>] ? kthread+0x0/0xa0
Mar  3 05:30:13 sam kernel: [<ffffffff810141c0>] ? child_rip+0x0/0x20


Expected results:
mountpoint keeps working

Additional info:
The GFS2 filesystem is mounted with default options:
$ grep gfs2 /etc/fstab 
/dev/mapper/banana_tree             /mnt/storage            gfs2    defaults        0 0

The multipath device is configured as follows:
multipaths {
        multipath {
                wwid                    3600a0b80005a5a69000003614a1d19ee
                alias                   banana_tree
                path_grouping_policy    multibus
                path_checker            readsector0
                path_selector           "round-robin 0"
                failback                manual
                rr_weight               priorities
                no_path_retry           5
        }
}

Comment 2 Steve Whitehouse 2011-03-07 15:14:10 UTC
At first glance this looks like it is a block device issue. I would suggest adding noatime to the mount flags, although that is not going to solve this particular issue.

Could we have a brief description of the set up here? How many nodes are there, what are the specs and what is the application?

Comment 3 Jakub Hrozek 2011-03-07 15:56:01 UTC
(In reply to comment #2)
> At first glance this looks like it is a block device issue. I would suggest
> adding noatime to the mount flags, although that is not going to solve this
> particular issue.
> 

Good suggestion, thanks.

> Could we have a brief description of the set up here? How many nodes are there,
> what are the specs and what is the application?

I could see the behaviour with 5 nodes which is my desired full setup and I can reproduce the issue with 2 nodes -- I downscaled the cluster b/c every time this bug happened I had to reboot the machines. The GFS2 filesystem has 8 journals - one per machine plus 3 more just in case.

Not sure what spec in particular you would like to hear - all machines are IBM eServer BladeCenter HS21 running RHEL 6.0. The cluster (and right now the machines individually) are used to host VM images. So the storage contains a smallish number of big files.

Here's what multipath -ll has got to say about the SAN:

# multipath -ll
banana_tree (3600a0b80005a5a69000003614a1d19ee) dm-2 IBM,1726-4xx  FAStT
size=1.1T features='1 queue_if_no_path' hwhandler='1 rdac' wp=rw
`-+- policy='round-robin 0' prio=2 status=active
  |- 2:0:1:0 sde 8:64 active ghost  running
  |- 1:0:1:0 sdc 8:32 active ready  running
  |- 2:0:0:0 sdd 8:48 active ready  running
  `- 1:0:0:0 sdb 8:16 active ghost  running

Comment 4 Steve Whitehouse 2011-03-09 12:05:22 UTC
Ben, does this look familiar to you? I think it is a multipath/scsi/storage/block layer issue, but I'd like some conformation on that if possible.

Comment 5 Ben Marzinski 2011-07-05 15:41:09 UTC
This multipath setup looks wrong.  The ghost paths should not be in the same pathgroup as the active paths. ghost paths are passive paths that are working but need a special command sent to the hardware to activate them.  These command get sent when mutipath switches pathgroups. However, in this setup, they are in the same pathgroup, which will keep multipath from activating them before they are used.

This device should autoconfigure to correctly setup the pathgroups, unless you have overridden the settings for it in /etc/multipath.conf.

If you are still having this problem, could you please post your /etc/multipath.conf file.

Comment 6 Jakub Hrozek 2011-07-11 08:06:24 UTC
Created attachment 512142 [details]
/etc/multipath.conf

Comment 7 Jakub Hrozek 2011-07-11 08:06:59 UTC
(In reply to comment #5)
> This multipath setup looks wrong.  The ghost paths should not be in the same
> pathgroup as the active paths. ghost paths are passive paths that are working
> but need a special command sent to the hardware to activate them.  These
> command get sent when mutipath switches pathgroups. However, in this setup,
> they are in the same pathgroup, which will keep multipath from activating them
> before they are used.
> 
> This device should autoconfigure to correctly setup the pathgroups, unless you
> have overridden the settings for it in /etc/multipath.conf.
> 
> If you are still having this problem, could you please post your
> /etc/multipath.conf file.

Sorry for the vacation induced delay. The /etc/multipath.conf file is attached.

Comment 8 Ben Marzinski 2011-07-12 14:55:53 UTC
This device is autoconfigured by device-mapper-multipath, based on guidance from IBM, so you shouldn't need to change anything to have it work correctly.

I would suggest changing your multipaths section to look like

multipaths {
        multipath {
                wwid                    3600a0b80005a5a69000003614a1d19ee
                alias                   banana_tree
        }
}

Which just sets the alias the way you want it.

If you really need it configured the way it was, the following configuration just removes the parts of your configuration that were causing you problems. But I would definitely try the default configuration first.

defaults {
	user_friendly_names yes
}

devices {
        device {
                vendor "IBM"
                product "1724"
                path_checker readsector0
                failback manual
                no_path_retry 5
        }
}
                
multipaths {
        multipath {
                wwid                    3600a0b80005a5a69000003614a1d19ee
                alias                   banana_tree
        }
}

Comment 9 Steve Whitehouse 2011-07-15 15:00:50 UTC
Is this issue resolved now? If so can we close the bug assuming that this really is just a config issue and not a real bug?

Jakub, let us know if you need anything more from us.

Comment 10 Jakub Hrozek 2011-07-15 15:18:54 UTC
(In reply to comment #9)
> Is this issue resolved now? If so can we close the bug assuming that this
> really is just a config issue and not a real bug?
> 
> Jakub, let us know if you need anything more from us.

So far so good -- thanks a lot for the pointer!

I wasn't able to deploy the change to full cluster yet, only 2 nodes but I  haven't seen any hiccups so far.

Feel free to close this issue, I can reopen later if the issue persists.

Comment 11 Steve Whitehouse 2011-07-15 15:24:05 UTC
Ok, sounds good, will close for now.


Note You need to log in before you can comment on or make changes to this bug.