Bug 433309
Summary: | GFS2: attempt to grow filesystem segment faults | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | Tom Tracy <ttracy> |
Component: | gfs2-utils | Assignee: | Robert Peterson <rpeterso> |
Status: | CLOSED INSUFFICIENT_DATA | QA Contact: | GFS Bugs <gfs-bugs> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 5.2 | CC: | bmarzins, edamato |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2008-10-30 16:10:48 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Tom Tracy
2008-02-18 15:33:30 UTC
kernel version : 2.6.18-71.el5 gfs2 utilities : gfs2-utils-0.1.38-1.el5 I was not able to recreate this failure using my latest code of gfs2-kernel and gfs2 userland, which should be close to what's in 5.2. Can I get the exact commands you're using to recreate the failure and the resulting call stack from the failure? I want to know if this only happens under certain conditions, like certain block sizes, RG sizes, number of journals, etc. Also, please check if there are any messages in dmesg. Here is what I did: [root@roth-01 ../RHEL5/cluster/gfs2/mkfs]# lvcreate --name roth_lv -L 200G /dev/roth_vg Logical volume "roth_lv" created [root@roth-01 ../RHEL5/cluster/gfs2/mkfs]# mkfs.gfs2 -O -t bobs_roth:test_gfs -X -p lock_dlm -j 3 /dev/roth_vg/roth_lv Expert mode: on Device: /dev/roth_vg/roth_lv Blocksize: 4096 Device Size 200.00 GB (52428800 blocks) Filesystem Size: 200.00 GB (52428798 blocks) Journals: 3 Resource Groups: 800 Locking Protocol: "lock_dlm" Lock Table: "bobs_roth:test_gfs" [root@roth-01 ../RHEL5/cluster/gfs2/mkfs]# /usr/sbin/lvresize -L +200G /dev/roth_vg/roth_lv Extending logical volume roth_lv to 400.00 GB Logical volume roth_lv successfully resized [root@roth-01 ../RHEL5/cluster/gfs2/mkfs]# mount -tgfs2 /dev/roth_vg/roth_lv /mnt/gfs2 [root@roth-01 ../RHEL5/cluster/gfs2/mkfs]# gfs2_grow /mnt/gfs2 FS: Mount Point: /mnt/gfs2 FS: Device: /dev/mapper/roth_vg-roth_lv FS: Size: 52428798 (0x31ffffe) FS: RG size: 65535 (0xffff) DEV: Size: 104857600 (0x6400000) The file system grew by 204800MB. gfs2_grow complete. [root@roth-01 ../RHEL5/cluster/gfs2/mkfs]# To reproduce this issue lvcreate --name archieve -L 200 /dev/mapper/msa-archieve lvresize -L +200G /dev/mapper/msa-archieve mount -t gfs2 /dev/mapper/msa-agrep rchieve /archive gfs2_grow /archieve /var/log/messages.2:Feb 8 14:02:15 et-virt08 kernel: multipathd[18936]: segfault at 000000000000000a rip 000000356b06fa7d rsp 0000000008772220 error 4 Did you try this with DM Multipath? That is the difference I am seeing between our two experiments The two differences between what you did and what I did are: (1) You used /dev/mapper devices on your commands, (2) DM Multipath. I tried to recreate this problem using the device mapper device (/dev/mapper/whatever) on the commands and still didn't get it to fail. In other words: This is looking more and more like a DM Multipath problem, especially based on the segfault message posted in comment #3. Perhaps the DM Multipath kernel module died? I'm adding Ben M. to the cc list to get his input. I'd still like to get what appears on the console dmesgs at the point of failure so we can see the complete call stacks. Tom, Are you still having this problem? Perhaps I can get access to your test system and debug gfs2_grow manually from there. Bob Bob Since GFS2 was pushed back, I am using the storage on another cluster. Let me see if I have enough space so you can test it... If I have enough, send me your personal email and will let you know the details Tom I'm waiting to hear back on this, but the test is impacted by the move in Westford. Until then, I'm putting the bug record in NEEDINFO. This has been in NEEDINFO for long enough that it can't reasonably be considered high priority any more. This still looks like a possible dm multipath problem to me. I'm assuming this is a duplicate of bug #426030, but I have no way to prove it. This bug has hit the six-month limit in NEEDINFO, so I'm closing it as INSUFFICIENT_DATA. I'm assuming the problem will be pursued in bug #426030. If the problem can be reproduced, we can always reopen this bug record. |