475488 – GFS2: mkfs.gfs2 stuck in gfs2_dirent_next loop

Bug 475488 - GFS2: mkfs.gfs2 stuck in gfs2_dirent_next loop

Summary: GFS2: mkfs.gfs2 stuck in gfs2_dirent_next loop

Keywords:
Status:	CLOSED DUPLICATE of bug 471618
Alias:	None
Product:	Red Hat Enterprise Linux 5
Classification:	Red Hat
Component:	gfs2-utils
Sub Component:
Version:	5.3
Hardware:	All
OS:	Linux
Priority:	low
Severity:	medium
Target Milestone:	rc
Target Release:	---
Assignee:	Robert Peterson
QA Contact:	Cluster QE
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	471618
TreeView+	depends on / blocked

Reported:	2008-12-09 13:45 UTC by Nate Straz
Modified:	2010-01-12 03:41 UTC (History)
CC List:	1 user (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2008-12-09 23:11:19 UTC
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
gzipped i386 core dump for mkfs.gfs2 from gfs2-utils-0.1.51-1.el5 (11.90 MB, application/x-gzip) 2008-12-09 13:50 UTC, Nate Straz	no flags	Details
patch to fix the problem (3.10 KB, patch) 2008-12-09 22:58 UTC, Robert Peterson	no flags	Details \| Diff
View All

Description Nate Straz 2008-12-09 13:45:04 UTC

Description of problem:

While trying to make a 1T file system with 1k file system block size, mkfs got stuck in a loop and did not complete.  I attached GDB and stepped through the command:
(gdb) bt
#0  leaf_search (dip=0x1acc72f8, bh=0x1addb2b8, 
    filename=0xbfcdb350 "inum_range5", len=11, dent_out=0xbfcdb24c, 
    dent_prev=0x0) at fs_ops.c:1258
#1  0x08051f59 in linked_leaf_search (dip=0x1acc72f8, 
    filename=0xbfcdb350 "inum_range5", len=11, dent_out=0xbfcdb24c, 
    dent_prev=0x0, bh_out=0xbfcdb250) at fs_ops.c:1321
#2  0x08052023 in dir_e_search (dip=0x1acc72f8, 
    filename=0xbfcdb350 "inum_range5", len=11, type=0x0, inum=0xbfcdb2b0)
    at fs_ops.c:1358
#3  0x080521a2 in dir_search (dip=0x1acc72f8, 
    filename=0xbfcdb350 "inum_range5", len=11, type=0x0, inum=0xbfcdb2b0)
    at fs_ops.c:1423
#4  0x080525c1 in gfs2_lookupi (dip=0x1acc72f8, 
    filename=0xbfcdb350 "inum_range5", len=11, ipp=0xbfcdb2f8) at fs_ops.c:1551
#5  0x08051b62 in createi (dip=0x1acc72f8, filename=0xbfcdb350 "inum_range5", 
    mode=33152, flags=513) at fs_ops.c:1186
#6  0x080585b7 in build_inum_range (per_node=0x1acc72f8, j=5)
    at structures.c:170
#7  0x080588d9 in build_per_node (sdp=0xbfcdb4bc) at structures.c:249
#8  0x08049c6d in main_mkfs (argc=11, argv=0xbfcfda14) at main_mkfs.c:445
#9  0x08049339 in main (argc=11, argv=0xbfcfda14) at main.c:55
#10 0x0054ee8c in __libc_start_main () from /lib/libc.so.6
#11 0x080491b1 in ?? ()
(gdb) n
1275            } while (gfs2_dirent_next(dip, bh, &dent) == 0);
(gdb) 
1257                    if (!dent->de_inum.no_formal_ino){
(gdb) 
1258                            prev = dent;
(gdb) 
1259                            continue;
(gdb) 
1275            } while (gfs2_dirent_next(dip, bh, &dent) == 0);
(gdb) info locals
hash = 2442621791
dent = (struct gfs2_dirent *) 0x1addb3cc
prev = (struct gfs2_dirent *) 0x1addb3cc
entries = 0
x = 0
type = 2
(gdb) n
1257                    if (!dent->de_inum.no_formal_ino){
(gdb) 
1258                            prev = dent;
(gdb) print *dent
$1 = {de_inum = {no_formal_ino = 0, no_addr = 0}, de_hash = 0, de_rec_len = 0, 
  de_name_len = 0, de_type = 0, __pad = '\0' <repeats 13 times>}
(gdb) n
1259                            continue;
(gdb) n
1275            } while (gfs2_dirent_next(dip, bh, &dent) == 0);
(gdb) n
1257                    if (!dent->de_inum.no_formal_ino){
(gdb) print *dent
$2 = {de_inum = {no_formal_ino = 0, no_addr = 0}, de_hash = 0, de_rec_len = 0, 
  de_name_len = 0, de_type = 0, __pad = '\0' <repeats 13 times>}
(gdb) info locals
hash = 2442621791
dent = (struct gfs2_dirent *) 0x1addb3cc
prev = (struct gfs2_dirent *) 0x1addb3cc
entries = 0
x = 0
type = 2


Version-Release number of selected component (if applicable):
gfs2-utils-0.1.51-1.el5

How reproducible:
Unknown, this is the first time I've hit this.

Steps to Reproduce:
1. lvcreate -l 1T -n brawl0 brawl
2. mkfs.gfs2 -O -b 1024 -j 6 -p lock_dlm -t tankmorph:brawl0 /dev/brawl/brawl0

  
Actual results:
I'll attach the 12M core file I generated.

Expected results:
mkfs.gfs2 should complete

Additional info:

Comment 1 Nate Straz 2008-12-09 13:50:03 UTC

Created attachment 326312 [details]
gzipped i386 core dump for mkfs.gfs2 from gfs2-utils-0.1.51-1.el5

Comment 2 Nate Straz 2008-12-09 19:55:32 UTC

While trying to verify that things are working correctly, I was able to make the file system with only five journals instead of six.  But when I mount the gfs2meta file system I find that I cannot do an "ls -l" inside of per_node.  It does work in the root of the file system.

I also tried running gfs2_jadd on the file system and umounting after that command was taking a long time.

Comment 3 Nate Straz 2008-12-09 19:59:15 UTC

After a journal add, the quota_change4 file disappeared.

[root@tank-01 ~]# gfs2_jadd -j 1 /mnt/brawl
Filesystem:            /mnt/brawl
Old Journals           5
New Journals           6
[root@tank-01 ~]# ls /mnt/meta/per_node
inum_range0  inum_range4    quota_change2   statfs_change1  statfs_change5
inum_range1  inum_range5    quota_change3   statfs_change2
inum_range2  quota_change0  quota_change5   statfs_change3
inum_range3  quota_change1  statfs_change0  statfs_change4

Comment 4 Nate Straz 2008-12-09 22:02:21 UTC

I dug through old logs to see when I last ran this test case and it was on Nov 19 with gfs2-utils-0.1.49-1.el5.

I reinstalled gfs2-utils-0.1.49-1.el5 and 0.1.50-1.el5 and the test case passed with those two versions.

NOTE: The 1k block size makes mkfs.gfs2 take a lot longer to create the file system.  On a 1TB block device, a regular mkfs.gfs2 takes 0:51, a mkfs.gfs2 -b 1024 takes 13:41.

Comment 5 Robert Peterson 2008-12-09 22:58:22 UTC

Created attachment 326420 [details]
patch to fix the problem

This patch fixes the problem.

This turned out to be a regression introduced in bug #471618.
New function compute_heightsize needs to work a tiny bit
differently when dealing with jdata by using sdp->sd_jbsize
rather than the normal block size.  This subtle difference
caused the code that morphs a gfs2 directory from linear to
exhash to stop working properly, if the directory is jdata, as
it is in the case of the per_node directory.

Comment 6 Robert Peterson 2008-12-09 23:11:19 UTC

Seeing as how this is a regression with bug #471618's fix, I'm closing
this bug as DUPLICATE of that one.

*** This bug has been marked as a duplicate of bug 471618 ***

Comment 7 Fedora Update System 2009-01-24 02:36:00 UTC

gfs2-utils-2.03.11-1.fc9, cman-2.03.11-1.fc9, rgmanager-2.03.11-1.fc9 has been pushed to the Fedora 9 stable repository.  If problems still persist, please make note of it in this bug report.

Note You need to log in before you can comment on or make changes to this bug.