Description of problem: While trying to make a 1T file system with 1k file system block size, mkfs got stuck in a loop and did not complete. I attached GDB and stepped through the command: (gdb) bt #0 leaf_search (dip=0x1acc72f8, bh=0x1addb2b8, filename=0xbfcdb350 "inum_range5", len=11, dent_out=0xbfcdb24c, dent_prev=0x0) at fs_ops.c:1258 #1 0x08051f59 in linked_leaf_search (dip=0x1acc72f8, filename=0xbfcdb350 "inum_range5", len=11, dent_out=0xbfcdb24c, dent_prev=0x0, bh_out=0xbfcdb250) at fs_ops.c:1321 #2 0x08052023 in dir_e_search (dip=0x1acc72f8, filename=0xbfcdb350 "inum_range5", len=11, type=0x0, inum=0xbfcdb2b0) at fs_ops.c:1358 #3 0x080521a2 in dir_search (dip=0x1acc72f8, filename=0xbfcdb350 "inum_range5", len=11, type=0x0, inum=0xbfcdb2b0) at fs_ops.c:1423 #4 0x080525c1 in gfs2_lookupi (dip=0x1acc72f8, filename=0xbfcdb350 "inum_range5", len=11, ipp=0xbfcdb2f8) at fs_ops.c:1551 #5 0x08051b62 in createi (dip=0x1acc72f8, filename=0xbfcdb350 "inum_range5", mode=33152, flags=513) at fs_ops.c:1186 #6 0x080585b7 in build_inum_range (per_node=0x1acc72f8, j=5) at structures.c:170 #7 0x080588d9 in build_per_node (sdp=0xbfcdb4bc) at structures.c:249 #8 0x08049c6d in main_mkfs (argc=11, argv=0xbfcfda14) at main_mkfs.c:445 #9 0x08049339 in main (argc=11, argv=0xbfcfda14) at main.c:55 #10 0x0054ee8c in __libc_start_main () from /lib/libc.so.6 #11 0x080491b1 in ?? () (gdb) n 1275 } while (gfs2_dirent_next(dip, bh, &dent) == 0); (gdb) 1257 if (!dent->de_inum.no_formal_ino){ (gdb) 1258 prev = dent; (gdb) 1259 continue; (gdb) 1275 } while (gfs2_dirent_next(dip, bh, &dent) == 0); (gdb) info locals hash = 2442621791 dent = (struct gfs2_dirent *) 0x1addb3cc prev = (struct gfs2_dirent *) 0x1addb3cc entries = 0 x = 0 type = 2 (gdb) n 1257 if (!dent->de_inum.no_formal_ino){ (gdb) 1258 prev = dent; (gdb) print *dent $1 = {de_inum = {no_formal_ino = 0, no_addr = 0}, de_hash = 0, de_rec_len = 0, de_name_len = 0, de_type = 0, __pad = '\0' <repeats 13 times>} (gdb) n 1259 continue; (gdb) n 1275 } while (gfs2_dirent_next(dip, bh, &dent) == 0); (gdb) n 1257 if (!dent->de_inum.no_formal_ino){ (gdb) print *dent $2 = {de_inum = {no_formal_ino = 0, no_addr = 0}, de_hash = 0, de_rec_len = 0, de_name_len = 0, de_type = 0, __pad = '\0' <repeats 13 times>} (gdb) info locals hash = 2442621791 dent = (struct gfs2_dirent *) 0x1addb3cc prev = (struct gfs2_dirent *) 0x1addb3cc entries = 0 x = 0 type = 2 Version-Release number of selected component (if applicable): gfs2-utils-0.1.51-1.el5 How reproducible: Unknown, this is the first time I've hit this. Steps to Reproduce: 1. lvcreate -l 1T -n brawl0 brawl 2. mkfs.gfs2 -O -b 1024 -j 6 -p lock_dlm -t tankmorph:brawl0 /dev/brawl/brawl0 Actual results: I'll attach the 12M core file I generated. Expected results: mkfs.gfs2 should complete Additional info:
Created attachment 326312 [details] gzipped i386 core dump for mkfs.gfs2 from gfs2-utils-0.1.51-1.el5
While trying to verify that things are working correctly, I was able to make the file system with only five journals instead of six. But when I mount the gfs2meta file system I find that I cannot do an "ls -l" inside of per_node. It does work in the root of the file system. I also tried running gfs2_jadd on the file system and umounting after that command was taking a long time.
After a journal add, the quota_change4 file disappeared. [root@tank-01 ~]# gfs2_jadd -j 1 /mnt/brawl Filesystem: /mnt/brawl Old Journals 5 New Journals 6 [root@tank-01 ~]# ls /mnt/meta/per_node inum_range0 inum_range4 quota_change2 statfs_change1 statfs_change5 inum_range1 inum_range5 quota_change3 statfs_change2 inum_range2 quota_change0 quota_change5 statfs_change3 inum_range3 quota_change1 statfs_change0 statfs_change4
I dug through old logs to see when I last ran this test case and it was on Nov 19 with gfs2-utils-0.1.49-1.el5. I reinstalled gfs2-utils-0.1.49-1.el5 and 0.1.50-1.el5 and the test case passed with those two versions. NOTE: The 1k block size makes mkfs.gfs2 take a lot longer to create the file system. On a 1TB block device, a regular mkfs.gfs2 takes 0:51, a mkfs.gfs2 -b 1024 takes 13:41.
Created attachment 326420 [details] patch to fix the problem This patch fixes the problem. This turned out to be a regression introduced in bug #471618. New function compute_heightsize needs to work a tiny bit differently when dealing with jdata by using sdp->sd_jbsize rather than the normal block size. This subtle difference caused the code that morphs a gfs2 directory from linear to exhash to stop working properly, if the directory is jdata, as it is in the case of the per_node directory.
Seeing as how this is a regression with bug #471618's fix, I'm closing this bug as DUPLICATE of that one. *** This bug has been marked as a duplicate of bug 471618 ***
gfs2-utils-2.03.11-1.fc9, cman-2.03.11-1.fc9, rgmanager-2.03.11-1.fc9 has been pushed to the Fedora 9 stable repository. If problems still persist, please make note of it in this bug report.