Bug 1020438
Summary: | xfs_repair segfaults in VM for 60T device when using ag_stride option. | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Boris Ranto <branto> |
Component: | xfsprogs | Assignee: | Eric Sandeen <esandeen> |
Status: | CLOSED ERRATA | QA Contact: | Eryu Guan <eguan> |
Severity: | medium | Docs Contact: | |
Priority: | unspecified | ||
Version: | 6.5 | CC: | branto, dchinner, dhe, eguan, esandeen, lnovich, netwiz |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | xfsprogs-3.1.1-15.el6 | Doc Type: | Bug Fix |
Doc Text: | Story Points: | --- | |
Clone Of: | 893904 | Environment: | |
Last Closed: | 2014-10-14 07:49:50 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 893904 | ||
Bug Blocks: | 1023566 |
Comment 1
Boris Ranto
2013-10-17 16:25:57 UTC
The core file is attached to the cloned bug, correct? Why -m 9000? Thanks, -Eric (gdb) bt #0 0x0000000000426962 in progress_rpt_thread (p=0x67ad20) at progress.c:234 #1 0x0000003b98a07851 in start_thread (arg=0x7f19d8e47700) at pthread_create.c:301 #2 0x0000003b982e767d in ?? () #3 0x0000000000000000 in ?? () (gdb) p msgp $1 = (msg_block_t *) 0x67ad20 (gdb) p msgp->format $2 = (progress_rpt_t *) 0x0 (gdb) Ok, easy enough to repro locally w/ a faster reporting interval: # mkfs.xfs -d size=60t,file,name=fsfile # xfs_repair -m 9000 -o ag_stride=32 -t 10 fsfile Phase 1 - find and verify superblock... - reporting progress in intervals of 10 seconds Phase 2 - using internal log - zero log... Segmentation fault Bug persists upstream. Ok, patch sent upstream. Probably needs a RHEL7 bug too... commit 7f2d6b811755b6b91f18aa5bd9d5980848a81267 Author: Eric Sandeen <sandeen> Date: Thu Oct 17 17:50:16 2013 +0000 xfs_repair: avoid segfault if reporting progress early in repair For a very large filesystem, zeroing the log may take some time. If we ask for progress reports frequently enough that one fires before we finish with log zeroing, we try to use a progress format which has not yet been set up, and segfault: # mkfs.xfs -d size=60t,file,name=fsfile # xfs_repair -m 9000 -o ag_stride=32 -t 1 fsfile Phase 1 - find and verify superblock... - reporting progress in intervals of 1 seconds Phase 2 - using internal log - zero log... Segmentation fault (gdb) bt #0 0x0000000000426962 in progress_rpt_thread (p=0x67ad20) at progress.c:234 #1 0x0000003b98a07851 in start_thread (arg=0x7f19d8e47700) at pthread_create.c:301 #2 0x0000003b982e767d in ?? () #3 0x0000000000000000 in ?? () (gdb) p msgp $1 = (msg_block_t *) 0x67ad20 (gdb) p msgp->format $2 = (progress_rpt_t *) 0x0 (gdb) I suppose we could rig up progress reports for log zeroing, but that won't usually take terribly long; for now, be defensive and init the message->format to NULL, and just return early from the progress thread if we've not yet set up any message. (Sure, global_msgs is global, and ->format is already NULL, but to me it's worth being explicit since we will test it). Signed-off-by: Eric Sandeen <sandeen> Reviewed-by: Christoph Hellwig <hch> Signed-off-by: Rich Johnston <rjohnston> Reproduced with xfsprogs-3.1.1-14.el6 [root@hp-dl388g8-03 tmp]# mkfs.xfs -d size=60t,file,name=/mnt/xfs/fsfile meta-data=/mnt/xfs/fsfile isize=256 agcount=60, agsize=268435455 blks = sectsz=512 attr=2, projid32bit=0 data = bsize=4096 blocks=16106127300, imaxpct=1 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0 log =internal log bsize=4096 blocks=521728, version=2 = sectsz=512 sunit=0 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 [root@hp-dl388g8-03 tmp]# xfs_repair -m 9000 -o ag_stride=32 -t 1 /mnt/xfs/fsfile Phase 1 - find and verify superblock... - reporting progress in intervals of 1 second Phase 2 - using internal log - zero log... Segmentation fault (core dumped) [root@hp-dl388g8-03 tmp]# rpm -q xfsprogs xfsprogs-3.1.1-14.el6.x86_64 Verified with xfsprogs-3.1.1-16.el6, xfs_repair could repair the image with no segfault. Set to VERIFIED. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2014-1564.html |