Bug 1123289
Summary: | crash on fsync | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Community] GlusterFS | Reporter: | Joe Julian <joe> | ||||||
Component: | core | Assignee: | Pranith Kumar K <pkarampu> | ||||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | |||||||
Severity: | urgent | Docs Contact: | |||||||
Priority: | unspecified | ||||||||
Version: | 3.4.4 | CC: | bugs, gluster-bugs, kkeithle, pkarampu, rgowdapp | ||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | Unspecified | ||||||||
OS: | Unspecified | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | |||||||||
: | 1126048 (view as bug list) | Environment: | |||||||
Last Closed: | 2015-02-12 20:09:33 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | 1126048, 1136827, 1136835 | ||||||||
Bug Blocks: | 1125245 | ||||||||
Attachments: |
|
Description
Joe Julian
2014-07-25 08:56:38 UTC
Second client to fail in 12 hours: [2014-07-25 07:00:29.287680] W [fuse-resolve.c:546:fuse_resolve_fd] 0-fuse-resolve: migration of basefd (ptr:0x7f17cb846444 inode-gfid:87544fde-9bad-46d8-b610-1a8c93b85113) did not complete, failing fop with EBADF (old-subvolume:gv-nova-3 new-subvolume:gv-nova-4) pending frames: frame : type(1) op(FSYNC) frame : type(0) op(0) frame : type(1) op(FSYNC) frame : type(0) op(0) frame : type(1) op(OPEN) frame : type(1) op(OPEN) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) patchset: git://git.gluster.com/glusterfs.git signal received: 11 time of crash: 2014-07-25 07:00:29configuration details: argp 1 backtrace 1 dlfcn 1 fdatasync 1 libpthread 1 llistxattr 1 setfsid 1 spinlock 1 epoll.h 1 xattr.h 1 st_atim.tv_nsec 1 package-string: glusterfs 3.4.4 /lib/x86_64-linux-gnu/libc.so.6(+0x364a0)[0x7f17c763f4a0] /usr/lib/x86_64-linux-gnu/glusterfs/3.4.4/xlator/cluster/distribute.so(dht_fsync+0x166)[0x7f17c26ee296] /usr/lib/x86_64-linux-gnu/glusterfs/3.4.4/xlator/performance/write-behind.so(wb_fsync+0x2e5)[0x7f17c24b3965] /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(default_fsync_resume+0x13f)[0x7f17c802e21f] /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(call_resume+0x1f7)[0x7f17c8046457] /usr/lib/x86_64-linux-gnu/glusterfs/3.4.4/xlator/performance/open-behind.so(open_and_resume+0xf0)[0x7f17c22a71a0] /usr/lib/x86_64-linux-gnu/glusterfs/3.4.4/xlator/performance/open-behind.so(ob_fsync+0x40)[0x7f17c22a8130] /usr/lib/x86_64-linux-gnu/glusterfs/3.4.4/xlator/debug/io-stats.so(io_stats_fsync+0x16a)[0x7f17c20909fa] /usr/lib/x86_64-linux-gnu/glusterfs/3.4.4/xlator/mount/fuse.so(fuse_fsync_resume+0x230)[0x7f17c58647b0] /usr/lib/x86_64-linux-gnu/glusterfs/3.4.4/xlator/mount/fuse.so(+0x8196)[0x7f17c5858196] /usr/lib/x86_64-linux-gnu/glusterfs/3.4.4/xlator/mount/fuse.so(+0x7ed6)[0x7f17c5857ed6] /usr/lib/x86_64-linux-gnu/glusterfs/3.4.4/xlator/mount/fuse.so(+0x81de)[0x7f17c58581de] /usr/lib/x86_64-linux-gnu/glusterfs/3.4.4/xlator/mount/fuse.so(fuse_resolve_continue+0x41)[0x7f17c5857ab1] /usr/lib/x86_64-linux-gnu/glusterfs/3.4.4/xlator/mount/fuse.so(+0x7e5e)[0x7f17c5857e5e] /usr/lib/x86_64-linux-gnu/glusterfs/3.4.4/xlator/mount/fuse.so(+0x81be)[0x7f17c58581be] /usr/lib/x86_64-linux-gnu/glusterfs/3.4.4/xlator/mount/fuse.so(fuse_resolve_and_resume+0x28)[0x7f17c58586c8] /usr/lib/x86_64-linux-gnu/glusterfs/3.4.4/xlator/mount/fuse.so(+0x201f4)[0x7f17c58701f4] /lib/x86_64-linux-gnu/libpthread.so.0(+0x7e9a)[0x7f17c79d0e9a] /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f17c76fd3fd] I replaced a brick on Monday. Could this be related to that? Joe, Replacing a brick results in a graph switch to new graph containing new-brick. That is when migration of fds get triggered. So, replacing brick uncovered this bug. regards, Raghavendra. Volume Name: gv-nova Type: Distributed-Replicate Volume ID: 978c6843-1d6b-48dc-a498-862c57637916 Status: Started Number of Bricks: 7 x 2 = 14 Transport-type: tcp Bricks: Brick1: storage02-stor:/gluster/brick02/nova Brick2: storage06-stor:/gluster/brick02/nova Brick3: storage03-stor:/gluster/brick03/nova Brick4: storage07-stor:/gluster/brick03/nova Brick5: storage03-stor:/gluster/brick04/nova Brick6: storage08-stor:/gluster/brick03/nova Brick7: storage04-stor:/gluster/brick03/nova Brick8: storage07-stor:/gluster/brick04/nova Brick9: storage04-stor:/gluster/brick04/nova Brick10: storage08-stor:/gluster/brick04/nova Brick11: storage02-stor:/gluster/brick03/nova Brick12: storage07-stor:/gluster/brick02/nova Brick13: storage02-stor:/gluster/brick04/nova Brick14: storage08-stor:/gluster/brick02/nova Options Reconfigured: cluster.heal-timeout: 500 cluster.self-heal-window-size: 2 cluster.self-heal-readdir-size: 2KB cluster.background-self-heal-count: 32 performance.quick-read: off performance.read-ahead: off performance.io-cache: off performance.stat-prefetch: off cluster.eager-lock: enable network.remote-dio: enable storage.owner-uid: 500 storage.owner-gid: 500 performance.io-thread-count: 24 cluster.self-heal-daemon: on diagnostics.latency-measurement: on diagnostics.count-fop-hits: on Created attachment 921196 [details]
Client log from the second crash
graph switch occurs at 2014-07-24 05:23:43 in the attached log file
The series of: [2014-07-29 06:39:13.410535] W [socket.c:514:__socket_rwv] 3-gv-nova-client-0: readv failed (No data available) [2014-07-29 06:39:13.410560] I [client.c:2098:client_rpc_notify] 3-gv-nova-client-0: disconnected Is the client querying for the removed brick's port number: 116 4.036814 10.30.6.28 10.30.6.3 Gluster Portmap 154 V1 PORTBYBRICK Call (Reply In 117) + Gluster Portmap [Program Version: 1] [Gluster Portmap: PORTBYBRICK (1)] + Brick: /gluster/st01brick01/nova 117 4.036993 10.30.6.3 10.30.6.28 Gluster Portmap 98 V1 PORTBYBRICK Reply (Call In 116) + Gluster Portmap [Program Version: 1] [Gluster Portmap: PORTBYBRICK (1)] Return value: -1 Errno: 0 (Success) Status: 0 Port: 0 Since that brick no longer exists due to using the appreciated command, "remove-brick ... commit force", that continues until the crash happens. replace-brick, not remove-brick Sorry was on leave till yesterday so couldn't debug this issue. I am able to re-create the crash. Also found the root cause for the crash. All I need to find now is why the nameless lookup is failing on some of the files which lead to failures in fd-migration to new graph. For the crash to happen, fd-migration needs to fail and fsync needs to come on the fd on which the migration failed. There are checks in fuse code to prevent sending the fops on fds which failed migration. But the problem is that error checking code has a bug introduced by http://review.gluster.org/6262. But this patch is fixing an issue with 'link' fop. So I need to revert this patch and fix the 'link' fop bug separately to fix the issue. This fixes the crash but we still don't know why fd-migration failed. We need to find root-cause for the same. Joe, Please check if there are any other mounts where you have fd-migration errors. They may also crash when fsync is issued on the mounts on those fds. Could you please give me the info of what are the files represented by the following gfids and on which bricks do we have these files? Brick logs would be helpful as well. pk1@pranithk-laptop - ~/Downloads/gluster-users/JoeJulian 18:21:12 :) ⚡ grep Invalid client.log [2014-07-24 05:23:44.931310] W [fuse-bridge.c:3932:fuse_migrate_fd_open] 0-glusterfs-fuse: name-less lookup of gfid (b7e794e9-67c0-4d47-9ee5-2b41d1de792f) failed (Invalid argument)(old-subvolume:gv-nova-3 new-subvolume:gv-nova-4) [2014-07-24 05:23:44.980643] W [fuse-bridge.c:3932:fuse_migrate_fd_open] 0-glusterfs-fuse: name-less lookup of gfid (9abcf733-05b4-45e1-9845-ff8f4b97fbc4) failed (Invalid argument)(old-subvolume:gv-nova-3 new-subvolume:gv-nova-4) [2014-07-24 05:23:45.088577] W [fuse-bridge.c:3932:fuse_migrate_fd_open] 0-glusterfs-fuse: name-less lookup of gfid (87544fde-9bad-46d8-b610-1a8c93b85113) failed (Invalid argument)(old-subvolume:gv-nova-3 new-subvolume:gv-nova-4) [2014-07-24 05:23:45.122628] W [fuse-bridge.c:3932:fuse_migrate_fd_open] 0-glusterfs-fuse: name-less lookup of gfid (37c5ffcb-3937-4bd5-a719-244526c278f5) failed (Invalid argument)(old-subvolume:gv-nova-3 new-subvolume:gv-nova-4) Here is the backtrace when the crash happens after fsync on fd which failed migration. Program received signal SIGSEGV, Segmentation fault. 0x00007f1003b494a9 in dht_fsync (frame=0x7f100b2ef1a4, this=0x13f8620, fd=0x13cd5c8, datasync=0, xdata=0x0) at dht-inode-read.c:817 817 STACK_WIND (frame, dht_fsync_cbk, subvol, subvol->fops->fsync, (gdb) bt #0 0x00007f1003b494a9 in dht_fsync (frame=0x7f100b2ef1a4, this=0x13f8620, fd=0x13cd5c8, datasync=0, xdata=0x0) at dht-inode-read.c:817 #1 0x00007f100803ba8d in wb_fsync (frame=0x7f100b2ef3a8, this=0x13fdfc0, fd=0x13cd5c8, datasync=0, xdata=0x0) at write-behind.c:1523 #2 0x00007f10038ee95c in ra_fsync (frame=0x7f100b2ef704, this=0x13fea20, fd=0x13cd5c8, datasync=0, xdata=0x0) at read-ahead.c:628 #3 0x00007f100c4e5aca in default_fsync (frame=0x7f100b2ef704, this=0x13ff480, fd=0x13cd5c8, flags=0, xdata=0x0) at defaults.c:991 #4 0x00007f100c4e5aca in default_fsync (frame=0x7f100b2ef704, this=0x13ffee0, fd=0x13cd5c8, flags=0, xdata=0x0) at defaults.c:991 #5 0x00007f100c4dfd63 in default_fsync_resume (frame=0x7f100b2efbb8, this=0x1400a60, fd=0x13cd5c8, flags=0, xdata=0x0) at defaults.c:561 #6 0x00007f100c4fe7bc in call_resume_wind (stub=0x7f100ad78da4) at call-stub.c:2389 #7 0x00007f100c5064d5 in call_resume (stub=0x7f100ad78da4) at call-stub.c:2841 #8 0x00007f10032c2c1a in open_and_resume (this=0x1400a60, fd=0x13cd5c8, stub=0x7f100ad78da4) at open-behind.c:236 #9 0x00007f10032c3e9f in ob_fsync (frame=0x7f100b2efbb8, this=0x1400a60, fd=0x13cd5c8, flag=0, xdata=0x0) at open-behind.c:487 #10 0x00007f10030b644c in mdc_fsync (frame=0x7f100b2efb0c, this=0x1403710, fd=0x13cd5c8, datasync=0, xdata=0x0) at md-cache.c:1635 #11 0x00007f1002ea1a3c in io_stats_fsync (frame=0x7f100b2ef5ac, this=0x1404170, fd=0x13cd5c8, flags=0, xdata=0x0) at io-stats.c:2168 #12 0x00007f100a8eb1b3 in fuse_fsync_resume (state=0x7f0ffc016f80) at fuse-bridge.c:2384 #13 0x00007f100a8da00b in fuse_resolve_done (state=0x7f0ffc016f80) at fuse-resolve.c:663 #14 0x00007f100a8da0e1 in fuse_resolve_all (state=0x7f0ffc016f80) at fuse-resolve.c:692 #15 0x00007f100a8d9fd4 in fuse_resolve (state=0x7f0ffc016f80) at fuse-resolve.c:649 #16 0x00007f100a8da0b8 in fuse_resolve_all (state=0x7f0ffc016f80) at fuse-resolve.c:688 #17 0x00007f100a8da13f in fuse_resolve_continue (state=0x7f0ffc016f80) at fuse-resolve.c:708 #18 0x00007f100a8d9d8d in fuse_resolve_fd (state=0x7f0ffc016f80) at fuse-resolve.c:561 #19 0x00007f100a8d9f82 in fuse_resolve (state=0x7f0ffc016f80) at fuse-resolve.c:638 #20 0x00007f100a8da063 in fuse_resolve_all (state=0x7f0ffc016f80) at fuse-resolve.c:681 #21 0x00007f100a8da17d in fuse_resolve_and_resume (state=0x7f0ffc016f80, fn=0x7f100a8eaba4 <fuse_fsync_resume>) at fuse-resolve.c:721 #22 0x00007f100a8eb2fa in fuse_fsync (this=0x1338860, finh=0x7f0ffc016160, msg=0x7f0ffc016188) at fuse-bridge.c:2403 #23 0x00007f100a8f8af9 in fuse_thread_proc (data=0x1338860) at fuse-bridge.c:4798 #24 0x00000038e4807f33 in start_thread () from /lib64/libpthread.so.0 ---Type <return> to continue, or q <return> to quit--- #25 0x00000038e40f4ded in clone () from /lib64/libc.so.6 (gdb) fr 0 #0 0x00007f1003b494a9 in dht_fsync (frame=0x7f100b2ef1a4, this=0x13f8620, fd=0x13cd5c8, datasync=0, xdata=0x0) at dht-inode-read.c:817 817 STACK_WIND (frame, dht_fsync_cbk, subvol, subvol->fops->fsync, (gdb) p subvol <<----- This is the reason for the crash. $1 = (xlator_t *) 0x0 Created attachment 922813 [details]
brick log with the do_fd_cleanup on at least one of the gfids referenced
REVIEW: http://review.gluster.org/8562 (mount/fuse: Handle fd resolution failures) posted (#1) for review on release-3.4 by Kaleb KEITHLEY (kkeithle) COMMIT: http://review.gluster.org/8562 committed in release-3.4 by Kaleb KEITHLEY (kkeithle) ------ commit 1d4ef0b891899e3a6dbc8c2087e73cee6f5a7fbe Author: Kaleb S. KEITHLEY <kkeithle> Date: Thu Aug 28 12:33:01 2014 -0400 mount/fuse: Handle fd resolution failures Backport from http://review.gluster.org/#/c/8402, BZ 1126048 Problem: Even when the fd resolution failed, the fop is continuing on the new graph which may not have valid inode. This lead to NULL layout subvols in dht which lead to crash in fsync after graph migration. Fix: - Remove resolution error handling in FUSE_FOP as it was only added to handle fd migration failures. - check in fuse_resolve_done for fd resolution failures and fail the fop right away. - loc resolution failures are already handled in the corresponding fops. - Return errno from state->resolve.op_errno in resume functions. - Send error to fuse on frame allocation failures. - Removed unused variable state->resolved - Removed unused macro FUSE_FOP_COOKIE Change-Id: I3f3129b9b4fc34ff1fe07218e6924a027346a95e BUG: 1123289 Signed-off-by: Kaleb S. KEITHLEY <kkeithle> Reviewed-on: http://review.gluster.org/8562 Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Raghavendra G <rgowdapp> |