Bug 772552

Summary: [glusterfs3.3.0qa19] dbench failed with Input/Output error
Product: [Community] GlusterFS Reporter: M S Vishwanath Bhat <vbhat>
Component: replicateAssignee: Pranith Kumar K <pkarampu>
Status: CLOSED DUPLICATE QA Contact:
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: pre-releaseCC: gluster-bugs, mzywusko
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-01-27 11:55:20 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description M S Vishwanath Bhat 2012-01-09 07:38:25 UTC
Description of problem:
Created a 2*2 striped replicated volume. Was did a fuse mount from one machine and nfs mount from other machine. Created a folder structure on the mountpoint. Now started running dbench from both the clients in different directories. On fuse mount dbench failed with Inout/Output error.

Version-Release number of selected component (if applicable):
glusterfs-3.3.0qa19

How reproducible:
1/1

Steps to Reproduce:
1. Create a 2*2 Striped-Replicated volume.
2. Do a fuse mount on one machine and nfs mount on another machine.
3. Create some directories on mountpoint. Also untarred the linux kernel.
4. Start dbench from both clients (in different directories)
  
Actual results:
on fuse mount dbench failed with Input/Output error. On nfs mount dbench hung in warm-up stage.

Expected results:
dbench should pass on both the clients.

Additional info:
I saw following erros/warnings in fuse client log.


[2012-01-06 06:34:32.182194] W [fuse-bridge.c:2249:fuse_readdir_cbk] 0-glusterfs-fuse: 2192896: READDIR => -1 (Transport endpoint is not connected)
[2012-01-06 06:34:32.186620] W [dict.c:458:dict_ref] (-->/usr/local/lib/glusterfs/3.3.0qa19/xlator/cluster/replicate.so(+0x5c3cd) [0x7fe8260843cd] (-->/usr/local/lib/glusterfs/3.3.0qa19/xlator/cluster/replicate.so(afr_lookup_done_success_action+0x11b) [0x7fe826084111] (-->/usr/local/lib/glusterfs/3.3.0qa19/xlator/cluster/replicate.so(afr_lookup_build_response_params+0x140) [0x7fe82608201c]))) 0-dict: dict is NULL
[2012-01-06 06:34:32.186699] W [dict.c:458:dict_ref] (-->/usr/local/lib/glusterfs/3.3.0qa19/xlator/cluster/replicate.so(afr_lookup_cbk+0xed) [0x7fe826084cb7] (-->/usr/local/lib/glusterfs/3.3.0qa19/xlator/cluster/replicate.so(+0x5c5bc) [0x7fe8260845bc] (-->/usr/local/lib/glusterfs/3.3.0qa19/xlator/cluster/stripe.so(stripe_lookup_cbk+0x349) [0x7fe825e05c92]))) 0-dict: dict is NULL
[2012-01-06 06:34:32.186733] W [dict.c:458:dict_ref] (-->/usr/local/lib/glusterfs/3.3.0qa19/xlator/cluster/replicate.so(afr_lookup_cbk+0xed) [0x7fe826084cb7] (-->/usr/local/lib/glusterfs/3.3.0qa19/xlator/cluster/replicate.so(+0x5c5bc) [0x7fe8260845bc] (-->/usr/local/lib/glusterfs/3.3.0qa19/xlator/cluster/stripe.so(stripe_lookup_cbk+0x3d3) [0x7fe825e05d1c]))) 0-dict: dict is NULL
[2012-01-06 06:34:32.188350] W [fuse-bridge.c:2249:fuse_readdir_cbk] 0-glusterfs-fuse: 2192918: READDIR => -1 (Transport endpoint is not connected)
[2012-01-06 06:34:32.193038] W [fuse-bridge.c:232:fuse_entry_cbk] 0-glusterfs-fuse: 2192989: LOOKUP() /untar/clients/client40/~dmtmp/SEED/LARGE.FIL returning inode 0
[2012-01-06 06:34:32.193122] W [inode.c:833:inode_lookup] (-->/usr/local/lib/glusterfs/3.3.0qa19/xlator/debug/io-stats.so(io_stats_lookup_cbk+0x23c) [0x7fe82517d822] (-->/usr/local/lib/glusterfs/3.3.0qa19/xlator/mount/fuse.so(+0x9423) [0x7fe8289e2423] (-->/usr/local/lib/glusterfs/3.3.0qa19/xlator/mount/fuse.so(+0x8ec5) [0x7fe8289e1ec5]))) 0-fuse: inode not found
[2012-01-06 06:34:32.195056] W [fuse-bridge.c:2249:fuse_readdir_cbk] 0-glusterfs-fuse: 2192945: READDIR => -1 (Transport endpoint is not connected)
[2012-01-06 06:34:32.196880] W [fuse-bridge.c:2249:fuse_readdir_cbk] 0-glusterfs-fuse: 2192947: READDIR => -1 (Transport endpoint is not connected)
[2012-01-06 06:34:32.198780] W [fuse-bridge.c:2249:fuse_readdir_cbk] 0-glusterfs-fuse: 2192961: READDIR => -1 (Transport endpoint is not connected)
[2012-01-06 06:34:32.201799] W [fuse-bridge.c:2249:fuse_readdir_cbk] 0-glusterfs-fuse: 2192982: READDIR => -1 (Transport endpoint is not connected)
[2012-01-06 06:34:32.213763] W [fuse-bridge.c:2249:fuse_readdir_cbk] 0-glusterfs-fuse: 2193006: READDIR => -1 (Transport endpoint is not connected)
[2012-01-06 06:34:32.224789] W [fuse-bridge.c:2249:fuse_readdir_cbk] 0-glusterfs-fuse: 2193044: READDIR => -1 (Transport endpoint is not connected)
[2012-01-06 06:34:32.228806] W [fuse-bridge.c:2249:fuse_readdir_cbk] 0-glusterfs-fuse: 2193051: READDIR => -1 (Transport endpoint is not connected)
[2012-01-06 06:34:32.231873] W [fuse-bridge.c:2249:fuse_readdir_cbk] 0-glusterfs-fuse: 2193062: READDIR => -1 (Transport endpoint is not connected)
[2012-01-06 06:34:32.245414] W [fuse-bridge.c:2249:fuse_readdir_cbk] 0-glusterfs-fuse: 2193173: READDIR => -1 (Transport endpoint is not connected)
[2012-01-06 06:34:32.246501] W [fuse-bridge.c:2249:fuse_readdir_cbk] 0-glusterfs-fuse: 2193176: READDIR => -1 (Transport endpoint is not connected)
[2012-01-06 06:34:32.249668] W [fuse-bridge.c:2249:fuse_readdir_cbk] 0-glusterfs-fuse: 2193182: READDIR => -1 (Transport endpoint is not connected)
[2012-01-06 06:34:32.251189] W [fuse-bridge.c:2249:fuse_readdir_cbk] 0-glusterfs-fuse: 2193187: READDIR => -1 (Transport endpoint is not connected)
[2012-01-06 06:34:32.264303] W [fuse-bridge.c:2249:fuse_readdir_cbk] 0-glusterfs-fuse: 2193258: READDIR => -1 (Transport endpoint is not connected)
[2012-01-06 06:34:32.265739] W [fuse-bridge.c:2249:fuse_readdir_cbk] 0-glusterfs-fuse: 2193265: READDIR => -1 (Transport endpoint is not connected)
[2012-01-06 06:34:32.268821] W [fuse-bridge.c:2249:fuse_readdir_cbk] 0-glusterfs-fuse: 2193268: READDIR => -1 (Transport endpoint is not connected)
[2012-01-06 06:34:32.271536] W [fuse-bridge.c:2249:fuse_readdir_cbk] 0-glusterfs-fuse: 2193288: READDIR => -1 (Transport endpoint is not connected)
[2012-01-06 06:34:32.274285] W [fuse-bridge.c:2249:fuse_readdir_cbk] 0-glusterfs-fuse: 2193291: READDIR => -1 (Transport endpoint is not connected)
[2012-01-06 06:34:32.287619] W [fuse-bridge.c:2249:fuse_readdir_cbk] 0-glusterfs-fuse: 2193350: READDIR => -1 (Transport endpoint is not connected)
[2012-01-06 06:34:32.290444] W [fuse-bridge.c:2249:fuse_readdir_cbk] 0-glusterfs-fuse: 2193357: READDIR => -1 (Transport endpoint is not connected)
[2012-01-06 06:34:32.302157] W [fuse-bridge.c:2249:fuse_readdir_cbk] 0-glusterfs-fuse: 2193420: READDIR => -1 (Transport endpoint is not connected)
[2012-01-06 06:34:32.327122] W [fuse-bridge.c:2249:fuse_readdir_cbk] 0-glusterfs-fuse: 2193597: READDIR => -1 (Transport endpoint is not connected)
[2012-01-06 06:34:32.347827] W [fuse-bridge.c:2249:fuse_readdir_cbk] 0-glusterfs-fuse: 2193672: READDIR => -1 (Transport endpoint is not connected)
[2012-01-06 06:34:32.354542] W [fuse-bridge.c:2249:fuse_readdir_cbk] 0-glusterfs-fuse: 2193720: READDIR => -1 (Transport endpoint is not connected)
[2012-01-06 06:34:32.372744] W [fuse-bridge.c:2249:fuse_readdir_cbk] 0-glusterfs-fuse: 2193808: READDIR => -1 (Transport endpoint is not connected)
[2012-01-06 06:34:32.414551] W [fuse-bridge.c:2249:fuse_readdir_cbk] 0-glusterfs-fuse: 2193987: READDIR => -1 (Transport endpoint is not connected)
[2012-01-06 06:34:32.419434] W [fuse-bridge.c:2249:fuse_readdir_cbk] 0-glusterfs-fuse: 2194006: READDIR => -1 (Transport endpoint is not connected)
[2012-01-06 06:34:32.424443] W [fuse-bridge.c:2249:fuse_readdir_cbk] 0-glusterfs-fuse: 2194027: READDIR => -1 (Transport endpoint is not connected)
[2012-01-06 06:34:32.438201] W [fuse-bridge.c:2249:fuse_readdir_cbk] 0-glusterfs-fuse: 2194071: READDIR => -1 (Transport endpoint is not connected)
[2012-01-06 06:34:32.441463] W [fuse-bridge.c:2249:fuse_readdir_cbk] 0-glusterfs-fuse: 2194073: READDIR => -1 (Transport endpoint is not connected)
[2012-01-06 06:34:32.493393] W [fuse-bridge.c:2249:fuse_readdir_cbk] 0-glusterfs-fuse: 2194296: READDIR => -1 (Transport endpoint is not connected)
[2012-01-06 06:34:32.496733] W [fuse-bridge.c:2249:fuse_readdir_cbk] 0-glusterfs-fuse: 2194313: READDIR => -1 (Transport endpoint is not connected)
[2012-01-06 06:34:32.513492] W [fuse-bridge.c:2249:fuse_readdir_cbk] 0-glusterfs-fuse: 2194404: READDIR => -1 (Transport endpoint is not connected)
[2012-01-06 06:56:53.755587] E [rpc-clnt.c:216:call_bail] 0-hosdu-client-0: bailing out frame type(GF-DUMP) op(DUMP(1)) xid = 0x1664952x sent = 2012-01-06 06:26:52.839342. timeout = 1800
[2012-01-06 06:56:53.755770] W [client-handshake.c:1268:client_dump_version_cbk] 0-hosdu-client-0: received RPC status error


Following are the entries from the nfs logs.

[2012-01-06 04:44:29.698005] I [client-handshake.c:1085:select_server_supported_programs] 0-hosdu-client-1: Using Program GlusterFS 3.3.0qa19, Num (1298437), Version (310)
[2012-01-06 04:44:29.698354] I [client-handshake.c:917:client_setvolume_cbk] 0-hosdu-client-1: Connected to 10.1.11.114:24010, attached to remote volume '/data/brick'.
[2012-01-06 04:44:29.698379] I [afr-common.c:3458:afr_notify] 0-hosdu-replicate-0: subvol 1 came up, start crawl
[2012-01-06 05:19:25.385782] W [rpcsvc.c:173:rpcsvc_program_actor] 0-rpc-service: RPC program not available (req 100227 3)
[2012-01-06 05:19:25.393484] W [rpcsvc.c:1093:rpcsvc_error_reply] (-->/usr/local/lib/libgfrpc.so.0(rpc_transport_notify+0x130) [0x7f6c8dd3bd04] (-->/usr/local/lib/libgfrpc.so.0(rpcsvc_notify+0x181) [0x7f6c8dd36432] (-->/usr/local/lib/libgfrpc.so.0(rpcsvc_handle_rpc_call+0x35f) [0x7f6c8dd360aa]))) 0-: sending a RPC error reply
[2012-01-06 05:20:34.485377] I [afr-common.c:1132:afr_detect_self_heal_by_iatt] 0-hosdu-replicate-0: permissions differ for /linux-3.0.2/Documentation/hwmon/twl4030-madc-hwmon 
[2012-01-06 05:20:43.553784] I [afr-common.c:1132:afr_detect_self_heal_by_iatt] 0-hosdu-replicate-0: permissions differ for /linux-3.0.2/Documentation/networking/gen_stats.txt 
[2012-01-06 05:21:02.403088] I [afr-common.c:1132:afr_detect_self_heal_by_iatt] 0-hosdu-replicate-0: permissions differ for /linux-3.0.2/arch/alpha/include/asm/machvec.h 
[2012-01-06 05:21:10.091967] I [afr-common.c:1132:afr_detect_self_heal_by_iatt] 0-hosdu-replicate-0: permissions differ for /linux-3.0.2/arch/arm/boot/compressed/mmcif-sh7372.c 
[2012-01-06 05:21:21.246462] I [afr-common.c:1132:afr_detect_self_heal_by_iatt] 0-hosdu-replicate-0: permissions differ for /linux-3.0.2/arch/arm/include/asm/tls.h 
[2012-01-06 05:21:32.444285] I [afr-common.c:1132:afr_detect_self_heal_by_iatt] 0-hosdu-replicate-0: permissions differ for /linux-3.0.2/arch/arm/mach-davinci/include/mach/gpio.h 
[2012-01-06 05:21:49.955338] I [afr-common.c:1132:afr_detect_self_heal_by_iatt] 0-hosdu-replicate-0: permissions differ for /linux-3.0.2/arch/arm/mach-omap2/board-omap3logic.c 
[2012-01-06 05:21:56.732124] I [afr-common.c:1132:afr_detect_self_heal_by_iatt] 0-hosdu-replicate-0: permissions differ for /linux-3.0.2/arch/arm/mach-orion5x/include/mach/memory.h 
[2012-01-06 05:21:57.934053] I [afr-common.c:1132:afr_detect_self_heal_by_iatt] 0-hosdu-replicate-0: permissions differ for /linux-3.0.2/arch/arm/mach-pxa/cm-x2xx-pci.h 
[2012-01-06 05:21:58.346962] I [afr-common.c:1132:afr_detect_self_heal_by_iatt] 0-hosdu-replicate-0: permissions differ for /linux-3.0.2/arch/arm/mach-pxa/icontrol.c 
[2012-01-06 05:22:11.355289] I [afr-common.c:1132:afr_detect_self_heal_by_iatt] 0-hosdu-replicate-0: permissions differ for /linux-3.0.2/arch/arm/mach-shmobile/include/mach/io.h 
[2012-01-06 05:22:22.227220] I [afr-common.c:1132:afr_detect_self_heal_by_iatt] 0-hosdu-replicate-0: permissions differ for /linux-3.0.2/arch/arm/mach-w90x900/nuc910.h 
[2012-01-06 05:22:23.854918] I [afr-common.c:1132:afr_detect_self_heal_by_iatt] 0-hosdu-replicate-0: permissions differ for /linux-3.0.2/arch/arm/nwfpe/fpmodule.c 
[2012-01-06 05:23:01.942715] I [afr-common.c:1132:afr_detect_self_heal_by_iatt] 0-hosdu-replicate-0: permissions differ for /linux-3.0.2/arch/h8300/mm/Makefile 
[2012-01-06 05:23:18.023476] I [afr-common.c:1132:afr_detect_self_heal_by_iatt] 0-hosdu-replicate-0: permissions differ for /linux-3.0.2/arch/m32r/include/asm/timex.h 

I have attached the fuse client log where dbench failed and archived the other logs.

Comment 1 Pranith Kumar K 2012-01-27 11:55:20 UTC

*** This bug has been marked as a duplicate of bug 773225 ***