Description of problem: ======================== On a 2 x 3 distribute replicate volume, we were running fs_sanity from 2 fuse mounts. After sometime Bonnie test (test part of fs_sanity) failed on both the mount points. The client logs got filled with warning message growing upto 12GB in turn filling the root file system: [2013-12-30 10:11:50.538894] W [fuse-bridge.c:2618:fuse_readv_cbk] 0-glusterfs-fuse: 57274844: READ => -1 (Input/output error) [2013-12-30 10:11:50.538937] W [page.c:991:__ioc_page_error] 0-vol-io-cache: page error for page = 0x7ff5705d8150 & waitq = 0x7ff571214620 Failure on Mount1: ----------------- executing bonnie Using uid:0, gid:0. Writing a byte at a time...done Writing intelligently...done Rewriting...done Reading a byte at a time...done Reading intelligently...done start 'em...Can't write block.: Transport endpoint is not connected Can't sync file. Can't write block.: Transport endpoint is not connected Can't sync file. Can't write block.: Transport endpoint is not connected Can't sync file. Can't write block.: Transport endpoint is not connected Can't sync file. Failure on Mount2: ------------------ executing bonnie Using uid:0, gid:0. Writing a byte at a time...done Writing intelligently...done Rewriting...done Reading a byte at a time...done Reading intelligently...done start 'em...Bonnie: drastic I/O error (re-write read): Transport endpoint is not connected Can't read a full block, only got 4096 bytes. Can't read a full block, only got 4096 bytes. Can't read a full block, only got 4096 bytes. Can't read a full block, only got 4096 bytes. Version-Release number of selected component (if applicable): ============================================================ glusterfs 3.4.0.52rhs built on Dec 19 2013 12:20:16 How reproducible: Steps to Reproduce: ===================== 1.Create 2 x 3 distribute replicate volume. "enable" the volume option "linux-aio" . Start the volume. 2. From the client node create 2 fuse mounts. 3. Run "fs_sanity" on both the mounts. a. create a nfs mount to /opt on rhsqe-repo.lab.eng.blr.redhat.com ( rhsqe-repo.lab.eng.blr.redhat.com:/opt on /opt type nfs (rw,vers=4,addr=10.70.34.52,clientaddr=10.70.34.93) ) b. Execute the command to run the fs_sanity : "cd /opt/qa/tools/system_light ; ./run.sh -w /mnt/vol/ -l /root/fs_sanity_log.log" Actual results: =============== Warning messages filling the file system. Additional info: =================== [root@fan ~]# gluster v info Volume Name: vol Type: Distributed-Replicate Volume ID: 6136c5d9-9b40-4503-aa4a-11ff3da44e88 Status: Started Number of Bricks: 2 x 3 = 6 Transport-type: tcp Bricks: Brick1: dj:/rhs/brick1/b1 Brick2: fan:/rhs/brick1/b1-rep1 Brick3: mia:/rhs/brick1/b1-rep2 Brick4: dj:/rhs/brick1/b2 Brick5: fan:/rhs/brick1/b2-rep1 Brick6: mia:/rhs/brick1/b2-rep2 Options Reconfigured: storage.linux-aio: enable [root@fan ~]# [root@fan ~]# gluster v status Status of volume: vol Gluster process Port Online Pid ------------------------------------------------------------------------------ Brick dj:/rhs/brick1/b1 49158 Y 16632 Brick fan:/rhs/brick1/b1-rep1 49156 Y 3313 Brick mia:/rhs/brick1/b1-rep2 49158 Y 10439 Brick dj:/rhs/brick1/b2 49159 Y 16644 Brick fan:/rhs/brick1/b2-rep1 49157 Y 3325 Brick mia:/rhs/brick1/b2-rep2 49159 Y 10451 NFS Server on localhost 2049 Y 3338 Self-heal Daemon on localhost N/A Y 3345 NFS Server on dj 2049 Y 16657 Self-heal Daemon on dj N/A Y 16663 NFS Server on mia 2049 Y 10466 Self-heal Daemon on mia N/A Y 10470 Task Status of Volume vol ------------------------------------------------------------------------------ There are no active volume tasks [root@fan ~]#
SOS Reports: http://rhsqe-repo.lab.eng.blr.redhat.com/bugs_necessary_info/1047449/
https://code.engineering.redhat.com/gerrit/#/c/17981/
Verified the fix on the build "glusterfs 3.4.0.54rhs built on Jan 5 2014 06:26:17" . Bug is fixed. Moving the bug to verified state.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHEA-2014-0208.html