Bug 859406

Summary: [RHEV-RHS] readv failures from posix-aio during rebalance
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Anush Shetty <ashetty>
Component: glusterdAssignee: Ric Wheeler <rwheeler>
Status: CLOSED CURRENTRELEASE QA Contact: shylesh <shmohan>
Severity: unspecified Docs Contact:
Priority: high    
Version: 2.0CC: aavati, grajaiya, iheim, nsathyan, rhs-bugs, rwheeler, vbellur
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-3.3.0rhsvirt1-7.el6rhs Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-08-10 07:47:31 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Anush Shetty 2012-09-21 13:34:51 UTC
Description of problem: We see some posix-aio related errors while running rebalance tests.



Version-Release number of selected component (if applicable):

# rpm -qa | grep glusterfs
glusterfs-server-3.3.0rhsvirt1-5.el6rhs.x86_64
org.apache.hadoop.fs.glusterfs-glusterfs-0.20.2_0.2-1.noarch
glusterfs-fuse-3.3.0rhsvirt1-5.el6rhs.x86_64
glusterfs-rdma-3.3.0rhsvirt1-5.el6rhs.x86_64
glusterfs-3.3.0rhsvirt1-5.el6rhs.x86_64
glusterfs-geo-replication-3.3.0rhsvirt1-5.el6rhs.x86_64 


How reproducible: Not consistent


Steps to Reproduce:
1. Create a distributed volume (2x2).
2. Add 2 bricks and run rebalance
3.
  
Actual results:

EINVAL errors due to readv failures


Expected results:

There should be no errors

Additional info:

The glusterfs version on one of the hypervisors was glusterfs.x86_64 0:3.3.0rhsvirt1-2.el6_2. But still filing a bug.


Server log:
[2012-09-21 18:29:39.229213] I [server-helpers.c:474:do_fd_cleanup] 0-distrep2-server: fd cleanup on /run10678/system_light/glusterfs.git/scheduler/alu
[2012-09-21 18:29:39.229232] I [server-helpers.c:474:do_fd_cleanup] 0-distrep2-server: fd cleanup on /run10678/system_light/glusterfs.git/libglusterfs/src/y.tab.c
[2012-09-21 18:29:39.229257] I [server-helpers.c:474:do_fd_cleanup] 0-distrep2-server: fd cleanup on /run10678/system_light/glusterfs.git/xlators/performance/symlink-cache/src
[2012-09-21 18:29:39.229284] I [server-helpers.c:629:server_connection_destroy] 0-distrep2-server: destroyed connection of rhs-gp-srv12.lab.eng.blr.redhat.com-29007-2012/09/21-18:14:16:056992-distrep2-client-3-0
[2012-09-21 18:29:39.239330] I [server-handshake.c:571:server_setvolume] 0-distrep2-server: accepted client from rhs-gp-srv12.lab.eng.blr.redhat.com-29007-2012/09/21-18:14:16:056992-distrep2-client-3-0 (version: 3.3.0rhsvirt1)
[2012-09-21 18:29:39.255455] W [posix.c:118:posix_lookup] (-->/usr/lib64/glusterfs/3.3.0rhsvirt1/xlator/performance/io-threads.so(iot_lookup_wrapper+0x113) [0x7fca7a4e13a3] (-->/usr/lib64/glusterfs/3.3.0rhsvirt1/xlator/features/locks.so(pl_lookup+0x1e4) [0x7fca7a6f3b04] (-->/usr/lib64/glusterfs/3.3.0rhsvirt1/xlator/features/access-control.so(posix_acl_lookup+0x1b1) [0x7fca7a90b6d1]))) 0-distrep2-posix: invalid argument: loc->path
[2012-09-21 18:29:39.255497] W [server-resolve.c:130:resolve_gfid_cbk] 0-distrep2-server: 00000000-0000-0000-0000-000000000000: failed to resolve (Success)
[2012-09-21 18:29:39.284917] E [posix-aio.c:79:posix_aio_readv_complete] 0-distrep2-posix: readv(async) failed fd=82,size=10593,offset=0 (-22/Invalid argument)
[2012-09-21 18:29:39.284985] I [server3_1-fops.c:1455:server_readv_cbk] 0-distrep2-server: 3807: READV 29 (454688b6-0142-4042-8e62-eae5060d3cf4) ==> -1 (Invalid argument)
[2012-09-21 18:29:39.285012] W [socket.c:195:__socket_rwv] 0-tcp.distrep2-server: writev failed (Bad address)

Comment 2 Amar Tumballi 2012-09-26 06:14:30 UTC
patch http://review.gluster.org/3997 fixes the initial problem of lookup getting INVAL... that makes the resolving proper.

Comment 3 Amar Tumballi 2012-09-28 07:30:03 UTC
[root@rhs-arch-srv4 glusterfs]# dd if=/dev/zero of=/mnt/glusterfs/abcd oflag=direct bs=4k count=10
10+0 records in
10+0 records out
40960 bytes (41 kB) copied, 0.00637862 s, 6.4 MB/s
[root@rhs-arch-srv4 glusterfs]# dd if=/dev/zero of=/mnt/glusterfs/abcd1 bs=4k count=10
10+0 records in
10+0 records out
40960 bytes (41 kB) copied, 0.00794557 s, 5.2 MB/s
[root@rhs-arch-srv4 glusterfs]# dd if=/dev/zero of=/mnt/glusterfs/abcd2 bs=300 count=300
dd: writing `/mnt/glusterfs/abcd2': Invalid argument
1+0 records in
0+0 records out
0 bytes (0 B) copied, 0.00156413 s, 0.0 kB/s
======================

Noticed that if the block size is same as 512*n, then the 'linux-aio' option works. If it is not that case, we are currently facing problems, as described in the bug description.

[2012-09-28 00:22:44.111100] E [posix-aio.c:236:posix_aio_writev_complete] 0-test-vol-posix: writev(async) failed fd=17,offset=0 (-22/Invalid argument)
[2012-09-28 00:22:44.111156] I [server-rpc-fops.c:1421:server_writev_cbk] 0-test-vol-server: 3119: WRITEV 0 (1b66d283-3eb8-4392-b701-4b0a9e9c5fd3) ==> (Invalid argument)


I am thinking, we should fall back to the regular read/write if block-size is not aligned with 512 bytes.

Comment 4 Amar Tumballi 2012-10-01 16:18:34 UTC
http://review.gluster.org/4006 posted

Comment 5 Amar Tumballi 2012-10-08 07:14:43 UTC
merged upstream, will send a backport to downstream...

Comment 6 Amar Tumballi 2012-10-08 10:24:44 UTC
https://code.engineering.redhat.com/gerrit/#/c/61/

Comment 7 Anush Shetty 2012-11-09 08:43:11 UTC
verified on "Beta - RHS 2.0 with virt support" (RHS-2.0-20121031.0-RHS-x86_64-DVD1.iso) and build 1-8