Hide Forgot
My configuration: glusterfs-volgen --name vboxstore1 --raid 1 vboxserver1:/space/glusterfs vboxserver2:/space/glusterfs, running 3.0.4 (previously 3.0.5 seeing same problems). Scenario: disk on vboxserver2:/space had crashed, was replaced with new blank filesystem, self-heal was attempted using "find" on a client. Suddenly, the client repeatedly says [2010-08-07 10:17:14] D [afr-self-heal-algorithm.c:152:sh_full_write_cbk] mirror-0: write to /oldcurie/virtualbox/bobtest/disk.vdi failed on subvolume vboxserver2-1 (Invalid argument) On vboxserver2, at the same time, we get [2010-08-07 10:17:14] E [posix.c:2608:posix_writev] posix1: lseek(-2147483648) o n fd=0x1047260 failed: Invalid argument [2010-08-07 10:17:14] D [server-protocol.c:1939:server_writev_cbk] server-tcp: 6 17605: WRITEV 4 (5423176) ==> -1 (Invalid argument) [2010-08-07 10:17:14] E [posix.c:2608:posix_writev] posix1: lseek(-2147418112) on fd=0x1047260 failed: Invalid argument [2010-08-07 10:17:14] D [server-protocol.c:1939:server_writev_cbk] server-tcp: 617611: WRITEV 4 (5423176) ==> -1 (Invalid argument) [2010-08-07 10:17:14] E [posix.c:2608:posix_writev] posix1: lseek(-2147352576) on fd=0x1047260 failed: Invalid argument [2010-08-07 10:17:14] D [server-protocol.c:1939:server_writev_cbk] server-tcp: 617618: WRITEV 4 (5423176) ==> -1 (Invalid argument) [2010-08-07 10:17:14] E [posix.c:2608:posix_writev] posix1: lseek(-2147287040) on fd=0x1047260 failed: Invalid argument ... Looking at the filesystem where the file was still intact: vboxserver1:/space/glusterfs/oldcurie/virtualbox/bobtest# ls -ld disk.vdi; du -cs disk.vdi -rw------- 1 root root 4284522496 Jul 30 19:13 disk.vdi 4188200 disk.vdi 4188200 total vboxserver1:/space/glusterfs/oldcurie/virtualbox/bobtest# getfattr -dm "" -e hex . disk.vdi # file: . trusted.afr.vboxserver1-1=0x000000000000000000000000 trusted.afr.vboxserver2-1=0x000000000000000000000000 trusted.posix1.gen=0x4c547d7c00000007 # file: disk.vdi trusted.afr.vboxserver1-1=0x000000000000000000000000 trusted.afr.vboxserver2-1=0x000000020000000000000000 trusted.posix1.gen=0x4c547d7c0000007f vboxserver2:/space/glusterfs/oldcurie/virtualbox/bobtest# ls -ld disk.vdi; du -cs disk.vdi -rw------- 1 root root 2147483648 Jul 30 19:13 disk.vdi 2099208 disk.vdi 2099208 total (note: 2147483648 = 2*10^31, the "large file" limit) vboxserver2:/space/glusterfs/oldcurie/virtualbox/bobtest# getfattr -dm "" -e hex . disk.vdi # file: . trusted.afr.vboxserver1-1=0x000000000000000000000000 trusted.afr.vboxserver2-1=0x000000000000000000000000 trusted.posix1.gen=0x4c5c35c10000013c # file: disk.vdi trusted.afr.vboxserver1-1=0x000000000000000000000000 trusted.afr.vboxserver2-1=0x000000000000000000000000 trusted.posix1.gen=0x4c5c78030000007b For some other files, the clipping at 2G is also there. For some other large (>2G) files however, the self-healing is succesful. I am on a Debian Lenny system, using the packages from www.backports.org. I have seen the errors on 3.0.5 when both servers were still running a 32-bit os, changing the second server to a 64-bit os or downgrading to 3.0.4 apparently gives the same results. This bug report against 3.0.4, with vboxserver1 running 32-bit os, vboxserver2 running 64-bit os.
The errors do not seem to appear when all performance translators are disabled. Server (I suspect it's the most likely culprit): #volume brick1 # type performance/io-threads # option thread-count 8 # subvolumes locks1 #end-volume Client: #volume readahead # type performance/read-ahead # option page-count 4 # subvolumes mirror-0 #end-volume # #volume iocache # type performance/io-cache # option cache-size `echo $(( $(grep 'MemTotal' /proc/meminfo | sed 's/[^0-9]//g') / 5120 ))`MB # option cache-timeout 1 # subvolumes readahead #end-volume # #volume quickread # type performance/quick-read # option cache-timeout 1 # option max-file-size 64kB # subvolumes iocache #end-volume # #volume writebehind # type performance/write-behind # option cache-size 4MB # subvolumes quickread #end-volume # #volume statprefetch # type performance/stat-prefetch # subvolumes writebehind #end-volume
This is fixed in both release 3.0.5 and in master (3.1) branch.. please upgrade