Bug 762785 (GLUSTER-1053)

Summary: Shared files occasionally unreadable from some nodes
Product: [Community] GlusterFS Reporter: Lakshmipathi G <lakshmipathi>
Component: io-cacheAssignee: Raghavendra G <raghavendra>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: medium Docs Contact:
Priority: low    
Version: mainlineCC: gluster-bugs, tejas
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: ---
Regression: RTP Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
user modified vol file
none
test script
none
test script none

Description Lakshmipathi G 2010-07-06 06:55:24 UTC
Created attachment 248 [details]
a fix

user modified vol file to overcome his issue

Comment 1 Lakshmipathi G 2010-07-06 09:52:08 UTC
reported in mailing list -

Hello all,

I am new to gluster and I've been seeing some inconsistent behavior. When I
write files to the gluster about 1 in 1000 will be unreadable on one node.
From that node I can see the file with ls and ls does report the correct
size. However running cat on the file produces no output and vim thinks that
it is full of the ^@ character. If I try to read the file from another node
it is fine.

After some Googling I've read that an ls -lR can fix similar problems but it
hasn't had any effect for me. Running touch on the file does restore its
contents. I am running Glusterfs 3.0.4 on RHEL 5.4. I generated the config
files with the volgen tool and didn't make any changes.

Is this a known issue or something that could've happened if I screwed up
the configuration?

Here is my glusterfs.vol
## file auto generated by /usr/bin/glusterfs-volgen (mount.vol)
# Cmd line:
# $ /usr/bin/glusterfs-volgen -n warehouse --raid 1
gluster1:/export/warehouse gluster2:/export/warehouse
gluster3:/export/warehouse gluster4:/export/warehouse

# RAID 1
# TRANSPORT-TYPE tcp
volume gluster4-1
    type protocol/client
    option transport-type tcp
    option remote-host gluster4
    option transport.socket.nodelay on
    option transport.remote-port 6996
    option remote-subvolume brick1
end-volume

volume gluster2-1
    type protocol/client
    option transport-type tcp
    option remote-host gluster2
    option transport.socket.nodelay on
    option transport.remote-port 6996
    option remote-subvolume brick1
end-volume

volume gluster3-1
    type protocol/client
    option transport-type tcp
    option remote-host gluster3
    option transport.socket.nodelay on
    option transport.remote-port 6996
    option remote-subvolume brick1
end-volume

volume gluster1-1
    type protocol/client
    option transport-type tcp
    option remote-host gluster1
    option transport.socket.nodelay on
    option transport.remote-port 6996
    option remote-subvolume brick1
end-volume

volume mirror-0
    type cluster/replicate
    subvolumes gluster1-1 gluster2-1
end-volume

volume mirror-1
    type cluster/replicate
    subvolumes gluster3-1 gluster4-1
end-volume

volume distribute
    type cluster/distribute
    subvolumes mirror-0 mirror-1
end-volume

volume readahead
    type performance/read-ahead
    option page-count 4
    subvolumes distribute
end-volume

volume iocache
    type performance/io-cache
    option cache-size `echo $(( $(grep 'MemTotal' /proc/meminfo | sed
's/[^0-9]//g') / 5120 ))`MB
    option cache-timeout 1
    subvolumes readahead
end-volume

volume quickread
    type performance/quick-read
    option cache-timeout 1
    option max-file-size 64kB
    subvolumes iocache
end-volume

volume writebehind
    type performance/write-behind
    option cache-size 4MB
    subvolumes quickread
end-volume

volume statprefetch
    type performance/stat-prefetch
    subvolumes writebehind
end-volume

------------------------------------------------------------## file auto
generated by /usr/bin/glusterfs-volgen (export.vol)
# Cmd line:
# $ /usr/bin/glusterfs-volgen -n warehouse --raid 1
gluster1:/export/warehouse gluster2:/export/warehouse
gluster3:/export/warehouse gluster4:/export/warehouse

volume posix1
  type storage/posix
  option directory /export/warehouse
end-volume

volume locks1
    type features/locks
    subvolumes posix1
end-volume

volume brick1
    type performance/io-threads
    option thread-count 8
    subvolumes locks1
end-volume

volume server-tcp
    type protocol/server
    option transport-type tcp
    option auth.addr.brick1.allow *
    option transport.socket.listen-port 6996
    option transport.socket.nodelay on
    subvolumes brick1
end-volume

and here is my glusterfsd.vol

--------------
If I use md5sum I get two different results on two different hosts. On the host where the file appears to be empty I got the md5sum of an empty file (d41d8cd98f00b204e9800998ecf8427e). I did some experiments since my last post and it looks like disabling the iocache plugin will eliminate these errors. I've attached the logs from the host where the file appears empty.
-----------

Comment 2 Raghavendra G 2010-07-06 12:36:20 UTC
What was the change to volume file?

(In reply to comment #1)
> Created an attachment (id=248) [details]
> user modified vol file  
> 
> user modified vol file to overcome his issue

Comment 3 Raghavendra G 2010-12-28 01:55:36 UTC
Created attachment 405

Comment 4 Raghavendra G 2010-12-28 01:58:14 UTC
Created attachment 406


<Comment from Jonathan>
I left my scripts running for a couple of days and this time I'm not
seeing any errors. It looks as though the bug has been fixed. If you
want to run my test yourself mkfiles.rb is a Ruby script that writes
one 10KB file every 10 seconds and testfiles.rb is a script that
verifies the files are readable and deletes them if they are. Both
scripts read and write the files in the current directory.
</Comment from Jonathan>

Since user has reported that the issue seems to be fixed, closing this bug for now.