Bug 762262 (GLUSTER-530)

Summary: infinite loop in libglusterfs/src/inode.c?
Product: [Community] GlusterFS Reporter: Dave Hall <skwashd>
Component: libglusterfsclientAssignee: Shehjar Tikoo <shehjart>
Status: CLOSED WONTFIX QA Contact:
Severity: medium Docs Contact:
Priority: low    
Version: 2.0.9CC: anush, gluster-bugs
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: ---
Regression: RTNR Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
callgrind output none

Description Dave Hall 2010-01-11 14:12:57 UTC
I first reported this on the user list, this bug report consolidates the discussion.  The full thread is available at http://gluster.org/pipermail/gluster-users/2010-January/thread.html#3719 with the subject "maxing out cpu".

It seems that using a self compiled version 2.0.9 on Ubuntu 9.10 on AMD64 the gluster client gets stuck in an infinite loop in libglusterfs/src/inode.c around lines 902-908.  This seems somewhat similar to bug #761853.

Relevant Config:
$ cat /etc/glusterfs/glusterfsd.vol 
volume posix
  type storage/posix
  option directory /srv/glusterfs/export
end-volume

volume locks
  type features/locks
  subvolumes posix
end-volume

volume brick
  type performance/io-threads
  option thread-count 4
  subvolumes locks
end-volume

volume server
  type protocol/server
  option transport-type tcp
  option transport.socket.bind-address 192.168.XXX.123
  option auth.addr.brick.allow 192.168.XXX.122,192.168.XXX.123,192.168.XXX.124
  option auth.addr.locks.allow 192.168.XXX.*
  option auth.addr.posix.allow 192.168.XXX.*
  subvolumes brick
end-volume


$ cat /etc/glusterfs/glusterfs.vol 
# Note the 3 node is missing because I removed it earlier today
volume my-storage-c
  type protocol/client
  option transport-type tcp
  option remote-host 192.168.201.123
  option remote-subvolume brick
end-volume

volume my-storage-d
  type protocol/client
  option transport-type tcp
  option remote-host 192.168.201.124
  option remote-subvolume brick
end-volume

volume replication
  type cluster/replicate
  subvolumes my-storage-c my-storage-d
  option read-subvolume my-storage-c
end-volume

volume writebehind
  type performance/write-behind
  option window-size 1MB
  subvolumes replication
end-volume

volume cache
  type performance/io-cache
  option cache-size 256MB
  subvolumes writebehind
end-volume

selective content of /etc/fstab

/dev/mapper/my--ui--c-glusterfs--export /srv/glusterfs/export ext4     noatime,nodev,nosuid 0       2
/etc/glusterfs/glusterfs.vol  /path/to/mount  glusterfs  defaults  0  0

Attached is the valgrind/callgrind output. 

The client logs contained nothing of interest.  All it had was some basic environment info and the volfile.

Comment 1 Shehjar Tikoo 2010-01-28 04:36:21 UTC
Hi Dave,

This problem has been observed before. The solution has involved major changes which are now part of 3.0.x releases. It'd be best to move to those.

Thanks

Comment 2 Dave Hall 2010-01-30 08:24:49 UTC
Hi Shehjar,

I have been testing 3.0.0 pretty heavily over the last couple of days and I can no longer replicate the bug.  Thanks for the response.  Gald it is fixed.

Cheers

Dave