Bug 761796 (GLUSTER-64) - Using cluster/replicate as a cluster/unify namespace brick crashes
Summary: Using cluster/replicate as a cluster/unify namespace brick crashes
Keywords:
Status: CLOSED WONTFIX
Alias: GLUSTER-64
Product: GlusterFS
Classification: Community
Component: unify
Version: mainline
Hardware: All
OS: Linux
low
low
Target Milestone: ---
Assignee: Amar Tumballi
QA Contact:
URL:
Whiteboard:
Depends On: GLUSTER-409
Blocks:
TreeView+ depends on / blocked
 
Reported: 2009-06-25 06:41 UTC by Basavanagowda Kanur
Modified: 2013-12-19 00:03 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:
Regression: RTNR
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:


Attachments (Terms of Use)

Description Basavanagowda Kanur 2009-06-25 06:41:23 UTC
[Migrated from savannah BTS] - bug 26776 [https://savannah.nongnu.org/bugs/?26776]

Wed 10 Jun 2009 05:14:11 PM GMT, original submission by Jonathan Steffan <damaestro>:

When using a cluster/replicate brick for the namespace of cluster/unify crashes shortly after data population.

How to crash:

1.) Start populating data.
2.) for i in {1..1000}; do ls -R /path/to/glusterfsmount; done
3.) Wait for short period.
4.) Client disconnects from servers and the filesystem is left in an unusable state.

--------------------------------------------------------------------------------
Wed 10 Jun 2009 11:38:32 PM GMT, comment #1 by Jonathan Steffan <damaestro>:

This also happens when using a cluster/afr volume for the namespace brick of a cluster/unify.

--------------------------------------------------------------------------------
Tue 16 Jun 2009 04:21:57 PM GMT, comment #2 by Jonathan Steffan <damaestro>:

Okay, this looks like it's less an issue about using these translators together and more about something crashing/leaking in the client. Using a replicated namespace brick just makes everything happen faster. The crash happens after a collection of the following is seen on the server side:

[2009-06-15 19:54:41] E [posix.c:270:posix_lookup] unify_lun_disk0: lstat on /mpath/to/some/sort/of/content/20812.jpg/30 failed: Not a directory
[2009-06-15 19:55:05] E [posix.c:270:posix_lookup] unify_lun_disk0: lstat on /path/to/some/sort/of/content/real_estate.jpg/0 failed: Not a directory
[2009-06-15 19:55:08] E [posix.c:270:posix_lookup] unify_lun_disk0: lstat on /mpath/to/some/sort/of/content/real_estate.jpg/600 failed: Not a directory
[2009-06-15 19:58:55] E [posix.c:270:posix_lookup] unify_lun_disk0: lstat on /path/to/some/sort/of/content/720music.jpg/oliveCOVER.jpg failed: Not a directory
[2009-06-15 19:59:09] E [posix.c:270:posix_lookup] unify_lun_disk0: lstat on /mpath/to/some/sort/of/content/0/20812.jpg/30 failed: Not a directory
[2009-06-15 19:59:53] E [posix.c:270:posix_lookup] unify_lun_disk0: lstat on /path/to/some/sort/of/content/34971.jpg/400 failed: Not a directory
[2009-06-15 20:11:14] E [posix.c:270:posix_lookup] iops_lun_disk0: lstat on /path/to/some/sort/of/content/headlines failed: Not a directory
[..... many lstat .....]
[2009-06-15 20:11:15] E [posix.c:270:posix_lookup] unify_lun_disk0: lstat on /path/to/some/sort/of/content/headlines failed: Not a directory
[2009-06-15 20:28:38] E [posix.c:382:posix_opendir] unify_lun_disk0: opendir failed on /path/to/some/sort/of/other/content/bang_4.jpg: Not a directory
[2009-06-15 20:31:01] E [posix.c:1298:posix_utimens] iops_lun_disk0: utimes on /path/to/disk-ld1/path/to/some/sort/of/content/90/.112294.jpg.pBk95v failed: No such file or directory
[2009-06-15 21:19:12] E [posix.c:1147:posix_chmod] iops_lun_disk0: chmod on /path/to/some/sort/of/content/70/.279.jpg.XARvhk failed: No such file or directory
[2009-06-15 21:21:47] E [posix.c:1147:posix_chmod] iops_lun_disk0: chmod on /mpath/to/some/sort/of/content/80/.1180.jpg.49pYSO failed: No such file or directory
[2009-06-15 19:40:11] E [posix.c:382:posix_opendir] unify_lun_disk0: opendir failed on /mpath/to/some/sort/of/content/0/60/31068.jpg: Not a directory
[2009-06-15 19:54:41] E [posix.c:270:posix_lookup] unify_lun_disk0: lstat on /path/to/some/sort/of/content/20812.jpg/30 failed: Not a directory
[2009-06-15 19:55:05] E [posix.c:270:posix_lookup] unify_lun_disk0: lstat on /path/to/some/sort/of/content/real_estate.jpg/0 failed: Not a directory
[2009-06-15 19:55:08] E [posix.c:270:posix_lookup] unify_lun_disk0: lstat on /path/to/some/sort/of/content/real_estate.jpg/600 failed: Not a directory
[2009-06-15 19:58:55] E [posix.c:270:posix_lookup] unify_lun_disk0: lstat on /path/to/some/sort/of/content/720music.jpg/oliveCOVER.jpg failed: Not a directory
[2009-06-15 19:59:09] E [posix.c:270:posix_lookup] unify_lun_disk0: lstat on /path/to/some/sort/of/content/20812.jpg/30 failed: Not a directory
[2009-06-15 19:59:53] E [posix.c:270:posix_lookup] unify_lun_disk0: lstat on /path/to/some/sort/of/content/34971.jpg/400 failed: Not a directory
[2009-06-15 20:07:58] E [posix.c:1147:posix_chmod] iops_lun_disk0: chmod on /path/to/some/sort/of/content/.81445.jpg.gQiXuD failed: No such file or directory
[2009-06-15 20:11:15] E [posix.c:270:posix_lookup] unify_lun_disk0: lstat on /path/to/some/sort/of/content/headlines failed: Not a directory
[2009-06-15 20:26:12] E [posix.c:1147:posix_chmod] iops_lun_disk0: chmod on /mpath/to/some/sort/of/content/0/.105545.jpg.ZeY2ga failed: No such file or directory

This is looking like it's possible that the issue is actually with the switch scheduler screwing up.

volume main_storage
type cluster/unify
# unify_namespace is a cluster/replicate to two servers
option namespace unify_namespace
option scheduler switch
# Anything not defined here ends up in 'bulk'
option scheduler.switch.case jpg:iops;gif:iops;png:iops;flv:iops;swf:iops;css:iops;xml:iops;htm:iops;wav:bulkaudio
subvolumes iops bulkaudio bulk
end-volume

--------------------------------------------------------------------------------

Tue 16 Jun 2009 04:27:36 PM GMT, comment #3 by Jonathan Steffan <damaestro>:
volume main_storage 
  type cluster/unify 
  # unify_namespace is a cluster/replicate to two servers 
  option namespace unify_namespace 
  option scheduler switch 
  # Anything not defined here ends up in 'bulk' 
  option scheduler.switch.case *jpg*:iops;*gif*:iops;*png*:iops;*flv*:iops;*swf*:iops;*css*:iops;*xml*:iops;*htm*:iops;*LOFI.mp3*:iops;*wav*:bulkaudio 
  subvolumes iops bulkaudio bulk 
end-volume 	

--------------------------------------------------------------------------------
Tue 16 Jun 2009 06:04:27 PM GMT, comment #4 by 	Jonathan Steffan <damaestro>:

http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=22

--------------------------------------------------------------------------------
Thu 18 Jun 2009 03:32:15 PM GMT, comment #5 by 	Jonathan Steffan <damaestro>:

We have removed the cluster/unify translator and are just going with multiple mounts to segment content. Everything is working now so I suspect this is an issue with running cluster/replicate or cluster/distribute under the unify translator.. or it's an issue with the switch scheduler, which I have opened another bug for.

Comment 1 Amar Tumballi 2009-11-26 00:45:53 UTC
adding dependency on bug-409, once committed, we can close all unify related bugs


Note You need to log in before you can comment on or make changes to this bug.