Bug 970224
Summary: | Under heavy load, Grid Engine array jobs fail; write permission | ||
---|---|---|---|
Product: | [Community] GlusterFS | Reporter: | Harry Mangalam <hjmangalam> |
Component: | access-control | Assignee: | Nagaprasad Sathyanarayana <nsathyan> |
Status: | CLOSED DEFERRED | QA Contact: | |
Severity: | medium | Docs Contact: | |
Priority: | unspecified | ||
Version: | 3.3.0 | CC: | bugs, gluster-bugs, hjmangalam, smohan |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2014-12-14 19:40:31 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Harry Mangalam
2013-06-03 18:38:57 UTC
Does the work load involve renames or links? What is the type of volume? and can you please provide logs from the clients? (if nfs mount, nfs server logs, and if fuse, client logs) Also, can you please provide the exact version being used? If possible, can you check with release 3.3.2.qa3 to see if the issue is fixed? Re: comment 1: - No renames or links in any of the cases I've checked. They are std file creations. - the volume type is: Volume Name: gl Type: Distribute Volume ID: 21f480f7-fc5a-4fd8-a084-3964634a9332 Status: Started Number of Bricks: 8 Transport-type: tcp,rdma Bricks: Brick1: bs2:/raid1 Brick2: bs2:/raid2 Brick3: bs3:/raid1 Brick4: bs3:/raid2 Brick5: bs4:/raid1 Brick6: bs4:/raid2 Brick7: bs1:/raid1 Brick8: bs1:/raid2 Options Reconfigured: performance.write-behind-window-size: 1024MB performance.flush-behind: on performance.cache-size: 268435456 nfs.disable: on performance.io-cache: on performance.quick-read: on performance.io-thread-count: 64 auth.allow: 10.2.*.*,10.1.*.* - the exact version of the clients is: $ glusterfs --version glusterfs 3.3.1 built on Oct 11 2012 21:49:36 The version on the servers is: glusterfs 3.3.0 built on May 31 2012 11:16:28 > If possible, can you check with release 3.3.2.qa3 to see if the issue is fixed? That will take a bit more time, but I can start that process. - the logs will take a bit of time to collect and post. I'll post when they are ready. The partial client logs (only the relevant bits) are at: <http://pastie.org/8005913> compute-2-7 has an enormous number of error messages from a past errors (~4GB). trying to figure that out now as well. the entire logs (except from compute-2-7) are gzipped together at: http://moo.nac.uci.edu/~hjm/82h529f7.tgz May not have been clear from previous notes (altho gluster options note it: .. nfs.disable: on .. We are only using the fuse mounts, not NFS. Nevertheless, we're still getting NFS error messages associated with a lot of these errors. ie [2013-05-23 17:50:43.632112] W [client3_1-fops.c:2630:client3_1_lookup_cbk] 0-gl-client-0: remote operation failed: Stale NFS file handle. Path: /cbcl/mengfant/Cmap/CMAP/LM1/run_train_svm_cmap.sh (be46f737-1fc5-48b3-9b75-8eb19f201272) If you want the complete log fo compute-2-7,, it's here: http://moo.nac.uci.edu:~hjm/927nv74.bz2> 137MB -> 4GB uncompressed. The version that this bug has been reported against, does not get any updates from the Gluster Community anymore. Please verify if this report is still valid against a current (3.4, 3.5 or 3.6) release and update the version, or close this bug. If there has been no update before 9 December 2014, this bug will get automatocally closed. |