Bug 763848 (GLUSTER-2116)

Summary: Access fails under load
Product: [Community] GlusterFS Reporter: Tim <tp+glusterfs>
Component: replicateAssignee: tcp
Status: CLOSED WORKSFORME QA Contact:
Severity: medium Docs Contact:
Priority: low    
Version: 3.1.1CC: gluster-bugs, vijay
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: ---
Regression: --- Mount Type: fuse
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Tim 2010-11-16 12:56:58 UTC
When accessing the server under load ( approx 15 requests per second) we started getting errors accessing a file.

The logs show:

[2010-11-16 02:50:45.942302] I [afr-common.c:613:afr_lookup_self_heal_check] home-replicate-0: size differs for /s/a/xxx/custom/classes/EventHooksCompiled1.php
[2010-11-16 02:56:35.55443] I [afr-common.c:613:afr_lookup_self_heal_check] home-replicate-0: size differs for /s/a/xxx/custom/classes/EventHooksCompiled1.php
[2010-11-16 02:56:35.351556] I [afr-common.c:613:afr_lookup_self_heal_check] home-replicate-0: size differs for /s/a/xxx/custom/classes/EventHooksCompiled1.php

The file cant be accessed at this time. It does recover after time.

The setup is 2 Servers replicated and 2 clients both accessing the file using the native/fuse client.

Comment 1 Tim 2010-11-21 18:58:10 UTC
Any further information I can provide?

Comment 2 Anand Avati 2011-01-12 14:37:59 UTC
Is it just the log entries you are seeing? Are there other symptoms as well? These logs can be neglected if they are coming only at the time of parallel read/write from multiple clients and there is no other issue seen.

Avati

Comment 3 Tim 2011-01-12 18:07:18 UTC
Hi,

as per my report..

The file cant be accessed at this time. It does recover after time.

ie. The access to the file is impossible. it fails. The log was just the only info I had of note at the point.

It was repeatable and happened predictably.

Comment 4 Anand Avati 2011-01-12 18:19:49 UTC
Can you please describe what are the exact commands with which you were accessing the files. What were the different operations (exact commands) which were being performed on the files. What was the exact failure (error message? hang?) you faced during the high load? Please provide the details. It is very hard to debug with vague high level description of the problem.

Avati

Comment 5 tcp 2011-02-01 05:36:52 UTC
Can the bug submitter send the information requested for in the earlier update?

Comment 6 tcp 2011-03-17 02:05:23 UTC
No response from submitter. Closing the bug.