Bug 1473887 - read call on shard should not go beyond the shard-size
read call on shard should not go beyond the shard-size
Status: CLOSED NOTABUG
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: sharding (Show other bugs)
3.2
x86_64 Linux
unspecified Severity urgent
: ---
: ---
Assigned To: Krutika Dhananjay
SATHEESARAN
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-07-21 23:59 EDT by SATHEESARAN
Modified: 2017-07-24 02:12 EDT (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2017-07-22 23:40:33 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description SATHEESARAN 2017-07-21 23:59:04 EDT
Description of problem:
-----------------------
On a particular testing with RHHI, it was observed that when VMs are created from template, one after the other before completion of one VM creation, VMs are stuck in 'Image Locked' state

When looked at the gluster logs, it was found that the reads are going beyond the shard-size and XFS is not returning proper errno.

So clearly we found out 2 issues
1. XFS issue ( BZ 1473549 )
2. gluster shard problem
This bug is to address the second point.

Version-Release number of selected component (if applicable):
--------------------------------------------------------------
RHGS 3.2.0 async
glusterfs-3.8.4-18.6.el7rhgs

How reproducible:
-----------------
Hit it for the first time, never tried to reproduce

Steps to Reproduce:
-------------------
This particular issue was seen with the RHHI case
There could be a much simpler step, but let me post what I have seen
1. Create RHV setup backed with gluster replica 3 volume ( XFS bricks )
2. Create RHEL 7.4 template
3. Create multiple VMs from the RHEL 7.4 template ( one after other, before the completion of previous VM )

Actual results:
---------------
read on  shard has gone beyond the shard size
[2017-07-20 12:50:59.982435] E [MSGID: 113040] [posix.c:3119:posix_readv] 0-vmstore-posix: read failed on gfid=0aac1b8b-0ccc-4936-922c-efb524d226e3, fd=0x7fd7c401989c, offset=3584 size=2048, buf=0x7fd7d8af7000 [Unknown error 3072]

Expected results:
-----------------
read syscall on shard should not have gone beyond shard-size
Comment 3 Krutika Dhananjay 2017-07-22 23:40:33 EDT
Wrt the specific logs highlighted in comment #2, the offset is 4193792, which is 512bytes short of 4194304(=4 * 1024 * 1024 = 4MB).
So it is a legitimate read; Also in light of the clarification provided at https://bugzilla.redhat.com/show_bug.cgi?id=1473549#c31, I'm closing this bug.

-Krutika
Comment 4 SATHEESARAN 2017-07-24 02:12:14 EDT
On Sat, Jul 22, 2017 at 11:09 PM, Pranith Kumar Karampuri <pkarampu@redhat.com> wrote:
I am sorry, I calculated 4MB wrong. Read came inside 4MB size, but I was under the impression it was beyond 4MB size. There is no bug in shard. I am going to update the bz as well.

Note You need to log in before you can comment on or make changes to this bug.