Bugzilla (bugzilla.redhat.com) will be under maintenance for infrastructure upgrades and will not be available on July 31st between 12:30 AM - 05:30 AM UTC. We appreciate your understanding and patience. You can follow status.redhat.com for details.
Bug 1272404 - Data Tiering:error "[2015-10-14 18:15:09.270483] E [MSGID: 122037] [ec-common.c:1502:ec_update_size_version_done] 0-tiervolume-disperse-1: Failed to update version and size [Input/output error]"
Summary: Data Tiering:error "[2015-10-14 18:15:09.270483] E [MSGID: 122037] [ec-common...
Keywords:
Status: CLOSED UPSTREAM
Alias: None
Product: GlusterFS
Classification: Community
Component: disperse
Version: 3.7.5
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
Assignee: bugs@gluster.org
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks: 1272407 1274629
TreeView+ depends on / blocked
 
Reported: 2015-10-16 10:37 UTC by Nag Pavan Chilakam
Modified: 2018-11-30 05:43 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1272407 1274629 (view as bug list)
Environment:
Last Closed: 2016-05-17 12:36:16 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:


Attachments (Terms of Use)

Description Nag Pavan Chilakam 2015-10-16 10:37:01 UTC
Description of problem:
=========================
On the longevity/stress setup we are getting the error message 
[2015-10-14 18:15:09.270412] E [MSGID: 122034] [ec-common.c:439:ec_child_select] 0-tiervolume-disperse-1: Insufficient available childs for this request (have 1, need 6
)
[2015-10-14 18:15:09.270483] E [MSGID: 122037] [ec-common.c:1502:ec_update_size_version_done] 0-tiervolume-disperse-1: Failed to update version and size [Input/output e
rror]
[2015-10-14 18:15:10.873293] E [MSGID: 122034] [ec-common.c:439:ec_child_select] 0-tiervolume-disperse-0: Insufficient available childs for this request (have 1, need 6
)
[2015-10-14 18:15:10.873470] E [MSGID: 122037] [ec-common.c:1502:ec_update_size_version_done] 0-tiervolume-disperse-0: Failed to update version and size [Input/output e
rror]
[2015-10-14 18:15:10.875723] E [MSGID: 122034] [ec-common.c:439:ec_child_select] 0-tiervolume-disperse-0: Insufficient available childs for this request (have 1, need 6
)
[2015-10-14 18:15:10.875742] E [MSGID: 122037] [ec-common.c:1502:ec_update_size_version_done] 0-tiervolume-disperse-0: Failed to update version and size [Input/output e
rror]
[2015-10-14 18:15:10.876542] E [MSGID: 122034] [ec-common.c:439:ec_child_select] 0-tiervolume-disperse-0: Insufficient available childs for this request (have 1, need 6
)
[2015-10-14 18:15:10.876567] E [MSGID: 122037] [ec-common.c:1502:ec_update_size_version_done] 0-tiervolume-disperse-0: Failed to update version and size [Input/output e
rror]
[2015-10-14 18:15:10.882593] E [MSGID: 122034] [ec-common.c:439:ec_child_select] 0-tiervolume-disperse-1: Insufficient available childs for this request (have 1, need 6
)
[2015-10-14 18:15:10.882645] E [MSGID: 122037] [ec-common.c:1502:ec_update_size_version_done] 0-tiervolume-disperse-1: Failed to update version and size [Input/output e
rror]
[2015-10-14 18:15:10.885247] E [MSGID: 122034] [ec-common.c:439:ec_child_select] 0-tiervolume-disperse-1: Insufficient available childs for this request (have 1, need 6
)
[2015-10-14 18:15:10.885293] E [MSGID: 122037] [ec-common.c:1502:ec_update_size_version_done] 0-tiervolume-disperse-1: Failed to update version and size [Input/output e
rror]





Version-Release number of selected component (if applicable):
=============================================================

glusterfs-3.7.5-0.19.git0f5c3e8.el7.centos.x86_64



Steps Carried:
==============

1. Created 12 node cluster
2. Create tiered volume with Hot tier as (6 x 2) and Cold tier as (2 x (6 + 2) = 16)
3. Fuse Mount the volume on 3 clients RHEL7.2,RHEl7.1 and RHEL6.7
4. Start creating data from each client:

Client 1:
=========
[root@dj ~]# crefi --multi -n 10 -b 10 -d 10 --max=1024k --min=5k --random -T 5 -t text -I 5 --fop=create /mnt/fuse/

Client 2:
=========
[root@mia ~]# cd /mnt/fuse/
[root@mia fuse]# for i in {1..10}; do cp -rf /etc etc.$i ; sleep 100 ; done

Client 3:
=========
[root@wingo fuse]# for i in {1..999}; do dd if=/dev/zero of=dd.$i bs=1M count=1 ; sleep 10 ; done

5. After a while, the data creation of client 1 and client 2 should be completed while the data creation from client 3 will still be inprogress

6. At this point the data creation will be of only 1 file from client 3 in every 10 sec.

Comment 2 Vijay Bellur 2015-10-28 10:32:38 UTC
REVIEW: http://review.gluster.org/12440 (cluster/ec: update version and size on good bricks) posted (#1) for review on release-3.7 by Ashish Pandey (aspandey@redhat.com)

Comment 3 Vijay Bellur 2015-11-02 05:51:05 UTC
COMMIT: http://review.gluster.org/12440 committed in release-3.7 by Pranith Kumar Karampuri (pkarampu@redhat.com) 
------
commit 9a0e3a7ecc61e47a0780708f86efc0170b8a85db
Author: Ashish Pandey <aspandey@redhat.com>
Date:   Fri Oct 23 13:27:51 2015 +0530

    cluster/ec: update version and size on good bricks
    
    Problem: readdir/readdirp fops calls [f]xattrop with
    fop->good which contain only one brick for these operations.
    That causes xattrop to be failed as it requires at least
    "minimum" number of brick.
    
    Solution: Use lock->good_mask to call xattrop. lock->good_mask
    contain all the good locked bricks on which the previous write
    opearion was successfull.
    
    Change-Id: If1b500391aa6fca6bd863702e030957b694ab499
    BUG: 1272404
    Signed-off-by: Ashish Pandey <aspandey@redhat.com>
    Reviewed-on: http://review.gluster.org/12419
    Tested-by: NetBSD Build System <jenkins@build.gluster.org>
    Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
    Tested-by: Xavier Hernandez <xhernandez@datalab.es>
    Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
    Reviewed-on: http://review.gluster.org/12440
    Tested-by: Gluster Build System <jenkins@build.gluster.com>

Comment 4 Mike McCune 2016-03-28 22:22:31 UTC
This bug was accidentally moved from POST to MODIFIED via an error in automation, please see mmccune@redhat.com with any questions


Note You need to log in before you can comment on or make changes to this bug.