Bug 1272407

Summary: Data Tiering:error "[2015-10-14 18:15:09.270483] E [MSGID: 122037] [ec-common.c:1502:ec_update_size_version_done] 0-tiervolume-disperse-1: Failed to update version and size [Input/output error]"
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Nag Pavan Chilakam <nchilaka>
Component: disperseAssignee: Ashish Pandey <aspandey>
Status: CLOSED ERRATA QA Contact: Nag Pavan Chilakam <nchilaka>
Severity: high Docs Contact:
Priority: unspecified    
Version: rhgs-3.1CC: aspandey, asrivast, dlambrig, pkarampu, rhinduja, rhs-bugs, sankarshan, storage-qa-internal
Target Milestone: ---Keywords: ZStream
Target Release: RHGS 3.1.2   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-3.7.5-5 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 1272404 Environment:
Last Closed: 2016-03-01 05:41:42 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1272404    
Bug Blocks: 1260783, 1260923    

Description Nag Pavan Chilakam 2015-10-16 10:48:49 UTC
+++ This bug was initially created as a clone of Bug #1272404 +++

Description of problem:
=========================
On the longevity/stress setup we are getting the error message 
[2015-10-14 18:15:09.270412] E [MSGID: 122034] [ec-common.c:439:ec_child_select] 0-tiervolume-disperse-1: Insufficient available childs for this request (have 1, need 6
)
[2015-10-14 18:15:09.270483] E [MSGID: 122037] [ec-common.c:1502:ec_update_size_version_done] 0-tiervolume-disperse-1: Failed to update version and size [Input/output e
rror]
[2015-10-14 18:15:10.873293] E [MSGID: 122034] [ec-common.c:439:ec_child_select] 0-tiervolume-disperse-0: Insufficient available childs for this request (have 1, need 6
)
[2015-10-14 18:15:10.873470] E [MSGID: 122037] [ec-common.c:1502:ec_update_size_version_done] 0-tiervolume-disperse-0: Failed to update version and size [Input/output e
rror]
[2015-10-14 18:15:10.875723] E [MSGID: 122034] [ec-common.c:439:ec_child_select] 0-tiervolume-disperse-0: Insufficient available childs for this request (have 1, need 6
)
[2015-10-14 18:15:10.875742] E [MSGID: 122037] [ec-common.c:1502:ec_update_size_version_done] 0-tiervolume-disperse-0: Failed to update version and size [Input/output e
rror]
[2015-10-14 18:15:10.876542] E [MSGID: 122034] [ec-common.c:439:ec_child_select] 0-tiervolume-disperse-0: Insufficient available childs for this request (have 1, need 6
)
[2015-10-14 18:15:10.876567] E [MSGID: 122037] [ec-common.c:1502:ec_update_size_version_done] 0-tiervolume-disperse-0: Failed to update version and size [Input/output e
rror]
[2015-10-14 18:15:10.882593] E [MSGID: 122034] [ec-common.c:439:ec_child_select] 0-tiervolume-disperse-1: Insufficient available childs for this request (have 1, need 6
)
[2015-10-14 18:15:10.882645] E [MSGID: 122037] [ec-common.c:1502:ec_update_size_version_done] 0-tiervolume-disperse-1: Failed to update version and size [Input/output e
rror]
[2015-10-14 18:15:10.885247] E [MSGID: 122034] [ec-common.c:439:ec_child_select] 0-tiervolume-disperse-1: Insufficient available childs for this request (have 1, need 6
)
[2015-10-14 18:15:10.885293] E [MSGID: 122037] [ec-common.c:1502:ec_update_size_version_done] 0-tiervolume-disperse-1: Failed to update version and size [Input/output e
rror]





Version-Release number of selected component (if applicable):
=============================================================

glusterfs-3.7.5-0.19.git0f5c3e8.el7.centos.x86_64



Steps Carried:
==============

1. Created 12 node cluster
2. Create tiered volume with Hot tier as (6 x 2) and Cold tier as (2 x (6 + 2) = 16)
3. Fuse Mount the volume on 3 clients RHEL7.2,RHEl7.1 and RHEL6.7
4. Start creating data from each client:

Client 1:
=========
[root@dj ~]# crefi --multi -n 10 -b 10 -d 10 --max=1024k --min=5k --random -T 5 -t text -I 5 --fop=create /mnt/fuse/

Client 2:
=========
[root@mia ~]# cd /mnt/fuse/
[root@mia fuse]# for i in {1..10}; do cp -rf /etc etc.$i ; sleep 100 ; done

Client 3:
=========
[root@wingo fuse]# for i in {1..999}; do dd if=/dev/zero of=dd.$i bs=1M count=1 ; sleep 10 ; done

5. After a while, the data creation of client 1 and client 2 should be completed while the data creation from client 3 will still be inprogress

6. At this point the data creation will be of only 1 file from client 3 in every 10 sec.

Comment 2 Dan Lambright 2015-10-17 20:31:33 UTC
I could not find these logs on the longevity test machines. Can this be recreated and the logs saved someplace?

Comment 5 Rahul Hinduja 2015-11-01 11:30:22 UTC
Verified with build: glusterfs-3.7.5-5.el7rhgs.x86_64

[root@dhcp37-165 glusterfs]# 
[root@dhcp37-165 glusterfs]# grep -i "ec_update_size_version_done" tiervolume-*
[root@dhcp37-165 glusterfs]# 


Moving the bug to verified state.

Comment 7 errata-xmlrpc 2016-03-01 05:41:42 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-0193.html