Bug 1427159

Summary: possible repeatedly recursive healing of same file with background heal not happening when IO is going on
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Nag Pavan Chilakam <nchilaka>
Component: disperseAssignee: Ashish Pandey <aspandey>
Status: CLOSED ERRATA QA Contact: Nag Pavan Chilakam <nchilaka>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: rhgs-3.2CC: amukherj, aspandey, pkarampu, rcyriac, rhinduja, rhs-bugs, storage-qa-internal
Target Milestone: ---Keywords: Regression
Target Release: RHGS 3.3.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-3.8.4-28 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1428673 (view as bug list) Environment:
Last Closed: 2017-09-21 04:33:25 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1428673, 1459392    
Bug Blocks: 1417147    

Description Nag Pavan Chilakam 2017-02-27 13:44:13 UTC
Description of problem:
=========================
When a file requires heal and there is a continuous IO happening, the heal never seems to get over, the problem is the heal happens again and again untill all IOs be it read or write is stopped.This problem will be more serious when IOs are going on , say appends
The problem with this is as below:
1)same file requiring heal many times ,hence taking a very huge amount of time , ie in an ideal case a 1GB heal gets over in about 2 min, but there it can take hours
2) unnessary cpu cycles are spent on healing the same file again and again



1) started append to a file on 2x(4+2) volume  from fuse client as below:
 dd if=/dev/urandom bs=1MB count=10000 >>ddfile2
2)getfattr of one of the bricks(all are healthy) (eager lock so dirty set to 1 as lock not getting released, due to no other request)
# file: ddfile2
security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
trusted.bit-rot.version=0x020000000000000058b3f6720001a78c
trusted.ec.config=0x0000080602000200
trusted.ec.dirty=0x00000000000000010000000000000001
trusted.ec.size=0x0000000000000000
trusted.ec.version=0x00000000000000000000000000000000
trusted.gfid=0x5637add23aba4f7a9c3b9535dd6639a3

3) brick size from one of the client before bringing b2 down
[root@dhcp35-45 ecvol]# for i in {1..100};do du -sh ddfile2 ;echo "##########";ls -lh ddfile2 ;sleep 30;done
1.0G	ddfile2
##########
-rw-r--r--. 2 root root 1012M Feb 27 18:46 ddfile2

4)brought down b2

5) ec attributes get updated periodically on all healthy bricks:

# file: ddfile2
security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
trusted.bit-rot.version=0x020000000000000058b3f6720001a78c
trusted.ec.config=0x0000080602000200
trusted.ec.dirty=0x000000000000000c000000000000000c
trusted.ec.size=0x0000000119ded040
trusted.ec.version=0x00000000000093c800000000000093c8
trusted.gfid=0x5637add23aba4f7a9c3b9535dd6639a3



6)heal info as below:
Every 2.0s: gluster v heal ecvol info                                                         Mon Feb 27 18:48:13 2017

Brick dhcp35-45.lab.eng.blr.redhat.com:/rhs/brick3/ecvol
/ddfile2
Status: Connected
Number of entries: 1

Brick dhcp35-130.lab.eng.blr.redhat.com:/rhs/brick3/ecvol
Status: Transport endpoint is not connected
Number of entries: -

Brick dhcp35-122.lab.eng.blr.redhat.com:/rhs/brick3/ecvol
/ddfile2
Status: Connected
Number of entries: 1

Brick dhcp35-23.lab.eng.blr.redhat.com:/rhs/brick3/ecvol
/ddfile2
Status: Connected
Number of entries: 1

Brick dhcp35-112.lab.eng.blr.redhat.com:/rhs/brick3/ecvol
/ddfile2
Status: Connected
Number of entries: 1

Brick dhcp35-138.lab.eng.blr.redhat.com:/rhs/brick3/ecvol
/ddfile2
Status: Connected
Number of entries: 1




brought back brick up:
healthy brick xattr
Every 2.0s: getfattr -d -m . -e hex ddfile2                                                   Mon Feb 27 18:50:16 2017

# file: ddfile2
security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
trusted.bit-rot.version=0x020000000000000058b3f6720001a78c
trusted.ec.config=0x0000080602000200
trusted.ec.dirty=0x000000000000000c0000000000000000
trusted.ec.size=0x000000012a05f200
trusted.ec.version=0x0000000000009c400000000000009c40
trusted.gfid=0x5637add23aba4f7a9c3b9535dd6639a3




now the IOs are still going on, if we check the the xattrs of both healthy brick and brick requiring heal, the METADATA healing completes immediately, but the data heal keeps happening again and again, untill all IOs to that file are stopped

If we see the file size (using du -sh) the file seems to be getting healed again and again(the ls -lh shows the size in a consistent manner ie because the metadata heal is completed)


=====brick requiring heal ====
(note , I reran the case for showing it to dev and in this case the file is testme, but the problem is seen consistently)
^[[A^[[B^[[B#########brick is down
256M	testme
##########
-rw-r--r--. 2 root root 248M Feb 27 18:56 testme
4.7M	testme
##########
-rw-r--r--. 2 root root 611M Feb 27 18:58 testme
477M	testme
##########
-rw-r--r--. 2 root root 705M Feb 27 18:59 testme
110M	testme
##########
-rw-r--r--. 2 root root 798M Feb 27 18:59 testme
552M	testme
##########
-rw-r--r--. 2 root root 891M Feb 27 19:00 testme
1019M	testme
##########
-rw-r--r--. 2 root root 985M Feb 27 19:00 testme
442M	testme
##########
-rw-r--r--. 2 root root 1.1G Feb 27 19:01 testme
899M	testme
##########
-rw-r--r--. 2 root root 1.2G Feb 27 19:01 testme
1.5G	testme
##########
-rw-r--r--. 2 root root 1.3G Feb 27 19:02 testme
458M	testme
##########
-rw-r--r--. 2 root root 1.4G Feb 27 19:02 testme
935M	testme
##########
-rw-r--r--. 2 root root 1.5G Feb 27 19:03 testme
1.6G	testme
##########
-rw-r--r--. 2 root root 1.6G Feb 27 19:03 testme
124M	testme
##########
-rw-r--r--. 2 root root 1.6G Feb 27 19:04 testme
558M	testme
##########
-rw-r--r--. 2 root root 1.7G Feb 27 19:04 testme
1.1G	testme
##########
-rw-r--r--. 2 root root 1.8G Feb 27 19:05 testme
^C



=====healthy brick b1====


^[[A^[[B^[[B#########brick is down
1.0G	testme
##########
-rw-r--r--. 2 root root 516M Feb 27 18:58 testme
1.0G	testme
##########
-rw-r--r--. 2 root root 611M Feb 27 18:58 testme
1.0G	testme
##########
-rw-r--r--. 2 root root 705M Feb 27 18:59 testme
1.0G	testme
##########
-rw-r--r--. 2 root root 798M Feb 27 18:59 testme
1.0G	testme
##########
-rw-r--r--. 2 root root 891M Feb 27 19:00 testme
1.0G	testme
##########
-rw-r--r--. 2 root root 985M Feb 27 19:00 testme
2.0G	testme
##########
-rw-r--r--. 2 root root 1.1G Feb 27 19:01 testme
2.0G	testme
##########
-rw-r--r--. 2 root root 1.2G Feb 27 19:01 testme
2.0G	testme
##########
-rw-r--r--. 2 root root 1.3G Feb 27 19:02 testme
2.0G	testme
##########
-rw-r--r--. 2 root root 1.4G Feb 27 19:02 testme
2.0G	testme
##########
-rw-r--r--. 2 root root 1.5G Feb 27 19:03 testme
2.0G	testme
##########
-rw-r--r--. 2 root root 1.6G Feb 27 19:03 testme
2.0G	testme
##########



now the append is complete, the xattrs show the size as different as below, eventhough data/metadata dirty bits are cleaned up

Comment 2 Nag Pavan Chilakam 2017-02-27 13:49:45 UTC
note that I used service glusterd restart to bring the brick online, inorder to avoid restart of all shds using start force

Comment 3 Nag Pavan Chilakam 2017-02-27 14:19:12 UTC
with very high probability, this seems to me like a regression introduced in b/w dev builds of 3.2 
Also, note that this problem can miss the eye due to spurious heal info we have in ec
https://bugzilla.redhat.com/show_bug.cgi?id=1347257#c9
 https://bugzilla.redhat.com/show_bug.cgi?id=1347251#c7

Comment 8 Atin Mukherjee 2017-05-08 05:35:51 UTC
upstream patch : https://review.gluster.org/#/c/16985/

Comment 11 Atin Mukherjee 2017-06-07 07:52:15 UTC
downstream patch : https://code.engineering.redhat.com/gerrit/#/c/108397/

Comment 13 Nag Pavan Chilakam 2017-07-27 11:29:31 UTC
on_qa validation:
1)in a 4+2, when we bring down one brick and bring back it online after some time,while append is happening, I don't see file getting recurssively healed again and again. The file gets healed completely , hence marking this case as PASS
2)seeing an issue, where file never completes healing(but no problem of recursively healing), for which i raised a bug#1475789  

Moving this bug as verified as the main problem mentioned in description(also same as in case 1 here) is fixed


version:3.8.4-35

Comment 15 errata-xmlrpc 2017-09-21 04:33:25 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:2774

Comment 16 errata-xmlrpc 2017-09-21 04:57:48 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:2774