1536334 – [Disperse] Implement open fd heal for disperse volume

Bug 1536334 - [Disperse] Implement open fd heal for disperse volume

Summary: [Disperse] Implement open fd heal for disperse volume

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	GlusterFS
Classification:	Community
Component:	disperse
Sub Component:
Version:	3.12
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Assignee:	Xavi Hernandez
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:	1431955
Blocks:
TreeView+	depends on / blocked

Reported:	2018-01-19 07:40 UTC by Xavi Hernandez
Modified:	2018-03-05 07:14 UTC (History)
CC List:	4 users (show)
Fixed In Version:	glusterfs-3.12.6
Clone Of:	1431955
Environment:
Last Closed:	2018-03-05 07:14:08 UTC
Regression:	---
Mount Type:	---
Documentation:	---
CRM:
Verified Versions:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Xavi Hernandez 2018-01-19 07:40:44 UTC

+++ This bug was initially created as a clone of Bug #1431955 +++

Description of problem:

When EC opens a file and get fd from all the bricks, if the brick is down, it will not have the fd from that sub volume.

Before sending a write on that fd, if the brick comes UP, we should be able to send this fd on this brick also to avoid unnecessary heal later.
 

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

--- Additional comment from Pranith Kumar K on 2017-03-27 07:28:29 CEST ---

Sunil,
   When you do a dd on a file and as long as the file is open, you see something like the following in statedump of the client:

[xlator.protocol.client.ec2-client-0.priv]
fd.0.remote_fd=0
connecting=0
connected=1
total_bytes_read=7288220
ping_timeout=42
total_bytes_written=11045016
ping_msgs_sent=3
msgs_sent=19812

This should be present for each of the fds that are open from each client-xlator.
So if we have 3=2+1 configuration we will have one for each of the client xlators. But if the brick was down at the time of opening the file this won't be present. Now after bringing the brick back up and operating on the file we should have this file opened again. I think at the moment this gets converted to anonymous-fd based operation so the operation may not fail. But it is important to open the file again for all operations to function properly like lk etc.

--- Additional comment from Sunil Kumar Acharya on 2017-03-27 13:56:27 CEST ---

Steps to re-create/test:

1. Created and mounted an EC (2+1) volume. Heal disabled.

[root@server3 ~]# gluster volume info
 
Volume Name: ec-vol
Type: Disperse
Volume ID: b676891f-392d-49a6-891c-8e7e3790658d
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: server1:/LAB/store/ec-vol
Brick2: server2:/LAB/store/ec-vol
Brick3: server3:/LAB/store/ec-vol
Options Reconfigured:
cluster.disperse-self-heal-daemon: disable               <<<<<
transport.address-family: inet
nfs.disable: on
disperse.background-heals: 0                             <<<<<
[root@server3 ~]#

2. Touched a file on the mountpoint.

# touch file

3. Brought down one of the brick process.

4. Opened a file descriptor for the file.

# exec 30<> file

5. Brought up the brick process which was down.

6. Wrote to the FD.

# echo "abc" >&30

7. File status on clinet and bricks after write completes.

Client:

[root@varada mount]# ls -lh file 
-rw-r--r--. 1 root root 4 Mar 27 17:11 file
[root@varada mount]# du -kh file 
1.0K	file
[root@varada mount]# 

Bricks:

[root@server1 ~]# du -kh /LAB/store/ec-vol/file
4.0K	/LAB/store/ec-vol/file
[root@server1 ~]# ls -lh /LAB/store/ec-vol/file
-rw-r--r-- 2 root root 0 Mar 27 17:08 /LAB/store/ec-vol/file
[root@server1 ~]# cat /LAB/store/ec-vol/file
[root@server1 ~]# 

[root@server2 ~]# du -kh /LAB/store/ec-vol/file
8.0K	/LAB/store/ec-vol/file
[root@server2 ~]# ls -lh /LAB/store/ec-vol/file
-rw-r--r-- 2 root root 512 Mar 27 17:11 /LAB/store/ec-vol/file
[root@server2 ~]# cat /LAB/store/ec-vol/file
abc
[root@server2 ~]# 

[root@server3 ~]# du -kh /LAB/store/ec-vol/file
8.0K	/LAB/store/ec-vol/file
[root@server3 ~]# ls -lh /LAB/store/ec-vol/file
-rw-r--r-- 2 root root 512 Mar 27 17:11 /LAB/store/ec-vol/file
[root@server3 ~]# cat /LAB/store/ec-vol/file
abc
abc
[root@server3 ~]#

--- Additional comment from Ashish Pandey on 2017-04-09 13:08:05 CEST ---


We will also need to disable some performance option to actually open an FD 
in step - 4 for those bricks which are UP.

1 - gluster v set vol performance.lazy-open no
2 - gluster v set vol performance.read-after-open yes

[root@apandey /]# gluster v info
 
Volume Name: vol
Type: Disperse
Volume ID: d007c6c2-98da-4cd9-8d5e-99e0e3f37012
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: apandey:/home/apandey/bricks/gluster/vol-1
Brick2: apandey:/home/apandey/bricks/gluster/vol-2
Brick3: apandey:/home/apandey/bricks/gluster/vol-3
Options Reconfigured:
disperse.background-heals: 0
cluster.disperse-self-heal-daemon: disable
performance.read-after-open: yes
performance.lazy-open: no
transport.address-family: inet
nfs.disable: on


[root@apandey glusterfs]# gluster v status
Status of volume: vol
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick apandey:/home/apandey/bricks/gluster/
vol-1                                       49152     0          Y       6297 
Brick apandey:/home/apandey/bricks/gluster/
vol-2                                       49153     0          Y       5865 
Brick apandey:/home/apandey/bricks/gluster/
vol-3                                       49154     0          Y       5884 
 
Task Status of Volume vol
------------------------------------------------------------------------------
There are no active volume tasks


After bringing the brick vol-1 UP and writing data on FD.


[root@apandey glusterfs]# cat /home/apandey/bricks/gluster/vol-1/dir/file
[root@apandey glusterfs]# cat /home/apandey/bricks/gluster/vol-2/dir/file
abc
abc
abc
abc
abc
abc
abc
abc
abc
abc
abc
[root@apandey glusterfs]# cat /home/apandey/bricks/gluster/vol-3/dir/file
abc
abc
abc
abc
abc
abc
abc
abc
abc
abc
abc
abc
abc
abc
abc
abc
abc
abc
abc
abc
abc
abc
[root@apandey glusterfs]# 




[root@apandey glusterfs]# getfattr -m. -d -e hex /home/apandey/bricks/gluster/vol-*/dir/file
getfattr: Removing leading '/' from absolute path names
# file: home/apandey/bricks/gluster/vol-1/dir/file
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a757365725f686f6d655f743a733000
trusted.ec.config=0x0000080301000200
trusted.ec.dirty=0x000000000000000b000000000000000b
trusted.ec.size=0x0000000000000000
trusted.ec.version=0x00000000000000000000000000000001
trusted.gfid=0xf8cf475afa5e4873bf2274f45278f74f

# file: home/apandey/bricks/gluster/vol-2/dir/file
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a757365725f686f6d655f743a733000
trusted.bit-rot.version=0x020000000000000058ea10fd0005a7ee
trusted.ec.config=0x0000080301000200
trusted.ec.dirty=0x000000000000000c000000000000000c
trusted.ec.size=0x000000000000002c
trusted.ec.version=0x000000000000000c000000000000000d
trusted.gfid=0xf8cf475afa5e4873bf2274f45278f74f

# file: home/apandey/bricks/gluster/vol-3/dir/file
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a757365725f686f6d655f743a733000
trusted.bit-rot.version=0x020000000000000058ea110100063c59
trusted.ec.config=0x0000080301000200
trusted.ec.dirty=0x000000000000000c000000000000000c
trusted.ec.size=0x000000000000002c
trusted.ec.version=0x000000000000000c000000000000000d
trusted.gfid=0xf8cf475afa5e4873bf2274f45278f74

Comment 1 Worker Ant 2018-01-19 08:29:49 UTC

REVIEW: https://review.gluster.org/19247 (cluster/ec: OpenFD heal implementation for EC) posted (#1) for review on release-3.12 by Xavier Hernandez

Comment 2 Worker Ant 2018-02-02 06:51:00 UTC

COMMIT: https://review.gluster.org/19247 committed in release-3.12 by "jiffin tony Thottan" <jthottan> with a commit message- cluster/ec: OpenFD heal implementation for EC

Existing EC code doesn't try to heal the OpenFD to
avoid unnecessary healing of the data later.

Fix implements the healing of open FDs before
carrying out file operations on them by making an
attempt to open the FDs on required up nodes.

Backport of:
>BUG: 1431955

BUG: 1536334
Change-Id: Ib696f59c41ffd8d5678a484b23a00bb02764ed15
Signed-off-by: Sunil Kumar Acharya <sheggodu>

Comment 3 Jiffin 2018-03-05 07:14:08 UTC

This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.12.6, please open a new bug report.

glusterfs-3.12.6 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://lists.gluster.org/pipermail/gluster-users/2018-February/033552.html
[2] https://www.gluster.org/pipermail/gluster-users/

Note You need to log in before you can comment on or make changes to this bug.