Bug 1509810 - [Disperse] Implement open fd heal for disperse volume
Summary: [Disperse] Implement open fd heal for disperse volume
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: disperse
Version: rhgs-3.4
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: RHGS 3.4.0
Assignee: Sunil Kumar Acharya
QA Contact: Upasana
URL:
Whiteboard:
Depends On: 1431955 1533023
Blocks: 1503134
TreeView+ depends on / blocked
 
Reported: 2017-11-06 07:02 UTC by Sunil Kumar Acharya
Modified: 2018-09-04 06:54 UTC (History)
8 users (show)

Fixed In Version: glusterfs-3.12.2-2
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1431955
Environment:
Last Closed: 2018-09-04 06:39:09 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2018:2607 0 None None None 2018-09-04 06:40:27 UTC

Description Sunil Kumar Acharya 2017-11-06 07:02:38 UTC
+++ This bug was initially created as a clone of Bug #1431955 +++

Description of problem:

When EC opens a file and get fd from all the bricks, if the brick is down, it will not have the fd from that sub volume.

Before sending a write on that fd, if the brick comes UP, we should be able to send this fd on this brick also to avoid unnecessary heal later.
 

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

--- Additional comment from Pranith Kumar K on 2017-03-27 01:28:29 EDT ---

Sunil,
   When you do a dd on a file and as long as the file is open, you see something like the following in statedump of the client:

[xlator.protocol.client.ec2-client-0.priv]
fd.0.remote_fd=0
connecting=0
connected=1
total_bytes_read=7288220
ping_timeout=42
total_bytes_written=11045016
ping_msgs_sent=3
msgs_sent=19812

This should be present for each of the fds that are open from each client-xlator.
So if we have 3=2+1 configuration we will have one for each of the client xlators. But if the brick was down at the time of opening the file this won't be present. Now after bringing the brick back up and operating on the file we should have this file opened again. I think at the moment this gets converted to anonymous-fd based operation so the operation may not fail. But it is important to open the file again for all operations to function properly like lk etc.

--- Additional comment from Sunil Kumar Acharya on 2017-03-27 07:56:27 EDT ---

Steps to re-create/test:

1. Created and mounted an EC (2+1) volume. Heal disabled.

[root@server3 ~]# gluster volume info
 
Volume Name: ec-vol
Type: Disperse
Volume ID: b676891f-392d-49a6-891c-8e7e3790658d
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: server1:/LAB/store/ec-vol
Brick2: server2:/LAB/store/ec-vol
Brick3: server3:/LAB/store/ec-vol
Options Reconfigured:
cluster.disperse-self-heal-daemon: disable               <<<<<
transport.address-family: inet
nfs.disable: on
disperse.background-heals: 0                             <<<<<
[root@server3 ~]#

2. Touched a file on the mountpoint.

# touch file

3. Brought down one of the brick process.

4. Opened a file descriptor for the file.

# exec 30<> file

5. Brought up the brick process which was down.

6. Wrote to the FD.

# echo "abc" >&30

7. File status on clinet and bricks after write completes.

Client:

[root@varada mount]# ls -lh file 
-rw-r--r--. 1 root root 4 Mar 27 17:11 file
[root@varada mount]# du -kh file 
1.0K	file
[root@varada mount]# 

Bricks:

[root@server1 ~]# du -kh /LAB/store/ec-vol/file
4.0K	/LAB/store/ec-vol/file
[root@server1 ~]# ls -lh /LAB/store/ec-vol/file
-rw-r--r-- 2 root root 0 Mar 27 17:08 /LAB/store/ec-vol/file
[root@server1 ~]# cat /LAB/store/ec-vol/file
[root@server1 ~]# 

[root@server2 ~]# du -kh /LAB/store/ec-vol/file
8.0K	/LAB/store/ec-vol/file
[root@server2 ~]# ls -lh /LAB/store/ec-vol/file
-rw-r--r-- 2 root root 512 Mar 27 17:11 /LAB/store/ec-vol/file
[root@server2 ~]# cat /LAB/store/ec-vol/file
abc
[root@server2 ~]# 

[root@server3 ~]# du -kh /LAB/store/ec-vol/file
8.0K	/LAB/store/ec-vol/file
[root@server3 ~]# ls -lh /LAB/store/ec-vol/file
-rw-r--r-- 2 root root 512 Mar 27 17:11 /LAB/store/ec-vol/file
[root@server3 ~]# cat /LAB/store/ec-vol/file
abc
abc
[root@server3 ~]#

--- Additional comment from Ashish Pandey on 2017-04-09 07:08:05 EDT ---


We will also need to disable some performance option to actually open an FD 
in step - 4 for those bricks which are UP.

1 - gluster v set vol performance.lazy-open no
2 - gluster v set vol performance.read-after-open yes

[root@apandey /]# gluster v info
 
Volume Name: vol
Type: Disperse
Volume ID: d007c6c2-98da-4cd9-8d5e-99e0e3f37012
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: apandey:/home/apandey/bricks/gluster/vol-1
Brick2: apandey:/home/apandey/bricks/gluster/vol-2
Brick3: apandey:/home/apandey/bricks/gluster/vol-3
Options Reconfigured:
disperse.background-heals: 0
cluster.disperse-self-heal-daemon: disable
performance.read-after-open: yes
performance.lazy-open: no
transport.address-family: inet
nfs.disable: on


[root@apandey glusterfs]# gluster v status
Status of volume: vol
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick apandey:/home/apandey/bricks/gluster/
vol-1                                       49152     0          Y       6297 
Brick apandey:/home/apandey/bricks/gluster/
vol-2                                       49153     0          Y       5865 
Brick apandey:/home/apandey/bricks/gluster/
vol-3                                       49154     0          Y       5884 
 
Task Status of Volume vol
------------------------------------------------------------------------------
There are no active volume tasks


After bringing the brick vol-1 UP and writing data on FD.


[root@apandey glusterfs]# cat /home/apandey/bricks/gluster/vol-1/dir/file
[root@apandey glusterfs]# cat /home/apandey/bricks/gluster/vol-2/dir/file
abc
abc
abc
abc
abc
abc
abc
abc
abc
abc
abc
[root@apandey glusterfs]# cat /home/apandey/bricks/gluster/vol-3/dir/file
abc
abc
abc
abc
abc
abc
abc
abc
abc
abc
abc
abc
abc
abc
abc
abc
abc
abc
abc
abc
abc
abc
[root@apandey glusterfs]# 




[root@apandey glusterfs]# getfattr -m. -d -e hex /home/apandey/bricks/gluster/vol-*/dir/file
getfattr: Removing leading '/' from absolute path names
# file: home/apandey/bricks/gluster/vol-1/dir/file
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a757365725f686f6d655f743a733000
trusted.ec.config=0x0000080301000200
trusted.ec.dirty=0x000000000000000b000000000000000b
trusted.ec.size=0x0000000000000000
trusted.ec.version=0x00000000000000000000000000000001
trusted.gfid=0xf8cf475afa5e4873bf2274f45278f74f

# file: home/apandey/bricks/gluster/vol-2/dir/file
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a757365725f686f6d655f743a733000
trusted.bit-rot.version=0x020000000000000058ea10fd0005a7ee
trusted.ec.config=0x0000080301000200
trusted.ec.dirty=0x000000000000000c000000000000000c
trusted.ec.size=0x000000000000002c
trusted.ec.version=0x000000000000000c000000000000000d
trusted.gfid=0xf8cf475afa5e4873bf2274f45278f74f

# file: home/apandey/bricks/gluster/vol-3/dir/file
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a757365725f686f6d655f743a733000
trusted.bit-rot.version=0x020000000000000058ea110100063c59
trusted.ec.config=0x0000080301000200
trusted.ec.dirty=0x000000000000000c000000000000000c
trusted.ec.size=0x000000000000002c
trusted.ec.version=0x000000000000000c000000000000000d
trusted.gfid=0xf8cf475afa5e4873bf2274f45278f74

--- Additional comment from Worker Ant on 2017-04-18 11:40:54 EDT ---

REVIEW: https://review.gluster.org/17077 (cluster/ec: OpenFD heal implementation for EC) posted (#1) for review on master by Sunil Kumar Acharya (sheggodu)

--- Additional comment from Worker Ant on 2017-04-18 11:47:46 EDT ---

REVIEW: https://review.gluster.org/17077 (cluster/ec: OpenFD heal implementation for EC) posted (#2) for review on master by Sunil Kumar Acharya (sheggodu)

--- Additional comment from Worker Ant on 2017-04-20 09:21:48 EDT ---

REVIEW: https://review.gluster.org/17077 (cluster/ec: OpenFD heal implementation for EC) posted (#3) for review on master by Sunil Kumar Acharya (sheggodu)

--- Additional comment from Worker Ant on 2017-04-27 07:39:20 EDT ---

REVIEW: https://review.gluster.org/17077 (cluster/ec: OpenFD heal implementation for EC) posted (#4) for review on master by Sunil Kumar Acharya (sheggodu)

--- Additional comment from Worker Ant on 2017-05-03 09:22:11 EDT ---

REVIEW: https://review.gluster.org/17077 (cluster/ec: OpenFD heal implementation for EC) posted (#5) for review on master by Sunil Kumar Acharya (sheggodu)

--- Additional comment from Worker Ant on 2017-05-16 02:29:34 EDT ---

REVIEW: https://review.gluster.org/17077 (cluster/ec: OpenFD heal implementation for EC) posted (#6) for review on master by Sunil Kumar Acharya (sheggodu)

--- Additional comment from Worker Ant on 2017-05-16 10:06:24 EDT ---

REVIEW: https://review.gluster.org/17077 (cluster/ec: OpenFD heal implementation for EC) posted (#7) for review on master by Sunil Kumar Acharya (sheggodu)

--- Additional comment from Worker Ant on 2017-05-30 06:50:05 EDT ---

REVIEW: https://review.gluster.org/17077 (cluster/ec: OpenFD heal implementation for EC) posted (#8) for review on master by Sunil Kumar Acharya (sheggodu)

--- Additional comment from Worker Ant on 2017-05-31 11:08:27 EDT ---

REVIEW: https://review.gluster.org/17077 (cluster/ec: OpenFD heal implementation for EC) posted (#9) for review on master by Sunil Kumar Acharya (sheggodu)

--- Additional comment from Worker Ant on 2017-06-05 11:36:58 EDT ---

REVIEW: https://review.gluster.org/17077 (cluster/ec: OpenFD heal implementation for EC) posted (#10) for review on master by Sunil Kumar Acharya (sheggodu)

--- Additional comment from Worker Ant on 2017-06-06 07:58:57 EDT ---

REVIEW: https://review.gluster.org/17077 (cluster/ec: OpenFD heal implementation for EC) posted (#11) for review on master by Sunil Kumar Acharya (sheggodu)

--- Additional comment from Worker Ant on 2017-06-08 14:18:03 EDT ---

REVIEW: https://review.gluster.org/17077 (cluster/ec: OpenFD heal implementation for EC) posted (#12) for review on master by Sunil Kumar Acharya (sheggodu)

--- Additional comment from Worker Ant on 2017-07-20 09:51:07 EDT ---

REVIEW: https://review.gluster.org/17077 (cluster/ec: OpenFD heal implementation for EC) posted (#13) for review on master by Sunil Kumar Acharya (sheggodu)

--- Additional comment from Worker Ant on 2017-08-24 08:10:49 EDT ---

REVIEW: https://review.gluster.org/17077 (cluster/ec: OpenFD heal implementation for EC) posted (#14) for review on master by Sunil Kumar Acharya (sheggodu)

--- Additional comment from Worker Ant on 2017-09-12 09:05:28 EDT ---

REVIEW: https://review.gluster.org/17077 (cluster/ec: OpenFD heal implementation for EC) posted (#15) for review on master by Sunil Kumar Acharya (sheggodu)

--- Additional comment from Worker Ant on 2017-09-22 07:38:17 EDT ---

REVIEW: https://review.gluster.org/17077 (cluster/ec: OpenFD heal implementation for EC) posted (#16) for review on master by Sunil Kumar Acharya (sheggodu)

--- Additional comment from Worker Ant on 2017-10-11 11:42:36 EDT ---

REVIEW: https://review.gluster.org/17077 (cluster/ec: OpenFD heal implementation for EC) posted (#17) for review on master by Sunil Kumar Acharya (sheggodu)

Comment 2 Sunil Kumar Acharya 2017-11-06 07:04:09 UTC
Upstream Patch : https://review.gluster.org/17077

Comment 7 errata-xmlrpc 2018-09-04 06:39:09 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:2607


Note You need to log in before you can comment on or make changes to this bug.