Bug 798874
| Summary: | self-heal hangs in case of metadata, data self-heal w.o. any changelog | ||
|---|---|---|---|
| Product: | [Community] GlusterFS | Reporter: | Pranith Kumar K <pkarampu> |
| Component: | replicate | Assignee: | Pranith Kumar K <pkarampu> |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Shwetha Panduranga <shwetha.h.panduranga> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | pre-release | CC: | gluster-bugs, rodrigo |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | glusterfs-3.4.0 | Doc Type: | Bug Fix |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2013-07-24 17:28:32 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 817967 | ||
|
Description
Pranith Kumar K
2012-03-01 07:08:45 UTC
CHANGE: http://review.gluster.com/2849 (cluster/afr: Reset re-usable sh args in sh_*_done) merged in master by Vijay Bellur (vijay) CHANGE: http://review.gluster.com/2928 (replicate: backport of 0783ca994d9ea95fd9ab3dd95d6407918f19f255) merged in release-3.2 by Anand Avati (avati) Waiting for inputs from pranith to verify the bug. Steps to recreate the bug:- --------------------------- 1.Create a replicate volume(1x2: brick1 and brick2) 2.From the backend create a file "file1" on brick1 and brick2.The size and ownership of file "file1" should differ on brick1 and brick2 3.Set "background-self-heal-count" option to value "0" on the volume. 4.Start the volume 5.Create a fuse/nfs mount 6.ls <file1> from the mount. 7.cat <file1> from the mount Expected Result:- ----------------- 1. ls should not hang and look up of the file will succeed (ls, ls -l, stat) 2. cat will report EIO 3. GFID's are assigned to the file. Extended attributes are not set. 4. rm -f <file1> should be successful. [06/12/12 - 08:20:45 root@AFR-Server1 ~]# glusterd
[06/12/12 - 08:21:11 root@AFR-Server1 ~]# ./peer_probe.sh
Probe successful
Probe successful
Number of Peers: 2
Hostname: 10.16.159.188
Uuid: b0784ecf-5412-4c6d-a9ca-f104c2a31497
State: Peer in Cluster (Connected)
Hostname: 10.16.159.196
Uuid: ac29a04f-35e0-4ec4-8caa-b3169c8a194d
State: Peer in Cluster (Connected)
[06/12/12 - 08:21:41 root@AFR-Server1 ~]# ./create_vol_1_3.sh
Creation of volume vol has been successful. Please start the volume to access data.
[06/12/12 - 08:21:52 root@AFR-Server1 ~]#
[06/12/12 - 08:21:57 root@AFR-Server1 ~]#
[06/12/12 - 08:21:58 root@AFR-Server1 ~]# gluster v set vol background-self-heal-count 0
Set volume successful
[06/12/12 - 08:22:22 root@AFR-Server1 ~]# gluster v info
Volume Name: vol
Type: Replicate
Volume ID: 0c1cf7ba-abd9-47da-aba0-379776511854
Status: Created
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: 10.16.159.184:/export_b1/dir1
Brick2: 10.16.159.188:/export_b1/dir1
Brick3: 10.16.159.196:/export_b1/dir1
Options Reconfigured:
cluster.background-self-heal-count: 0
Create file "file1" on Brick1 from backend:-
-------------------------------------------
[06/12/12 - 08:22:29 root@AFR-Server1 ~]# echo "Data From Brick1" > /export_b1/dir1/file1
[
[06/12/12 - 08:23:47 root@AFR-Server1 ~]# ls -lh /export_b1/dir1/file1
-rw-r--r--. 1 root root 17 Jun 12 08:23 /export_b1/dir1/file1
Create file "file1" on Brick2 from backend:-
-------------------------------------------
[06/12/12 - 08:25:25 root@AFR-Server2 ~]# echo "Data From Brick2" > /export_b1/dir1/file1
[06/12/12 - 08:25:32 root@AFR-Server2 ~]# chown qa /export_b1/dir1/file1
[06/12/12 - 08:25:38 root@AFR-Server2 ~]# ls -lh /export_b1/dir1/file1
-rw-r--r--. 1 qa root 17 Jun 12 08:25 /export_b1/dir1/file1
Create file "file1" on Brick3 from backend:-
-------------------------------------------
[06/12/12 - 08:24:27 root@AFR-Server3 ~]# for i in {1..100}; do echo "This is from Brick3" >> /export_b1/dir1/file1; done
[06/12/12 - 08:24:59 root@AFR-Server3 ~]# ls -lh /export_b1/dir1/file1
-rw-r--r--. 1 root root 2.0K Jun 12 08:24 /export_b1/dir1/file1
[06/12/12 - 08:25:53 root@AFR-Server1 ~]# gluster v start vol
Starting volume vol has been successful
[06/12/12 - 08:25:57 root@AFR-Server1 ~]# gluster v status
Status of volume: vol
Gluster process Port Online Pid
------------------------------------------------------------------------------
Brick 10.16.159.184:/export_b1/dir1 24009 Y 16069
Brick 10.16.159.188:/export_b1/dir1 24009 Y 10729
Brick 10.16.159.196:/export_b1/dir1 24009 Y 25019
NFS Server on localhost 38467 Y 16075
Self-heal Daemon on localhost N/A Y 16081
NFS Server on 10.16.159.188 38467 Y 10735
Self-heal Daemon on 10.16.159.188 N/A Y 10740
NFS Server on 10.16.159.196 38467 Y 25024
Self-heal Daemon on 10.16.159.196 N/A Y 25031
Mount Output;-
--------------
[06/12/12 - 08:25:49 root@ARF-Client1 ~]# mount -t glusterfs 10.16.159.184:/vol /mnt/gfsc1
[06/12/12 - 08:26:06 root@ARF-Client1 ~]# cd /mnt/gfsc1
[06/12/12 - 08:26:11 root@ARF-Client1 gfsc1]# ls
file1
[06/12/12 - 08:26:13 root@ARF-Client1 gfsc1]# ls -lh file1
-rw-r--r--. 1 qa root 17 Jun 12 08:25 file1
[06/12/12 - 08:26:19 root@ARF-Client1 gfsc1]# stat file1
File: `file1'
Size: 17 Blocks: 8 IO Block: 131072 regular file
Device: 15h/21d Inode: 12957359811459890708 Links: 1
Access: (0644/-rw-r--r--) Uid: ( 501/ qa) Gid: ( 0/ root)
Access: 2012-06-12 08:26:19.226629936 -0400
Modify: 2012-06-12 08:25:32.996155593 -0400
Change: 2012-06-12 08:26:13.824264595 -0400
[06/12/12 - 08:26:23 root@ARF-Client1 gfsc1]# cat file1
cat: file1: Input/output error
[06/12/12 - 08:26:32 root@ARF-Client1 gfsc1]# rm file1
rm: remove regular file `file1'? y
[06/12/12 - 08:28:14 root@ARF-Client1 gfsc1]# ls
Brick1 xattrs:-
-------------
[06/12/12 - 08:26:00 root@AFR-Server1 ~]# getfattr -d -m. -ehex /export_b1/dir1/file1
getfattr: Removing leading '/' from absolute path names
# file: export_b1/dir1/file1
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000
trusted.gfid=0x6dc5370120d14a0bb3d1ca18f4fd3e14
[06/12/12 - 08:26:38 root@AFR-Server1 ~]# getfattr -d -m. -ehex /export_b1/dir1/
getfattr: Removing leading '/' from absolute path names
# file: export_b1/dir1/
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000
trusted.gfid=0x00000000000000000000000000000001
trusted.glusterfs.volume-id=0x0c1cf7baabd947daaba0379776511854
[06/12/12 - 08:26:41 root@AFR-Server1 ~]# ls -lh /export_b1/dir1/file1
-rw-r--r--. 2 root root 17 Jun 12 08:23 /export_b1/dir1/file1
[06/12/12 - 08:27:49 root@AFR-Server1 ~]# cat /export_b1/dir1/file1
Data From Brick1
Brick2 xattrs:-
---------------
[06/12/12 - 08:25:41 root@AFR-Server2 ~]# getfattr -d -m. -ehex /export_b1/dir1/file1
getfattr: Removing leading '/' from absolute path names
# file: export_b1/dir1/file1
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000
trusted.gfid=0x6dc5370120d14a0bb3d1ca18f4fd3e14
[06/12/12 - 08:26:55 root@AFR-Server2 ~]# getfattr -d -m. -ehex /export_b1/dir1/
getfattr: Removing leading '/' from absolute path names
# file: export_b1/dir1/
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000
trusted.gfid=0x00000000000000000000000000000001
trusted.glusterfs.volume-id=0x0c1cf7baabd947daaba0379776511854
[06/12/12 - 08:27:04 root@AFR-Server2 ~]# ls -lh /export_b1/dir1/file1
-rw-r--r--. 2 qa root 17 Jun 12 08:25 /export_b1/dir1/file1
[06/12/12 - 08:27:44 root@AFR-Server2 ~]# cat /export_b1/dir1/file1
Data From Brick2
Brick3 xattrs:-
---------------
[06/12/12 - 08:27:16 root@AFR-Server3 ~]# getfattr -d -m. -ehex /export_b1/dir1/file1
getfattr: Removing leading '/' from absolute path names
# file: export_b1/dir1/file1
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000
trusted.gfid=0x6dc5370120d14a0bb3d1ca18f4fd3e14
[06/12/12 - 08:27:22 root@AFR-Server3 ~]# getfattr -d -m. -ehex /export_b1/dir1/
getfattr: Removing leading '/' from absolute path names
# file: export_b1/dir1/
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000
trusted.gfid=0x00000000000000000000000000000001
trusted.glusterfs.volume-id=0x0c1cf7baabd947daaba0379776511854
[06/12/12 - 08:27:25 root@AFR-Server3 ~]# ls -lh /export_b1/dir1/file1
-rw-r--r--. 2 root root 2.0K Jun 12 08:24 /export_b1/dir1/file1
Verified the bug on 3.3.0qa45. |