Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1146906 - glusterfs process crashed when performing stat on a file from a snap volume mount
glusterfs process crashed when performing stat on a file from a snap volume m...
Status: CLOSED ERRATA
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: distribute (Show other bugs)
3.0
Unspecified Unspecified
high Severity high
: ---
: RHGS 3.0.3
Assigned To: Nithya Balachandran
Anil Shah
: ZStream
Depends On:
Blocks: 1159280 1162694 1162767 1199057
  Show dependency treegraph
 
Reported: 2014-09-26 06:26 EDT by spandura
Modified: 2015-05-13 13:42 EDT (History)
4 users (show)

See Also:
Fixed In Version: glusterfs-3.6.0.32-1
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2015-01-15 08:40:29 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2015:0038 normal SHIPPED_LIVE Red Hat Storage 3.0 enhancement and bug fix update #3 2015-01-15 13:35:28 EST

  None (edit)
Description spandura 2014-09-26 06:26:47 EDT
Description of problem:
========================
In a dis-rep volume, deleted all the files/dirs from the actual volume from fuse mount(To simulate accidental removal of files) . From one of the snaps taken copied all the files to the actual volume. while taking arequal-checksum on the snap volume mount point , the glusterfs process mount process of the snap volume crashed. 

2014-09-26 06:54:02.000347] I [dht-common.c:1892:dht_lookup_cbk] 0-84ffc336efc54efe893ab182bf8107bb-dht: linkfile not having link subvol for /E_new_dir.1/E_new_file.3
pending frames:
frame : type(1) op(LOOKUP)
frame : type(0) op(0)
patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash: 
2014-09-26 06:54:02
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.6.0.29
/usr/lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xb6)[0x397981ff06]
/usr/lib64/libglusterfs.so.0(gf_print_trace+0x33f)[0x397983a59f]
/lib64/libc.so.6[0x34066329a0]
/usr/lib64/glusterfs/3.6.0.29/xlator/cluster/distribute.so(dht_lookup_everywhere_done+0x6e3)[0x7fd81ed43c03]
/usr/lib64/glusterfs/3.6.0.29/xlator/cluster/distribute.so(dht_lookup_everywhere_cbk+0x403)[0x7fd81ed485c3]
/usr/lib64/glusterfs/3.6.0.29/xlator/cluster/replicate.so(afr_lookup_cbk+0x558)[0x7fd81efcbb18]
/usr/lib64/glusterfs/3.6.0.29/xlator/protocol/client.so(client3_3_lookup_cbk+0x647)[0x7fd81f20a267]
/usr/lib64/libgfrpc.so.0(rpc_clnt_handle_reply+0xa5)[0x397a00e9c5]
/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x13f)[0x397a00fe4f]
/usr/lib64/libgfrpc.so.0(rpc_transport_notify+0x28)[0x397a00b668]
/usr/lib64/glusterfs/3.6.0.29/rpc-transport/socket.so(+0x9275)[0x7fd81f84e275]
/usr/lib64/glusterfs/3.6.0.29/rpc-transport/socket.so(+0xac5d)[0x7fd81f84fc5d]
/usr/lib64/libglusterfs.so.0[0x3979876367]
/usr/sbin/glusterfs(main+0x603)[0x407e93]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x340661ed1d]
/usr/sbin/glusterfs[0x4049a9]
---------
(END) 


Version-Release number of selected component (if applicable):
=============================================================
glusterfs 3.6.0.29 built on Sep 18 2014 23:55:46

How reproducible:
====================
1/1

Steps to Reproduce:
======================
1. Create 2 x 2 dis-rep volume. Start the volume.

2. From 2 client machines create 1 fuse mount each.

3. From 1st client machine fuse  mount execute the script "self_heal_sanity_create.sh"

4. From both the client fuse mount points calculate arequal-checksum.

5. Create snapshot snap1

6. Set self-heal-daemon to off

7. Crash brick1 and brick3 (using godown utility)

8. From 1st client machine fuse  mount execute the script "self_heal_sanity_modify.sh"

9. From both the client fuse mount points calculate arequal-checksum.

10. Bring back brick1 and brick3

11. Immediately create snapshot snap2.

12. From both the client fuse mount points calculate arequal-checksum.  (self-heals all the data)

13. From 1st client machine fuse  mount perfrom "rm -rf *" (simulating accedential deletion of data)

14. Create a fuse mount for snap volume "snap2" from one of the clients.

15. from the snap volume mount calculate arequal-checksum.

16. copy the contents from snap mount point to the actual volume mount point "cp -rp * <actual_volume_mount>"

17. calculate the arequal-checksum of both snap_mount_point and actual_volume_mount_point.

Expected : They should be same.
Actual : They differ

18. unmount the snap volume mount and remount it with option "use-readdirp=NO".

19. calculate arequal-checksum.

Actual results:
===============
Observed the crash.

Further executing stat or ls on the file is crashing the process. 

Expected results:
=================
glusterfs process shouldn't crash
Comment 9 Anil Shah 2014-12-11 07:16:14 EST
Bug successfully verified the bug on build glusterfs 3.6.0.36.
Comment 11 errata-xmlrpc 2015-01-15 08:40:29 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-0038.html

Note You need to log in before you can comment on or make changes to this bug.