1292808 – [USS]: Snapd related core generated while accessing snapshot after its recreation.

Bug 1292808 - [USS]: Snapd related core generated while accessing snapshot after its recreation.

Summary: [USS]: Snapd related core generated while accessing snapshot after its recrea...

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	snapshot
Sub Component:
Version:	rhgs-3.1
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	urgent
Target Milestone:	---
Target Release:	---
Assignee:	Bug Updates Notification Mailing List
QA Contact:	storage-qa-internal@redhat.com
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2015-12-18 12:15 UTC by Shashank Raj
Modified:	2018-04-16 16:04 UTC (History)
CC List:	2 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2018-04-16 16:04:09 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Shashank Raj 2015-12-18 12:15:03 UTC

Description of problem:
Snapd related core generated while accessing snapshot after its recreation.

Version-Release number of selected component (if applicable):
glusterfs-3.7.5-12

How reproducible:
twice

Steps to Reproduce: 

Observed during automation run and attaching snippets in the steps where errors and failures are observed

1.Create a tiered volume and start it.
2.Attach tier to the volume.
3.Mount the volume and create a file under mount point
echo "Hello" > /mnt/glusterfs/file
4.Enable uss on the volume
5.Create a snapshot and activate it
6.Do the following on the client and observe that it gives below error message

"stat /mnt/glusterfs/.snaps >/dev/null 2>&1 && cd /mnt/glusterfs/.snaps/snap0 && ls >/dev/null         && cat file" on dhcp35-15.lab.eng.blr.redhat.com: RETCODE is 0
2015-12-18 17:15:48,838 ERROR uss_check_file_content Content of file does not matchecho "Namaskara" > /mnt/glusterfs/file

7.delete the snapshotcat /mnt/glusterfs/.snaps/snap0/file
8.echo "Namaskara" > /mnt/glusterfs/file, to the file
9.create the snapshot again with the same name and activate it.
10.try to access the file from snapshot and observe that it fails with transport endpoint not connected.
cat: /mnt/glusterfs/.snaps/snap0/file: Transport endpoint is not connected
11.Core related to snapd is observed on the node.

Actual results:
snapd crashed

Expected results:
Snapd should not crash.

Additional info:
Following backtrace observed:

#0  pthread_spin_lock () at ../nptl/sysdeps/x86_64/pthread_spin_lock.S:24
#1  0x00007fb7b41f7059 in inode_ctx_get0 (inode=0x7fb7819e13d4, xlator=xlator@entry=0x7fb79c0228f0, value1=value1@entry=0x7fb7a48eeb90)
    at inode.c:2089
#2  0x00007fb7b41f70e8 in inode_needs_lookup (inode=0x7fb7819e13d4, this=0x7fb79c0228f0) at inode.c:1872
#3  0x00007fb7a6779286 in __glfs_resolve_inode (fs=fs@entry=0x7fb79c0008e0, subvol=subvol@entry=0x7fb77c024e20, object=object@entry=0x7fb79c03eb10)
    at glfs-resolve.c:997
#4  0x00007fb7a677938b in glfs_resolve_inode (fs=fs@entry=0x7fb79c0008e0, subvol=subvol@entry=0x7fb77c024e20, object=object@entry=0x7fb79c03eb10)
    at glfs-resolve.c:1023
#5  0x00007fb7a677a7d2 in pub_glfs_h_open (fs=0x7fb79c0008e0, object=object@entry=0x7fb79c03eb10, flags=flags@entry=0) at glfs-handleops.c:634
#6  0x00007fb7a698fbe5 in svs_open (frame=0x7fb7b1ce0230, this=0x7fb7a0005e80, loc=0x7fb7b178606c, flags=0, fd=0x7fb7a0021d2c, 
    xdata=<optimized out>) at snapview-server.c:1887
#7  0x00007fb7b41e4eba in default_open_resume (frame=0x7fb7b1ce002c, this=0x7fb7a0009850, loc=0x7fb7b178606c, flags=0, fd=0x7fb7a0021d2c, xdata=0x0)
    at defaults.c:1415
#8  0x00007fb7b420417d in call_resume (stub=0x7fb7b178602c) at call-stub.c:2576
#9  0x00007fb7a5ab8363 in iot_worker (data=0x7fb7a001cc70) at io-threads.c:215
#10 0x00007fb7b303cdc5 in start_thread (arg=0x7fb7a48ef700) at pthread_create.c:308
#11 0x00007fb7b29831cd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113

Comment 2 Shashank Raj 2015-12-18 12:24:12 UTC

sosreports and core are placed at http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/1292808

Comment 3 Shashank Raj 2016-02-04 06:17:09 UTC

This bug is reproducible everytime we run the automated test. Should be looked upon.

Note You need to log in before you can comment on or make changes to this bug.