Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 764920 (GLUSTER-3188)

Summary:

Still getting gfid mismatches on "sed -i" renames

Product:

[Community] GlusterFS

Reporter:

Joe Julian <joe>

Component:

replicate

Assignee:

Pranith Kumar K <pkarampu>

Status:

CLOSED DUPLICATE

QA Contact:

Severity:

high

Docs Contact:

Priority:

urgent

Version:

3.1.5

CC:

gluster-bugs, pierre.francois

Target Milestone:

---

Target Release:

---

Hardware:

x86_64

OS:

Linux

Whiteboard:

Fixed In Version:

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

Type:

---

Regression:

---

Mount Type:

fuse

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
log with debug/trace	none
files and script that create mismatched gfids	none
One-time pass that created mismatched gfids	none

Description Joe Julian 2011-07-18 16:12:27 UTC

After upgrading to 3.1.5, "sed -i" scripts still create gfid mismatch errors.

The directory bridge/ is rm -rf'd and mkdir'd before create_tables.sql is created. It's created with a stdout redirect then process with 'sed -i'.

Volume Name: share1
Type: Distributed-Replicate
Status: Started
Number of Bricks: 4 x 3 = 12
Transport-type: tcp
Bricks:
Brick1: ewcs2:/var/spool/glusterfs/a_share1
Brick2: ewcs4:/var/spool/glusterfs/a_share1
Brick3: ewcs7:/var/spool/glusterfs/a_share1
Brick4: ewcs2:/var/spool/glusterfs/b_share1
Brick5: ewcs4:/var/spool/glusterfs/b_share1
Brick6: ewcs7:/var/spool/glusterfs/b_share1
Brick7: ewcs2:/var/spool/glusterfs/c_share1
Brick8: ewcs4:/var/spool/glusterfs/c_share1
Brick9: ewcs7:/var/spool/glusterfs/c_share1
Brick10: ewcs2:/var/spool/glusterfs/d_share1
Brick11: ewcs4:/var/spool/glusterfs/d_share1
Brick12: ewcs7:/var/spool/glusterfs/d_share1

[2011-07-18 03:35:01.455389] W [dht-common.c:657:dht_lookup_linkfile_cbk] 0-share1-dht: /bridge/create_tables.sql: gfid different on data file on share1-replicate-1
[2011-07-18 03:35:01.489529] E [afr-lk-common.c:569:afr_unlock_inodelk_cbk] 0-share1-replicate-3: /bridge/create_tables.sql: unlock failed No such file or directory
[2011-07-18 03:35:01.489666] E [afr-lk-common.c:569:afr_unlock_inodelk_cbk] 0-share1-replicate-3: /bridge/create_tables.sql: unlock failed No such file or directory
[2011-07-18 03:35:01.489713] E [afr-lk-common.c:569:afr_unlock_inodelk_cbk] 0-share1-replicate-3: /bridge/create_tables.sql: unlock failed No such file or directory
[2011-07-18 03:35:01.688496] W [fuse-bridge.c:582:fuse_fd_cbk] 0-glusterfs-fuse: 10781: OPEN() /bridge/create_tables.sql => -1 (Invalid argument)
[2011-07-18 03:35:01.689867] W [fuse-bridge.c:582:fuse_fd_cbk] 0-glusterfs-fuse: 10782: OPEN() /bridge/create_tables.sql => -1 (Invalid argument)
[2011-07-18 03:35:13.652498] W [dht-common.c:657:dht_lookup_linkfile_cbk] 0-share1-dht: /bridge/create_tables.sql: gfid different on data file on share1-replicate-1
[2011-07-18 04:05:22.893785] E [rpc-clnt.c:199:call_bail] 0-share1-client-9: bailing out frame type(GlusterFS 3.1) op(INODELK(29)) xid = 0x6297x sent = 2011-07-18 03:35:13.679729. timeout = 1800
[2011-07-18 04:35:23.619224] E [rpc-clnt.c:199:call_bail] 0-share1-client-10: bailing out frame type(GlusterFS 3.1) op(INODELK(29)) xid = 0x6188x sent = 2011-07-18 04:05:22.893927. timeout = 1800
[2011-07-18 05:05:24.459560] E [rpc-clnt.c:199:call_bail] 0-share1-client-11: bailing out frame type(GlusterFS 3.1) op(INODELK(29)) xid = 0x5987x sent = 2011-07-18 04:35:23.632024. timeout = 1800
[2011-07-18 05:05:24.463618] W [fuse-bridge.c:582:fuse_fd_cbk] 0-glusterfs-fuse: 32281: OPEN() /bridge/create_tables.sql => -1 (Invalid argument)
[2011-07-18 07:26:51.743494] W [fuse-bridge.c:905:fuse_unlink_cbk] 0-glusterfs-fuse: 87650: UNLINK() /bridge/create_tables.sql => -1 (Invalid argument)
[2011-07-18 07:26:54.714613] W [fuse-bridge.c:905:fuse_unlink_cbk] 0-glusterfs-fuse: 87748: UNLINK() /bridge/create_tables.sql => -1 (Invalid argument)
[2011-07-18 08:02:51.611195] W [fuse-bridge.c:905:fuse_unlink_cbk] 0-glusterfs-fuse: 103497: UNLINK() /bridge/create_tables.sql => -1 (Invalid argument)
[2011-07-18 08:07:30.172676] I [fuse-bridge.c:3218:fuse_thread_proc] 0-fuse: unmounting /mnt/gluster/share1
[2011-07-18 08:07:30.184897] I [glusterfsd.c:712:cleanup_and_exit] 0-glusterfsd: shutting down

Comment 1 Joe Julian 2011-07-18 22:38:51 UTC

After this happens the first time, if it happens a second time the client will hang on fuse-bridge.c:3124 (readv())

Comment 2 Joe Julian 2011-07-18 22:40:44 UTC

#0  0x00007f96bbb7b265 in sigwait () from /lib64/libpthread.so.0
#1  0x0000000000403c6b in glusterfs_sigwaiter ()
#2  0x00007f96bbb737e1 in start_thread () from /lib64/libpthread.so.0
#3  0x00007f96bb8ce52d in clone () from /lib64/libc.so.6

Thread 3 (Thread 0x7f96b934a710 (LWP 27076)):
#0  0x00007f96bb893afd in nanosleep () from /lib64/libc.so.6
#1  0x00007f96bb8c78f4 in usleep () from /lib64/libc.so.6
#2  0x00007f96bc3de31d in gf_timer_proc (ctx=0xd73010) at timer.c:182
#3  0x00007f96bbb737e1 in start_thread () from /lib64/libpthread.so.0
#4  0x00007f96bb8ce52d in clone () from /lib64/libc.so.6

Thread 2 (Thread 0x7f96b4fcb710 (LWP 27077)):
#0  0x00007f96bb8c6667 in readv () from /lib64/libc.so.6
#1  0x00007f96ba7db765 in fuse_thread_proc (data=<value optimized out>) at fuse-bridge.c:3124
#2  0x00007f96bbb737e1 in start_thread () from /lib64/libpthread.so.0
#3  0x00007f96bb8ce52d in clone () from /lib64/libc.so.6

Thread 1 (Thread 0x7f96bc834700 (LWP 27072)):
#0  0x00007f96bb8ceb23 in epoll_wait () from /lib64/libc.so.6
#1  0x00007f96bc3eedf7 in event_dispatch_epoll (event_pool=0xd73350) at event.c:859
#2  0x00000000004049fb in main ()

Comment 3 Joe Julian 2011-07-20 11:47:55 UTC

Created attachment 563

Comment 4 Joe Julian 2011-07-20 12:42:20 UTC

Created attachment 564

Comment 5 Joe Julian 2011-07-20 12:43:28 UTC

Created attachment 565


I ran the script once, and looked at the backend. They had mismached ids.

Comment 6 Joe Julian 2011-07-21 20:56:02 UTC


*** This bug has been marked as a duplicate of bug 2464 ***