Bug 1529237 - File can't access if wrong file mode is set in gluster filesystem
Summary: File can't access if wrong file mode is set in gluster filesystem
Keywords:
Status: CLOSED UPSTREAM
Alias: None
Product: GlusterFS
Classification: Community
Component: core
Version: mainline
Hardware: x86_64
OS: Linux
low
medium
Target Milestone: ---
Assignee: Kotresh HR
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-12-27 09:02 UTC by George
Modified: 2020-03-12 12:55 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-03-12 12:55:29 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)
Test script and log for this issue (19.35 MB, application/zip)
2017-12-27 09:02 UTC, George
no flags Details

Description George 2017-12-27 09:02:59 UTC
Created attachment 1372733 [details]
Test script and log for this issue

Description of problem:

write 2 long variable and other text into a new file which store in glusterfs filesystem, and then read and check whether the value is just wrote or not, after try about 100 times into different new files, it easy to reproduce that the read value is not correct like just wrote.

if test the same procedure in EXT4 filesytem, the issue will not be exist.

Version-Release number of selected component (if applicable):
3.12.3

How reproducible:


Steps to Reproduce:
1. create a gluster volume named "test", and mount to "/mnt/test" for this volume.  (we use replicate mode , have SN-0, SN-1 2 SN node, and a Client Node)
2. copy a shell script _loop.sh and file_test.c in attachment
3. compile file_test.c with binary output "file_test" (could use command "make file_test")
4. run _loop.sh , format is _loop.sh <try_times> <save_to_glusterwhere>

Actual results:


/mnt/test is mounted from gluster storage.

[root@mn-0:/home/robot]
# ./_loop.sh 100 /mnt/test/test__1
Loop completed 100 times without errors.
[root@mn-0:/home/robot]
# ./_loop.sh 100 /mnt/test/test__2
Read values do not match with written data: -2064763152 and -1074501584
Error 6 returned on loop 3
[root@mn-0:/home/robot]
# ./_loop.sh 100 /mnt/test/test__3
Read values do not match with written data: -24283408 and -1440483296
Error 6 returned on loop 39
[root@mn-0:/home/robot]
# ./_loop.sh 100 /mnt/test/test__4
Loop completed 100 times without errors.
[root@mn-0:/home/robot]
# ./_loop.sh 100 /mnt/test/test__5
Read values do not match with written data: 438253296 and -2074572768
Error 6 returned on loop 34
[root@mn-0:/home/robot]
# ./_loop.sh 100 /mnt/test/test__6
Read values do not match with written data: -972917008 and -231886816
Error 6 returned on loop 3
[root@mn-0:/home/robot]
# ./_loop.sh 100 /mnt/test/test__7
Read values do not match with written data: -434657552 and 906727456
Error 6 returned on loop 6



Expected results:
Expect the test result like run the same command in ext4 ("/mnt/tmp" is EXT4 filesystem)
[root@mn-0:/home/robot]
# ./_loop.sh 100 /mnt/tmp/test__1
Loop completed 100 times without errors.
[root@mn-0:/home/robot]
# ./_loop.sh 100 /mnt/tmp/test__2
Loop completed 100 times without errors.
[root@mn-0:/home/robot]
# ./_loop.sh 100 /mnt/tmp/test__3
Loop completed 100 times without errors.
[root@mn-0:/home/robot]
# ./_loop.sh 100 /mnt/tmp/test__4
Loop completed 100 times without errors.
[root@mn-0:/home/robot]
# ./_loop.sh 100 /mnt/tmp/test__5
Loop completed 100 times without errors.
[root@mn-0:/home/robot]
# ./_loop.sh 100 /mnt/tmp/test__6
Loop completed 100 times without errors.
[root@mn-0:/home/robot]
# ./_loop.sh 100 /mnt/tmp/test__7
Loop completed 100 times without errors.
[root@mn-0:/home/robot]



Additional info:
with the below performance related translator disabled , the issue still is exist:
gluster volume set test stat-prefetch off
gluster volume set test read-ahead off
gluster volume set test write-behind off
gluster volume set test io-cache off
gluster volume set test quick-read off
gluster volume set test open-behind off
# gluster v info test

Volume Name: test
Type: Replicate
Volume ID: 2f602a72-2836-45c5-a14f-aaedab65f95d
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: sn-0.local:/mnt/bricks/test/brick
Brick2: sn-1.local:/mnt/bricks/test/brick
Options Reconfigured:
diagnostics.brick-log-level: TRACE
diagnostics.client-log-level: TRACE
performance.open-behind: off
performance.quick-read: off
performance.io-cache: off
performance.write-behind: off
performance.read-ahead: off
performance.stat-prefetch: off
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off
cluster.server-quorum-ratio: 51%

I also attached the debug log when run the test FYI, and attach some I think useful log abstract from the attachment below FYI:

[2017-12-27 06:21:04.332524] T [MSGID: 0] [afr-inode-read.c:286:afr_fstat_cbk] 12-stack-trace: stack-address: 0x7fb5ec010680, test-replicate-0 returned 0
[2017-12-27 06:21:04.332724] T [MSGID: 0] [syncop.c:1715:syncop_fgetxattr] 12-stack-trace: stack-address: 0x7fb5fc0618d0, winding from test-dht to test-replicate-0
[2017-12-27 06:21:04.332768] D [MSGID: 0] [afr-read-txn.c:220:afr_read_txn] 12-test-replicate-0: 70e0f7ee-2f0a-4364-a38e-96e7031eee7b: generation now vs cached: 2, 2
[2017-12-27 06:21:04.332791] T [MSGID: 0] [afr-inode-read.c:1728:afr_fgetxattr_wind] 12-stack-trace: stack-address: 0x7fb5fc0618d0, winding from test-replicate-0 to test-client-0
[2017-12-27 06:21:04.332817] T [rpc-clnt.c:1496:rpc_clnt_record] 12-test-client-0: Auth Info: pid: 5257, uid: 0, gid: 0, owner: 0000000000000000
[2017-12-27 06:21:04.332837] T [rpc-clnt.c:1353:rpc_clnt_record_build_header] 12-rpc-clnt: Request fraglen 140, payload: 64, rpc hdr: 76
[2017-12-27 06:21:04.332862] T [rpc-clnt.c:1699:rpc_clnt_submit] 12-rpc-clnt: submitted request (XID: 0x2681 Program: GlusterFS 3.3, ProgVers: 330, Proc: 35) to rpc-transport (test-client-0)
[2017-12-27 06:21:04.333943] T [rpc-clnt.c:675:rpc_clnt_reply_init] 12-test-client-0: received rpc message (RPC XID: 0x2681 Program: GlusterFS 3.3, ProgVers: 330, Proc: 35) from rpc-transport (test-client-0)
[2017-12-27 06:21:04.334013] D [MSGID: 0] [client-rpc-fops.c:1143:client3_3_fgetxattr_cbk] 12-test-client-0: remote operation failed: No data available
[2017-12-27 06:21:04.334036] D [MSGID: 0] [client-rpc-fops.c:1151:client3_3_fgetxattr_cbk] 12-stack-trace: stack-address: 0x7fb5fc0618d0, test-client-0 returned -1 error: No data available [No data available]
[2017-12-27 06:21:04.334114] T [MSGID: 0] [afr-common.c:1306:afr_inode_refresh_subvol_with_fstat] 12-stack-trace: stack-address: 0x7fb5fc0618d0, winding from test-replicate-0 to test-client-0
[2017-12-27 06:21:04.334174] T [rpc-clnt.c:1496:rpc_clnt_record] 12-test-client-0: Auth Info: pid: 5257, uid: 0, gid: 0, owner: 0000000000000000
[2017-12-27 06:21:04.334196] T [rpc-clnt.c:1353:rpc_clnt_record_build_header] 12-rpc-clnt: Request fraglen 336, payload: 260, rpc hdr: 76
[2017-12-27 06:21:04.334229] T [rpc-clnt.c:1699:rpc_clnt_submit] 12-rpc-clnt: submitted request (XID: 0x2682 Program: GlusterFS 3.3, ProgVers: 330, Proc: 25) to rpc-transport (test-client-0)
[2017-12-27 06:21:04.334280] T [MSGID: 0] [afr-common.c:1306:afr_inode_refresh_subvol_with_fstat] 12-stack-trace: stack-address: 0x7fb5fc0618d0, winding from test-replicate-0 to test-client-1
[2017-12-27 06:21:04.334311] T [rpc-clnt.c:1496:rpc_clnt_record] 12-test-client-1: Auth Info: pid: 5257, uid: 0, gid: 0, owner: 0000000000000000
[2017-12-27 06:21:04.334330] T [rpc-clnt.c:1353:rpc_clnt_record_build_header] 12-rpc-clnt: Request fraglen 336, payload: 260, rpc hdr: 76
[2017-12-27 06:21:04.334354] T [rpc-clnt.c:1699:rpc_clnt_submit] 12-rpc-clnt: submitted request (XID: 0x252c Program: GlusterFS 3.3, ProgVers: 330, Proc: 25) to rpc-transport (test-client-1)
[2017-12-27 06:21:04.334904] T [rpc-clnt.c:675:rpc_clnt_reply_init] 12-test-client-1: received rpc message (RPC XID: 0x252c Program: GlusterFS 3.3, ProgVers: 330, Proc: 25) from rpc-transport (test-client-1)
[2017-12-27 06:21:04.334983] T [MSGID: 0] [client-rpc-fops.c:1462:client3_3_fstat_cbk] 12-stack-trace: stack-address: 0x7fb5fc0618d0, test-client-1 returned 0
[2017-12-27 06:21:04.335128] T [rpc-clnt.c:675:rpc_clnt_reply_init] 12-test-client-0: received rpc message (RPC XID: 0x2682 Program: GlusterFS 3.3, ProgVers: 330, Proc: 25) from rpc-transport (test-client-0)
[2017-12-27 06:21:04.335165] T [MSGID: 0] [client-rpc-fops.c:1462:client3_3_fstat_cbk] 12-stack-trace: stack-address: 0x7fb5fc0618d0, test-client-0 returned 0
[2017-12-27 06:21:04.335193] T [MSGID: 0] [afr-inode-read.c:1728:afr_fgetxattr_wind] 12-stack-trace: stack-address: 0x7fb5fc0618d0, winding from test-replicate-0 to test-client-1
[2017-12-27 06:21:04.335224] T [rpc-clnt.c:1496:rpc_clnt_record] 12-test-client-1: Auth Info: pid: 5257, uid: 0, gid: 0, owner: 0000000000000000
[2017-12-27 06:21:04.335253] T [rpc-clnt.c:1353:rpc_clnt_record_build_header] 12-rpc-clnt: Request fraglen 140, payload: 64, rpc hdr: 76
[2017-12-27 06:21:04.335284] T [rpc-clnt.c:1699:rpc_clnt_submit] 12-rpc-clnt: submitted request (XID: 0x252d Program: GlusterFS 3.3, ProgVers: 330, Proc: 35) to rpc-transport (test-client-1)
[2017-12-27 06:21:04.335870] T [rpc-clnt.c:675:rpc_clnt_reply_init] 12-test-client-1: received rpc message (RPC XID: 0x252d Program: GlusterFS 3.3, ProgVers: 330, Proc: 35) from rpc-transport (test-client-1)
[2017-12-27 06:21:04.335902] D [MSGID: 0] [client-rpc-fops.c:1143:client3_3_fgetxattr_cbk] 12-test-client-1: remote operation failed: No data available
[2017-12-27 06:21:04.335922] D [MSGID: 0] [client-rpc-fops.c:1151:client3_3_fgetxattr_cbk] 12-stack-trace: stack-address: 0x7fb5fc0618d0, test-client-1 returned -1 error: No data available [No data available]
[2017-12-27 06:21:04.335950] D [MSGID: 0] [afr-inode-read.c:1720:afr_fgetxattr_wind] 12-stack-trace: stack-address: 0x7fb5fc0618d0, test-replicate-0 returned -1 error: No data available [No data available]

Comment 1 George 2017-12-28 06:52:45 UTC
Sorry, some code error in test code while call system call open
wrong in attachment: open(argv[1], O_CREAT|O_RDWR|S_IRUSR|S_IWUSR) 
should be          : open(argv[1], O_CREAT|O_RDWR,S_IRUSR|S_IWUSR)

But it still seems has some issue while the file access in client with root privilege, in the brick, the file could be read , but in the client, the file has no content at all while the really size is "120".

in sn-0 storage brick, the file could be access:
# hexdump testfile.39
0000000 3039 0000 0000 0000 d431 0000 0000 0000
0000010 6d2f 746e 742f 7365 2f74 5f61 325f 742f
0000020 7365 6674 6c69 2e65 3933 0000 0000 0000
0000030 0000 0000 0000 0000 0000 0000 0000 0000
*
0000070 0000 0000 7ffc 0000
0000078
[root@sn-0:/mnt/bricks/test/brick/a__2]


But in mounted gluster fs in client, the file can't be read.

[root@mn-0:/mnt/test/a__2]
# hexdump testfile.39
[root@mn-0:/mnt/test/a__2]
# echo $?
0
[root@mn-0:/mnt/test/a__2]
# ls -l testfile.39
---------T 1 root root 120 Dec 28 07:16 testfile.39
[root@mn-0:/mnt/test/a__2]

Comment 2 Shyamsundar 2018-10-23 14:54:54 UTC
Release 3.12 has been EOLd and this bug was still found to be in the NEW state, hence moving the version to mainline, to triage the same and take appropriate actions.

Comment 3 Worker Ant 2020-03-12 12:55:29 UTC
This bug is moved to https://github.com/gluster/glusterfs/issues/966, and will be tracked there from now on. Visit GitHub issues URL for further details


Note You need to log in before you can comment on or make changes to this bug.