+++ This bug was initially created as a clone of Bug #1218940 +++ Description of problem: tests/basic/fop-sanity.t fails randomly Version-Release number of selected component (if applicable): How reproducible: Randomly on the regression systems However, it is consistently reproducible if tests/basic/fop-sanity.c is modified to add the following line immediately before the call to open(): fprintf(stdout, "Test string"); Steps to Reproduce: 1. 2. 3. Actual results: Result: PASS tests/basic/fops-sanity.t .. 1..11 ok 1 ok 2 No volumes present ok 3 ok 4 ok 5 ok 6 ok 7 ok 8 volume set: success ok 9 read failed: No data available read returning junk fd based file operation 1 failed read failed: No data available read returning junk fstat failed : No data available fd based file operation 2 failed read failed: No data available read returning junk dup fd based file operation failed not ok 10 FAILED COMMAND: ./fops-sanity damn /home/nbalachandran/Projects/gluster/code/glusterfs ok 11 Failed 1/11 subtests Expected results: Test should pass Additional info: --- Additional comment from Nithya Balachandran on 2015-05-06 04:59:51 EDT --- The man page for open() states that the mode must be passed in if the flags include O_CREAT. The calls to open() in the fops-sanity.c did not include the mode argument. This seemed to cause the file to be created with some random mode which ended up creating a 'T' file. DHT then proceeded to treat the file as a linkto file instead of a data file, causing the test to fail. Fix: Provide a mode argument to the open() calls.
Patch posted at: http://review.gluster.org/#/c/10590/
Additional information: open() appears to pass in a random (?) value for the mode even if not provided in the call. This mode appears to include the T bit and ends up creating a DHT 'linkto' file. Fragments of the strace outputs for both the failed and fixed test are as follows: Working Strace output: ---------------------------------------------- arch_prctl(ARCH_SET_FS, 0x7f0fb9a8c740) = 0 mprotect(0x3062bb4000, 16384, PROT_READ) = 0 mprotect(0x604000, 4096, PROT_READ) = 0 mprotect(0x306261f000, 4096, PROT_READ) = 0 munmap(0x7f0fb9a8f000, 116173) = 0 open("temp-xattr-test-file_2", O_RDWR|O_CREAT, 0666) = 3 <==== correct mode ftruncate(3, 0) = 0 write(3, "This is my second string\n", 25) = 25 lseek(3, 0, SEEK_SET) = 0 read(3, "This is my second string\n", 25) = 25 fstat(3, {st_mode=S_IFREG|0644, st_size=25, ...}) = 0 fchmod(3, 0640) = 0 fchown(3, 10001, 10001) = 0 fsync(3) = 0 fsetxattr(3, "trusted.xattr-test", "working", 8, 0) = 0 fdatasync(3) = 0 flistxattr(3, NULL, 0) = 36 fgetxattr(3, "trusted.xattr-test", 0x0, 0) = 8 fremovexattr(3, "trusted.xattr-test") = 0 close(3) = 0 unlink("temp-xattr-test-file_2") = 0 fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 9), ...}) = 0 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f0fb9aab000 write(1, "fd based file operation 2 passed"..., 33fd based file operation 2 passed ) = 33 exit_group(0)
Strace without the fix. As this is random behaviour, the mode here maps to a non-linkto file so the tests pass in this case: mprotect(0x604000, 4096, PROT_READ) = 0 mprotect(0x306261f000, 4096, PROT_READ) = 0 munmap(0x7fb8dd4f8000, 116173) = 0 open("temp-xattr-test-file_2", O_RDWR|O_CREAT, 03777775624535142) = 3 <=== wierd mode ftruncate(3, 0) = 0 write(3, "This is my second string\n", 25) = 25 lseek(3, 0, SEEK_SET) = 0 read(3, "This is my second string\n", 25) = 25 fstat(3, {st_mode=S_IFREG|S_ISVTX|0140, st_size=25, ...}) = 0 fchmod(3, 0640) = 0 fchown(3, 10001, 10001) = 0 fsync(3) = 0 fsetxattr(3, "trusted.xattr-test", "working", 8, 0) = 0 fdatasync(3) = 0 flistxattr(3, NULL, 0) = 36 fgetxattr(3, "trusted.xattr-test", 0x0, 0) = 8 fremovexattr(3, "trusted.xattr-test") = 0 close(3) = 0 unlink("temp-xattr-test-file_2") = 0 fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 9), ...}) = 0 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fb8dd514000 write(1, "fd based file operation 2 passed"..., 33fd based file operation 2 passed ) = 33 exit_group(0) = ?
This test can be made to fail consistently by adding an fprintf (stdout, ..) before the open (): Strace without the fix and with additional fprintf(). The mode here maps to a linkto file so the tests fail: mprotect(0x306261f000, 4096, PROT_READ) = 0 munmap(0x7f3e795c1000, 116173) = 0 fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 9), ...}) = 0 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f3e795dd000 open("temp-xattr-test-file_2", O_RDWR|O_CREAT, 03014256725020) = 3 ftruncate(3, 0) = 0 write(3, "This is my second string\n", 25) = 25 lseek(3, 0, SEEK_SET) = 0 read(3, 0x7fffd6b88750, 25) = -1 ENODATA (No data available) write(2, "read failed: No data available\n", 31read failed: No data available ) = 31 write(2, "read returning junk\n", 20read returning junk ) = 20 fstat(3, 0x7fffd6b887d0) = -1 ENODATA (No data available) write(2, "fstat failed : No data available"..., 33fstat failed : No data available ) = 33 fchmod(3, 0640) = 0 fchown(3, 10001, 10001) = 0 fsync(3) = 0 fsetxattr(3, "trusted.xattr-test", "working", 8, 0) = 0 fdatasync(3) = 0 flistxattr(3, NULL, 0) = 36 fgetxattr(3, "trusted.xattr-test", 0x0, 0) = 8 fremovexattr(3, "trusted.xattr-test") = 0 close(3) = 0 unlink("temp-xattr-test-file_2") = 0 write(2, "fd based file operation 2 failed"..., 33fd based file operation 2 failed ) = 33 write(1, "Testing", 7Testing) = 7 exit_group(-1) = ? +++ exited with 255 +++
I don't know how the strange mode value is selected, but passing in a correct mode solves the problem.
COMMIT: http://review.gluster.org/10590 committed in master by Raghavendra G (rgowdapp) ------ commit a661f7f54cef34aa39894818568a2c1b462e8cbc Author: Nithya Balachandran <nbalacha> Date: Tue May 5 23:47:58 2015 +0530 tests: Spurious failure in fop-sanity.t Modified the calls to open in fops-sanity.c to pass in the mode as well if flags includes O_CREAT (as per man page). The missing mode randomly caused T files to be created causing DHT to treat them as linkto files and fail the fop. Modified 2 other files where the mode was not being provided. Change-Id: I047573d43655b4957d0703f7df36238f7e729c1f BUG: 1218951 Signed-off-by: Nithya Balachandran <nbalacha> Reviewed-on: http://review.gluster.org/10590 Tested-by: Gluster Build System <jenkins.com> Tested-by: NetBSD Build System Reviewed-by: Raghavendra G <rgowdapp> Tested-by: Raghavendra G <rgowdapp>
"tests" component is for tests framework only. File a bug under test component if you find a bug in 1. any of the *.rc files under tests/ 2. run-tests.sh For everything else, the bug should be filed on 1. component which is being tested by .t file if the .t file requires fix. 2. component which is causing a valid .t file to fail in regression. I have used my best judgement here to move the bug to right component. In case of ambiguity, I have placed the blame on the .t file component. Please consider test bugs under the same backlog list that tracks other bugs in your component.
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.0, please open a new bug report. glusterfs-3.8.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://blog.gluster.org/2016/06/glusterfs-3-8-released/ [2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user