Description of problem: Used git-bisect to figure out that the following commit caused the problem. 20:33:06 ⚡ git bisect good 5a3d849b8563067d35c1422e43e605bd9533f3c2 is the first bad commit commit 5a3d849b8563067d35c1422e43e605bd9533f3c2 Author: Anand Avati <avati> Date: Tue Jun 26 21:06:39 2012 -0700 cluster/afr: post-op-delay support let me just verify that it is indeed the bad-one. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
This is happening because of the following sequence of operations: 1) Write: takes a full lock on file with fd address. 2) setattr: issues blocking lock based on frame->address. Because of the delayed changelog the unlock happens after a second. Some how logs show 2 seconds delay though. Still need to figure that out. This delay is the reason for the bug. 3) Flush: takes a full lock on file with fd address. (this is not contributing to the bug) [2012-08-04 16:36:47.691995] I [server-helpers.c:807:get_frame_from_request] 0-server: 30 - ccad140100000000 [2012-08-04 16:36:47.692093] D [io-threads.c:266:iot_schedule] 0-r2-io-threads: FINODELK scheduled as normal fop [2012-08-04 16:36:47.692558] I [server-helpers.c:807:get_frame_from_request] 0-server: 34 - ccad140100000000 [2012-08-04 16:36:47.692693] D [io-threads.c:266:iot_schedule] 0-r2-io-threads: FXATTROP scheduled as slow fop [2012-08-04 16:36:47.694386] I [server-helpers.c:807:get_frame_from_request] 0-server: 13 - 0000000000000000 [2012-08-04 16:36:47.694515] D [io-threads.c:266:iot_schedule] 0-r2-io-threads: WRITE scheduled as slow fop [2012-08-04 16:36:47.695786] I [server-helpers.c:807:get_frame_from_request] 0-server: 29 - 98b8f48baf7f0000 [2012-08-04 16:36:47.695903] D [io-threads.c:266:iot_schedule] 0-r2-io-threads: INODELK scheduled as normal fop [2012-08-04 16:36:47.696479] I [server-helpers.c:807:get_frame_from_request] 0-server: 29 - 98b8f48baf7f0000 [2012-08-04 16:36:47.696565] D [io-threads.c:266:iot_schedule] 0-r2-io-threads: INODELK scheduled as normal fop [2012-08-04 16:36:49.508681] I [server-helpers.c:807:get_frame_from_request] 0-server: 34 - ccad140100000000 [2012-08-04 16:36:49.508798] D [io-threads.c:266:iot_schedule] 0-r2-io-threads: FXATTROP scheduled as slow fop [2012-08-04 16:36:49.509421] I [server-helpers.c:807:get_frame_from_request] 0-server: 30 - ccad140100000000 [2012-08-04 16:36:49.509483] D [io-threads.c:266:iot_schedule] 0-r2-io-threads: FINODELK scheduled as normal fop [2012-08-04 16:36:49.509530] D [inodelk.c:313:__inode_unlock_lock] 0-r2-locks: Matching lock found for unlock 0- 9223372036854775807, by ccad140100000000 on 0xc53bb0 [2012-08-04 16:36:49.510362] I [server-helpers.c:807:get_frame_from_request] 0-server: 38 - 0000000000000000 [2012-08-04 16:36:49.510462] D [io-threads.c:266:iot_schedule] 0-r2-io-threads: SETATTR scheduled as normal fop [2012-08-04 16:36:49.511198] I [server-helpers.c:807:get_frame_from_request] 0-server: 29 - 98b8f48baf7f0000 [2012-08-04 16:36:49.511298] D [io-threads.c:266:iot_schedule] 0-r2-io-threads: INODELK scheduled as normal fop [2012-08-04 16:36:49.511452] I [server-helpers.c:807:get_frame_from_request] 0-server: 29 - bcc0f48baf7f0000 [2012-08-04 16:36:49.511609] D [io-threads.c:2380:__iot_workers_scale] 0-r2-io-threads: scaled threads to 2 (queue_size=2/2) [2012-08-04 16:36:49.511672] D [io-threads.c:266:iot_schedule] 0-r2-io-threads: INODELK scheduled as normal fop [2012-08-04 16:36:49.511759] D [inodelk.c:313:__inode_unlock_lock] 0-r2-locks: Matching lock found for unlock 9223372036854775806-9223372036854775807, by 98b8f48baf7f0000 on 0xc53bb0 [2012-08-04 16:36:49.512325] I [server-helpers.c:807:get_frame_from_request] 0-server: 38 - 0000000000000000 [2012-08-04 16:36:49.512410] D [io-threads.c:266:iot_schedule] 0-r2-io-threads: SETATTR scheduled as normal fop [2012-08-04 16:36:49.512978] I [server-helpers.c:807:get_frame_from_request] 0-server: 29 - bcc0f48baf7f0000 [2012-08-04 16:36:49.513074] D [io-threads.c:266:iot_schedule] 0-r2-io-threads: INODELK scheduled as normal fop [2012-08-04 16:36:49.513117] D [inodelk.c:313:__inode_unlock_lock] 0-r2-locks: Matching lock found for unlock 9223372036854775806-9223372036854775807, by bcc0f48baf7f0000 on 0xc53bb0 [2012-08-04 16:36:49.513232] I [server-helpers.c:807:get_frame_from_request] 0-server: 29 - 74b0f48baf7f0000 [2012-08-04 16:36:49.513318] D [io-threads.c:266:iot_schedule] 0-r2-io-threads: INODELK scheduled as normal fop [2012-08-04 16:36:49.513914] I [server-helpers.c:807:get_frame_from_request] 0-server: 38 - 0000000000000000 [2012-08-04 16:36:49.514028] D [io-threads.c:266:iot_schedule] 0-r2-io-threads: SETATTR scheduled as normal fop [2012-08-04 16:36:49.514284] D [posix.c:262:posix_do_chmod] 0-r2-posix: /gfs/r2_0/linux-3.0.1/.gitignore (Function not implemented) [2012-08-04 16:36:49.513117] D [inodelk.c:313:__inode_unlock_lock] 0-r2-locks: Matching lock found for unlock 9223372036854775806-9223372036854775807, by bcc0f48baf7f0000 on 0xc53bb0 [2012-08-04 16:36:49.513232] I [server-helpers.c:807:get_frame_from_request] 0-server: 29 - 74b0f48baf7f0000 [2012-08-04 16:36:49.513318] D [io-threads.c:266:iot_schedule] 0-r2-io-threads: INODELK scheduled as normal fop [2012-08-04 16:36:49.513914] I [server-helpers.c:807:get_frame_from_request] 0-server: 38 - 0000000000000000 [2012-08-04 16:36:49.514028] D [io-threads.c:266:iot_schedule] 0-r2-io-threads: SETATTR scheduled as normal fop [2012-08-04 16:36:49.514284] D [posix.c:262:posix_do_chmod] 0-r2-posix: /gfs/r2_0/linux-3.0.1/.gitignore (Function not implemented) [2012-08-04 16:36:49.514887] I [server-helpers.c:807:get_frame_from_request] 0-server: 29 - 74b0f48baf7f0000 [2012-08-04 16:36:49.514983] D [io-threads.c:266:iot_schedule] 0-r2-io-threads: INODELK scheduled as normal fop [2012-08-04 16:36:49.515117] I [server-helpers.c:807:get_frame_from_request] 0-server: 30 - ccad140100000000 [2012-08-04 16:36:49.515186] D [io-threads.c:266:iot_schedule] 0-r2-io-threads: FINODELK scheduled as normal fop [2012-08-04 16:36:49.515265] D [inodelk.c:313:__inode_unlock_lock] 0-r2-locks: Matching lock found for unlock 9223372036854775806-9223372036854775807, by 74b0f48baf7f0000 on 0xc53bb0 [2012-08-04 16:36:49.515806] I [server-helpers.c:807:get_frame_from_request] 0-server: 27 - 0000000000000000 [2012-08-04 16:36:49.515928] D [io-threads.c:266:iot_schedule] 0-r2-io-threads: LOOKUP scheduled as fast fop [2012-08-04 16:36:49.516435] I [server-helpers.c:807:get_frame_from_request] 0-server: 15 - 2c3736089ab3ced2 [2012-08-04 16:36:49.516520] D [io-threads.c:266:iot_schedule] 0-r2-io-threads: FLUSH scheduled as normal fop [2012-08-04 16:36:49.517220] I [server-helpers.c:807:get_frame_from_request] 0-server: 30 - ccad140100000000 [2012-08-04 16:36:49.517295] D [io-threads.c:266:iot_schedule] 0-r2-io-threads: FINODELK scheduled as normal fop [2012-08-04 16:36:49.517330] D [inodelk.c:313:__inode_unlock_lock] 0-r2-locks: Matching lock found for unlock 0-9223372036854775807, by ccad140100000000 on 0xc53bb0
I think the steps were a bit confusing in the comment above: Here are the steps: 1) Write: takes a full lock on file with fd address. Completes fop and unwinds, write-post-op is delayed by 1 second. 2) setattr: issues non-blocking lock with frame->address. This fails because of the full lock by write in prev step. Now a blocking lock is issued . Because of the delayed changelog the unlock for the write-fop's full lock happens after a second. This delay is the reason for the bug. Some how logs show 2 seconds delay though. Still need to figure that out.
http://review.gluster.org/3975