Description of problem: In a two node cluster, the lock master of two folders can move 100 files from one folder to the other in less than 1 second. If a server is not the lock master for that folder, it will take that server 3-5 seconds to perform the same task on GFS1, and 30-50 seconds on GFS1. While I am still somewhat dissapointed in the 3-5 seconds on GFS1, I could live with it. This bug is simply to point out the speed degradation from GFS1 to GFS2. Version-Release number of selected component (if applicable): Comparison of GFS1 to GFS2 kmod-gfs-0.1.31-3.el5 kernel-2.6.18-128.el5 gfs-utils-0.1.18-1.el5 gfs2-utils-0.1.53-1.el5_3.3 How reproducible: After formatting the filesystem in either GFS1 or GFS2, I cd into the root directory of the mounted file system (in my case /var/vmail), then on the first node I run: time rm -fr test1 test2; time mkdir test1 test2 ; time for num in `seq 1 100`; do touch test1/foo.$num; done; time for num in `seq 1 100`; do mv test1/foo.$num test2/bar.$num; done; On the second node I'll then run: time for num in `seq 1 100`; do mv test2/bar.$num test1/file.$num; done Steps to Reproduce: 1. mkfs.gfs[2] -t emailcluster:emailfs -p lock_dlm -j 3 /dev/sdb1 2. Modify /etc/fstab "/dev/sdb1 /var/vmail gfs[2] defaults,noatime 0 0" 3. service gfs[2] start Actual results: For the non-"lock master" on GFS1, the move takes in the order of 10 times as long as the "lock master". For the non-"lock master" on GFS2, the move takes in the order of 100 times as long as the "lock master". Expected results: For GFS1 and GFS2 to be comparable in this test. Additional info:
This was reported against the wrong product. Updating to the correct product.
Update: I have been able to reproduce this on my shell cluster. gfs2 takes about 30 seconds while gfs1 takes about 4 seconds to complete the test mentioned above. When I changed the .go_min_hold_time from HZ/5 to HZ/100 for gfs2_inode_glops and gfs2_rgrp_glops in glops.c, gfs2 finished the test in about 6-7 seconds. However, the min-hold-time should ideally expire soon after the glocks are cached (as soon as the files are created) and by the time we move the files on the second node, the glocks should be ready to be released. I'm looking into this further.
Created attachment 438434 [details] First idea at trying to fix this This is a bit subtle, since the information about the intent of a remote node is not available at the demoting node. All the demoting node knows is that two demote requests for the same lock have appeared, one from EX to SH and then one from SH to EX. The patch introduces a new glock flag called the queued flag. This is set any time a holder is queued on a glock. It is reset upon a state change which does not involve the UN state occurs, and only then if there are no queued holders at the time. If the flag is not set, then the minimum hold time will not be obeyed. Instead the demote will be scheduled immediately. If the flag is set, then the min hold time will be obeyed as usual. This should, fingers crossed, be enough to solve the problem that we are seeing with multiple demotes, but also enough to ensure that we still don't run into issues when there is lock contention where progress is not made.
(In reply to comment #5) > Created an attachment (id=438434) [details] > First idea at trying to fix this > > This is a bit subtle, since the information about the intent of a remote node > is not available at the demoting node. All the demoting node knows is that two > demote requests for the same lock have appeared, one from EX to SH and then one > from SH to EX. > > The patch introduces a new glock flag called the queued flag. This is set any > time a holder is queued on a glock. It is reset upon a state change which does > not involve the UN state occurs, and only then if there are no queued holders > at the time. > > If the flag is not set, then the minimum hold time will not be obeyed. Instead > the demote will be scheduled immediately. If the flag is set, then the min hold > time will be obeyed as usual. > > This should, fingers crossed, be enough to solve the problem that we are seeing > with multiple demotes, but also enough to ensure that we still don't run into > issues when there is lock contention where progress is not made. No luck with this patch. Still seeing about 38 seconds for the mv commands on gfs2 on the second node.
Created attachment 438686 [details] Updated patch, including printing the new glock flag in traces
So far as I could tell from yesterday's results, the patch is doing what it was originally written to do. So the problem is likely to be elsewhere.
Another thought.... Lets assume two nodes, A and B. Node A has a lock cached in EX but is idle. Node B tries a lookup followed by a write. Node B sends a PR request to the dlm The dlm send a callback to node A Node A drops to PR The dlm grants node B's request for PR Node B does the lookup Node B send a PR -> EX convert request to the dlm The dlm sends a single callback to Node A (and if I've read the code correctly) also Node B, and fails node B's request due to the blocking PR which Node A is still holding So I'd like to verify whether indeed Node B does get a callback from the dlm when the convert fails. If so then I think we might have found the issue. The reason why that is a problem is that Node B has only just been granted a lock. If the demote request arrives before the reply to the convert request, this will set the glock workqueue according to the min hold time. The glock workqueue is of the delayed variety. That means that the "work" is executed at some future date depending upon when the timer goes off. If a subsequent request to queue work is received while the timer is pending, then it will be ignored. So if there is a demote request being received by node B, that will prevent any further running of the workqueue for 200mS or thereabouts, assuming that the lookup took a very small amount of time. So if you still have those traces from yesterday, then I'd like to see whether any demote requests were received on the node which you were doing the writes on.
west-10 - first node # tracer: nop # # TASK-PID CPU# TIMESTAMP FUNCTION # | | | | | dlm_astd-2018 [002] 1885930.700013: gfs2_demote_rq: 253,3 glock 2:66314 demote EX to PR flags:DyIq dlm_astd-2018 [002] 1885930.700019: gfs2_glock_abhi_debug: 253,3 glock 2:66314 state EX to XX tgt:EX dmt:PR flags:DyIq - gfs2_glock_cb delay:0 glock_workqueue-1854 [002] 1885930.715963: gfs2_glock_abhi_debug: 253,3 glock 2:66314 state EX to XX tgt:PR dmt:PR flags:lDpIq - do_xmote delay:0 glock_workqueue-1854 [002] 1885930.715996: gfs2_glock_abhi_debug: 253,3 glock 2:66314 state EX to XX tgt:PR dmt:PR flags:lDpIq - glock_work_func delay:0 glock_workqueue-1854 [002] 1885930.716012: gfs2_glock_state_change: 253,3 glock 2:66314 state EX to PR tgt:PR dmt:PR flags:lDpIq glock_workqueue-1854 [002] 1885930.716016: gfs2_glock_abhi_debug: 253,3 glock 2:66314 state PR to XX tgt:PR dmt:EX flags:I - glock_work_func delay:0 dlm_astd-2018 [002] 1885930.740405: gfs2_demote_rq: 253,3 glock 2:66314 demote PR to NL flags:DI dlm_astd-2018 [002] 1885930.740406: gfs2_glock_abhi_debug: 253,3 glock 2:66314 state PR to XX tgt:PR dmt:NL flags:DI - gfs2_glock_cb delay:0 glock_workqueue-1854 [002] 1885930.740448: gfs2_glock_abhi_debug: 253,3 glock 2:66314 state PR to XX tgt:NL dmt:NL flags:lDpI - do_xmote delay:0 glock_workqueue-1854 [002] 1885930.740457: gfs2_glock_abhi_debug: 253,3 glock 2:66314 state PR to XX tgt:NL dmt:NL flags:lDpI - glock_work_func delay:0 glock_workqueue-1854 [002] 1885930.740468: gfs2_glock_state_change: 253,3 glock 2:66314 state PR to NL tgt:NL dmt:NL flags:lDpI glock_workqueue-1854 [002] 1885930.740470: gfs2_glock_abhi_debug: 253,3 glock 2:66314 state NL to XX tgt:NL dmt:EX flags:I - glock_work_func delay:0 glock_workqueue-1854 [002] 1885934.676075: gfs2_glock_state_change: 253,3 glock 2:66213 state NL to EX tgt:EX dmt:EX flags:lIq glock_workqueue-1854 [002] 1885934.676428: gfs2_promote: 253,3 glock 2:66213 promote first EX glock_workqueue-1854 [002] 1885934.676439: gfs2_glock_abhi_debug: 253,3 glock 2:66213 state EX to XX tgt:EX dmt:EX flags:Iq - glock_work_func delay:0 glock_workqueue-1854 [002] 1885934.676540: gfs2_glock_state_change: 253,3 glock 2:66213 state EX to NL tgt:NL dmt:NL flags:lDpIq glock_workqueue-1854 [002] 1885934.676543: gfs2_glock_abhi_debug: 253,3 glock 2:66213 state NL to XX tgt:NL dmt:EX flags:Iq - glock_work_func delay:0 grep-11934 [000] 1885937.435063: gfs2_promote: 253,3 glock 2:22 promote first PR grep-11934 [000] 1885937.435071: gfs2_glock_queue: 253,3 glock 2:22 dequeue PR grep-11934 [000] 1885937.435081: gfs2_promote: 253,3 glock 2:22 promote first PR grep-11934 [000] 1885937.435082: gfs2_glock_queue: 253,3 glock 2:22 dequeue PR gfs2_quotad-11923 [001] 1885944.683912: gfs2_glock_abhi_debug: 253,3 glock 2:66213 state NL to XX tgt:EX dmt:EX flags:lIq - do_xmote delay:0 gfs2_quotad-11923 [001] 1885944.684409: gfs2_demote_rq: 253,3 glock 2:66213 demote EX to NL flags:DIq gfs2_quotad-11923 [001] 1885944.684411: gfs2_glock_queue: 253,3 glock 2:66213 dequeue EX gfs2_quotad-11923 [001] 1885944.684412: gfs2_glock_abhi_debug: 253,3 glock 2:66213 state EX to XX tgt:EX dmt:NL flags:DIq - gfs2_glock_dq delay:0 glock_workqueue-1851 [001] 1885944.684458: gfs2_glock_abhi_debug: 253,3 glock 2:66213 state EX to XX tgt:NL dmt:NL flags:lDpIq - do_xmote delay:0 glock_workqueue-1851 [001] 1885944.684471: gfs2_glock_abhi_debug: 253,3 glock 2:66213 state EX to XX tgt:NL dmt:NL flags:lDpIq - glock_work_func delay:0 find-11933 [003] 1885947.442094: gfs2_promote: 253,3 glock 2:22 promote first PR find-11933 [003] 1885947.442102: gfs2_glock_queue: 253,3 glock 2:22 dequeue PR find-11933 [003] 1885947.442111: gfs2_promote: 253,3 glock 2:22 promote first PR find-11933 [003] 1885947.442112: gfs2_glock_queue: 253,3 glock 2:22 dequeue PR find-11933 [001] 1885947.444452: gfs2_promote: 253,3 glock 2:22 promote first PR find-11933 [001] 1885947.444460: gfs2_glock_queue: 253,3 glock 2:22 dequeue PR find-11933 [001] 1885947.444464: gfs2_promote: 253,3 glock 2:22 promote first PR find-11933 [001] 1885947.444465: gfs2_glock_queue: 253,3 glock 2:22 dequeue PR find-11933 [001] 1885947.444470: gfs2_promote: 253,3 glock 2:22 promote first PR find-11933 [001] 1885947.444471: gfs2_glock_queue: 253,3 glock 2:22 dequeue PR find-11933 [001] 1885947.444521: gfs2_promote: 253,3 glock 2:22 promote first PR find-11933 [001] 1885947.444522: gfs2_glock_queue: 253,3 glock 2:22 dequeue PR find-11933 [001] 1885947.444523: gfs2_promote: 253,3 glock 2:22 promote first PR find-11933 [001] 1885947.444524: gfs2_glock_queue: 253,3 glock 2:22 dequeue PR find-11933 [001] 1885947.444533: gfs2_promote: 253,3 glock 2:22 promote first PR find-11933 [001] 1885947.444534: gfs2_glock_queue: 253,3 glock 2:22 dequeue PR find-11933 [001] 1885947.474184: gfs2_promote: 253,3 glock 2:22 promote first PR find-11933 [001] 1885947.474191: gfs2_glock_queue: 253,3 glock 2:22 dequeue PR find-11933 [001] 1885947.474199: gfs2_promote: 253,3 glock 2:22 promote first PR find-11933 [001] 1885947.474200: gfs2_glock_queue: 253,3 glock 2:22 dequeue PR
west-11 # tracer: nop # # TASK-PID CPU# TIMESTAMP FUNCTION # | | | | | open_n_write-29159 [002] 1630943.457578: gfs2_promote: 253,3 glock 2:22 promote first PR open_n_write-29159 [002] 1630943.457584: gfs2_glock_queue: 253,3 glock 2:22 dequeue PR open_n_write-29159 [002] 1630943.457591: gfs2_promote: 253,3 glock 2:22 promote first PR open_n_write-29159 [002] 1630943.457629: gfs2_glock_abhi_debug: 253,3 glock 5:66314 state NL to XX tgt:PR dmt:EX flags:lq - do_xmote delay:0 glock_workqueue-19106 [002] 1630943.457995: gfs2_glock_state_change: 253,3 glock 5:66314 state NL to PR tgt:PR dmt:EX flags:lIq glock_workqueue-19106 [002] 1630943.457997: gfs2_promote: 253,3 glock 5:66314 promote other PR glock_workqueue-19106 [002] 1630943.458000: gfs2_glock_abhi_debug: 253,3 glock 5:66314 state PR to XX tgt:PR dmt:EX flags:Iq - glock_work_func delay:0 open_n_write-29159 [002] 1630943.458005: gfs2_glock_queue: 253,3 glock 2:22 dequeue PR open_n_write-29159 [002] 1630943.458007: gfs2_glock_abhi_debug: 253,3 glock 2:66314 state NL to XX tgt:PR dmt:EX flags:lq - do_xmote delay:0 glock_workqueue-19106 [002] 1630943.497774: gfs2_glock_state_change: 253,3 glock 2:66314 state NL to PR tgt:PR dmt:EX flags:lIq glock_workqueue-19106 [002] 1630943.497984: gfs2_promote: 253,3 glock 2:66314 promote first PR glock_workqueue-19106 [002] 1630943.497987: gfs2_glock_abhi_debug: 253,3 glock 2:66314 state PR to XX tgt:PR dmt:EX flags:Iq - glock_work_func delay:0 open_n_write-29159 [002] 1630943.497990: gfs2_glock_queue: 253,3 glock 2:66314 dequeue PR open_n_write-29159 [002] 1630943.497994: gfs2_promote: 253,3 glock 2:66314 promote first PR open_n_write-29159 [002] 1630943.498177: gfs2_glock_queue: 253,3 glock 2:66314 dequeue PR open_n_write-29159 [002] 1630943.498313: gfs2_promote: 253,3 glock 2:66314 promote first PR open_n_write-29159 [002] 1630943.498314: gfs2_glock_queue: 253,3 glock 2:66314 dequeue PR open_n_write-29159 [002] 1630943.498319: gfs2_promote: 253,3 glock 2:66314 promote first PR open_n_write-29159 [002] 1630943.498320: gfs2_glock_queue: 253,3 glock 2:66314 dequeue PR open_n_write-29159 [002] 1630943.498332: gfs2_promote: 253,3 glock 2:66314 promote first PR open_n_write-29159 [002] 1630943.498334: gfs2_glock_queue: 253,3 glock 2:66314 dequeue PR open_n_write-29159 [002] 1630943.498338: gfs2_glock_abhi_debug: 253,3 glock 2:66314 state PR to XX tgt:EX dmt:EX flags:lIq - do_xmote delay:0 dlm_astd-2092 [002] 1630943.498552: gfs2_demote_rq: 253,3 glock 2:66314 demote PR to NL flags:ldrIq dlm_astd-2092 [002] 1630943.498553: gfs2_glock_abhi_debug: 253,3 glock 2:66314 state PR to XX tgt:EX dmt:NL flags:ldrIq - gfs2_glock_cb delay:200 glock_workqueue-19106 [002] 1630943.498556: gfs2_glock_state_change: 253,3 glock 2:66314 state PR to PR tgt:EX dmt:NL flags:ldIq glock_workqueue-19106 [002] 1630943.498572: gfs2_glock_abhi_debug: 253,3 glock 2:66314 state PR to XX tgt:EX dmt:NL flags:ldIq - do_xmote delay:0 glock_workqueue-19106 [002] 1630943.498583: gfs2_glock_abhi_debug: 253,3 glock 2:66314 state PR to XX tgt:EX dmt:NL flags:ldIq - glock_work_func delay:200 glock_workqueue-19106 [002] 1630943.697269: gfs2_glock_state_change: 253,3 glock 2:66314 state PR to NL tgt:EX dmt:NL flags:ldIq glock_workqueue-19106 [002] 1630943.697271: gfs2_glock_abhi_debug: 253,3 glock 2:66314 state NL to XX tgt:EX dmt:NL flags:ldIq - do_xmote delay:0 glock_workqueue-19106 [002] 1630943.697279: gfs2_glock_abhi_debug: 253,3 glock 2:66314 state NL to XX tgt:EX dmt:NL flags:lIq - glock_work_func delay:0 glock_workqueue-19106 [002] 1630943.697485: gfs2_glock_state_change: 253,3 glock 2:66314 state NL to EX tgt:EX dmt:NL flags:lIq glock_workqueue-19106 [002] 1630943.697740: gfs2_promote: 253,3 glock 2:66314 promote first EX glock_workqueue-19106 [002] 1630943.697743: gfs2_glock_abhi_debug: 253,3 glock 2:66314 state EX to XX tgt:EX dmt:NL flags:Iq - glock_work_func delay:0 open_n_write-29159 [002] 1630943.697751: gfs2_promote: 253,3 glock 1:2 promote other PR open_n_write-29159 [002] 1630943.697771: gfs2_glock_queue: 253,3 glock 1:2 dequeue PR open_n_write-29159 [002] 1630943.697773: gfs2_glock_queue: 253,3 glock 2:66314 dequeue EX glock_workqueue-19106 [002] 1630948.158332: gfs2_glock_state_change: 253,3 glock 2:66213 state NL to EX tgt:EX dmt:EX flags:lIq glock_workqueue-19106 [002] 1630948.158578: gfs2_promote: 253,3 glock 2:66213 promote first EX glock_workqueue-19106 [002] 1630948.158590: gfs2_glock_abhi_debug: 253,3 glock 2:66213 state EX to XX tgt:EX dmt:EX flags:Iq - glock_work_func delay:0 glock_workqueue-19106 [002] 1630948.158732: gfs2_glock_state_change: 253,3 glock 2:66213 state EX to NL tgt:NL dmt:NL flags:lDpIq glock_workqueue-19106 [002] 1630948.158736: gfs2_glock_abhi_debug: 253,3 glock 2:66213 state NL to XX tgt:NL dmt:EX flags:Iq - glock_work_func delay:0 find-29161 [002] 1630950.204468: gfs2_promote: 253,3 glock 2:22 promote first PR find-29161 [002] 1630950.204476: gfs2_glock_queue: 253,3 glock 2:22 dequeue PR find-29161 [002] 1630950.204488: gfs2_promote: 253,3 glock 2:22 promote first PR find-29161 [002] 1630950.204490: gfs2_glock_queue: 253,3 glock 2:22 dequeue PR find-29161 [000] 1630950.208075: gfs2_promote: 253,3 glock 2:22 promote first PR find-29161 [000] 1630950.208084: gfs2_glock_queue: 253,3 glock 2:22 dequeue PR find-29161 [000] 1630950.208091: gfs2_promote: 253,3 glock 2:22 promote first PR find-29161 [000] 1630950.208093: gfs2_glock_queue: 253,3 glock 2:22 dequeue PR find-29161 [000] 1630950.208102: gfs2_promote: 253,3 glock 2:22 promote first PR find-29161 [000] 1630950.208104: gfs2_glock_queue: 253,3 glock 2:22 dequeue PR find-29161 [000] 1630950.208176: gfs2_promote: 253,3 glock 2:22 promote first PR find-29161 [000] 1630950.208178: gfs2_glock_queue: 253,3 glock 2:22 dequeue PR find-29161 [000] 1630950.208181: gfs2_promote: 253,3 glock 2:22 promote first PR find-29161 [000] 1630950.208183: gfs2_glock_queue: 253,3 glock 2:22 dequeue PR find-29161 [000] 1630950.208200: gfs2_promote: 253,3 glock 2:22 promote first PR find-29161 [000] 1630950.208202: gfs2_glock_queue: 253,3 glock 2:22 dequeue PR find-29161 [000] 1630950.267308: gfs2_promote: 253,3 glock 2:22 promote first PR find-29161 [000] 1630950.267317: gfs2_glock_queue: 253,3 glock 2:22 dequeue PR find-29161 [000] 1630950.267330: gfs2_promote: 253,3 glock 2:22 promote first PR find-29161 [000] 1630950.267331: gfs2_glock_queue: 253,3 glock 2:22 dequeue PR gfs2_quotad-29146 [003] 1630954.860305: gfs2_glock_abhi_debug: 253,3 glock 2:66213 state NL to XX tgt:EX dmt:EX flags:lIq - do_xmote delay:0 gfs2_quotad-29146 [003] 1630954.860929: gfs2_demote_rq: 253,3 glock 2:66213 demote EX to NL flags:DIq gfs2_quotad-29146 [003] 1630954.860931: gfs2_glock_queue: 253,3 glock 2:66213 dequeue EX gfs2_quotad-29146 [003] 1630954.860932: gfs2_glock_abhi_debug: 253,3 glock 2:66213 state EX to XX tgt:EX dmt:NL flags:DIq - gfs2_glock_dq delay:0 glock_workqueue-19107 [003] 1630954.861002: gfs2_glock_abhi_debug: 253,3 glock 2:66213 state EX to XX tgt:NL dmt:NL flags:lDpIq - do_xmote delay:0 glock_workqueue-19107 [003] 1630954.861034: gfs2_glock_abhi_debug: 253,3 glock 2:66213 state EX to XX tgt:NL dmt:NL flags:lDpIq - glock_work_func delay:0 grep-29162 [001] 1630956.907173: gfs2_promote: 253,3 glock 2:22 promote first PR grep-29162 [001] 1630956.907182: gfs2_glock_queue: 253,3 glock 2:22 dequeue PR grep-29162 [001] 1630956.907190: gfs2_promote: 253,3 glock 2:22 promote first PR grep-29162 [001] 1630956.907192: gfs2_glock_queue: 253,3 glock 2:22 dequeue PR
Created attachment 442615 [details] Another test patch Patch to see if my hunch is correct
Created attachment 442677 [details] combined rhel5 patch This is the combined rhel5 patch for the two upstream patches above. With both these patches in place, gfs2 performs much better than before with the 'move 100 files from one dir to another' test GFS1 ---- real 0m0.917s user 0m0.036s sys 0m0.261s GFS2 ---- real 0m2.511s user 0m0.016s sys 0m0.183s
http://porkchop.devel.redhat.com/brewroot/scratch/adas/task_2730019/ This is a rhel5 kernel build with the patch above. It's in my scratch space... don't know how long it'll be there before it's wiped away by brew housekeeping. However, I've saved off the rpms onto my work machine. In case you can't find the builds at the link above, ask me.
Patch for GFS2 has been pushed into the -nmw git tree. Patch for dlm has been sent to dct.
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release.
I'm trying to understand the dlm patch in comment 16. send_blocking_asts() sends basts to locks on the grantqueue, and I don't see that the lkb arg would ever be on the grantqueue. So I don't think the patch has any effect on send_blocking_asts(). send_blocking_asts_all() sends basts to locks on both grantqueue and convertqueue, and I do expect that the lkb arg would be on the convertqueue, so the patch does make sense in that case. Is that what you've observed? But send_blocking_asts_all() is only used when gfs uses the DLM_LKF_NOQUEUEBAST flag. Is gfs using that flag here?
Ah, I hadn't seen the email with the description in it: From: Steven Whitehouse <swhiteho> To: David Teigland <teigland> Cc: cluster-devel, linux-kernel.org Date: Fri, 03 Sep 2010 10:40:16 +0100 Message-ID: <1283506816.2500.21.camel@localhost> When converting a lock, an lkb is in the granted state and also being used to request a new state. In the case that the conversion was a "try 1cb" type which has failed, and if the new state was incompatible with the old state, a callback was being generated to the requesting node. This is incorrect as callbacks should only be sent to all the other nodes holding blocking locks. The requesting node should receive the normal (failed) response to its "try 1cb" conversion request only. This was discovered while debugging a performance problem on GFS2, however this fix also speeds up GFS as well. In the GFS2 case the performance gain is over 10x for cases of write activity to an inode whose glock is cached on another, idle (wrt that glock) node.
Barry, since you are looking into performance on GFS2, I thought you might be interested in having a heads up on this particular bug. It only affects performance in cluster operation, it won't make any difference on single node.
posted to rhkernel http://post-office.corp.redhat.com/archives/rhkernel-list/2010-September/msg00455.html
posted -part2.patch to rhkernel as well http://post-office.corp.redhat.com/archives/rhkernel-list/2010-September/msg00464.html
in kernel-2.6.18-221.el5 You can download this test kernel from http://people.redhat.com/jwilson/el5 Detailed testing feedback is always welcomed.
Sorry, I was logged into the wrong instance.
Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: In a two node cluster, the lock master of two folders can move 100 files from one folder to the other in less than 1 second. If a server is not the lock master for that folder, it would take that server 3-5 seconds to perform the same task on GFS1 (Global File System 1), and 30-50 seconds on GFS2 (Global File System 2). With this update, the aforementioned task takes less than 1 second on GFS1 and about 3 seconds on GFS2.
Technical note updated. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. Diffed Contents: @@ -1 +1 @@ -In a two node cluster, the lock master of two folders can move 100 files from one folder to the other in less than 1 second. If a server is not the lock master for that folder, it would take that server 3-5 seconds to perform the same task on GFS1 (Global File System 1), and 30-50 seconds on GFS2 (Global File System 2). With this update, the aforementioned task takes less than 1 second on GFS1 and about 3 seconds on GFS2.+In a two node cluster, moving 100 files between two folders using the lock master was nearly instantaneous. However, not using the lock master resulted in a considerably worse performance on both, GFS1 (Global File System 1) and GFS2 (Global File System 2) file systems. With this update, not using the lock master does not lead to worsened performance on either of the aforementioned file systems.
I added a new test case to our regression run for this bz. Running on a RHEL5.5 cluster and a RHEL5.6 cluster showed a huge improvement. RHEL5.5: Mounting gfs /dev/timetrials/lmrename on all nodes with opts '': buzz-01 buzz-02 buzz-03 buzz-04 buzz-05 Creating 1000 files for each node to move from buzz-01 Running mv on buzz-01 Running mv on buzz-02 Move from lock master took 1.12 seconds Move from other node took 8.03 seconds buzz-02 took 7.17x longer than buzz-01 (master) Mounting gfs2 /dev/timetrials/lmrename on all nodes with opts '': buzz-01 buzz-02 buzz-03 buzz-04 buzz-05 Creating 1000 files for each node to move from buzz-01 Running mv on buzz-01 Running mv on buzz-02 Move from lock master took 1.13 seconds Move from other node took 221.37 seconds buzz-02 took 195.90x longer than buzz-01 (master) RHEL5.6: Mounting gfs /dev/timetrials/lmrename on all nodes with opts '': dash-01 dash-02 dash-03 Creating 1000 files for each node to move from dash-01 Running mv on dash-01 Running mv on dash-02 Move from lock master took 3.19 seconds Move from other node took 9.58 seconds dash-02 took 3.00x longer than dash-01 (master) Mounting gfs2 /dev/timetrials/lmrename on all nodes with opts '': dash-01 dash-02 dash-03 Creating 1000 files for each node to move from dash-01 Running mv on dash-01 Running mv on dash-02 Move from lock master took 3.15 seconds Move from other node took 31.30 seconds dash-02 took 9.94x longer than dash-01 (master)
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2011-0017.html