Description of problem: ---------------------- EC (4+2) mounted via FUSE,parallel readdir enabled. Bonnie++ failed on 6 of my clients with the following error messages on the client side : *CLIENT1,gqac015* : [root@gqac015 ]# /opt/qa/tools/system_light/run.sh -w /gluster-mount -t bonnie -l /var/tmp/bonnie.log /opt/qa/tools/system_light/scripts /opt/qa/tools/system_light (unreachable)/ /gluster-mount / ----- /gluster-mount /gluster-mount/run7965/ Tests available: arequal bonnie compile_kernel coverage dbench dd ffsb fileop fs_mark fsx fuse glusterfs glusterfs_build iozone locks ltp multiple_files openssl posix_compliance postmark read_large rpc syscallbench tiobench ===========================TESTS RUNNING=========================== Changing to the specified mountpoint /gluster-mount/run7965 executing bonnie Using uid:0, gid:0. Writing a byte at a time... done Writing intelligently...done Rewriting... done Reading a byte at a time...done Reading intelligently...done start 'em...done...done...done...done...done... Create files in sequential order...done. Stat files in sequential order...Expected 16384 files but only got 15706 Cleaning up test directory after error. real 616m37.709s user 1m59.851s sys 30m51.004s bonnie failed 0 Total 0 tests were successful *CLIENT2,gqac028* : ===========================TESTS RUNNING=========================== Changing to the specified mountpoint /gluster-mount/run5649 executing bonnie Using uid:0, gid:0. Writing a byte at a time... done Writing intelligently...done Rewriting... done Reading a byte at a time...done Reading intelligently...done start 'em...done...done...done...done...done... Create files in sequential order...done. Stat files in sequential order...Expected 16384 files but only got 15748 Cleaning up test directory after error. real 634m5.945s user 2m1.608s sys 34m42.844s bonnie failed 0 Total 0 tests were successful Switching over to the previous working directory /opt/qa/tools/system_light/run.sh: line 97: cd: (unreachable)/: No such file or directory Removing /gluster-mount/run5649/ *CLIENT3,gqac016* : ===========================TESTS RUNNING=========================== Changing to the specified mountpoint /gluster-mount/run10519 executing bonnie Using uid:0, gid:0. Writing a byte at a time... done Writing intelligently...done Rewriting... done Reading a byte at a time...done Reading intelligently...done start 'em...done...done...done...done...done... Create files in sequential order...done. Stat files in sequential order...Expected 16384 files but only got 15675 Cleaning up test directory after error. real 625m20.753s user 2m13.675s sys 30m57.884s bonnie failed 0 Total 0 tests were successful Switching over to the previous working directory *CLIENT4,gqac010* ===========================TESTS RUNNING=========================== Changing to the specified mountpoint /gluster-mount/run8690 executing bonnie Using uid:0, gid:0. Writing a byte at a time... done Writing intelligently...done Rewriting... done Reading a byte at a time...done Reading intelligently...done start 'em...done...done...done...done...done... Create files in sequential order...done. Stat files in sequential order...Expected 16384 files but only got 15756 Cleaning up test directory after error. real 631m42.074s user 2m15.927s sys 32m6.464s bonnie failed 0 Total 0 tests were successful Switching over to the previous working directory *CLIENT5,gqac012*: Changing to the specified mountpoint /gluster-mount/run5375 executing bonnie Using uid:0, gid:0. Writing a byte at a time... done Writing intelligently...done Rewriting... done Reading a byte at a time...done Reading intelligently...done start 'em...done...done...done...done...done... Create files in sequential order...done. Stat files in sequential order...Expected 16384 files but only got 15713 Cleaning up test directory after error. real 615m21.491s user 2m8.174s sys 30m19.569s bonnie failed 0 Total 0 tests were successful Switching over to the previous working directory *CLIENT 6,gqac027* : ===========================TESTS RUNNING=========================== Changing to the specified mountpoint /gluster-mount/run5594 executing bonnie Using uid:0, gid:0. Writing a byte at a time... done Writing intelligently...done Rewriting... done Reading a byte at a time...done Reading intelligently...done start 'em...done...done...done...done...done... Create files in sequential order...done. Stat files in sequential order...Expected 16384 files but only got 15757 Cleaning up test directory after error. real 622m24.676s user 2m0.420s sys 32m4.004s bonnie failed 0 Total 0 tests were successful Switching over to the previous working directory Version-Release number of selected component (if applicable): ------------------------------------------------------------- 3.8.4-20 How reproducible: ----------------- 2/2 Steps to Reproduce: ------------------- Run Bonnie from multiple clients. Actual results: --------------- Bonnie fails, Expected results: ----------------- A clean Bonnie run. Additional info: --------------- [root@gqas013 ~]# gluster v info Volume Name: butcher Type: Distributed-Disperse Volume ID: 55902003-7ea9-4f58-987d-63c6c759a385 Status: Started Snapshot Count: 0 Number of Bricks: 12 x (4 + 2) = 72 Transport-type: tcp Bricks: Brick1: gqas013.sbu.lab.eng.bos.redhat.com:/bricks1/brick Brick2: gqas005.sbu.lab.eng.bos.redhat.com:/bricks1/brick Brick3: gqas006.sbu.lab.eng.bos.redhat.com:/bricks1/brick Brick4: gqas008.sbu.lab.eng.bos.redhat.com:/bricks1/brick Brick5: gqas014.sbu.lab.eng.bos.redhat.com:/bricks1/brick Brick6: gqas015.sbu.lab.eng.bos.redhat.com:/bricks1/brick Brick7: gqas013.sbu.lab.eng.bos.redhat.com:/bricks2/brick Brick8: gqas005.sbu.lab.eng.bos.redhat.com:/bricks2/brick Brick9: gqas006.sbu.lab.eng.bos.redhat.com:/bricks2/brick Brick10: gqas008.sbu.lab.eng.bos.redhat.com:/bricks2/brick Brick11: gqas014.sbu.lab.eng.bos.redhat.com:/bricks2/brick Brick12: gqas015.sbu.lab.eng.bos.redhat.com:/bricks2/brick Brick13: gqas013.sbu.lab.eng.bos.redhat.com:/bricks3/brick Brick14: gqas005.sbu.lab.eng.bos.redhat.com:/bricks3/brick Brick15: gqas006.sbu.lab.eng.bos.redhat.com:/bricks3/brick Brick16: gqas008.sbu.lab.eng.bos.redhat.com:/bricks3/brick Brick17: gqas014.sbu.lab.eng.bos.redhat.com:/bricks3/brick Brick18: gqas015.sbu.lab.eng.bos.redhat.com:/bricks3/brick Brick19: gqas013.sbu.lab.eng.bos.redhat.com:/bricks4/brick Brick20: gqas005.sbu.lab.eng.bos.redhat.com:/bricks4/brick Brick21: gqas006.sbu.lab.eng.bos.redhat.com:/bricks4/brick Brick22: gqas008.sbu.lab.eng.bos.redhat.com:/bricks4/brick Brick23: gqas014.sbu.lab.eng.bos.redhat.com:/bricks4/brick Brick24: gqas015.sbu.lab.eng.bos.redhat.com:/bricks4/brick Brick25: gqas013.sbu.lab.eng.bos.redhat.com:/bricks5/brick Brick26: gqas005.sbu.lab.eng.bos.redhat.com:/bricks5/brick Brick27: gqas006.sbu.lab.eng.bos.redhat.com:/bricks5/brick Brick28: gqas008.sbu.lab.eng.bos.redhat.com:/bricks5/brick Brick29: gqas014.sbu.lab.eng.bos.redhat.com:/bricks5/brick Brick30: gqas015.sbu.lab.eng.bos.redhat.com:/bricks5/brick Brick31: gqas013.sbu.lab.eng.bos.redhat.com:/bricks6/brick Brick32: gqas005.sbu.lab.eng.bos.redhat.com:/bricks6/brick Brick33: gqas006.sbu.lab.eng.bos.redhat.com:/bricks6/brick Brick34: gqas008.sbu.lab.eng.bos.redhat.com:/bricks6/brick Brick35: gqas014.sbu.lab.eng.bos.redhat.com:/bricks6/brick Brick36: gqas015.sbu.lab.eng.bos.redhat.com:/bricks6/brick Brick37: gqas013.sbu.lab.eng.bos.redhat.com:/bricks7/brick Brick38: gqas005.sbu.lab.eng.bos.redhat.com:/bricks7/brick Brick39: gqas006.sbu.lab.eng.bos.redhat.com:/bricks7/brick Brick40: gqas008.sbu.lab.eng.bos.redhat.com:/bricks7/brick Brick41: gqas014.sbu.lab.eng.bos.redhat.com:/bricks7/brick Brick42: gqas015.sbu.lab.eng.bos.redhat.com:/bricks7/brick Brick43: gqas013.sbu.lab.eng.bos.redhat.com:/bricks8/brick Brick44: gqas005.sbu.lab.eng.bos.redhat.com:/bricks8/brick Brick45: gqas006.sbu.lab.eng.bos.redhat.com:/bricks8/brick Brick46: gqas008.sbu.lab.eng.bos.redhat.com:/bricks8/brick Brick47: gqas014.sbu.lab.eng.bos.redhat.com:/bricks8/brick Brick48: gqas015.sbu.lab.eng.bos.redhat.com:/bricks8/brick Brick49: gqas013.sbu.lab.eng.bos.redhat.com:/bricks9/brick Brick50: gqas005.sbu.lab.eng.bos.redhat.com:/bricks9/brick Brick51: gqas006.sbu.lab.eng.bos.redhat.com:/bricks9/brick Brick52: gqas008.sbu.lab.eng.bos.redhat.com:/bricks9/brick Brick53: gqas014.sbu.lab.eng.bos.redhat.com:/bricks9/brick Brick54: gqas015.sbu.lab.eng.bos.redhat.com:/bricks9/brick Brick55: gqas013.sbu.lab.eng.bos.redhat.com:/bricks10/brick Brick56: gqas005.sbu.lab.eng.bos.redhat.com:/bricks10/brick Brick57: gqas006.sbu.lab.eng.bos.redhat.com:/bricks10/brick Brick58: gqas008.sbu.lab.eng.bos.redhat.com:/bricks10/brick Brick59: gqas014.sbu.lab.eng.bos.redhat.com:/bricks10/brick Brick60: gqas015.sbu.lab.eng.bos.redhat.com:/bricks10/brick Brick61: gqas013.sbu.lab.eng.bos.redhat.com:/bricks11/brick Brick62: gqas005.sbu.lab.eng.bos.redhat.com:/bricks11/brick Brick63: gqas006.sbu.lab.eng.bos.redhat.com:/bricks11/brick Brick64: gqas008.sbu.lab.eng.bos.redhat.com:/bricks11/brick Brick65: gqas014.sbu.lab.eng.bos.redhat.com:/bricks11/brick Brick66: gqas015.sbu.lab.eng.bos.redhat.com:/bricks11/brick Brick67: gqas013.sbu.lab.eng.bos.redhat.com:/bricks12/brick Brick68: gqas005.sbu.lab.eng.bos.redhat.com:/bricks12/brick Brick69: gqas006.sbu.lab.eng.bos.redhat.com:/bricks12/brick Brick70: gqas008.sbu.lab.eng.bos.redhat.com:/bricks12/brick Brick71: gqas014.sbu.lab.eng.bos.redhat.com:/bricks12/brick Brick72: gqas015.sbu.lab.eng.bos.redhat.com:/bricks12/brick Options Reconfigured: performance.parallel-readdir: on transport.address-family: inet nfs.disable: on features.cache-invalidation: on features.cache-invalidation-timeout: 600 performance.stat-prefetch: on performance.cache-invalidation: on performance.md-cache-timeout: 600 network.inode-lru-limit: 50000 cluster.lookup-optimize: on server.event-threads: 4 client.event-threads: 4 [root@gqas013 ~]#
The test is really long and I am not sure how reproducible it is,. But I could not reproduce it when I disabled parallel readdir(tried once,got a clean run). Also ,I could repro this error on replicate as well with parallel readdir enabled.
Can you please recheck this test case with the latest build, as quite a fixes in this area is gone in?
Clean run of Bonnie++ on multiple iterations : <snip> ========= Changing to the specified mountpoint /gluster-mount/run11097 executing bonnie Using uid:0, gid:0. Writing a byte at a time...done Writing intelligently...done Rewriting...done Reading a byte at a time...done Reading intelligently...done start 'em...done...done...done...done...done... Create files in sequential order...done. Stat files in sequential order...done. Delete files in sequential order...done. Create files in random order...done. Stat files in random order...done. Delete files in random order...done. real 635m28.172s user 1m46.703s sys 31m5.598s 1 Total 1 tests were successful Switching over to the previous working directory Removing /gluster-mount/run11097/ [root@gqac015 ~]# ===========================TESTS RUNNING=========================== Changing to the specified mountpoint /gluster-mount/run11858 executing bonnie Using uid:0, gid:0. Writing a byte at a time...done Writing intelligently...done Rewriting...done Reading a byte at a time...done Reading intelligently...done start 'em...done...done...done...done...done... Create files in sequential order...done. Stat files in sequential order...done. Delete files in sequential order...done. Create files in random order...done. Stat files in random order...done. Delete files in random order...done. real 632m41.918s user 2m17.743s sys 30m21.370s 1 Total 1 tests were successful Switching over to the previous working directory Removing /gluster-mount/run11858/ [root@gqac005 ~]# </snip> This looks fixed on latest gluster bits as well.
Verified on 3.8.4-24.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:2774