Bug 1439039 - [Parallel Readdir] : Bonnie++ fails,complains about getting lesser number of files than expected.
Summary: [Parallel Readdir] : Bonnie++ fails,complains about getting lesser number of ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: glusterfs
Version: rhgs-3.3
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: RHGS 3.3.0
Assignee: Poornima G
QA Contact: Ambarish
URL:
Whiteboard:
Depends On:
Blocks: 1417151
TreeView+ depends on / blocked
 
Reported: 2017-04-05 05:39 UTC by Ambarish
Modified: 2017-09-21 04:35 UTC (History)
7 users (show)

Fixed In Version: glusterfs-3.8.4-22
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-09-21 04:35:56 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2017:2774 0 normal SHIPPED_LIVE glusterfs bug fix and enhancement update 2017-09-21 08:16:29 UTC

Description Ambarish 2017-04-05 05:39:30 UTC
Description of problem:
----------------------

EC (4+2) mounted via FUSE,parallel readdir enabled.


Bonnie++ failed on 6 of my clients with the following error messages on the client side :


*CLIENT1,gqac015* :

[root@gqac015 ]# /opt/qa/tools/system_light/run.sh -w /gluster-mount -t bonnie -l /var/tmp/bonnie.log
/opt/qa/tools/system_light/scripts
/opt/qa/tools/system_light
(unreachable)/
/gluster-mount
/
----- /gluster-mount
/gluster-mount/run7965/
Tests available:
arequal
bonnie
compile_kernel
coverage
dbench
dd
ffsb
fileop
fs_mark
fsx
fuse
glusterfs
glusterfs_build
iozone
locks
ltp
multiple_files
openssl
posix_compliance
postmark
read_large
rpc
syscallbench
tiobench
===========================TESTS RUNNING===========================
Changing to the specified mountpoint
/gluster-mount/run7965
executing bonnie
Using uid:0, gid:0.
Writing a byte at a time...
done
Writing intelligently...done
Rewriting...
done
Reading a byte at a time...done
Reading intelligently...done
start 'em...done...done...done...done...done...
Create files in sequential order...done.
Stat files in sequential order...Expected 16384 files but only got 15706
Cleaning up test directory after error.

real    616m37.709s
user    1m59.851s
sys     30m51.004s
bonnie failed
0
Total 0 tests were successful


*CLIENT2,gqac028* :

===========================TESTS RUNNING===========================
Changing to the specified mountpoint
/gluster-mount/run5649
executing bonnie
Using uid:0, gid:0.
Writing a byte at a time...
done
Writing intelligently...done
Rewriting...
done
Reading a byte at a time...done
Reading intelligently...done
start 'em...done...done...done...done...done...
Create files in sequential order...done.
Stat files in sequential order...Expected 16384 files but only got 15748
Cleaning up test directory after error.

real    634m5.945s
user    2m1.608s
sys     34m42.844s
bonnie failed
0
Total 0 tests were successful
Switching over to the previous working directory
/opt/qa/tools/system_light/run.sh: line 97: cd: (unreachable)/: No such file or directory
Removing /gluster-mount/run5649/



*CLIENT3,gqac016* :

===========================TESTS RUNNING===========================
Changing to the specified mountpoint
/gluster-mount/run10519
executing bonnie
Using uid:0, gid:0.
Writing a byte at a time...
done
Writing intelligently...done
Rewriting...
done
Reading a byte at a time...done
Reading intelligently...done
start 'em...done...done...done...done...done...
Create files in sequential order...done.
Stat files in sequential order...Expected 16384 files but only got 15675
Cleaning up test directory after error.

real    625m20.753s
user    2m13.675s
sys     30m57.884s
bonnie failed
0
Total 0 tests were successful
Switching over to the previous working directory


*CLIENT4,gqac010*

===========================TESTS RUNNING===========================
Changing to the specified mountpoint
/gluster-mount/run8690
executing bonnie
Using uid:0, gid:0.
Writing a byte at a time...
done
Writing intelligently...done
Rewriting...
done
Reading a byte at a time...done
Reading intelligently...done
start 'em...done...done...done...done...done...
Create files in sequential order...done.
Stat files in sequential order...Expected 16384 files but only got 15756
Cleaning up test directory after error.

real    631m42.074s
user    2m15.927s
sys     32m6.464s
bonnie failed
0
Total 0 tests were successful
Switching over to the previous working directory


*CLIENT5,gqac012*:

Changing to the specified mountpoint
/gluster-mount/run5375
executing bonnie
Using uid:0, gid:0.
Writing a byte at a time...
done
Writing intelligently...done
Rewriting...
done
Reading a byte at a time...done
Reading intelligently...done
start 'em...done...done...done...done...done...
Create files in sequential order...done.
Stat files in sequential order...Expected 16384 files but only got 15713
Cleaning up test directory after error.

real    615m21.491s
user    2m8.174s
sys     30m19.569s
bonnie failed
0
Total 0 tests were successful
Switching over to the previous working directory


*CLIENT 6,gqac027* :

===========================TESTS RUNNING===========================
Changing to the specified mountpoint
/gluster-mount/run5594
executing bonnie
Using uid:0, gid:0.
Writing a byte at a time...
done
Writing intelligently...done
Rewriting...
done
Reading a byte at a time...done
Reading intelligently...done
start 'em...done...done...done...done...done...
Create files in sequential order...done.
Stat files in sequential order...Expected 16384 files but only got 15757
Cleaning up test directory after error.

real    622m24.676s
user    2m0.420s
sys     32m4.004s
bonnie failed
0
Total 0 tests were successful
Switching over to the previous working directory



Version-Release number of selected component (if applicable):
-------------------------------------------------------------

3.8.4-20

How reproducible:
-----------------

2/2

Steps to Reproduce:
-------------------

Run Bonnie from multiple clients.

Actual results:
---------------

Bonnie fails,

Expected results:
-----------------

A clean Bonnie run.

Additional info:
---------------

[root@gqas013 ~]# gluster v info
 
Volume Name: butcher
Type: Distributed-Disperse
Volume ID: 55902003-7ea9-4f58-987d-63c6c759a385
Status: Started
Snapshot Count: 0
Number of Bricks: 12 x (4 + 2) = 72
Transport-type: tcp
Bricks:
Brick1: gqas013.sbu.lab.eng.bos.redhat.com:/bricks1/brick
Brick2: gqas005.sbu.lab.eng.bos.redhat.com:/bricks1/brick
Brick3: gqas006.sbu.lab.eng.bos.redhat.com:/bricks1/brick
Brick4: gqas008.sbu.lab.eng.bos.redhat.com:/bricks1/brick
Brick5: gqas014.sbu.lab.eng.bos.redhat.com:/bricks1/brick
Brick6: gqas015.sbu.lab.eng.bos.redhat.com:/bricks1/brick
Brick7: gqas013.sbu.lab.eng.bos.redhat.com:/bricks2/brick
Brick8: gqas005.sbu.lab.eng.bos.redhat.com:/bricks2/brick
Brick9: gqas006.sbu.lab.eng.bos.redhat.com:/bricks2/brick
Brick10: gqas008.sbu.lab.eng.bos.redhat.com:/bricks2/brick
Brick11: gqas014.sbu.lab.eng.bos.redhat.com:/bricks2/brick
Brick12: gqas015.sbu.lab.eng.bos.redhat.com:/bricks2/brick
Brick13: gqas013.sbu.lab.eng.bos.redhat.com:/bricks3/brick
Brick14: gqas005.sbu.lab.eng.bos.redhat.com:/bricks3/brick
Brick15: gqas006.sbu.lab.eng.bos.redhat.com:/bricks3/brick
Brick16: gqas008.sbu.lab.eng.bos.redhat.com:/bricks3/brick
Brick17: gqas014.sbu.lab.eng.bos.redhat.com:/bricks3/brick
Brick18: gqas015.sbu.lab.eng.bos.redhat.com:/bricks3/brick
Brick19: gqas013.sbu.lab.eng.bos.redhat.com:/bricks4/brick
Brick20: gqas005.sbu.lab.eng.bos.redhat.com:/bricks4/brick
Brick21: gqas006.sbu.lab.eng.bos.redhat.com:/bricks4/brick
Brick22: gqas008.sbu.lab.eng.bos.redhat.com:/bricks4/brick
Brick23: gqas014.sbu.lab.eng.bos.redhat.com:/bricks4/brick
Brick24: gqas015.sbu.lab.eng.bos.redhat.com:/bricks4/brick
Brick25: gqas013.sbu.lab.eng.bos.redhat.com:/bricks5/brick
Brick26: gqas005.sbu.lab.eng.bos.redhat.com:/bricks5/brick
Brick27: gqas006.sbu.lab.eng.bos.redhat.com:/bricks5/brick
Brick28: gqas008.sbu.lab.eng.bos.redhat.com:/bricks5/brick
Brick29: gqas014.sbu.lab.eng.bos.redhat.com:/bricks5/brick
Brick30: gqas015.sbu.lab.eng.bos.redhat.com:/bricks5/brick
Brick31: gqas013.sbu.lab.eng.bos.redhat.com:/bricks6/brick
Brick32: gqas005.sbu.lab.eng.bos.redhat.com:/bricks6/brick
Brick33: gqas006.sbu.lab.eng.bos.redhat.com:/bricks6/brick
Brick34: gqas008.sbu.lab.eng.bos.redhat.com:/bricks6/brick
Brick35: gqas014.sbu.lab.eng.bos.redhat.com:/bricks6/brick
Brick36: gqas015.sbu.lab.eng.bos.redhat.com:/bricks6/brick
Brick37: gqas013.sbu.lab.eng.bos.redhat.com:/bricks7/brick
Brick38: gqas005.sbu.lab.eng.bos.redhat.com:/bricks7/brick
Brick39: gqas006.sbu.lab.eng.bos.redhat.com:/bricks7/brick
Brick40: gqas008.sbu.lab.eng.bos.redhat.com:/bricks7/brick
Brick41: gqas014.sbu.lab.eng.bos.redhat.com:/bricks7/brick
Brick42: gqas015.sbu.lab.eng.bos.redhat.com:/bricks7/brick
Brick43: gqas013.sbu.lab.eng.bos.redhat.com:/bricks8/brick
Brick44: gqas005.sbu.lab.eng.bos.redhat.com:/bricks8/brick
Brick45: gqas006.sbu.lab.eng.bos.redhat.com:/bricks8/brick
Brick46: gqas008.sbu.lab.eng.bos.redhat.com:/bricks8/brick
Brick47: gqas014.sbu.lab.eng.bos.redhat.com:/bricks8/brick
Brick48: gqas015.sbu.lab.eng.bos.redhat.com:/bricks8/brick
Brick49: gqas013.sbu.lab.eng.bos.redhat.com:/bricks9/brick
Brick50: gqas005.sbu.lab.eng.bos.redhat.com:/bricks9/brick
Brick51: gqas006.sbu.lab.eng.bos.redhat.com:/bricks9/brick
Brick52: gqas008.sbu.lab.eng.bos.redhat.com:/bricks9/brick
Brick53: gqas014.sbu.lab.eng.bos.redhat.com:/bricks9/brick
Brick54: gqas015.sbu.lab.eng.bos.redhat.com:/bricks9/brick
Brick55: gqas013.sbu.lab.eng.bos.redhat.com:/bricks10/brick
Brick56: gqas005.sbu.lab.eng.bos.redhat.com:/bricks10/brick
Brick57: gqas006.sbu.lab.eng.bos.redhat.com:/bricks10/brick
Brick58: gqas008.sbu.lab.eng.bos.redhat.com:/bricks10/brick
Brick59: gqas014.sbu.lab.eng.bos.redhat.com:/bricks10/brick
Brick60: gqas015.sbu.lab.eng.bos.redhat.com:/bricks10/brick
Brick61: gqas013.sbu.lab.eng.bos.redhat.com:/bricks11/brick
Brick62: gqas005.sbu.lab.eng.bos.redhat.com:/bricks11/brick
Brick63: gqas006.sbu.lab.eng.bos.redhat.com:/bricks11/brick
Brick64: gqas008.sbu.lab.eng.bos.redhat.com:/bricks11/brick
Brick65: gqas014.sbu.lab.eng.bos.redhat.com:/bricks11/brick
Brick66: gqas015.sbu.lab.eng.bos.redhat.com:/bricks11/brick
Brick67: gqas013.sbu.lab.eng.bos.redhat.com:/bricks12/brick
Brick68: gqas005.sbu.lab.eng.bos.redhat.com:/bricks12/brick
Brick69: gqas006.sbu.lab.eng.bos.redhat.com:/bricks12/brick
Brick70: gqas008.sbu.lab.eng.bos.redhat.com:/bricks12/brick
Brick71: gqas014.sbu.lab.eng.bos.redhat.com:/bricks12/brick
Brick72: gqas015.sbu.lab.eng.bos.redhat.com:/bricks12/brick
Options Reconfigured:
performance.parallel-readdir: on
transport.address-family: inet
nfs.disable: on
features.cache-invalidation: on
features.cache-invalidation-timeout: 600
performance.stat-prefetch: on
performance.cache-invalidation: on
performance.md-cache-timeout: 600
network.inode-lru-limit: 50000
cluster.lookup-optimize: on
server.event-threads: 4
client.event-threads: 4
[root@gqas013 ~]#

Comment 2 Ambarish 2017-04-05 05:44:23 UTC
The test is  really long  and I am not sure how reproducible it is,.

But I could not reproduce it when I disabled parallel readdir(tried once,got a clean run). 

Also ,I could repro this error on replicate as well with parallel readdir enabled.

Comment 5 Poornima G 2017-04-26 10:07:26 UTC
Can you please recheck this test case with the latest build, as quite a fixes in this area is gone in?

Comment 6 Ambarish 2017-05-06 03:53:53 UTC
Clean run of Bonnie++ on multiple iterations :

<snip>
=========
Changing to the specified mountpoint
/gluster-mount/run11097
executing bonnie
Using uid:0, gid:0.
Writing a byte at a time...done
Writing intelligently...done
Rewriting...done
Reading a byte at a time...done
Reading intelligently...done
start 'em...done...done...done...done...done...
Create files in sequential order...done.
Stat files in sequential order...done.
Delete files in sequential order...done.
Create files in random order...done.
Stat files in random order...done.
Delete files in random order...done.

real    635m28.172s
user    1m46.703s
sys     31m5.598s
1
Total 1 tests were successful
Switching over to the previous working directory
Removing /gluster-mount/run11097/
[root@gqac015 ~]# 

===========================TESTS RUNNING===========================
Changing to the specified mountpoint
/gluster-mount/run11858
executing bonnie
Using uid:0, gid:0.
Writing a byte at a time...done
Writing intelligently...done
Rewriting...done
Reading a byte at a time...done
Reading intelligently...done
start 'em...done...done...done...done...done...
Create files in sequential order...done.
Stat files in sequential order...done.
Delete files in sequential order...done.
Create files in random order...done.
Stat files in random order...done.
Delete files in random order...done.

real    632m41.918s
user    2m17.743s
sys     30m21.370s
1
Total 1 tests were successful
Switching over to the previous working directory
Removing /gluster-mount/run11858/
[root@gqac005 ~]# 


</snip>



This looks fixed on latest gluster bits as well.

Comment 9 Ambarish 2017-05-08 06:52:20 UTC
Verified on 3.8.4-24.

Comment 11 errata-xmlrpc 2017-09-21 04:35:56 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:2774


Note You need to log in before you can comment on or make changes to this bug.