PATCH: http://patches.gluster.com/patch/4282 in master (storage/posix: calculate the correct size of each dirent in readdir/readdirp.)
When readdirp is done on a directory (one example is during afr self heal) containing large number of files, encoding of response fails. This is because of iobuf size (into which msg is encoded) being less than the size of the response being encoded. Though readdirp call has size argument set to maximum iobuf size (which is page-size), posix_readdir/readdirp does not calculate the correct size of each entry, thereby resulting in a response bigger than iobuf size. Also, since the entire response is sent in a single iobuf (atleast on sockets), caller of readdirp should also make sure the iobuf can hold the rpc and procedure headers along with direntries.
This patch successfully triggers self-heal (Bug #1365) and all the files are restored in the backend. But there is a mismatch in md5sum of the directories. The mismatch is seen only in the backend which got self-healed. root@pitta:/mnt/gluster# arequal-checksum /mnt/exportnew1/hello2/ Entry counts Regular files : 10000 Directories : 1 Symbolic links : 0 Other : 0 Total : 10001 Metadata checksums Regular files : 3e9 Directories : 24d74c Symbolic links : 3e9 Other : 3e9 Checksums Regular files : 31e5447350a0c582c4f8f2ae5f51a8e1 Directories : 30301e31 Symbolic links : 0 Other : 0 Total : f51db6dd3fc17352 root@pitta:/mnt/gluster# arequal-checksum /mnt/gluster/hello2/ Entry counts Regular files : 10000 Directories : 1 Symbolic links : 0 Other : 0 Total : 10001 Metadata checksums Regular files : 3e9 Directories : 24d74c Symbolic links : 3e9 Other : 3e9 Checksums Regular files : 71b0553083fcdbca65c470265c2bd165 Directories : 30301e31 Symbolic links : 0 Other : 0 Total : 14742516efe7149e
*** This bug has been marked as a duplicate of bug 1422 ***