Hide Forgot
I had distributed replicate setup(2 distribute subvolumes each with 2-repliacas). I started copying /usr to the mount point. After some 50MB-60MB has been copied, did rm -rf usr on the mount point. Now rm and cp are happening simultaneously. I was also doing some other operations such as creating some 20 small files and removing them. find command was being run. Now I brought a server down.Erased the contents of the export directory of that server. Brought it up. Now doing ls on the usr directory will not show any contents. du on that directory shows 16K.Whereas du at the backend shows more size. But doing ls man which is present in usr directory but not visible from usr directory will show the contents of man directory. This happens with first iteration of server up-down sometimes, and for multiple iterations sometimes. This is observed in 3.0.4rc series.
Do you have stat-prefetch loaded? With stat-prefetch on, self-heal might not be triggered by just doing "ls". Can you do "find | xargs stat" on the mountpoint and then try the ls? Also, are you killing the first server or the second?
No there was no stat-prefetch loaded. And I was doing all the servers down and up. But yes, most of the times it was the first server and most many times this was observed when 1st server of each of the dht subvolumes was brought down. And once we even observed data corruption also. Like doing md5sum on the backend of the 1st server(which was brought down, backend was cleaned and brought up) yielded same value for all the entries in a directory.Running find command while the server was broght down and again brought up says some filesystem error. Will repeat the test once again and post the errors am getting. But the strange thing is everytime I do the test I get a new error(each of the things mentioned were observed in different runs, not in a single run). Will repeat the tests and update again.
Whenever some big directory such as "/usr" is being copied and removed simultaneously and "find" is also being run parallely then server up and down will cause this following to be displayed by the "find" "find: WARNING: Hard link count is wrong for `./usr/doc/HTML/en/kinfocenter' (saw only st_nlink=15 but we already saw 13 subdirectories): this may be a bug in your file system driver. Automatically turning on find's -noleaf option. Earlier results may have failed to include directories that should have been searched."
Please check with new release..
(In reply to comment #4) > Please check with new release.. Raghu, Can you please test this out again with 3.1.0?
I think with the first_up subvolume fix this should be closed. But anyways, once Jonhy is back, please check with him if its fixed and close it.
Please update the status of this bug as its been more than 6months since its filed (bug id < 2000) Please resolve it with proper resolution if its not valid anymore. If its still valid and not critical, move it to 'enhancement' severity.
*** This bug has been marked as a duplicate of bug 2684 ***