Hide Forgot
This bug has been seen a couple of times in the wild. Scenario #1: A pure-distribute setup with 6 servers. One of the server machine goes down and another machine assumes its responsibility. It starts its own GlusterFS server process and starts exporting the same LUNs that the now-dead server was exporting. Client starts seeing LOOKUP / => ESTALE Scenario #2: 4-server distribute+replicate setup. One of the servers is shut down, its disk taken out and replaced with a blank one. GlusterFS started again, and self-heal triggered from the client. Client starts seeing LOOKUP / => ESTALE.
I happened to reproduce this inadvertently myself. The client volume file had a mistake and distribute's two subvolumes were identical (total 2 subvolumes). Mounting and remounting multiple times still led to the error ESTALE.
glusterfs will get a different inode number from the new disk, when cbk, there should check the new inode number with the cached one, if it's not matched. errno will set to ESTALE. so if you want to replace a machine or a disk.you should flush cache before that. (In reply to comment #0) > This bug has been seen a couple of times in the wild. > > Scenario #1: > > A pure-distribute setup with 6 servers. One of the server machine goes down and > another machine assumes its responsibility. It starts its own GlusterFS server > process and starts exporting the same LUNs that the now-dead server was > exporting. Client starts seeing LOOKUP / => ESTALE > > Scenario #2: > > 4-server distribute+replicate setup. One of the servers is shut down, its disk > taken out and replaced with a blank one. GlusterFS started again, and self-heal > triggered from the client. Client starts seeing LOOKUP / => ESTALE.
sorry, it's not inode number, st_dev changed, client_lookup_cbk will check it.
gfid changes invalidates this bug