Description of problem: I use two host A and B to be gluster client, they access the same gluster volume. There is a directory on gluster named dirxx and a file is undering this directory named fileyy. On host A I run a program which will open file dirxx/fileyy. Then I remove directory dirxx from host B while host A is accessing file dirxx/fileyy. Here comes the issue, if I try to execute "mkdir dirxx" on host A, I will receive an error that "mkdir: cannot create directory `dirxx': File exists". I think this is a issue about inconsistency, cause if I remove directory from host A and then recreate directory on host A, the creation is successful Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 0. mount glusterfs to host A and B 1. [hostA ]# mkdir dirxx; touch dirxx/fileyy 2. open this file from hostA, or you can simply open dirxx/fileyy with vi dirxx/fileyy 3. [hostB ]# rm -rf dirxx (make sure dirxx/fileyy is used by hostA now) 4. [hostA ]# mkdir dirxx 5. you can see the error "mkdir: cannot create directory `dirxx': File exists" Actual results: Expected results: Additional info:
Hi Albert, Can you provide the version you are using? On the latest git branch(master), I am able to create the directory irrespective of where the rmdir/mkdir is issued. If you continue to see failure, please send across the client logs when the error occured.
Hi Shishir, Thanks for your response. The version I am using is gluster 3.3.0. To reproduce the issue is very easy, the percentage of reproduction is 100% on version 3.3.0, and there's no logs output when the error occured.
Hi Albert, Can you provide the volume information? output of 'glusterfs volume info <volname>'.
Hi Shishir, Below is my volume information ------------------------ volume information ----------------------- [root@localhost dhtvolume]# gluster volume info dhtvolume Volume Name: dhtvolume Type: Distribute Volume ID: 277948ae-dea2-4074-9689-a2dec22a1d64 Status: Started Number of Bricks: 2 Transport-type: tcp Bricks: Brick1: 192.168.6.230:/data/gluster/dhtbrick Brick2: 192.168.6.232:/data/gluster/dhtbrick [root@localhost dhtvolume]# ------------------------------------------------------------ ------------------------- client mount point--------------------------- [root@localhost dhtvolume]# df -h /dev/sda3 887G 294G 548G 35% / /dev/sda1 99M 12M 83M 13% /boot tmpfs 7.9G 0 7.9G 0% /dev/shm 192.168.6.77:/BrdCloud/binall/lib/ 872G 49G 778G 6% /BrdCloud/binall/lib glusterfs#192.168.6.230:/dhtvolume 2.1T 800G 1.2T 40% /mnt/gluster [root@localhost dhtvolume]# -------------------------------------------------------- I have two clients mount the same volume. It's very important to notice that the operation "rm -rf " must from another client, one client open file /mnt/gluster/dirxx/fileyy, another client execute "rm -rf /mnt/gluster/dirxx, you will receive this error. I have downloaded the latest git version and tried again, but the error is still there. I guess may be the reason is, client B delete the directory while A is using a file under it, the server actually received B's command and delete the directory and the file under the directory. But client A doesn't know what happen. Later we try to mkdir from client A, client A search its entry cache firstly and find that the entry is already exist, so it just return a error.
Hi Albert, This might be due to client side caching. Can you try to mount with --entry-timeout=0 for both the clients? Or drop caches after rmdir before doing the mkdir? (echo 3 >/proc/sys/vm/drop_caches)
Hi Shishir, I tried again, neither entry-timeout=0 nor drop caches can make it work. What should I do next?
Hi Albert, Thanks for trying it out. Will try to reproduce the issue inhouse and get back to you.
Wait for good news from you. Have a nice weekend.
The version that this bug has been reported against, does not get any updates from the Gluster Community anymore. Please verify if this report is still valid against a current (3.4, 3.5 or 3.6) release and update the version, or close this bug. If there has been no update before 9 December 2014, this bug will get automatocally closed.