Bug 1463191

Summary: gfapi: discard glfs object when volume is deleted
Product: [Community] GlusterFS Reporter: Prasanna Kumar Kalever <prasanna.kalever>
Component: libgfapiAssignee: Sanju <srakonde>
Status: CLOSED DEFERRED QA Contact:
Severity: low Docs Contact:
Priority: low    
Version: mainlineCC: amukherj, bugs, ndevos, rgowdapp, vpandey
Target Milestone: ---Keywords: StudentProject
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1463192 (view as bug list) Environment:
Last Closed: 2019-12-31 07:22:58 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1463192    
Attachments:
Description Flags
sample patch file for review. none

Description Prasanna Kumar Kalever 2017-06-20 11:09:16 UTC
Description of problem:

currently, once we have a glfs object in hand for a given volume, after deleting and recreating the volume with same name, we still can access new volume using the old glfs object, which is wrong.

Version-Release number of selected component (if applicable):
mainline

How reproducible:
1. write a gfapi program, once you are done calling glfs_init() try creating a file in the volume, now apply break-point there.

2. delete the volume and recreate the volume with the same name.

3. now continue with your program, in the next lines try creating another file in the volume using the same old glfs object

4. surprisingly it allows to create.

My use-case was more like calling glfs_get_volumeid() returns old volume id rather than throwing an error which should say glfs object is not valid or worst case return new volume id, but in my case it returned old uuid.

Refer https://bugzilla.redhat.com/show_bug.cgi?id=1461808#c9 for some more interesting context and sample programs.


Actual results:
with old glfs object we still can access new volume

Expected results:
return invalid object.

Comment 1 Niels de Vos 2017-06-21 08:55:09 UTC
This problem does not look like it is unique to gfapi, FUSE will most likely have the same issue. Once a connection to a brick is dropped, a re-connect should most likely verify if the volume-id is the same as before.

I think there are several approaches to solving this:

1. glusterd initiated through GF_CBK_EVENT_NOTIFY or similar (on volume delete)
2. handling this in protocol/client:RPC_CLNT_CONNECT and client_handshake()
3. in the master xlators like fuse and gfapi

Whatever is picked, any subsequent usage of the deleted volume should result in ESTALE errors.


My current preference goes to doing this in protocol/client. What do others think?

Comment 2 Venkata R Edara 2017-08-01 08:36:45 UTC
Reproduced the bug using python-bindings. will be working on fix.

Comment 3 Venkata R Edara 2017-08-16 12:34:41 UTC
Created attachment 1314099 [details]
sample patch file for review.

I had the fix in gfapi code. The actual fix is in glfs-mgmt.c which has call to glfs_get_volume_info() which has callback that checks whether vol id is same and sets the errno . 

This fix can also be in xlator/protocol/client, we have to register new handshake protocol entry for fetching volume id. 
clnt_handshake_procs has to have entry GF_HNDSK_GET_VOLUME_INFO and repective call and callbacks. as adding entry to protocol would be sensitive, I made changes in gfapi code only.

Comment 4 Worker Ant 2017-08-18 10:42:09 UTC
REVIEW: https://review.gluster.org/18064 (gfapi: Fix for bug 1463191) posted (#1) for review on master by Anonymous Coward

Comment 5 Venkata R Edara 2017-08-18 11:10:27 UTC
https://review.gluster.org/#/c/18064/

Comment 6 Worker Ant 2017-08-21 12:29:18 UTC
REVIEW: https://review.gluster.org/18064 (Discard glfs object if volume is recreated) posted (#2) for review on master by Anonymous Coward

Comment 7 Worker Ant 2017-08-22 09:20:27 UTC
REVIEW: https://review.gluster.org/18064 (Discard glfs object if volume is recreated) posted (#3) for review on master by Anonymous Coward

Comment 8 Worker Ant 2017-08-23 08:38:25 UTC
REVIEW: https://review.gluster.org/18064 (gfapi: mark glfs object as bad if volume is re-created) posted (#4) for review on master by Anonymous Coward

Comment 9 Venkata R Edara 2017-08-23 08:54:13 UTC
steps to reproduce the issue:

here is gfapi python program , volume name is "test2", hostname is "gluster2".

host 1. in client machine 
>>> from gluster import gfapi
>>> volume = gfapi.Volume('gluster2','test2')
>>> volume.mount()
>>> volume.mkdir('/dir1')
>>> volume.mkdir('/dir2')

host 2.in server machine, that is gluster2 
lets delete the test2 volume on server side and re-create the volume
gluster2 # gluster volume stop test2
gluster2 # gluster volume delete test2
gluster2 # rm -rf /storage/brick3/test2vol
gluster2 # mkdir /storage/brick3/test2vol
gluster2 # gluster volume create test2 gluster2:/storage/brick3/test2vol
gluster2 # gluster volume start test2

host 1. come back to client machine, continue python program, we have volume object here. you will be able to create dir on old volume obj which should be discarded.

>>> volume.listdir('/')
[]
>>> volume.mkdir('/dir3')
>>> volume.listdir('/')
['dir3']

This shows that if volume is re-created with same name on server side, client program can access using old volume object. the fix to bug raises ENXIO err.

Comment 10 Worker Ant 2017-09-04 11:20:35 UTC
REVIEW: https://review.gluster.org/18064 (gfapi: mark glfs object as bad if volume is re-created) posted (#5) for review on master by Venkata Ramarao Edara (redara)

Comment 12 Yaniv Kaul 2019-11-18 09:37:14 UTC
What's the next step here?

Comment 13 Vishal Pandey 2019-11-18 09:46:30 UTC
I had a chat with Prasanna over this bug. He said its not a very critical issue for now. I have kept this bug on the backseat for now. Will take up this bug once I am finished with https://bugzilla.redhat.com/show_bug.cgi?id=1763030.

Comment 14 Yaniv Kaul 2019-11-18 09:50:28 UTC
(In reply to Vishal Pandey from comment #13)
> I had a chat with Prasanna over this bug. He said its not a very critical
> issue for now. I have kept this bug on the backseat for now. Will take up
> this bug once I am finished with
> https://bugzilla.redhat.com/show_bug.cgi?id=1763030.

Why would you take it if it's low severity and priority? Can an intern look at it?

Comment 15 Vishal Pandey 2019-11-18 11:09:06 UTC
Yes, an intern can definitely take a look at it. But the only issue is if there is any gfapi folk avilable to help out in case of any issues he/she might face.

Comment 16 Yaniv Kaul 2019-12-31 07:22:58 UTC
(In reply to INVALID USER from comment #15)
> Yes, an intern can definitely take a look at it. But the only issue is if
> there is any gfapi folk avilable to help out in case of any issues he/she
> might face.

No one took it, closing.