Bug 1463191 - gfapi: discard glfs object when volume is deleted [NEEDINFO]
gfapi: discard glfs object when volume is deleted
Status: ASSIGNED
Product: GlusterFS
Classification: Community
Component: libgfapi (Show other bugs)
mainline
Unspecified Unspecified
unspecified Severity unspecified
: ---
: ---
Assigned To: Venkata R Edara
bugs@gluster.org
:
Depends On:
Blocks: 1463192
  Show dependency treegraph
 
Reported: 2017-06-20 07:09 EDT by Prasanna Kumar Kalever
Modified: 2017-09-04 07:20 EDT (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1463192 (view as bug list)
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
ndevos: needinfo? (rgowdapp)
ndevos: needinfo? (amukherj)


Attachments (Terms of Use)
sample patch file for review. (4.19 KB, patch)
2017-08-16 08:34 EDT, Venkata R Edara
no flags Details | Diff

  None (edit)
Description Prasanna Kumar Kalever 2017-06-20 07:09:16 EDT
Description of problem:

currently, once we have a glfs object in hand for a given volume, after deleting and recreating the volume with same name, we still can access new volume using the old glfs object, which is wrong.

Version-Release number of selected component (if applicable):
mainline

How reproducible:
1. write a gfapi program, once you are done calling glfs_init() try creating a file in the volume, now apply break-point there.

2. delete the volume and recreate the volume with the same name.

3. now continue with your program, in the next lines try creating another file in the volume using the same old glfs object

4. surprisingly it allows to create.

My use-case was more like calling glfs_get_volumeid() returns old volume id rather than throwing an error which should say glfs object is not valid or worst case return new volume id, but in my case it returned old uuid.

Refer https://bugzilla.redhat.com/show_bug.cgi?id=1461808#c9 for some more interesting context and sample programs.


Actual results:
with old glfs object we still can access new volume

Expected results:
return invalid object.
Comment 1 Niels de Vos 2017-06-21 04:55:09 EDT
This problem does not look like it is unique to gfapi, FUSE will most likely have the same issue. Once a connection to a brick is dropped, a re-connect should most likely verify if the volume-id is the same as before.

I think there are several approaches to solving this:

1. glusterd initiated through GF_CBK_EVENT_NOTIFY or similar (on volume delete)
2. handling this in protocol/client:RPC_CLNT_CONNECT and client_handshake()
3. in the master xlators like fuse and gfapi

Whatever is picked, any subsequent usage of the deleted volume should result in ESTALE errors.


My current preference goes to doing this in protocol/client. What do others think?
Comment 2 Venkata R Edara 2017-08-01 04:36:45 EDT
Reproduced the bug using python-bindings. will be working on fix.
Comment 3 Venkata R Edara 2017-08-16 08:34 EDT
Created attachment 1314099 [details]
sample patch file for review.

I had the fix in gfapi code. The actual fix is in glfs-mgmt.c which has call to glfs_get_volume_info() which has callback that checks whether vol id is same and sets the errno . 

This fix can also be in xlator/protocol/client, we have to register new handshake protocol entry for fetching volume id. 
clnt_handshake_procs has to have entry GF_HNDSK_GET_VOLUME_INFO and repective call and callbacks. as adding entry to protocol would be sensitive, I made changes in gfapi code only.
Comment 4 Worker Ant 2017-08-18 06:42:09 EDT
REVIEW: https://review.gluster.org/18064 (gfapi: Fix for bug 1463191) posted (#1) for review on master by Anonymous Coward
Comment 5 Venkata R Edara 2017-08-18 07:10:27 EDT
https://review.gluster.org/#/c/18064/
Comment 6 Worker Ant 2017-08-21 08:29:18 EDT
REVIEW: https://review.gluster.org/18064 (Discard glfs object if volume is recreated) posted (#2) for review on master by Anonymous Coward
Comment 7 Worker Ant 2017-08-22 05:20:27 EDT
REVIEW: https://review.gluster.org/18064 (Discard glfs object if volume is recreated) posted (#3) for review on master by Anonymous Coward
Comment 8 Worker Ant 2017-08-23 04:38:25 EDT
REVIEW: https://review.gluster.org/18064 (gfapi: mark glfs object as bad if volume is re-created) posted (#4) for review on master by Anonymous Coward
Comment 9 Venkata R Edara 2017-08-23 04:54:13 EDT
steps to reproduce the issue:

here is gfapi python program , volume name is "test2", hostname is "gluster2".

host 1. in client machine 
>>> from gluster import gfapi
>>> volume = gfapi.Volume('gluster2','test2')
>>> volume.mount()
>>> volume.mkdir('/dir1')
>>> volume.mkdir('/dir2')

host 2.in server machine, that is gluster2 
lets delete the test2 volume on server side and re-create the volume
gluster2 # gluster volume stop test2
gluster2 # gluster volume delete test2
gluster2 # rm -rf /storage/brick3/test2vol
gluster2 # mkdir /storage/brick3/test2vol
gluster2 # gluster volume create test2 gluster2:/storage/brick3/test2vol
gluster2 # gluster volume start test2

host 1. come back to client machine, continue python program, we have volume object here. you will be able to create dir on old volume obj which should be discarded.

>>> volume.listdir('/')
[]
>>> volume.mkdir('/dir3')
>>> volume.listdir('/')
['dir3']

This shows that if volume is re-created with same name on server side, client program can access using old volume object. the fix to bug raises ENXIO err.
Comment 10 Worker Ant 2017-09-04 07:20:35 EDT
REVIEW: https://review.gluster.org/18064 (gfapi: mark glfs object as bad if volume is re-created) posted (#5) for review on master by Venkata Ramarao Edara (redara@redhat.com)

Note You need to log in before you can comment on or make changes to this bug.