Bug 1028672 - BD xlator
BD xlator
Status: CLOSED CURRENTRELEASE
Product: GlusterFS
Classification: Community
Component: core (Show other bugs)
mainline
Unspecified Unspecified
unspecified Severity unspecified
: ---
: ---
Assigned To: GlusterFS Bugs list
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-11-09 06:21 EST by M. Mohan Kumar
Modified: 2014-04-17 07:50 EDT (History)
2 users (show)

See Also:
Fixed In Version: glusterfs-3.5.0
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2014-04-17 07:50:26 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description M. Mohan Kumar 2013-11-09 06:21:25 EST
Description of problem:

Existing bd_map xlator has limitations such as only one brick per volume,  no directory creation support etc.

Add a new storage xlator 'Block Device' xlator to export block devices as regular files to the client to overcome these limitations.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:
Comment 1 Anand Avati 2013-11-09 08:27:16 EST
REVIEW: http://review.gluster.org/6050 (bd: Add test case for bd xlator) posted (#2) for review on master by M. Mohan Kumar (mohan@in.ibm.com)
Comment 2 Anand Avati 2013-11-09 08:27:28 EST
REVIEW: http://review.gluster.org/5747 (bd_map: Remove bd_map xlator) posted (#3) for review on master by M. Mohan Kumar (mohan@in.ibm.com)
Comment 3 Anand Avati 2013-11-09 08:27:34 EST
REVIEW: http://review.gluster.org/5235 (bd: Add BD support to other xlators) posted (#5) for review on master by M. Mohan Kumar (mohan@in.ibm.com)
Comment 4 Anand Avati 2013-11-09 08:27:40 EST
REVIEW: http://review.gluster.org/5748 (bd: Add aio support to BD xlator) posted (#3) for review on master by M. Mohan Kumar (mohan@in.ibm.com)
Comment 5 Anand Avati 2013-11-09 08:27:46 EST
REVIEW: http://review.gluster.org/5626 (bd: Add support to create clone, snapshot and merge of LV images.) posted (#4) for review on master by M. Mohan Kumar (mohan@in.ibm.com)
Comment 6 Anand Avati 2013-11-11 03:27:59 EST
REVIEW: http://review.gluster.org/4809 (bd: posix/multi-brick support to BD xlator) posted (#7) for review on master by M. Mohan Kumar (mohan@in.ibm.com)
Comment 7 Anand Avati 2013-11-11 03:28:08 EST
REVIEW: http://review.gluster.org/6050 (bd: Add test case for bd xlator) posted (#3) for review on master by M. Mohan Kumar (mohan@in.ibm.com)
Comment 8 Anand Avati 2013-11-11 03:28:15 EST
REVIEW: http://review.gluster.org/5747 (bd_map: Remove bd_map xlator) posted (#4) for review on master by M. Mohan Kumar (mohan@in.ibm.com)
Comment 9 Anand Avati 2013-11-11 03:28:21 EST
REVIEW: http://review.gluster.org/5235 (bd: Add BD support to other xlators) posted (#6) for review on master by M. Mohan Kumar (mohan@in.ibm.com)
Comment 10 Anand Avati 2013-11-11 03:28:27 EST
REVIEW: http://review.gluster.org/5748 (bd: Add aio support to BD xlator) posted (#4) for review on master by M. Mohan Kumar (mohan@in.ibm.com)
Comment 11 Anand Avati 2013-11-11 03:28:34 EST
REVIEW: http://review.gluster.org/5626 (bd: Add support to create clone, snapshot and merge of LV images.) posted (#5) for review on master by M. Mohan Kumar (mohan@in.ibm.com)
Comment 12 Anand Avati 2013-11-11 08:26:42 EST
REVIEW: http://review.gluster.org/4809 (bd: posix/multi-brick support to BD xlator) posted (#8) for review on master by M. Mohan Kumar (mohan@in.ibm.com)
Comment 13 Anand Avati 2013-11-11 08:26:48 EST
REVIEW: http://review.gluster.org/6050 (bd: Add test case for bd xlator) posted (#4) for review on master by M. Mohan Kumar (mohan@in.ibm.com)
Comment 14 Anand Avati 2013-11-11 08:26:54 EST
REVIEW: http://review.gluster.org/5747 (bd_map: Remove bd_map xlator) posted (#5) for review on master by M. Mohan Kumar (mohan@in.ibm.com)
Comment 15 Anand Avati 2013-11-11 08:27:00 EST
REVIEW: http://review.gluster.org/5235 (bd: Add BD support to other xlators) posted (#7) for review on master by M. Mohan Kumar (mohan@in.ibm.com)
Comment 16 Anand Avati 2013-11-11 08:27:06 EST
REVIEW: http://review.gluster.org/5748 (bd: Add aio support to BD xlator) posted (#5) for review on master by M. Mohan Kumar (mohan@in.ibm.com)
Comment 17 Anand Avati 2013-11-11 08:27:13 EST
REVIEW: http://review.gluster.org/5626 (bd: Add support to create clone, snapshot and merge of LV images.) posted (#6) for review on master by M. Mohan Kumar (mohan@in.ibm.com)
Comment 18 Anand Avati 2013-11-11 20:26:16 EST
REVIEW: http://review.gluster.org/4809 (bd: posix/multi-brick support to BD xlator) posted (#9) for review on master by M. Mohan Kumar (mohan@in.ibm.com)
Comment 19 Anand Avati 2013-11-11 20:26:22 EST
REVIEW: http://review.gluster.org/6050 (bd: Add test case for bd xlator) posted (#5) for review on master by M. Mohan Kumar (mohan@in.ibm.com)
Comment 20 Anand Avati 2013-11-11 20:26:28 EST
REVIEW: http://review.gluster.org/5747 (bd_map: Remove bd_map xlator) posted (#6) for review on master by M. Mohan Kumar (mohan@in.ibm.com)
Comment 21 Anand Avati 2013-11-11 20:26:34 EST
REVIEW: http://review.gluster.org/5235 (bd: Add BD support to other xlators) posted (#8) for review on master by M. Mohan Kumar (mohan@in.ibm.com)
Comment 22 Anand Avati 2013-11-11 20:26:45 EST
REVIEW: http://review.gluster.org/5748 (bd: Add aio support to BD xlator) posted (#6) for review on master by M. Mohan Kumar (mohan@in.ibm.com)
Comment 23 Anand Avati 2013-11-11 20:26:51 EST
REVIEW: http://review.gluster.org/5626 (bd: Add support to create clone, snapshot and merge of LV images.) posted (#7) for review on master by M. Mohan Kumar (mohan@in.ibm.com)
Comment 24 Anand Avati 2013-11-13 12:25:32 EST
REVIEW: http://review.gluster.org/4809 (bd: posix/multi-brick support to BD xlator) posted (#10) for review on master by M. Mohan Kumar (mohan@in.ibm.com)
Comment 25 Anand Avati 2013-11-13 12:25:45 EST
REVIEW: http://review.gluster.org/6050 (bd: Add test case for bd xlator) posted (#6) for review on master by M. Mohan Kumar (mohan@in.ibm.com)
Comment 26 Anand Avati 2013-11-13 12:25:52 EST
REVIEW: http://review.gluster.org/5747 (bd_map: Remove bd_map xlator) posted (#7) for review on master by M. Mohan Kumar (mohan@in.ibm.com)
Comment 27 Anand Avati 2013-11-13 12:25:58 EST
REVIEW: http://review.gluster.org/5235 (bd: Add BD support to other xlators) posted (#9) for review on master by M. Mohan Kumar (mohan@in.ibm.com)
Comment 28 Anand Avati 2013-11-13 12:26:04 EST
REVIEW: http://review.gluster.org/5748 (bd: Add aio support to BD xlator) posted (#7) for review on master by M. Mohan Kumar (mohan@in.ibm.com)
Comment 29 Anand Avati 2013-11-13 12:26:11 EST
REVIEW: http://review.gluster.org/5626 (bd: Add support to create clone, snapshot and merge of LV images.) posted (#8) for review on master by M. Mohan Kumar (mohan@in.ibm.com)
Comment 30 Anand Avati 2013-11-13 14:38:35 EST
COMMIT: http://review.gluster.org/5747 committed in master by Anand Avati (avati@redhat.com) 
------
commit 15a8ecd9b3eedf80881bd3dba81f16b7d2cb7c97
Author: M. Mohan Kumar <mohan@in.ibm.com>
Date:   Wed Nov 13 22:44:42 2013 +0530

    bd_map: Remove bd_map xlator
    
    Remove bd_map xlator and CLI related changes.
    
    Change-Id: If7086205df1907127c1a1fa4ba603f1c48421d09
    BUG: 1028672
    Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com>
    Reviewed-on: http://review.gluster.org/5747
    Tested-by: Gluster Build System <jenkins@build.gluster.com>
    Reviewed-by: Anand Avati <avati@redhat.com>
Comment 31 Anand Avati 2013-11-13 14:38:49 EST
COMMIT: http://review.gluster.org/4809 committed in master by Anand Avati (avati@redhat.com) 
------
commit 48c40e1a42efe1b59126406084821947d139dd0e
Author: M. Mohan Kumar <mohan@in.ibm.com>
Date:   Wed Nov 13 22:44:42 2013 +0530

    bd: posix/multi-brick support to BD xlator
    
    Current BD xlator (block backend) has a few limitations such as
    * Creation of directories not supported
    * Supports only single brick
    * Does not use extended attributes (and client gfid) like posix xlator
    * Creation of special files (symbolic links, device nodes etc) not
      supported
    
    Basic limitation of not allowing directory creation is blocking
    oVirt/VDSM to consume BD xlator as part of Gluster domain since VDSM
    creates multi-level directories when GlusterFS is used as storage
    backend for storing VM images.
    
    To overcome these limitations a new BD xlator with following
    improvements is suggested.
    
    * New hybrid BD xlator that handles both regular files and block device
      files
    * The volume will have both POSIX and BD bricks. Regular files are
      created on POSIX bricks, block devices are created on the BD brick (VG)
    * BD xlator leverages exiting POSIX xlator for most POSIX calls and
      hence sits above the POSIX xlator
    * Block device file is differentiated from regular file by an extended
      attribute
    * The xattr 'user.glusterfs.bd' (BD_XATTR) plays a role in mapping a
      posix file to Logical Volume (LV).
    * When a client sends a request to set BD_XATTR on a posix file, a new
      LV is created and mapped to posix file. So every block device will
      have a representative file in POSIX brick with 'user.glusterfs.bd'
      (BD_XATTR) set.
    * Here after all operations on this file results in LV related
      operations.
    
    For example opening a file that has BD_XATTR set results in opening
    the LV block device, reading results in reading the corresponding LV
    block device.
    
    When BD xlator gets request to set BD_XATTR via setxattr call, it
    creates a LV and information about this LV is placed in the xattr of the
    posix file. xattr "user.glusterfs.bd" used to identify that posix file
    is mapped to BD.
    
    Usage:
    Server side:
    [root@host1 ~]# gluster volume create bdvol host1:/storage/vg1_info?vg1 host2:/storage/vg2_info?vg2
    It creates a distributed gluster volume 'bdvol' with Volume Group vg1
    using posix brick /storage/vg1_info in host1 and Volume Group vg2 using
    /storage/vg2_info in host2.
    
    [root@host1 ~]# gluster volume start bdvol
    
    Client side:
    [root@node ~]# mount -t glusterfs host1:/bdvol /media
    [root@node ~]# touch /media/posix
    It creates regular posix file 'posix' in either host1:/vg1 or host2:/vg2 brick
    [root@node ~]# mkdir /media/image
    [root@node ~]# touch /media/image/lv1
    It also creates regular posix file 'lv1' in either host1:/vg1 or
    host2:/vg2 brick
    [root@node ~]# setfattr -n "user.glusterfs.bd" -v "lv" /media/image/lv1
    [root@node ~]#
    Above setxattr results in creating a new LV in corresponding brick's VG
    and it sets 'user.glusterfs.bd' with value 'lv:<default-extent-size'
    [root@node ~]# truncate -s5G /media/image/lv1
    It results in resizig LV 'lv1'to 5G
    
    New BD xlator code is placed in xlators/storage/bd directory.
    
    Also add volume-uuid to the VG so that same VG can't be used for other
    bricks/volumes. After deleting a gluster volume, one has to manually
    remove the associated tag using vgchange <vg-name> --deltag
    <trusted.glusterfs.volume-id:<volume-id>>
    
    Changes from previous version V5:
    * Removed support for delayed deleting of LVs
    
    Changes from previous version V4:
    * Consolidated the patches
    * Removed usage of BD_XATTR_SIZE and consolidated it in BD_XATTR.
    
    Changes from previous version V3:
    * Added support in FUSE to support full/linked clone
    * Added support to merge snapshots and provide information about origin
    * bd_map xlator removed
    * iatt structure used in inode_ctx. iatt is cached and updated during
    fsync/flush
    * aio support
    * Type and capabilities of volume are exported through getxattr
    
    Changes from version 2:
    * Used inode_context for caching BD size and to check if loc/fd is BD or
      not.
    * Added GlusterFS server offloaded copy and snapshot through setfattr
      FOP. As part of this libgfapi is modified.
    * BD xlator supports stripe
    * During unlinking if a LV file is already opened, its added to delete
      list and bd_del_thread tries to delete from this list when a last
      reference to that file is closed.
    
    Changes from previous version:
    * gfid is used as name of LV
    * ? is used to specify VG name for creating BD volume in volume
      create, add-brick. gluster volume create volname host:/path?vg
    * open-behind issue is fixed
    * A replicate brick can be added dynamically and LVs from source brick
      are replicated to destination brick
    * A distribute brick can be added dynamically and rebalance operation
      distributes existing LVs/files to the new brick
    * Thin provisioning support added.
    * bd_map xlator support retained
    * setfattr -n user.glusterfs.bd -v "lv" creates a regular LV and
      setfattr -n user.glusterfs.bd -v "thin" creates thin LV
    * Capability and backend information added to gluster volume info (and
    --xml) so
      that management tools can exploit BD xlator.
    * tracing support for bd xlator added
    
    TODO:
    * Add support to display snapshots for a given LV
    * Display posix filename for list-origin instead of gfid
    
    Change-Id: I00d32dfbab3b7c806e0841515c86c3aa519332f2
    BUG: 1028672
    Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com>
    Reviewed-on: http://review.gluster.org/4809
    Tested-by: Gluster Build System <jenkins@build.gluster.com>
    Reviewed-by: Anand Avati <avati@redhat.com>
Comment 32 Anand Avati 2013-11-13 14:39:02 EST
COMMIT: http://review.gluster.org/5235 committed in master by Anand Avati (avati@redhat.com) 
------
commit 6ec9c4599e96de9dcae9426eae6bb1dde4dc7549
Author: M. Mohan Kumar <mohan@in.ibm.com>
Date:   Wed Nov 13 22:44:42 2013 +0530

    bd: Add BD support to other xlators
    
    Make changes to distributed xlator to work with BD xlator. Unlike files,
    a block device can't be removed when its opened. So some part of the
    code were moved down to avoid this situation. Also before truncating a
    BD file its BD_XATTR should be set otherwise truncate will result in
    truncating posix file. So file is created with needed BD_XATTR and
    truncate is invoked. Also enables BD xlator in stripe volume type.
    
    Change-Id: If127516e261fac5fc5b137e7fe33e100bc92acc0
    BUG: 1028672
    Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com>
    Reviewed-on: http://review.gluster.org/5235
    Tested-by: Gluster Build System <jenkins@build.gluster.com>
    Reviewed-by: Anand Avati <avati@redhat.com>
Comment 33 Anand Avati 2013-11-13 14:39:27 EST
COMMIT: http://review.gluster.org/5748 committed in master by Anand Avati (avati@redhat.com) 
------
commit b222ce817f5f324fe20d4d3614001ed2f177afb8
Author: M. Mohan Kumar <mohan@in.ibm.com>
Date:   Wed Nov 13 22:44:42 2013 +0530

    bd: Add aio support to BD xlator
    
    Volume option bd-aio controls AIO feature for BD xlator. Code taken from
    posix-aio.c
    
    Change-Id: Ib049bd59c9d3f9101d33939838322cfa808de053
    BUG: 1028672
    Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com>
    Reviewed-on: http://review.gluster.org/5748
    Tested-by: Gluster Build System <jenkins@build.gluster.com>
    Reviewed-by: Anand Avati <avati@redhat.com>
Comment 34 Anand Avati 2013-11-13 14:39:45 EST
COMMIT: http://review.gluster.org/5626 committed in master by Anand Avati (avati@redhat.com) 
------
commit 81a57679c20ac0ac9b48e313af75036132e3a5ad
Author: M. Mohan Kumar <mohan@in.ibm.com>
Date:   Wed Nov 13 22:44:43 2013 +0530

    bd: Add support to create clone, snapshot and merge of LV images.
    
    Special xattr names "clone" & "snapshot" can be used to create full and
    linked clone of the LV images. GFID of destination posix file (to be
    mapped) is passed as a value to the xattr. Destination posix file must
    exist before running this operation.
    
    These operations form a basis for offloading storage related operations
    from QEMU to GlusterFS.
    
    Syntax for full clone: xattr name: "clone" value: "gfid-of-dest-file"
    Syntax for linked clone: xattr name: "snapshot" value: "gfid-of-dest-file"
    Syntax for merging: xattr name: "merge" value: "path-to-snapshot-file"
    
    Example:
    	setfattr -n clone -v <gfid-of-dest-file> /media/source
    	setfattr -n snapshot -v <gfid-of-dest-file> /media/source
    	setfattr -n merge -v "/media/sn" /media/sn
    
    Change-Id: Id9f984a709d4c2e52a64ae75bb12a8ecb01f8776
    BUG: 1028672
    Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com>
    Reviewed-on: http://review.gluster.org/5626
    Tested-by: Gluster Build System <jenkins@build.gluster.com>
    Reviewed-by: Anand Avati <avati@redhat.com>
Comment 35 Anand Avati 2013-11-13 14:39:51 EST
COMMIT: http://review.gluster.org/6050 committed in master by Anand Avati (avati@redhat.com) 
------
commit cc742479562f085034b1ea969d4a5869d28a7136
Author: M. Mohan Kumar <mohan@in.ibm.com>
Date:   Wed Nov 13 22:44:43 2013 +0530

    bd: Add test case for bd xlator
    
    Change-Id: I73a0bfa7085d2e71b2489687fa53f5fe7d1e8ea1
    BUG: 1028672
    Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com>
    Reviewed-on: http://review.gluster.org/6050
    Tested-by: Gluster Build System <jenkins@build.gluster.com>
    Reviewed-by: Anand Avati <avati@redhat.com>
Comment 36 Anand Avati 2013-12-23 06:26:44 EST
REVIEW: http://review.gluster.org/6577 (bd: Check for capabilities for creating thin lv) posted (#1) for review on master by M. Mohan Kumar (mohan@in.ibm.com)
Comment 37 Anand Avati 2013-12-24 05:24:46 EST
COMMIT: http://review.gluster.org/6577 committed in master by Vijay Bellur (vbellur@redhat.com) 
------
commit f86c618cd0943930c391e6bf55fdf977b3245f36
Author: M. Mohan Kumar <mohan@in.ibm.com>
Date:   Mon Dec 23 16:27:42 2013 +0530

    bd: Check for capabilities for creating thin lv
    
    Check capabitlies of the volume before trying to create thin LV.
    
    BUG: 1028672
    
    Change-Id: I1375f6f2a7576e223fc5d7cd40315999446db86a
    Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com>
    Reviewed-on: http://review.gluster.org/6577
    Reviewed-by: Vijay Bellur <vbellur@redhat.com>
    Tested-by: Gluster Build System <jenkins@build.gluster.com>
Comment 38 Anand Avati 2013-12-24 08:22:17 EST
REVIEW: http://review.gluster.org/6590 (bd: Check for capabilities for creating thin lv) posted (#1) for review on release-3.5 by M. Mohan Kumar (mohan@in.ibm.com)
Comment 39 Anand Avati 2014-01-01 19:55:07 EST
COMMIT: http://review.gluster.org/6590 committed in release-3.5 by Vijay Bellur (vbellur@redhat.com) 
------
commit e1dd28c5b74d9687f48c5bc315423b054fc4ec7f
Author: M. Mohan Kumar <mohan@in.ibm.com>
Date:   Tue Dec 24 18:51:33 2013 +0530

    bd: Check for capabilities for creating thin lv
    
    Check capabitlies of the volume before trying to create thin LV.
    
    BUG: 1028672
    
    Change-Id: Ie4e2281265e193458ccd16736960daf69d3e1b29
    Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com>
    Reviewed-on: http://review.gluster.org/6590
    Reviewed-by: Vijay Bellur <vbellur@redhat.com>
    Tested-by: Gluster Build System <jenkins@build.gluster.com>
Comment 40 Niels de Vos 2014-04-17 07:50:26 EDT
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.5.0, please reopen this bug report.

glusterfs-3.5.0 has been announced on the Gluster Developers mailinglist [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/6137
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Note You need to log in before you can comment on or make changes to this bug.