Bug 1505597 - ZFS backed bricks fail to rebalance distributed volumes [ubuntu?]
Summary: ZFS backed bricks fail to rebalance distributed volumes [ubuntu?]
Keywords:
Status: CLOSED DUPLICATE of bug 1516691
Alias: None
Product: GlusterFS
Classification: Community
Component: posix
Version: 3.12
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: ---
Assignee: bugs@gluster.org
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-10-24 00:25 UTC by Kevin Landreth
Modified: 2017-12-13 18:22 UTC (History)
3 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2017-12-13 18:22:35 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Kevin Landreth 2017-10-24 00:25:34 UTC
Description of problem:
It appears posix.c sys_allocate() is being called with the FALLOC_FL_KEEP_SIZE flag despite whether the underlying system supports said feature.  ZFS on Linux with ubuntu 16.04 seems to support FALLOC_FL_KEEP_SIZE at compile time but not on all filesystems, particularly ZFS.

This bug appears to only be triggered using rebalance on a distributed volume.  I could not produce the bug with normal store and unlink actions.

If this type of setup "just works" with RHEL/CentOS systems utilizing ZFS then I can only figure that there might be some core differences which aren't accounted for during compile time.

Version-Release number of selected component (if applicable): 3.12.2-ubuntu1~xenial2


How reproducible: add a zfs backed brick to an existing distributed volume and ask gluster to rebalance that volume.


Steps to Reproduce:
1. install ubuntu xenial 16.04 on normal ext4/xfs root
2. install zfsutils-linux (will install a 0.6.5 variant)
3. install glusterfs-server from https://launchpad.net/~gluster/+archive/ubuntu/glusterfs-3.12
4. create a zpool "tank"
5. zfs set compression=lz4 tank
6. zfs set xattr=sa tank
7. zfs set sync=disabled tank
8. zfs create -o mountpoint=/gluster tank/gluster
9. zfs create tank/gluster/brick1
9a. zfs create tank/gluster/brick2
10. mkdir /gluster/brick{1,2}/brick
11. gluster volume create media server:/gluster/brick1/brick server:/gluster/brick2/brick
12. mount -t glusterfs localhost:/media
13. populate with test data
14. zfs create tank/gluster/brick3; mkdir /gluster/brick3/brick
15. gluster add-brick media server:/gluster/brick3
16. gluster volume rebalance media start
17. observe error rebalance logs and brick logs.

Actual results:
[2017-10-23 20:09:57.258434] T [MSGID: 0] [server-rpc-fops.c:3229:server_fallocate_resume] 0-stack-trace: stack-address: 0x7ff138001940, winding from media-server to /gluster/media/brick
[2017-10-23 20:09:57.258449] T [MSGID: 0] [defaults.c:2606:default_fallocate] 0-stack-trace: stack-address: 0x7ff138001940, winding from /gluster/media/brick to media-io-stats
[2017-10-23 20:09:57.258462] T [MSGID: 0] [io-stats.c:3426:io_stats_fallocate] 0-stack-trace: stack-address: 0x7ff138001940, winding from media-io-stats to media-quota
[2017-10-23 20:09:57.258475] T [MSGID: 0] [quota.c:4925:quota_fallocate] 0-stack-trace: stack-address: 0x7ff138001940, winding from media-quota to media-index
[2017-10-23 20:09:57.258488] T [MSGID: 0] [defaults.c:2606:default_fallocate] 0-stack-trace: stack-address: 0x7ff138001940, winding from media-index to media-barrier
[2017-10-23 20:09:57.258501] T [MSGID: 0] [defaults.c:2606:default_fallocate] 0-stack-trace: stack-address: 0x7ff138001940, winding from media-barrier to media-marker
[2017-10-23 20:09:57.258515] T [MSGID: 0] [marker.c:2184:marker_fallocate] 0-stack-trace: stack-address: 0x7ff138001940, winding from media-marker to media-selinux
[2017-10-23 20:09:57.258528] T [MSGID: 0] [defaults.c:2606:default_fallocate] 0-stack-trace: stack-address: 0x7ff138001940, winding from media-selinux to media-io-threads
[2017-10-23 20:09:57.258564] T [MSGID: 0] [defaults.c:1900:default_fallocate_resume] 0-stack-trace: stack-address: 0x7ff138001940, winding from media-io-threads to media-upcall
[2017-10-23 20:09:57.258581] T [MSGID: 0] [upcall.c:1487:up_fallocate] 0-stack-trace: stack-address: 0x7ff138001940, winding from media-upcall to media-leases
[2017-10-23 20:09:57.258595] T [MSGID: 0] [leases.c:771:leases_fallocate] 0-stack-trace: stack-address: 0x7ff138001940, winding from media-leases to media-read-only
[2017-10-23 20:09:57.258608] T [MSGID: 0] [defaults.c:2606:default_fallocate] 0-stack-trace: stack-address: 0x7ff138001940, winding from media-read-only to media-worm
[2017-10-23 20:09:57.258621] T [MSGID: 0] [defaults.c:2606:default_fallocate] 0-stack-trace: stack-address: 0x7ff138001940, winding from media-worm to media-locks
[2017-10-23 20:09:57.258635] T [MSGID: 0] [posix.c:4294:pl_fallocate] 0-stack-trace: stack-address: 0x7ff138001940, winding from media-locks to media-access-control
[2017-10-23 20:09:57.258648] T [MSGID: 0] [defaults.c:2606:default_fallocate] 0-stack-trace: stack-address: 0x7ff138001940, winding from media-access-control to media-bitrot-stub
[2017-10-23 20:09:57.258661] T [MSGID: 0] [defaults.c:2606:default_fallocate] 0-stack-trace: stack-address: 0x7ff138001940, winding from media-bitrot-stub to media-changelog
[2017-10-23 20:09:57.258674] T [MSGID: 0] [defaults.c:2606:default_fallocate] 0-stack-trace: stack-address: 0x7ff138001940, winding from media-changelog to media-changetimerecorder
[2017-10-23 20:09:57.258687] T [MSGID: 0] [defaults.c:2606:default_fallocate] 0-stack-trace: stack-address: 0x7ff138001940, winding from media-changetimerecorder to media-trash
[2017-10-23 20:09:57.258707] T [MSGID: 0] [defaults.c:2606:default_fallocate] 0-stack-trace: stack-address: 0x7ff138001940, winding from media-trash to media-posix
[2017-10-23 20:09:57.258729] D [MSGID: 0] [posix.c:1038:_posix_fallocate] 0-stack-trace: stack-address: 0x7ff138001940, media-posix returned -1 error: Operation not supported [Operation not supported]
[2017-10-23 20:09:57.258747] D [MSGID: 0] [posix.c:4282:pl_fallocate_cbk] 0-stack-trace: stack-address: 0x7ff138001940, media-locks returned -1 error: Operation not supported [Operation not supported]
[2017-10-23 20:09:57.258764] D [MSGID: 0] [leases.c:736:leases_fallocate_cbk] 0-stack-trace: stack-address: 0x7ff138001940, media-leases returned -1 error: Operation not supported [Operation not supported]
[2017-10-23 20:09:57.258781] D [MSGID: 0] [upcall.c:1464:up_fallocate_cbk] 0-stack-trace: stack-address: 0x7ff138001940, media-upcall returned -1 error: Operation not supported [Operation not supported]
[2017-10-23 20:09:57.258798] D [MSGID: 0] [defaults.c:1290:default_fallocate_cbk] 0-stack-trace: stack-address: 0x7ff138001940, media-io-threads returned -1 error: Operation not supported [Operation not supported]
[2017-10-23 20:09:57.258826] D [MSGID: 0] [marker.c:2142:marker_fallocate_cbk] 0-stack-trace: stack-address: 0x7ff138001940, media-marker returned -1 error: Operation not supported [Operation not supported]
[2017-10-23 20:09:57.258843] D [MSGID: 0] [io-stats.c:2465:io_stats_fallocate_cbk] 0-stack-trace: stack-address: 0x7ff138001940, media-io-stats returned -1 error: Operation not supported [Operation not supported]


Expected results:

sys_fallocate to work with zfs on linux as well as the native linux file systems.


Additional info:

root@sulley:~# cd /tmp/
root@sulley:/tmp# touch fallocate-test
root@sulley:/tmp# fallocate -n -l 50M fallocate-test
root@sulley:/tmp# ls -alh fallocate-test
-rw-r--r-- 1 root root 0 Oct 23 19:17 fallocate-test
root@sulley:/tmp# du -Sh fallocate-test
50M     fallocate-test
root@sulley:/tmp# cd /gluster
root@sulley:/gluster# touch fallocate-test
root@sulley:/gluster# fallocate -n -l 50M fallocate-test
fallocate: fallocate failed: keep size mode is unsupported


* Fallocate support for ZFS on linux -
 https://github.com/zfsonlinux/zfs/issues/326
* Original commit include the compile time logic for posix_fallocate -
 https://github.com/gluster/glusterfs/commit/8e57090f7da4027c46176c9786372a00e22df69d
* GlusterFS zfs documentation -
 http://docs.gluster.org/en/latest/Administrator%20Guide/Gluster%20On%20ZFS/

Comment 1 Kevin Landreth 2017-10-24 00:35:58 UTC
Forgot to mention zfs set acltype=posixacl is also used.  Included for completeness.

Comment 2 Xavi Hernandez 2017-12-13 18:22:35 UTC
This seems a duplicate of bug #1516691. There's already a patch for it that will be available in 3.12.4.

*** This bug has been marked as a duplicate of bug 1516691 ***


Note You need to log in before you can comment on or make changes to this bug.