Bug 1120815

Summary: df reports incorrect space available and used.
Product: [Community] GlusterFS Reporter: Ryan Larson <rlarson>
Component: fuseAssignee: GlusterFS Bugs list <gluster-bugs>
Status: CLOSED INSUFFICIENT_DATA QA Contact:
Severity: medium Docs Contact:
Priority: unspecified    
Version: 3.4.1CC: bugs, gluster-bugs, hartsjc
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-10-23 16:05:03 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Ryan Larson 2014-07-17 18:41:04 UTC
Description of problem:
The df command reports incorrect space available and used for a volume.

Version-Release number of selected component (if applicable):
glusterfs-libs-3.4.1-3.el6.x86_64
glusterfs-server-3.4.1-3.el6.x86_64
glusterfs-3.4.1-3.el6.x86_64
glusterfs-cli-3.4.1-3.el6.x86_64
glusterfs-fuse-3.4.1-3.el6.x86_64


Actual results:
[root@host1 ~]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/root_vg-root_lv
                      192G  5.0G  177G   3% /
tmpfs                  32G     0   32G   0% /dev/shm
/dev/sda1             485M   39M  421M   9% /boot
/dev/mapper/data_vg-data_lv
                       99G  188M   94G   1% /data
/dev/mapper/data_vg-replicated_lv
                      394G  105G  270G  28% /data/protected/replicated-DO-NOT-TOUCH
localhost:cluster-replicated
                       99G   99G     0 100% /data/replicated

[root@host1 ~]# gluster volume info cluster-replicated 
 
Volume Name: cluster-replicated
Type: Replicate
Volume ID: e0d695b2-9a32-4a84-85ce-cd65335e07a4
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp,rdma
Bricks:
Brick1: host1:/data/protected/replicated-DO-NOT-TOUCH
Brick2: host2:/data/protected/replicated-DO-NOT-TOUCH
[root@host1 ~]# lsblk 
NAME                             MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda                                8:0    0   1.1T  0 disk 
|-sda1                             8:1    0   500M  0 part /boot
|-sda2                             8:2    0 195.3G  0 part 
| |-root_vg-root_lv (dm-0)       253:0    0 194.3G  0 lvm  /
| `-root_vg-swap_lv (dm-1)       253:1    0     1G  0 lvm  [SWAP]
|-sda3                             8:3    0     1M  0 part 
`-sda4                             8:4    0 919.7G  0 part 
  |-data_vg-data_lv (dm-2)       253:2    0   100G  0 lvm  /data
  `-data_vg-replicated_lv (dm-3) 253:3    0   400G  0 lvm  /data/protected/replicated-DO-NOT-TOUCH
[root@host1 ~]# lvs
  LV            VG      Attr       LSize   Pool Origin Data%  Move Log Cpy%Sync Convert
  data_lv       data_vg -wi-ao---- 100.00g                                             
  replicated_lv data_vg -wi-ao---- 400.00g                                             
  root_lv       root_vg -wi-ao---- 194.28g                                             
  swap_lv       root_vg -wi-ao----   1.00g                                             


Expected results:
[root@host1 ~]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/root_vg-root_lv
                      192G  5.0G  177G   3% /
tmpfs                  32G     0   32G   0% /dev/shm
/dev/sda1             485M   39M  421M   9% /boot
/dev/mapper/data_vg-data_lv
                       99G  188M   94G   1% /data
/dev/mapper/data_vg-replicated_lv
                      394G  105G  270G  28% /data/protected/replicated-DO-NOT-TOUCH
localhost:cluster-replicated
                      394G  105G  270G  28% /data/replicated


Additional info:
I can continue to write to the volume even though it says it is full via df.  But this is breaking our monitoring system because we have a volume that is 100% full.

Comment 1 James Hartsock 2014-07-25 21:13:27 UTC
Ryan,


I know you told me that you are seeing this when mouting from another node to

hostname-not-localhost:cluster-replicated
                       99G   99G     0 100% /data/replicated


I am guessing this was a gluster mount too, are the results the same if you use NFS to mount it?

Comment 2 Ryan Larson 2014-07-28 16:47:39 UTC
Yes, it shows the same data when mounted via nfs.

[root@host1 ~]# df -h 
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/root_vg-root_lv
                      192G  5.1G  177G   3% /
tmpfs                  32G     0   32G   0% /dev/shm
/dev/sda1             485M   39M  421M   9% /boot
/dev/mapper/data_vg-data_lv
                       99G  188M   94G   1% /data
/dev/mapper/data_vg-replicated_lv
                      394G  105G  270G  28% /data/protected/replicated-DO-NOT-TOUCH
localhost:cluster-replicated
                       99G   99G     0 100% /data/replicated
localhost:/cluster-replicated
                       99G   99G     0 100% /tmp/test

[root@host1 ~]# mount -v 
/dev/mapper/root_vg-root_lv on / type ext4 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
tmpfs on /dev/shm type tmpfs (rw)
/dev/sda1 on /boot type ext4 (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
/dev/mapper/data_vg-data_lv on /data type ext4 (rw)
/dev/mapper/data_vg-replicated_lv on /data/protected/replicated-DO-NOT-TOUCH type ext4 (rw)
/data/home on /home type none (rw,bind)
localhost:cluster-replicated on /data/replicated type fuse.glusterfs (rw,allow_other,max_read=131072)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
localhost:/cluster-replicated on /tmp/test type nfs (rw,addr=127.0.0.1)

Comment 3 James Hartsock 2014-07-31 15:55:27 UTC
Ryan,

First,  Would it be possible to mount outside the /data mount as it seems this is possibly the trigger as it appears the size of /data is being report for /data/replicated.

# df
Filesystem                        Size Used Avail Use% Mounted on
/dev/mapper/data_vg-data_lv        99G 188M   94G   1% /data
/dev/mapper/data_vg-replicated_lv 394G 105G  269G  29% /data/protected/replicated-DO-NOT-TOUCH
localhost:cluster-replicated       99G  99G     0 100% /data/replicated




Secondly, believe you mentioned you have another system with similar setup/versions that is not seeing the issue.  If you could share that information, I think that would be good data too.

Comment 4 James Hartsock 2014-08-12 21:12:39 UTC
Ryan,

While my names/paths are slightly different I think you will see they are similar enough for you to see similarities.  But anyway the only way I have been able to replicate this issue is if I create and start the gluster volume on /data/protected prior to mounting /data/protected (thus results on it being on /data really).


These on both nodes of my test systems
# mkdir /data #may need to rm -rf first if re-doing
# mkfs.xfs -f -i size=512 /dev/data/data
# mkfs.xfs -f -i size=512 /dev/data/replicated
# mount /dev/mapper/data-data /data

These on one node
# gluster volume create replicated replica 2 rhs1:/data/protected/replicated rhs2:/data/protected/replicated
# gluster volume start replicated
<make sure everything is started/status is good>

These  on one node
# mkdir /data/replicated
# mount localhost:/replicated /data/replicated
<mount should work, with size of /data volume really>
# date >> /data/replicated/james
# date >> /data/replicated/james2
# chmod 666 /data/replicated/*
# umount /data/replicated

This on both nodes
# mount /dev/mapper/data-replicated /data/protected/replicated

This on one node
# mount localhost:/replicated /data/replicated
# df
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/mapper/vg00-root
                       5684576   2011100   3384708  38% /
tmpfs                   961364         0    961364   0% /dev/shm
/dev/vda1               495844     37096    433148   8% /boot
/dev/mapper/data-data
                      10475520     33040  10442480   1% /data
/dev/mapper/data-replicated
                      52403200     32944  52370256   1% /data/protected/replicated
localhost:/replicated
                      10475520     33024  10442496   1% /data/replicated




While this may not be exactly how it happened, is it possible that a lvol, filesystem, or mount got done in an odd order when you created this one?

I have tried a few other things to see if I can replicate this issue on RHEL with the same glusterfs RPMs and only doing something in odd order like above have I been able to accomplish this.


Also after our last call I think you mentioned you might be able to test with /dev/mapper/data_vg-replicated_lv mounted outside of /data.  Were you able to get that done?

Comment 5 James Hartsock 2014-10-23 16:05:03 UTC
My understanding that after the system was rebuilt this issue is no longer present.  I will close this BZ with INSUFFICIENT_DATA for now.  We can re-open if the issue is seen again (and hopefully can be replicated).