Red Hat Bugzilla – Bug 764713
A share viewed through NFS mounting is the wrong size, and writing to it corrupts the file system
Last modified: 2011-08-03 11:18:17 EDT
I've got a couple of servers with several mirrored gluster volumes. Two volumes work fine from all perspectives. One, the most recently set up, mounts correctly remotely as glusterfs, but fails badly as nfs (served by Gluster's internal NFS capacity). The mount appears to work when requested, but the filesystem size shown in totally wrong and it is not safely accessible. This is with 3.1.4:
So we have for instance on one external system:
192.168.1.242:/std 309637120 138276672 155631808 48% /mnt/std
19380692 2860644 15543300 16% /mnt/store
where the first nfs mount is correct and working, but the second is way off.
That was the same result as when /store was nfs mounted to another system
too. But on that same other system, /store mounts correctly as glusterfs:
vm2:/store 536704000 14459648 494981376 3% /mnt/store
with the real size shown, and the filesystem fully accessible.
The erroneous mount is also apparently dangerous. I tried writing a file to
it to see what would happen, and it garbaged the underlying filesystems. So
I did a full reformatting and recreation of the gluster volume before
retrying at that point - and still got the bad nfs mount for it.
The bad nfs mount happens no matter which of the two servers in the gluster
cluster the mount uses, too. And the problem occurs no matter what third server the NFS client runs on. All underlying file systems are ext4.
Any ideas what I'm hitting here? For the present purpose, we need to be able
to mount nfs, as we need some Macs to mount it.
Give me log files and vol files for your setup.
Also let me know if its easily reproducible so that i can try to reproduce it on my setup.
The problem is reproducible on the systems in question. I deleted the setup, reformatted the underlying file systems (ext4), recreated the Gluster setup, and the mount still showed the wrong size - IIRC exactly the same wrong size.
There is however another Gluster NFS share on the same equipment, of similar size, which is fine. Could it be that this happens when there's more than one? I'll get you the volume info and logs a bit later.
The configuration files from /etc/glusterd are at http://transpect.com/glusterd.tar.gz. Please advise on what part of the logs you need, as file names within them contain sensitive business information, so I don't want to share more than is pertinent.
(In reply to comment #3)
> The configuration files from /etc/glusterd are at
> http://transpect.com/glusterd.tar.gz. Please advise on what part of the logs
> you need, as file names within them contain sensitive business information, so
> I don't want to share more than is pertinent.
Hi, are you still seeing this issue?
We allow multiple NFS share from same server, its working for me.
From your vol files, i am not seeing any volume named "store", is it the same setup in question?
Haven't re-tested, but the problem had been consistently there. The share in question I renamed "recstore" when recreating it, just in case there was something stray associated with the original name causing the problem. That's probably the name in the files you have. The result was the same though.
(In reply to comment #5)
> Haven't re-tested, but the problem had been consistently there. The share in
> question I renamed "recstore" when recreating it, just in case there was
> something stray associated with the original name causing the problem. That's
> probably the name in the files you have. The result was the same though.
If you have the same setup, could you try to reproduce it and give nfs.log of the servers. One more thing "/mnt/gluster/store" is this some other mount point or just a directory on local filesystem?
Curious. Does not reproduce now. Trying NFS mounts from a couple of other systems, they show the right size. So some condition that brought out the bug must have changed.
/mnt/gluster/store is the backing store (an LVM volume), mounted locally.
Ok, might be some problem with the backend.
Will mark this issue resolved for now.