Bug 762538 (GLUSTER-806)

Summary: troubles use the sparse file
Product: [Community] GlusterFS Reporter: ekoh <huang>
Component: coreAssignee: shishir gowda <sgowda>
Status: CLOSED NOTABUG QA Contact:
Severity: low Docs Contact:
Priority: low    
Version: 3.0.3CC: amarts, gluster-bugs, nsathyan, vijay
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description ekoh 2010-04-07 03:34:27 UTC
I use Glusterfs to store my VM images. And create VM on the Glusterfs. When I create a qcow2 images file then install a new VM with it, I get the errors:


Unable to complete install 'libvirt.libvirtError internal error unable to start guest: qemu: could not open disk image /srv/vm1/test-qcow2.img

Traceback (most recent call last):
  File "/usr/share/virt-manager/virtManager/create.py", line 724, in do_install
    dom = guest.start_install(False, meter = meter)
  File "/usr/lib/python2.4/site-packages/virtinst/Guest.py", line 541, in start_install
    return self._do_install(consolecb, meter, removeOld, wait)
  File "/usr/lib/python2.4/site-packages/virtinst/Guest.py", line 633, in _do_install
    self.domain = self.conn.createLinux(install_xml, 0)
  File "/usr/lib64/python2.4/site-packages/libvirt.py", line 974, in createLinux
    if ret is None:raise libvirtError('virDomainCreateLinux() failed', conn=self)
libvirtError: internal error unable to start guest: qemu: could not open disk image /srv/vm1/test-qcow2.img


But no problem to use RAW image, or use the qcow2 images at local disk.

Comment 1 Vijay Bellur 2010-04-07 04:05:42 UTC
Hi Ekoh,

Can you please provide the GlusterFS client and server logs?

Thanks,
Vijay

Comment 2 ekoh 2010-04-07 06:17:53 UTC
Hi,Vijay
Thank you very much.

I have try to use the 3.0.4rc3, but get same results.

And this is my server and client's logs:

------Server1------

[2010-04-07 18:10:12] N [server-protocol.c:5852:mop_setvolume] server-tcp: accepted client from 172.18.0.18:1023
[2010-04-07 18:10:12] N [server-protocol.c:5852:mop_setvolume] server-tcp: accepted client from 172.18.0.18:1022


-------Server2------

[2010-04-07 18:10:12] N [server-protocol.c:5852:mop_setvolume] server-tcp: accepted client from 172.18.0.18:1021
[2010-04-07 18:10:12] N [server-protocol.c:5852:mop_setvolume] server-tcp: accepted client from 172.18.0.18:1020


-------Client--------

================================================================================
Version      : glusterfs 3.0.4rc3 built on Apr  7 2010 11:39:11
git: v3.0.3-15-g391023d
Starting Time: 2010-04-07 18:10:12
Command line : /sbin/glusterfs --log-level=NORMAL --disable-direct-io-mode --volfile=/etc/glusterfs/glusterfs.vol /srv/vm1 
PID          : 23651
System name  : Linux
Nodename     : vm-test.localdomain
Kernel Release : 2.6.18-164.11.1.el5
Hardware Identifier: x86_64

Given volfile:
+------------------------------------------------------------------------------+
  1: ## file auto generated by /bin/glusterfs-volgen (mount.vol)
  2: # Cmd line:
  3: # $ /bin/glusterfs-volgen --name repstore1 --raid 1 gluster01:/home/fs gluster02:/home/fs
  4: 
  5: # RAID 1
  6: # TRANSPORT-TYPE tcp
  7: volume gluster02-1
  8:     type protocol/client
  9:     option transport-type tcp
 10:     option remote-host gluster02
 11:     option transport.socket.nodelay on
 12:     option transport.remote-port 6996
 13:     option remote-subvolume brick1
 14: end-volume
 15: 
 16: volume gluster01-1
 17:     type protocol/client
 18:     option transport-type tcp
 19:     option remote-host gluster01
 20:     option transport.socket.nodelay on
 21:     option transport.remote-port 6996
 22:     option remote-subvolume brick1
 23: end-volume
 24: 
 25: volume mirror-0
 26:     type cluster/replicate
 27:     subvolumes gluster01-1 gluster02-1
 28: end-volume
 29: 
 30: volume readahead
 31:     type performance/read-ahead
 32:     option page-count 4
 33:     subvolumes mirror-0
 34: end-volume
 35: 
 36: volume iocache
 37:     type performance/io-cache
 38:     option cache-size `echo $(( $(grep 'MemTotal' /proc/meminfo | sed 's/[^0-9]//g') / 5120 ))`MB
 39:     option cache-timeout 1
 40:     subvolumes readahead
 41: end-volume
 42: 
 43: volume quickread
 44:     type performance/quick-read
 45:     option cache-timeout 1
 46:     option max-file-size 64kB
 47:     subvolumes iocache
 48: end-volume
 49: 
 50: volume writebehind
 51:     type performance/write-behind
 52:     option cache-size 4MB
 53:     subvolumes quickread
 54: end-volume
 55: 
 56: volume statprefetch
 57:     type performance/stat-prefetch
 58:     subvolumes writebehind
 59: end-volume
 60: 

+------------------------------------------------------------------------------+
[2010-04-07 18:10:12] W [xlator.c:656:validate_xlator_volume_options] gluster01-1: option 'transport.remote-port' is deprecated, preferred is 'remote-port', continuing with correction
[2010-04-07 18:10:12] W [xlator.c:656:validate_xlator_volume_options] gluster02-1: option 'transport.remote-port' is deprecated, preferred is 'remote-port', continuing with correction
[2010-04-07 18:10:12] N [glusterfsd.c:1396:main] glusterfs: Successfully started
[2010-04-07 18:10:12] N [client-protocol.c:6246:client_setvolume_cbk] gluster01-1: Connected to 172.18.0.23:6996, attached to remote volume 'brick1'.
[2010-04-07 18:10:12] N [afr.c:2627:notify] mirror-0: Subvolume 'gluster01-1' came back up; going online.
[2010-04-07 18:10:12] N [client-protocol.c:6246:client_setvolume_cbk] gluster02-1: Connected to 172.18.0.24:6996, attached to remote volume 'brick1'.
[2010-04-07 18:10:12] N [fuse-bridge.c:2942:fuse_init] glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.13 kernel 7.10
[2010-04-07 18:10:12] N [client-protocol.c:6246:client_setvolume_cbk] gluster02-1: Connected to 172.18.0.24:6996, attached to remote volume 'brick1'.
[2010-04-07 18:10:12] N [client-protocol.c:6246:client_setvolume_cbk] gluster01-1: Connected to 172.18.0.23:6996, attached to remote volume 'brick1'.


And this is the server's configure file:

# cat /etc/glusterfs/glusterfsd.vol
## file auto generated by /bin/glusterfs-volgen (export.vol)
# Cmd line:
# $ /bin/glusterfs-volgen --name repstore1 --raid 1 gluster01:/home/fs gluster02:/home/fs

volume posix1
  type storage/posix
  option directory /home/fs
end-volume

volume locks1
    type features/locks
    subvolumes posix1
end-volume

volume brick1
    type performance/io-threads
    option thread-count 8
    subvolumes locks1
end-volume

volume server-tcp
    type protocol/server
    option transport-type tcp
    option auth.addr.brick1.allow *
    option transport.socket.listen-port 6996
    option transport.socket.nodelay on
    subvolumes brick1
end-volume


----------------------------------


Thanks,

ekoh

Comment 3 Amar Tumballi 2010-04-20 08:24:51 UTC
Hi,

Can you mount glusterfs with '--disable-direct-io-mode' and try to run this command. We suspect this to be some direct-io related issue, which is not reaching upto GlusterFS.

Comment 4 ekoh 2010-04-28 04:30:24 UTC
hi, Amar
Thank you for your reply.

Yes, I have add the '--disable-direct-io-mode' option to the '/etc/fstab' like this:

/etc/glusterfs/glusterfs.vol  /srv/vm1      glusterfs  defaults,direct-io-mode=disable  0  0

And from the client logs, I can found this:

Command line : /sbin/glusterfs --log-level=NORMAL --disable-direct-io-mode
--volfile=/etc/glusterfs/glusterfs.vol /srv/vm1 


ps: I have add my e-mail address into the CC. list, but I have not receive any mail. :(

Comment 5 shishir gowda 2010-07-27 06:26:02 UTC
We did not see this error on glusterfs with qemu and qwoc2 image files.

One usage error might be qemu run with copy on write(qwoc2), and the original image might not be in the same path.

When the original image and the qwoc2 image exists under different paths, this fails.

Please check the above error and reopen if this not the error.