Bug 1415693

Summary: libvirt does not autostart guests off gluster storage
Product: Red Hat Enterprise Linux 7 Reporter: lejeczek <peljasz>
Component: libvirtAssignee: Peter Krempa <pkrempa>
Status: CLOSED NOTABUG QA Contact: lijuan men <lmen>
Severity: high Docs Contact:
Priority: unspecified    
Version: 7.3CC: carl, dyuan, jsuchane, peljasz, rbalakri, xuzhang
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-03-06 07:13:05 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description lejeczek 2017-01-23 13:04:01 UTC
Description of problem:

I've tried both systemd & libvirt mailing list but nothing, nobody commented.

I have a simple domains:

...
  <devices>
    <emulator>/usr/libexec/qemu-kvm</emulator>
    <disk type='network' device='disk'>
      <driver name='qemu' type='raw'/>
      <source protocol='gluster' name='QEMU-VMs/rhel-work3.qcow2'>
        <host name='127.0.0.1'/>
      </source>
      <target dev='vda' bus='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
    </disk>
...

and they work fine but fail to autostart.
Errors I see:
...
failed to initialize gluster connection (src=0x7f9424266350 priv=0x7f94242922b0): Transport endpoint is
internal error: Failed to autostart VM 'rhel-work2': failed to initialize gluster connection (src=0x7f9
failed to initialize gluster connection (src=0x7f942423fef0 priv=0x7f9424256320): Transport endpoint is
internal error: Failed to autostart VM 'rhel-work3': failed to initialize gluster connection (src=0x7f9
failed to initialize gluster connection (src=0x7f9424261b20 priv=0x7f94242a18b0): Transport endpoint is
internal error: Failed to autostart VM 'rhel-work1': failed to initialize gluster connection (src=0x7f9
...

I tried to make systemd libvirtd to wait for gluster:

After=glusterd.service

but if that's all required then, well, still fails.

Version-Release number of selected component (if applicable):

glusterfs-client-xlators-3.7.18-1.el7.x86_64
glusterfs-cli-3.7.18-1.el7.x86_64
glusterfs-api-3.7.18-1.el7.x86_64
glusterfs-server-3.7.18-1.el7.x86_64
glusterfs-3.7.18-1.el7.x86_64
glusterfs-fuse-3.7.18-1.el7.x86_64
glusterfs-libs-3.7.18-1.el7.x86_64

and also

glusterfs-fuse-3.8.8-1.el7.x86_64
glusterfs-cli-3.8.8-1.el7.x86_64
glusterfs-libs-3.8.8-1.el7.x86_64
glusterfs-server-3.8.8-1.el7.x86_64
glusterfs-3.8.8-1.el7.x86_64
glusterfs-client-xlators-3.8.8-1.el7.x86_64
glusterfs-api-3.8.8-1.el7.x86_64

libvirt-2.0.0-10.el7_3.4.x86_64

How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 2 Jaroslav Suchanek 2017-01-26 11:05:34 UTC
(In reply to lejeczek from comment #0)
> Description of problem:
> 
> I've tried both systemd & libvirt mailing list but nothing, nobody commented.
> 

Peter responded a few minutes before you created this bz. Does it help?

Comment 3 Carl T. Miller 2017-03-04 14:45:03 UTC
I'm not sure if you're seeing the same issue I saw.
Basically glusterd would not start properly, and
when it did, I often got errors about the transport.

My temporary fix is to disable glusterd and
libvirtd, then put the following into /etc/rc.local:

export counter=0
until [ -d /vms/.trashcan ] || [ $counter -eq 10 ]; do
  service glusterd restart
  umount /vms
  mount /vms
  counter=`expr $counter + 1`
  sleep 1
done
[ -d /vms/.trashcan ] && systemctl start libvirtd

(This is a fully patched CentOS 7.3 server if that
makes any difference.)

Comment 4 lejeczek 2017-03-04 15:30:23 UTC
Actually, I think that - After=glusterd.service - works for me. There were lots of updates since I filed the report, still on gluster 3.8, 3.8.9-1.el7.x86_64 but it works now. I edited with --full, like:

...
After=remote-fs.target
After=glusterd.service
Documentation=man:libvirtd(8)
Documentation=http://libvirt.org

Comment 5 Peter Krempa 2017-03-06 07:13:05 UTC
I did not manage to reproduce the issue with a host running glusterfs-3.8.4-1.el7. I suspect that the systemd service for glusterd is marked as started prior to glusterd actually working properly.

If the bug reproduces again please reopen this BZ and attach also logs for glusterd, since libvirtd can't do much if the storage does not work.