Bug 1392512

Summary: Add qemu-ga AF_VSOCK support
Product: [Community] Virtualization Tools Reporter: Stefan Hajnoczi <stefanha>
Component: libvirtAssignee: Libvirt Maintainers <libvirt-maint>
Status: NEW --- QA Contact: Virtualization Bugs <virt-bugs>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: unspecifiedCC: berrange, dyuan, fjin, jdenemar, jsuchane, libvirt-maint, pkrempa, xuzhang
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Stefan Hajnoczi 2016-11-07 16:10:18 UTC
The QEMU guest agent supports AF_VSOCK communication from QEMU 2.8 onwards.  Libvirt currently does not support AF_VSOCK (virtio-vsock).

When qemu-ga is launched in vsock-listen mode it uses an AF_VSOCK socket to communicate with the host.  The socket acts like a UNIX domain socket or TCP connection (it is a reliable in-order stream).

On the host side a AF_VSOCK socket must be opened using socket(2) and non-blocking connect(2) should be used to connect to the agent.  If the connection is refused then the agent might not be running yet.  Libvirt should retry connecting (with exponential backoff) if the connection failed or was disconnected.

Comment 1 Stefan Hajnoczi 2016-11-07 16:11:40 UTC
This feature depends on virtio-vsock.  It must first be possible to launch guests with the virtio-vsock device.

Comment 2 Daniel Berrangé 2016-11-07 16:16:19 UTC
What's the actual compelling reason to want to use AF_VSOCK with qemu guest agent ?

I can see downsides to exposing it with AF_VSOCK. In particular any apps on the host would be able to connect and modify stuff behind libvirt's back which could lead to hard to diagnose problems for libvirt.

So at first glance my vote would be to explicitly *not* support use of qemu guest agent with vsock. Leave vsock for other stuff where multiple-clients is a must-have and libvirt is not involved, eg NFS, libguestfs agent, etc

Comment 4 Stefan Hajnoczi 2016-11-08 17:22:02 UTC
(In reply to Daniel Berrange from comment #2)
> What's the actual compelling reason to want to use AF_VSOCK with qemu guest
> agent ?
> 
> I can see downsides to exposing it with AF_VSOCK. In particular any apps on
> the host would be able to connect and modify stuff behind libvirt's back
> which could lead to hard to diagnose problems for libvirt.

qemu-ga only accepts one client at a time.  Therefore it's not practical for users to directly access the socket.  They must go through libvirt.

The reason to support qemu-ga -vsock-listen in libvirt is that users will want to remove the virtio-serial device if their own agent is using AF_VSOCK.  Fewer devices to manage, less attack surface, etc.

Comment 5 Daniel Berrangé 2016-11-08 17:28:08 UTC
(In reply to Stefan Hajnoczi from comment #4)
> (In reply to Daniel Berrange from comment #2)
> > What's the actual compelling reason to want to use AF_VSOCK with qemu guest
> > agent ?
> > 
> > I can see downsides to exposing it with AF_VSOCK. In particular any apps on
> > the host would be able to connect and modify stuff behind libvirt's back
> > which could lead to hard to diagnose problems for libvirt.
> 
> qemu-ga only accepts one client at a time.  Therefore it's not practical for
> users to directly access the socket.  They must go through libvirt.

That's entirely desirable - libvirt doesn't want apps secretly making guest changes like freezing filesystems, cpu hotunplug, etc behind its back, so it wants to be in control of the QEMU guest agent - we don't want users directly accessing it.

> The reason to support qemu-ga -vsock-listen in libvirt is that users will
> want to remove the virtio-serial device if their own agent is using
> AF_VSOCK.  Fewer devices to manage, less attack surface, etc.

I'm not finding that a particularly compelling benefit given the downside to libvirt - we don't want the guest agent exposed directly to all host users.

Comment 6 Stefan Hajnoczi 2016-11-09 14:13:53 UTC
(In reply to Daniel Berrange from comment #5)
> (In reply to Stefan Hajnoczi from comment #4)
> > (In reply to Daniel Berrange from comment #2)
> > > What's the actual compelling reason to want to use AF_VSOCK with qemu guest
> > > agent ?
> > > 
> > > I can see downsides to exposing it with AF_VSOCK. In particular any apps on
> > > the host would be able to connect and modify stuff behind libvirt's back
> > > which could lead to hard to diagnose problems for libvirt.
> > 
> > qemu-ga only accepts one client at a time.  Therefore it's not practical for
> > users to directly access the socket.  They must go through libvirt.
> 
> That's entirely desirable - libvirt doesn't want apps secretly making guest
> changes like freezing filesystems, cpu hotunplug, etc behind its back, so it
> wants to be in control of the QEMU guest agent - we don't want users
> directly accessing it.
> 
> > The reason to support qemu-ga -vsock-listen in libvirt is that users will
> > want to remove the virtio-serial device if their own agent is using
> > AF_VSOCK.  Fewer devices to manage, less attack surface, etc.
> 
> I'm not finding that a particularly compelling benefit given the downside to
> libvirt - we don't want the guest agent exposed directly to all host users.

There is no downside to libvirt because host applications cannot directly talk to the agent alongside libvirt.  Anyone trying to connect directly would quickly notice that the agent will not communicate while libvirt is around.  They need to go via libvirt so I don't see the issue.

Comment 7 Daniel Berrangé 2016-11-09 14:20:20 UTC
(In reply to Stefan Hajnoczi from comment #6)
> (In reply to Daniel Berrange from comment #5)
> > (In reply to Stefan Hajnoczi from comment #4)
> > > The reason to support qemu-ga -vsock-listen in libvirt is that users will
> > > want to remove the virtio-serial device if their own agent is using
> > > AF_VSOCK.  Fewer devices to manage, less attack surface, etc.
> > 
> > I'm not finding that a particularly compelling benefit given the downside to
> > libvirt - we don't want the guest agent exposed directly to all host users.
> 
> There is no downside to libvirt because host applications cannot directly
> talk to the agent alongside libvirt.  Anyone trying to connect directly
> would quickly notice that the agent will not communicate while libvirt is
> around.  They need to go via libvirt so I don't see the issue.

I must be missing something then, because IIUC, the whole point of AF_VSOCK is that it allows multiple concurrent client connections to the same service, but here you seem to be saying only 1 client is allowed at any time.

Comment 8 Peter Krempa 2016-11-09 14:45:31 UTC
In addition to all this we'd still need to keep all the "virtio-serial" based code and add decision code so that old and new guests still will work. The worst case is if the client downgrades the guest agent.

Comment 9 Stefan Hajnoczi 2016-11-10 09:58:04 UTC
(In reply to Daniel Berrange from comment #7)
> (In reply to Stefan Hajnoczi from comment #6)
> > (In reply to Daniel Berrange from comment #5)
> > > (In reply to Stefan Hajnoczi from comment #4)
> > > > The reason to support qemu-ga -vsock-listen in libvirt is that users will
> > > > want to remove the virtio-serial device if their own agent is using
> > > > AF_VSOCK.  Fewer devices to manage, less attack surface, etc.
> > > 
> > > I'm not finding that a particularly compelling benefit given the downside to
> > > libvirt - we don't want the guest agent exposed directly to all host users.
> > 
> > There is no downside to libvirt because host applications cannot directly
> > talk to the agent alongside libvirt.  Anyone trying to connect directly
> > would quickly notice that the agent will not communicate while libvirt is
> > around.  They need to go via libvirt so I don't see the issue.
> 
> I must be missing something then, because IIUC, the whole point of AF_VSOCK
> is that it allows multiple concurrent client connections to the same
> service, but here you seem to be saying only 1 client is allowed at any time.

The guest agent only accept 1 client at a time.  Although we could change this I think the policy makes sense to avoid interference between clients.

The reason to enable AF_VSOCK guest agents is to allow guests that have no virtio-serial device.

Comment 10 Jaroslav Suchanek 2016-12-02 10:54:30 UTC
Lets keep this for upstream tracker for now.