Bug 962669

Summary: Windows guest agent service failed to be started
Product: Red Hat Enterprise Linux 6 Reporter: huiqingding <huding>
Component: qemu-kvmAssignee: Laszlo Ersek <lersek>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: high Docs Contact:
Priority: high    
Version: 6.5CC: acathrow, bsarathy, chayang, huding, juzhang, lnovich, lveyde, mazhang, michen, mkenneth, pbonzini, qzhang, sluo, virt-maint, xfu
Target Milestone: rcKeywords: Regression, Reopened, TestBlocker
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: qemu-kvm-0.12.1.2-2.380.el6 Doc Type: Release Note
Doc Text:
Starting with this release, the qemu guest agent persistently saves file handle numbers in the state directory. The Windows build of the guest agent, qemu-ga.exe, determines the state directory in one of two ways. The guest administrator may set the state directory with the --statedir option on the command line. Alternatively, if that option is not specified, qemu-ga.exe queries the system for the local state directory at startup (refer to CSIDL_COMMON_APPDATA in <http://msdn.microsoft.com/en-us/library/bb762494.aspx>), and appends the "qemu-ga" subdirectory to it. In either case, the state directory is automatically created if it doesn't exist. The default value can be listed using the "--help" option. Retrieving CSIDL_COMMON_APPDATA has introduced a new DLL dependency for qemu-ga.exe. The README file in the guest agent package has been updated with details.
Story Points: ---
Clone Of:
: 964304 (view as bug list) Environment:
Last Closed: 2013-11-21 06:56:50 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description huiqingding 2013-05-14 08:18:12 UTC
Description of problem:
When start windows guest agent service "qemu-ga", error information "The service is not responding to the control function" is outputed.

Version-Release number of selected component (if applicable):
kernel-2.6.32-376.el6.x86_64
qemu-kvm-0.12.1.2-2.369.el6.x86_64.rpm

How reproducible:
100%

Guest:
Win7-32
qemu-ga: qemu-guest-agent-win32-0.12.1.2-2.369.el6.x86_64.rpm

Steps to Reproduce:
1.Boot a Win7-32 guest
 /usr/libexec/qemu-kvm -M pc -cpu SandyBridge -enable-kvm -uuid 6530dbca-555f-4212-bee9-4423ef56066d -rtc base=localtime,clock=host,driftfix=slew -m 4G -smp 4,sockets=2,cores=2,threads=1 -name win7-32 -drive file=/home/win7-32.qcow2,format=qcow2,if=none,id=drive-ide0-0-0,werror=stop,rerror=stop,cache=none -device virtio-blk-pci,drive=drive-ide0-0-0,id=ide0-0-0,scsi=off,bus=pci.0 -chardev socket,path=/tmp/qga.sock,server,nowait,id=qga0 -device virtio-serial -device virtserialport,chardev=qga0,name=org.qemu.guest_agent.0 -netdev tap,id=hostnet0,script=/etc/qemu-ifup -device virtio-net-pci,netdev=hostnet0,id=net0,mac=04:17:21:e2:17:03,bus=pci.0,addr=0x6 -device virtio-balloon-pci,id=ballooning,bus=pci.0 -monitor stdio -spice port=5900,disable-ticketing -drive file=/home/driver.iso,if=none,media=cdrom,format=raw,id=drive-ide1-0-2 -device ide-drive,drive=drive-ide1-0-2,id=ide1-0-2,bus=ide.1,unit=1

2.Install the qemu-guest-agent-win32-0.12.1.2-2.346.el6.x86_64 on a rhel host and get the executable.

3.Install depended DLL
   libglib-2.0-0.dll
   libiconv.dll
   iconv.dll
   libintl-8.dll

4. Install the qemu-ga.exe inside windows guest.
#c:\qemu-ga>qemu-ga.exe --service install

5. Start agent manually inside guest via
#c:\qemu-ga>net start qemu-ga
  
Actual results:
Error information "The service is not responding to the control function" is outputed.

Expected results:
The service can be started sucessfully.

Additional info:
We test qemu-kvm-0.12.1.2-2.351.el6.x86_64.rpm and qemu-guest-agent-win32-0.12.1.2-2.351.el6.x86_64.rpm. qemu-ga service can be started sucessfully.

Comment 2 Qingtang Zhou 2013-05-14 08:45:06 UTC
Ran the qemu-ga.exe program from qemu-guest-agent-win32-0.12.1.2-2.369.el6.x86_64.rpm directly, I got following error message:
"""
C:\qemu-ga>qemu-ga.369.exe
1368574769.448800: critical: failed to write persistent state to /var/run/qga.state: Failed to create file '/var/run/qga.state.JWD5WW': No such file or directory
1368574769.448800: critical: unable to create state file at path /var/run/qga.state
1368574769.448800: critical: failed to load persistent state
"""

The error was still there when I specified method (-m) and statedir (-t):
"""
C:\qemu-ga>qemu-ga.369.exe -m virtio-serial -t "C:\qemu-ga"
1368574948.100000: critical: error opening path
1368574948.100000: critical: error opening channel
1368574948.115600: critical: failed to create guest agent channel
1368574948.115600: critical: failed to initialize guest agent channel
"""

even with the socket path (-p):
"""
C:\qemu-ga>qemu-ga.369.exe -m virtio-serial -t "C:\qemu-ga" -p "\\.\Global\org.qemu.guest_agent.0"
1368574985.446400: critical: error opening path
1368574985.446400: critical: error opening channel
1368574985.446400: critical: failed to create guest agent channel
1368574985.446400: critical: failed to initialize guest agent channel
"""

Please correct me if I used these options incorrectly.

Comment 3 huiqingding 2013-05-14 09:10:39 UTC
We test RHEL6.5 guest, guest agent can works successfully. 
The kernel and qemu verison is:
kernel-2.6.32-376.el6.x86_64
qemu-kvm-0.12.1.2-2.369.el6.x86_64.rpm

Comment 5 huiqingding 2013-05-14 09:19:18 UTC
Guest agent qemu-ga can be started successfully on win7-32 guest using the following version:
qemu-kvm-0.12.1.2-2.355.el6.x86_64.rpm
qemu-guest-agent-0.12.1.2-2.355.el6.x86_64.rpm
kernel-2.6.32-376.el6.x86_64

Comment 6 Laszlo Ersek 2013-05-16 20:22:32 UTC
This happens inside code added by

  qemu-ga: use key-value store to avoid recycling fd handles after restart

The key-value store is set as

  s->pstate_filepath = g_strdup_printf("%s/qga.state", state_dir);

and "state_dir" comes indeed from "-t".

I believe the RHEL-6 backport behaves identically to upstream.

I have two ideas:

(1) According to the glib documentation, both the forward-slash / and the backslash \ work as directory separator on Windows. However using both in one path might not work.

Can you please try the following?

  qemu-ga.369.exe -t C:/qemu-ga

(2) One web search engine is telling me that in cmd.exe, backslashes between quotation marks might work the same way as they do in C string literals. IOW you might have to double them:

  qemu-ga.369.exe -t "C:\\qemu-ga"


If none of these work, then I'll have to submit a patch like this upstream:

------------------------------------------------------------
diff --git a/qga/main.c b/qga/main.c
index 44a2836..09ff7df 100644
--- a/qga/main.c
+++ b/qga/main.c
@@ -1030,9 +1030,10 @@ int main(int argc, char **argv)
     g_log_set_default_handler(ga_log, s);
     g_log_set_fatal_mask(NULL, G_LOG_LEVEL_ERROR);
     ga_enable_logging(s);
-    s->state_filepath_isfrozen = g_strdup_printf("%s/qga.state.isfrozen",
-                                                 state_dir);
-    s->pstate_filepath = g_strdup_printf("%s/qga.state", state_dir);
+    s->state_filepath_isfrozen = g_strdup_printf(
+                       "%s" G_DIR_SEPARATOR_S "qga.state.isfrozen", state_dir);
+    s->pstate_filepath = g_strdup_printf("%s" G_DIR_SEPARATOR_S "qga.state",
+                                         state_dir);
     s->frozen = false;
 
 #ifndef _WIN32
------------------------------------------------------------

Thanks!

Comment 7 Laszlo Ersek 2013-05-17 00:57:36 UTC
Disregard comment 6.

I installed a win2k8r2sp1 guest and started testing the -369 build of
qemu-ga-win32. The symptoms reported in this BZ originate from three
independent configuration problems.

The first problem is the location of the state dir. This is correctly
averted with

  qemu-ga -t c:\qemu-ga

The second problem is that

  virt-install --os-type=windows --os-variant=win2k8

does not create a virtio-serial controller for the VM. However,
qemu-ga-win32 only communicates over virtio-serial.

The third problem is that the virtual machine must have a virtio-serial
driver installed.


Steps to fix the configuration:

(1) Add a virtio-serial controller to the guest XML with "virsh edit", under
the /domain/devices node:

    <controller type='virtio-serial' index='0'/>

(2) Add a virtio-serial channel that uses port 1 on the guest side, and a
unix domain socket on the host side (replace GUEST_NAME below as
appropriate):

    <channel type='unix'>
      <source mode='bind' path='/var/lib/libvirt/qemu/GUEST_NAME.agent'/>
      <target type='virtio' name='org.qemu.guest_agent.1'/>
      <address type='virtio-serial' controller='0' bus='0' port='1'/>
    </channel>

We use port 1 instead of port 0 so that libvirt is not tempted to interfere
with our channel. This way libvirt creates the channel but will not try to
connect to it.

(3) Install the "virtio-win" package on the host.

(4) Attach the file "/usr/share/virtio-win/virtio-win.iso" to the VM
configuration as an IDE CD-ROM, with "raw" storage format.

(5) Start the windows VM. Pull up

  Start | Administrative Tools | Computer Management

select

  Computer Management (Local) | System Tools | Device Manager

then

  WIN-XXX | Other devices | PCI Simple Communications Controller

Right-click it and select Update Driver Software.

From the CD-ROM attached in step (4), select

  E:\vioserial\2k8\amd64

(6) In the guest, start cmd.exe as administrator. Then issue

  c:
  cd \qemu-ga
  qemu-ga -t c:\qemu-ga -p \\.\Global\org.qemu.guest_agent.1

At this point the guest agent must be running.

(7) On the host side, run the following utility to communicate with the
guest agent -- again, replace GUEST_NAME so that it matches step (2):

  socat unix-connect:/var/lib/libvirt/qemu/GUEST_NAME.agent readline

(8) Send QMP commands from the host to the guest agent:

  {"execute":"guest-sync", "arguments":{"id":12345}}
  {"return": 12345}

  {"execute":"guest-ping"}
  {"return": {}}

  {"execute":"guest-get-time"}
  {"return": 1368751948923000000}

  {"execute":"guest-info"}
  /* ... bunch of info ... */


Closing as NOTABUG. Please feel free to reopen if you run into other
problems.

Comment 8 Paolo Bonzini 2013-05-17 10:05:52 UTC
In step (6) you are not following the reproduction instructions:

4. Install the qemu-ga.exe inside windows guest.
#c:\qemu-ga>qemu-ga.exe --service install

5. Start agent manually inside guest via
#c:\qemu-ga>net start qemu-ga

The bug is that before the "qemu-ga: use key-value store to avoid recycling fd handles after restart" patch, -t had no effect in Win32.  Now it does, and the default directory "/var/run" does not exist.

This is partly due to a problem in the spec file: these flags from qemu_ga_build_flags should not be used in the Win32 build:

             --prefix=%{_prefix} \\\
             --localstatedir=%{_localstatedir} \\\
             --sysconfdir=%{_sysconfdir}

However, I'm afraid that this alone won't fix the bug.

First, and not Windows-specific, the "make install" target is not creating the ${localstatedir}/run directory, which it probably should do even on POSIX.

Second, the default prefix is "c:/Program Files/QEMU" which is incorrect in localized versions of Windows and can anyway be changed at installation time.  So when we have an installer for qemu-ga-win32, that path may not exist at all, and I'm not sure that the service has permission to write to "c:/Program Files/QEMU/run".

The first problem only needs to be fixed upstream.

The second is a real regression in RHEL, however.  I guess the kosher Win32 way would be to use the registry instead of a file for the key-value store, but perhaps you can also use some well-known directory.  A quick search suggests adding "\Application Data" to the contents of the environment variable ALLUSERSPROFILE.

Comment 9 Paolo Bonzini 2013-05-17 10:06:19 UTC
(Of course the second problem is also a regression upstream, so we probably need a RHEL7 clone of this bug).

Comment 10 Paolo Bonzini 2013-05-17 10:09:22 UTC
More info... on XP and 2000 it should be as I wrote above, while starting from Vista (i.e. (GetVersion() & 0xFF) >= 6) it should be just ALLUSERSPROFILE.

Comment 11 Laszlo Ersek 2013-05-17 11:41:04 UTC
(In reply to comment #8)
> In step (6) you are not following the reproduction instructions:
>
> 4. Install the qemu-ga.exe inside windows guest.
> #c:\qemu-ga>qemu-ga.exe --service install
>
> 5. Start agent manually inside guest via
> #c:\qemu-ga>net start qemu-ga

According to README.txt,

> 4. Register the guest agent as a service by running it from the command
>    line with the '--service install' option, along with other desired
>    options.

I didn't test "--service install" / "net start" last night based on this,
but I have now, and it indeed fails. Replacing step (6) from comment 7:

  c:
  cd \qemu-ga
  qemu-ga -t c:\qemu-ga -p \\.\Global\org.qemu.guest_agent.1 --service install

This responds with

  ** (qemu-ga.exe:1352): DEBUG: service's cmdline: C:\qemu-ga\qemu-ga.exe
      -d -p \\.\Global\org.qemu.guest_agent.1
  Service was installed successfully.

Implying that the "-t c:\qemu-ga" option is not saved during service
registration. Then

  cd \
  net start qemu-ga

fails in fact with

  The service is not responding to the control function

> The bug is that before the "qemu-ga: use key-value store to avoid
> recycling fd handles after restart" patch, -t had no effect in Win32.

Correct. It is used internally to set up the pathname of the fsfreeze state
file (s->state_filepath_isfrozen), but the code loading/saving that file
depends on !_WIN32.

> Now it does, and the default directory "/var/run" does not exist.
>
> This is partly due to a problem in the spec file: these flags from
> qemu_ga_build_flags should not be used in the Win32 build:
>
>              --prefix=%{_prefix} \\\
>              --localstatedir=%{_localstatedir} \\\
>              --sysconfdir=%{_sysconfdir}
>
> However, I'm afraid that this alone won't fix the bug.
>
> First, and not Windows-specific, the "make install" target is not creating
> the ${localstatedir}/run directory, which it probably should do even on
> POSIX.

In the RHEL-6 build ${localstatedir} is /var, and /var/run is guaranteed to
exist. But I agree, I built upstream qemu with --prefix=zzzz; make install
creates all subdirectories needed for installation, but it doesn't create
the zzzz/var/run state dir where the pid file and the handle counter file
would be kept.

>
> Second, the default prefix is "c:/Program Files/QEMU" which is incorrect
> in localized versions of Windows and can anyway be changed at installation
> time.

We don't install qemu-ga.exe with an installer. README.txt says:

> 1. Create a directory/folder on the Windows guest to contain the guest
>    agent executable (e.g. c:\qemu-ga)
>
> 2. Copy the qemu-ga.exe, and any required DLL files, to the folder created
>    in  step 1
>
> 3. To see all valid options for the guest agent, run 'qemu-ga.exe --help'
>    from the folder where you copied the files in step 1.
>
> 4. Register the guest agent as a service by running it from the command
>    line with the '--service install' option, along with other desired
>    options.

The -t option must be specified by the guest admin. It wasn't required until
now, but the installation instructions (use --help, then change whatever you
need for --service install) haven't changed. The main problem is
ga_install_service() ignoring (not forwarding) option -t.


> So when we have an installer for qemu-ga-win32, that path may not
> exist at all, and I'm not sure that the service has permission to write to
> "c:/Program Files/QEMU/run".
>
> The first problem only needs to be fixed upstream.

I take by that you mean the ${localstatedir}/run directory; I'll send a
patch for it.

> The second is a real regression in RHEL, however.  I guess the kosher
> Win32 way would be to use the registry instead of a file for the key-value
> store, but perhaps you can also use some well-known directory.  A quick
> search suggests adding "\Application Data" to the contents of the
> environment variable ALLUSERSPROFILE.

From comment 10:

> on XP and 2000 it should be as I wrote above, while starting from Vista
> (i.e. (GetVersion() & 0xFF) >= 6) it should be just ALLUSERSPROFILE

The discussion in
<http://stackoverflow.com/questions/166876/best-place-to-put-application-data>
suggests that this is not so clear cut, for example the "Application Data"
directory name could be localized too. Based on comment
<http://stackoverflow.com/a/191316> and MSDN,

  SHGetFolderPath(NULL, CSIDL_COMMON_APPDATA, NULL, SHGFP_TYPE_CURRENT,
                  path);
  + application name + file name

seems most portable across Windows versions.

It's going to be a huge pain to query this at qga startup time, because the
code currently uses fixed string literals disguised as several layers of
macros. Everything that depends on QGA_STATEDIR_DEFAULT, directly or
indirectly (eg. through QGA_PIDFILE_DEFAULT) needs to become dynamic.

I'll write a series for upstream, but I expect weeks of bikeshedding.

Comment 14 Laszlo Ersek 2013-05-18 05:11:53 UTC
Posted upstream patches (v1):
http://lists.nongnu.org/archive/html/qemu-devel/2013-05/msg02396.html

Comment 15 Laszlo Ersek 2013-05-30 19:19:21 UTC
Michael Roth's pull req:
http://thread.gmane.org/gmane.comp.emulators.qemu/214084

Comment 16 Ademar Reis 2013-06-03 17:52:39 UTC
*** Bug 969833 has been marked as a duplicate of this bug. ***

Comment 17 Laszlo Ersek 2013-06-03 17:58:57 UTC
Patches to backport:

1  e2ea351 osdep: add qemu_get_local_state_pathname()
2  c394ecb qga: determine default state dir and pidfile dynamically
3  5a699bb configure: don't save any fixed local_statedir for win32
4  bf12c1f qga: create state directory on win32
5  a880845 qga: remove undefined behavior in ga_install_service()
6  a839ee7 qga: save state directory in ga_install_service()
7  f2e3978 Makefile: create ".../var/run" when installing the POSIX guest agent

Comment 28 errata-xmlrpc 2013-11-21 06:56:50 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2013-1553.html