Bug 1127877 - vdsm-tool configure --force does not configure qemu.conf properly in the first run on a fresh install
Summary: vdsm-tool configure --force does not configure qemu.conf properly in the firs...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: oVirt
Classification: Retired
Component: vdsm
Version: 3.5
Hardware: Unspecified
OS: Unspecified
high
urgent
Target Milestone: ---
: 3.5.0
Assignee: Yaniv Bronhaim
QA Contact: Jiri Belka
URL:
Whiteboard: infra
Depends On:
Blocks: 1073943 1190692
TreeView+ depends on / blocked
 
Reported: 2014-08-07 18:07 UTC by Nir Soffer
Modified: 2019-04-28 09:40 UTC (History)
14 users (show)

Fixed In Version: v4.16.4
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1190692 (view as bug list)
Environment:
Last Closed: 2016-11-08 06:25:04 UTC
oVirt Team: Infra


Attachments (Terms of Use)
Configurations files before and after vdsm-tool configure and /var/log/messages (61.13 KB, application/gzip)
2014-08-07 18:07 UTC, Nir Soffer
no flags Details
patch 31466 verification fe19 (269.31 KB, text/plain)
2014-08-14 10:37 UTC, Mooli Tayer
no flags Details


Links
System ID Priority Status Summary Last Updated
oVirt gerrit 31289 master MERGED configfile: sort dict items inserted to config files for consistency. Never
oVirt gerrit 31562 master MERGED vdsm-tool: roll out self signed certificates in vdsm-tool. Never
oVirt gerrit 32047 ovirt-3.5 MERGED vdsm-tool: reorgenize module configurers. Never
oVirt gerrit 32048 ovirt-3.5 MERGED vdsm-tool: simplify getting modules by names. Never
oVirt gerrit 32050 ovirt-3.5 MERGED vdsm-tool: suppoort dependencies between ModuleConfigure Never
oVirt gerrit 32052 ovirt-3.5 MERGED vdsm-tool: roll out self signed certificates in vdsm-tool. Never
oVirt gerrit 32053 ovirt-3.5 MERGED tool: Raise UsageError when used incorrectly Never
oVirt gerrit 32055 ovirt-3.5 MERGED tool: Fix error message for non-existing module Never
oVirt gerrit 32056 ovirt-3.5 MERGED tool: Fix compatibility with Python 2.6 Never
oVirt gerrit 32058 ovirt-3.5 MERGED tool: Use space after comma when formatting lists Never
oVirt gerrit 32059 ovirt-3.5 MERGED tool: Fix help message when is-configured fails Never
oVirt gerrit 32061 ovirt-3.5 MERGED tool: Fix TypeError when configuration check fails Never
oVirt gerrit 32062 ovirt-3.5 MERGED configfile: sort dict items inserted to config files for consistency. Never

Description Nir Soffer 2014-08-07 18:07:59 UTC
Created attachment 924994 [details]
Configurations files before and after vdsm-tool configure and /var/log/messages

Description of problem:

When configuring vdsm in the first time on a fresh image, 
vdsm-tool configure --force does not configure qemu.conf, causing 
vdsm to fail when starting it.



Version-Release number of selected component (if applicable):
vdsm-4.16.1-6.gita4a4614.fc20.x86_64

How reproducible:
Always

Steps to Reproduce:
1. Install a fresh image of Fedora 19/Fedora 20/RHEL 6.5
2. yum install vdsm
3. vdsm-tool configure --force
4. systemctl start vdsmd

Actual results:
Job for vdsmd.service failed. See 'systemctl status vdsmd.service' and 'journalctl -xn' for details.

Expected results:
vdsmd should start

Workaround:
1. Run again vdsm-tool configure --force
2. Start vdsm

Here are commands I used:

[root@dhcp-0-177 configure]# vdsm-tool configure --force

Checking configuration status...

libvirt is not configured for vdsm yet

Running configure...
Reconfiguration of libvirt is done.

Done configuring modules to VDSM.
[root@dhcp-0-177 configure]# systemctl start vdsmd
Job for vdsmd.service failed. See 'systemctl status vdsmd.service' and 'journalctl -xn' for details.
[root@dhcp-0-177 configure]# vdsm-tool configure --force

Checking configuration status...

libvirt is already configured for vdsm

Running configure...
Reconfiguration of libvirt is done.

Done configuring modules to VDSM.
[root@dhcp-0-177 configure]# systemctl start vdsmd


Checking /var/log/messages we see that vdsm fail to validate libvirt configuration:

Aug  7 17:29:02 dhcp-0-177 vdsmd_init_common.sh: vdsm: Running validate_configuration
Aug  7 17:29:02 dhcp-0-177 vdsmd_init_common.sh: Error:  Config is not valid. Check conf files
Aug  7 17:29:02 dhcp-0-177 vdsmd_init_common.sh: FAILED: conflicting vdsm and libvirt-qemu tls configuration.
Aug  7 17:29:02 dhcp-0-177 vdsmd_init_common.sh: vdsm.conf with ssl=True requires the following changes:
Aug  7 17:29:02 dhcp-0-177 vdsmd_init_common.sh: libvirtd.conf: listen_tcp=0, auth_tcp="sasl",
Aug  7 17:29:02 dhcp-0-177 vdsmd_init_common.sh: qemu.conf: spice_tls=1.
Aug  7 17:29:02 dhcp-0-177 vdsmd_init_common.sh: Modules libvirt contains invalid configuration
Aug  7 17:29:02 dhcp-0-177 vdsmd_init_common.sh: vdsm: stopped during execute validate_configuration task (task returned with error code 1).


To understand this issue, I copied the libvirt.conf and qemu.conf
before running vdsm-tool in the first and second time. Diffing the files
show:

[root@dhcp-0-177 configure]# diff -ur before-configure/ after-configure/
diff -ur before-configure/qemu.conf after-configure/qemu.conf
--- before-configure/qemu.conf	2014-08-07 17:27:24.031529700 +0300
+++ after-configure/qemu.conf	2014-08-07 17:28:46.152352048 +0300
@@ -435,3 +435,12 @@
 #
 #migration_port_min = 49152
 #migration_port_max = 49215
+## beginning of configuration section by vdsm-4.13.0
+save_image_format="lzop"
+remote_display_port_max=6923
+lock_manager="sanlock"
+remote_display_port_min=5900
+spice_tls=1
+auto_dump_path="/var/log/core"
+dynamic_ownership=0
+## end of configuration section by vdsm-4.13.0

[root@dhcp-0-177 configure]# diff -ur after-configure/ after-second-configure/
diff -ur after-configure/qemu.conf after-second-configure/qemu.conf
--- after-configure/qemu.conf	2014-08-07 17:28:46.152352048 +0300
+++ after-second-configure/qemu.conf	2014-08-07 17:29:57.826817086 +0300
@@ -436,11 +436,12 @@
 #migration_port_min = 49152
 #migration_port_max = 49215
 ## beginning of configuration section by vdsm-4.13.0
+spice_tls=1
 save_image_format="lzop"
 remote_display_port_max=6923
-lock_manager="sanlock"
+spice_tls_x509_cert_dir="/etc/pki/vdsm/libvirt-spice"
 remote_display_port_min=5900
-spice_tls=1
+lock_manager="sanlock"
 auto_dump_path="/var/log/core"
 dynamic_ownership=0
 ## end of configuration section by vdsm-4.13.0

We can see that in the first time vdsm-tool configure was run, qemu.conf was not
modified. It was modified only in the second vdsm-tool configure run.

Comment 1 Nir Soffer 2014-08-07 18:13:21 UTC
We can also see that the order of the parameters in qemu.conf changes between the runs. This does not effect the operation of the system but is an unwanted property, because it make checking the configuration harder.

The configuration should use always the same order by sorting the keys.

Comment 2 Nir Soffer 2014-08-07 18:15:59 UTC
Another strage thing is adding the string:
## beginning of configuration section by vdsm-4.13.0

When configuring vdsm-4.16.1!

Comment 3 Yaniv Bronhaim 2014-08-08 19:46:31 UTC
(In reply to Nir Soffer from comment #2)
> Another strage thing is adding the string:
> ## beginning of configuration section by vdsm-4.13.0
> 
> When configuring vdsm-4.16.1!

lib/vdsm/tool/configurator.py line 349
    # version != PACKAGE_VERSION since we do not want to update configuration   
    # on every update. see 'configuration versioning:' at Configfile.py for     
    # details. 

and for the confutation versioning:
    configuration versioning:                                                   
    sections added (by prependSection() or addEntry()) will wrapped between:    
    'sectionStart'-'version'\n                                                  
    ...                                                                         
    'sectionEnd'-'version'\n.  

about the bug itself, mooli please check that out

Comment 4 Mooli Tayer 2014-08-10 13:06:33 UTC
Hi,

1. regarding the bug, Long story short, i'm pretty sure it was caused by a bug fixed by: 
http://gerrit.ovirt.org/#/c/31293/

Any chance you can rebase on that and test?

To make debugging better cherry-pick http://gerrit.ovirt.org/#/c/31289/ and http://gerrit.ovirt.org/#/c/31290/ as well and run:

vdsm-tool -vvv configure --force

2. regarding comment 1 I've noticed this issue before too, I've submitted:
http://gerrit.ovirt.org/#/c/31289/
Thanks!

Comment 5 Nir Soffer 2014-08-10 13:37:01 UTC
(In reply to Mooli Tayer from comment #4)
> Any chance you can rebase on that and test?
Very low chance :-)

Comment 6 Mooli Tayer 2014-08-10 13:40:02 UTC
OK turns out this is a different issue:

It happens because of this check, that effects the configuration vdsm will use in qemu.conf and libvirtd.conf:(spice_tls_x509_cert_dir= and auth_tcp=,listen_tcp= accordingly)

'certs_exist': all(os.path.isfile(f) for f in [
    self.CA_FILE,
    self.CERT_FILE,
    self.KEY_FILE

The reason is after install /etc/pki/vdsm/certs is an empty dir.
Same is true after first run of vdsm-tool configure --force. only when running vdsm for the first time it creates:
├── certs
│   ├── cacert.pem
│   └── vdsmcert.pem

This should probably be done inside vdsm-tool(and before or within libvirt configure).

Another unwanted side effects is that these files are not removed upon removal of vdsm.

Comment 7 Yaniv Bronhaim 2014-08-11 11:53:45 UTC
The original code was:
    # If the ssl flag is set, update the libvirt and qemu configuration files   
    # with the location for certificates and permissions.                       
    if [ -f $ts/certs/cacert.pem -a \                                           
         -f $ts/certs/vdsmcert.pem -a \                                         
         -f $ts/keys/vdsmkey.pem -a \                                           
         "${ssl}" = "true" ]; then                                              
        set_if_default "${lconf}" ca_file \"$ts/certs/cacert.pem\"              
        set_if_default "${lconf}" cert_file \"$ts/certs/vdsmcert.pem\"          
        set_if_default "${lconf}" key_file \"$ts/keys/vdsmkey.pem\"             
        set_if_default "${qconf}" spice_tls_x509_cert_dir \"$ts/libvirt-spice\" 
    else                                                                        
        set_if_default "${lconf}" auth_tcp \"none\"                             
        set_if_default "${lconf}" listen_tcp 1                                  
        set_if_default "${lconf}" listen_tls 0                                  
    fi 

which means that if the pem files was not there it would configure libvirt and qemu to work without ssl (even if vdsm.conf says differently)

during pre-start script (init/vdsmd_init_common.sh) vdsm runs gencert step which calls the script vdsm-gencerts.sh who generates those pem files. I assume that host-deploy does it for us as well.
afaiu, we should move this script call to the libvirt configure part (if ssl=true and the cert files are missing). am i missing something?
otherwise, in this status libvirt configure will always fail of course (as vdsm.conf says ssl=true but didn't configure libvirt and qemu conf files accordingly)
dan,alon, any comments?

Comment 8 Alon Bar-Lev 2014-08-11 12:03:22 UTC
the self generating certificate is to be used for development only, allowing vdsm and/or libvirt to automatically accept non encrypted communication is something that should be banned, yes, also at the cost of having vdsm exit while stating that configuration is incomplete.

Comment 9 Mooli Tayer 2014-08-14 10:37:55 UTC
Created attachment 926736 [details]
patch 31466 verification fe19

Comment 10 Jiri Belka 2014-09-22 10:50:48 UTC
ok, tested on RHEL with vdsm-4.16.4-0.el6.x86_64

# vdsm-tool configure --force

Checking configuration status...

libvirt is not configured for vdsm yet

Running configure...
Reconfiguration of certificates is done.
Reconfiguration of libvirt is done.

Done configuring modules to VDSM.
# service vdsmd start
Starting multipathd daemon:                                [  OK  ]
Starting rpcbind:                                          [  OK  ]
Starting wdmd:                                             [  OK  ]
Starting sanlock:                                          [  OK  ]
libvirtd start/running, process 7840
supervdsm start                                            [  OK  ]
Starting iscsid:                                           [  OK  ]
vdsm: Running mkdirs
vdsm: Running configure_coredump
vdsm: Running configure_vdsm_logs
vdsm: Running run_init_hooks
vdsm: Running check_is_configured
libvirt is already configured for vdsm
vdsm: Running validate_configuration
SUCCESS: ssl configured to true. No conflicts
vdsm: Running prepare_transient_repository
vdsm: Running syslog_available
vdsm: Running nwfilter
libvir: Network Filter Driver error : Network filter not found: no nwfilter with matching name 'vdsm-no-mac-spoofing'
vdsm: Running dummybr
vdsm: Running load_needed_modules
vdsm: Running tune_system
vdsm: Running test_space
vdsm: Running test_lo
vdsm: Running unified_network_persistence_upgrade
vdsm: Running restore_nets
vdsm: Running upgrade_300_nets
Starting up vdsm daemon: 
vdsm start
# service vdsmd status
VDS daemon server is running

Comment 11 Sandro Bonazzola 2014-10-17 12:36:42 UTC
oVirt 3.5 has been released and should include the fix for this issue.

Comment 12 Gabriel Somlo 2016-11-04 15:17:43 UTC
I just tried this on Fedora 23 with vdsm-4.18.13-1.fc23.x86_64, and the bug is still present.

On a fresh and up-to-date F23 server install, I did:

dnf install http://plain.resources.ovirt.org/pub/yum-repo/ovirt-release40.rpm
dnf install vdsm

vdsmd.service fails to start, and it all boils down to "vdsm-tool configure" failing with:

FAILED: conflicting vdsm and libvirt-qemu tls configuration.
vdsm.conf with ssl=True requires the following changes:
libvirtd.conf: listen_tcp=0, auth_tcp="sasl", listen_tls=1
qemu.conf: spice_tls=1.

Running "vdsm-tool configure --force" twice in a row allows the service to subsequently start.

Comment 13 Oved Ourfali 2016-11-05 12:54:35 UTC
Yaniv, can you please take a look?

Comment 14 Yaniv Bronhaim 2016-11-07 16:23:35 UTC
Just did it on fresh installation and after running once vdsm-tool configure --force its all configured properly. can you please re-verify your report asap?
install vdsm
run vdsm-tool configure --force
start the service

Comment 15 Gabriel Somlo 2016-11-07 19:08:29 UTC
Oh, with the caveat that one has to "vdsm-tool configure --force" manually before ever running "systemctl vdsmd start" successfully, you're right.

I thought "systemctl vdsmd start" should succeed the first time it's attempted, but if manual "vdsm-tool configure --force" is a documented precondition, then let's close the bug once again, with my apologies for the misunderstanding :)

Thanks,
--Gabriel

Comment 16 Nir Soffer 2016-11-07 19:25:29 UTC
(In reply to Gabriel Somlo from comment #15)
> Oh, with the caveat that one has to "vdsm-tool configure --force" manually
> before ever running "systemctl vdsmd start" successfully, you're right.

Yes, this is the supported way to use vdsm service, you must
configure it before starting. When you add a host via engine,
engine is using host-deploy to configure vdsm before starting
the service.

If this is not documented, please open a documentation bug.

Comment 17 Oved Ourfali 2016-11-08 06:25:04 UTC
Closing again based on the above.
If relevant, please reopen.


Note You need to log in before you can comment on or make changes to this bug.