Created attachment 924994 [details] Configurations files before and after vdsm-tool configure and /var/log/messages Description of problem: When configuring vdsm in the first time on a fresh image, vdsm-tool configure --force does not configure qemu.conf, causing vdsm to fail when starting it. Version-Release number of selected component (if applicable): vdsm-4.16.1-6.gita4a4614.fc20.x86_64 How reproducible: Always Steps to Reproduce: 1. Install a fresh image of Fedora 19/Fedora 20/RHEL 6.5 2. yum install vdsm 3. vdsm-tool configure --force 4. systemctl start vdsmd Actual results: Job for vdsmd.service failed. See 'systemctl status vdsmd.service' and 'journalctl -xn' for details. Expected results: vdsmd should start Workaround: 1. Run again vdsm-tool configure --force 2. Start vdsm Here are commands I used: [root@dhcp-0-177 configure]# vdsm-tool configure --force Checking configuration status... libvirt is not configured for vdsm yet Running configure... Reconfiguration of libvirt is done. Done configuring modules to VDSM. [root@dhcp-0-177 configure]# systemctl start vdsmd Job for vdsmd.service failed. See 'systemctl status vdsmd.service' and 'journalctl -xn' for details. [root@dhcp-0-177 configure]# vdsm-tool configure --force Checking configuration status... libvirt is already configured for vdsm Running configure... Reconfiguration of libvirt is done. Done configuring modules to VDSM. [root@dhcp-0-177 configure]# systemctl start vdsmd Checking /var/log/messages we see that vdsm fail to validate libvirt configuration: Aug 7 17:29:02 dhcp-0-177 vdsmd_init_common.sh: vdsm: Running validate_configuration Aug 7 17:29:02 dhcp-0-177 vdsmd_init_common.sh: Error: Config is not valid. Check conf files Aug 7 17:29:02 dhcp-0-177 vdsmd_init_common.sh: FAILED: conflicting vdsm and libvirt-qemu tls configuration. Aug 7 17:29:02 dhcp-0-177 vdsmd_init_common.sh: vdsm.conf with ssl=True requires the following changes: Aug 7 17:29:02 dhcp-0-177 vdsmd_init_common.sh: libvirtd.conf: listen_tcp=0, auth_tcp="sasl", Aug 7 17:29:02 dhcp-0-177 vdsmd_init_common.sh: qemu.conf: spice_tls=1. Aug 7 17:29:02 dhcp-0-177 vdsmd_init_common.sh: Modules libvirt contains invalid configuration Aug 7 17:29:02 dhcp-0-177 vdsmd_init_common.sh: vdsm: stopped during execute validate_configuration task (task returned with error code 1). To understand this issue, I copied the libvirt.conf and qemu.conf before running vdsm-tool in the first and second time. Diffing the files show: [root@dhcp-0-177 configure]# diff -ur before-configure/ after-configure/ diff -ur before-configure/qemu.conf after-configure/qemu.conf --- before-configure/qemu.conf 2014-08-07 17:27:24.031529700 +0300 +++ after-configure/qemu.conf 2014-08-07 17:28:46.152352048 +0300 @@ -435,3 +435,12 @@ # #migration_port_min = 49152 #migration_port_max = 49215 +## beginning of configuration section by vdsm-4.13.0 +save_image_format="lzop" +remote_display_port_max=6923 +lock_manager="sanlock" +remote_display_port_min=5900 +spice_tls=1 +auto_dump_path="/var/log/core" +dynamic_ownership=0 +## end of configuration section by vdsm-4.13.0 [root@dhcp-0-177 configure]# diff -ur after-configure/ after-second-configure/ diff -ur after-configure/qemu.conf after-second-configure/qemu.conf --- after-configure/qemu.conf 2014-08-07 17:28:46.152352048 +0300 +++ after-second-configure/qemu.conf 2014-08-07 17:29:57.826817086 +0300 @@ -436,11 +436,12 @@ #migration_port_min = 49152 #migration_port_max = 49215 ## beginning of configuration section by vdsm-4.13.0 +spice_tls=1 save_image_format="lzop" remote_display_port_max=6923 -lock_manager="sanlock" +spice_tls_x509_cert_dir="/etc/pki/vdsm/libvirt-spice" remote_display_port_min=5900 -spice_tls=1 +lock_manager="sanlock" auto_dump_path="/var/log/core" dynamic_ownership=0 ## end of configuration section by vdsm-4.13.0 We can see that in the first time vdsm-tool configure was run, qemu.conf was not modified. It was modified only in the second vdsm-tool configure run.
We can also see that the order of the parameters in qemu.conf changes between the runs. This does not effect the operation of the system but is an unwanted property, because it make checking the configuration harder. The configuration should use always the same order by sorting the keys.
Another strage thing is adding the string: ## beginning of configuration section by vdsm-4.13.0 When configuring vdsm-4.16.1!
(In reply to Nir Soffer from comment #2) > Another strage thing is adding the string: > ## beginning of configuration section by vdsm-4.13.0 > > When configuring vdsm-4.16.1! lib/vdsm/tool/configurator.py line 349 # version != PACKAGE_VERSION since we do not want to update configuration # on every update. see 'configuration versioning:' at Configfile.py for # details. and for the confutation versioning: configuration versioning: sections added (by prependSection() or addEntry()) will wrapped between: 'sectionStart'-'version'\n ... 'sectionEnd'-'version'\n. about the bug itself, mooli please check that out
Hi, 1. regarding the bug, Long story short, i'm pretty sure it was caused by a bug fixed by: http://gerrit.ovirt.org/#/c/31293/ Any chance you can rebase on that and test? To make debugging better cherry-pick http://gerrit.ovirt.org/#/c/31289/ and http://gerrit.ovirt.org/#/c/31290/ as well and run: vdsm-tool -vvv configure --force 2. regarding comment 1 I've noticed this issue before too, I've submitted: http://gerrit.ovirt.org/#/c/31289/ Thanks!
(In reply to Mooli Tayer from comment #4) > Any chance you can rebase on that and test? Very low chance :-)
OK turns out this is a different issue: It happens because of this check, that effects the configuration vdsm will use in qemu.conf and libvirtd.conf:(spice_tls_x509_cert_dir= and auth_tcp=,listen_tcp= accordingly) 'certs_exist': all(os.path.isfile(f) for f in [ self.CA_FILE, self.CERT_FILE, self.KEY_FILE The reason is after install /etc/pki/vdsm/certs is an empty dir. Same is true after first run of vdsm-tool configure --force. only when running vdsm for the first time it creates: ├── certs │ ├── cacert.pem │ └── vdsmcert.pem This should probably be done inside vdsm-tool(and before or within libvirt configure). Another unwanted side effects is that these files are not removed upon removal of vdsm.
The original code was: # If the ssl flag is set, update the libvirt and qemu configuration files # with the location for certificates and permissions. if [ -f $ts/certs/cacert.pem -a \ -f $ts/certs/vdsmcert.pem -a \ -f $ts/keys/vdsmkey.pem -a \ "${ssl}" = "true" ]; then set_if_default "${lconf}" ca_file \"$ts/certs/cacert.pem\" set_if_default "${lconf}" cert_file \"$ts/certs/vdsmcert.pem\" set_if_default "${lconf}" key_file \"$ts/keys/vdsmkey.pem\" set_if_default "${qconf}" spice_tls_x509_cert_dir \"$ts/libvirt-spice\" else set_if_default "${lconf}" auth_tcp \"none\" set_if_default "${lconf}" listen_tcp 1 set_if_default "${lconf}" listen_tls 0 fi which means that if the pem files was not there it would configure libvirt and qemu to work without ssl (even if vdsm.conf says differently) during pre-start script (init/vdsmd_init_common.sh) vdsm runs gencert step which calls the script vdsm-gencerts.sh who generates those pem files. I assume that host-deploy does it for us as well. afaiu, we should move this script call to the libvirt configure part (if ssl=true and the cert files are missing). am i missing something? otherwise, in this status libvirt configure will always fail of course (as vdsm.conf says ssl=true but didn't configure libvirt and qemu conf files accordingly) dan,alon, any comments?
the self generating certificate is to be used for development only, allowing vdsm and/or libvirt to automatically accept non encrypted communication is something that should be banned, yes, also at the cost of having vdsm exit while stating that configuration is incomplete.
Created attachment 926736 [details] patch 31466 verification fe19
ok, tested on RHEL with vdsm-4.16.4-0.el6.x86_64 # vdsm-tool configure --force Checking configuration status... libvirt is not configured for vdsm yet Running configure... Reconfiguration of certificates is done. Reconfiguration of libvirt is done. Done configuring modules to VDSM. # service vdsmd start Starting multipathd daemon: [ OK ] Starting rpcbind: [ OK ] Starting wdmd: [ OK ] Starting sanlock: [ OK ] libvirtd start/running, process 7840 supervdsm start [ OK ] Starting iscsid: [ OK ] vdsm: Running mkdirs vdsm: Running configure_coredump vdsm: Running configure_vdsm_logs vdsm: Running run_init_hooks vdsm: Running check_is_configured libvirt is already configured for vdsm vdsm: Running validate_configuration SUCCESS: ssl configured to true. No conflicts vdsm: Running prepare_transient_repository vdsm: Running syslog_available vdsm: Running nwfilter libvir: Network Filter Driver error : Network filter not found: no nwfilter with matching name 'vdsm-no-mac-spoofing' vdsm: Running dummybr vdsm: Running load_needed_modules vdsm: Running tune_system vdsm: Running test_space vdsm: Running test_lo vdsm: Running unified_network_persistence_upgrade vdsm: Running restore_nets vdsm: Running upgrade_300_nets Starting up vdsm daemon: vdsm start # service vdsmd status VDS daemon server is running
oVirt 3.5 has been released and should include the fix for this issue.
I just tried this on Fedora 23 with vdsm-4.18.13-1.fc23.x86_64, and the bug is still present. On a fresh and up-to-date F23 server install, I did: dnf install http://plain.resources.ovirt.org/pub/yum-repo/ovirt-release40.rpm dnf install vdsm vdsmd.service fails to start, and it all boils down to "vdsm-tool configure" failing with: FAILED: conflicting vdsm and libvirt-qemu tls configuration. vdsm.conf with ssl=True requires the following changes: libvirtd.conf: listen_tcp=0, auth_tcp="sasl", listen_tls=1 qemu.conf: spice_tls=1. Running "vdsm-tool configure --force" twice in a row allows the service to subsequently start.
Yaniv, can you please take a look?
Just did it on fresh installation and after running once vdsm-tool configure --force its all configured properly. can you please re-verify your report asap? install vdsm run vdsm-tool configure --force start the service
Oh, with the caveat that one has to "vdsm-tool configure --force" manually before ever running "systemctl vdsmd start" successfully, you're right. I thought "systemctl vdsmd start" should succeed the first time it's attempted, but if manual "vdsm-tool configure --force" is a documented precondition, then let's close the bug once again, with my apologies for the misunderstanding :) Thanks, --Gabriel
(In reply to Gabriel Somlo from comment #15) > Oh, with the caveat that one has to "vdsm-tool configure --force" manually > before ever running "systemctl vdsmd start" successfully, you're right. Yes, this is the supported way to use vdsm service, you must configure it before starting. When you add a host via engine, engine is using host-deploy to configure vdsm before starting the service. If this is not documented, please open a documentation bug.
Closing again based on the above. If relevant, please reopen.