Created attachment 550728 [details] rhevm.log Environment: Devel env, last commit's hash: 4d4f2aec4fb767f29d9cf795dca7c47c57303d33 Scenario: 1. Set the MaxNumberOfHostsInStoragePool to 250 in vdc_options table - you can choose any value, 250 is the default value, in the attached log the value was setted to 126. Results: I managed to install 266 hosts in that data center, all hosts in Up status. Expected Results: Addition of the 251'th host should fail during the CanDoAction Here is the installation of the 127'th host installation: 2012-01-04 18:50:56,442 INFO [org.ovirt.engine.core.bll.LoginAdminUserCommand] (http-0.0.0.0-8080-1) Running command: LoginAdminUserCommand internal: false. 2012-01-04 18:50:56,781 INFO [org.ovirt.engine.core.utils.hostinstall.MinaInstallWrapper] (http-0.0.0.0-8080-1) Invoking /bin/echo -e `/bin/bash -c /usr/sbin/dmidecode|/bin/awk ' /UUID/{ print $2; } ' | /usr/bin/tr ' ' '_' && cat /sys/class/net/*/address | /bin/grep -v '00:00:00:00' | /bin/sort -u | /usr/bin/head --lines=1` on 10.35.99.21 2012-01-04 18:50:56,862 INFO [org.ovirt.engine.core.bll.VdsInstaller] (pool-12-thread-79) Installation of 10.35.99.32. successfully done sftp operation ( Stage: Upload Installation script to Host) 2012-01-04 18:50:56,863 INFO [org.ovirt.engine.core.utils.hostinstall.MinaInstallWrapper] (pool-12-thread-79) return true 2012-01-04 18:50:56,863 INFO [org.ovirt.engine.core.bll.VdsInstaller] (pool-12-thread-79) Installation of 10.35.99.32. Executing installation stage. (Stage: Running first installation script on Host)
ayal - what's the impact of a host with an ID larger than what the SPM gets in SPM start (anything other than lvextends will not work for it?)
(In reply to comment #1) > ayal - what's the impact of a host with an ID larger than what the SPM gets in > SPM start (anything other than lvextends will not work for it?) The impact today is only that the mailbox would not work. Currently that means lvextends would fail very ungracefully (vdsm wouldn't know)
i think a CanDoAction for adding a server to a cluster (either a new one or via edit) is in order. not sure if important enough for 3.0.z. but a release note reminding customer they need to change this field for block domains is in order.
(In reply to comment #3) > i think a CanDoAction for adding a server to a cluster (either a new one or via > edit) is in order. > not sure if important enough for 3.0.z. > but a release note reminding customer they need to change this field for block > domains is in order. is a use case of bringing host from maintenance apply here too? - since the limit is per active hosts and not general hosts in the cluster.
validation should be failing to add a host if the ID generated for it is > config_limit(regardless of number of hosts in DC or their status)
Posted at http://gerrit.ovirt.org/#/c/6514/
Merged If375400f3e12e3e0452053dea12ac6e28bc0ff61
Tested on SI21. The new behaviour seems ok - the MaxNumberOfHostsInStoragePool indeed blocks installation of the 1+MaxNumberOfHostsInStoragePool host and the following nice error message appears in webadmin: "Error while executing action New Host: The maximum number of Hosts allowed in the Data Center has been reached". HOWEVER: 1. I couldn't find any error message in engine.log, the below lines appear whenever I try to install one more host 2. Does it really necessary to run an ssh command on the host before blocking its installation? can't it be done using a CanDoAction? 2012-10-18 23:50:07,075 INFO [org.ovirt.engine.core.utils.hostinstall.VdsInstallerSSH] (ajp-/127.0.0.1:8702-4) SSH execute puma38.scl.lab.tlv.redhat.com 'IDFILE=/etc/vdsm/vdsm.id; if [ -r "${IDFILE}" ]; then cat "${IDFILE}"; else UUID="$(dmidecode -s system-uuid 2> /dev/null | sed -e 's/.*Not.*//' )"; if [ -z "${UUID}" ]; then UUID="$(uuidgen 2> /dev/null)" && mkdir -p "$(dirname "${IDFILE}")" && echo "${UUID}" > "${IDFILE}" && chmod 0644 "${IDFILE}"; fi; [ -n "${UUID}" ] && echo "${UUID}"; fi' 2012-10-18 23:50:07,126 INFO [org.ovirt.engine.core.bll.AddVdsCommand] (ajp-/127.0.0.1:8702-4) [42c1f34d] Running command: AddVdsCommand internal: false. Entities affected : ID: d8512282-1970-11e2-ae49-13f7e553b2fd Type: VdsGroups 2012-10-18 23:50:07,169 INFO [org.ovirt.engine.core.bll.AddVdsSpmIdCommand] (ajp-/127.0.0.1:8702-4) [80fa6fe] Lock freed to object EngineLock [exclusiveLocks= key: 6acb5a2d-d920-4610-98d4-e4a22f60d3a8 value: REGISTER_VDS , sharedLocks= ] 2012-10-18 23:50:07,172 INFO [org.ovirt.engine.core.bll.AddVdsCommand] (ajp-/127.0.0.1:8702-4) [80fa6fe] Command [id=b507305f-fafd-429e-9c8d-e5524ba093dd]: Compensating NEW_ENTITY_ID of org.ovirt.engine.core.common.businessentities.VdsStatistics; snapshot: 29b8d4a8-1976-11e2-8404-d33241ec1e77. 2012-10-18 23:50:07,174 INFO [org.ovirt.engine.core.bll.AddVdsCommand] (ajp-/127.0.0.1:8702-4) [80fa6fe] Command [id=b507305f-fafd-429e-9c8d-e5524ba093dd]: Compensating NEW_ENTITY_ID of org.ovirt.engine.core.common.businessentities.VdsDynamic; snapshot: 29b8d4a8-1976-11e2-8404-d33241ec1e77. 2012-10-18 23:50:07,175 INFO [org.ovirt.engine.core.bll.AddVdsCommand] (ajp-/127.0.0.1:8702-4) [80fa6fe] Command [id=b507305f-fafd-429e-9c8d-e5524ba093dd]: Compensating NEW_ENTITY_ID of org.ovirt.engine.core.common.businessentities.VdsStatic; snapshot: 29b8d4a8-1976-11e2-8404-d33241ec1e77. 2012-10-18 23:50:11,054 ERROR [org.ovirt.engine.core.engineencryptutils.EncryptionUtils] (QuartzScheduler_Worker-61) Failed to decrypt Data must not be longer than 256 bytes 2012-10-18 23:50:11,080 ERROR [org.ovirt.engine.core.engineencryptutils.EncryptionUtils] (QuartzScheduler_Worker-87) Failed to decrypt Data must start with zero
I guess that we can perform the SSH command only when neccessary (after we verify that the host can be added to the DC), but it will require to perform the AddVdsCommand or part of it's logic before running the vds installer and in case that the VdsInstaller failed rollback the entire DB operations that were done. Regarding the logging, it may be added in case of a failure.
Verified according to Comment 8 & Comment 9.