Hide Forgot
Description of problem: I was doing a deployment of: RHV(self hosted, 4 hosts) + OSE(3 nodes) + CFME It got through the RHV and OSE deployment but failed in the CFME deployment with the following error: CFME Launch failed with error ["Failed to power up a compute thistimeitwillwork-RHEV (RHEV) instance thistimeitwillwork-cfme.b.b: Cannot run VM. There is no host that satisfies current scheduling constraints. See below for details:, The host hosted_engine_1 did not satisfy internal filter Memory because its available memory is too low (3083.000000 MB) to run the VM., The host hosted_engine_1 did not satisfy internal filter Memory because its available memory is too low (3083.000000 MB) to run the VM., The host hosted_engine_1 did not satisfy internal filter Memory because its available memory is too low (3083.000000 MB) to run the VM., The host hosted_engine_1 did not satisfy internal filter Memory because its available memory is too low (3083.000000 MB) to run the VM."] I actually could login into the CFME and found that the guest for the CFME did not exist according to itself and RHV (however it was running), and their was a host with 3083 MB of memory left, and the other three hosts had 7244 MB left. All the hosts had 16GB of memory each. Version-Release number of selected component (if applicable): QCI-1.0-RHEL-7-20160824.t.1 How reproducible: I think always Steps to Reproduce: 1. Do a RHV + OSE + CFME deployment where there is not quite enough memory left on the first RHV host. Actual results: It fails with the error above. Expected results: An error when configuring CFME to say there was not enough space to deploy the CFME.
Possible RHEV Issue. We need to investigate this further and attempt to recreate. Our simplified aggregate checks seem to correctly determined that enough space *was* available across the 4 hypervisors. Moving to v1.1
Right now self hosted does not work and the original configuration I tested this with involved self hosted (and indeed was where the conflict occurred because the engine and the CFME box ended up on the same system). So I am at present waiting for a compose of QCI 1.1 where self hosted works to test this.
Unfortunately now I can't validate this because OCP deployments are failing due to this bug: https://bugzilla.redhat.com/show_bug.cgi?id=1403864 This is in compose: QCI-1.1-RHEL-7-20161209.t.0
Verified in QCI-1.1-RHEL-7-20161215.t.0
I don't think this should be marked as verified. On QCI-1.1-RHEL-7-20161215.t.0, a deployment of RHV self-hosted + OCP + CFME with 4 hypervisors (16 GB RAM each) and 4 OCP nodes (1 master + 3 workers), the CFME task fails because there is not enough RAM available: ---- D, [2016-12-20T13:24:38.048941 #19406] DEBUG -- : ====== CFME Launch run method ====== I, [2016-12-20T13:25:46.069406 #19406] INFO -- : ["Failed to power up a compute tpapaioa_3-RHEV (RHEV) instance tpapaioa-3-rhv-cfme.cfme.lab.eng.rdu2.redhat.com: Cannot run VM. There is no host that satisfies current scheduling constraints. See below for details:, The host hosted_engine_2 did not satisfy internal filter Memory because its available memory is too low (7243.000000 MB) to run the VM., The host hosted_engine_2 did not satisfy internal filter Memory because its available memory is too low (7243.000000 MB) to run the VM., The host hosted_engine_2 did not satisfy internal filter Memory because its available memory is too low (7243.000000 MB) to run the VM., The host hosted_engine_2 did not satisfy internal filter Memory because its available memory is too low (7243.000000 MB) to run the VM."] ---- There is no warning during the creation of the deployment that there is insufficient memory for all 6 VM's (1 engine + 4 OCP nodes + 1 CFME). On the OpenShift > Master/Nodes tab: Resources needed: Resources available: vCPU 5 16 RAM 32 GB 58.04 GB Disk 135 GB 657.54 GB Hovering over the tooltip next to "Resources available" shows: "0 vCPUs, 0GB RAM, 0GB Disk reserved for CloudForms", even though CFME has been selected, and there is no mention of the resources required for the self-hosted engine VM, nor does there appear to be some accounting for the fact that even though a total of ~58 GB RAM is available, each individual host has at most 16 GB.
will revisit by trying a RHV self-hosted deployment with 4 16GB RAM hypervisors as you've described above. Thank You.
I am punting this from the 1.1 release, the underlying issue is more involved than we first realized. Heart of the issue is that we are treating the memory requirements incorrectly. We assumed we could consider the total memory available as a single pool, similar to shared disk usage, and do simple arithmetic. We can't, we need to account for how a VM needs to satisfy it's RAM requirements from free RAM on a single hypervisor, it can't split it's requirement onto a 2nd hypervisor. The issue is that the memory actually consumed is segmented based on each hypervisor accounting for how VMs get scheduled to each VM. For example: If we have 4 hypervisors each with 16GB or RAM, this is a total of 64GB. Each hypervisor has ~16GB - 2GB (assuming RAM hypervisor reserves for itself) = 14GB When it comes to running 5 VMs say of 8GB each, we need 40GB. We __thought__ we had sufficient memory, 4 hypervisors each 14GB = 56GB > 40GB Issue is we need to account for where the VMs will run and how a VM only runs on a single hypervisor. When we schedule this we have each hypervisor with a single 8GB VM. So each hypevisor 16GB - 2GB (hyp usage) - 8GB (VM) = ~6GB free.. None of the hypervisors in this case have a full chunk of 8GB to allocate to a VM. The "pool" reflects we have enough RAM free, but treating this memory in a pool is inaccurate since it doesn't model real world usage. Also note.....to implement this correctly, we need to be aware of how RHV will schedule the VM to hypervisor, example...if hypervisors differ in capabilities, say free CPUs or other requirements we need to address this in our algorithm. At the moment I'm aware of: - Memory needs - Number of CPUs We need to address the above 2 requirements when estimating how the RHV scheduler will place VM to Hypervisor.