Bug 1455884

Summary: Installation failed at task openshift_health_check with instance of 16GiB Memory
Product: OpenShift Container Platform Reporter: Weihua Meng <wmeng>
Component: InstallerAssignee: Luke Meyer <lmeyer>
Status: CLOSED CURRENTRELEASE QA Contact: Weihua Meng <wmeng>
Severity: high Docs Contact:
Priority: high    
Version: 3.6.0CC: aos-bugs, jokerman, mmccomas, pep
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
undefined
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-07-20 14:05:24 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Weihua Meng 2017-05-26 11:25:10 UTC
Description of problem:
Installation failed at task openshift_health_check with instance of 16GiB Memory

Version-Release number of selected component (if applicable):
openshift-ansible-3.6.85-1.git.0.109a54e.el7.noarch.rpm

How reproducible:
Always

Steps to Reproduce:
1. launch instances on AWS EC2 with type t2.xlarge 16GiB Memory
2. set up cluster by openshift-ansible-3.6.85-1.git.0.109a54e.el7.noarch.rpm

Actual results:
Installation failed.

"msg": "Available memory (15.6 GB) below recommended value (16.0 GB)"

Expected results:
Installation SUCCESS
16 GB RAM meets system requirements
https://docs.openshift.com/container-platform/3.5/install_config/install/prerequisites.html

Additional info:

Comment 1 Scott Dodson 2017-05-26 13:01:49 UTC
I bet a lot of cloud platforms will be shaving ~256-512MiB off the quoted size of the instance. We should probably account for that.

Comment 2 Luke Meyer 2017-05-26 15:11:01 UTC
There's already a little bit of fudge factor since we're measuring GB instead of GiB. But looking at a t2.xlarge instance it does look like AWS has some disappearing RAM:

# dmidecode --type memory
Handle 0x1000, DMI type 16, 15 bytes
Physical Memory Array
	Maximum Capacity: 16 GB

Handle 0x1100, DMI type 17, 21 bytes
Memory Device
	Size: 16384 MB


# cat /proc/meminfo 
MemTotal:       16005316 kB
MemFree:        14720336 kB
MemAvailable:   15595096 kB


MemTotal is what the check is looking at. It's substantially lower than the "hardware" RAM.

I don't think support would deny a customer with this memory setup, so we should probably introduce a bit more fudge factor for those with memory that's "close enough".

Comment 3 Josep 'Pep' Turro Mauri 2017-05-26 16:30:46 UTC
I think this is expected and not specific to AWS:

https://access.redhat.com/solutions/3006511

So yes we probably have to always give a small % of margin

Comment 5 Luke Meyer 2017-05-30 15:29:47 UTC
PR just merged to master, so should be available to test whenever there's a new openshift-ansible build.

Comment 7 Weihua Meng 2017-06-05 03:28:53 UTC
Verified on openshift-ansible-3.6.94-1.git.0.fff177b.el7.noarch.rpm
Installation SUCCESS
Fixed.

Comment 8 Luke Meyer 2017-07-20 14:05:24 UTC
never released