1455884 – Installation failed at task openshift_health_check with instance of 16GiB Memory

Bug 1455884 - Installation failed at task openshift_health_check with instance of 16GiB Memory

Summary: Installation failed at task openshift_health_check with instance of 16GiB Memory

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Installer
Sub Component:
Version:	3.6.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	---
Target Release:	---
Assignee:	Luke Meyer
QA Contact:	Weihua Meng
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2017-05-26 11:25 UTC by Weihua Meng
Modified:	2017-08-16 19:51 UTC (History)
CC List:	4 users (show)
Fixed In Version:
Doc Type:	No Doc Update
Doc Text:	undefined
Clone Of:
Environment:
Last Closed:	2017-07-20 14:05:24 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHEA-2017:1716	0	normal	SHIPPED_LIVE	Red Hat OpenShift Container Platform 3.6 RPM Release Advisory	2017-08-10 09:02:50 UTC

Description Weihua Meng 2017-05-26 11:25:10 UTC

Description of problem:
Installation failed at task openshift_health_check with instance of 16GiB Memory

Version-Release number of selected component (if applicable):
openshift-ansible-3.6.85-1.git.0.109a54e.el7.noarch.rpm

How reproducible:
Always

Steps to Reproduce:
1. launch instances on AWS EC2 with type t2.xlarge 16GiB Memory
2. set up cluster by openshift-ansible-3.6.85-1.git.0.109a54e.el7.noarch.rpm

Actual results:
Installation failed.

"msg": "Available memory (15.6 GB) below recommended value (16.0 GB)"

Expected results:
Installation SUCCESS
16 GB RAM meets system requirements
https://docs.openshift.com/container-platform/3.5/install_config/install/prerequisites.html

Additional info:

Comment 1 Scott Dodson 2017-05-26 13:01:49 UTC

I bet a lot of cloud platforms will be shaving ~256-512MiB off the quoted size of the instance. We should probably account for that.

Comment 2 Luke Meyer 2017-05-26 15:11:01 UTC

There's already a little bit of fudge factor since we're measuring GB instead of GiB. But looking at a t2.xlarge instance it does look like AWS has some disappearing RAM:

# dmidecode --type memory
Handle 0x1000, DMI type 16, 15 bytes
Physical Memory Array
	Maximum Capacity: 16 GB

Handle 0x1100, DMI type 17, 21 bytes
Memory Device
	Size: 16384 MB


# cat /proc/meminfo 
MemTotal:       16005316 kB
MemFree:        14720336 kB
MemAvailable:   15595096 kB


MemTotal is what the check is looking at. It's substantially lower than the "hardware" RAM.

I don't think support would deny a customer with this memory setup, so we should probably introduce a bit more fudge factor for those with memory that's "close enough".

Comment 3 Josep 'Pep' Turro Mauri 2017-05-26 16:30:46 UTC

I think this is expected and not specific to AWS:

https://access.redhat.com/solutions/3006511

So yes we probably have to always give a small % of margin

Comment 4 Luke Meyer 2017-05-26 20:06:34 UTC

https://github.com/openshift/openshift-ansible/pull/4301

Comment 5 Luke Meyer 2017-05-30 15:29:47 UTC

PR just merged to master, so should be available to test whenever there's a new openshift-ansible build.

Comment 7 Weihua Meng 2017-06-05 03:28:53 UTC

Verified on openshift-ansible-3.6.94-1.git.0.fff177b.el7.noarch.rpm
Installation SUCCESS
Fixed.

Comment 8 Luke Meyer 2017-07-20 14:05:24 UTC

never released

Note You need to log in before you can comment on or make changes to this bug.