Bug 1730910 - crio service not enabled/started when node is initialized
Summary: crio service not enabled/started when node is initialized
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: RHCOS
Version: 4.1.z
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ---
Assignee: Steve Milner
QA Contact: Micah Abbott
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-07-17 20:41 UTC by Chris Callegari
Modified: 2019-07-18 14:09 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-07-18 13:25:35 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Chris Callegari 2019-07-17 20:41:19 UTC
Description of problem:
crio service not enabled/started when node is initialized.  This causes kubelet to fail to start.

Version-Release number of selected component (if applicable):
4.1.4

aws ami: rhcos-rhcos-420devel.8.20190717.0-hvm (ami-0093b77a3f424a8f9)

How reproducible:
Always

Steps to Reproduce:
Follow UPI guide - https://docs.openshift.com/container-platform/4.1/installing/installing_aws_user_infra/installing-aws-user-infra.html#installation-aws-user-infra-bootstrap_installing-aws-user-infra

Actual results:


Expected results:


Additional info:
Let me know what kind of logs we need to debug this

Comment 1 Chris Callegari 2019-07-17 20:43:29 UTC
Master instances and worker instances all fail to initialize with crio service running.  Manually starting service and restarting kubelet enables "wait-for bootstrap-complete" to complete successfully.

Comment 2 Colin Walters 2019-07-17 20:46:16 UTC
You're using a 4.2 bootimage with a 4.1 installer?  This isn't...tested.  And I'd say we shouldn't support it unless there's a strong reason.

The change here came in https://src.osci.redhat.com/rpms/redhat-release-coreos/c/d34c31c10d025f04f1e04488844f1fe85a6b061c?branch=rhaos-4.2-rhel-8

Comment 3 Steve Milner 2019-07-17 21:44:22 UTC
Thanks for the report Chris!

> You're using a 4.2 bootimage with a 4.1 installer?  This isn't...tested.  And I'd say we shouldn't support it unless there's a strong reason.

Agreed. Having cri-o start when the kubelet starts up was done purposefully for 4.2+. Is there a specific need for the 4.1 installer to work for 4.2 boot images?

Comment 4 Chris Callegari 2019-07-18 13:17:01 UTC
Yikes, I didn't realize I had a version mismatch!  That is absolutely not what I'm trying to do.

Comment 5 Chris Callegari 2019-07-18 13:32:14 UTC
When you say "bootimage" do you mean the AWS AMI?

Comment 6 Steve Milner 2019-07-18 13:34:08 UTC
Correct. We have `machine-os-content` (aka os-container`s) and boot images (IE: AMI, qcow2, ova, etc..).

Comment 8 Colin Walters 2019-07-18 13:55:47 UTC
No customers should be searching for AMIs by name.  Absolutely don't do that, you can easily be picking up development or pre-release builds, as happened here.

Use the AMIs from https://docs.openshift.com/container-platform/4.1/installing/installing_aws_user_infra/installing-aws-user-infra.html

Comment 9 Colin Walters 2019-07-18 13:58:40 UTC
OK, right, you want https://github.com/openshift/installer/issues/1399

Comment 10 Chris Callegari 2019-07-18 14:09:50 UTC
re issue/1139 ... oh yea, that would be a great feature to have.


This test is for customers in AWS trying to install OCP 4 in internet isolated subnets. Sol Eng is ensuring the product can handle the scenario. Finance and telco are the customers demanding the functionality.


The UPI documentation is already stale.  We got overridden on a pull request to automate that ami table.

We don't filter by name.  We filter by Description.  The filter is the following...
params.REGION set in Jenkins pipeline parameters section
export AWS_DEFAULT_REGION=${params.REGION}
export RHCOSAMIID=`aws ec2 describe-images --owners 531415883065 \
    --filters "Name=name,Values=rhcos*" \
    --query "sort_by(Images, &CreationDate) | [? ! contains(Description, 'Beta')] | [? ! contains(Description, 'devel')] | [-1].ImageId" \
    --output text`


Note You need to log in before you can comment on or make changes to this bug.