Bug 1585038

Summary: Invalid proxy environment assignment in cri-o systemd configurations
Product: OpenShift Container Platform Reporter: Gan Huang <ghuang>
Component: ContainersAssignee: Giuseppe Scrivano <gscrivan>
Status: CLOSED CURRENTRELEASE QA Contact: DeShuai Ma <dma>
Severity: medium Docs Contact:
Priority: medium    
Version: 3.10.0CC: amurdaca, aos-bugs, gscrivan, jokerman, mmccomas, sdodson, vlaad
Target Milestone: ---   
Target Release: 3.10.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-09-11 19:20:15 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1589013    
Bug Blocks:    

Description Gan Huang 2018-06-01 08:12:54 UTC
Description of problem:
Trigger cri-o installation with proxy variables, it resulted in that control plane were unable to start up due to :

# systemctl status cri-o -l
● crio.service - Open Container Initiative Daemon
   Loaded: loaded (/usr/lib/systemd/system/crio.service; enabled; vendor preset: disabled)
   Active: active (running) since Fri 2018-06-01 04:00:06 EDT; 31s ago
     Docs: https://github.com/kubernetes-incubator/cri-o
 Main PID: 25139 (crio)
   CGroup: /system.slice/crio.service
           └─25139 /usr/bin/crio

Jun 01 04:00:06 qe-ghuangproxy-master-etcd-nfs-1 crio[16103]: 2018/06/01 04:00:06 transport: http2Server.HandleStreams failed to read frame: read unix /var/run/crio/crio.sock->@: use of closed network connection
Jun 01 04:00:06 qe-ghuangproxy-master-etcd-nfs-1 crio[16103]: time="2018-06-01 04:00:06.346626406-04:00" level=error msg="Failed to start streaming server: http: Server closed"
Jun 01 04:00:06 qe-ghuangproxy-master-etcd-nfs-1 systemd[1]: Ignoring invalid environment assignment 'export HTTP_PROXY=http://file.rdu.redhat.com:3128': /etc/sysconfig/crio-network
Jun 01 04:00:06 qe-ghuangproxy-master-etcd-nfs-1 systemd[1]: Ignoring invalid environment assignment 'export HTTPS_PROXY=http://file.rdu.redhat.com:3128': /etc/sysconfig/crio-network
Jun 01 04:00:06 qe-ghuangproxy-master-etcd-nfs-1 systemd[1]: Ignoring invalid environment assignment 'export NO_PROXY=.centralci.eng.rdu2.redhat.com,.cluster.local,.svc,169.254.169.254,172.16.120.153,172.30.0.1,qe-ghuangproxy-master-etcd-nfs-1,qe-ghuangproxy-node-registry-router-1': /etc/sysconfig/crio-network


Version-Release number of the following components:
openshift-ansible-3.10.0-0.56.0.git.0.b921fb9.el7.noarch.rpm
RHEL-7.5 

How reproducible:
always

Steps to Reproduce:
1. Trigger installation behind proxy with cri-o enabled

openshift_https_proxy=http://file.rdu.redhat.com:3128
openshift_http_proxy=http://file.rdu.redhat.com:3128
openshift_no_proxy="169.254.169.254,.centralci.eng.rdu2.redhat.com"

openshift_use_crio=true
openshift_crio_use_rpm=true


Actual results:
Control plane failed to start due to proxy related environments ignored in cri-o systemd configurations.

# cat /etc/sysconfig/crio-network
export HTTP_PROXY=http://file.rdu.redhat.com:3128
export HTTPS_PROXY=http://file.rdu.redhat.com:3128
export NO_PROXY=.centralci.eng.rdu2.redhat.com,.cluster.local,.svc,169.254.169.254,172.16.120.153,172.30.0.1,qe-ghuangproxy-master-etcd-nfs-1,qe-ghuangproxy-node-registry-router-1

Expected results:

Additional info:
Once removed "export " prefix in /etc/sysconfig/crio-network, and reload daemon and restart cri-o, control plane can start up properly.

Comment 1 Scott Dodson 2018-06-01 12:34:51 UTC
Seems as if the method to read /etc/sysconfig/crio-network has changed in 1.10?

Here's where that config template is https://github.com/openshift/openshift-ansible/blob/master/roles/container_runtime/templates/crio-network.j2

Please determine if this is to be considered a regression in crio or update the template.

Comment 2 Antonio Murdaca 2018-06-04 07:34:13 UTC
(In reply to Scott Dodson from comment #1)
> Seems as if the method to read /etc/sysconfig/crio-network has changed in
> 1.10?
> 
> Here's where that config template is
> https://github.com/openshift/openshift-ansible/blob/master/roles/
> container_runtime/templates/crio-network.j2
> 
> Please determine if this is to be considered a regression in crio or update
> the template.

Giuseppe added "export(s)" here https://github.com/openshift/openshift-ansible/commit/7ce6b62cfbdd4cde157ecc2154a43cf9e7afd56e#diff-5969c8474466370b083fc457c47a6844 so not sure if the regression is in the template or cri-o network. CC'ing Giuseppe here

Comment 3 Antonio Murdaca 2018-06-04 07:35:20 UTC
Related  bug https://bugzilla.redhat.com/show_bug.cgi?id=1529478

Comment 4 Giuseppe Scrivano 2018-06-04 08:36:52 UTC
the original template file was for the system container run.sh script.  It breaks the EnvironmentFile= directive for systemd that accepts only lines like NAME=VALUE:

I've opened a PR here:

https://github.com/openshift/openshift-ansible/pull/8615

and the change for the system container so that it still works after the previous change:

https://github.com/kubernetes-incubator/cri-o/pull/1587

Comment 6 Gan Huang 2018-06-12 09:43:44 UTC
Fixed in openshift-ansible-3.10.0-0.66.0.git.79.68197f9.el7.noarch.rpm