Bug 1427040

Summary: STI build failed due to cert error
Product: OpenShift Container Platform Reporter: Wenkai Shi <weshi>
Component: InstallerAssignee: Russell Teague <rteague>
Status: CLOSED CURRENTRELEASE QA Contact: Johnny Liu <jialiu>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 3.5.0CC: aos-bugs, ccoleman, jokerman, mifiedle, mmccomas
Target Milestone: ---Keywords: Regression
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-03-08 13:46:05 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1427378    
Bug Blocks:    

Description Wenkai Shi 2017-02-27 08:05:57 UTC
Description of problem:
STI build failed due to cert error.

Version-Release number of selected component (if applicable):
openshift-ansible-3.5.15-1.git.0.8d2a456.el7.noarch

How reproducible:
100%

Steps to Reproduce:
1. Prepare a environment 
2. oc new project test
3. oc new-app centos/ruby-22-centos7~https://github.com/openshift/ruby-ex.git

Actual results:
# oc logs -f bc/ruby-ex
...
---> Cleaning up unused ruby gems ...
Pushing image 172.30.146.9:5000/weshi/ruby-ex:latest ...
Registry server Address: 
Registry server User Name: serviceaccount
Registry server Email: serviceaccount
Registry server Password: <<non-empty>>
error: build error: Failed to push image: Get https://172.30.146.9:5000/v1/_ping: x509: cannot validate certificate for 172.30.146.9 because it doesn't contain any IP SANs

Expected results:
STI build succeed

Additional info:
# ansible-playbook -i hosts -v /usr/share/ansible/openshift-ansible/playbooks/byo/config.yml
...

TASK [openshift_hosted : Retrieve registry service IP] *************************
Monday 27 February 2017  07:06:15 +0000 (0:00:03.276)       0:20:11.120 ******* 
ok: [qe-weshi-all-in-one-master-1.0227-7sv.qe.rhcloud.com] => {
    "changed": false, 
    "results": {
        "clusterip": "", 
        "cmd": "/usr/bin/oc -n default get service docker-registry -o json", 
        "results": [
            {}
        ], 
        "returncode": 0, 
        "stderr": "Error from server (NotFound): services \"docker-registry\" not found\n", 
        "stdout": ""
    }, 
    "state": "list"
}

TASK [openshift_hosted : Create registry certificates] *************************
Monday 27 February 2017  07:06:17 +0000 (0:00:01.708)       0:20:12.828 ******* 
changed: [qe-weshi-all-in-one-master-1.0227-7sv.qe.rhcloud.com] => {
    "changed": true, 
    "results": {
        "cmd": "/usr/bin/oc adm ca create-server-cert --cert=/etc/origin/master/registry.crt --hostnames=,docker-registry.default.svc.cluster.local,docker-registry-default.0227-7sv.qe.rhcloud.com --key=/etc/origin/master/registry.key --signer-key=/etc/origin/master/ca.key --signer-serial=/etc/origin/master/ca.serial.txt --signer-cert=/etc/origin/master/ca.crt --overwrite=True", 
        "results": "", 
        "returncode": 0
    }, 
    "state": "present"
}
...

# openssl x509 -in /etc/origin/master/registry.crt -text
...
  X509v3 Subject Alternative Name: 
    DNS:, DNS:docker-registry-default.0227-7sv.qe.rhcloud.com, DNS:docker-registry.default.svc.cluster.local
...

Comment 1 Clayton Coleman 2017-02-27 19:42:48 UTC
It also needs to sign for DNS:docker-registry.default.svc as well as the service IP

Comment 2 Russell Teague 2017-02-27 20:40:21 UTC
The docker-registry service did not exist at the time it collected the service IP, therefore it failed to obtain the IP.  Added a task to create the service.

https://github.com/openshift/openshift-ansible/pull/3512

Comment 4 Wenkai Shi 2017-02-28 04:04:15 UTC
Seems the cert has been create correctly, but met another bug here: BZ# 1427378

# openssl x509 -in /etc/origin/master/registry.crt -text
...
  X509v3 Subject Alternative Name: 
    DNS:docker-registry-default.0228-nqx.qe.rhcloud.com, DNS:docker-registry.default.svc.cluster.local, DNS:172.30.131.16, IP Address:172.30.131.16
...

Comment 5 Wenkai Shi 2017-02-28 06:36:46 UTC
Because of the docker registry was not created during installation(BZ# 1427378), and the docker registry is secure registry by default. It means a new cert must to created if need a new docker registry for STI build testing. Will test this when BZ# 1427378 fix.

Comment 7 Johnny Liu 2017-03-02 09:15:43 UTC
Verified this bug with openshift-ansible-3.5.20-1.git.0.5a5fcd5.el7.noarch, and PASS.

# openssl x509 -in /etc/origin/master/registry.crt -text
            X509v3 Subject Alternative Name: 
                DNS:docker-registry-default.0302-obm.qe.rhcloud.com, DNS:docker-registry.default.svc.cluster.local, DNS:172.30.136.182, IP Address:172.30.136.182


Though BZ#1427378 is only fix partially, need some other minor polish, following the workaround of comment #6 and #7 mentioned in BZ#1427378 to continue this bug verification. Sti build is passed without any cert error.