Bug 1467775 - Unable to perform initial IP allocation check for container OCP
Unable to perform initial IP allocation check for container OCP
Status: CLOSED WORKSFORME
Product: OpenShift Container Platform
Classification: Red Hat
Component: Upgrade (Show other bugs)
3.6.0
Unspecified Unspecified
unspecified Severity medium
: ---
: ---
Assigned To: Tim Bielawa
Anping Li
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-07-05 01:45 EDT by Anping Li
Modified: 2017-08-16 15 EDT (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2017-07-06 10:52:53 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
The inventory and logs (76.45 KB, application/x-gzip)
2017-07-05 01:52 EDT, Anping Li
no flags Details

  None (edit)
Description Anping Li 2017-07-05 01:45:03 EDT
Description of problem:
The second and third atomic-openshift-master-api couldn't be started for "controller.go:128] Unable to perform initial IP allocation check: unable to refresh the service IP block: User "system:anonymous" cannot list all services in the cluster"

Version-Release number of selected component (if applicable):
openshift-ansible-3.6.133

How reproducible:
Both install and upgrade hit this issue with OCP v3.6. At least 5 times. 

Steps to Reproduce:
1. container install OCP v3.5
2. upgrade to OCP 3.6
   anible-playbook usr/share/ansible/openshift-ansible/playbooks/byo/openshift-cluster/upgrades/v3_6/upgrade.yml

Actual results:
[root@openshift-182 ~]# systemctl status atomic-openshift-master-api
● atomic-openshift-master-api.service - Atomic OpenShift Master API
   Loaded: loaded (/etc/systemd/system/atomic-openshift-master-api.service; enabled; vendor preset: disabled)
   Active: activating (start-post) (Result: exit-code) since Wed 2017-07-05 01:23:06 EDT; 8s ago
     Docs: https://github.com/openshift/origin
  Process: 16535 ExecStop=/usr/bin/docker stop atomic-openshift-master-api (code=exited, status=1/FAILURE)
  Process: 16574 ExecStart=/usr/bin/docker run --rm --privileged --net=host --name atomic-openshift-master-api --env-file=/etc/sysconfig/atomic-openshift-master-api -v /var/lib/origin:/var/lib/origin -v /var/log:/var/log -v /var/run/docker.sock:/var/run/docker.sock -v /etc/origin:/etc/origin -v /etc/pki:/etc/pki:ro openshift3/ose:${IMAGE_VERSION} start master api --config=${CONFIG_FILE} $OPTIONS (code=exited, status=255)
  Process: 16567 ExecStartPre=/usr/bin/docker rm -f atomic-openshift-master-api (code=exited, status=1/FAILURE)
 Main PID: 16574 (code=exited, status=255);         : 16575 (sleep)
   Memory: 96.0K
   CGroup: /system.slice/atomic-openshift-master-api.service
           └─control
             └─16575 /usr/bin/sleep 10

Jul 05 01:23:10 openshift-182.lab.eng.nay.redhat.com atomic-openshift-master-api[16574]: Trace[1623241719]: [343.386731ms] [343.379485ms] Etcd node listed
Jul 05 01:23:10 openshift-182.lab.eng.nay.redhat.com atomic-openshift-master-api[16574]: Trace[1623241719]: [566.788618ms] [223.401887ms] Node list decoded
Jul 05 01:23:10 openshift-182.lab.eng.nay.redhat.com atomic-openshift-master-api[16574]: E0705 05:23:10.311299       1 reflector.go:201] github.com/openshift/origin/pkg/authorization/generated/informers/inter...
Jul 05 01:23:10 openshift-182.lab.eng.nay.redhat.com atomic-openshift-master-api[16574]: E0705 05:23:10.311781       1 reflector.go:201] github.com/openshift/origin/pkg/authorization/generated/informers/inter...
Jul 05 01:23:10 openshift-182.lab.eng.nay.redhat.com atomic-openshift-master-api[16574]: E0705 05:23:10.311998       1 reflector.go:201] github.com/openshift/origin/pkg/authorization/generated/informers/inter...
Jul 05 01:23:10 openshift-182.lab.eng.nay.redhat.com atomic-openshift-master-api[16574]: E0705 05:23:10.312187       1 reflector.go:201] github.com/openshift/origin/pkg/quota/generated/informers/internalversi...
Jul 05 01:23:10 openshift-182.lab.eng.nay.redhat.com atomic-openshift-master-api[16574]: E0705 05:23:10.312730       1 reflector.go:201] github.com/openshift/origin/pkg/authorization/generated/informers/inter...
Jul 05 01:23:10 openshift-182.lab.eng.nay.redhat.com atomic-openshift-master-api[16574]: I0705 05:23:10.349300       1 serve.go:86] Serving securely on 0.0.0.0:8443
Jul 05 01:23:10 openshift-182.lab.eng.nay.redhat.com atomic-openshift-master-api[16574]: F0705 05:23:10.387662       1 controller.go:128] Unable to perform initial IP allocation check: unable to ref...he cluster
Jul 05 01:23:10 openshift-182.lab.eng.nay.redhat.com systemd[1]: atomic-openshift-master-api.service: main process exited, code=exited, status=255/n/a
Hint: Some lines were ellipsized, use -l to show in full.

Expected results:


Additional info:
The issue may disappear when you reinstall.
Comment 1 Anping Li 2017-07-05 01:52 EDT
Created attachment 1294447 [details]
The inventory and logs

hit same issue two times with same inventory file

1. install ocp 3.5 and upgrade to ocp 3.6.133 by openshift-ansible v3.6.133.
   logs-20170705053310-upgrade 
2. install OCP 3.6 by openshift-ansible 3.6.96. logs-20170704145436-config
Comment 2 Scott Dodson 2017-07-05 11:11:17 EDT
I'd be very surprised if this was an installer issue but lets verify we can reproduce it and then see if we can get some better logs.
Comment 3 Tim Bielawa 2017-07-05 16:31:57 EDT
Couldn't reproduce :-(


Started a fresh 3 node cluster at image tag (as specified in your inventory) v3.5.5.24 and then ran an upgrade with image tag (per your log) v3.6.133.


Initial install landed docker-1.12.6-32.git88a4867.el7.x86_64, upgraded manually to newer docker-1.12.6-32.git88a4867.el7.x86_64. Rebooted. No changes.


All atomic-openshift services are running and happy. Maybe it's just a fluke?
Comment 4 Anping Li 2017-07-06 06:19:52 EDT
At least 5 times.  Both jiajliu and I had hit this issue. I will try to reproduce it and leave the Env.
Comment 5 Anping Li 2017-07-06 10:52:53 EDT
Could it for I couldn't reproduce it. I will open it when hit same issue.

Note You need to log in before you can comment on or make changes to this bug.