Bug 1306429 - unable to start kube-addons after using kube/contrib/ansible on RHELAH
unable to start kube-addons after using kube/contrib/ansible on RHELAH
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: kubernetes (Show other bugs)
7.2
Unspecified Unspecified
unspecified Severity unspecified
: rc
: ---
Assigned To: Jan Chaloupka
atomic-bugs@redhat.com
: Extras
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2016-02-10 15:53 EST by Micah Abbott
Modified: 2016-06-28 08:50 EDT (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2016-06-28 08:50:51 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Comment 5 Jason Brooks 2016-04-26 18:28:53 EDT
I'm encountering this same issue (issue #2) with CentOS Atomic -- turning off validation is what's required to make this work, but the real problem is that our kube 1.2 appears not to support failureThreshold and successThreshold. I've tested w/ the local docker-based kube install (http://kubernetes.io/docs/getting-started-guides/docker/), for instance, and these values work fine, no messing w/ validation required.


For issue #1, this has been fixed upstream, I have a PR in to get the fix into the ansible scripts: https://github.com/kubernetes/contrib/pull/797.
Comment 6 Guohua Ouyang 2016-04-29 02:03:23 EDT
(In reply to Jason Brooks from comment #5)
> I'm encountering this same issue (issue #2) with CentOS Atomic -- turning
> off validation is what's required to make this work, but the real problem is
> that our kube 1.2 appears not to support failureThreshold and
> successThreshold. I've tested w/ the local docker-based kube install
> (http://kubernetes.io/docs/getting-started-guides/docker/), for instance,
> and these values work fine, no messing w/ validation required.
> 

https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG.md/#v120

"Liveness and readiness probes now support more configuration parameters: periodSeconds, successThreshold, failureThreshold"

So the problems seems only occur in our kube 1.2.

> For issue #1, this has been fixed upstream, I have a PR in to get the fix
> into the ansible scripts: https://github.com/kubernetes/contrib/pull/797.
Comment 7 Guohua Ouyang 2016-05-04 20:41:42 EDT
https://github.com/kubernetes/contrib/issues/886
same problem, seems he is using older kubernetes version.

the namespace issue fixed on kubernetes-1.2.0-0.11.git738b760.el7, but skydns issue still exists.
Comment 8 Micah Abbott 2016-05-16 12:17:17 EDT
On a freshly provisioned set of RHELAH 7.2.4 nodes, I'm still running into problems using the kubernetes/contrib/ansible script.

With only a few modifications to group_vars/all.yaml to allow for connectivity and prevent network collisions, the cluster comes up but none of the add-ons are started.

I see the following in the journal on the master:

May 16 16:12:28 kube-master kube-addons.sh[18808]: Error from server: serviceaccounts "default" not found


But the account is there:

# kubectl get serviceaccounts --all-namespaces
NAMESPACE   NAME      SECRETS   AGE
default     default   1         6m


This may really be a problem with the playbook at this point.
Comment 9 Jason Brooks 2016-05-16 12:24:07 EDT
(In reply to Micah Abbott from comment #8)
> On a freshly provisioned set of RHELAH 7.2.4 nodes, I'm still running into
> problems using the kubernetes/contrib/ansible script.
> 
> With only a few modifications to group_vars/all.yaml to allow for
> connectivity and prevent network collisions, the cluster comes up but none
> of the add-ons are started.
> 
> I see the following in the journal on the master:
> 
> May 16 16:12:28 kube-master kube-addons.sh[18808]: Error from server:
> serviceaccounts "default" not found
> 
> 
> But the account is there:
> 
> # kubectl get serviceaccounts --all-namespaces
> NAMESPACE   NAME      SECRETS   AGE
> default     default   1         6m
> 
> 
> This may really be a problem with the playbook at this point.

Does the issue look like this: https://github.com/kubernetes/kubernetes/issues/23973

I have this PR waiting to address that:
https://github.com/kubernetes/contrib/pull/797
Comment 10 Micah Abbott 2016-05-16 14:22:52 EDT
(In reply to Jason Brooks from comment #9)
> Does the issue look like this:
> https://github.com/kubernetes/kubernetes/issues/23973
> 
> I have this PR waiting to address that:
> https://github.com/kubernetes/contrib/pull/797

Interestingly, I ran into this error first:

https://github.com/kubernetes/kubernetes/issues/25440

So I hacked around that (for better or worse):

$ git diff roles/kubernetes-addons/tasks/main.yml
diff --git a/ansible/roles/kubernetes-addons/tasks/main.yml b/ansible/roles/kubernetes-addons/tasks/main.yml
index b952af1..1bbc200 100644
--- a/ansible/roles/kubernetes-addons/tasks/main.yml
+++ b/ansible/roles/kubernetes-addons/tasks/main.yml
@@ -47,6 +47,13 @@
     http_proxy: "{{ http_proxy|default('') }}"
     https_proxy: "{{ https_proxy|default('') }}"
     no_proxy: "{{ no_proxy|default('') }}"
+
+- name: HACK: Modify location of namespace file
+  lineinfile:
+    dest: "{{ kube_script_dir }}/kube-addons.sh"
+    regexp: ^start_addon
+    line: "start_addon {{ kube_config_dir }}/addons/namespace.yaml 100 10 \"\" &"
+    state: present
   notify:
     - restart kube-addons
 



Then I hit the issue you referenced and used your patch to fix that.

And then....it worked!  :)


Still feels like one or more problems in the upstream playbook, so I'm not sure this is truly a problem in RHEL kube.
Comment 11 Jason Brooks 2016-05-16 14:31:40 EDT
(In reply to Micah Abbott from comment #10)

> 
> 
> Still feels like one or more problems in the upstream playbook, so I'm not
> sure this is truly a problem in RHEL kube.

The addons are awkward in general, one of the biggest problems w/ addons and the upstream ansible is that the ansible puts selinux into permissive mode! I've been tracking this and some other issues in my fork of the scripts, and upstreaming bits along the way: https://github.com/kubernetes/contrib/compare/master...jasonbrooks:atomic

There's a different way of starting up the addons for atomic in there, as well as some selinux workarounds (more notes on that at https://gist.github.com/jasonbrooks/4c2b2046443ec7e58b51f55d8e1d6e17). 

I've been thinking that the kube-addons might be better managed (for atomic, at least) as atomicapps...
Comment 12 Guohua Ouyang 2016-05-16 20:11:40 EDT
(In reply to Micah Abbott from comment #8)
> On a freshly provisioned set of RHELAH 7.2.4 nodes, I'm still running into
> problems using the kubernetes/contrib/ansible script.
> 
> With only a few modifications to group_vars/all.yaml to allow for
> connectivity and prevent network collisions, the cluster comes up but none
> of the add-ons are started.
> 
> I see the following in the journal on the master:
> 
> May 16 16:12:28 kube-master kube-addons.sh[18808]: Error from server:
> serviceaccounts "default" not found
> 
> 
> But the account is there:
> 
> # kubectl get serviceaccounts --all-namespaces
> NAMESPACE   NAME      SECRETS   AGE
> default     default   1         6m
> 
> 
> This may really be a problem with the playbook at this point.

The failure is from `kubectl get --namespace=kube-system serviceaccount default`, see L74 in /usr/libexec/kubernetes/kube-addons.sh.

namespace 'kube-system' is not created because of 
https://github.com/kubernetes/contrib/issues/944
The PR is https://github.com/kubernetes/contrib/pull/945.

Except the namespace issue, the other problem is DNS addon is not created, I filed a separated bug for it, bug#1336593
Comment 13 Jan Chaloupka 2016-06-28 08:50:51 EDT
As per [1] I am able to have the addons up and running with the ansible playbook run over VM with the latest kubernetes.

Micah, Jason, if you still experience the issue on AH, please open an issue on kubernetes/contrib repository [2].

Micah, wrt. to your patch, part of it is already merged upstream. What is not has to be patched before you run the playbook. I do the same before running the playbook as a part of Red Hat CI.

Closing the issue as the problem is not affecting the kubernetes itself.

[1] https://github.com/kubernetes/contrib/issues/886
[2] https://github.com/kubernetes/contrib

Note You need to log in before you can comment on or make changes to this bug.