Bug 1623989
Summary: | Openshift-on-OpenStack documentate required Octavia LB timeout values for OCP 3.10 and OCP 3.11 on OSP 13 | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Jon Uriarte <juriarte> |
Component: | Installer | Assignee: | Luis Tomas Bolivar <ltomasbo> |
Status: | CLOSED DUPLICATE | QA Contact: | Jon Uriarte <juriarte> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 3.10.0 | CC: | aos-bugs, jokerman, ltomasbo, mmccomas, rlopez, tsedovic |
Target Milestone: | --- | Keywords: | Reopened |
Target Release: | 3.10.z | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2019-03-12 10:27:47 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1636496, 1669078, 1685481 | ||
Bug Blocks: |
Description
Jon Uriarte
2018-08-30 15:34:34 UTC
Adding logs from infra-node-0.openshift.example.com where the pod docker-registry-1-rg9jj creation failed: Aug 30 10:51:32 infra-node-0 atomic-openshift-node: I0830 10:51:32.049784 28245 kubelet.go:1869] SyncLoop (ADD, "api"): "docker-registry-1-rg9jj_default(2f05e71c-ac64-11e8-8a03-fa163ee43a34)" Aug 30 10:51:32 infra-node-0 atomic-openshift-node: I0830 10:51:32.194738 28245 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "registry-certificates" (UniqueName: "kube rnetes.io/secret/2f05e71c-ac64-11e8-8a03-fa163ee43a34-registry-certificates") pod "docker-registry-1-rg9jj" (UID: "2f05e71c-ac64-11e8-8a03-fa163ee43a34") Aug 30 10:51:32 infra-node-0 atomic-openshift-node: I0830 10:51:32.194835 28245 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "registry-storage" (UniqueName: "kubernete s.io/empty-dir/2f05e71c-ac64-11e8-8a03-fa163ee43a34-registry-storage") pod "docker-registry-1-rg9jj" (UID: "2f05e71c-ac64-11e8-8a03-fa163ee43a34") Aug 30 10:51:32 infra-node-0 atomic-openshift-node: I0830 10:51:32.194879 28245 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "registry-token-5rdwt" (UniqueName: "kuber netes.io/secret/2f05e71c-ac64-11e8-8a03-fa163ee43a34-registry-token-5rdwt") pod "docker-registry-1-rg9jj" (UID: "2f05e71c-ac64-11e8-8a03-fa163ee43a34") Aug 30 10:51:32 infra-node-0 atomic-openshift-node: I0830 10:51:32.295272 28245 reconciler.go:252] operationExecutor.MountVolume started for volume "registry-certificates" (UniqueName: "kubernetes.io/secret/2f 05e71c-ac64-11e8-8a03-fa163ee43a34-registry-certificates") pod "docker-registry-1-rg9jj" (UID: "2f05e71c-ac64-11e8-8a03-fa163ee43a34") Aug 30 10:51:32 infra-node-0 atomic-openshift-node: I0830 10:51:32.295407 28245 reconciler.go:252] operationExecutor.MountVolume started for volume "registry-storage" (UniqueName: "kubernetes.io/empty-dir/2f05 e71c-ac64-11e8-8a03-fa163ee43a34-registry-storage") pod "docker-registry-1-rg9jj" (UID: "2f05e71c-ac64-11e8-8a03-fa163ee43a34") Aug 30 10:51:32 infra-node-0 atomic-openshift-node: I0830 10:51:32.295440 28245 reconciler.go:252] operationExecutor.MountVolume started for volume "registry-token-5rdwt" (UniqueName: "kubernetes.io/secret/2f0 5e71c-ac64-11e8-8a03-fa163ee43a34-registry-token-5rdwt") pod "docker-registry-1-rg9jj" (UID: "2f05e71c-ac64-11e8-8a03-fa163ee43a34") Aug 30 10:51:32 infra-node-0 atomic-openshift-node: I0830 10:51:32.296654 28245 operation_generator.go:555] MountVolume.SetUp succeeded for volume "registry-storage" (UniqueName: "kubernetes.io/empty-dir/2f05e 71c-ac64-11e8-8a03-fa163ee43a34-registry-storage") pod "docker-registry-1-rg9jj" (UID: "2f05e71c-ac64-11e8-8a03-fa163ee43a34") Aug 30 10:51:32 infra-node-0 atomic-openshift-node: I0830 10:51:32.316504 28245 operation_generator.go:555] MountVolume.SetUp succeeded for volume "registry-certificates" (UniqueName: "kubernetes.io/secret/2f0 5e71c-ac64-11e8-8a03-fa163ee43a34-registry-certificates") pod "docker-registry-1-rg9jj" (UID: "2f05e71c-ac64-11e8-8a03-fa163ee43a34") Aug 30 10:51:32 infra-node-0 atomic-openshift-node: I0830 10:51:32.329506 28245 operation_generator.go:555] MountVolume.SetUp succeeded for volume "registry-token-5rdwt" (UniqueName: "kubernetes.io/secret/2f05 e71c-ac64-11e8-8a03-fa163ee43a34-registry-token-5rdwt") pod "docker-registry-1-rg9jj" (UID: "2f05e71c-ac64-11e8-8a03-fa163ee43a34") Aug 30 10:51:32 infra-node-0 atomic-openshift-node: I0830 10:51:32.384930 28245 kuberuntime_manager.go:385] No sandbox for pod "docker-registry-1-rg9jj_default(2f05e71c-ac64-11e8-8a03-fa163ee43a34)" can be fou nd. Need to start a new one Aug 30 10:51:32 infra-node-0 atomic-openshift-node: + docker exec --env 'CNI_ARGS=IgnoreUnknown=1;K8S_POD_NAMESPACE=default;K8S_POD_NAME=docker-registry-1-rg9jj;K8S_POD_INFRA_CONTAINER_ID=31a90463a12df537f53f1d2 a7f1b3e0aa186055c3140b838ecfc2028cfa55870' --env CNI_COMMAND=ADD --env CNI_IFNAME=eth0 --env CNI_NETNS=/proc/29328/ns/net --env CNI_CONTAINERID=31a90463a12df537f53f1d2a7f1b3e0aa186055c3140b838ecfc2028cfa55870 -- env CNI_PATH=/opt/kuryr-cni/bin:/opt/cni/bin -i 0a99cd97702f363f1246ff43866f77345296f9e4c282bb2047616812f51f0d53 kuryr-cni --config-file /etc/kuryr/kuryr.conf Aug 30 10:51:33 infra-node-0 atomic-openshift-node: I0830 10:51:33.405982 28245 kubelet.go:1876] SyncLoop (UPDATE, "api"): "docker-registry-1-rg9jj_default(2f05e71c-ac64-11e8-8a03-fa163ee43a34)" Aug 30 10:51:33 infra-node-0 atomic-openshift-node: I0830 10:51:33.514494 28245 kubelet.go:1914] SyncLoop (PLEG): "docker-registry-1-rg9jj_default(2f05e71c-ac64-11e8-8a03-fa163ee43a34)", event: &pleg.PodLifecy cleEvent{ID:"2f05e71c-ac64-11e8-8a03-fa163ee43a34", Type:"ContainerStarted", Data:"31a90463a12df537f53f1d2a7f1b3e0aa186055c3140b838ecfc2028cfa55870"} Aug 30 10:51:33 infra-node-0 atomic-openshift-node: {'CNI_IFNAME': 'eth0', 'CNI_NETNS': '/proc/29328/ns/net', 'CNI_PATH': '/opt/kuryr-cni/bin:/opt/cni/bin', 'CNI_ARGS': 'IgnoreUnknown=1;K8S_POD_NAMESPACE=default ;K8S_POD_NAME=docker-registry-1-rg9jj;K8S_POD_INFRA_CONTAINER_ID=31a90463a12df537f53f1d2a7f1b3e0aa186055c3140b838ecfc2028cfa55870', 'CNI_DAEMON': 'True', 'CNI_CONFIG_DIR_PATH': '/etc/cni/net.d', 'CNI_COMMAND': ' ADD', 'CNI_CONTAINERID': '31a90463a12df537f53f1d2a7f1b3e0aa186055c3140b838ecfc2028cfa55870', 'CNI_BIN_DIR_PATH': '/opt/cni/bin', 'config_kuryr': {u'debug': True, u'cniVersion': u'0.3.0', u'type': u'kuryr-cni', u 'kuryr_conf': u'/etc/kuryr/kuryr.conf', u'name': u'kuryr'}} _make_request /usr/lib/python2.7/site-packages/kuryr_kubernetes/cni/api.py:169 Aug 30 10:54:09 infra-node-0 atomic-openshift-node: I0830 10:54:09.807448 28245 kubelet.go:1914] SyncLoop (PLEG): "docker-registry-1-rg9jj_default(2f05e71c-ac64-11e8-8a03-fa163ee43a34)", event: &pleg.PodLifecy cleEvent{ID:"2f05e71c-ac64-11e8-8a03-fa163ee43a34", Type:"ContainerStarted", Data:"934b8227048c27591bc326362ddbc70a257ee9517fa1691b459a550faa9d4a3a"} Aug 30 10:54:11 infra-node-0 atomic-openshift-node: I0830 10:54:11.345338 28245 kubelet.go:1885] SyncLoop (DELETE, "api"): "docker-registry-1-rg9jj_default(2f05e71c-ac64-11e8-8a03-fa163ee43a34)" Aug 30 10:54:12 infra-node-0 atomic-openshift-node: + docker exec --env 'CNI_ARGS=IgnoreUnknown=1;K8S_POD_NAMESPACE=default;K8S_POD_NAME=docker-registry-1-rg9jj;K8S_POD_INFRA_CONTAINER_ID=31a90463a12df537f53f1d2 a7f1b3e0aa186055c3140b838ecfc2028cfa55870' --env CNI_COMMAND=DEL --env CNI_IFNAME=eth0 --env CNI_NETNS=/proc/29328/ns/net --env CNI_CONTAINERID=31a90463a12df537f53f1d2a7f1b3e0aa186055c3140b838ecfc2028cfa55870 -- env CNI_PATH=/opt/kuryr-cni/bin:/opt/cni/bin -i 0a99cd97702f363f1246ff43866f77345296f9e4c282bb2047616812f51f0d53 kuryr-cni --config-file /etc/kuryr/kuryr.conf Aug 30 10:54:13 infra-node-0 atomic-openshift-node: {'CNI_IFNAME': 'eth0', 'CNI_NETNS': '/proc/29328/ns/net', 'CNI_PATH': '/opt/kuryr-cni/bin:/opt/cni/bin', 'CNI_ARGS': 'IgnoreUnknown=1;K8S_POD_NAMESPACE=default ;K8S_POD_NAME=docker-registry-1-rg9jj;K8S_POD_INFRA_CONTAINER_ID=31a90463a12df537f53f1d2a7f1b3e0aa186055c3140b838ecfc2028cfa55870', 'CNI_DAEMON': 'True', 'CNI_CONFIG_DIR_PATH': '/etc/cni/net.d', 'CNI_COMMAND': ' DEL', 'CNI_CONTAINERID': '31a90463a12df537f53f1d2a7f1b3e0aa186055c3140b838ecfc2028cfa55870', 'CNI_BIN_DIR_PATH': '/opt/cni/bin', 'config_kuryr': {u'debug': True, u'cniVersion': u'0.3.0', u'type': u'kuryr-cni', u 'kuryr_conf': u'/etc/kuryr/kuryr.conf', u'name': u'kuryr'}} _make_request /usr/lib/python2.7/site-packages/kuryr_kubernetes/cni/api.py:169 Aug 30 10:54:13 infra-node-0 atomic-openshift-node: I0830 10:54:13.994712 28245 kubelet.go:1914] SyncLoop (PLEG): "docker-registry-1-rg9jj_default(2f05e71c-ac64-11e8-8a03-fa163ee43a34)", event: &pleg.PodLifecy cleEvent{ID:"2f05e71c-ac64-11e8-8a03-fa163ee43a34", Type:"ContainerDied", Data:"934b8227048c27591bc326362ddbc70a257ee9517fa1691b459a550faa9d4a3a"} Aug 30 10:54:13 infra-node-0 atomic-openshift-node: I0830 10:54:13.994823 28245 kubelet.go:1914] SyncLoop (PLEG): "docker-registry-1-rg9jj_default(2f05e71c-ac64-11e8-8a03-fa163ee43a34)", event: &pleg.PodLifecy cleEvent{ID:"2f05e71c-ac64-11e8-8a03-fa163ee43a34", Type:"ContainerDied", Data:"31a90463a12df537f53f1d2a7f1b3e0aa186055c3140b838ecfc2028cfa55870"} Aug 30 10:54:15 infra-node-0 atomic-openshift-node: I0830 10:54:15.024339 28245 kubelet.go:1885] SyncLoop (DELETE, "api"): "docker-registry-1-rg9jj_default(2f05e71c-ac64-11e8-8a03-fa163ee43a34)" Aug 30 10:54:15 infra-node-0 atomic-openshift-node: I0830 10:54:15.034589 28245 kubelet.go:1879] SyncLoop (REMOVE, "api"): "docker-registry-1-rg9jj_default(2f05e71c-ac64-11e8-8a03-fa163ee43a34)" Aug 30 10:54:15 infra-node-0 atomic-openshift-node: I0830 10:54:15.034634 28245 kubelet.go:2081] Failed to delete pod "docker-registry-1-rg9jj_default(2f05e71c-ac64-11e8-8a03-fa163ee43a34)", err: pod not found This seems to be a duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=1618685. Jon and I confirmed that it is a duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=1618685 as Michal suggested. The workaround for this is to raise the Octavia K8s-API LB timeout so that the connection is not closed while waiting for the pods to become ready. The steps for the workaround for OSP 13 to increase Octavia LB timeout are: 1.- Log in into overcloud controller 2.- Add the files on https://github.com/openstack/octavia/tree/stable/queens/octavia/common/jinja/haproxy/templates on /var/lib/config-data/puppet-generated/octavia/ 3.- Modify the base.j2 file to increase the default timeouts (for instance from 50000 to 500000, i.e., from 50 seconds to 500 seconds): https://github.com/openstack/octavia/blob/stable/queens/octavia/common/jinja/haproxy/templates/base.j2#L42-L43 4.- restart octavia-worker container 5.- Trigger openshift-ansible provisioning playbooks *** This bug has been marked as a duplicate of bug 1618685 *** Upps, sorry, after step number 3, another step is missing to change octavia.conf configuration to point to the template: 3.1.- edit file /var/lib/config-data/puppet-generated/octavia/etc/octavia/octavia.conf to point to the haproxy template just copied (note it is mounted in a different directory inside the container): [haproxy_amphora] haproxy_template = /var/lib/kolla/config_files/src/haproxy.cfg.j2 Regarding the changes, those changes since they are not done via director if there is an update/upgrade of the OSP environment these changes would be lost. What parameters need to be set in an overcloud update for them to be set? Re-opening this BZ, as [1] fix will not be backported to OCP 3.10 and OCP 3.11, Octavia default timeouts for Load Balancers need to be increased. The change must be done via Director in order to be persistent to updates and upgrades. Documentation in OCP 3.10 and OCP 3.11 OpenStack playbooks needs to be updated accordingly and reflect these changes. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1618685 Filed a BZ in OSP 13 [1] requesting support for changing Octavia LB timeouts in TripleO. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1636496 Closing the bug as the documentation update has already being handled on https://bugzilla.redhat.com/show_bug.cgi?id=1685481 *** This bug has been marked as a duplicate of bug 1685481 *** |