Created attachment 1219302 [details] deployer_pod_log Description of problem: Upgrade logging stacks from 3.2.0 level to 3.4.0 level, it failed by: Unable to find log message from cluster.service from pod logging-es-3bjvollr-4-mhyt5 within 300 seconds # oc get po NAME READY STATUS RESTARTS AGE logging-curator-1-rbae8 0/1 CrashLoopBackOff 4 18m logging-deployer-cwpmt 0/1 Error 0 22m logging-deployer-pdkwp 0/1 Completed 0 31m logging-es-3bjvollr-4-mhyt5 0/1 CrashLoopBackOff 8 17m logging-fluentd-f31ok 1/1 Running 0 18m Version-Release number of selected component (if applicable): brew registry: openshift3/logging-deployer 3.4.0 c364ab9c2f75 # openshift version openshift v3.4.0.23+24b1a58 kubernetes v1.4.0+776c994 etcd 3.1.0-rc.0 How reproducible: Always Steps to Reproduce: 1.Install openshift 3.2.0 2.Deploy logging 3.2.0 level: IMAGE_PREFIX=brew...:xxxx/openshift3/,IMAGE_VERSION=3.2.0,MODE=install 3.Upgrade logging stacks: $oadm policy add-cluster-role-to-user cluster-admin xiazhao $oc delete template logging-deployer-account-template logging-deployer-template $oc create -f https://raw.githubusercontent.com/openshift/origin-aggregated-logging/master/deployer/deployer.yaml $oc new-app logging-deployer-account-template $oc get template logging-deployer-template -o yaml -n logging | sed 's/\(image:\s.*\)logging-deployment\(.*\)/\1logging-deployer\2/g' | oc apply -n logging -f - $oc policy add-role-to-user edit --serviceaccount logging-deployer $oc policy add-role-to-user daemonset-admin --serviceaccount logging-deployer $oadm policy add-cluster-role-to-user oauth-editor system:serviceaccount:logging:logging-deployer $oadm policy add-cluster-role-to-user rolebinding-reader system:serviceaccount:logging:aggregated-logging-elasticsearch $oc new-app logging-deployer-template -p PUBLIC_MASTER_URL=https://{master-domain}:8443,ENABLE_OPS_CLUSTER=false,IMAGE_PREFIX=brew...:xxxx/openshift3/,IMAGE_VERSION=3.4.0,ES_INSTANCE_RAM=1G,ES_CLUSTER_SIZE=1,KIBANA_HOSTNAME={kibana-route},KIBANA_OPS_HOSTNAME={kibana-ops-route},MASTER_URL=https://{master-domain}:8443,MODE=upgrade 4.Check for upgrade result Actual results: 4.Upgrade failed Expected results: Upgraded to 3.4.0 successfully Additional info: deployer pod logs attached
This looks to be a duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=1393769 The deployer failed while waiting for the EFK components to scale up. The difference in the error message is that there was a window of time that the Deployer was able to see the ES pod had started, but it couldn't find a message in the logs to confirm that the service was available.
It's fixed. Tested with latest deployer image of 3.4.0, upgraded successfully, and kibana & kibana ops UI accesible with log entries: $ oc get po NAME READY STATUS RESTARTS AGE logging-curator-1-n27sm 1/1 Running 0 4m logging-curator-ops-1-izno3 1/1 Running 0 4m logging-deployer-o8b77 0/1 Completed 0 10m logging-deployer-r8kpd 0/1 Completed 0 6m logging-es-flruj8ta-4-4gnaz 1/1 Running 0 4m logging-es-ops-rpbmoj63-4-mgqhe 1/1 Running 0 4m logging-fluentd-qxxjh 1/1 Running 0 4m logging-kibana-2-j5ohm 2/2 Running 0 3m logging-kibana-ops-3-9pr69 2/2 Running 0 3m I'm not sure why the upgrade pod refused to show all logs by a short write issue: $ oc logs -f logging-deployer-r8kpd ++ oc get dc -l logging-infra=elasticsearch -o 'jsonpath={.items[*].metadata.name}' + for dc in '$(oc get dc -l $label -o jsonpath='\''{.items[*].metadata.name}'\'')' + patchDCImage logging-es-flruj8ta logging-elasticsearch false + local dc=logging-es-flruj8ta + local image=logging-elasticsearch + local kibana=false ++ oc get dc/logging-es-flruj8ta -o 'jsonpath={.status.latestVersion}' + local version=1 + local authProxy_patch + '[' false = true ']' + patchIfValid dc/logging-es-flruj8ta '{.spec.template.spec.containers[0].image}=brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/logging-elasticsearch:3.4.0 ' error: short write # openshift version openshift v3.4.0.25+1f36858 kubernetes v1.4.0+776c994 etcd 3.1.0-rc.0 Images tested with: brew....:xxxx/openshift3/logging-deployer 3.4.0 08eaf2753130 2 days ago 764.3 MB
Prerelease issue, no docs needed.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:0066