Hide Forgot
Description of problem: When adding nodes with scaleup, new nodes can't access registry Version-Release number of selected component (if applicable): Existing cluster: atomic-openshift-node-3.1.1.6-1.git.0.b57e8bd.el7aos.x86_64 tuned-profiles-atomic-openshift-node-3.1.1.6-1.git.0.b57e8bd.el7aos.x86_64 atomic-openshift-3.1.1.6-1.git.0.b57e8bd.el7aos.x86_64 atomic-openshift-sdn-ovs-3.1.1.6-1.git.0.b57e8bd.el7aos.x86_64 atomic-openshift-clients-3.1.1.6-1.git.0.b57e8bd.el7aos.x86_64 atomic-openshift-utils-3.0.35-1.git.0.6a386dd.el7aos.noarch New nodes: atomic-openshift-3.1.1.6-4.git.21.cd70c35.el7aos.x86_64 tuned-profiles-atomic-openshift-node-3.1.1.6-4.git.21.cd70c35.el7aos.x86_64 atomic-openshift-utils-3.0.47-1.git.0.4498ce3.el7aos.noarch atomic-openshift-clients-3.1.1.6-4.git.21.cd70c35.el7aos.x86_64 atomic-openshift-node-3.1.1.6-4.git.21.cd70c35.el7aos.x86_64 atomic-openshift-sdn-ovs-3.1.1.6-4.git.21.cd70c35.el7aos.x86_64 How reproducible: Customer got it twice, I couldn't reproduce Steps to Reproduce: 1. Scaleup as described in https://access.redhat.com/solutions/2150381 Actual results: On new nodes: builder.go:185] Error: build error: Failed to push image. Response from registry is: unable to ping registry endpoint https://172.30.66.111:5000/v0/ v2 ping attempt failed with error: Get https://172.30.66.111:5000/v2/: dial tcp 172.30.66.111:5000: i/o timeout v1 ping attempt failed with error: Get https://172.30.66.111:5000/v1/_ping: dial tcp 172.30.66.111:5000: i/o timeout Expected results: No issue when scaling up Additional info: Restarting the node service on affected nodes solved the issue.
I failed to reproduce this as well. Without the ability to reproduce the error, it is not really possible to fix. Our best theory is that it might be caused by the version mismatch between the old nodes and the new nodes and a card has been added to fix that issue.
(In reply to Samuel Munilla from comment #1) > I failed to reproduce this as well. Without the ability to reproduce the > error, it is not really possible to fix. Our best theory is that it might be > caused by the version mismatch between the old nodes and the new nodes and a > card has been added to fix that issue. For reference: https://trello.com/c/66AY6AoS/198-persist-installed-version-across-cluster