Hide Forgot
Description of problem: Should be able to set retries when the stack is updating/installing/subscribe rpm packages from remote server in a intermittent/unstable network Version-Release number of selected component (if applicable): openshift-on-openstack-0.9.4-1.el7.centos.noarch How reproducible: 50% in my test env Steps to Reproduce: 1. Create a stack (3 master + 1node) using RHN account rhn_username: "xxx" rhn_password: "xxxxxxxxxxxx" 2. 3. Actual results: 1)Stack failed at Oct 21 03:24:11 localhost cloud-init: No Presto metadata available for rhel-7-server-rpms Oct 21 03:25:41 localhost cloud-init: https://cdn.redhat.com/content/dist/rhel/server/7/7Server/x86_64/os/Packages/libuuid-2.23.2-26.el7_2.3.x86_64.rpm: [Errno 14] curl#35 - "Encountered end of file" Oct 21 03:25:41 localhost cloud-init: Trying other mirror. Oct 21 03:25:57 localhost cloud-init: https://cdn.redhat.com/content/dist/rhel/server/7/7Server/x86_64/os/Packages/libteam-1.17-7.el7_2.x86_64.rpm: [Errno 12] Timeout on https://cdn.redhat.com/content/dist/rhel/server/7/7Server/x86_64/os/Packages/libteam-1.17-7.el7_2.x86_64.rpm: (28, 'Operation too slow. Less than 1000 bytes/sec transferred the last 30 seconds') Oct 21 03:25:57 localhost cloud-init: Trying other mirror. Oct 21 03:27:43 localhost cloud-init: Error downloading packages: Oct 21 03:27:43 localhost cloud-init: libuuid-2.23.2-26.el7_2.3.x86_64: [Errno 256] No more mirrors to try. Oct 21 03:27:43 localhost cloud-init: + notify_failure 'could not update RPMs' Install the packages manually on the host successed 2) Stack failed at Oct 21 00:34:00 localhost cloud-init: Registering to: subscription.rhn.redhat.com:443/subscription Oct 21 00:34:00 localhost cloud-init: The system has been registered with ID: 33ec5146-f2e8-4f8d-af16-db184ba2c2fb Oct 21 00:34:00 localhost cloud-init: + '[' -n '' ']' Oct 21 00:34:00 localhost cloud-init: + subscription-manager attach --auto Oct 21 00:35:22 localhost cloud-init: Installed Product Current Status: Oct 21 00:35:22 localhost cloud-init: Product Name: Red Hat Enterprise Linux Server Oct 21 00:35:22 localhost cloud-init: Status: Not Subscribed Oct 21 00:35:22 localhost cloud-init: Unable to find available subscriptions for all your installed products. Actually the server has been subscribed successfully Expected results: Retries could be configurable in such tasks which might be failed due to network.(download, subscirbe, etc..) Additional info:
From logs it seems that the failure occurred during "yum update" and "subscription-manager" operations. The PR bellow adds retries to all yum operations and to all subscription-manager operations too. https://github.com/redhat-openstack/openshift-on-openstack/pull/286
Fixed in 0.9.5
Verified with v0.9.5 Stack can be created successfully under random networking issues. We can see obvious network issues from cloud-init logs, but it would not break the creation of the stack due to the retries policy. # cat /var/log/cloud-init.log <--snip--> Nov 1 23:09:04 localhost cloud-init: Downloading packages: Nov 1 23:09:04 localhost cloud-init: No Presto metadata available for rhel-7-server-rpms Nov 1 23:10:35 localhost cloud-init: https://cdn.redhat.com/content/dist/rhel/server/7/7Server/x86_64/os/Packages/mariadb-libs-5.5.50-1.el7_2.x86_64.rpm: [Errno 14] curl#35 - "Encountered end of file" Nov 1 23:10:35 localhost cloud-init: Trying other mirror. Nov 1 23:10:51 localhost cloud-init: https://cdn.redhat.com/content/dist/rhel/server/7/7Server/x86_64/os/Packages/logrotate-3.8.6-7.el7_2.x86_64.rpm: [Errno 12] Timeout on https://cdn.redhat.com/content/dist/rhel/server/7/7Server/x86_64/os/Packages/logrotate-3.8.6-7.el7_2.x86_64.rpm: (28, 'Operation too slow. Less than 1000 bytes/sec transferred the last 30 seconds') Nov 1 23:10:51 localhost cloud-init: Trying other mirror. Nov 1 23:12:26 localhost cloud-init: Error downloading packages: Nov 1 23:12:26 localhost cloud-init: 1:mariadb-libs-5.5.50-1.el7_2.x86_64: [Errno 256] No more mirrors to try. Nov 1 23:12:28 localhost cloud-init: Loaded plugins: product-id, search-disabled-repos, subscription-manager Nov 1 23:12:32 localhost cloud-init: Resolving Dependencies <--snip-->