Bug 1323218 - Upgrade failed to import image-streams on nativeha env
Summary: Upgrade failed to import image-streams on nativeha env
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Cluster Version Operator
Version: 3.2.0
Hardware: Unspecified
OS: Unspecified
medium
low
Target Milestone: ---
: ---
Assignee: Brenton Leanhardt
QA Contact: Anping Li
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-04-01 14:54 UTC by Anping Li
Modified: 2016-05-12 15:23 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-05-12 15:23:14 UTC
Target Upstream Version:


Attachments (Terms of Use)
Upgrade failed to import images stream (1.52 MB, text/plain)
2016-04-05 13:21 UTC, Anping Li
no flags Details

Description Anping Li 2016-04-01 14:54:58 UTC
Description of problem:
Upgrade failed to import image-streams on containerized nativeha env. The image files can be found.

#ll /usr/share/openshift/examples/image-streams/image-streams-rhel7.json
-rw-r--r--. 1 root root 14085 Apr  1 22:33 /usr/share/openshift/examples/image-streams/image-streams-rhel7.json
#oc v3.1.1.6-33-g81eabcc
kubernetes v1.1.0-origin-1107-g4c8e6f4


Version-Release number of selected component (if applicable):
atomic-openshift-utils-3.0.69-1.git.0.c818db9.el7.noarch


How reproducible:
always

Steps to Reproduce:
1.Install nativeha containerized OSE 3.1 on RHEL
2.Upgrade to OSE 3.2
3.

Actual results:

TASK: [openshift_examples | Import RHEL streams] ******************************
<ha2-master2.example.com> ESTABLISH CONNECTION FOR USER: root
<ha2-master2.example.com> REMOTE_MODULE command oc create -n openshift -f /usr/share/openshift/examples/image-streams/image-streams-rhel7.json
<ha2-master2.example.com> EXEC ssh -C -tt -v -o ControlMaster=auto -o ControlPersist=60s -o ControlPath="/root/.ansible/cp/ansible-ssh-%h-%p-%r" -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o ConnectTimeout=10 ha2-master2.example.com /bin/sh -c 'mkdir -p $HOME/.ansible/tmp/ansible-tmp-1459521200.13-23709753680840 && echo $HOME/.ansible/tmp/ansible-tmp-1459521200.13-23709753680840'
<ha2-master2.example.com> PUT /tmp/tmpf4VYgx TO /root/.ansible/tmp/ansible-tmp-1459521200.13-23709753680840/command
<ha2-master2.example.com> EXEC ssh -C -tt -v -o ControlMaster=auto -o ControlPersist=60s -o ControlPath="/root/.ansible/cp/ansible-ssh-%h-%p-%r" -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o ConnectTimeout=10 ha2-master2.example.com /bin/sh -c 'LANG=C LC_CTYPE=C /usr/bin/python /root/.ansible/tmp/ansible-tmp-1459521200.13-23709753680840/command; rm -rf /root/.ansible/tmp/ansible-tmp-1459521200.13-23709753680840/ >/dev/null 2>&1'
failed: [ha2-master2.example.com] => {"changed": false, "cmd": ["oc", "create", "-n", "openshift", "-f", "/usr/share/openshift/examples/image-streams/image-streams-rhel7.json"], "delta": "0:00:11.891157", "end": "2016-04-01 22:33:31.599988", "failed": true, "failed_when_result": true, "rc": 1, "start": "2016-04-01 22:33:19.708831", "stdout_lines": [], "warnings": []}
stderr:
================================================================================
ATTENTION: You are running oc via a wrapper around 'docker run openshift3/ose'.
This wrapper is intended only to be used to bootstrap an environment. Please
install client tools on another host once you have granted cluster-admin
privileges to a user.
See https://docs.openshift.com/enterprise/latest/cli_reference/get_started_cli.html
=================================================================================

the path "/usr/share/openshift/examples/image-streams/image-streams-rhel7.json" does not exist

FATAL: all hosts have already failed -- aborting

PLAY RECAP ********************************************************************
           to retry, use: --limit @/root/upgrade.retry

ha2-master.example.com     : ok=6    changed=1    unreachable=0    failed=0
ha2-master1.example.com    : ok=97   changed=16   unreachable=0    failed=0
ha2-master2.example.com    : ok=216  changed=35   unreachable=0    failed=1
ha2-master3.example.com    : ok=181  changed=31   unreachable=0    failed=0
ha2-node1.example.com      : ok=84   changed=13   unreachable=0    failed=0
ha2-node2.example.com      : ok=84   changed=13   unreachable=0    failed=0
localhost                  : ok=41   changed=0    unreachable=0    failed=0

Expected results:


Additional info:

Comment 1 Brenton Leanhardt 2016-04-04 14:55:24 UTC
I'm not really sure how this happened.  Can you upload your inventory and the entire ansible log from the run?

I see that ha2-master2.example.com is where the job failed.  The way this works is that the example files are copied only to the first master and then the oc commands run there.  I'm worried that somehow things are confused in your environment and the example files were uploaded to master1 yet the commands ran on master2.

I've been running all my local tests in a multi-master environment so I'm betting your hitting some sort of edge case.

Comment 2 Jason DeTiberus 2016-04-05 02:11:30 UTC
Looking at the output (and that ha2-master2) has the largest number of tasks, it looks like ha2-master2 is the host that was considered oo_first_master for the run.

That said, I agree that it looks like it might be an inventory related issue, or possibly another issue during the run. Could you also include the full log output of the ansible run as well?

Comment 3 Anping Li 2016-04-05 13:21:15 UTC
Created attachment 1143838 [details]
Upgrade failed to import images stream

I commend the first master in inventory. But I don't think that is the root cause.  

[root@anli config]# cat hostnative
[OSEv3:children]
masters
nodes
etcd
lb
nfs

[OSEv3:vars]
ansible_ssh_user=root
openshift_use_openshift_sdn=true
deployment_type=openshift-enterprise
osm_default_subdomain=ha2.example.com
openshift_master_identity_providers=[{'name': 'allow_all', 'login': 'true', 'challenge': 'true', 'kind': 'AllowAllPasswordIdentityProvider'}]
openshift_set_hostname=True
os_sdn_network_plugin_name=redhat/openshift-ovs-multitenant


cli_docker_additional_registries=virt-openshift-05.lab.eng.nay.redhat.com:5000
cli_docker_insecure_registries=virt-openshift-05.lab.eng.nay.redhat.com:5000
openshift_docker_additional_registries=virt-openshift-05.lab.eng.nay.redhat.com:5000
openshift_docker_insecure_registries=virt-openshift-05.lab.eng.nay.redhat.com:5000
#openshift_rolling_restart_mode=system

openshift_hosted_registry_storage_kind=nfs
openshift_hosted_registry_storage_nfs_directory=/var/export/
openshift_hosted_registry_storage_nfs_options='*(rw,sync,all_squash)'
openshift_hosted_registry_storage_volume_name=registry
openshift_hosted_registry_storage_volume_size=2G

openshift_master_cluster_method=native
openshift_master_cluster_hostname=ha2-master.example.com
openshift_master_cluster_public_hostname=ha2-master.example.com

[masters]
#ha2-master1.example.com 
ha2-master2.example.com
ha2-master3.example.com

[etcd]
ha2-master1.example.com
ha2-master2.example.com
ha2-master3.example.com

[nodes]
ha2-master1.example.com  openshift_node_labels="{'region': 'idle', 'zone': 'default'}" openshift_hostname=ha2-master1.example.com openshift_public_hostname=ha2-master1.example.com
ha2-master2.example.com  openshift_node_labels="{'region': 'infra', 'zone': 'default'}" openshift_hostname=ha2-master2.example.com openshift_public_hostname=ha2-master2.example.com openshift_schedulable=true
ha2-master3.example.com  openshift_node_labels="{'region': 'infra', 'zone': 'default'}" openshift_hostname=ha2-master3.example.com openshift_public_hostname=ha2-master3.example.com openshift_schedulable=true
ha2-node1.example.com  openshift_node_labels="{'region': 'primary', 'zone': 'west'}" openshift_hostname=ha2-node1.example.com openshift_public_hostname=ha2-node1.example.com
ha2-node2.example.com  openshift_node_labels="{'region': 'primary', 'zone': 'east'}" openshift_hostname=ha2-node2.example.com openshift_public_hostname=ha2-node2.example.com

[lb]
ha2-master.example.com

[nfs]
ha2-master1.example.co

Comment 4 Anping Li 2016-04-05 13:24:49 UTC
I had tried "oc create -f /usr/share/openshift/examples/image-streams/image-streams-rhel7.json" manually and got same error. Unfortunately, The env wasn't kept. not sure what happened.

Comment 5 Brenton Leanhardt 2016-04-12 21:15:05 UTC
I plan to investigate this more tomorrow.  Is this still happening?  I've never seen it happen. I've installed dozens of multi-master and all-in-one environments in the last week so it's a bit of a mystery right now.

Comment 6 Anping Li 2016-04-13 10:56:46 UTC
Never hit again, I downgrade the severity. it is OK for to close it if we can't find the root cause within a short time.


Note You need to log in before you can comment on or make changes to this bug.