Bug 1461374 - Failed to attach the cinder volume after upgrade
Failed to attach the cinder volume after upgrade
Status: VERIFIED
Product: OpenShift Container Platform
Classification: Red Hat
Component: Storage (Show other bugs)
3.6.0
Unspecified Unspecified
unspecified Severity high
: ---
: ---
Assigned To: hchen
Jianwei Hou
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-06-14 06:28 EDT by Anping Li
Modified: 2017-06-22 23:30 EDT (History)
7 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Anping Li 2017-06-14 06:28:50 EDT
Description of problem:
After upgrade to v3.6, some pods can not be started for the cinder volume couldn't be attached.  The message are "Expected HTTP response code [200 204] when accessing [GET http://10.14.5.228:8776/], but got 300 instead"


Version-Release number of selected component (if applicable):
openshift-ansible-3.6.99
openstack-cinder-8.1.0-1.el7ost.noarch
The OCP version before upgrade: atomic-openshift-3.5.5.24
The OCP version after upgrade: atomic-openshift-3.6.106

How reproducible:
always

Steps to Reproduce:
1. install OCP v3.5 on openstack and enable cloud provider
2. Create application and use cinder volume.
3. upgrade to v3.6
4. check the pod status after upgrade

Actual results:
The pvc can't be attached. oc describe pod pod show the following messages.

[root@openshift-147 ~]# oc describe pod docker-registry-2-mk96x
Name:            docker-registry-2-mk96x
Namespace:        default
Security Policy:    hostnetwork
Node:            openshift-127.lab.sjc.redhat.com/192.168.2.98
Start Time:        Wed, 14 Jun 2017 05:25:21 -0400
Labels:            deployment=docker-registry-2
            deploymentconfig=docker-registry
            docker-registry=default
Annotations:        kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"ReplicationController","namespace":"default","name":"docker-registry-2","uid":"312fcac9-4f1a-11e7-81ef-fa1...
            openshift.io/deployment-config.latest-version=2
            openshift.io/deployment-config.name=docker-registry
            openshift.io/deployment.name=docker-registry-2
            openshift.io/scc=hostnetwork
Status:            Pending
IP:            
Controllers:        ReplicationController/docker-registry-2
Containers:
  registry:
    Container ID:    
    Image:        brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/ose-docker-registry:v3.5.5.24
    Image ID:        
    Port:        5000/TCP
    State:        Waiting
      Reason:        ContainerCreating
    Ready:        False
    Restart Count:    0
    Requests:
      cpu:    100m
      memory:    256Mi
    Liveness:    http-get http://:5000/healthz delay=10s timeout=5s period=10s #success=1 #failure=3
    Readiness:    http-get http://:5000/healthz delay=0s timeout=5s period=10s #success=1 #failure=3
    Environment:
      REGISTRY_HTTP_ADDR:                    :5000
      REGISTRY_HTTP_NET:                    tcp
      REGISTRY_HTTP_SECRET:                    3Twx9Icj+ukwYYrZ+9cPcyFzYHEdYhFPleaUe+aW4Lg=
      REGISTRY_MIDDLEWARE_REPOSITORY_OPENSHIFT_ENFORCEQUOTA:    false
    Mounts:
      /registry from registry-storage (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from registry-token-96h28 (ro)
Conditions:
  Type        Status
  Initialized     True 
  Ready     False 
  PodScheduled     True 
Volumes:
  registry-storage:
    Type:    PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:    platform-pvc
    ReadOnly:    false
  registry-token-96h28:
    Type:    Secret (a volume populated by a Secret)
    SecretName:    registry-token-96h28
    Optional:    false
QoS Class:    Burstable
Node-Selectors:    registry=enabled
        role=node
Tolerations:    <none>
Events:
  FirstSeen    LastSeen    Count    From            SubObjectPath    Type        Reason        Message
  ---------    --------    -----    ----            -------------    --------    ------        -------
  1m        1m        1    default-scheduler            Normal        Scheduled    Successfully assigned docker-registry-2-mk96x to openshift-127.lab.sjc.redhat.com
  3s        3s        1    attachdetach                Warning        FailedMount    Failed to attach volume "pvc-2de6038b-4f1a-11e7-81ef-fa163e97878a" on node "openshift-127.lab.sjc.redhat.com" with: Expected HTTP response code [200 204] when accessing [GET http://10.14.5.228:8776/], but got 300 instead
{"versions": [{"status": "SUPPORTED", "updated": "2014-06-28T12:20:21Z", "links": [{"href": "http://docs.openstack.org/", "type": "text/html", "rel": "describedby"}, {"href": "http://10.14.5.228:8776/v1/", "rel": "self"}], "min_version": "", "version": "", "media-types": [{"base": "application/json", "type": "application/vnd.openstack.volume+json;version=1"}, {"base": "application/xml", "type": "application/vnd.openstack.volume+xml;version=1"}], "id": "v1.0"}, {"status": "SUPPORTED", "updated": "2014-06-28T12:20:21Z", "links": [{"href": "http://docs.openstack.org/", "type": "text/html", "rel": "describedby"}, {"href": "http://10.14.5.228:8776/v2/", "rel": "self"}], "min_version": "", "version": "", "media-types": [{"base": "application/json", "type": "application/vnd.openstack.volume+json;version=1"}, {"base": "application/xml", "type": "application/vnd.openstack.volume+xml;version=1"}], "id": "v2.0"}, {"status": "CURRENT", "updated": "2016-02-08T12:20:21Z", "links": [{"href": "http://docs.openstack.org/", "type": "text/html", "rel": "describedby"}, {"href": "http://10.14.5.228:8776/v3/", "rel": "self"}], "min_version": "3.0", "version": "3.0", "media-types": [{"base": "application/json", "type": "application/vnd.openstack.volume+json;version=1"}, {"base": "application/xml", "type": "application/vnd.openstack.volume+xml;version=1"}], "id": "v3.0"}]}


Expected results:


Additional info:
Comment 1 Scott Dodson 2017-06-14 08:29:09 EDT
Re-assigning to Storage component for triage. If there's something that the upgrade playbooks need to take care of here please let us know what we should be doing and re-assign to Upgrade component.
Comment 9 Anping Li 2017-06-14 23:02:12 EDT
Failed to start openshift-master once i used the /root/_output/local/bin/openshift

Jun 14 21:45:00 openshift-147.lab.sjc.redhat.com atomic-openshift-master[28235]: W0614 21:45:00.087526   28235 start_master.go:291] Warning: oauthConfig.identityProvider[0].provider.insecure: Invalid value: true: validating passwords over an insecure connection could allow them to be intercepted, master start will continue.
Jun 14 21:45:00 openshift-147.lab.sjc.redhat.com atomic-openshift-master[28235]: W0614 21:45:00.087652   28235 start_master.go:291] Warning: auditConfig.auditFilePath: Required value: audit can now be logged to a separate file, master start will continue.
Jun 14 21:45:00 openshift-147.lab.sjc.redhat.com atomic-openshift-master[28235]: I0614 21:45:00.095902   28235 logs.go:41] warning: ignoring ServerName for user-provided CA for backwards compatibility is deprecated
Jun 14 21:45:00 openshift-147.lab.sjc.redhat.com atomic-openshift-master[28235]: I0614 21:45:00.097367   28235 logs.go:41] warning: ignoring ServerName for user-provided CA for backwards compatibility is deprecated
Jun 14 21:45:00 openshift-147.lab.sjc.redhat.com atomic-openshift-master[28235]: E0614 21:45:00.098477   28235 cacher.go:274] unexpected ListAndWatch error: github.com/openshift/origin/vendor/k8s.io/apiserver/pkg/storage/cacher.go:215: Failed to list *api.ClusterPolicy: client: etcd cluster is unavailable or misconfigured; error #0: dial tcp 10.14.6.147:4001: getsockopt: connection refused
Jun 14 21:45:00 openshift-147.lab.sjc.redhat.com atomic-openshift-master[28235]: I0614 21:45:00.098564   28235 logs.go:41] warning: ignoring ServerName for user-provided CA for backwards compatibility is deprecated
Jun 14 21:45:00 openshift-147.lab.sjc.redhat.com atomic-openshift-master[28235]: E0614 21:45:00.098933   28235 cacher.go:274] unexpected ListAndWatch error: github.com/openshift/origin/vendor/k8s.io/apiserver/pkg/storage/cacher.go:215: Failed to list *api.ClusterPolicyBinding: client: etcd cluster is unavailable or misconfigured; error #0: dial tcp 10.14.6.147:4001: getsockopt: connection refused
Jun 14 21:45:00 openshift-147.lab.sjc.redhat.com atomic-openshift-master[28235]: I0614 21:45:00.099681   28235 logs.go:41] warning: ignoring ServerName for user-provided CA for backwards compatibility is deprecated
Jun 14 21:45:00 openshift-147.lab.sjc.redhat.com atomic-openshift-master[28235]: E0614 21:45:00.099871   28235 cacher.go:274] unexpected ListAndWatch error: github.com/openshift/origin/vendor/k8s.io/apiserver/pkg/storage/cacher.go:215: Failed to list *api.Policy: client: etcd cluster is unavailable or misconfigured; error #0: dial tcp 10.14.6.147:4001: getsockopt: connection refused
Jun 14 21:45:00 openshift-147.lab.sjc.redhat.com atomic-openshift-master[28235]: I0614 21:45:00.100683   28235 logs.go:41] warning: ignoring ServerName for user-provided CA for backwards compatibility is deprecated
Jun 14 21:45:00 openshift-147.lab.sjc.redhat.com atomic-openshift-master[28235]: E0614 21:45:00.100953   28235 cacher.go:274] unexpected ListAndWatch error: github.com/openshift/origin/vendor/k8s.io/apiserver/pkg/storage/cacher.go:215: Failed to list *api.PolicyBinding: client: etcd cluster is unavailable or misconfigured; error #0: dial tcp 10.14.6.147:4001: getsockopt: connection refused
Jun 14 21:45:00 openshift-147.lab.sjc.redhat.com atomic-openshift-master[28235]: E0614 21:45:00.101964   28235 cacher.go:274] unexpected ListAndWatch error: github.com/openshift/origin/vendor/k8s.io/apiserver/pkg/storage/cacher.go:215: Failed to list *api.Group: client: etcd cluster is unavailable or misconfigured; error #0: dial tcp 10.14.6.147:4001: getsockopt: connection refused
Jun 14 21:45:00 openshift-147.lab.sjc.redhat.com atomic-openshift-master[28235]: I0614 21:45:00.102512   28235 admission.go:107] Admission plugin ProjectRequestLimit is not enabled.  It will not be started.
Jun 14 21:45:00 openshift-147.lab.sjc.redhat.com atomic-openshift-master[28235]: I0614 21:45:00.102559   28235 admission.go:107] Admission plugin openshift.io/RestrictSubjectBindings is not enabled.  It will not be started.
Jun 14 21:45:00 openshift-147.lab.sjc.redhat.com atomic-openshift-master[28235]: I0614 21:45:00.102569   28235 admission.go:107] Admission plugin PodNodeConstraints is not enabled.  It will not be started.
Jun 14 21:45:00 openshift-147.lab.sjc.redhat.com atomic-openshift-master[28235]: I0614 21:45:00.102712   28235 admission.go:107] Admission plugin RunOnceDuration is not enabled.  It will not be started.
Jun 14 21:45:00 openshift-147.lab.sjc.redhat.com atomic-openshift-master[28235]: I0614 21:45:00.102723   28235 admission.go:107] Admission plugin PodNodeConstraints is not enabled.  It will not be started.
Jun 14 21:45:00 openshift-147.lab.sjc.redhat.com atomic-openshift-master[28235]: I0614 21:45:00.102774   28235 admission.go:107] Admission plugin ClusterResourceOverride is not enabled.  It will not be started.
Jun 14 21:45:00 openshift-147.lab.sjc.redhat.com atomic-openshift-master[28235]: I0614 21:45:00.104952   28235 admission.go:107] Admission plugin ImagePolicyWebhook is not enabled.  It will not be started.
Jun 14 21:45:00 openshift-147.lab.sjc.redhat.com atomic-openshift-master[28235]: I0614 21:45:00.105126   28235 admission.go:107] Admission plugin AlwaysPullImages is not enabled.  It will not be started.
Jun 14 21:45:00 openshift-147.lab.sjc.redhat.com atomic-openshift-master[28235]: I0614 21:45:00.105138   28235 admission.go:107] Admission plugin LimitPodHardAntiAffinityTopology is not enabled.  It will not be started.
Jun 14 21:45:00 openshift-147.lab.sjc.redhat.com atomic-openshift-master[28235]: I0614 21:45:00.106728   28235 logs.go:41] warning: ignoring ServerName for user-provided CA for backwards compatibility is deprecated
Jun 14 21:45:00 openshift-147.lab.sjc.redhat.com atomic-openshift-master[28235]: I0614 21:45:00.107606   28235 logs.go:41] warning: ignoring ServerName for user-provided CA for backwards compatibility is deprecated
un 14 21:45:00 openshift-147.lab.sjc.redhat.com atomic-openshift-master[28235]: I0614 21:45:00.108580   28235 logs.go:41] warning: ignoring ServerName for user-provided CA for backwards compatibility is deprecated
Jun 14 21:45:00 openshift-147.lab.sjc.redhat.com atomic-openshift-master[28235]: I0614 21:45:00.109652   28235 logs.go:41] warning: ignoring ServerName for user-provided CA for backwards compatibility is deprecated
Jun 14 21:45:00 openshift-147.lab.sjc.redhat.com atomic-openshift-master[28235]: E0614 21:45:00.109947   28235 cacher.go:274] unexpected ListAndWatch error: github.com/openshift/origin/vendor/k8s.io/apiserver/pkg/storage/cacher.go:215: Failed to list *api.OAuthAccessToken: client: etcd cluster is unavailable or misconfigured; error #0: dial tcp 10.14.6.147:4001: getsockopt: connection refused
Jun 14 21:45:00 openshift-147.lab.sjc.redhat.com atomic-openshift-master[28235]: E0614 21:45:00.111208   28235 cacher.go:274] unexpected ListAndWatch error: github.com/openshift/origin/vendor/k8s.io/apiserver/pkg/storage/cacher.go:215: Failed to list *api.User: client: etcd cluster is unavailable or misconfigured; error #0: dial tcp 10.14.6.147:4001: getsockopt: connection refused
Jun 14 21:45:00 openshift-147.lab.sjc.redhat.com atomic-openshift-master[28235]: I0614 21:45:00.234816   28235 master_config.go:327] Successfully initialized cloud provider: "openstack" from the config file: "/etc/origin/cloudprovider/openstack.conf"
Jun 14 21:45:00 openshift-147.lab.sjc.redhat.com atomic-openshift-master[28235]: I0614 21:45:00.729936   28235 master_config.go:505] Using the lease endpoint reconciler
Jun 14 21:45:00 openshift-147.lab.sjc.redhat.com atomic-openshift-master[28235]: I0614 21:45:00.730828   28235 logs.go:41] warning: ignoring ServerName for user-provided CA for backwards compatibility is deprecated
Jun 14 21:45:00 openshift-147.lab.sjc.redhat.com atomic-openshift-master[28235]: I0614 21:45:00.731751   28235 logs.go:41] warning: ignoring ServerName for user-provided CA for backwards compatibility is deprecated
Jun 14 21:45:00 openshift-147.lab.sjc.redhat.com atomic-openshift-master[28235]: I0614 21:45:00.731859   28235 start_master.go:418] Starting master on 0.0.0.0:8443 (v3.6.63-1+39af25c-dirty)
Jun 14 21:45:00 openshift-147.lab.sjc.redhat.com atomic-openshift-master[28235]: I0614 21:45:00.731870   28235 start_master.go:419] Public master address is https://openshift-147.lab.sjc.redhat.com:8443
Jun 14 21:45:00 openshift-147.lab.sjc.redhat.com atomic-openshift-master[28235]: I0614 21:45:00.731888   28235 start_master.go:423] Using images from "brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/ose-<component>:v3.6.63-1"
Jun 14 21:45:00 openshift-147.lab.sjc.redhat.com openshift[28235]: peerTLS: cert = /etc/origin/master/etcd.server.crt, key = /etc/origin/master/etcd.server.key, ca = /etc/origin/master/ca-bundle.crt, trusted-ca = , client-cert-auth = true
Jun 14 21:45:00 openshift-147.lab.sjc.redhat.com openshift[28235]: listening for peers on https://0.0.0.0:7001
Jun 14 21:45:00 openshift-147.lab.sjc.redhat.com openshift[28235]: listening for client requests on 0.0.0.0:4001
Jun 14 21:45:00 openshift-147.lab.sjc.redhat.com atomic-openshift-master[28235]: I0614 21:45:00.736115   28235 logs.go:41] warning: ignoring ServerName for user-provided CA for backwards compatibility is deprecated
Jun 14 21:45:01 openshift-147.lab.sjc.redhat.com openshift[28235]: recovered store from snapshot at index 2700240
Jun 14 21:45:01 openshift-147.lab.sjc.redhat.com openshift[28235]: name = openshift.local
Jun 14 21:45:01 openshift-147.lab.sjc.redhat.com openshift[28235]: data dir = /var/lib/origin/openshift.local.etcd
Jun 14 21:45:01 openshift-147.lab.sjc.redhat.com openshift[28235]: member dir = /var/lib/origin/openshift.local.etcd/member
Jun 14 21:45:01 openshift-147.lab.sjc.redhat.com openshift[28235]: heartbeat = 100ms
Jun 14 21:45:01 openshift-147.lab.sjc.redhat.com openshift[28235]: election = 1000ms
Jun 14 21:45:01 openshift-147.lab.sjc.redhat.com openshift[28235]: snapshot count = 10000
Jun 14 21:45:01 openshift-147.lab.sjc.redhat.com openshift[28235]: advertise client URLs = https://openshift-147.lab.sjc.redhat.com:4001
Jun 14 21:45:01 openshift-147.lab.sjc.redhat.com systemd[1]: atomic-openshift-master.service: main process exited, code=exited, status=1/FAILURE
Jun 14 21:45:01 openshift-147.lab.sjc.redhat.com systemd[1]: Failed to start Atomic OpenShift Master.
Jun 14 21:45:01 openshift-147.lab.sjc.redhat.com systemd[1]: Unit atomic-openshift-master.service entered failed state.
Jun 14 21:45:01 openshift-147.lab.sjc.redhat.com systemd[1]: atomic-openshift-master.service failed.
Comment 10 N. Harrison Ripps 2017-06-19 10:24:47 EDT
Can we get a retest? A PR was submitted to fix this bug 3 days ago: https://github.com/openshift/origin/pull/14658
Comment 11 Anping Li 2017-06-20 00:47:26 EDT
Which packages or openshift version can I use for retest?
Comment 12 N. Harrison Ripps 2017-06-20 08:56:32 EDT
Huamin, can you please provide Anping with the packages / OpenShift version to be used for retest?
Comment 14 Jianwei Hou 2017-06-22 23:30:44 EDT
Tested with OCP v3.6.121, this is not reproducible on QEOS7.

Note You need to log in before you can comment on or make changes to this bug.