Bug 1889713 - pods in statefulset are not re-scheduled on other nodes when a node is down [NEEDINFO]
Summary: pods in statefulset are not re-scheduled on other nodes when a node is down
Keywords:
Status: ASSIGNED
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: kube-controller-manager
Version: 4.5
Hardware: x86_64
OS: Linux
medium
high
Target Milestone: ---
: 4.7.0
Assignee: Tomáš Nožička
QA Contact: zhou ying
URL:
Whiteboard: LifecycleReset
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-10-20 12:50 UTC by Subba Gaddamadugu
Modified: 2020-11-26 08:02 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Target Upstream Version:
mfojtik: needinfo?


Attachments (Terms of Use)

Description Subba Gaddamadugu 2020-10-20 12:50:05 UTC
Description of problem:
while performing a node power-down test pods are stuck in Terminating state.

status of nodes:
oc describe nodes|grep -i taint
Taints:             <none>
Taints:             <none>
Taints:             node.kubernetes.io/unreachable:NoExecute
Taints:             <none>
Taints:             <none>
Taints:             <none>

Tolerations on pods:
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s


Version-Release number of selected component (if applicable):
4.5


How reproducible:
Always


Steps to Reproduce:
1. Power-down a node
2. wait for 5 minutes 
3. see if pods are scheduled on other nodes

Actual results:
Pods are not rescheduled to another node, hence cluster is not useful.

Expected results:
Pods scheduled on another node


Additional info:
Found similar issues with no resolution

https://access.redhat.com/solutions/5278401
Operator catalog Pods created by CatalogSource do not schedule on other nodes even if the node is down.

https://access.redhat.com/solutions/3567851
One of my host nodes has failed, or was purposely shut down, and the node's pods are not being started on another available node.

Comment 1 Subba Gaddamadugu 2020-10-20 13:29:35 UTC
One notable observation is that all these pods stuck in Terminating state are statefulsets.

Comment 2 Subba Gaddamadugu 2020-10-20 15:07:58 UTC
The node which was powered down has these conditions and all other nodes don't have any taints and are in Ready status.
Conditions:
  Type             Status    LastHeartbeatTime                 LastTransitionTime                Reason              Message
  ----             ------    -----------------                 ------------------                ------              -------
  MemoryPressure   Unknown   Tue, 20 Oct 2020 00:07:40 -0700   Tue, 20 Oct 2020 00:10:42 -0700   NodeStatusUnknown   Kubelet stopped posting node status.
  DiskPressure     Unknown   Tue, 20 Oct 2020 00:07:40 -0700   Tue, 20 Oct 2020 00:10:42 -0700   NodeStatusUnknown   Kubelet stopped posting node status.
  PIDPressure      Unknown   Tue, 20 Oct 2020 00:07:40 -0700   Tue, 20 Oct 2020 00:10:42 -0700   NodeStatusUnknown   Kubelet stopped posting node status.
  Ready            Unknown   Tue, 20 Oct 2020 00:07:40 -0700   Tue, 20 Oct 2020 00:10:42 -0700   NodeStatusUnknown   Kubelet stopped posting node status.

and 
`oc get nodes` show the powered down node is NotReady

Comment 3 Tomáš Nožička 2020-10-22 08:36:23 UTC
please attach must-gather for this case.

also `oc get statefulset <name> -o yaml` and `oc get pod <name-of-the-pod-that-is-stuck> -o yaml`

Comment 4 Subba Gaddamadugu 2020-10-26 14:25:59 UTC
oc get statefulset keycloak-postgresql -n nautilus-system -o yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
  annotations:
    meta.helm.sh/release-name: keycloak
    meta.helm.sh/release-namespace: nautilus-system
  creationTimestamp: "2020-10-21T03:45:38Z"
  generation: 1
  labels:
    app: postgresql
    app.kubernetes.io/managed-by: Helm
    chart: postgresql-5.3.9
    heritage: Helm
    release: keycloak
  managedFields:
  - apiVersion: apps/v1
    fieldsType: FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          .: {}
          f:meta.helm.sh/release-name: {}
          f:meta.helm.sh/release-namespace: {}
        f:labels:
          .: {}
          f:app: {}
          f:app.kubernetes.io/managed-by: {}
          f:chart: {}
          f:heritage: {}
          f:release: {}
      f:spec:
        f:podManagementPolicy: {}
        f:replicas: {}
        f:revisionHistoryLimit: {}
        f:selector:
          f:matchLabels:
            .: {}
            f:app: {}
            f:release: {}
            f:role: {}
        f:serviceName: {}
        f:template:
          f:metadata:
            f:labels:
              .: {}
              f:app: {}
              f:chart: {}
              f:heritage: {}
              f:release: {}
              f:role: {}
            f:name: {}
          f:spec:
            f:containers:
              k:{"name":"keycloak-postgresql"}:
                .: {}
                f:env:
                  .: {}
                  k:{"name":"BITNAMI_DEBUG"}:
                    .: {}
                    f:name: {}
                    f:value: {}
                  k:{"name":"PGDATA"}:
                    .: {}
                    f:name: {}
                    f:value: {}
                  k:{"name":"POSTGRES_DB"}:
                    .: {}
                    f:name: {}
                    f:value: {}
                  k:{"name":"POSTGRES_PASSWORD"}:
                    .: {}
                    f:name: {}
                    f:valueFrom:
                      .: {}
                      f:secretKeyRef:
                        .: {}
                        f:key: {}
                        f:name: {}
                  k:{"name":"POSTGRES_USER"}:
                    .: {}
                    f:name: {}
                    f:value: {}
                  k:{"name":"POSTGRESQL_PORT_NUMBER"}:
                    .: {}
                    f:name: {}
                    f:value: {}
                f:image: {}
                f:imagePullPolicy: {}
                f:livenessProbe:
                  .: {}
                  f:exec:
                    .: {}
                    f:command: {}
                  f:failureThreshold: {}
                  f:initialDelaySeconds: {}
                  f:periodSeconds: {}
                  f:successThreshold: {}
                  f:timeoutSeconds: {}
                f:name: {}
                f:ports:
                  .: {}
                  k:{"containerPort":5432,"protocol":"TCP"}:
                    .: {}
                    f:containerPort: {}
                    f:name: {}
                    f:protocol: {}
                f:readinessProbe:
                  .: {}
                  f:exec:
                    .: {}
                    f:command: {}
                  f:failureThreshold: {}
                  f:initialDelaySeconds: {}
                  f:periodSeconds: {}
                  f:successThreshold: {}
                  f:timeoutSeconds: {}
                f:resources:
                  .: {}
                  f:requests:
                    .: {}
                    f:cpu: {}
                    f:memory: {}
                f:securityContext:
                  .: {}
                  f:runAsUser: {}
                f:terminationMessagePath: {}
                f:terminationMessagePolicy: {}
                f:volumeMounts:
                  .: {}
                  k:{"mountPath":"/bitnami/postgresql"}:
                    .: {}
                    f:mountPath: {}
                    f:name: {}
            f:dnsPolicy: {}
            f:initContainers:
              .: {}
              k:{"name":"init-chmod-data"}:
                .: {}
                f:command: {}
                f:image: {}
                f:imagePullPolicy: {}
                f:name: {}
                f:resources:
                  .: {}
                  f:requests:
                    .: {}
                    f:cpu: {}
                    f:memory: {}
                f:securityContext:
                  .: {}
                  f:runAsUser: {}
                f:terminationMessagePath: {}
                f:terminationMessagePolicy: {}
                f:volumeMounts:
                  .: {}
                  k:{"mountPath":"/bitnami/postgresql"}:
                    .: {}
                    f:mountPath: {}
                    f:name: {}
            f:restartPolicy: {}
            f:schedulerName: {}
            f:securityContext:
              .: {}
              f:fsGroup: {}
            f:terminationGracePeriodSeconds: {}
        f:updateStrategy:
          f:type: {}
        f:volumeClaimTemplates: {}
    manager: Go-http-client
    operation: Update
    time: "2020-10-21T03:45:38Z"
  - apiVersion: apps/v1
    fieldsType: FieldsV1
    fieldsV1:
      f:status:
        f:collisionCount: {}
        f:currentReplicas: {}
        f:currentRevision: {}
        f:observedGeneration: {}
        f:readyReplicas: {}
        f:replicas: {}
        f:updateRevision: {}
        f:updatedReplicas: {}
    manager: kube-controller-manager
    operation: Update
    time: "2020-10-21T03:45:57Z"
  name: keycloak-postgresql
  namespace: nautilus-system
  resourceVersion: "23714440"
  selfLink: /apis/apps/v1/namespaces/nautilus-system/statefulsets/keycloak-postgresql
  uid: cfd78ae7-7437-4d13-9201-d389d23d55f9
spec:
  podManagementPolicy: OrderedReady
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: postgresql
      release: keycloak
      role: master
  serviceName: keycloak-postgresql-headless
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: postgresql
        chart: postgresql-5.3.9
        heritage: Helm
        release: keycloak
        role: master
      name: keycloak-postgresql
    spec:
      containers:
      - env:
        - name: BITNAMI_DEBUG
          value: "false"
        - name: POSTGRESQL_PORT_NUMBER
          value: "5432"
        - name: PGDATA
          value: /bitnami/postgresql/data
        - name: POSTGRES_USER
          value: keycloak
        - name: POSTGRES_PASSWORD
          valueFrom:
            secretKeyRef:
              key: postgresql-password
              name: keycloak-postgresql
        - name: POSTGRES_DB
          value: keycloak
        image: harbor.shipyard.local/1.1-hf2-rc3/postgresql:11.4.0-debian-9-r0
        imagePullPolicy: IfNotPresent
        livenessProbe:
          exec:
            command:
            - sh
            - -c
            - exec pg_isready -U "keycloak" -d "keycloak" -h 127.0.0.1 -p 5432
          failureThreshold: 6
          initialDelaySeconds: 30
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 5
        name: keycloak-postgresql
        ports:
        - containerPort: 5432
          name: postgresql
          protocol: TCP
        readinessProbe:
          exec:
            command:
            - sh
            - -c
            - |
              pg_isready -U "keycloak" -d "keycloak" -h 127.0.0.1 -p 5432
          failureThreshold: 6
          initialDelaySeconds: 5
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 5
        resources:
          requests:
            cpu: 250m
            memory: 256Mi
        securityContext:
          runAsUser: 1001
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /bitnami/postgresql
          name: data
      dnsPolicy: ClusterFirst
      initContainers:
      - command:
        - sh
        - -c
        - |
          mkdir -p /bitnami/postgresql/data
          chmod 700 /bitnami/postgresql/data
          find /bitnami/postgresql -mindepth 1 -maxdepth 1 -not -name ".snapshot" -not -name "lost+found" | \
            xargs chown -R 1001:1001
        image: harbor.shipyard.local/1.1-hf2-rc3/minideb:latest
        imagePullPolicy: Always
        name: init-chmod-data
        resources:
          requests:
            cpu: 250m
            memory: 256Mi
        securityContext:
          runAsUser: 0
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /bitnami/postgresql
          name: data
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext:
        fsGroup: 1001
      terminationGracePeriodSeconds: 30
  updateStrategy:
    type: RollingUpdate
  volumeClaimTemplates:
  - apiVersion: v1
    kind: PersistentVolumeClaim
    metadata:
      creationTimestamp: null
      name: data
    spec:
      accessModes:
      - ReadWriteOnce
      resources:
        requests:
          storage: 6Gi
      volumeMode: Filesystem
    status:
      phase: Pending
status:
  collisionCount: 0
  currentReplicas: 1
  currentRevision: keycloak-postgresql-6df4fd447b
  observedGeneration: 1
  readyReplicas: 1
  replicas: 1
  updateRevision: keycloak-postgresql-6df4fd447b
  updatedReplicas: 1

-------------------------------------------------------------------


oc get po keycloak-postgresql-0 -n nautilus-system -o yaml
apiVersion: v1
kind: Pod
metadata:
  annotations:
    k8s.v1.cni.cncf.io/network-status: |-
      [{
          "name": "openshift-sdn",
          "interface": "eth0",
          "ips": [
              "10.128.0.58"
          ],
          "default": true,
          "dns": {}
      }]
    k8s.v1.cni.cncf.io/networks-status: |-
      [{
          "name": "openshift-sdn",
          "interface": "eth0",
          "ips": [
              "10.128.0.58"
          ],
          "default": true,
          "dns": {}
      }]
    openshift.io/scc: anyuid
  creationTimestamp: "2020-10-21T03:45:38Z"
  deletionGracePeriodSeconds: 30
  deletionTimestamp: "2020-10-26T14:21:45Z"
  generateName: keycloak-postgresql-
  labels:
    app: postgresql
    chart: postgresql-5.3.9
    controller-revision-hash: keycloak-postgresql-6df4fd447b
    heritage: Helm
    release: keycloak
    role: master
    statefulset.kubernetes.io/pod-name: keycloak-postgresql-0
  managedFields:
  - apiVersion: v1
    fieldsType: FieldsV1
    fieldsV1:
      f:status:
        f:conditions:
          .: {}
          k:{"type":"PodScheduled"}:
            .: {}
            f:lastProbeTime: {}
            f:lastTransitionTime: {}
            f:message: {}
            f:reason: {}
            f:status: {}
            f:type: {}
    manager: kube-scheduler
    operation: Update
    time: "2020-10-21T03:45:38Z"
  - apiVersion: v1
    fieldsType: FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          f:k8s.v1.cni.cncf.io/network-status: {}
          f:k8s.v1.cni.cncf.io/networks-status: {}
    manager: multus
    operation: Update
    time: "2020-10-21T03:45:47Z"
  - apiVersion: v1
    fieldsType: FieldsV1
    fieldsV1:
      f:status:
        f:conditions:
          k:{"type":"ContainersReady"}:
            .: {}
            f:lastProbeTime: {}
            f:lastTransitionTime: {}
            f:status: {}
            f:type: {}
          k:{"type":"Initialized"}:
            .: {}
            f:lastProbeTime: {}
            f:lastTransitionTime: {}
            f:status: {}
            f:type: {}
          k:{"type":"Ready"}:
            .: {}
            f:lastProbeTime: {}
            f:type: {}
        f:containerStatuses: {}
        f:hostIP: {}
        f:initContainerStatuses: {}
        f:phase: {}
        f:podIP: {}
        f:podIPs:
          .: {}
          k:{"ip":"10.128.0.58"}:
            .: {}
            f:ip: {}
        f:startTime: {}
    manager: kubelet
    operation: Update
    time: "2020-10-21T03:45:57Z"
  - apiVersion: v1
    fieldsType: FieldsV1
    fieldsV1:
      f:metadata:
        f:generateName: {}
        f:labels:
          .: {}
          f:app: {}
          f:chart: {}
          f:controller-revision-hash: {}
          f:heritage: {}
          f:release: {}
          f:role: {}
          f:statefulset.kubernetes.io/pod-name: {}
        f:ownerReferences:
          .: {}
          k:{"uid":"cfd78ae7-7437-4d13-9201-d389d23d55f9"}:
            .: {}
            f:apiVersion: {}
            f:blockOwnerDeletion: {}
            f:controller: {}
            f:kind: {}
            f:name: {}
            f:uid: {}
      f:spec:
        f:containers:
          k:{"name":"keycloak-postgresql"}:
            .: {}
            f:env:
              .: {}
              k:{"name":"BITNAMI_DEBUG"}:
                .: {}
                f:name: {}
                f:value: {}
              k:{"name":"PGDATA"}:
                .: {}
                f:name: {}
                f:value: {}
              k:{"name":"POSTGRES_DB"}:
                .: {}
                f:name: {}
                f:value: {}
              k:{"name":"POSTGRES_PASSWORD"}:
                .: {}
                f:name: {}
                f:valueFrom:
                  .: {}
                  f:secretKeyRef:
                    .: {}
                    f:key: {}
                    f:name: {}
              k:{"name":"POSTGRES_USER"}:
                .: {}
                f:name: {}
                f:value: {}
              k:{"name":"POSTGRESQL_PORT_NUMBER"}:
                .: {}
                f:name: {}
                f:value: {}
            f:image: {}
            f:imagePullPolicy: {}
            f:livenessProbe:
              .: {}
              f:exec:
                .: {}
                f:command: {}
              f:failureThreshold: {}
              f:initialDelaySeconds: {}
              f:periodSeconds: {}
              f:successThreshold: {}
              f:timeoutSeconds: {}
            f:name: {}
            f:ports:
              .: {}
              k:{"containerPort":5432,"protocol":"TCP"}:
                .: {}
                f:containerPort: {}
                f:name: {}
                f:protocol: {}
            f:readinessProbe:
              .: {}
              f:exec:
                .: {}
                f:command: {}
              f:failureThreshold: {}
              f:initialDelaySeconds: {}
              f:periodSeconds: {}
              f:successThreshold: {}
              f:timeoutSeconds: {}
            f:resources:
              .: {}
              f:requests:
                .: {}
                f:cpu: {}
                f:memory: {}
            f:securityContext:
              .: {}
              f:runAsUser: {}
            f:terminationMessagePath: {}
            f:terminationMessagePolicy: {}
            f:volumeMounts:
              .: {}
              k:{"mountPath":"/bitnami/postgresql"}:
                .: {}
                f:mountPath: {}
                f:name: {}
        f:dnsPolicy: {}
        f:enableServiceLinks: {}
        f:hostname: {}
        f:initContainers:
          .: {}
          k:{"name":"init-chmod-data"}:
            .: {}
            f:command: {}
            f:image: {}
            f:imagePullPolicy: {}
            f:name: {}
            f:resources:
              .: {}
              f:requests:
                .: {}
                f:cpu: {}
                f:memory: {}
            f:securityContext:
              .: {}
              f:runAsUser: {}
            f:terminationMessagePath: {}
            f:terminationMessagePolicy: {}
            f:volumeMounts:
              .: {}
              k:{"mountPath":"/bitnami/postgresql"}:
                .: {}
                f:mountPath: {}
                f:name: {}
        f:restartPolicy: {}
        f:schedulerName: {}
        f:securityContext:
          .: {}
          f:fsGroup: {}
        f:subdomain: {}
        f:terminationGracePeriodSeconds: {}
        f:volumes:
          .: {}
          k:{"name":"data"}:
            .: {}
            f:name: {}
            f:persistentVolumeClaim:
              .: {}
              f:claimName: {}
      f:status:
        f:conditions:
          k:{"type":"Ready"}:
            f:lastTransitionTime: {}
            f:status: {}
    manager: kube-controller-manager
    operation: Update
    time: "2020-10-26T14:16:10Z"
  name: keycloak-postgresql-0
  namespace: nautilus-system
  ownerReferences:
  - apiVersion: apps/v1
    blockOwnerDeletion: true
    controller: true
    kind: StatefulSet
    name: keycloak-postgresql
    uid: cfd78ae7-7437-4d13-9201-d389d23d55f9
  resourceVersion: "29270037"
  selfLink: /api/v1/namespaces/nautilus-system/pods/keycloak-postgresql-0
  uid: c44679de-ec1c-4c40-ae3f-8a0883cfe99d
spec:
  containers:
  - env:
    - name: BITNAMI_DEBUG
      value: "false"
    - name: POSTGRESQL_PORT_NUMBER
      value: "5432"
    - name: PGDATA
      value: /bitnami/postgresql/data
    - name: POSTGRES_USER
      value: keycloak
    - name: POSTGRES_PASSWORD
      valueFrom:
        secretKeyRef:
          key: postgresql-password
          name: keycloak-postgresql
    - name: POSTGRES_DB
      value: keycloak
    image: harbor.shipyard.local/1.1-hf2-rc3/postgresql:11.4.0-debian-9-r0
    imagePullPolicy: IfNotPresent
    livenessProbe:
      exec:
        command:
        - sh
        - -c
        - exec pg_isready -U "keycloak" -d "keycloak" -h 127.0.0.1 -p 5432
      failureThreshold: 6
      initialDelaySeconds: 30
      periodSeconds: 10
      successThreshold: 1
      timeoutSeconds: 5
    name: keycloak-postgresql
    ports:
    - containerPort: 5432
      name: postgresql
      protocol: TCP
    readinessProbe:
      exec:
        command:
        - sh
        - -c
        - |
          pg_isready -U "keycloak" -d "keycloak" -h 127.0.0.1 -p 5432
      failureThreshold: 6
      initialDelaySeconds: 5
      periodSeconds: 10
      successThreshold: 1
      timeoutSeconds: 5
    resources:
      requests:
        cpu: 250m
        memory: 256Mi
    securityContext:
      capabilities:
        drop:
        - MKNOD
      runAsUser: 1001
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /bitnami/postgresql
      name: data
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: default-token-dn6vx
      readOnly: true
  dnsPolicy: ClusterFirst
  enableServiceLinks: true
  hostname: keycloak-postgresql-0
  imagePullSecrets:
  - name: default-dockercfg-xslj9
  initContainers:
  - command:
    - sh
    - -c
    - |
      mkdir -p /bitnami/postgresql/data
      chmod 700 /bitnami/postgresql/data
      find /bitnami/postgresql -mindepth 1 -maxdepth 1 -not -name ".snapshot" -not -name "lost+found" | \
        xargs chown -R 1001:1001
    image: harbor.shipyard.local/1.1-hf2-rc3/minideb:latest
    imagePullPolicy: Always
    name: init-chmod-data
    resources:
      requests:
        cpu: 250m
        memory: 256Mi
    securityContext:
      capabilities:
        drop:
        - MKNOD
      runAsUser: 0
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /bitnami/postgresql
      name: data
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: default-token-dn6vx
      readOnly: true
  nodeName: master-ogden-sl.ocp.sl.sdp.hop.lab.emc.com
  priority: 0
  restartPolicy: Always
  schedulerName: default-scheduler
  securityContext:
    fsGroup: 1001
    seLinuxOptions:
      level: s0:c26,c15
  serviceAccount: default
  serviceAccountName: default
  subdomain: keycloak-postgresql-headless
  terminationGracePeriodSeconds: 30
  tolerations:
  - effect: NoExecute
    key: node.kubernetes.io/not-ready
    operator: Exists
    tolerationSeconds: 300
  - effect: NoExecute
    key: node.kubernetes.io/unreachable
    operator: Exists
    tolerationSeconds: 300
  - effect: NoSchedule
    key: node.kubernetes.io/memory-pressure
    operator: Exists
  volumes:
  - name: data
    persistentVolumeClaim:
      claimName: data-keycloak-postgresql-0
  - name: default-token-dn6vx
    secret:
      defaultMode: 420
      secretName: default-token-dn6vx
status:
  conditions:
  - lastProbeTime: null
    lastTransitionTime: "2020-10-21T03:45:48Z"
    status: "True"
    type: Initialized
  - lastProbeTime: null
    lastTransitionTime: "2020-10-26T14:16:10Z"
    status: "False"
    type: Ready
  - lastProbeTime: null
    lastTransitionTime: "2020-10-21T03:45:57Z"
    status: "True"
    type: ContainersReady
  - lastProbeTime: null
    lastTransitionTime: "2020-10-21T03:45:40Z"
    status: "True"
    type: PodScheduled
  containerStatuses:
  - containerID: cri-o://9166977270f7947153913adb34c6498131ce42c2eb953bfef312b83f8cdf8832
    image: harbor.shipyard.local/1.1-hf2-rc3/postgresql:11.4.0-debian-9-r0
    imageID: devops-repo.isus.emc.com:8116/nautilus/postgresql@sha256:78df827386fb801dffb0c30c3043d5c6ac3896ab500e7ecdb142d1924ca3695b
    lastState: {}
    name: keycloak-postgresql
    ready: true
    restartCount: 0
    started: true
    state:
      running:
        startedAt: "2020-10-21T03:45:48Z"
  hostIP: 10.243.57.7
  initContainerStatuses:
  - containerID: cri-o://a25e4289417ffd57d64a04ecb84e013d7206d9146e64604cd23e1436d677ee1d
    image: harbor.shipyard.local/1.1-hf2-rc3/minideb:latest
    imageID: devops-repo.isus.emc.com:8116/nautilus/minideb@sha256:9183acb6f31e36c3253b24351a52fdef5226177323f4938c1baebbf95c149825
    lastState: {}
    name: init-chmod-data
    ready: true
    restartCount: 0
    state:
      terminated:
        containerID: cri-o://a25e4289417ffd57d64a04ecb84e013d7206d9146e64604cd23e1436d677ee1d
        exitCode: 0
        finishedAt: "2020-10-21T03:45:48Z"
        reason: Completed
        startedAt: "2020-10-21T03:45:47Z"
  phase: Running
  podIP: 10.128.0.58
  podIPs:
  - ip: 10.128.0.58
  qosClass: Burstable
  startTime: "2020-10-21T03:45:40Z"






oc describe po keycloak-postgresql-0 -n nautilus-system
Name:                      keycloak-postgresql-0
Namespace:                 nautilus-system
Priority:                  0
Node:                      master-ogden-sl.ocp.sl.sdp.hop.lab.emc.com/10.243.57.7
Start Time:                Tue, 20 Oct 2020 20:45:40 -0700
Labels:                    app=postgresql
                           chart=postgresql-5.3.9
                           controller-revision-hash=keycloak-postgresql-6df4fd447b
                           heritage=Helm
                           release=keycloak
                           role=master
                           statefulset.kubernetes.io/pod-name=keycloak-postgresql-0
Annotations:               k8s.v1.cni.cncf.io/network-status:
                             [{
                                 "name": "openshift-sdn",
                                 "interface": "eth0",
                                 "ips": [
                                     "10.128.0.58"
                                 ],
                                 "default": true,
                                 "dns": {}
                             }]
                           k8s.v1.cni.cncf.io/networks-status:
                             [{
                                 "name": "openshift-sdn",
                                 "interface": "eth0",
                                 "ips": [
                                     "10.128.0.58"
                                 ],
                                 "default": true,
                                 "dns": {}
                             }]
                           openshift.io/scc: anyuid
Status:                    Terminating (lasts <invalid>)
Termination Grace Period:  30s
IP:                        10.128.0.58
IPs:
  IP:           10.128.0.58
Controlled By:  StatefulSet/keycloak-postgresql
Init Containers:
  init-chmod-data:
    Container ID:  cri-o://a25e4289417ffd57d64a04ecb84e013d7206d9146e64604cd23e1436d677ee1d
    Image:         harbor.shipyard.local/1.1-hf2-rc3/minideb:latest
    Image ID:      devops-repo.isus.emc.com:8116/nautilus/minideb@sha256:9183acb6f31e36c3253b24351a52fdef5226177323f4938c1baebbf95c149825
    Port:          <none>
    Host Port:     <none>
    Command:
      sh
      -c
      mkdir -p /bitnami/postgresql/data
      chmod 700 /bitnami/postgresql/data
      find /bitnami/postgresql -mindepth 1 -maxdepth 1 -not -name ".snapshot" -not -name "lost+found" | \
        xargs chown -R 1001:1001

    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Tue, 20 Oct 2020 20:45:47 -0700
      Finished:     Tue, 20 Oct 2020 20:45:48 -0700
    Ready:          True
    Restart Count:  0
    Requests:
      cpu:        250m
      memory:     256Mi
    Environment:  <none>
    Mounts:
      /bitnami/postgresql from data (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-dn6vx (ro)
Containers:
  keycloak-postgresql:
    Container ID:   cri-o://9166977270f7947153913adb34c6498131ce42c2eb953bfef312b83f8cdf8832
    Image:          harbor.shipyard.local/1.1-hf2-rc3/postgresql:11.4.0-debian-9-r0
    Image ID:       devops-repo.isus.emc.com:8116/nautilus/postgresql@sha256:78df827386fb801dffb0c30c3043d5c6ac3896ab500e7ecdb142d1924ca3695b
    Port:           5432/TCP
    Host Port:      0/TCP
    State:          Running
      Started:      Tue, 20 Oct 2020 20:45:48 -0700
    Ready:          True
    Restart Count:  0
    Requests:
      cpu:      250m
      memory:   256Mi
    Liveness:   exec [sh -c exec pg_isready -U "keycloak" -d "keycloak" -h 127.0.0.1 -p 5432] delay=30s timeout=5s period=10s #success=1 #failure=6
    Readiness:  exec [sh -c pg_isready -U "keycloak" -d "keycloak" -h 127.0.0.1 -p 5432
] delay=5s timeout=5s period=10s #success=1 #failure=6
    Environment:
      BITNAMI_DEBUG:           false
      POSTGRESQL_PORT_NUMBER:  5432
      PGDATA:                  /bitnami/postgresql/data
      POSTGRES_USER:           keycloak
      POSTGRES_PASSWORD:       <set to the key 'postgresql-password' in secret 'keycloak-postgresql'>  Optional: false
      POSTGRES_DB:             keycloak
    Mounts:
      /bitnami/postgresql from data (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-dn6vx (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   True
  PodScheduled      True
Volumes:
  data:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  data-keycloak-postgresql-0
    ReadOnly:   false
  default-token-dn6vx:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-dn6vx
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/memory-pressure:NoSchedule
                 node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:          <none>


-------------------------------------------------------------------

oc get nodes
NAME                                          STATUS     ROLES           AGE   VERSION
master-ogden-sl.ocp.sl.sdp.hop.lab.emc.com    NotReady   master,worker   16d   v1.18.3+6c42de8
master-orem-sl.ocp.sl.sdp.hop.lab.emc.com     Ready      master,worker   10d   v1.18.3+6c42de8
master-sandy-sl.ocp.sl.sdp.hop.lab.emc.com    Ready      master,worker   4d    v1.18.3+6c42de8
worker-layton-sl.ocp.sl.sdp.hop.lab.emc.com   Ready      worker          31d   v1.18.3+6c42de8
worker-logan-sl.ocp.sl.sdp.hop.lab.emc.com    Ready      worker          31d   v1.18.3+6c42de8

-----------------------------------------
oc get pvc -n nautilus-system
NAME                                               STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS                AGE
data-keycloak-postgresql-0                         Bound    pvc-af4a3784-0472-4315-bcba-67840db2452c   6Gi        RWO            ocs-storagecluster-cephfs   5d10h

--------------------------------------------
oc get pv|grep postgres
pvc-af4a3784-0472-4315-bcba-67840db2452c   6Gi        RWO            Delete           Bound       nautilus-system/data-keycloak-postgresql-0                         ocs-storagecluster-cephfs              5d10h

Comment 6 Michal Fojtik 2020-11-25 15:12:08 UTC
This bug hasn't had any activity in the last 30 days. Maybe the problem got resolved, was a duplicate of something else, or became less pressing for some reason - or maybe it's still relevant but just hasn't been looked at yet. As such, we're marking this bug as "LifecycleStale" and decreasing the severity/priority. If you have further information on the current state of the bug, please update it, otherwise this bug can be closed in about 7 days. The information can be, for example, that the problem still occurs, that you still want the feature, that more information is needed, or that the bug is (for whatever reason) no longer relevant. Additionally, you can add LifecycleFrozen into Keywords if you think this bug should never be marked as stale. Please consult with bug assignee before you do that.

Comment 7 Subba Gaddamadugu 2020-11-25 15:54:11 UTC
problem was not resolved. 
If manual intervention is needed for the given situation, how does openshift identify and alert operator?

Comment 8 Michal Fojtik 2020-11-25 16:12:13 UTC
The LifecycleStale keyword was removed because the bug got commented on recently.
The bug assignee was notified.


Note You need to log in before you can comment on or make changes to this bug.