Description of problem: openshift-state-metrics pod is scheduled on rhel workers, but failed to start up, this is the first time to see such error, it is not reproducible every time # oc -n openshift-monitoring get pod -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES ... openshift-state-metrics-5d4477d447-xjr5j 0/3 ContainerCreating 0 5h54m <none> qe-lpt-481-xmb5l-rhel-3 <none> <none> # oc -n openshift-monitoring describe pod openshift-state-metrics-5d4477d447-xjr5j Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedScheduling 5h38m (x9 over 5h45m) default-scheduler 0/3 nodes are available: 3 node(s) had taints that the pod didn't tolerate. Warning FailedScheduling 5h37m (x2 over 5h37m) default-scheduler 0/4 nodes are available: 4 node(s) had taints that the pod didn't tolerate. Warning FailedScheduling 5h37m (x2 over 5h37m) default-scheduler 0/5 nodes are available: 5 node(s) had taints that the pod didn't tolerate. Warning FailedScheduling 5h37m default-scheduler 0/7 nodes are available: 7 node(s) had taints that the pod didn't tolerate. Normal Scheduled 5h37m default-scheduler Successfully assigned openshift-monitoring/openshift-state-metrics-5d4477d447-xjr5j to qe-lpt-481-xmb5l-rhel-3 Warning FailedMount 5h37m (x3 over 5h37m) kubelet, qe-lpt-481-xmb5l-rhel-3 MountVolume.SetUp failed for volume "openshift-state-metrics-tls" : couldn't propagate object cache: timed out waiting for the condition Warning FailedMount 5h37m (x3 over 5h37m) kubelet, qe-lpt-481-xmb5l-rhel-3 MountVolume.SetUp failed for volume "openshift-state-metrics-token-g46pr" : couldn't propagate object cache: timed out waiting for the condition Warning FailedCreatePodSandBox 5h36m kubelet, qe-lpt-481-xmb5l-rhel-3 Failed create pod sandbox: rpc error: code = Unknown desc = failed to create pod network sandbox k8s_openshift-state-metrics-5d4477d447-xjr5j_openshift-monitoring_8d5a2f0e-5e9c-11ea-8dc2-fa163effc51f_0(a2995de158cc47d5370774976284faba332fdb2589aa5ce011d45f3deaea1f1b): Multus: Err adding pod to network "openshift-sdn": Multus: error in invoke Delegate add - "openshift-sdn": could not set up pod iptables rules: Another app is currently holding the xtables lock. Perhaps you want to use the -w option? Warning FailedCreatePodSandBox <invalid> (x1559 over 5h36m) kubelet, qe-lpt-481-xmb5l-rhel-3 Failed create pod sandbox: rpc error: code = Unknown desc = pod sandbox with name "k8s_openshift-state-metrics-5d4477d447-xjr5j_openshift-monitoring_8d5a2f0e-5e9c-11ea-8dc2-fa163effc51f_0" already exists # oc get node --show-labels NAME STATUS ROLES AGE VERSION LABELS qe-lpt-481-xmb5l-control-plane-0 Ready master 7h56m v1.14.6+6f6155bd9 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=qe-lpt-481-xmb5l-control-plane-0,kubernetes.io/os=linux,node-role.kubernetes.io/master=,node.openshift.io/os_id=rhcos qe-lpt-481-xmb5l-control-plane-1 Ready master 7h56m v1.14.6+6f6155bd9 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=qe-lpt-481-xmb5l-control-plane-1,kubernetes.io/os=linux,node-role.kubernetes.io/master=,node.openshift.io/os_id=rhcos qe-lpt-481-xmb5l-control-plane-2 Ready master 7h56m v1.14.6+6f6155bd9 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=qe-lpt-481-xmb5l-control-plane-2,kubernetes.io/os=linux,node-role.kubernetes.io/master=,node.openshift.io/os_id=rhcos qe-lpt-481-xmb5l-rhel-0 Ready worker 6h58m v1.14.6+6f6155bd9 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=qe-lpt-481-xmb5l-rhel-0,kubernetes.io/os=linux,node-role.kubernetes.io/worker=,node.openshift.io/os_id=rhel qe-lpt-481-xmb5l-rhel-1 Ready worker 6h58m v1.14.6+6f6155bd9 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=qe-lpt-481-xmb5l-rhel-1,kubernetes.io/os=linux,node-role.kubernetes.io/worker=,node.openshift.io/os_id=rhel qe-lpt-481-xmb5l-rhel-2 Ready worker 6h58m v1.14.6+6f6155bd9 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=qe-lpt-481-xmb5l-rhel-2,kubernetes.io/os=linux,node-role.kubernetes.io/worker=,node.openshift.io/os_id=rhel qe-lpt-481-xmb5l-rhel-3 Ready worker 6h58m v1.14.6+6f6155bd9 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=qe-lpt-481-xmb5l-rhel-3,kubernetes.io/os=linux,node-role.kubernetes.io/worker=,node.openshift.io/os_id=rhel check the journal logs, same error: ****************************************************** Mar 04 23:57:19 qe-lpt-481-xmb5l-rhel-3 hyperkube[1258]: I0304 23:57:19.547093 1258 status_manager.go:524] Status for pod "alertmanager-main-2_openshift-monitoring(8dcc91a3-5e9c-11ea-8dc2-fa163effc51f)" updatd successfully: (1, {Phase:Pending Conditions:[{Type:Initialized Status:True LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2020-03-04 23:57:00 -0500 EST Reason: Message:} {Type:Ready Status:Fals LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2020-03-04 23:57:00 -0500 EST Reason:ContainersNotReady Message:containers with unready status: [alertmanager config-reloader alertmanager-proxy]} Type:ContainersReady Status:False LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2020-03-04 23:57:00 -0500 EST Reason:ContainersNotReady Message:containers with unready status: [alertmanager confg-reloader alertmanager-proxy]} {Type:PodScheduled Status:True LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2020-03-04 23:57:01 -0500 EST Reason: Message:}] Message: Reason: NominatedNodeName: ostIP:10.0.99.37 PodIP: StartTime:2020-03-04 23:57:00 -0500 EST InitContainerStatuses:[] ContainerStatuses:[{Name:alertmanager State:{Waiting:&ContainerStateWaiting{Reason:ContainerCreating,Message:,} Running:ni Terminated:nil} LastTerminationState:{Waiting:nil Running:nil Terminated:nil} Ready:false RestartCount:0 Image:quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:4abb5e5901b8a47af9c0e23455aeca604809fc25cd213af469bfe2ec82a3253 ImageID: ContainerID:} {Name:alertmanager-proxy State:{Waiting:&ContainerStateWaiting{Reason:ContainerCreating,Message:,} Running:nil Terminated:nil} LastTerminationState:{Waiting:nil Running:il Terminated:nil} Ready:false RestartCount:0 Image:quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:7ee6e583b29b9879fe539674281a8c3a961c3599e76637299ec4265d08b4fd70 ImageID: ContainerID:} {Name:config-reloder State:{Waiting:&ContainerStateWaiting{Reason:ContainerCreating,Message:,} Running:nil Terminated:nil} LastTerminationState:{Waiting:nil Running:nil Terminated:nil} Ready:false RestartCount:0 Image:quay.io/opnshift-release-dev/ocp-v4.0-art-dev@sha256:2e6be7edcd47f45897b42e052bd9ceebac65f9caa460bf49c7450cb130e228a9 ImageID: ContainerID:}] QOSClass:Burstable}) Mar 04 23:57:19 qe-lpt-481-xmb5l-rhel-3 hyperkube[1258]: E0304 23:57:19.584910 1258 remote_runtime.go:109] RunPodSandbox from runtime service failed: rpc error: code = Unknown desc = failed to create pod netwrk sandbox k8s_openshift-state-metrics-5d4477d447-xjr5j_openshift-monitoring_8d5a2f0e-5e9c-11ea-8dc2-fa163effc51f_0(a2995de158cc47d5370774976284faba332fdb2589aa5ce011d45f3deaea1f1b): Multus: Err adding pod to nework "openshift-sdn": Multus: error in invoke Delegate add - "openshift-sdn": could not set up pod iptables rules: Another app is currently holding the xtables lock. Perhaps you want to use the -w option? Mar 04 23:57:19 qe-lpt-481-xmb5l-rhel-3 hyperkube[1258]: E0304 23:57:19.585072 1258 kuberuntime_sandbox.go:68] CreatePodSandbox for pod "openshift-state-metrics-5d4477d447-xjr5j_openshift-monitoring(8d5a2f0e-e9c-11ea-8dc2-fa163effc51f)" failed: rpc error: code = Unknown desc = failed to create pod network sandbox k8s_openshift-state-metrics-5d4477d447-xjr5j_openshift-monitoring_8d5a2f0e-5e9c-11ea-8dc2-fa163effc51f_0a2995de158cc47d5370774976284faba332fdb2589aa5ce011d45f3deaea1f1b): Multus: Err adding pod to network "openshift-sdn": Multus: error in invoke Delegate add - "openshift-sdn": could not set up pod iptables rules: nother app is currently holding the xtables lock. Perhaps you want to use the -w option? Mar 04 23:57:19 qe-lpt-481-xmb5l-rhel-3 hyperkube[1258]: E0304 23:57:19.585122 1258 kuberuntime_manager.go:697] createPodSandbox for pod "openshift-state-metrics-5d4477d447-xjr5j_openshift-monitoring(8d5a2f0e5e9c-11ea-8dc2-fa163effc51f)" failed: rpc error: code = Unknown desc = failed to create pod network sandbox k8s_openshift-state-metrics-5d4477d447-xjr5j_openshift-monitoring_8d5a2f0e-5e9c-11ea-8dc2-fa163effc51f_(a2995de158cc47d5370774976284faba332fdb2589aa5ce011d45f3deaea1f1b): Multus: Err adding pod to network "openshift-sdn": Multus: error in invoke Delegate add - "openshift-sdn": could not set up pod iptables rules:Another app is currently holding the xtables lock. Perhaps you want to use the -w option? Mar 04 23:57:19 qe-lpt-481-xmb5l-rhel-3 hyperkube[1258]: E0304 23:57:19.585337 1258 pod_workers.go:190] Error syncing pod 8d5a2f0e-5e9c-11ea-8dc2-fa163effc51f ("openshift-state-metrics-5d4477d447-xjr5j_openshft-monitoring(8d5a2f0e-5e9c-11ea-8dc2-fa163effc51f)"), skipping: failed to "CreatePodSandbox" for "openshift-state-metrics-5d4477d447-xjr5j_openshift-monitoring(8d5a2f0e-5e9c-11ea-8dc2-fa163effc51f)" with CreateodSandboxError: "CreatePodSandbox for pod \"openshift-state-metrics-5d4477d447-xjr5j_openshift-monitoring(8d5a2f0e-5e9c-11ea-8dc2-fa163effc51f)\" failed: rpc error: code = Unknown desc = failed to create pod netork sandbox k8s_openshift-state-metrics-5d4477d447-xjr5j_openshift-monitoring_8d5a2f0e-5e9c-11ea-8dc2-fa163effc51f_0(a2995de158cc47d5370774976284faba332fdb2589aa5ce011d45f3deaea1f1b): Multus: Err adding pod to ntwork \"openshift-sdn\": Multus: error in invoke Delegate add - \"openshift-sdn\": could not set up pod iptables rules: Another app is currently holding the xtables lock. Perhaps you want to use the -w option?\nMar 04 23:57:19 qe-lpt-481-xmb5l-rhel-3 hyperkube[1258]: I0304 23:57:19.585395 1258 event.go:209] Event(v1.ObjectReference{Kind:"Pod", Namespace:"openshift-monitoring", Name:"openshift-state-metrics-5d4477d44-xjr5j", UID:"8d5a2f0e-5e9c-11ea-8dc2-fa163effc51f", APIVersion:"v1", ResourceVersion:"37857", FieldPath:""}): type: 'Warning' reason: 'FailedCreatePodSandBox' Failed create pod sandbox: rpc error: code = Unknow desc = failed to create pod network sandbox k8s_openshift-state-metrics-5d4477d447-xjr5j_openshift-monitoring_8d5a2f0e-5e9c-11ea-8dc2-fa163effc51f_0(a2995de158cc47d5370774976284faba332fdb2589aa5ce011d45f3deaea11b): Multus: Err adding pod to network "openshift-sdn": Multus: error in invoke Delegate add - "openshift-sdn": could not set up pod iptables rules: Another app is currently holding the xtables lock. Perhaps youwant to use the -w option? Mar 04 23:57:19 qe-lpt-481-xmb5l-rhel-3 hyperkube[1258]: I0304 23:57:19.637884 1258 kubelet_pods.go:1346] Generating status for "openshift-state-metrics-5d4477d447-xjr5j_openshift-monitoring(8d5a2f0e-5e9c-11e-8dc2-fa163effc51f)" ****************************************************** Version-Release number of selected component (if applicable): 4.2.22 How reproducible: rarely Steps to Reproduce: 1. See the description 2. 3. Actual results: Expected results: Additional info:
verified this bug on 4.5.0-0.nightly-2020-03-10-002435
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.5 image release advisory), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2409