Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1365867

Summary:	Ceph: Unable to mount volumes for pod : rbd: image is locked by other nodes
Product:	OpenShift Container Platform	Reporter:	Josep 'Pep' Turro Mauri <pep>
Component:	Storage	Assignee:	hchen
Status:	CLOSED ERRATA	QA Contact:	Jianwei Hou <jhou>
Severity:	medium	Docs Contact:
Priority:	medium
Version:	3.2.0	CC:	aos-bugs, bchilds, eparis, hchen, jsafrane, michael.morello, misalunk, schamilt, smunilla
Target Milestone:	---	Keywords:	NeedsTestCase, Reopened
Target Release:	---
Hardware:	All
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:	Cause: When an openshift node crashes before umapping a rbd volume, the advisory lock held on the rbd volume is not released and prevents other nodes to map it. Consequence: The rbd volume cannot be used by other nodes unless the advisory lock is manully removed. Fix: If no rbd client is using the rbd volume, the advisory lock is removed automatically. Result: The rbd volume can be used by other nodes without manually removing the lock.	Story Points:	---
Clone Of:		Environment:
Last Closed:	2017-08-10 05:15:47 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1267746

Description Josep 'Pep' Turro Mauri 2016-08-10 11:32:04 UTC

Description of problem:

If a node that mounts ceph PVs crashes, subsequent ceph PV mounts on other nodes are blocked until the locks are manually cleared.

Version-Release number of selected component (if applicable):
Reported in upstream issue referencing origin versions v1.1.3 - 1.3.3.

How reproducible:
I have not had time to reproduce with OSE, but based on the upstream report this should be 100% reproducible

Steps to Reproduce:
1. Have a node running pods that access ceph-backed PVs
2. Crash that node

Actual results:
Pods starting on other nodes that use ceph PVs remain in ContainerCreating status, manual admin intervention required to clean up logs.

Expected results:
Automatic recovery.

Additional info:
Lacking a fencing mechanism I'm not sure how much this can be safely automated

Comment 1 Jan Safranek 2016-08-10 15:03:22 UTC

Huamin, can you please take a look?

Comment 2 hchen 2016-08-10 17:44:37 UTC

this is a known issue, there are card and k8s/openshift issues tracking it[1]. The plan is to use attach/detach to do controller master initiatied to detach call to unlock the lock.

1. https://trello.com/c/Y1j4dTBO/131-bug-the-ceph-rbd-volume-plugin-seems-to-hold-a-lock-when-the-container-fails

Comment 4 Bradley Childs 2016-09-29 14:16:26 UTC

https://github.com/kubernetes/kubernetes/pull/12502/files

Comment 5 hchen 2016-09-29 14:20:39 UTC

Upstream fix is proposed at 
https://github.com/kubernetes/kubernetes/pull/33660

Comment 6 hchen 2016-11-02 19:48:18 UTC

33660 depends on the following
https://github.com/kubernetes/kubernetes/pull/35433
https://github.com/kubernetes/kubernetes/pull/35434

Comment 7 hchen 2016-11-02 19:48:27 UTC

33660 depends on the following
https://github.com/kubernetes/kubernetes/pull/35433
https://github.com/kubernetes/kubernetes/pull/35434

Comment 11 hchen 2017-02-01 15:44:54 UTC

*** Bug 1409237 has been marked as a duplicate of this bug. ***

Comment 20 Jianwei Hou 2017-06-02 08:40:02 UTC

Still reproducible in openshift v3.6.86

Steps:
1. Create StorageClass for rbd provisioner.
2. Create a PVC that dynamically provisions a PV, create a ReplicationController(rc=1). 
3. After Pod is running, stop its the node service.
4. New Pod is recreated in another node, but stuck at status 'ContainerCreating'. Old Pod became 'Unkown'. The new Pod would become 'Running' when original node was recovered or the lock is manually removed.

# oc get pods  
NAME          READY     STATUS              RESTARTS   AGE
rbdpd-8n8zd   0/1       ContainerCreating   0          8m
rbdpd-xwn25   1/1       Unknown             0          1h

# oc describe pod rbdpd-8n8zd
Name:			rbdpd-8n8zd
Namespace:		jhou
Security Policy:	restricted
Node:			ip-172-18-6-78.ec2.internal/172.18.6.78
Start Time:		Fri, 02 Jun 2017 16:12:37 +0800
Labels:			app=rbd
Annotations:		kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"ReplicationController","namespace":"jhou","name":"rbdpd","uid":"31b871f0-475a-11e7-8550-0e259545e72a","api...
			openshift.io/scc=restricted
Status:			Pending
IP:			
Controllers:		ReplicationController/rbdpd
Containers:
  myfrontend:
    Container ID:	
    Image:		jhou/hello-openshift
    Image ID:		
    Port:		80/TCP
    State:		Waiting
      Reason:		ContainerCreating
    Ready:		False
    Restart Count:	0
    Environment:	<none>
    Mounts:
      /mnt/rbd from pvol (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-xk75q (ro)
Conditions:
  Type		Status
  Initialized 	True 
  Ready 	False 
  PodScheduled 	True 
Volumes:
  pvol:
    Type:	PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:	rbdc
    ReadOnly:	false
  default-token-xk75q:
    Type:	Secret (a volume populated by a Secret)
    SecretName:	default-token-xk75q
    Optional:	false
QoS Class:	BestEffort
Node-Selectors:	<none>
Tolerations:	<none>
Events:
  FirstSeen	LastSeen	Count	From					SubObjectPath	Type		Reason		Message
  ---------	--------	-----	----					-------------	--------	------		-------
  9m		9m		1	default-scheduler					Normal		Scheduled	Successfully assigned rbdpd-8n8zd to ip-172-18-6-78.ec2.internal
  9m		1m		12	kubelet, ip-172-18-6-78.ec2.internal			Warning		FailedMount	MountVolume.SetUp failed for volume "kubernetes.io/rbd/3d88d0cc-476b-11e7-8550-0e259545e72a-pvc-3874b558-4757-11e7-8550-0e259545e72a" (spec.Name: "pvc-3874b558-4757-11e7-8550-0e259545e72a") pod "3d88d0cc-476b-11e7-8550-0e259545e72a" (UID: "3d88d0cc-476b-11e7-8550-0e259545e72a") with: rbd: image kubernetes-dynamic-pvc-387acbcd-4757-11e7-8550-0e259545e72a is locked by other nodes
  7m		56s		4	kubelet, ip-172-18-6-78.ec2.internal			Warning		FailedMount	Unable to mount volumes for pod "rbdpd-8n8zd_jhou(3d88d0cc-476b-11e7-8550-0e259545e72a)": timeout expired waiting for volumes to attach/mount for pod "jhou"/"rbdpd-8n8zd". list of unattached/unmounted volumes=[pvol]
  7m		56s		4	kubelet, ip-172-18-6-78.ec2.internal			Warning		FailedSync	Error syncing pod, skipping: timeout expired waiting for volumes to attach/mount for pod "jhou"/"rbdpd-8n8zd". list of unattached/unmounted volumes=[pvol]



[root@ip-172-18-3-46 ~]# rbd lock list kubernetes-dynamic-pvc-387acbcd-4757-11e7-8550-0e259545e72a
There is 1 exclusive lock on this image.
Locker      ID                                              Address                
client.4193 kubelet_lock_magic_ip-172-18-1-167.ec2.internal 172.18.1.167:0/1037989 
[root@ip-172-18-3-46 ~]# rbd lock remove kubernetes-dynamic-pvc-387acbcd-4757-11e7-8550-0e259545e72a kubelet_lock_magic_ip-172-18-1-167.ec2.internal client.4193

Comment 23 Jianwei Hou 2017-06-15 08:12:21 UTC

Verified on openshift v3.6.106.

Given the node is down(I shut it down), the replication controller creates a Pod in another functional node and the Pod could become running. The rbd lock does not prevents other nodes from mounting.

Comment 26 errata-xmlrpc 2017-08-10 05:15:47 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1716

Comment 27 hchen 2017-11-13 14:59:35 UTC

this is resolved through rbd attach/detach refactoring.

Comment 29 Red Hat Bugzilla 2023-09-14 03:29:25 UTC

The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days