Bug 1356478

Summary:	Openshift need update the output error message when try re-format the volume
Product:	OpenShift Container Platform	Reporter:	Chao Yang <chaoyang>
Component:	Storage	Assignee:	Jan Safranek <jsafrane>
Status:	CLOSED ERRATA	QA Contact:	Chao Yang <chaoyang>
Severity:	low	Docs Contact:
Priority:	low
Version:	3.3.0	CC:	aos-bugs, bchilds, eparis, jhou, jsafrane, lxia, screeley, smunilla
Target Milestone:	---
Target Release:	3.7.0
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:	atomic-openshift-3.6.74-1.git.0.e6d1637.el7	Doc Type:	No Doc Update
Doc Text:	undefined	Story Points:	---
Clone Of:		Environment:
Last Closed:	2017-11-28 21:51:43 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Chao Yang 2016-07-14 08:33:18 UTC

Description of problem:
Openshift need uodate the message when try to re-format volume according to  https://bugzilla.redhat.com/show_bug.cgi?id=1349727#c8
Version-Release number of selected component (if applicable):
openshift v3.3.0.5
kubernetes v1.3.0+57fb9ac

How reproducible:
Always

Steps to Reproduce:
In the openshift, pod will not be running if volume changed fsType which already formatted as other fstype
1.Create a volume
2.Create a pv using this volume id
apiVersion: "v1"
kind: "PersistentVolume"
metadata:
  name: "pv0001"
spec:
  capacity:
    storage: "1Gi"
  accessModes:
    - "ReadWriteOnce"
  awsElasticBlockStore:
    fsType: "ext4"
    volumeID: "vol-61c723c5"
3. Create a pvc and pod
4. Pod is running
5. Delete pod, pvc and pv
6. From aws web console, this volume is available
7. Create pv using this volume, set fsType is xfs in the pv yaml file
8. Create pvc and pod
9. Pod will be failed
Error message like 
Mounting arguments: /dev/xvdba /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/aws-ebs/mounts/vol-61c723c5 xfs [defaults]
Output: mount: wrong fs type, bad option, bad superblock on /dev/xvdba,
       missing codepage or helper program, or other error

Actual results:
Output have no hint that openshift does not allow re-formate volume 

Expected results:
Should output like "It is failed due to re-formate the volume, data will be lost"

Additional info:

Comment 1 Jan Safranek 2016-08-05 13:35:31 UTC

This is actually very similar to bug #1360236 and PR https://github.com/kubernetes/kubernetes/pull/27778

When the PR is merged it should be trivial to extend it to add an error like 'failed to mount volume as "xfs", it's already formatted with "ext4". Mount error: ....'.

I have experimental fix at https://github.com/jsafrane/kubernetes/commit/3e658e22d0474a27d487c5f73c353929ddbada35, but it requires Scott event recorder changes in #1360236 and perhaps some plumbing.

Comment 2 Jan Safranek 2016-08-26 11:59:21 UTC

Pushed to Kubernetes: https://github.com/kubernetes/kubernetes/pull/31515

Comment 4 Chao Yang 2017-06-05 07:04:53 UTC

1.Create pv as ext4 format
2.Create pvc and pod, then check pod is running
3.Delete pod, pvc and pv, check ebs volume is available from the web console
4.Create pv with same ebs volume, format is xfs
5.Create pvc and pod. pod status is ContainerCreating
6.ebs volume is in the state of "attached"

Volume "kubernetes.io/aws-ebs/vol-0db9a2d474c9a9d71"/Node "ip-172-18-1-53.ec2.internal" is attached--touching


[root@ip-172-18-3-207 ~]# oc describe pods mypod
Name:			mypod
Namespace:		default
Security Policy:	anyuid
Node:			ip-172-18-1-53.ec2.internal/172.18.1.53
Start Time:		Mon, 05 Jun 2017 02:41:40 -0400
Labels:			name=frontendhttp
Status:			Pending
IP:			
Controllers:		<none>
Containers:
  myfrontend:
    Container ID:	
    Image:		jhou/hello-openshift
    Image ID:		
    Port:		80/TCP
    State:		Waiting
      Reason:		ContainerCreating
    Ready:		False
    Restart Count:	0
    Volume Mounts:
      /tmp from aws (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-xz5om (ro)
    Environment Variables:	<none>
Conditions:
  Type		Status
  Initialized 	True 
  Ready 	False 
  PodScheduled 	True 
Volumes:
  aws:
    Type:	PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:	ebsc
    ReadOnly:	false
  default-token-xz5om:
    Type:	Secret (a volume populated by a Secret)
    SecretName:	default-token-xz5om
QoS Tier:	BestEffort
Events:
  FirstSeen	LastSeen	Count	From					SubobjectPath	Type		Reason		Message
  ---------	--------	-----	----					-------------	--------	------		-------
  2m		2m		1	{default-scheduler }					Normal		Scheduled	Successfully assigned mypod to ip-172-18-1-53.ec2.internal
  6s		6s		1	{kubelet ip-172-18-1-53.ec2.internal}			Warning		FailedMount	Unable to mount volumes for pod "mypod_default(07e58ca2-49ba-11e7-81bb-0e813404bf12)": timeout expired waiting for volumes to attach/mount for pod "mypod"/"default". list of unattached/unmounted volumes=[aws]
  6s		6s		1	{kubelet ip-172-18-1-53.ec2.internal}			Warning		FailedSync	Error syncing pod, skipping: timeout expired waiting for volumes to attach/mount for pod "mypod"/"default". list of unattached/unmounted volumes=[aws]

[root@ip-172-18-3-207 ~]# oc version
oc v3.3.1.34
kubernetes v1.3.0+52492b4

Comment 5 Jan Safranek 2017-06-05 10:04:21 UTC

> Error syncing pod, skipping: timeout expired waiting for volumes to attach/mount for pod "mypod"/"default". list of unattached/unmounted volumes=[aws]


This is a different bug. Please try to find as much as possible what's wrong, especially openshift-node logs may be useful and open a new bug (or try to find already open one).

changing filesystem from ext4 to xfs makes a difference *after* the volume is attached to a node. I've just checked that it works as described in comment #1 when using:

oc v3.6.74
kubernetes v1.6.1+5115d708d7
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://ip-172-18-11-145.ec2.internal:8443
openshift v3.6.74
kubernetes v1.6.1+5115d708d7

Comment 6 Chao Yang 2017-06-05 12:00:48 UTC

Node log is as below:

Jun  5 07:24:36 ip-172-18-1-53 atomic-openshift-node: I0605 07:24:36.262049    9192 attacher.go:111] Successfully found attached AWS Volume "vol-0db9a2d474c9a9d71".
Jun  5 07:24:36 ip-172-18-1-53 atomic-openshift-node: I0605 07:24:36.262060    9192 operation_executor.go:704] MountVolume.WaitForAttach succeeded for volume "kubernetes.io/aws-ebs/vol-0db9a2d474c9a9d71" (spec.Name: "pv0002") pod "07e58ca2-49ba-11e7-81bb-0e813404bf12" (UID: "07e58ca2-49ba-11e7-81bb-0e813404bf12").
Jun  5 07:24:36 ip-172-18-1-53 atomic-openshift-node: I0605 07:24:36.262227    9192 mount_linux.go:258] Checking for issues with fsck on disk: /dev/xvdbh
Jun  5 07:24:36 ip-172-18-1-53 atomic-openshift-node: I0605 07:24:36.291940    9192 mount_linux.go:277] Attempting to mount disk: xfs /dev/xvdbh /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/aws-ebs/mounts/vol-0db9a2d474c9a9d71
Jun  5 07:24:36 ip-172-18-1-53 atomic-openshift-node: I0605 07:24:36.291957    9192 mount_linux.go:105] Mounting /dev/xvdbh /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/aws-ebs/mounts/vol-0db9a2d474c9a9d71 xfs [defaults]
Jun  5 07:24:36 ip-172-18-1-53 kernel: xvdbh: unknown partition table
Jun  5 07:24:36 ip-172-18-1-53 kernel: XFS (xvdbh): Invalid superblock magic number
Jun  5 07:24:36 ip-172-18-1-53 atomic-openshift-node: E0605 07:24:36.304776    9192 mount_linux.go:110] Mount failed: exit status 32
Jun  5 07:24:36 ip-172-18-1-53 atomic-openshift-node: Mounting arguments: /dev/xvdbh /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/aws-ebs/mounts/vol-0db9a2d474c9a9d71 xfs [defaults]
Jun  5 07:24:36 ip-172-18-1-53 atomic-openshift-node: Output: mount: wrong fs type, bad option, bad superblock on /dev/xvdbh,
Jun  5 07:24:36 ip-172-18-1-53 atomic-openshift-node: missing codepage or helper program, or other error
Jun  5 07:24:36 ip-172-18-1-53 atomic-openshift-node: In some cases useful info is found in syslog - try
Jun  5 07:24:36 ip-172-18-1-53 atomic-openshift-node: dmesg | tail or so.
Jun  5 07:24:36 ip-172-18-1-53 atomic-openshift-node: I0605 07:24:36.304834    9192 mount_linux.go:311] Attempting to determine if disk "/dev/xvdbh" is formatted using lsblk with args: ([-nd -o FSTYPE /dev/xvdbh])
Jun  5 07:24:36 ip-172-18-1-53 atomic-openshift-node: E0605 07:24:36.308453    9192 nestedpendingoperations.go:254] Operation for "\"kubernetes.io/aws-ebs/vol-0db9a2d474c9a9d71\"" failed. No retries permitted until 2017-06-05 07:26:36.308415436 -0400 EDT (durationBeforeRetry 2m0s). Error: MountVolume.MountDevice failed for volume "kubernetes.io/aws-ebs/vol-0db9a2d474c9a9d71" (spec.Name: "pv0002") pod "07e58ca2-49ba-11e7-81bb-0e813404bf12" (UID: "07e58ca2-49ba-11e7-81bb-0e813404bf12") with: mount failed: exit status 32
Jun  5 07:24:36 ip-172-18-1-53 atomic-openshift-node: Mounting arguments: /dev/xvdbh /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/aws-ebs/mounts/vol-0db9a2d474c9a9d71 xfs [defaults]
Jun  5 07:24:36 ip-172-18-1-53 atomic-openshift-node: Output: mount: wrong fs type, bad option, bad superblock on /dev/xvdbh,

Comment 7 Chao Yang 2017-06-05 12:09:34 UTC

I test this on 
oc v3.5.5.23
kubernetes v1.5.2+43a9be4
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://ip-172-18-5-55.ec2.internal:443
openshift v3.5.5.23
kubernetes v1.5.2+43a9be4

After changing filesystem from ext4 to xfs,pod output is 
Events:
  FirstSeen	LastSeen	Count	From					SubObjectPath	Type		Reason		Message
  ---------	--------	-----	----					-------------	--------	------		-------
  16m		16m		1	default-scheduler					Normal		Scheduled	Successfully assigned mypod to ip-172-18-5-55.ec2.internal
  14m		1m		7	kubelet, ip-172-18-5-55.ec2.internal			Warning		FailedMount	Unable to mount volumes for pod "mypod_chao(d9c717a3-49e3-11e7-b5c0-0e14d6e6ec50)": timeout expired waiting for volumes to attach/mount for pod "chao"/"mypod". list of unattached/unmounted volumes=[aws]
  14m		1m		7	kubelet, ip-172-18-5-55.ec2.internal			Warning		FailedSync	Error syncing pod, skipping: timeout expired waiting for volumes to attach/mount for pod "chao"/"mypod". list of unattached/unmounted volumes=[aws]
  16m		16s		16	kubelet, ip-172-18-5-55.ec2.internal			Warning		FailedMount	MountVolume.MountDevice failed for volume "kubernetes.io/aws-ebs/vol-0e3bb197dc5d02e48" (spec.Name: "pv0001") pod "d9c717a3-49e3-11e7-b5c0-0e14d6e6ec50" (UID: "d9c717a3-49e3-11e7-b5c0-0e14d6e6ec50") with: exit status 32

There is no similar output like "failed to mount volume as "xfs", it's already formatted with "ext4". 

Log is as below:
Jun  5 07:49:38 ip-172-18-5-55 atomic-openshift-node: I0605 07:49:38.904207   12128 attacher.go:178] Successfully found attached AWS Volume "vol-0e3bb197dc5d02e48".
Jun  5 07:49:38 ip-172-18-5-55 atomic-openshift-node: I0605 07:49:38.904216   12128 operation_executor.go:988] MountVolume.WaitForAttach succeeded for volume "kubernetes.io/aws-ebs/vol-0e3bb197dc5d02e48" (spec.Name: "pv0001") pod "d9c717a3-49e3-11e7-b5c0-0e14d6e6ec50" (UID: "d9c717a3-49e3-11e7-b5c0-0e14d6e6ec50").
Jun  5 07:49:38 ip-172-18-5-55 atomic-openshift-node: I0605 07:49:38.904242   12128 nsenter_mount.go:175] findmnt: directory /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/aws-ebs/mounts/vol-0e3bb197dc5d02e48 does not exist
Jun  5 07:49:38 ip-172-18-5-55 atomic-openshift-node: I0605 07:49:38.904322   12128 mount_linux.go:330] Checking for issues with fsck on disk: /dev/xvdbf
Jun  5 07:49:38 ip-172-18-5-55 atomic-openshift-node: I0605 07:49:38.938543   12128 mount_linux.go:349] Attempting to mount disk: xfs /dev/xvdbf /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/aws-ebs/mounts/vol-0e3bb197dc5d02e48
Jun  5 07:49:38 ip-172-18-5-55 atomic-openshift-node: I0605 07:49:38.938558   12128 nsenter_mount.go:114] nsenter Mounting /dev/xvdbf /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/aws-ebs/mounts/vol-0e3bb197dc5d02e48 xfs [defaults]
Jun  5 07:49:38 ip-172-18-5-55 atomic-openshift-node: I0605 07:49:38.938569   12128 nsenter_mount.go:117] Mount command: nsenter [--mount=/rootfs/proc/1/ns/mnt -- /bin/mount -t xfs -o defaults /dev/xvdbf /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/aws-ebs/mounts/vol-0e3bb197dc5d02e48]
Jun  5 07:49:38 ip-172-18-5-55 kernel: XFS (xvdbf): Invalid superblock magic number
Jun  5 07:49:38 ip-172-18-5-55 atomic-openshift-node: I0605 07:49:38.948141   12128 nsenter_mount.go:121] Output of mounting /dev/xvdbf to /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/aws-ebs/mounts/vol-0e3bb197dc5d02e48: mount: wrong fs type, bad option, bad superblock on /dev/xvdbf,
Jun  5 07:49:38 ip-172-18-5-55 atomic-openshift-node: missing codepage or helper program, or other error
Jun  5 07:49:38 ip-172-18-5-55 atomic-openshift-node: In some cases useful info is found in syslog - try
Jun  5 07:49:38 ip-172-18-5-55 atomic-openshift-node: dmesg | tail or so.
Jun  5 07:49:38 ip-172-18-5-55 atomic-openshift-node: I0605 07:49:38.948180   12128 mount_linux.go:383] Attempting to determine if disk "/dev/xvdbf" is formatted using lsblk with args: ([-nd -o FSTYPE /dev/xvdbf])
Jun  5 07:49:38 ip-172-18-5-55 atomic-openshift-node: E0605 07:49:38.950859   12128 nestedpendingoperations.go:262] Operation for "\"kubernetes.io/aws-ebs/vol-0e3bb197dc5d02e48\"" failed. No retries permitted until 2017-06-05 07:51:38.950836492 -0400 EDT (durationBeforeRetry 2m0s). Error: MountVolume.MountDevice failed for volume "kubernetes.io/aws-ebs/vol-0e3bb197dc5d02e48" (spec.Name: "pv0001") pod "d9c717a3-49e3-11e7-b5c0-0e14d6e6ec50" (UID: "d9c717a3-49e3-11e7-b5c0-0e14d6e6ec50") with: exit status 32
Jun  5 07:49:38 ip-172-18-5-55 atomic-openshift-node: I0605 07:49:38.951298   12128 server.go:664] Event(api.ObjectReference{Kind:"Pod", Namespace:"chao", Name:"mypod", UID:"d9c717a3-49e3-11e7-b5c0-0e14d6e6ec50", APIVersion:"v1", ResourceVersion:"4357", FieldPath:""}): type: 'Warning' reason: 'FailedMount' MountVolume.MountDevice failed for volume "kubernetes.io/aws-ebs/vol-0e3bb197dc5d02e48" (spec.Name: "pv0001") pod "d9c717a3-49e3-11e7-b5c0-0e14d6e6ec50" (UID: "d9c717a3-49e3-11e7-b5c0-0e14d6e6ec50") with: exit status 32

I compared log from ocp3.3 and ocp3.5
the difference is there is one line "Jun  5 07:24:36 ip-172-18-1-53 kernel: xvdbh: unknown partition table" from ocp3.3

Does OCP3.5 output is as expected?

Comment 8 Jan Safranek 2017-06-05 14:55:57 UTC

> I test this on 
> oc v3.5.5.23

The bugfix is part of 3.6.x, not 3.5. There is probably some confusion where the bug should be fixed. How can I see it from the bug fields? "Target release" was filled *after* I put the bug to MODIFIED state (expecting a fix in 3.6).

This is a low prio bug, I personally don't think that it makes sense to fix it in 3.3.z, 3.4.z or 3.5.z.

Comment 10 Chao Yang 2017-06-06 08:59:02 UTC

Tested passed on the
oc v3.6.86
kubernetes v1.6.1+5115d708d7
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://ip-172-18-3-207.ec2.internal:8443
openshift v3.6.86
kubernetes v1.6.1+5115d708d7

and pod is ContainerCreating after mount ebs volume from ext4 format xfs.

some output of `oc describe pods mypod`

  FirstSeen	LastSeen	Count	From					SubObjectPath	Type		Reason		Message
  ---------	--------	-----	----					-------------	--------	------		-------
  43s		43s		1	default-scheduler					Normal		Scheduled	Successfully assigned mypod to ip-172-18-5-45.ec2.internal
  26s		5s		6	kubelet, ip-172-18-5-45.ec2.internal			Warning		FailedMount	MountVolume.MountDevice failed for volume "kubernetes.io/aws-ebs/vol-0ec9b6417ce6c66a2" (spec.Name: "pv0001") pod "5618dc0e-4a95-11e7-93e4-0e7458766c3e" (UID: "5618dc0e-4a95-11e7-93e4-0e7458766c3e") with: failed to mount the volume as "xfs", it's already formatted with "ext4". Mount error: mount failed: exit status 32
Mounting command: mount
Mounting arguments: /dev/xvdbo /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/aws-ebs/mounts/vol-0ec9b6417ce6c66a2 xfs [defaults]
Output: mount: wrong fs type, bad option, bad superblock on /dev/xvdbo,
       missing codepage or helper program, or other error

       In some cases useful info is found in syslog - try
       dmesg | tail or so.

Comment 12 Chao Yang 2017-06-08 05:35:12 UTC

This is passed per https://bugzilla.redhat.com/show_bug.cgi?id=1356478#c10

Comment 14 Chao Yang 2017-11-13 07:22:26 UTC

Test this and passed on the
oc v3.7.7
kubernetes v1.7.6+a08f5eeb62
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://ip-172-18-2-90.ec2.internal:8443
openshift v3.7.7
kubernetes v1.7.6+a08f5eeb62

The output message is 
  18s		18s		1	kubelet, ip-172-18-6-96.ec2.internal			Warning		FailedMount		MountVolume.MountDevice failed for volume "ebsc" : failed to mount the volume as "xfs", it already contains ext4. Mount error: mount failed: exit status 32
Mounting command: systemd-run
Mounting arguments: --description=Kubernetes transient mount for /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/aws-ebs/mounts/aws/us-east-1d/vol-00b26a69c8c31bb30 --scope -- mount -t xfs -o defaults /dev/xvdbe /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/aws-ebs/mounts/aws/us-east-1d/vol-00b26a69c8c31bb30
Output: Running scope as unit run-118277.scope.
mount: wrong fs type, bad option, bad superblock on /dev/xvdbe,
       missing codepage or helper program, or other error

       In some cases useful info is found in syslog - try
       dmesg | tail or so.

Comment 18 errata-xmlrpc 2017-11-28 21:51:43 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:3188