Bug 1889464 - CSI driver: Could not create file system on attached volume directory
Summary: CSI driver: Could not create file system on attached volume directory
Keywords:
Status: ASSIGNED
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: e2fsprogs
Version: 8.2
Hardware: ppc64le
OS: Linux
unspecified
medium
Target Milestone: rc
: 8.2
Assignee: Lukáš Czerner
QA Contact: Boyang Xue
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-10-19 17:58 UTC by Julie
Modified: 2020-11-23 13:33 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-10-20 08:11:06 UTC
Type: Bug
Target Upstream Version:


Attachments (Terms of Use)

Description Julie 2020-10-19 17:58:12 UTC
Description of problem:

Test Scenario:
Create a pod and attach an openstack csi driver volume to the pod.
	1.	Deploy a powervm cluster with 3 masters and 2 workers on a PowerVM set up that is connected to SAN.
	2.	Create a volume 
	3.	Provision a pod on worker-0 node with this volume attached.

Issue:
pod is in "ContainerCreating" state.
On describing the pod, it shows the error message,

"E1019 13:46:16.722484       1 utils.go:48] GRPC error: rpc error: 
code = InvalidArgument desc = Could not create file system on attached volume directory /dev/sr0. 
Error is exit status 1".

mkfs.ext4 is not working with -F parameter.


Version-Release number of selected component (if applicable):

[root@mjulie-ocp46-4bc2-bastion powervc-csi]# oc version
Client Version: 4.6.0-rc.4
Server Version: 4.6.0-rc.4
Kubernetes Version: v1.19.0+d59ce34


How reproducible:
Always

Steps to Reproduce:
1. Install CSI driver on the OCP 4.6 cluster
2. Create a volume in the OpenStack environment

a. vi dynamic-pvc.yaml 
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: dyn-pvc
spec:
  accessModes:
  - ReadWriteMany
  resources:
    requests:
      storage: 1Gi

-oc apply -f dynamic-pvc.yaml
-oc get pvc
```
NAME       STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS                     AGE
dyn-pvc    Bound    pvc-7bf53778-3569-47a2-91d9-ac13833bb967   1Gi        RWX            ibm-powervc-csi-volume-default   6h50m


3.Create a pod which attaches this volume
a. vi dynamic-pod.yaml

apiVersion: v1
kind: Pod
metadata:
  name: dyn-pod
  labels:
    app: nginx
spec:
  containers:
   - name: web-server
     image: nginx
     volumeMounts:
       - name: mypvc
         mountPath: /usr/share/nginx/html/powervc
     ports:
     - containerPort: 80
  volumes:
   - name: mypvc
     persistentVolumeClaim:
       claimName: dyn-pvc
       readOnly: false

b. oc apply -f dynamic-pod.yaml

[root@mjulie-ocp46-4bc2-bastion powervc-csi]# oc get pods
NAME                                   READY   STATUS              RESTARTS   AGE
dyn-pod                                0/1     ContainerCreating   0          6h43m


Actual results:
Pod is in "ContainerCreating" state.
On describing the pod, it shows the error message,

[root@mjulie-ocp46-4bc2-bastion powervc-csi]# oc describe pod dyn-pod

Events:
  Type     Reason                  Age                    From                     Message
  ----     ------                  ----                   ----                     -------
  Normal   Scheduled               9m50s                  default-scheduler        Successfully assigned myproject/dyn-pod to worker-0
  Normal   SuccessfulAttachVolume  8m54s                  attachdetach-controller  AttachVolume.Attach succeeded for volume 
  "pvc-7bf53778-3569-47a2-91d9-ac13833bb967"
  Warning  FailedMount             3m15s (x2 over 5m30s)  kubelet                  Unable to attach or mount volumes: unmounted volumes=[mypvc], 
  unattached volumes=[mypvc default-token-8pl67]: timed out waiting for the condition
  Warning  FailedMount             67s (x7 over 7m32s)    kubelet                  MountVolume.MountDevice failed for volume 
  "pvc-7bf53778-3569-47a2-91d9-ac13833bb967" : rpc error: code = InvalidArgument desc = 
  Could not create file system on attached volume directory /dev/sr0. Error is exit status 1
  Warning  FailedMount             57s (x2 over 7m47s)    kubelet                  Unable to attach or mount volumes: 
  unmounted volumes=[mypvc], unattached volumes=[default-token-8pl67 mypvc]: timed out waiting for the condition
[root@mjulie-ocp46-4bc2-bastion powervc-csi]#


Expected results:
If csi test case succeeds,  it should mount /dev/sr0 on worker-0 node. It should show  up in the output on executing "mount" command on the node, after the mount and mkfs are done.

Master Log:

Node Log (of failed PODs):

PV Dump:
PVC Dump:

[root@mjulie-ocp46-4bc2-bastion powervc-csi]# oc describe pvc  dyn-pvc


Events:
  Type    Reason                 Age                 From                                                                                       Message
  ----    ------                 ----                ----                                                                                       -------
  Normal  Provisioning           7h                  ibm-powervc-csi_ibm-powervc-csi-provisioner-plugin-0_a689c6ab-240d-415b-a0b9-4d4a1d1d0d5a  External provisioner is provisioning volume for claim "myproject/dyn-pvc"
  Normal  ExternalProvisioning   6h59m (x5 over 7h)  persistentvolume-controller                                                                waiting for a volume to be created, either by external provisioner "ibm-powervc-csi" or manually created by system administrator
  Normal  ProvisioningSucceeded  6h59m               ibm-powervc-csi_ibm-powervc-csi-provisioner-plugin-0_a689c6ab-240d-415b-a0b9-4d4a1d1d0d5a  Successfully provisioned volume pvc-7bf53778-3569-47a2-91d9-ac13833bb967


[root@mjulie-ocp46-4bc2-bastion powervc-csi]# oc describe pod dyn-pod

```
Events:
  Type     Reason                  Age                    From                     Message
  ----     ------                  ----                   ----                     -------
  Normal   Scheduled               9m50s                  default-scheduler        Successfully assigned myproject/dyn-pod to worker-0
  Normal   SuccessfulAttachVolume  8m54s                  attachdetach-controller  AttachVolume.Attach succeeded for volume 
  "pvc-7bf53778-3569-47a2-91d9-ac13833bb967"
  Warning  FailedMount             3m15s (x2 over 5m30s)  kubelet                  Unable to attach or mount volumes: unmounted volumes=[mypvc], 
  unattached volumes=[mypvc default-token-8pl67]: timed out waiting for the condition
  Warning  FailedMount             67s (x7 over 7m32s)    kubelet                  MountVolume.MountDevice failed for volume 
  "pvc-7bf53778-3569-47a2-91d9-ac13833bb967" : rpc error: code = InvalidArgument desc = 
  Could not create file system on attached volume directory /dev/sr0. Error is exit status 1
  Warning  FailedMount             57s (x2 over 7m47s)    kubelet                  Unable to attach or mount volumes: 
  unmounted volumes=[mypvc], unattached volumes=[default-token-8pl67 mypvc]: timed out waiting for the condition


StorageClass Dump (if StorageClass used by PV/PVC):


[root@mjulie-ocp46-4bc2-bastion powervc-csi]# oc describe sc ibm-powervc-csi-volume-default
Name:            ibm-powervc-csi-volume-default
IsDefaultClass:  Yes
Annotations:     kubectl.kubernetes.io/last-applied-configuration={"allowVolumeExpansion":true,"apiVersion":"storage.k8s.io/v1","kind":"StorageClass","metadata":{"annotations":{"storageclass.kubernetes.io/is-default-class":"true"},"name":"ibm-powervc-csi-volume-default"},"parameters":{"csi.storage.k8s.io/fstype":"ext4","type":"svc-ocp base template"},"provisioner":"ibm-powervc-csi"}
,storageclass.kubernetes.io/is-default-class=true
Provisioner:           ibm-powervc-csi
Parameters:            csi.storage.k8s.io/fstype=ext4,type=svc-ocp base template
AllowVolumeExpansion:  True
MountOptions:          <none>
ReclaimPolicy:         Delete
VolumeBindingMode:     Immediate
Events:                <none>


Logs on csi-driver pod shows:
oc logs ibm-powervc-csi-plugin-dslwp -c ibm-powervc-csi


2020/10/19 13:46:16 Attached Volume does not has a file system
2020/10/19 13:46:16 Running command /usr/bin/sudo [/usr/sbin/mkfs.ext4 /dev/sr0 -F]
2020/10/19 13:46:16 Command output  Syntax error in mke2fs config file (<default>, line #22)
        Unknown code prof 17
E1019 13:46:16.722484       1 utils.go:48] GRPC error: rpc error: code =
InvalidArgument desc = Could not create file system on attached volume directory /dev/sr0. 
Error is exit status 1

Comment 1 Jan Safranek 2020-10-20 08:11:06 UTC
> Provisioner:           ibm-powervc-csi

Filesystem creation is task of a CSI driver, ibm-powervc-csi in this case. Please fix the driver and how it calls mkfs.ext4. There is nothing wrong in OpenShift.

Comment 2 Sridhar Venkat (IBM) 2020-10-20 13:00:48 UTC
The command used in CoreOS environment to makefilesystem has been 

/usr/bin/sudo /usr/sbin/mkfs.ext4 /dev/sr0 -F

where /dev/sr0 in this case is the device on which the ext4 file system needs to be created. This worked until 4.5 release of RHEL/CoreOS. This is not working in 4.6. This bug needs to be addressed by owner of mkfs.ext4. Or precisely by owner of mke2fs. mkfs.ext4 calls mke2fs under the covers.

This is not a Openshift Container Platform problem, this needs to go to LTC.

Comment 3 Boyang Xue 2020-10-21 03:40:08 UTC
Could you provide e2fsprogs version from the last working enviroment and the first bad enviroment? Thanks!

Comment 4 Julie 2020-10-21 16:43:59 UTC
(In reply to Boyang Xue from comment #3)
> Could you provide e2fsprogs version from the last working enviroment and the
> first bad enviroment? Thanks!

I will try this scenario on an OCP4.5 cluster to see if it works as desired, and let you know. Thanks.

Comment 5 Julie 2020-10-23 16:09:05 UTC
(In reply to Julie from comment #4)
> (In reply to Boyang Xue from comment #3)
> > Could you provide e2fsprogs version from the last working enviroment and the
> > first bad enviroment? Thanks!
> 
> I will try this scenario on an OCP4.5 cluster to see if it works as desired,
> and let you know. Thanks.


Tested this on Power platform with various versions of OCP - 4.6, 4.5.15, 4.4.27 and on 4.4.9.

OCP version 4.6, 4.5 and 4.4.27 include SAME version of 'e2fsprogs' package.
mkfs.ext4 command does NOT work on the worker node.

[core@worker-1 ~]$ rpm -qa | grep e2fsprogs
e2fsprogs-libs-1.45.4-3.el8.ppc64le
e2fsprogs-1.45.4-3.el8.ppc64le


OCP version 4.4.9 includes the below version of 'e2fsprogs'.
mkfs.ext4 command works fine on the worker node.

[core@worker-0 ~]$ rpm -qa | grep e2fsprogs
e2fsprogs-libs-1.44.6-3.el8.ppc64le
e2fsprogs-1.44.6-3.el8.ppc64le

Hope this clarifies your query, Boyang Xue.

Comment 6 Boyang Xue 2020-10-27 02:58:42 UTC
Thanks Julie!

From e2fsprogs 1.44 to 1.45, there was a change to default /etc/mke2fs.conf:

[root@ci-vm-10-0-137-252 ~]# diff -u /etc/mke2fs.144 /etc/mke2fs.145
--- /etc/mke2fs.144     2020-10-26 22:48:57.411428528 -0400
+++ /etc/mke2fs.145     2020-10-26 22:49:03.256428528 -0400
@@ -14,6 +14,16 @@
                features = has_journal,extent,huge_file,flex_bg,metadata_csum,64bit,dir_nlink,extra_isize
                inode_size = 256
        }
+       rhel6_ext4 = {
+               features = has_journal,extent,huge_file,flex_bg,uninit_bg,dir_nlink,extra_isize
+               inode_size = 256
+               enable_periodic_fsck = 1
+               default_mntopts = ""
+       }
+       rhel7_ext4 = {
+               features = has_journal,extent,huge_file,flex_bg,uninit_bg,dir_nlink,extra_isize,64bit
+               inode_size = 256
+       }
        small = {
                blocksize = 1024
                inode_size = 128

In comment#0, mke2fs quitted with:

```
2020/10/19 13:46:16 Running command /usr/bin/sudo [/usr/sbin/mkfs.ext4 /dev/sr0 -F]
2020/10/19 13:46:16 Command output  Syntax error in mke2fs config file (<default>, line #22)
        Unknown code prof 17
```

I think this bug may be related to the above change to default /etc/mke2fs.conf.

Have you modified the /etc/mke2fs.conf on the node where mkfs.ext4 doesn't work? It's best if you could paste/attach the /etc/mke2fs.conf.
What's /dev/sr0, is it point to CD-ROM?

Comment 7 Julie 2020-10-27 11:40:30 UTC
This is the worker node of OCP 4.6 cluster where I hit the issue.

[core@worker-0 ~]$ rpm -qa | grep e2fsprogs
e2fsprogs-libs-1.45.4-3.el8.ppc64le
e2fsprogs-1.45.4-3.el8.ppc64le


[core@worker-0 ~]$ vi /etc/mke2fs.conf

[defaults]
        base_features = sparse_super,large_file,filetype,resize_inode,dir_index,ext_attr
        default_mntopts = acl,user_xattr
        enable_periodic_fsck = 0
        blocksize = 4096
        inode_size = 256
        inode_ratio = 16384

[fs_types]
        ext3 = {
                features = has_journal
        }
        ext4 = {
                features = has_journal,extent,huge_file,flex_bg,metadata_csum,64bit,dir_nlink,extra_isize
                inode_size = 256
        }
        rhel6_ext4 = {
                features = has_journal,extent,huge_file,flex_bg,uninit_bg,dir_nlink,extra_isize
                inode_size = 256
                enable_periodic_fsck = 1
                default_mntopts = ""
        }
        rhel7_ext4 = {
                features = has_journal,extent,huge_file,flex_bg,uninit_bg,dir_nlink,extra_isize,64bit
                inode_size = 256
        }
        small = {
                blocksize = 1024
                inode_size = 128
                inode_ratio = 4096
        }
        floppy = {
                blocksize = 1024
                inode_size = 128
                inode_ratio = 8192
        }
        big = {
                inode_ratio = 32768
        }
        huge = {
                inode_ratio = 65536
        }
        news = {
                inode_ratio = 4096
        }
        largefile = {
                inode_ratio = 1048576
                blocksize = -1
 }
        hurd = {
             blocksize = 4096
             inode_size = 128
        }

Comment 8 Boyang Xue 2020-10-27 14:03:36 UTC
(In reply to Julie from comment #7)
> This is the worker node of OCP 4.6 cluster where I hit the issue.
> 
> [core@worker-0 ~]$ rpm -qa | grep e2fsprogs
> e2fsprogs-libs-1.45.4-3.el8.ppc64le
> e2fsprogs-1.45.4-3.el8.ppc64le
> 
> 
> [core@worker-0 ~]$ vi /etc/mke2fs.conf
> 
> [defaults]
>         base_features =
> sparse_super,large_file,filetype,resize_inode,dir_index,ext_attr
>         default_mntopts = acl,user_xattr
>         enable_periodic_fsck = 0
>         blocksize = 4096
>         inode_size = 256
>         inode_ratio = 16384
> 
> [fs_types]
>         ext3 = {
>                 features = has_journal
>         }
>         ext4 = {
>                 features =
> has_journal,extent,huge_file,flex_bg,metadata_csum,64bit,dir_nlink,
> extra_isize
>                 inode_size = 256
>         }
>         rhel6_ext4 = {
>                 features =
> has_journal,extent,huge_file,flex_bg,uninit_bg,dir_nlink,extra_isize
>                 inode_size = 256
>                 enable_periodic_fsck = 1
>                 default_mntopts = ""
>         }

Line 22. There seems to be a problem in the above line, probable invisible characters. I can reproduce this by adding character to get the line invalid:

[root@kvm103 ~]# head -n 22 /etc/mke2fs.conf
[defaults]
        base_features = sparse_super,large_file,filetype,resize_inode,dir_index,ext_attr
        default_mntopts = acl,user_xattr
        enable_periodic_fsck = 0
        blocksize = 4096
        inode_size = 256
        inode_ratio = 16384

[fs_types]
        ext3 = {
                features = has_journal
        }
        ext4 = {
                ifeatures = has_journal,extent,huge_file,flex_bg,metadata_csum,64bit,dir_nlink,extra_isize
                inode_size = 256
}
        rhel6_ext4 = {
                features = has_journal,extent,huge_file,flex_bg,uninit_bg,dir_nlink,extra_isize
                inode_size = 256
                enable_periodic_fsck = 1
                default_mntopts = ""
i}
[root@kvm103 ~]# mkfs.ext4 -F /dev/vda6
Syntax error in mke2fs config file (/etc/mke2fs.conf, line #22)
        Unknown code prof 15

Is it possible to copy a clean /etc/mke2fs.conf from another system to replace current /etc/mke2fs.conf, and see if it will work?

>         rhel7_ext4 = {
>                 features =
> has_journal,extent,huge_file,flex_bg,uninit_bg,dir_nlink,extra_isize,64bit
>                 inode_size = 256
>         }
>         small = {
>                 blocksize = 1024
>                 inode_size = 128
>                 inode_ratio = 4096
>         }
>         floppy = {
>                 blocksize = 1024
>                 inode_size = 128
>                 inode_ratio = 8192
>         }
>         big = {
>                 inode_ratio = 32768
>         }
>         huge = {
>                 inode_ratio = 65536
>         }
>         news = {
>                 inode_ratio = 4096
>         }
>         largefile = {
>                 inode_ratio = 1048576
>                 blocksize = -1
>  }
>         hurd = {
>              blocksize = 4096
>              inode_size = 128
>         }

Comment 9 Boyang Xue 2020-10-27 14:12:45 UTC
(In reply to Boyang Xue from comment #8)
> (In reply to Julie from comment #7)
> > This is the worker node of OCP 4.6 cluster where I hit the issue.
> > 
> > [core@worker-0 ~]$ rpm -qa | grep e2fsprogs
> > e2fsprogs-libs-1.45.4-3.el8.ppc64le
> > e2fsprogs-1.45.4-3.el8.ppc64le
> > 
> > 
> > [core@worker-0 ~]$ vi /etc/mke2fs.conf
> > 
> > [defaults]
> >         base_features =
> > sparse_super,large_file,filetype,resize_inode,dir_index,ext_attr
> >         default_mntopts = acl,user_xattr
> >         enable_periodic_fsck = 0
> >         blocksize = 4096
> >         inode_size = 256
> >         inode_ratio = 16384
> > 
> > [fs_types]
> >         ext3 = {
> >                 features = has_journal
> >         }
> >         ext4 = {
> >                 features =
> > has_journal,extent,huge_file,flex_bg,metadata_csum,64bit,dir_nlink,
> > extra_isize
> >                 inode_size = 256
> >         }
> >         rhel6_ext4 = {
> >                 features =
> > has_journal,extent,huge_file,flex_bg,uninit_bg,dir_nlink,extra_isize
> >                 inode_size = 256
> >                 enable_periodic_fsck = 1
> >                 default_mntopts = ""
> >         }
> 
> Line 22. There seems to be a problem in the above line, probable invisible
> characters. I can reproduce this by adding character to get the line invalid:
> 
> [root@kvm103 ~]# head -n 22 /etc/mke2fs.conf
> [defaults]
>         base_features =
> sparse_super,large_file,filetype,resize_inode,dir_index,ext_attr
>         default_mntopts = acl,user_xattr
>         enable_periodic_fsck = 0
>         blocksize = 4096
>         inode_size = 256
>         inode_ratio = 16384
> 
> [fs_types]
>         ext3 = {
>                 features = has_journal
>         }
>         ext4 = {
>                 ifeatures =
> has_journal,extent,huge_file,flex_bg,metadata_csum,64bit,dir_nlink,
> extra_isize
>                 inode_size = 256
> }
>         rhel6_ext4 = {
>                 features =
> has_journal,extent,huge_file,flex_bg,uninit_bg,dir_nlink,extra_isize
>                 inode_size = 256
>                 enable_periodic_fsck = 1
>                 default_mntopts = ""
> i}
> [root@kvm103 ~]# mkfs.ext4 -F /dev/vda6
> Syntax error in mke2fs config file (/etc/mke2fs.conf, line #22)
>         Unknown code prof 15

I just notice this reproduction has difference with those in comment#0, where the error was like:

2020/10/19 13:46:16 Command output  Syntax error in mke2fs config file (<default>, line #22)

Notice the "file name" part is <default>, rather than /etc/mke2fs.conf.

Lukas,

Could you suggest where the <default> config file lives?

Thanks,
Boyang

> Is it possible to copy a clean /etc/mke2fs.conf from another system to
> replace current /etc/mke2fs.conf, and see if it will work?
> 
> >         rhel7_ext4 = {
> >                 features =
> > has_journal,extent,huge_file,flex_bg,uninit_bg,dir_nlink,extra_isize,64bit
> >                 inode_size = 256
> >         }
> >         small = {
> >                 blocksize = 1024
> >                 inode_size = 128
> >                 inode_ratio = 4096
> >         }
> >         floppy = {
> >                 blocksize = 1024
> >                 inode_size = 128
> >                 inode_ratio = 8192
> >         }
> >         big = {
> >                 inode_ratio = 32768
> >         }
> >         huge = {
> >                 inode_ratio = 65536
> >         }
> >         news = {
> >                 inode_ratio = 4096
> >         }
> >         largefile = {
> >                 inode_ratio = 1048576
> >                 blocksize = -1
> >  }
> >         hurd = {
> >              blocksize = 4096
> >              inode_size = 128
> >         }

Comment 10 Lukáš Czerner 2020-10-29 09:50:41 UTC
(In reply to Boyang Xue from comment #9)

> > [root@kvm103 ~]# mkfs.ext4 -F /dev/vda6
> > Syntax error in mke2fs config file (/etc/mke2fs.conf, line #22)
> >         Unknown code prof 15
> 
> I just notice this reproduction has difference with those in comment#0,
> where the error was like:
> 
> 2020/10/19 13:46:16 Command output  Syntax error in mke2fs config file
> (<default>, line #22)
> 
> Notice the "file name" part is <default>, rather than /etc/mke2fs.conf.
> 
> Lukas,
> 
> Could you suggest where the <default> config file lives?
> 
> Thanks,
> Boyang

Hi Boyang,

the <default> is not a file. This just placeholder name for internal configuration string that is constructed at compile time based on what we actually have in mke2fs.conf. Couple of things are going on here:

1. The <default> profile is only used in the case that normal configuration (should be /etc/mke2fs.conf) does not exist, or MKE2FS_CONFIG environment variable was set, but the config file does not exist at that location.

2. The way that the default profile is generated results in removal of double quotation in default_mntopts = "" because apparently it can deal with default_mntopts not having any parameters - this is present upstream as well.


I have to think about what the proper fix needs to be and will send one upstream if needed. Meanwhile the way to fix it is to make sure that the proper configuration file is present, whether it's the default /etc/mke2fs.conf or the one set by MKE2FS_CONFIG.


Julie, could you please check whether /etc/mke2fs.conf file exists, or that the MKE2FS_CONFIG environment variable points to existing configuration file if it is set at all ?

Thanks!

Comment 11 Julie 2020-10-29 14:51:25 UTC
(In reply to Lukáš Czerner from comment #10)

> 
> Julie, could you please check whether /etc/mke2fs.conf file exists, or that
> the MKE2FS_CONFIG environment variable points to existing configuration file
> if it is set at all ?
> 
> Thanks!


Hi Lukas,
On my OCP 4.6 cluster on Power, this conf file DOES EXIST on both the worker nodes.

[core@worker-0 ~]$ ls -al /etc/mke2fs.conf
-rw-r--r--. 1 root root 1108 Oct 19 06:32 /etc/mke2fs.conf

[core@worker-1 ~]$ ls -al /etc/mke2fs.conf
-rw-r--r--. 1 root root 1108 Oct 19 06:32 /etc/mke2fs.conf


I do NOT see any env variable with this name "MKE2FS_CONFIG" , on the worker nodes.
[core@worker-1 ~]$ echo $MKE2FS_CONFIG

[core@worker-1 ~]$

Comment 12 Lukáš Czerner 2020-10-30 09:16:27 UTC
(In reply to Julie from comment #11)
> (In reply to Lukáš Czerner from comment #10)
> 
> > 
> > Julie, could you please check whether /etc/mke2fs.conf file exists, or that
> > the MKE2FS_CONFIG environment variable points to existing configuration file
> > if it is set at all ?
> > 
> > Thanks!
> 
> 
> Hi Lukas,
> On my OCP 4.6 cluster on Power, this conf file DOES EXIST on both the worker
> nodes.
> 
> [core@worker-0 ~]$ ls -al /etc/mke2fs.conf
> -rw-r--r--. 1 root root 1108 Oct 19 06:32 /etc/mke2fs.conf
> 
> [core@worker-1 ~]$ ls -al /etc/mke2fs.conf
> -rw-r--r--. 1 root root 1108 Oct 19 06:32 /etc/mke2fs.conf
> 
> 
> I do NOT see any env variable with this name "MKE2FS_CONFIG" , on the worker
> nodes.
> [core@worker-1 ~]$ echo $MKE2FS_CONFIG
> 
> [core@worker-1 ~]$

Ok, thanks for this information. I have a e2fsprogs fix ready and I can provide you with a testing e2fsprogs package to try and see if it fixes the problem if you're interested.

However, there is something else going on on that system as well.

2020/10/19 13:46:16 Running command /usr/bin/sudo [/usr/sbin/mkfs.ext4 /dev/sr0 -F]
2020/10/19 13:46:16 Command output  Syntax error in mke2fs config file (<default>, line #22)
        Unknown code prof 17

The fact that it's trying to use <default> config file, means that it failed to use the system config file (/etc/mke2fs.conf) for some reason. It if in fact exists then there is something else going on. Could you please try running the mke2fs manually on that system and provide output ? Also could you provide output of the /etc/mke2fs.conf ? If the mke2fs does fail when run manually, can you please provide strace output of the command as well ?

Additionally, I assume that the output above is from some automated process. I am not sure what it is, but is it possible for you to log some additional information before the mkfs.ext4 is run ? Again I am interested if in that case the system config file is accessible and the MKE2FS_CONFIG env variable is set.

Thanks!
-Lukas

Comment 13 Julie 2020-10-30 17:50:37 UTC
(In reply to Lukáš Czerner from comment #12)

> 
> Ok, thanks for this information. I have a e2fsprogs fix ready and I can
> provide you with a testing e2fsprogs package to try and see if it fixes the
> problem if you're interested.
> 
> However, there is something else going on on that system as well.
> 
> 2020/10/19 13:46:16 Running command /usr/bin/sudo [/usr/sbin/mkfs.ext4
> /dev/sr0 -F]
> 2020/10/19 13:46:16 Command output  Syntax error in mke2fs config file
> (<default>, line #22)
>         Unknown code prof 17
> 
> The fact that it's trying to use <default> config file, means that it failed
> to use the system config file (/etc/mke2fs.conf) for some reason. It if in
> fact exists then there is something else going on. Could you please try
> running the mke2fs manually on that system and provide output ? Also could
> you provide output of the /etc/mke2fs.conf ? If the mke2fs does fail when
> run manually, can you please provide strace output of the command as well ?
> 
> Additionally, I assume that the output above is from some automated process.
> I am not sure what it is, but is it possible for you to log some additional
> information before the mkfs.ext4 is run ? Again I am interested if in that
> case the system config file is accessible and the MKE2FS_CONFIG env variable
> is set.
> 
> Thanks!
> -Lukas




Hi Lukas,
Alright, please share the "e2fsprogs" package with the fix, I shall try it on the OCP 4.6 cluster and let you know the result. Thanks!

Output of the command on running it manually,

[core@worker-0 ~]$ /usr/bin/sudo /usr/sbin/mkfs.ext4 /dev/sr0 -F
mke2fs 1.45.4 (23-Sep-2019)
/dev/sr0: Read-only file system while setting up superblock
[core@worker-0 ~]$


Strace output:
[core@worker-0 ~]$ strace /usr/bin/sudo /usr/sbin/mkfs.ext4 /dev/sr0 -F
execve("/usr/bin/sudo", ["/usr/bin/sudo", "/usr/sbin/mkfs.ext4", "/dev/sr0", "-F"], 0x7fffeb1399b8 /* 25 vars */) = 0
access(0x7fff98477680, F_OK)            = -1 ENOENT (No such file or directory)
brk(NULL)                               = 0x1002cfb0000
fcntl(0, F_GETFD)                       = 0
fcntl(1, F_GETFD)                       = 0
fcntl(2, F_GETFD)                       = 0
access(0x7fff98477680, F_OK)            = -1 ENOENT (No such file or directory)
access(0x7fff98478270, R_OK)            = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, 0x7fffe592f610, O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat(0x7fffe592f610, 0x7fffe592f650)    = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, 0x7fffe592f610, O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat(0x7fffe592f610, 0x7fffe592f650)    = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, 0x7fffe592f610, O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat(0x7fffe592f610, 0x7fffe592f650)    = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, 0x7fffe592f610, O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat(0x7fffe592f610, 0x7fffe592f650)    = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, 0x7fffe592f610, O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat(0x7fffe592f610, 0x7fffe592f650)    = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, 0x7fffe592f610, O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat(0x7fffe592f610, 0x7fffe592f650)    = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, 0x7fffe592f610, O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat(0x7fffe592f610, 0x7fffe592f650)    = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, 0x7fffe592f610, O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat(0x7fffe592f610, 0x7fffe592f650)    = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, 0x7fffe592f610, O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat(0x7fffe592f610, 0x7fffe592f650)    = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, 0x7fffe592f610, O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat(0x7fffe592f610, 0x7fffe592f650)    = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, 0x7fffe592f610, O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat(0x7fffe592f610, 0x7fffe592f650)    = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, 0x7fffe592f610, O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat(0x7fffe592f610, 0x7fffe592f650)    = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, 0x7fffe592f610, O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat(0x7fffe592f610, 0x7fffe592f650)    = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, 0x7fffe592f610, O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat(0x7fffe592f610, 0x7fffe592f650)    = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, 0x7fffe592f610, O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat(0x7fffe592f610, 0x7fffe592f650)    = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, 0x7fffe592f610, O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat(0x7fffe592f610, 0x7fffe592f650)    = 0
openat(AT_FDCWD, 0x7fff9847a240, O_RDONLY|O_CLOEXEC) = 3
fstat(3, 0x7fffe592f650)                = 0
mmap(NULL, 25265, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7fff98420000
close(3)                                = 0
openat(AT_FDCWD, 0x7fff98491eb0, O_RDONLY|O_CLOEXEC) = 3
read(3, 0x7fffe592f858, 832)            = 832
fstat(3, 0x7fffe592f680)                = 0
mmap(NULL, 311416, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fff983d0000
mmap(0x7fff98400000, 131072, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x20000) = 0x7fff98400000
close(3)                                = 0
openat(AT_FDCWD, 0x7fffe592f5f0, O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, 0x7fff984923b0, O_RDONLY|O_CLOEXEC) = 3
read(3, 0x7fffe592f838, 832)            = 832
fstat(3, 0x7fffe592f660)                = 0
mmap(NULL, 337280, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fff98370000
mmap(0x7fff983b0000, 131072, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x30000) = 0x7fff983b0000
close(3)                                = 0
openat(AT_FDCWD, 0x7fffe592f5d0, O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, 0x7fff984928b0, O_RDONLY|O_CLOEXEC) = 3
read(3, 0x7fffe592f818, 832)            = 832
fstat(3, 0x7fffe592f640)                = 0
mmap(NULL, 131368, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fff98340000
mmap(0x7fff98350000, 65536, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0) = 0x7fff98350000
mmap(0x7fff98360000, 296, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7fff98360000
close(3)                                = 0
openat(AT_FDCWD, 0x7fffe592f5b0, O_RDONLY|O_CLOEXEC) = 3
read(3, 0x7fffe592f7f8, 832)            = 832
fstat(3, 0x7fffe592f620)                = 0
mmap(NULL, 197720, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fff98300000
mmap(0x7fff98320000, 131072, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x10000) = 0x7fff98320000
close(3)                                = 0
openat(AT_FDCWD, 0x7fffe592f590, O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, 0x7fff984932d0, O_RDONLY|O_CLOEXEC) = 3
read(3, 0x7fffe592f7d8, 832)            = 832
fstat(3, 0x7fffe592f600)                = 0
mmap(NULL, 279840, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fff982b0000
mmap(0x7fff982e0000, 131072, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x20000) = 0x7fff982e0000
close(3)                                = 0
openat(AT_FDCWD, 0x7fffe592f570, O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, 0x7fff984937d0, O_RDONLY|O_CLOEXEC) = 3
read(3, 0x7fffe592f7b8, 832)            = 832
fstat(3, 0x7fffe592f5e0)                = 0
mmap(NULL, 131336, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fff98280000
mmap(0x7fff98290000, 65536, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0) = 0x7fff98290000
mmap(0x7fff982a0000, 264, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7fff982a0000
close(3)                                = 0
openat(AT_FDCWD, 0x7fffe592f550, O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, 0x7fff98493cd0, O_RDONLY|O_CLOEXEC) = 3
read(3, 0x7fffe592f798, 832)            = 832
fstat(3, 0x7fffe592f5c0)                = 0
mmap(NULL, 2183848, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fff98060000
mprotect(0x7fff98250000, 65536, PROT_NONE) = 0
mmap(0x7fff98260000, 131072, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1f0000) = 0x7fff98260000
close(3)                                = 0
openat(AT_FDCWD, 0x7fffe592f530, O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, 0x7fff984941d0, O_RDONLY|O_CLOEXEC) = 3
read(3, 0x7fffe592f778, 832)            = 832
fstat(3, 0x7fffe592f5a0)                = 0
mmap(NULL, 131080, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fff98030000
mmap(0x7fff98040000, 131072, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0) = 0x7fff98040000
close(3)                                = 0
openat(AT_FDCWD, 0x7fffe592f510, O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, 0x7fff98494720, O_RDONLY|O_CLOEXEC) = 3
read(3, 0x7fffe592f758, 832)            = 832
fstat(3, 0x7fffe592f580)                = 0
mmap(NULL, 655792, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fff97f80000
mmap(0x7fff98010000, 131072, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x80000) = 0x7fff98010000
close(3)                                = 0
mprotect(0x7fff98260000, 65536, PROT_READ) = 0
mprotect(0x7fff982e0000, 65536, PROT_READ) = 0
mprotect(0x7fff98010000, 65536, PROT_READ) = 0
mprotect(0x7fff98040000, 65536, PROT_READ) = 0
mprotect(0x7fff98290000, 65536, PROT_READ) = 0
mprotect(0x7fff98320000, 65536, PROT_READ) = 0
mprotect(0x7fff98350000, 65536, PROT_READ) = 0
mprotect(0x7fff983b0000, 65536, PROT_READ) = 0
mprotect(0x7fff98400000, 65536, PROT_READ) = 0
mprotect(0x131ae0000, 65536, PROT_READ) = 0
mprotect(0x7fff98480000, 65536, PROT_READ) = 0
munmap(0x7fff98420000, 25265)           = 0
set_tid_address(0x7fff98495b50)         = 3577691
set_robust_list(0x7fff98495b60, 24)     = 0
rt_sigaction(SIGRTMIN, 0x7fffe5930a78, NULL, 8) = 0
rt_sigaction(SIGRT_1, 0x7fffe5930a78, NULL, 8) = 0
rt_sigprocmask(SIG_UNBLOCK, 0x7fffe5930c18, NULL, 8) = 0
prlimit64(0, RLIMIT_STACK, NULL, 0x7fffe5930c00) = 0
statfs(0x7fff9839ffc8, 0x7fffe5930c20)  = 0
statfs(0x7fff9839ffc8, 0x7fffe5930af0)  = 0
brk(NULL)                               = 0x1002cfb0000
brk(0x1002cfe0000)                      = 0x1002cfe0000
access(0x7fff983a0008, F_OK)            = 0
prlimit64(0, RLIMIT_AS, NULL, 0x131af0028) = 0
prlimit64(0, RLIMIT_AS, 0x7fffe59307f8, NULL) = 0
prlimit64(0, RLIMIT_CPU, NULL, 0x131af0040) = 0
prlimit64(0, RLIMIT_CPU, 0x7fffe59307f8, NULL) = 0
prlimit64(0, RLIMIT_DATA, NULL, 0x131af0058) = 0
prlimit64(0, RLIMIT_DATA, 0x7fffe59307f8, NULL) = 0
prlimit64(0, RLIMIT_FSIZE, NULL, 0x131af0070) = 0
prlimit64(0, RLIMIT_FSIZE, 0x7fffe59307f8, NULL) = 0
prlimit64(0, RLIMIT_NOFILE, NULL, 0x131af0088) = 0
prlimit64(0, RLIMIT_NOFILE, 0x7fffe59307f8, NULL) = -1 EPERM (Operation not permitted)
prlimit64(0, RLIMIT_NOFILE, 0x7fffe5930808, NULL) = 0
prlimit64(0, RLIMIT_NPROC, NULL, 0x131af00a0) = 0
prlimit64(0, RLIMIT_NPROC, 0x7fffe59307f8, NULL) = -1 EPERM (Operation not permitted)
prlimit64(0, RLIMIT_NPROC, 0x7fffe5930808, NULL) = 0
prlimit64(0, RLIMIT_RSS, NULL, 0x131af00b8) = 0
prlimit64(0, RLIMIT_RSS, 0x7fffe59307f8, NULL) = 0
prlimit64(0, RLIMIT_STACK, NULL, 0x131af00d0) = 0
prlimit64(0, RLIMIT_STACK, 0x7fffe59307f8, NULL) = 0
fcntl(0, F_GETFL)                       = 0x402 (flags O_RDWR|O_APPEND)
fcntl(1, F_GETFL)                       = 0x402 (flags O_RDWR|O_APPEND)
fcntl(2, F_GETFL)                       = 0x402 (flags O_RDWR|O_APPEND)
openat(AT_FDCWD, 0x7fff9820c9b8, O_RDONLY|O_CLOEXEC) = 3
fstat(3, 0x7fff98271b30)                = 0
mmap(NULL, 3663584, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7fff97c00000
close(3)                                = 0
openat(AT_FDCWD, 0x7fff98210b08, O_RDONLY|O_CLOEXEC) = 3
fstat(3, 0x7fffe59306b0)                = 0
fstat(3, 0x7fffe5930458)                = 0
read(3, 0x1002cfb1670, 8192)            = 127
_llseek(3, -71, 0x7fffe5930460, SEEK_CUR) = 0
read(3, 0x1002cfb1670, 8192)            = 71
close(3)                                = 0
stat(0x7fff98319188, 0x7fffe59305f8)    = 0
openat(AT_FDCWD, 0x7fff98319188, O_RDONLY) = -1 EACCES (Permission denied)
geteuid()                               = 1000
geteuid()                               = 1000
stat(0x7fffe593f758, 0x7fffe592f788)    = 0
openat(AT_FDCWD, 0x7fffe592f100, O_RDONLY|O_CLOEXEC) = 3
fstat(3, 0x7fffe592ee68)                = 0
read(3, 0x1002cfb36c0, 8192)            = 2997
read(3, "", 8192)                       = 0
close(3)                                = 0
openat(AT_FDCWD, 0x1002cfb25c0, O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, 0x1002cfb27d0, O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, 0x1002cfb2650, O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, 0x1002cfb2750, O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, 0x1002cfb2860, O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, 0x1002cfb26d0, O_RDONLY) = -1 ENOENT (No such file or directory)
write(2, 0x7fffe593f761, 4sudo)             = 4
write(2, 0x7fff98318b00, 2: )             = 2
write(2, 0x7fffe592cb78, 133effective uid is not 0, is /usr/bin/sudo on a file system with the 'nosuid' option set or an NFS file system without root privileges?)           = 133
write(2, 0x7fff9827178b, 1
)             = 1
exit_group(1)                           = ?
+++ exited with 1 +++
[core@worker-0 ~]$



 vi  /etc/mke2fs.conf
[defaults]
        base_features = sparse_super,large_file,filetype,resize_inode,dir_index,ext_attr
        default_mntopts = acl,user_xattr
        enable_periodic_fsck = 0
        blocksize = 4096
        inode_size = 256
        inode_ratio = 16384

[fs_types]
        ext3 = {
                features = has_journal
        }
        ext4 = {
                features = has_journal,extent,huge_file,flex_bg,metadata_csum,64bit,dir_nlink,extra_isize
                inode_size = 256
        }
        rhel6_ext4 = {
                features = has_journal,extent,huge_file,flex_bg,uninit_bg,dir_nlink,extra_isize
                inode_size = 256
                enable_periodic_fsck = 1
                default_mntopts = ""
        }
        rhel7_ext4 = {
                features = has_journal,extent,huge_file,flex_bg,uninit_bg,dir_nlink,extra_isize,64bit
                inode_size = 256
        }
        small = {
                blocksize = 1024
                inode_size = 128
                inode_ratio = 4096
        }
        floppy = {
                blocksize = 1024
                inode_size = 128
                inode_ratio = 8192
        }
        big = {
                inode_ratio = 32768
        }
        huge = {
                inode_ratio = 65536
        }
        news = {
                inode_ratio = 4096
        }
        largefile = {
                inode_ratio = 1048576
                blocksize = -1
        }
        largefile4 = {
                inode_ratio = 4194304
                blocksize = -1
        }
        hurd = {
             blocksize = 4096
             inode_size = 128
        }

Comment 14 Lukáš Czerner 2020-11-02 08:17:13 UTC
(In reply to Julie from comment #13)
> (In reply to Lukáš Czerner from comment #12)
> 
> > 
> > Ok, thanks for this information. I have a e2fsprogs fix ready and I can
> > provide you with a testing e2fsprogs package to try and see if it fixes the
> > problem if you're interested.
> > 
> > However, there is something else going on on that system as well.
> > 
> > 2020/10/19 13:46:16 Running command /usr/bin/sudo [/usr/sbin/mkfs.ext4
> > /dev/sr0 -F]
> > 2020/10/19 13:46:16 Command output  Syntax error in mke2fs config file
> > (<default>, line #22)
> >         Unknown code prof 17
> > 
> > The fact that it's trying to use <default> config file, means that it failed
> > to use the system config file (/etc/mke2fs.conf) for some reason. It if in
> > fact exists then there is something else going on. Could you please try
> > running the mke2fs manually on that system and provide output ? Also could
> > you provide output of the /etc/mke2fs.conf ? If the mke2fs does fail when
> > run manually, can you please provide strace output of the command as well ?
> > 
> > Additionally, I assume that the output above is from some automated process.
> > I am not sure what it is, but is it possible for you to log some additional
> > information before the mkfs.ext4 is run ? Again I am interested if in that
> > case the system config file is accessible and the MKE2FS_CONFIG env variable
> > is set.
> > 
> > Thanks!
> > -Lukas
> 
> 
> 
> 
> Hi Lukas,
> Alright, please share the "e2fsprogs" package with the fix, I shall try it
> on the OCP 4.6 cluster and let you know the result. Thanks!

Ok, as soon I have the build I'll send it to you.

> 
> Output of the command on running it manually,
> 
> [core@worker-0 ~]$ /usr/bin/sudo /usr/sbin/mkfs.ext4 /dev/sr0 -F
> mke2fs 1.45.4 (23-Sep-2019)
> /dev/sr0: Read-only file system while setting up superblock
> [core@worker-0 ~]$

The device used here is read only, can you try something that's usable to create a file system ? Note the read only check should happen after the the configuration file is parsed so I'd say that it is working fine when running manually, but let's confirm by running on a usable device first.

-Lukas

Comment 15 Fredrik Nyström 2020-11-02 09:54:20 UTC
Regarding broken internal mke2fs.conf, will this be fixed so that it is 
once again possible to run mkfs/fsck standalone without need for 
external config file?

We add mkfs.ext4 and fsck.ext4 to initramfs and run from 
dracut-initqueue.

Easy way to check if internal config is broken:
MKE2FS_CONFIG= mkfs.ext4
Syntax error in mke2fs config file (<default>, line #22)
        Unknown code prof 17

Kind Regards / Fredrik

Comment 16 Lukáš Czerner 2020-11-02 10:13:47 UTC
(In reply to Fredrik Nyström from comment #15)
> Regarding broken internal mke2fs.conf, will this be fixed so that it is 
> once again possible to run mkfs/fsck standalone without need for 
> external config file?

Yes, that's the plan.

-Lukas

Comment 17 Boyang Xue 2020-11-02 10:33:09 UTC
(In reply to Fredrik Nyström from comment #15)
> Regarding broken internal mke2fs.conf, will this be fixed so that it is 
> once again possible to run mkfs/fsck standalone without need for 
> external config file?
> 
> We add mkfs.ext4 and fsck.ext4 to initramfs and run from 
> dracut-initqueue.
> 
> Easy way to check if internal config is broken:
> MKE2FS_CONFIG= mkfs.ext4
> Syntax error in mke2fs config file (<default>, line #22)
>         Unknown code prof 17
> 
> Kind Regards / Fredrik

Thanks for the information! A quick test shows:

e2fsprogs-1.44.6-3.el8:
```
[root@kvm103 ~]# MKE2FS_CONFIG= mkfs.ext4
Usage: mkfs.ext4 [-c|-l filename] [-b block-size] [-C cluster-size]
        [-i bytes-per-inode] [-I inode-size] [-J journal-options]
        [-G flex-group-size] [-N number-of-inodes] [-d root-directory]
        [-m reserved-blocks-percentage] [-o creator-os]
        [-g blocks-per-group] [-L volume-label] [-M last-mounted-directory]
        [-O feature[,...]] [-r fs-revision] [-E extended-option[,...]]
        [-t fs-type] [-T usage-type ] [-U UUID] [-e errors_behavior][-z undo_file]
        [-jnqvDFSV] device [blocks-count]
```

e2fsprogs-1.45.6-1.el8:
```
[root@kvm103 ~]# MKE2FS_CONFIG= mkfs.ext4
Syntax error in mke2fs config file (<default>, line #22)                                                                                                                                                                   Unknown code prof 17
```

I'm setting qa_ack+

Comment 19 Lukáš Czerner 2020-11-02 11:46:15 UTC
> > Hi Lukas,
> > Alright, please share the "e2fsprogs" package with the fix, I shall try it
> > on the OCP 4.6 cluster and let you know the result. Thanks!
> 
> Ok, as soon I have the build I'll send it to you.

Here is a scratch build of e2fsprogs with the proposed fix. Let me know if it fixes the problem.

https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=32723246

-Lukas

Comment 22 Julie 2020-11-02 17:47:00 UTC
(In reply to Lukáš Czerner from comment #14)
> (In reply to Julie from comment #13)
> > (In reply to Lukáš Czerner from comment #12)
> > 
> > > 
> > > Ok, thanks for this information. I have a e2fsprogs fix ready and I can
> > > provide you with a testing e2fsprogs package to try and see if it fixes the
> > > problem if you're interested.
> > > 
> > > However, there is something else going on on that system as well.
> > > 
> > > 2020/10/19 13:46:16 Running command /usr/bin/sudo [/usr/sbin/mkfs.ext4
> > > /dev/sr0 -F]
> > > 2020/10/19 13:46:16 Command output  Syntax error in mke2fs config file
> > > (<default>, line #22)
> > >         Unknown code prof 17
> > > 
> > > The fact that it's trying to use <default> config file, means that it failed
> > > to use the system config file (/etc/mke2fs.conf) for some reason. It if in
> > > fact exists then there is something else going on. Could you please try
> > > running the mke2fs manually on that system and provide output ? Also could
> > > you provide output of the /etc/mke2fs.conf ? If the mke2fs does fail when
> > > run manually, can you please provide strace output of the command as well ?
> > > 
> > > Additionally, I assume that the output above is from some automated process.
> > > I am not sure what it is, but is it possible for you to log some additional
> > > information before the mkfs.ext4 is run ? Again I am interested if in that
> > > case the system config file is accessible and the MKE2FS_CONFIG env variable
> > > is set.
> > > 
> > > Thanks!
> > > -Lukas
> > 
> > 
> > 
> > 
> > Hi Lukas,
> > Alright, please share the "e2fsprogs" package with the fix, I shall try it
> > on the OCP 4.6 cluster and let you know the result. Thanks!
> 
> Ok, as soon I have the build I'll send it to you.
> 
> > 
> > Output of the command on running it manually,
> > 
> > [core@worker-0 ~]$ /usr/bin/sudo /usr/sbin/mkfs.ext4 /dev/sr0 -F
> > mke2fs 1.45.4 (23-Sep-2019)
> > /dev/sr0: Read-only file system while setting up superblock
> > [core@worker-0 ~]$
> 
> The device used here is read only, can you try something that's usable to
> create a file system ? Note the read only check should happen after the the
> configuration file is parsed so I'd say that it is working fine when running
> manually, but let's confirm by running on a usable device first.
> 
> -Lukas


Lukas,
I created another pod , and executed same test

Logs of 'csi driver plugin' pod shows the error
[root@mjulie-ocp46-4bc2-bastion powervc-csi]# oc logs ibm-powervc-csi-plugin-dslwp -c ibm-powervc-csi | grep /dev/sdc

I1102 17:32:43.221364       1 nodeserver.go:193] 1 : Found directory of attached volume /dev/sdc
2020/11/02 17:32:43 Running command /usr/bin/sudo [/bin/lsblk /dev/sdc --noheadings -o FSTYPE -f]
2020/11/02 17:32:43 Running command /usr/bin/sudo [/usr/sbin/mkfs.ext4 /dev/sdc -F]
E1102 17:32:43.257799       1 utils.go:48] GRPC error: rpc error: code = InvalidArgument desc = 
Could not create file system on attached volume directory /dev/sdc. Error is exit status 1



I executed same command manually on worker-0 node, and it completed successfully without any errors.
Thus it is confirmed that the command works fine on running manually.

ssh core@worker-0

[core@worker-0 ~]$ /usr/bin/sudo /usr/sbin/mkfs.ext4 /dev/sdc -F
mke2fs 1.45.4 (23-Sep-2019)
Creating filesystem with 1048576 4k blocks and 262144 inodes
Filesystem UUID: d7b7d06d-6bc5-4c7a-b108-95e0afe889ce
Superblock backups stored on blocks:
        32768, 98304, 163840, 229376, 294912, 819200, 884736

Allocating group tables: done
Writing inode tables: done
Creating journal (16384 blocks): done
Writing superblocks and filesystem accounting information: done
[core@worker-0 ~]$

Comment 23 Julie 2020-11-03 14:17:14 UTC
(In reply to Lukáš Czerner from comment #19)
> > > Hi Lukas,
> > > Alright, please share the "e2fsprogs" package with the fix, I shall try it
> > > on the OCP 4.6 cluster and let you know the result. Thanks!
> > 
> > Ok, as soon I have the build I'll send it to you.
> 
> Here is a scratch build of e2fsprogs with the proposed fix. Let me know if
> it fixes the problem.
> 
> https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=32723246
> 
> -Lukas


Lukas,
https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=32723246


'Parameters' show the following:
SRPM: tasks/3156/32723156/e2fsprogs-1.45.6-2.el8_3.src.rpm
Build Tag: rhel-8.3.0-z-build
Arch: x86_64

Arch shows as 'x86_64'. Will this package work on IBM Power platform?

My OCP cluster is on POWER (ppc64le arch).

[core@worker-0 ~]$ uname -a
Linux worker-0 4.18.0-193.24.1.el8_2.dt1.ppc64le #1 SMP Thu Sep 24 14:48:17 EDT 2020 ppc64le ppc64le ppc64le GNU/Linux
[core@worker-0 ~]$

NAME="Red Hat Enterprise Linux CoreOS"
VERSION="46.82.202010082145-0"

Comment 24 Eric Sandeen 2020-11-03 19:30:48 UTC
(In reply to Julie from comment #23)

> Lukas,
> https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=32723246
> 
> 
> 'Parameters' show the following:
> SRPM: tasks/3156/32723156/e2fsprogs-1.45.6-2.el8_3.src.rpm
> Build Tag: rhel-8.3.0-z-build
> Arch: x86_64
> 
> Arch shows as 'x86_64'. Will this package work on IBM Power platform?

Julie - the ppc64le build is here:

https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=32723239

-Eric

Comment 25 Julie 2020-11-04 18:17:02 UTC
(In reply to Eric Sandeen from comment #24)
> (In reply to Julie from comment #23)
> 
> > Lukas,
> > https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=32723246
> > 
> > 
> > 'Parameters' show the following:
> > SRPM: tasks/3156/32723156/e2fsprogs-1.45.6-2.el8_3.src.rpm
> > Build Tag: rhel-8.3.0-z-build
> > Arch: x86_64
> > 
> > Arch shows as 'x86_64'. Will this package work on IBM Power platform?
> 
> Julie - the ppc64le build is here:
> 
> https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=32723239
> 
> -Eric

Hello Lukas/ Eric,

I tested the fix on OCP 4.6 Power cluster.

Steps:
1. Installed new packages

[root@worker-0 tmp]# rpm-ostree override replace e2fsprogs-1.45.6-2.el8_3.ppc64le.rpm libcom_err-1.45.6-2.el8_3.ppc64le.rpm libss-1.45.6-2.el8_3.ppc64le.rpm e2fsprogs-libs-1.45.6-2.el8_3.ppc64le.rpm
Checking out tree acdfd3c... done
Enabled rpm-md repositories:
Importing rpm-md... done
Resolving dependencies... done
Applying 4 overrides
Processing packages... done
Running pre scripts... done
Running post scripts... done
Running posttrans scripts... done
Writing rpmdb... done
Writing OSTree commit... done
Staging deployment... done
Upgraded:
  e2fsprogs 1.45.4-3.el8 -> 1.45.6-2.el8_3
  e2fsprogs-libs 1.45.4-3.el8 -> 1.45.6-2.el8_3
  libcom_err 1.45.4-3.el8 -> 1.45.6-2.el8_3
  libss 1.45.4-3.el8 -> 1.45.6-2.el8_3
Run "systemctl reboot" to start a reboot
[root@worker-0 tmp]#

2.Rebooted worker-0 node
  After rebooting the worker node, it reflects the new packages.

3.Executed the test to create a pod with attached volume.

[root@mjulie-ocp46-4bc2-bastion powervc-csi]# oc describe pod dyn-pod3

Events:
  Type     Reason                  Age    From                     Message
  ----     ------                  ----   ----                     -------
  Normal   Scheduled               3m9s   default-scheduler        Successfully assigned myproject/dyn-pod3 to worker-0
  Normal   SuccessfulAttachVolume  2m15s  attachdetach-controller  AttachVolume.Attach succeeded for volume "pvc-5bcc4c26-9840-47a8-b515-7d6ead333fcd"
  Warning  FailedMount             66s    kubelet                  Unable to attach or mount volumes: unmounted volumes=[mypvc], unattached volumes=[mypvc default-token-8pl67]: timed out waiting for the condition
  Normal   AddedInterface          39s    multus                   Add eth0 [10.131.0.71/23]
  Normal   Pulling                 39s    kubelet                  Pulling image "nginx"
  Normal   Pulled                  38s    kubelet                  Successfully pulled image "nginx" in 618.885143ms
  Normal   Created                 37s    kubelet                  Created container web-server
  Normal   Started                 37s    kubelet                  Started container web-server


4.Verified that mount and mkfs commands are successfully done.
  On executing "mount" command on worker-0 node, it shows the mount  /dev/sdc

[core@worker-0 ~]$ mount | grep /dev/sdc
/dev/sdc on /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-5bcc4c26-9840-47a8-b515-7d6ead333fcd/globalmount type ext4 (rw,relatime,seclabel)
/dev/sdc on /var/lib/kubelet/pods/fcd73178-3cc1-4048-9ae7-3ee6484540bd/volumes/kubernetes.io~csi/pvc-5bcc4c26-9840-47a8-b515-7d6ead333fcd/mount type ext4 (rw,relatime,seclabel)
[core@worker-0 ~]$


Note:
1. Verified the fix on OCP 4.6 cluster. Original issue is not reproduced.
2. I need to test the fix on OCP 4.5 cluster too. Will do this and let you know the results. Thanks.

Comment 26 Julie 2020-11-06 14:27:03 UTC
(In reply to Julie from comment #25)
> 
> Hello Lukas/ Eric,
> 
> I tested the fix on OCP 4.6 Power cluster.
> 
> Steps:
> 1. Installed new packages
> 
> [root@worker-0 tmp]# rpm-ostree override replace
> e2fsprogs-1.45.6-2.el8_3.ppc64le.rpm libcom_err-1.45.6-2.el8_3.ppc64le.rpm
> libss-1.45.6-2.el8_3.ppc64le.rpm e2fsprogs-libs-1.45.6-2.el8_3.ppc64le.rpm
> Checking out tree acdfd3c... done
> Enabled rpm-md repositories:
> Importing rpm-md... done
> Resolving dependencies... done
> Applying 4 overrides
> Processing packages... done
> Running pre scripts... done
> Running post scripts... done
> Running posttrans scripts... done
> Writing rpmdb... done
> Writing OSTree commit... done
> Staging deployment... done
> Upgraded:
>   e2fsprogs 1.45.4-3.el8 -> 1.45.6-2.el8_3
>   e2fsprogs-libs 1.45.4-3.el8 -> 1.45.6-2.el8_3
>   libcom_err 1.45.4-3.el8 -> 1.45.6-2.el8_3
>   libss 1.45.4-3.el8 -> 1.45.6-2.el8_3
> Run "systemctl reboot" to start a reboot
> [root@worker-0 tmp]#
> 
> 2.Rebooted worker-0 node
>   After rebooting the worker node, it reflects the new packages.
> 
> 3.Executed the test to create a pod with attached volume.
> 
> [root@mjulie-ocp46-4bc2-bastion powervc-csi]# oc describe pod dyn-pod3
> 
> Events:
>   Type     Reason                  Age    From                     Message
>   ----     ------                  ----   ----                     -------
>   Normal   Scheduled               3m9s   default-scheduler       
> Successfully assigned myproject/dyn-pod3 to worker-0
>   Normal   SuccessfulAttachVolume  2m15s  attachdetach-controller 
> AttachVolume.Attach succeeded for volume
> "pvc-5bcc4c26-9840-47a8-b515-7d6ead333fcd"
>   Warning  FailedMount             66s    kubelet                  Unable to
> attach or mount volumes: unmounted volumes=[mypvc], unattached
> volumes=[mypvc default-token-8pl67]: timed out waiting for the condition
>   Normal   AddedInterface          39s    multus                   Add eth0
> [10.131.0.71/23]
>   Normal   Pulling                 39s    kubelet                  Pulling
> image "nginx"
>   Normal   Pulled                  38s    kubelet                 
> Successfully pulled image "nginx" in 618.885143ms
>   Normal   Created                 37s    kubelet                  Created
> container web-server
>   Normal   Started                 37s    kubelet                  Started
> container web-server
> 
> 
> 4.Verified that mount and mkfs commands are successfully done.
>   On executing "mount" command on worker-0 node, it shows the mount  /dev/sdc
> 
> [core@worker-0 ~]$ mount | grep /dev/sdc
> /dev/sdc on
> /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-5bcc4c26-9840-47a8-b515-
> 7d6ead333fcd/globalmount type ext4 (rw,relatime,seclabel)
> /dev/sdc on
> /var/lib/kubelet/pods/fcd73178-3cc1-4048-9ae7-3ee6484540bd/volumes/
> kubernetes.io~csi/pvc-5bcc4c26-9840-47a8-b515-7d6ead333fcd/mount type ext4
> (rw,relatime,seclabel)
> [core@worker-0 ~]$
> 
> 
> Note:
> 1. Verified the fix on OCP 4.6 cluster. Original issue is not reproduced.
> 2. I need to test the fix on OCP 4.5 cluster too. Will do this and let you
> know the results. Thanks.


Hi Lukas, Eric,

FYI:
I verified the fix on the following OCP clusters on Power (ppc64le arch). New "e2fsprogs" package works fine on the following systems.

1. OCP 4.6
Worker node on OCP 4.6:
Red Hat Enterprise Linux CoreOS VERSION="46.82.202010082145-0"
kernel: 4.18.0-193.24.1.el8_2.dt1.ppc64le

2. OCP 4.5.18  
Worker node on OCP 4.5.18:
Red Hat Enterprise Linux CoreOS:  VERSION="45.82.202011012059-0"
kernel: 4.18.0-193.28.1.el8_2.ppc64le

Comment 27 Lukáš Czerner 2020-11-11 09:43:21 UTC
(In reply to Julie from comment #26)

> Hi Lukas, Eric,
> 
> FYI:
> I verified the fix on the following OCP clusters on Power (ppc64le arch).
> New "e2fsprogs" package works fine on the following systems.
> 
> 1. OCP 4.6
> Worker node on OCP 4.6:
> Red Hat Enterprise Linux CoreOS VERSION="46.82.202010082145-0"
> kernel: 4.18.0-193.24.1.el8_2.dt1.ppc64le
> 
> 2. OCP 4.5.18  
> Worker node on OCP 4.5.18:
> Red Hat Enterprise Linux CoreOS:  VERSION="45.82.202011012059-0"
> kernel: 4.18.0-193.28.1.el8_2.ppc64le

Thank you for testing Julie. The fix has been proposed upstream and will be included in the RHEL release.

-Lukas

Comment 28 Julie 2020-11-18 10:06:00 UTC
>> 
> Thank you for testing Julie. The fix has been proposed upstream and will be
> included in the RHEL release.
> 
> -Lukas


Lukas, please tell me which versions of RHEL will have this fix included. Thank you.

Comment 29 Lukáš Czerner 2020-11-23 13:33:01 UTC
(In reply to Julie from comment #28)
> >> 
> > Thank you for testing Julie. The fix has been proposed upstream and will be
> > included in the RHEL release.
> > 
> > -Lukas
> 
> 
> Lukas, please tell me which versions of RHEL will have this fix included.
> Thank you.

Ideally we would have that fixed in rhel8.4, z-stream and of course rhel 9 but it's not been merged upstream as of yet so I can't tell you that with any certainty yet.

-Lukas


Note You need to log in before you can comment on or make changes to this bug.