Bug 1698858

Summary: mkfs.gfs2 can max out iovec limit with wide enough alignment holes
Product: Red Hat Enterprise Linux 8 Reporter: Andrew Price <anprice>
Component: gfs2-utilsAssignee: Andrew Price <anprice>
Status: CLOSED ERRATA QA Contact: cluster-qe <cluster-qe>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 8.0CC: bmarson, cluster-maint, gfs2-maint, jpayne, rhandlin, rpeterso
Target Milestone: rc   
Target Release: 8.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: gfs2-utils-3.2.0-5.el8 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-11-05 22:17:06 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Andrew Price 2019-04-11 11:40:08 UTC
Reported by Barry Marson in a sas test environment:

NAME            ALIGNMENT MIN-IO  OPT-IO PHY-SEC LOG-SEC ROTA SCHED RQ-SIZE   RA WSAME
vg_sas-lsaswork         0 131072 6291456     512     512    1           128 8192   32M

[root@bills _scripts]# mkfs -t gfs2 -O -j 8 -r 256 -t afc_cluster:work /dev/mapper/vg_sas-lsaswork
It appears to contain an existing filesystem (xfs)
This will destroy any data on /dev/dm-60
Discarding device contents (may take a while on large devices): Done
Adding journals: [1/8]Zeroing write failed at block 17
Failed to create journals

With this topology we're getting EINVAL from pwritev() because the iovec required to zero the hole between the superblock and the first rgrp is too large. The write should either get chunked into IOV_MAX batches or done using a larger zeroed buffer so that the vector can be smaller.

Comment 1 Andrew Price 2019-04-11 11:53:21 UTC
Impact:
- Can fail mkfs.gfs2 with a vague error message on some striped devices
- Workaround available - mkfs.gfs2 -o align=0  but this could have a (small, workload dependent) performance impact

Simple test:

$ mkfs.gfs2 -O -o test_topology=0:512:131072:6291456:512 -p lock_nolock testvol 
This will destroy any data on testvol
Adding journals: [1/1]Zeroing write failed at block 17
Failed to create journals

(Will be added to the upstream test suite)

Comment 2 Andrew Price 2019-04-25 12:48:45 UTC
Patch posted upstream: https://www.redhat.com/archives/cluster-devel/2019-April/msg00040.html

Comment 5 Justin Payne 2019-10-01 23:59:00 UTC
Verified in gfs2-utils-3.2.0-5:

[root@p8-224-node4 ~]# rpm -q gfs2-utils
gfs2-utils-3.2.0-5.el8.ppc64le
[root@p8-224-node4 ~]# mkfs.gfs2 -O -o test_topology=0:512:131072:6291456:512 -p lock_nolock -t test:test /dev/sdb1
This will destroy any data on /dev/sdb1
Discarding device contents (may take a while on large devices): Done
Adding journals: Done 
Building resource groups: Done     
Creating quota file: Done
Writing superblock and syncing: Done
Device:                    /dev/sdb1
Block size:                4096
Device size:               99.97 GB (26206204 blocks)
Filesystem size:           99.97 GB (26206176 blocks)
Journals:                  1
Journal size:              128MB
Resource groups:           398
Locking protocol:          "lock_nolock"
Lock table:                "test:test"
UUID:                      3cce2766-8549-4598-88c9-6562331b7a9b

Comment 7 errata-xmlrpc 2019-11-05 22:17:06 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:3567