Bug 1383283

Summary: swapon failed: Operation not permitted
Product: Red Hat Enterprise Linux 7 Reporter: Jiri Jaburek <jjaburek>
Component: anacondaAssignee: Anaconda Maintenance Team <anaconda-maint-list>
Status: CLOSED DEFERRED QA Contact: Release Test Team <release-test-team>
Severity: medium Docs Contact:
Priority: medium    
Version: 7.3CC: mkolman, ngu, sbueno
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-11-18 14:07:03 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
the error picture none

Description Jiri Jaburek 2016-10-10 11:04:31 UTC
Description of problem:

I'm running a 32-core x86_64 qemu-based VM, cgroups-limited, and am
seeing this when trying to install a RHEL-7.2 system:

------------------------------------------------------------
Creating xfs on /dev/mapper/vg0-varlog
.
Creating swap on /dev/mapper/vg0-swap
.
Creating xfs on /dev/mapper/vg0-root
.
Creating xfs on /dev/vda1
.
** (anaconda:2093): WARNING **: Could not open X display

An unknown error has occured, look at the /tmp/anaconda-tb* file(s)
for more details
------------------------------------------------------------

followed by a traceback and

SwapError: swapon failed for '/dev/mapper/vg0-swap'

Trying it manually indeed throws

# swapon /dev/mapper/vg0-swap 
swapon: /dev/mapper/vg0-swap: swapon failed: Operation not permitted

Even when booting with selinux=0 (not that it would matter, no policy
is loaded) and creating a swap file manually, the error remains.

------------------------------------------------------------
# sestatus 
SELinux status:                 disabled
# dd if=/dev/zero of=/sw bs=16M count=1
1+0 records in
1+0 records out
16777216 bytes (17 MB) copied, 1.00315 s, 16.7 MB/s
# mkswap /sw
Setting up swapspace version 1, size = 16380 KiB
no label, UUID=debb8a87-9c84-4a19-8ce4-acffe2aef4ec
# chmod 0600 /sw
# swapon /sw
swapon: /sw: swapon failed: Operation not permitted
------------------------------------------------------------

Looking into the kernel code, there are not many cases for EPERM,
but one stands out:

        if (type >= MAX_SWAPFILES) {
                spin_unlock(&swap_lock);
                kfree(p);
                return ERR_PTR(-EPERM);
        }

and Anaconda seems to indeed create one zram device per CPU core, for
some weird reason:

------------------------------------------------------------
# cat /proc/swaps 
Filename                 Type            Size    Used    Priority
/dev/zram0               partition       63856   0       100
/dev/zram1               partition       63856   0       100
/dev/zram2               partition       63856   0       100
...
/dev/zram27              partition       63856   0       100
/dev/zram28              partition       63856   0       100
------------------------------------------------------------

but the MAX_SWAPFILES is defined as

#define MAX_SWAPFILES_SHIFT     5
#define MAX_SWAPFILES \
        ((1 << MAX_SWAPFILES_SHIFT) - SWP_MIGRATION_NUM - SWP_HWPOISON_NUM)

which limits the number of swap files below 32 (depending on kernel config),
quite possibly to 28.

Creating this many zram devices is probably not necessary and there should
be some max limit to them.


Version-Release number of selected component (if applicable):
anaconda-21.48.22.56-1.el7
anaconda-21.48.22.93-1.el7

How reproducible:
always

Actual results:
anaconda creates many zram devices as swaps given many CPU cores

Expected results:
anaconda does not create zram devices as swaps or creates a smaller
amount of them (why is 1 not enough?)

Additional info:
This bug would presumably happen on any machine with >=32 cores, not just
the unusual case of a testing VM.

Comment 1 Martin Kolman 2016-10-10 11:27:51 UTC
(In reply to Jiri Jaburek from comment #0)

<snip >

> Expected results:
> anaconda does not create zram devices as swaps or creates a smaller
> amount of them (why is 1 not enough?)
AFAIK the individual zram devices are single threaded, so you generally use as many devices as you have CPUs to parallelize the compression. But that kinda breaks down if your have 32 CPUs - a smaller amount of bigger devices should work as well.

Comment 2 Martin Kolman 2016-10-10 12:02:05 UTC
BTW, looking at the zram docs[0] it looks like zram now also supports multiple compression streams per device & even sets appropriate number of stream per device by default. Any idea if the version of zram in the RHEL7 kernel supports these features ?

If it does we could just make Anaconda setup a single large zram device with default parameters and let the zram logic sort out the number of compression streams.

[0] https://www.kernel.org/doc/Documentation/blockdev/zram.txt

Comment 3 Jiri Jaburek 2016-10-10 12:48:53 UTC
I'm not a zram developer, but it seems to have been introduced by

http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=da9556a2367cf2261ab4d3e100693c82fb1ddb26

and grepping the 3.10.0-327.el7 source tarball seems to indicate that it has not been backported, so RHEL7 may still benefit from multiple zram swaps, assuming the load distribution is split amongst them (which I'm not sure is the case).

Creating ie. up to 4 shouldn't hurt, creating any more would be questionable given that Anaconda doesn't perform any highly parallelized workload.

Comment 4 Martin Kolman 2017-01-18 14:19:46 UTC
So according to comment 3 (unless there are plans to backport newer zram to 7.4 kernel - but I don't know about any) it looks like limiting the number of zram devices created during the installation is the best way for RHEL7. What about a maximum of 16 ? to be safe under the limit of 28 swap devices.

As for Fedora - making use of the zram multistream feature seems like the best future proof option.

Comment 5 Samantha N. Bueno 2017-05-26 16:29:09 UTC
Deferring this to 7.5 planning, because we did not get to it during 7.4.

Comment 6 Samantha N. Bueno 2017-09-01 09:43:40 UTC
We were unable to fit this into our work load for 7.5 either; deferring to 7.6 planning since it'd be nice to fix this here and upstream.

Comment 7 cliao 2018-09-29 09:29:50 UTC
Created attachment 1488331 [details]
the error picture

I got the same error in tree RHEL-7.6-20180926.0.

version:
qemu: qemu-img-1.5.3-160.el7.x86_64
kernel: kernel-3.10.0-954.el7.x86_64

boot guest commands:
/usr/libexec/qemu-kvm \
    -name 'avocado-vt'  \
    -sandbox off  \
    -machine pc  \
    -nodefaults \
    -device VGA,bus=pci.0,addr=0x2  \
    -device nec-usb-xhci,id=usb1,bus=pci.0,addr=0x3 \
    -drive id=drive_image1,if=none,snapshot=off,aio=native,cache=none,format=qcow2,file=/home/rhel76-2.qcow2 \
    -device virtio-blk-pci,id=image1,drive=drive_image1,bootindex=1,bus=pci.0,addr=0x4 \
    -device virtio-net-pci,mac=9a:9c:9d:9e:1f:a0,id=idTIjIEl,vectors=4,netdev=idrPPk5g,bus=pci.0,addr=0x5  \
    -netdev tap,id=idrPPk5g,vhost=on \
    -m 2048  \
    -smp 32,maxcpus=32,cores=16,threads=1,sockets=2  \
    -drive id=drive_cd1,if=none,snapshot=off,aio=native,cache=none,media=cdrom,file=/home/kvm_autotest_root/iso/linux/RHEL-7.6-20180926.0-Server-x86_64-dvd1.iso \
    -device ide-cd,id=cd1,drive=drive_cd1,bootindex=2,bus=ide.0,unit=0 \
    -drive id=drive_unattended,if=none,snapshot=off,aio=native,cache=none,media=cdrom,file=/home/rhel76-64/ks.iso \
    -device ide-cd,id=unattended,drive=drive_unattended,bootindex=3,bus=ide.0,unit=1 \
    -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1  \
    -kernel '/home/rhel76-64/vmlinuz'  \
    -append 'ksdevice=link inst.repo=cdrom:/dev/sr0 inst.ks=cdrom:/dev/sr1:/ks.cfg nicdelay=60 biosdevname=0 net.ifnames=0 console=ttyS0,115200 console=tty0'  \
    -initrd '/home/rhel76-64/initrd.img'  \
    -vnc :2  \
    -rtc base=utc,clock=host,driftfix=slew  \
    -boot menu=off,strict=off,order=cdn,once=d  \
    -no-shutdown \
    -enable-kvm -monitor stdio

Comment 8 Samantha N. Bueno 2019-11-18 14:07:03 UTC
Closing this, as this report is stale. Last flags are from 7.6, no customer cases are attached. If this is indeed still an issue, it affects a very small contingent of users, and we should work to address this upstream.