Bug 1691511

Summary: Repeated systemd-run --scope -- mount -t tmpfs tmpfs /<path> under directories bind-mounted to themselves result in E2BIG failures
Product: Red Hat Enterprise Linux 7 Reporter: Kyle Walker <kwalker>
Component: systemdAssignee: systemd-maint
Status: CLOSED ERRATA QA Contact: Frantisek Sumsal <fsumsal>
Severity: high Docs Contact:
Priority: urgent    
Version: 7.6CC: aabhishe, aanjarle, ansverma, dchong, dtardon, erjones, fkrska, jrosenta, rupatel, sople, systemd-maint-list
Target Milestone: rcKeywords: ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: systemd-219-65.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1722576 1786340 (view as bug list) Environment:
Last Closed: 2019-08-06 12:43:57 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1722576, 1786340    

Description Kyle Walker 2019-03-21 18:58:52 UTC
Description of problem:
 With openshift workloads, commands such as the following are issued repeatedly:

    systemd-run --description='Kubernetes transient mount for /var/lib/origin/openshift.local.volumes/pods/<val>/volumes/kubernetes.io~secret/<val>' --scope -- mount -t tmpfs tmpfs /var/lib/origin/openshift.local.volumes/pods/<val>/volumes/kubernetes.io~secret/<val>

When the mount point is within a bind-mounted directory, the behaviour results in a large number of "loaded inactive dead" mount points. Eventually the commands will fail with the following error message visible in the system logs.

    systemd: Failed to set up mount unit: Argument list too long


This issue was reported upstream in the following:

    https://github.com/kubernetes/kubernetes/issues/57345


Which pushed the investigation to an upstream systemd issue documented in:

    https://github.com/systemd/systemd/issues/7798


Version-Release number of selected component (if applicable):

    systemd-219-62.el7_6.2.x86_64


How reproducible:

    Difficult, though the underlying suspected problem is fairly simple

Steps to Reproduce:
1. Create a directory and bind mount it to itself

    # mkdir /mnt/test && mount --bind /mnt/test /mnt/test

2. Mount additional directories under the above

    # mkdir /mnt/test/subdir1 && mount -t tmpfs tmpfs /mnt/test/subdir1

3. Verify the state of the underlying subdir

    # systemctl list-units --type mount --all | grep test


Actual results:

    # systemctl list-units --type mount --all | grep test
    mnt-test-subdir1.mount        loaded inactive dead    /mnt/test/subdir1
    mnt-test.mount                loaded active   mounted /mnt/test


Expected results:

    # systemctl list-units --type mount --all | grep test
    mnt-test-subdir1.mount        loaded active   mounted /mnt/test/subdir1
    mnt-test.mount                loaded active   mounted /mnt/test    


Additional info:

    The cause of the "systemd: Failed to set up mount unit: Argument list too long" error message is suspected to be continued growth of "inactive dead" mount units that cannot be removed and eventual failure in the following codepath:

    src/core/unit.c
    int unit_add_name(Unit *u, const char *text) {
            _cleanup_free_ char *s = NULL, *i = NULL;
            UnitType t;
            int r;
    <snip>
            if (hashmap_size(u->manager->units) >= MANAGER_MAX_NAMES)
                    return -E2BIG;
    <snip>

Comment 2 Kyle Walker 2019-03-21 19:25:04 UTC
Opened a downstream PR with a backport of the following commit:

    core: Fix edge case when processing /proc/self/mountinfo

    Currently, if there are two /proc/self/mountinfo entries with the same
    mount point path, the mount setup flags computed for the second of
    these two entries will overwrite the mount setup flags computed for
    the first of these two entries. This is the root cause of issue systemd#7798.
    This patch changes mount_setup_existing_unit to prevent the
    just_mounted mount setup flag from being overwritten if it is set to
    true. This will allow all mount units created from /proc/self/mountinfo
    entries to be initialized properly.

PR:

    https://github.com/lnykryn/systemd-rhel/pull/318

Comment 5 Lukáš Nykrýn 2019-04-02 09:10:30 UTC
fix merged to staging branch -> https://github.com/lnykryn/systemd-rhel/pull/318 -> post

Comment 12 errata-xmlrpc 2019-08-06 12:43:57 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2019:2091