Bug 1494014 - systemd automatically unmount filesystem mounted using "mount <mountpoint>" command [NEEDINFO]
Summary: systemd automatically unmount filesystem mounted using "mount <mountpoint>" c...
Keywords:
Status: NEW
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: systemd
Version: 7.6
Hardware: All
OS: All
urgent
urgent
Target Milestone: rc
: ---
Assignee: Jan Synacek
QA Contact: qe-baseos-daemons
URL:
Whiteboard:
Depends On:
Blocks: 1477664 1643104 1719445
TreeView+ depends on / blocked
 
Reported: 2017-09-21 10:43 UTC by Renaud Métrich
Modified: 2019-09-09 14:25 UTC (History)
26 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1660761 (view as bug list)
Environment:
Last Closed:
quentin.clarenne: needinfo? (jsynacek)
nmarjano: needinfo? (jsynacek)


Attachments (Terms of Use)
Bash script for bug reproduction (987 bytes, application/x-shellscript)
2018-03-26 20:24 UTC, Anderson
no flags Details
iSCSI reproducer (1.69 KB, application/x-gzip)
2018-07-06 13:48 UTC, Renaud Métrich
no flags Details


Links
System ID Priority Status Summary Last Updated
Github systemd systemd issues 8596 None None None 2018-07-06 08:49:04 UTC
Red Hat Knowledge Base (Article) 4212551 None None None 2019-06-11 11:24:10 UTC
Github systemd systemd issues 9869 None None None 2018-08-14 12:49:52 UTC

Description Renaud Métrich 2017-09-21 10:43:50 UTC
Description of problem:

Performing "mount <mountpoint>" succeeds with rc 0, but systemd automatically unmounts the filesystem immediately:

systemd[1]: Unit mnt.mount is bound to inactive unit dev-rhel_vm\x2drhel74-lv1.device. Stopping, too.
systemd[1]: Unmounting /mnt...
kernel: XFS (dm-3): Unmounting Filesystem
systemd[1]: Unmounted /mnt.


This happens when a mount point (e.g. /mnt) is reused after a device has been deleted.

Typical use case: admin wants to move a logical volume from one VG to another VG.
He will do:
1. create the LV (lv2) on new VG
2. mount lv2 on /tmplv2 temporarily
3. copy old LV (lv1) mounted on /mylv to /tmplv2
4. umount /tmplv2 /mylv
5. update /etc/fstab (changing lv1 into lv2)
6. mount /mylv

At step 6, the /mylv filesystem gets unmounted automatically, until "systemctl daemon-reload" is executed.


Version-Release number of selected component (if applicable):

systemd-219-42.el7_4.1.x86_64


How reproducible:

Always


Steps to Reproduce:
1. Create 2 LVs and format them

lvcreate -n lv1 -L 1G vg
lvcreate -n lv2 -L 1G vg
mkfs.xfs /dev/vg/lv1
mkfs.xfs /dev/vg/lv2

2. Have lv1 be in /etc/fstab and mount it, then unmount it

# grep lv1 /etc/fstab
/dev/vg/lv1		/mnt			xfs	defaults	0 0

# mount /mnt
# umount /mnt

3. Edit /etc/fstab to use lv2 instead of lv1

# grep lv2 /etc/fstab
/dev/vg/lv2		/mnt			xfs	defaults	0 0

4. Delete lv1

# lvremove /dev/vg/lv1

5. Mount lv2

# mount /mnt; echo $?
0


Actual results:

/mnt gets automatically unmounted by systemd. Journal shows:

systemd[1]: Unit mnt.mount is bound to inactive unit dev-vg-lv1.device. Stopping, too.
systemd[1]: Unmounting /mnt...
systemd[1]: Unmounted /mnt.


Expected results:

No unmount


Additional info:

This seems to be due to some cache within systemd: even though /mnt has been unmounted, hence unit still exist for systemd:
# systemctl --all | grep mnt.mount
  mnt.mount                                                                                                      loaded    inactive dead      /mnt

I would expect the unit to be destroyed upon umount. But it's not the case, because it has been added to /etc/fstab I guess.

Comment 2 John Nielsen 2018-03-14 17:54:39 UTC
I just got bitten by this as well. Had an iSCSI volume mounted, needed to use different storage, transferred files, unmounted iSCSI and mounted the new storage at the same mount point. So far so good. But then when I removed the iSCSI target volume and logged out of it systemd "failed" the mount and unmounted the NEW storage. And kept unmounting it as soon as I tried to mount it again.

Worked around it by commenting the mount in fstab, running "systemctl daemon-reload" and "systemctl reset-failed", un-commenting the mount from fstab and running "systemctl daemon-reload" again (and then re-mounting, of course).

Comment 3 Anderson 2018-03-26 20:14:34 UTC
I just discovered a 3-day data loss because of this bug. Last Friday, I finished a migration of a filesystem which stores a central log server. Then, I removed the old logical volume and did not figure that systemd unmounted the migrated filesystem and stopped the "rsyslog" daemon (*).


----------------

(*) Note: actually, "rsyslog" stopped because of a custom unit parameter:

[Unit]
Requires=dev-mapper-volg-lvol.device
After=dev-mapper-volg-lvol.device

Comment 4 Anderson 2018-03-26 20:24:59 UTC
Created attachment 1413329 [details]
Bash script for bug reproduction

The attached shell script allows reproduction of the bug. After a data migration simulation, systemd enters a loop that tries to unmount a busy filesystem.

Comment 5 rhayden 2018-05-07 13:43:30 UTC
I potentially ran into this issue and have logged a support ticket with Red Hat (CASE 02093462).   In my situation, the SysAdmin incorrectly added an entry into /etc/fstab with the wrong logical volume name.  They manually mounted the file system successfully.  Later, after a reboot, the file system would not mount because systemd associated the mount unit to the inactive logical volume.   The systemctl daemon-reload command was executed and the file system mounted properly.

Comment 6 Tan Xiaojun 2018-06-01 08:54:26 UTC
Hi, all,

We also have a systemd-related problem in the RHEL-7.4 test. It looks similar to this, can you provide a fixed systemd installation package, we want to upgrade and test it.

Thanks.
Xiaojun.

Comment 7 Tan Xiaojun 2018-06-01 09:22:33 UTC
(In reply to Tan Xiaojun from comment #6)
> Hi, all,
> 
> We also have a systemd-related problem in the RHEL-7.4 test. It looks
> similar to this, can you provide a fixed systemd installation package, we
> want to upgrade and test it.
> 
> Thanks.
> Xiaojun.

Oh. BTW, we need the AArch64(arm64) version of the package.

Thanks a lot.
Xiaojun.

Comment 8 Renaud Métrich 2018-06-01 10:03:55 UTC
There is no fix yet. Workaround is to run "systemctl daemon-reload" after modifying the mount point.

Comment 9 Thomas Gardner 2018-06-22 19:59:20 UTC
This behavior is unexpected, unnecessary, and frankly dangerous.

It is unexpected because ever since the existence of an /etc/fstab
file, it was only consulted under two conditions:  1) During boot, it
would be read to see what FSes need to be mounted, and 2) the "mount"
command would read it if the parameters given to it required that (for
instance "mount -a" or "mount /some/mount/point").  These really are
the only two times I can think of off the top of my head that this file
has every traditionally been consulted.  In fact, since boot generally
has always just done a "mount -a" these could be considered one, but
I digress.  This file was never baby-sat to make sure that the system
constantly has the things listed in it are always mounted, and when
unmounted externally, automatically re-mount them.  Certainly at no
time has it ever been considered normal for the _former_ contents of
this file to be consulted and enforced by the system (which in essence
is what happens here).

It's unnecessary simply because it has never been anyone's expectation
that this file (or some idea of what might be in it) be enforced by
the system.

It's dangerous because people are just not expecting this, they don't
realize they have to change their behavior and that is reasonable,
because the file looks the same as it always has.  Here's a "for
instance" for you: Let's say someone has a swap device that they feel
should be bigger.  They add some storage, make that storage into a
new swap device, update the fstab, start swapping to the new device,
and stop swapping on the old one.  They figure they're done: They're
swapping to the new one now, and they figure that the next time the
machine boots, it'll pick up the change in /etc/fstab, because this
is how that has always worked.  They go on their merry way.  At some
point, maybe the next day, maybe a week later, the storage admin sees
the sysadmin in the hall and asks if they finished that swap exchange.
As far as the sysadmin knows they're done, and says so.  They talk
about the weather, the game last night, complain about management and
whatever else.  The next time the storage admin thinks of it, removes
that old swap LUN.  Neither of them thought to check if systemd decided
it was smarter than everyone else and made the machine start swapping
to the old device.

At this point, you've just irreversibly corrupted memory.  Maybe you've
corrupted your database in the process....  The machine crashes and we
all have a terrible time trying to figure out what the heck happened.
The customer doesn't make the connection to the removed LUN and the
crash (perhaps the storage admin tarried for a day or two more before
removing it so the sysadmin had no idea there even was a correlation
in time with any other event), so they never even mention it to us.
What a nightmare for everyone involved.  Well, everyone except
the developers who thought this was a good idea in the first place.
Since we never figure out why this is the reason we lost that customer
(because we never figured out this is what happened), the programmers
get off scott-free, completely oblivious and emboldened to continue
making decisions like this.  Well, at least until the company goes
under because we keep doing things like this and all our customers
just leave when someone better comes along.

OK, so maybe I'm having a little fun with the story time, but I think
you get the idea:  This was just not a good move.  If you're going to
leave a well known file hanging around in your distribution, with the
same format it's always had, you really should continue to use it in
the same way it's always been used.  If you want to change behavior
in a drastic way like this, the old well-known file needs to go away
so that people don't continue to use it the way they always have and
expect it to work the way it always has.

Comment 12 Renaud Métrich 2018-07-06 11:45:20 UTC
My reproducer is broken.

After creating /dev/vg/lv1 entry in /etc/fstab, use "systemctl daemon-reload" so that the "mnt.mount" unit gets created.

Then, because /etc/fstab has been modified without reloading systemd afterwards, the mnt.mount unit points to invalid (non-existent) device:

-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
# systemctl status mnt.mount
● mnt.mount - /mnt
   Loaded: loaded (/etc/fstab; bad; vendor preset: disabled)
   Active: inactive (dead) since Fri 2018-07-06 13:42:45 CEST; 4s ago
    Where: /mnt
     What: /dev/vg/lv1
     Docs: man:fstab(5)
           man:systemd-fstab-generator(8)
  Process: 1521 ExecUnmount=/bin/umount /mnt (code=exited, status=0/SUCCESS)

Jul 06 13:42:45 vm-systemd7-umount systemd[1]: About to execute: /bin/umount /mnt
Jul 06 13:42:45 vm-systemd7-umount systemd[1]: Forked /bin/umount as 1521
Jul 06 13:42:45 vm-systemd7-umount systemd[1]: mnt.mount changed mounted -> unmounting
Jul 06 13:42:45 vm-systemd7-umount systemd[1]: Unmounting /mnt...
Jul 06 13:42:45 vm-systemd7-umount systemd[1521]: Executing: /bin/umount /mnt
Jul 06 13:42:45 vm-systemd7-umount systemd[1]: Child 1521 belongs to mnt.mount
Jul 06 13:42:45 vm-systemd7-umount systemd[1]: mnt.mount mount process exited, code=exited status=0
Jul 06 13:42:45 vm-systemd7-umount systemd[1]: mnt.mount changed unmounting -> dead
Jul 06 13:42:45 vm-systemd7-umount systemd[1]: Job mnt.mount/stop finished, result=done
Jul 06 13:42:45 vm-systemd7-umount systemd[1]: Unmounted /mnt.
Warning: mnt.mount changed on disk. Run 'systemctl daemon-reload' to reload units.
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------

See also the "Warning" which clearly indicates the issue.

Comment 13 Renaud Métrich 2018-07-06 13:44:39 UTC
I can confirm that this issue has nothing to do with LVM, I can reproduce with iscsi (see below for reproducer).

-------------------------------------------------------------------------------------------------
Also, the issue only reproduces if filesystem is in /etc/fstab AND there ISN'T the "noauto" flag.
-------------------------------------------------------------------------------------------------

I could also reproduce on Fedora 27. I will update the GitHub issue (https://github.com/systemd/systemd/issues/8596) also, hoping Lennart will reopen it.

Reproducer using iscsi, with 2 Targets, 1 Lun in each, so that a target can be deleted to mimic switching between targets (e.g. Disaster Recovery scenario).

Server (RHEL7):

  - copy saveconfig.json to /etc/target

  - create disks:

      truncate -s 200M /tmp/disk1.img
      truncate -s 200M /tmp/disk2.img

  - restart "target" service:

      systemctl restart target

  - stop firewall for convenience:

      systemctl stop firewall

Client (Fedora 27 or RHEL7):

  - copy initiatorname.iscsi to /etc/iscsi/

  - discover the targets (XXX == iscsi server):

      iscsiadm -m discovery -t st -p XXX

  - attach the targets (replace XXX by what was found above):

      iscsiadm -m node -T iqn.2003-01.org.linux-iscsi.XXX.x8664:sn.c0e14b4d5602 -l
      iscsiadm -m node -T iqn.2003-01.org.linux-iscsi.XXX.x8664:sn.f6f77528fc0b -l

  - format luns:

      mkfs.xfs -L DISK1 /dev/disk/by-id/scsi-3600140529f7fada8dfd43babba397b96
      mkfs.xfs -L DISK2 /dev/disk/by-id/scsi-360014051f1fbab5955b4facafb2a36fc

Now, run the usual scenario. For convenience, I did the following:

1. Edit /etc/fstab and add line:

  LABEL=DISK1 /mnt xfs defaults 0 0

2. Fake a normal boot

  # systemctl daemon-reload; mount /mnt

3. Unmount /mnt

  # umount /mnt

4. Edit /etc/fstab and change /mnt line to use second disk:

  LABEL=DISK2 /mnt xfs defaults 0 0

5. Detach the 1st target with lun DISK0 (replace XXX by what was found above):

  # iscsiadm -m node -T iqn.2003-01.org.linux-iscsi.XXX.x8664:sn.c0e14b4d5602 -u

6. Check that "systemctl status mnt.mount" complains about reloading (don't reload of course)

  # systemctl status mnt.mount | grep Warning
  Warning: mnt.mount changed on disk. Run 'systemctl daemon-reload' to reload units.

7. Mount the new disk:

  # mount /mnt

--> gets automatically unmounted until "systemctl daemon-reload" is performed.

Comment 14 Renaud Métrich 2018-07-06 13:48:07 UTC
Created attachment 1457011 [details]
iSCSI reproducer

Comment 15 Thomas Gardner 2018-07-09 12:06:40 UTC
Let's try this from a different angle:

Will someone from the systemd development team please explain what the design intent was behind this behavior?  What problem was this behavior intended to solve, or what does it bring to the user which was not available to the user before?

Is anyone from the systemd development team even getting any of this?  If you're out there, would you at least drop us a line and let us know that you see this?  I just did a quick scan and didn't see any response from anyone in development.  Either I missed it, or perhaps there is something wrong with the flags on this bug which make it unseen, or perhaps, well, something else.

If we get no response from development soon, I'll assume the best intentions and start trying to figure out why no one from the systemd development team are seeing this.

Comment 16 Alexandros Panagiotou 2018-08-13 09:16:44 UTC
Hello,
I have been testing this in order to understand when it pops up in more detail:

[1] The type of device being mounted is of little importance. I have successfully reproduced it even with scsi devices on VMs and with loop devices (in addition to iscsi and lvm mentioned earlier). Details for the reproducers below. 

[2] Based on my tests, the key factor appears to be that the device described in the mount unit (in "What=") be invalid (non existing). This triggers systemd to take action.

    There are other secondary artifacts related to this - such as systemd automounting the filesystem as soon as the device re-appears (which, most likely, is not what the administrator expects. Auto-mounting is not necessarily a good idea and it is breaking conventions dating decades back - for example the administrator might wish to run a filesystem check on the device)

[3] It is important to note that, as long as the device mentioned in "What=" in the mount systemd unit exists, systemd *will* accept other devices being mounted on the same mount point.

[4] This still appears on a fully updated RHEL 7.5 with systemd-219-57.el7.x86_64.



[5] Reproducers:

A. With loop devices:
[A.1] Create 2 disks for the loop devices:

      truncate -s 10g 10gdisk1
      truncate -s 10g 10gdisk2

[A.2] Create the loop devices:

      losetup -f 10gdisk1
      losetup -f 10gdisk2

      Confirm they are there:

      losetup -a

[A.3] Format them with mkfs (this is only necessary the first time the loop devices are created):

      mkfs.xfs /dev/loop0
      mkfs.xfs /dev/loop1

[A.4] In /etc/fstab, add the entry:

      /dev/loop0 /mnt xfs defaults 0 0

      and then reload the units:

      systemctl daemon-reload

[A.5] Remove /dev/loop0:

      losetup -d /dev/loop0

[A.6] Try to mount /dev/loop1 on /mnt

      mount /dev/loop1 /mnt

In the logs the following messages appear:

Aug 13 10:46:14 fastvm-rhel-7-5-100 kernel: XFS (loop1): Mounting V5 Filesystem
Aug 13 10:46:14 fastvm-rhel-7-5-100 kernel: XFS (loop1): Ending clean mount
Aug 13 10:46:14 fastvm-rhel-7-5-100 systemd: Unit mnt.mount is bound to inactive unit dev-loop0.device. Stopping, too.
Aug 13 10:46:14 fastvm-rhel-7-5-100 systemd: Unmounting /mnt...
Aug 13 10:46:14 fastvm-rhel-7-5-100 kernel: XFS (loop1): Unmounting Filesystem
Aug 13 10:46:14 fastvm-rhel-7-5-100 systemd: Unmounted /mnt.

--------------------------------------------

B. With plain scsi devices:

[B.1] Create a system with (at least) 2 additional scsi devices, apart from the one used for root. (e.g. sdb & sdc). This can be a VM with 2 additional disks. Format the devices with a filesystem (I've been using xfs):

      mkfs.xfs /dev/sdb
      mkfs.xfs /dev/sdc

[B.2] Add the following entry in fstab and reload systemd units:

      /dev/sdb /mnt xfs defaults 0 0

      And then:

      systemctl daemon-reload

[B.3] Remove /dev/sdb from the system:

      echo 1 > /sys/block/sdb/device/delete

[B.4] Mount /dev/sdc on /mnt:

      mount /dev/sdc /mnt

The following appears in the logs:

Aug 13 11:08:09 fastvm-rhel-7-5-100 kernel: XFS (sdc): Mounting V5 Filesystem
Aug 13 11:08:09 fastvm-rhel-7-5-100 kernel: XFS (sdc): Ending clean mount
Aug 13 11:08:09 fastvm-rhel-7-5-100 systemd: Unit mnt.mount is bound to inactive unit dev-sdb.device. Stopping, too.
Aug 13 11:08:09 fastvm-rhel-7-5-100 systemd: Unmounting /mnt...
Aug 13 11:08:09 fastvm-rhel-7-5-100 kernel: XFS (sdc): Unmounting Filesystem
Aug 13 11:08:09 fastvm-rhel-7-5-100 systemd: Unmounted /mnt.

--------------------------------------------

Additional note for B:

It is interesting, that after rescanning and re-detecting /dev/sdb, it gets auto mounted. To be more specific, as soon as I run:

echo '- - -' > /sys/class/scsi_host/host2/scan 

I see in the logs:

Aug 13 11:09:49 fastvm-rhel-7-5-100 kernel: scsi 2:0:0:2: Direct-Access     QEMU     QEMU HARDDISK    1.5. PQ: 0 ANSI: 5
Aug 13 11:09:49 fastvm-rhel-7-5-100 kernel: sd 2:0:0:2: Attached scsi generic sg1 type 0
Aug 13 11:09:49 fastvm-rhel-7-5-100 kernel: sd 2:0:0:2: [sdb] 41943040 512-byte logical blocks: (21.4 GB/20.0 GiB)
Aug 13 11:09:49 fastvm-rhel-7-5-100 kernel: sd 2:0:0:2: [sdb] Write Protect is off
Aug 13 11:09:49 fastvm-rhel-7-5-100 kernel: sd 2:0:0:2: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Aug 13 11:09:49 fastvm-rhel-7-5-100 kernel: sd 2:0:0:2: [sdb] Attached SCSI disk
Aug 13 11:09:49 fastvm-rhel-7-5-100 systemd: mnt.mount: Directory /mnt to mount over is not empty, mounting anyway.
Aug 13 11:09:49 fastvm-rhel-7-5-100 systemd: Mounting /mnt...
Aug 13 11:09:49 fastvm-rhel-7-5-100 kernel: XFS (sdb): Mounting V5 Filesystem
Aug 13 11:09:49 fastvm-rhel-7-5-100 kernel: XFS (sdb): Ending clean mount
Aug 13 11:09:49 fastvm-rhel-7-5-100 systemd: Mounted /mnt.

Nobody issued a mount command here! And nobody would expect this to be mounted.

(In the case of my test system, sdb has host:bus:target:lun: 2:0:0:2 - therefore it is on host2. The host number may need to be adjusted on other systems)

Regards,
Alexandros

Comment 17 Renaud Métrich 2018-08-13 14:43:53 UTC
Issue happens as soon as the mount unit is part of the dependency tree (through WantedBy or RequiredBy).
Hence, to reproduce with /etc/fstab entry, "noauto" doesn't work (because there is no dep created), but "nofail" is enough (because a WantedBy dep is created).

Most simplest reproducer without /etc/fstab:

# truncate -s 1G /root/myloop
# losetup /dev/loop1 /root/myloop
# mkfs.xfs /dev/loop1

# cat /etc/systemd/system/mnt.mount 
[Mount]
What=/dev/loop0
Where=/mnt
Type=xfs
Options=defaults

[Install]
WantedBy=multi-user.target

# systemctl enable mnt.mount


/dev/loop0 doesn't exist.
Everytime I try to "mount /dev/loop1 /mnt", "/mnt" gets automatically unmounted.

systemd backtrace when this happens:

(gdb) break unit_check_binds_to
Breakpoint 1 at 0x557d18317e70: file src/core/unit.c, line 1642.
(gdb) cont
Continuing.

Breakpoint 1, unit_notify (u=0x557d19fc9790, os=UNIT_ACTIVE, ns=UNIT_ACTIVE, 
    reload_success=<optimized out>) at src/core/unit.c:1990
1990	                unit_check_binds_to(u);
(gdb) bt
#0  unit_notify (u=0x557d19fc9790, os=UNIT_ACTIVE, ns=UNIT_ACTIVE, reload_success=<optimized out>)
    at src/core/unit.c:1990
#1  0x0000557d18324a01 in device_update_found_by_name (now=true, found=DEVICE_FOUND_MOUNT, add=true, 
    path=0x557d19efdd90 "\200\313\357\031}U", m=0x557d19efdd90) at src/core/device.c:523
#2  device_found_node (m=m@entry=0x557d19efdd90, node=node@entry=0x557d19f10440 "/dev/loop1", 
    add=add@entry=true, found=found@entry=DEVICE_FOUND_MOUNT, now=now@entry=true)
    at src/core/device.c:833
#3  0x0000557d18327757 in mount_load_proc_self_mountinfo (m=m@entry=0x557d19efdd90, 
    set_flags=set_flags@entry=true) at src/core/mount.c:1609
#4  0x0000557d18328447 in mount_dispatch_io (source=<optimized out>, fd=<optimized out>, 
    revents=<optimized out>, userdata=0x557d19efdd90) at src/core/mount.c:1756
#5  0x0000557d1830eae0 in source_dispatch (s=s@entry=0x557d19f066e0)
    at src/libsystemd/sd-event/sd-event.c:2115
#6  0x0000557d1830fb7a in sd_event_dispatch (e=0x557d19efe280)
    at src/libsystemd/sd-event/sd-event.c:2472
#7  0x0000557d1830fd1f in sd_event_run (e=<optimized out>, timeout=<optimized out>)
    at src/libsystemd/sd-event/sd-event.c:2501
#8  0x0000557d18270ec3 in manager_loop (m=0x557d19efdd90) at src/core/manager.c:2212
#9  0x0000557d1826546b in main (argc=5, argv=0x7ffdc938e5f8) at src/core/main.c:1773
(gdb)

Comment 18 Renaud Métrich 2018-08-14 08:40:24 UTC
After the mount occurs, systemd discovers the mount and we fall into this piece of code:

1669 static void retroactively_start_dependencies(Unit *u) {
1670         Iterator i;
1671         Unit *other;
1672 
1673         assert(u);
1674         assert(UNIT_IS_ACTIVE_OR_ACTIVATING(unit_active_state(u)));
1675 
...
1681         SET_FOREACH(other, u->dependencies[UNIT_BINDS_TO], i)
1682                 if (!set_get(u->dependencies[UNIT_AFTER], other) &&
1683                     !UNIT_IS_ACTIVE_OR_ACTIVATING(unit_active_state(other)))
1684                         manager_add_job(u->manager, JOB_START, other, JOB_REPLACE, true, NULL, NULL)     ;


Case of 'WantedBy=multi-user.target' as dependency:

Line 1681, we have "other" == dev-loop0.device, which is the non-existent device.
Hence, this causes the automatic umount. We should have got "dev-loop1.device" instead.

Case of no 'WantedBy':

Line 1681, we have "other" == dev-loop1.device (i.e. good device).


Apparently, the u->dependencies list doesn't get updated with proper device.
Below is the backtrace when the device is the expected one (dev-loop1.device):

2204	int unit_add_dependency(Unit *u, UnitDependency d, Unit *other, bool add_reference) {
(gdb) p *other
$4 = {manager = 0x557d19efdd90, type = UNIT_DEVICE, load_state = UNIT_LOADED, merged_into = 0x0, 
  id = 0x557d19f099a0 "dev-loop1.device", instance = 0x0, names = 0x557d19f85e40, dependencies = {
    0x0 <repeats 24 times>}, requires_mounts_for = 0x0, description = 0x557d19fab290 "/dev/loop1", 
  documentation = 0x0, fragment_path = 0x0, source_path = 0x0, dropin_paths = 0x0, fragment_mtime = 0, 
  source_mtime = 0, dropin_mtime = 0, job = 0x0, nop_job = 0x0, job_timeout = 90000000, 
  job_timeout_action = EMERGENCY_ACTION_NONE, job_timeout_reboot_arg = 0x0, refs = 0x0, 
  conditions = 0x0, asserts = 0x0, condition_timestamp = {realtime = 0, monotonic = 0}, 
  assert_timestamp = {realtime = 0, monotonic = 0}, inactive_exit_timestamp = {
    realtime = 1534170428533938, monotonic = 291516132}, active_enter_timestamp = {
    realtime = 1534170428533938, monotonic = 291516132}, active_exit_timestamp = {realtime = 0, 
    monotonic = 0}, inactive_enter_timestamp = {realtime = 0, monotonic = 0}, slice = {unit = 0x0, 
    refs_next = 0x0, refs_prev = 0x0}, units_by_type_next = 0x557d19f936c0, 
  units_by_type_prev = 0x557d19f93ce0, has_requires_mounts_for_next = 0x0, 
  has_requires_mounts_for_prev = 0x0, load_queue_next = 0x0, load_queue_prev = 0x0, 
  dbus_queue_next = 0x557d19f394d0, dbus_queue_prev = 0x557d19fc5010, cleanup_queue_next = 0x0, 
  cleanup_queue_prev = 0x0, gc_queue_next = 0x0, gc_queue_prev = 0x0, cgroup_queue_next = 0x0, 
  cgroup_queue_prev = 0x0, pids = 0x0, sigchldgen = 0, gc_marker = 8714, deserialized_job = -1, 
  load_error = 0, unit_file_state = _UNIT_FILE_STATE_INVALID, unit_file_preset = -1, cgroup_path = 0x0, 
  cgroup_realized_mask = 0, cgroup_subtree_mask = 0, cgroup_members_mask = 0, 
  on_failure_job_mode = JOB_REPLACE, stop_when_unneeded = false, default_dependencies = true, 
  refuse_manual_start = false, refuse_manual_stop = false, allow_isolate = false, 
  ignore_on_isolate = true, ignore_on_snapshot = true, condition_result = false, assert_result = false, 
  transient = false, in_load_queue = false, in_dbus_queue = true, in_cleanup_queue = false, 
  in_gc_queue = false, in_cgroup_queue = false, sent_dbus_new_signal = true, no_gc = false, 
  in_audit = false, cgroup_realized = false, cgroup_members_mask_valid = true, 
  cgroup_subtree_mask_valid = true}
(gdb) bt
#0  unit_add_dependency (u=u@entry=0x557d19fc5010, d=UNIT_AFTER, other=other@entry=0x557d19f93a00, 
    add_reference=add_reference@entry=true) at src/core/unit.c:2204
#1  0x0000557d18313a57 in unit_add_two_dependencies (u=0x557d19fc5010, d=<optimized out>, 
    e=UNIT_BINDS_TO, other=0x557d19f93a00, add_reference=true) at src/core/unit.c:2314
#2  0x0000557d183154c9 in unit_add_node_link (u=u@entry=0x557d19fc5010, what=<optimized out>, 
    wants=<optimized out>, dep=UNIT_BINDS_TO) at src/core/unit.c:2869
#3  0x0000557d183271d1 in mount_add_device_links (m=0x557d19fc5010) at src/core/mount.c:348
#4  mount_add_extras (m=m@entry=0x557d19fc5010) at src/core/mount.c:521
#5  0x0000557d18328968 in mount_load (u=0x557d19fc5010) at src/core/mount.c:571
#6  0x0000557d18313db0 in unit_load (u=0x557d19fc5010) at src/core/unit.c:1209
#7  0x0000557d1826d00e in manager_dispatch_load_queue (m=m@entry=0x557d19efdd90)
    at src/core/manager.c:1394
#8  0x0000557d18328457 in mount_dispatch_io (source=<optimized out>, fd=<optimized out>, 
    revents=<optimized out>, userdata=0x557d19efdd90) at src/core/mount.c:1768
#9  0x0000557d1830eae0 in source_dispatch (s=s@entry=0x557d19f066e0)
    at src/libsystemd/sd-event/sd-event.c:2115
#10 0x0000557d1830fb7a in sd_event_dispatch (e=0x557d19efe280)
    at src/libsystemd/sd-event/sd-event.c:2472
#11 0x0000557d1830fd1f in sd_event_run (e=<optimized out>, timeout=<optimized out>)
    at src/libsystemd/sd-event/sd-event.c:2501
#12 0x0000557d18270ec3 in manager_loop (m=0x557d19efdd90) at src/core/manager.c:2212
#13 0x0000557d1826546b in main (argc=5, argv=0x7ffdc938e5f8) at src/core/main.c:1773
(gdb)

Comment 19 Renaud Métrich 2018-08-14 10:14:40 UTC
The difference comes from mount_setup_unit().

When mnt.mount is not known to systemd (not in the dep tree), a new unit is created, which right device dev-loop1.device.
When mnt.mount is known to systemd due to being in dep tree, the unit is reused, causing the wrong device dev-loop0.device to be used in the BindsTo dependency.

Comment 20 Renaud Métrich 2018-08-14 12:21:18 UTC
Sum up of the findings:

At the time of the mount, when a mount unit already exists for systemd (typically because systemd-fstab-generator generated it and linked it to other units, such as remote-fs.target for a remote mount), the mount unit is reused, and nothing is changed except the "What" property (which is read from /proc/self/mounts).
This leads to an inconsistency between configured dependencies and expected ones:

# systemctl show mnt.mount | grep loop
What=/dev/loop1
BindsTo=dev-loop0.device
WantedBy=multi-user.target dev-loop0.device
After=local-fs-pre.target dev-loop0.device systemd-journald.socket system.slice -.mount
RequiresMountsFor=/ /dev/loop0


When the mount unit doesn't exist yet, then there is no inconsistency of course, since a new unit is built.

The design issue is believing a mount point name is considered unique, which explains why the mount unit for a given mount point is built as "<mountpoint>.mount".
What is unique is the tuple "<device> + <mountpoint>" instead.
If systemd used this name instead, then there would be no issue at all.


Another scenario where the issue can occur (and that's even freakier!):

- Assume /mnt is defined in /etc/fstab with device /dev/loop0 and that device exists.
- Unmount /mnt
- Mount /dev/loop1 as /mnt instead: mount /dev/loop1 /mnt.
- Delete /dev/loop0: losetup -d /dev/loop0

Result: /mnt gets unmounted! (since there is the BindsTo=dev-loop0.device property.


I can reproduce with Fedora 28 as well, now preparing to re-open the issue upstream...

Comment 21 Anderson 2018-08-15 13:01:26 UTC
I am wondering why this bug report was tagged as "confidential" ( https://imgur.com/a/yHKWGPj ), because:

# I was able to subscribe to notifications from this bug report after I had found it using Google Search. Therefore, this bug was originally public.

# If this bug is hidden, neither systemd developers nor the Linux community will be able to retrieve additional information which was not forwarded to systemd's bug tracker ( https://github.com/systemd/systemd/issues/8596 and https://github.com/systemd/systemd/issues/9869 ).

# I believe this bug is not related to information security, as I do not see any security-related information.

A systemd developer written me the following comment: "I am sorry, but if you have issues with LVM2, please first contact your downstreams and the LVM community.". I have not contacted the LVM community because I do not believe it is needed, however how will I prove that a previous contact to distribution maintainers have already been made if this bug report is hidden?

Unless there is a feasible reason which I have not seen yet, I request Red Hat to make this bug report public again.

Comment 22 John Marble 2018-08-16 17:25:38 UTC
I'm also seeing this bug.  We have a process where we remove a test filesystem, and volume group, and take a snapshot at the storage array level of a production filesystem and then mount it back to the where the original test filesystem was mounted.  Basically, just re-using the same mount point with a new device.  The only way I can get it to work on a reliable basis is to run "systemctl daemon-reload" before the mount.  I don't really understand why there are no updates to this bug which seems to be easily reproduced.

Comment 23 Jan Synacek 2018-10-09 12:57:41 UTC
No solution is known as of now, see https://github.com/systemd/systemd/issues/9869.

Comment 24 Alexandros Panagiotou 2018-11-01 15:59:14 UTC
Hello,
There is one additional artifact resulting from this behaviour of systemd: If the filesystem is in use and systemd cannot unmount it, then systemd gets in an endless loop flooding the logs with messages due to its failure to unmount a busy filesystem. On a test VM, I've got in about 2 minutes 20k lines from systemd (and umount) for these failures. A simple way to reproduce this is:

[1] Create the devices (for this example loop devices - but this is just because they can be deleted easily. Any block device will do):

    truncate -s 10G /tmp/bigfile1
    truncate -s 10G /tmp/bigfile2
    losetup -f /tmp/bigfile1
    losetup -f /tmp/bigfile2
    losetup -a
    /dev/loop0: [64768]:4194370 (/tmp/bigfile1)
    /dev/loop1: [64768]:4674228 (/tmp/bigfile2)

    mkfs.xfs /dev/loop0
    mkfs.xfs /dev/loop1
    mkdir /test

[2] Add in /etc/fstab an entry to mount /dev/loop0 on /test, e.g.:

    /dev/loop0 /test                       xfs     defaults        0 0

[3] Reload systemd units and mount everything from fstab:

    systemctl daemon-reload
    mount -a
    grep test /proc/mounts 
    /dev/loop0 /test xfs rw,seclabel,relatime,attr2,inode64,noquota 0 0

[4] Unmount /dev/loop0 and mount /dev/loop1 on /test

    umount /dev/loop0
    mount /dev/loop1 /test
    grep test /proc/mounts 
    /dev/loop1 /test xfs rw,seclabel,relatime,attr2,inode64,noquota 0 0

[5] Keep /test busy (e.g. cd /test) and monitor /var/log/messages with tail -f

[6] Delete /dev/loop0:

    losetup -d /dev/loop0

Result: continuous messages from umount and systemd about /test being busy.

Regards,
Alexandros

Comment 25 Gerwin Krist 2019-02-06 09:09:57 UTC
Hi,

(I also put this update on the upstream project: https://github.com/systemd/systemd/issues/9869).

We are seeing this exact behaviour also on our RHEL7 servers. But we are having another use case. This problem is triggered when the backup client starts to backup and when the puppet client (and it's restarting services) run on the same time.

We use R1Soft backup software which make a snapshot on a special partition (/dev/hcp1) and when the client is done than it unmounts that partition (check 13:39:10 below), systemd is unmounting the /var partition and renders the server useless. IMHO this is very bad behaviour and for us the only workaround is to stop the backupservers before we roll new updates/restarts (using Puppet) out to our systems

Feb  5 13:37:10 xxxxxxxx kernel: hcp: INFO: dec_device_user: 2,9942,cdp-2-6
Feb  5 13:37:10 xxxxxxxx kernel: hcp: INFO: stopping hcp session hcp1.
Feb  5 13:37:10 xxxxxxxx kernel: hcp: INFO: hcp session hcp1 stopped.
Feb  5 13:37:10 xxxxxxxx systemd: Stopping MariaDB database server...
Feb  5 13:37:10 xxxxxxxx systemd: Stopping Avahi mDNS/DNS-SD Stack...
Feb  5 13:37:10 xxxxxxxx systemd: Stopping NTP client/server...
Feb  5 13:37:10 xxxxxxxx systemd: Stopped target Local File Systems.
Feb  5 13:37:10 xxxxxxxx systemd: Stopping Load/Save Random Seed...
Feb  5 13:37:10 xxxxxxxx systemd: Stopped This is the timer to set the schedule for automated renewals.
Feb  5 13:37:10 xxxxxxxx systemd: Stopping The nginx HTTP and reverse proxy server...
Feb  5 13:37:10 xxxxxxxx systemd: Stopping Update UTMP about System Boot/Shutdown...
Feb  5 13:37:10 xxxxxxxx systemd: Stopped Flush Journal to Persistent Storage.
Feb  5 13:37:10 xxxxxxxx chronyd[2938]: chronyd exiting
Feb  5 13:37:10 xxxxxxxx systemd: Stopped Flexible branding.
Feb  5 13:37:10 xxxxxxxx systemd: Stopping The PHP FastCGI Process Manager...
Feb  5 13:37:10 xxxxxxxx systemd: Stopped The nginx HTTP and reverse proxy server.
Feb  5 13:37:10 xxxxxxxx systemd: Stopped Update UTMP about System Boot/Shutdown.
Feb  5 13:37:10 xxxxxxxx systemd: Stopped The PHP FastCGI Process Manager.
Feb  5 13:37:10 xxxxxxxx systemd: Stopped Load/Save Random Seed.
Feb  5 13:37:10 xxxxxxxx avahi-daemon[2887]: Got SIGTERM, quitting.
Feb  5 13:37:10 xxxxxxxx avahi-daemon[2887]: Leaving mDNS multicast group on interface eth0.IPv4 with address ip.ip.ip.ip.
Feb  5 13:37:10 xxxxxxxx avahi-daemon[2887]: avahi-daemon 0.6.31 exiting.
Feb  5 13:37:10 xxxxxxxx systemd: Stopped NTP client/server.
Feb  5 13:37:10 xxxxxxxx systemd: Stopped Avahi mDNS/DNS-SD Stack.
Feb  5 13:37:10 xxxxxxxx systemd: Closed Avahi mDNS/DNS-SD Stack Activation Socket.
Feb  5 13:37:11 xxxxxxxx systemd: Stopped The NGINX part of SMT.
Feb  5 13:37:15 xxxxxxxx systemd: Stopped MariaDB database server.
Feb  5 13:37:15 xxxxxxxx systemd: Unmounting /var...
Feb  5 13:37:15 xxxxxxxx umount: umount: /var: target is busy.
Feb  5 13:37:15 xxxxxxxx umount: (In some cases useful info about processes that use
Feb  5 13:37:15 xxxxxxxx umount: the device is found by lsof(8) or fuser(1))
Feb  5 13:37:15 xxxxxxxx systemd: var.mount mount process exited, code=exited status=32
Feb  5 13:37:15 xxxxxxxx systemd: Failed unmounting /var.
Feb  5 13:37:15 xxxxxxxx systemd: Unit var.mount is bound to inactive unit dev-disk-by\x2duuid-7ec6b55c\x2d5923\x2d4dd2\x2db8aa\x2da821e84f71ee.device. Stopping, too.
Feb  5 13:37:15 xxxxxxxx systemd: Unmounting /var...
Feb  5 13:37:15 xxxxxxxx umount: umount: /var: target is busy.
Feb  5 13:37:15 xxxxxxxx umount: (In some cases useful info about processes that use
Feb  5 13:37:15 xxxxxxxx umount: the device is found by lsof(8) or fuser(1))

Comment 26 Lukáš Nykrýn 2019-03-04 14:15:18 UTC
This is not fixed in upstream, so it won't be in rhel-7.7.

Comment 27 Gerwin Krist 2019-03-27 16:36:38 UTC
This one is hurting us pretty good :-( I have filled in a support ticket so hope this speed things up.

Comment 28 Ph-Quentin 2019-03-29 18:26:26 UTC
Hello !

Do you have some news about this ? 

This bug is really critical when you have multiple partitions.

You need to push it on repo ASAP!

Best regards,
Quentin

Comment 29 Renaud Métrich 2019-06-03 13:08:43 UTC
Below is yet another scenario where that mighty behaviour occurs:

1. Set up a system with "/var" on EXT4 filesystem (can be another file system of course)

  By default, Anaconda creates /etc/fstab entries using "UUID=<uuid>" devices

2. Record the UUID for "/var", say "87fda01b-eeaa-4702-806d-ca693f55d6ad"

3. Add a new disk (e.g. "/dev/vdb") to the system and create a file system with same UUID as "/var": "87fda01b-eeaa-4702-806d-ca693f55d6ad"

  NOTE: in the real world, this is legit in case you backup a partition "byte by byte" for example.
  Another example is using the third party backup software https://www.r1soft.com which creates a temporary device "/dev/hcp1" with exact same UUID as partition being backed up.

4. At this point "/dev/disk/by-uuid/87fda01b-eeaa-4702-806d-ca693f55d6ad" will point to the new disk (e.g. "/dev/vdb") due to udev rule running

5. Reload systemd

6. Unplug the "/dev/vdb" disk

  NOTE: in the real world, this is legit in case you remove the backup disk (for example if it's a USB device).
  Another example is finishing the backup when using third party backup software https://www.r1soft.com which deletes the temporary device "/dev/hcp1"

7. Udev updates the "/dev/disk/by-uuid/87fda01b-eeaa-4702-806d-ca693f55d6ad" link to point back to initial partition hosting "/var"

8. Systemd tries unmounting "/var" in loop (because this partition is busy in our case), **breaking** the system


The only workaround to this new reproducer is to mount devices by device path, instead of LABEL or UUID, but device paths may change ...

I believe this now needs to be seriously taken into account Upstream, this is a critical issue.

Comment 32 Lukáš Nykrýn 2019-08-05 12:10:54 UTC
There is still no solution in upstream, this bug will very likely miss 7.8.


Note You need to log in before you can comment on or make changes to this bug.