Bug 2258599 - Leak in uresourced inotify watches
Summary: Leak in uresourced inotify watches
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: uresourced
Version: 39
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: ---
Assignee: Benjamin Berg
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2024-01-16 13:23 UTC by Raman Gupta
Modified: 2024-11-13 13:08 UTC (History)
5 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2024-11-13 13:08:52 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Raman Gupta 2024-01-16 13:23:58 UTC
`uresourced` appears to leak inotify watches.

Using `inotify-info`, one can see the number of watches consumed by `uresourced` keeps increasing, and immediately drops significantly upon restart.

```
Before restart:

------------------------------------------------------------------------------
INotify Limits:
max_queued_events    16384
max_user_instances   512
max_user_watches     524288
------------------------------------------------------------------------------
Pid  App                        Watches   Instances
4782 uresourced                     43439   1
```

After restart:

```
Pid  App                        Watches   Instances
1365099 uresourced                     160   1
```


Reproducible: Always

Steps to Reproduce:
1. Wait
2. The leak happens fast enough that `watch 'inotify-info | grep resourced'` is enough to watch the inotify watch counts rise every few seconds.



Another user confirmed the bug here: https://discussion.fedoraproject.org/t/uresourced-461312-inotify-watches-normal/72952/2.

Comment 1 Tom Hughes 2024-01-16 13:51:38 UTC
I wrote a script to demonstrate the problem - it looks at all the inodes being watched by the current user's uresourced and tries to find them in the cgroup tree and many simply don't exist any more:

#!/bin/sh

uid=$(id -u)
pid=$(pgrep -u ${uid} uresourced)

for watch in $(fgrep inotify /proc/${pid}/fdinfo/5 | awk '{ print $3 }')
do
  inum=$((16#${watch##ino:}))
  name=$(find /sys/fs/cgroup -inum ${inum})

  echo "${inum} - ${name}"
done
  
exit 0

Comment 2 jyx21 2024-01-16 14:17:48 UTC
After restarting uresourced for 12 hours, it now has 12758 watches, consistent with my observation of (complained by vscode, then) restarting uresourced approximately once a week. Among them, only the following items can be found in /sys.

    8353 /sys/fs/cgroup/user.slice/user-1000.slice/user/app.slice
    8408 /sys/fs/cgroup/user.slice/user-1000.slice/user/app.slice/app-podman\x2dcompose.slice
    8463 /sys/fs/cgroup/user.slice/user-1000.slice/user/app.slice/dbus.socket
    8573 /sys/fs/cgroup/user.slice/user-1000.slice/user/app.slice/app-podman\x2dcompose.slice/podman-compose
    8699 /sys/fs/cgroup/user.slice/user-1000.slice/user/app.slice/app-podman\x2dcompose.slice/podman-compose
   10491 /sys/fs/cgroup/user.slice/user-1000.slice/user/app.slice/run-rfb2b8060ecf24d3b925426e11f9ff9be.scope
   10766 /sys/fs/cgroup/user.slice/user-1000.slice/user/app.slice/podman.service
 8103720 /sys/fs/cgroup/user.slice/user-1000.slice/user/app.slice/app-gnome\x2dsession\x2dmanager.slice
 8103775 /sys/fs/cgroup/user.slice/user-1000.slice/user/app.slice/gnome-session-monitor.service
 8103885 /sys/fs/cgroup/user.slice/user-1000.slice/user/app.slice/app-gnome\x2dsession\x2dmanager.slice/gnome-session-manager
 8104302 /sys/fs/cgroup/user.slice/user-1000.slice/user/app.slice/dbus-:1.223-org.a11y.atspi.Registry
 8104454 /sys/fs/cgroup/user.slice/user-1000.slice/user/app.slice/dbus-:1.2-org.gnome.Shell.CalendarServer
 8104509 /sys/fs/cgroup/user.slice/user-1000.slice/user/app.slice/evolution-source-registry.service
 8104624 /sys/fs/cgroup/user.slice/user-1000.slice/user/app.slice/dbus-:1.2-org.gnome.Shell.Notifications
 8105461 /sys/fs/cgroup/user.slice/user-1000.slice/user/app.slice/dbus-:1.2-org.freedesktop.portal.IBus
 8105516 /sys/fs/cgroup/user.slice/user-1000.slice/user/app.slice/dbus-:1.2-org.freedesktop.problems.applet
 8105741 /sys/fs/cgroup/user.slice/user-1000.slice/user/app.slice/app-gnome-org.gnome.Evolution\x2dalarm\x2dnotify-3399597.scope
 8105796 /sys/fs/cgroup/user.slice/user-1000.slice/user/app.slice/app-gnome-org.gnome.SettingsDaemon.DiskUtilityNotify-3399562.scope
 8105851 /sys/fs/cgroup/user.slice/user-1000.slice/user/app.slice/app-gnome-org.gnome.Software-3399490.scope
 8105906 /sys/fs/cgroup/user.slice/user-1000.slice/user/app.slice/dbus-:1.2-org.gnome.ScreenSaver
 8106003 /sys/fs/cgroup/user.slice/user-1000.slice/user/app.slice/dbus-:1.2-org.gnome.OnlineAccounts
 8106100 /sys/fs/cgroup/user.slice/user-1000.slice/user/app.slice/dbus-:1.2-org.gnome.Identity
 8106155 /sys/fs/cgroup/user.slice/user-1000.slice/user/app.slice/evolution-calendar-factory.service
 8106396 /sys/fs/cgroup/user.slice/user-1000.slice/user/app.slice/evolution-addressbook-factory.service
 8106655 /sys/fs/cgroup/user.slice/user-1000.slice/user/app.slice/dconf.service
 8106854 /sys/fs/cgroup/user.slice/user-1000.slice/user/app.slice/xdg-desktop-portal-gnome.service
 8106909 /sys/fs/cgroup/user.slice/user-1000.slice/user/app.slice/xdg-desktop-portal-gtk.service
 8126880 /sys/fs/cgroup/user.slice/user-1000.slice/user/app.slice/dbus-:1.2-org.gnome.Calendar
 8127265 /sys/fs/cgroup/user.slice/user-1000.slice/user/app.slice/obex.service
 8127435 /sys/fs/cgroup/user.slice/user-1000.slice/user/app.slice/tracker-miner-fs-3.service
 8127885 /sys/fs/cgroup/user.slice/user-1000.slice/user/app.slice/app-cgroupify.slice
21813140 /sys/fs/cgroup/user.slice/user-1000.slice/user/app.slice/app-org.gnome.Terminal.slice

Comment 3 Benjamin Berg 2024-01-16 16:48:28 UTC
lol, I am surprised no one noticed this earlier … can you try this patch (seems to work here)?

diff --git a/src/r-app-monitor.c b/src/r-app-monitor.c
index 71f7244..4a39fb2 100644
--- a/src/r-app-monitor.c
+++ b/src/r-app-monitor.c
@@ -64,12 +64,17 @@ r_app_monitor_finalize (GObject *object)
   g_clear_pointer (&self->wd_to_path_map, g_hash_table_destroy);
   g_clear_pointer (&self->app_info_map, g_hash_table_destroy);
 
+  if (self->inotify_fd >= 0)
+    close (self->inotify_fd);
+
   G_OBJECT_CLASS (r_app_monitor_parent_class)->finalize (object);
 }
 
 gboolean
 inotify_add_cgroup_dir (RAppMonitor *self, gchar *path)
 {
+  gpointer old_path;
+  gpointer old_wd;
   gint wd;
 
   wd = inotify_add_watch (self->inotify_fd, path,
@@ -77,6 +82,14 @@ inotify_add_cgroup_dir (RAppMonitor *self, gchar *path)
   if (wd == -1)
     return FALSE;
 
+  if (g_hash_table_steal_extended(self->path_to_wd_map, path, &old_path, &old_wd))
+    {
+      g_free (old_path);
+      g_hash_table_remove (self->wd_to_path_map, old_wd);
+
+      inotify_rm_watch (self->inotify_fd, GPOINTER_TO_INT (old_wd));
+    }
+
   g_hash_table_replace (self->path_to_wd_map, g_strdup (path),
                         GINT_TO_POINTER (wd));
   g_hash_table_replace (self->wd_to_path_map, GINT_TO_POINTER (wd),
@@ -362,6 +375,8 @@ handle_inotify_event (RAppMonitor *self, struct inotify_event *i)
       g_hash_table_remove (self->path_to_wd_map, app_path);
       g_hash_table_remove (self->wd_to_path_map, wd_temp);
       g_hash_table_remove (self->app_info_map, app_path);
+
+      inotify_rm_watch (self->inotify_fd, GPOINTER_TO_INT (wd_temp));
     }
 }

Comment 4 Raman Gupta 2024-01-16 17:10:46 UTC
(In reply to Benjamin Berg from comment #3)
> lol, I am surprised no one noticed this earlier … can you try this patch
> (seems to work here)?

Installed the patch and I can confirm that the watch counts now appear to make sense, remaining stable by both going up *and* going down.

I can also confirm that the script from Tom Hughes (https://bugzilla.redhat.com/show_bug.cgi?id=2258599#c1) only outputs valid watches.

Comment 5 jyx21 2024-01-17 01:50:25 UTC
I'm not an expert, but is this "leak" tied to specific usecases, e.g., podman rootless containers? I'm surprised little information could be obtained by searching.

Comment 6 Daniel Miranda 2024-02-06 17:37:20 UTC
I am observing this issue, but I am not using podman, rootless or otherwise.

Any chance we can get the patch applied and pushed to Fedora?

Comment 7 Aoife Moloney 2024-11-13 10:28:32 UTC
This message is a reminder that Fedora Linux 39 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora Linux 39 on 2024-11-26.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
'version' of '39'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, change the 'version' 
to a later Fedora Linux version. Note that the version field may be hidden.
Click the "Show advanced fields" button if you do not see it.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora Linux 39 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora Linux, you are encouraged to change the 'version' to a later version
prior to this bug being closed.

Comment 8 Raman Gupta 2024-11-13 13:08:52 UTC
Looks like this was fixed in 0.5.4 and it does not appear to be an issue any more in Fedora 41.

* https://gitlab.freedesktop.org/benzea/uresourced/-/compare/v0.5.3...v0.5.4


Note You need to log in before you can comment on or make changes to this bug.