Bug 2210237 - systemd-239-31 prevents some units from stopping
Summary: systemd-239-31 prevents some units from stopping
Keywords:
Status: ASSIGNED
Alias: None
Product: Red Hat Enterprise Linux 9
Classification: Red Hat
Component: systemd
Version: 9.2
Hardware: All
OS: Linux
unspecified
medium
Target Milestone: rc
: ---
Assignee: Michal Sekletar
QA Contact: Frantisek Sumsal
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-05-26 08:51 UTC by Michal Sekletar
Modified: 2023-08-07 10:29 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github redhat-plumbers systemd-rhel9 pull 171 0 None open (#2210237) core/service: when resetting PID also reset known flag 2023-07-10 14:33:40 UTC
Github redhat-plumbers systemd-rhel9 pull 186 0 None open Revert "core/service: when resetting PID also reset known flag" 2023-08-01 07:42:22 UTC
Red Hat Issue Tracker RHELPLAN-158274 0 None None None 2023-05-26 08:55:34 UTC

Description Michal Sekletar 2023-05-26 08:51:26 UTC
This bug was initially created as a copy of Bug #1875056

I am copying this bug because: 



Description of problem:
postgresql unit fails to stop when running in container based on ubi8-init with systemd-239-31.

Version-Release number of selected component (if applicable):
RHEL 8.2
systemd-239-31
postgresql 12.1

After installing, enabling and starting of the postgresql

sh-4.4# systemctl status postgresql
● postgresql.service - PostgreSQL database server
   Loaded: loaded (/usr/lib/systemd/system/postgresql.service; enabled; vendor preset: disabled)
   Active: active (running) since Tue 2020-09-01 11:43:08 UTC; 39s ago
  Process: 41 ExecStartPre=/usr/libexec/postgresql-check-db-dir postgresql (code=exited, status=0/SUCCESS)
 Main PID: 62 (postmaster)
    Tasks: 8 (limit: 101454)
   Memory: 25.1M
   CGroup: /docker/c816f955e1014d6b6bce7eecf51b5267b3bab5679b256d4dbdd45fb203e12b56/system.slice/postgresql.service
           ├─62 /usr/bin/postmaster -D /var/lib/pgsql/data
           ├─63 postgres: logger   
           ├─65 postgres: checkpointer   
           ├─66 postgres: background writer   
           ├─67 postgres: walwriter   
           ├─68 postgres: autovacuum launcher   
           ├─69 postgres: stats collector   
           └─70 postgres: logical replication launcher   

Sep 01 11:43:07 rhv-flow-node.lab.eng.tlv2.redhat.com systemd[1]: Starting PostgreSQL database server...
Sep 01 11:43:08 rhv-flow-node.lab.eng.tlv2.redhat.com postmaster[62]: 2020-09-01 11:43:08.442 UTC LOG:  starting PostgreSQL 12.1 on x86_64-redhat-linux-gnu, compiled by gcc (GCC) 8.3.1 20190507 (Red Hat 8.3.1-4), 64-bit
Sep 01 11:43:08 rhv-flow-node.lab.eng.tlv2.redhat.com postmaster[62]: 2020-09-01 11:43:08.443 UTC LOG:  listening on IPv4 address "0.0.0.0", port 5432
Sep 01 11:43:08 rhv-flow-node.lab.eng.tlv2.redhat.com postmaster[62]: 2020-09-01 11:43:08.443 UTC LOG:  listening on IPv6 address "::", port 5432
Sep 01 11:43:08 rhv-flow-node.lab.eng.tlv2.redhat.com postmaster[62]: 2020-09-01 11:43:08.449 UTC LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
Sep 01 11:43:08 rhv-flow-node.lab.eng.tlv2.redhat.com postmaster[62]: 2020-09-01 11:43:08.459 UTC LOG:  listening on Unix socket "/tmp/.s.PGSQL.5432"
Sep 01 11:43:08 rhv-flow-node.lab.eng.tlv2.redhat.com postmaster[62]: 2020-09-01 11:43:08.521 UTC LOG:  redirecting log output to logging collector process
Sep 01 11:43:08 rhv-flow-node.lab.eng.tlv2.redhat.com postmaster[62]: 2020-09-01 11:43:08.521 UTC HINT:  Future log output will appear in directory "log".
Sep 01 11:43:08 rhv-flow-node.lab.eng.tlv2.redhat.com systemd[1]: Started PostgreSQL database server.

all the processes are in place:

sh-4.4# ps aux
USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root           1  0.5  0.0  94504  6968 ?        Ss   11:43   0:00 /sbin/init
root          31  0.8  0.0  93880  7828 ?        Ss   11:43   0:00 /usr/lib/systemd/systemd-journald
root          43  0.5  0.0  12172  3804 ?        Ss   11:43   0:00 /usr/sbin/haveged -w 1024 -v 1 --Foreground
root          44  0.0  0.0 106000  1636 ?        Ssl  11:43   0:00 /usr/sbin/gssproxy -D
root          45  1.0  0.0 225768  3900 ?        Ssl  11:43   0:00 /usr/sbin/rsyslogd -n
dbus          51  0.0  0.0  53952  2504 ?        Ss   11:43   0:00 /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation --syslog-only
chrony        53  0.0  0.0  97100  1504 ?        S    11:43   0:00 /usr/sbin/chronyd
postgres      62  0.2  0.0 295468  7568 ?        Ss   11:43   0:00 /usr/bin/postmaster -D /var/lib/pgsql/data
postgres      63  0.0  0.0 143828  1844 ?        Ss   11:43   0:00 postgres: logger   
postgres      65  0.0  0.0 295468  1880 ?        Ss   11:43   0:00 postgres: checkpointer   
postgres      66  0.0  0.0 295468  2092 ?        Ss   11:43   0:00 postgres: background writer   
postgres      67  0.0  0.0 295468  2084 ?        Ss   11:43   0:00 postgres: walwriter   
postgres      68  0.0  0.0 296016  2864 ?        Ss   11:43   0:00 postgres: autovacuum launcher   
postgres      69  0.0  0.0 145948  2068 ?        Ss   11:43   0:00 postgres: stats collector   
postgres      70  0.0  0.0 296036  2600 ?        Ss   11:43   0:00 postgres: logical replication launcher   
root          72  0.0  0.0  12008  1980 pts/1    Ss   11:43   0:00 sh
root          81  0.0  0.0  44584  1780 pts/1    R+   11:44   0:00 ps aux

when trying to stop the postgresql

sh-4.4# systemctl stop postgresql

the systemctl gets stuck. When connecting from other shell the postgresql is active (not stopped) although the main PID exited succesfully:

sh-4.4# systemctl status postgresql
● postgresql.service - PostgreSQL database server
   Loaded: loaded (/usr/lib/systemd/system/postgresql.service; enabled; vendor preset: disabled)
   Active: deactivating (final-sigkill) (thawing) since Tue 2020-09-01 11:46:07 UTC; 5min ago
  Process: 62 ExecStart=/usr/bin/postmaster -D ${PGDATA} (code=exited, status=0/SUCCESS)
  Process: 41 ExecStartPre=/usr/libexec/postgresql-check-db-dir postgresql (code=exited, status=0/SUCCESS)
 Main PID: 62 (code=exited, status=0/SUCCESS)
    Tasks: 0 (limit: 101454)
   Memory: 22.5M
   CGroup: /docker/c816f955e1014d6b6bce7eecf51b5267b3bab5679b256d4dbdd45fb203e12b56/system.slice/postgresql.service

Sep 01 11:43:08 rhv-flow-node.lab.eng.tlv2.redhat.com postmaster[62]: 2020-09-01 11:43:08.442 UTC LOG:  starting PostgreSQL 12.1 on x86_64-redhat-linux-gnu, compiled by gcc (GC>
Sep 01 11:43:08 rhv-flow-node.lab.eng.tlv2.redhat.com postmaster[62]: 2020-09-01 11:43:08.443 UTC LOG:  listening on IPv4 address "0.0.0.0", port 5432
Sep 01 11:43:08 rhv-flow-node.lab.eng.tlv2.redhat.com postmaster[62]: 2020-09-01 11:43:08.443 UTC LOG:  listening on IPv6 address "::", port 5432
Sep 01 11:43:08 rhv-flow-node.lab.eng.tlv2.redhat.com postmaster[62]: 2020-09-01 11:43:08.449 UTC LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
Sep 01 11:43:08 rhv-flow-node.lab.eng.tlv2.redhat.com postmaster[62]: 2020-09-01 11:43:08.459 UTC LOG:  listening on Unix socket "/tmp/.s.PGSQL.5432"
Sep 01 11:43:08 rhv-flow-node.lab.eng.tlv2.redhat.com postmaster[62]: 2020-09-01 11:43:08.521 UTC LOG:  redirecting log output to logging collector process
Sep 01 11:43:08 rhv-flow-node.lab.eng.tlv2.redhat.com postmaster[62]: 2020-09-01 11:43:08.521 UTC HINT:  Future log output will appear in directory "log".
Sep 01 11:43:08 rhv-flow-node.lab.eng.tlv2.redhat.com systemd[1]: Started PostgreSQL database server.
Sep 01 11:46:07 rhv-flow-node.lab.eng.tlv2.redhat.com systemd[1]: Stopping PostgreSQL database server...
Sep 01 11:46:07 rhv-flow-node.lab.eng.tlv2.redhat.com systemd[1]: postgresql.service: Killing process 63 (postmaster) with signal SIGKILL.

All the corresponding postgres processes got terminated as expected:

sh-4.4# ps aux
USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root           1  0.1  0.0  94504  6968 ?        Ss   11:43   0:00 /sbin/init
root          31  0.2  0.0  93880  7828 ?        Ss   11:43   0:00 /usr/lib/systemd/systemd-journald
root          43  0.2  0.0  12172  3804 ?        Ss   11:43   0:00 /usr/sbin/haveged -w 1024 -v 1 --Foreground
root          44  0.0  0.0 106000  1636 ?        Ssl  11:43   0:00 /usr/sbin/gssproxy -D
root          45  0.3  0.0 225768  3900 ?        Ssl  11:43   0:00 /usr/sbin/rsyslogd -n
dbus          51  0.0  0.0  53952  2504 ?        Ss   11:43   0:00 /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation --syslog-only
chrony        53  0.0  0.0  97100  1504 ?        S    11:43   0:00 /usr/sbin/chronyd
root          72  0.0  0.0  12008  1988 pts/1    Ss   11:43   0:00 sh
root          86  0.0  0.0  83632  3860 pts/1    S+   11:46   0:00 systemctl stop postgresql
root          87  0.0  0.0  83452  3340 pts/1    S+   11:46   0:00 /usr/bin/systemd-tty-ask-password-agent --watch
root          88  0.1  0.0  12008  1952 pts/2    Ss   11:46   0:00 sh
root          94  0.0  0.0  44584  1780 pts/2    R+   11:46   0:00 ps aux

Kind of weird is that systemd tries to kill postmaster as process 63 but postmaster was process 62.

This bug is regression - in systemd-239-18 is not present.

The same issue was identified in Centos 8.2 (https://bugs.centos.org/view.php?id=17461)


How reproducible:
100%


Steps to Reproduce:
1. Run container based on ubi8-init
2. Install postgresql-server
3. Start the postgres via systemctl start
4. Try to stop the postgres via systemctl stop

Actual results:
systemctl stop postgresql 
gets stuck and the unit is not stopped

Expected results:
systemctl stop postgresql
ends with success and unit is stopped


Additional info:

Comment 1 Plumber Bot 2023-07-13 19:08:48 UTC
fix merged to github main branch -> https://github.com/redhat-plumbers/systemd-rhel9/pull/171


Note You need to log in before you can comment on or make changes to this bug.