Somewhere between Fedora-Rawhide-20180303.n.0 and Fedora-Rawhide-20180308.n.2 , and between Fedora-28-20180302.n.0 and Fedora-28-20180310.n.0 , the openQA database server role deployment test started failing. The 'rolectl deploy' command fails, showing an error message "Error: Message recipient disconnected from message bus without replying". However, system logs just don't show a lot in the way of errors; there are none in the rolekit log, and nothing obvious in the system log, it just shows postgres deployment proceeding apparently successfully. I'm honestly not sure what's going on, but will attach the log tarball for debugging. Proposing as a Beta blocker, per Basic criterion "Release-blocking roles and the supported role configuration interfaces must meet the core functional Role Definition Requirements to the extent that supported roles can be successfully deployed, started, stopped, brought to a working configuration, and queried" - database server is one of the release-blocking roles, and it apparently can't be deployed properly...
Created attachment 1407408 [details] tarball of all of /var/log from a failure
So, I ran the rolekit service with `/usr/sbin/roled --nofork --nopid --debug 10 --persistent` (after ensuring that the service was not already running due to being autostarted) and then attempted to deploy a databaseserver with `rolectl deploy databaseserver --name="psql"` The full log is displayed below, but the short version is that some of the packages it needs to install are not properly signed (or the key is not properly imported) on an F28 system. CCing Kevin Fenzi and Patrick Uiterwijk to investigate. 2018-03-13 12:59:48 DEBUG1: role.databaseserver.deploy('psql', {}) 2018-03-13 12:59:48 DEBUG1: Created missing '/etc/rolekit/roles/databaseserver'. 2018-03-13 12:59:48 DEBUG1: role.databaseserver.psql.PropertiesChanged('org.fedoraproject.rolekit1.role.instance', '{'state': 'nascent'}', '[]') 2018-03-13 12:59:48 DEBUG1: role.databaseserver.InstanceAdded('psql') 2018-03-13 12:59:48 DEBUG1: role.databaseserver.psql.deploy({}) 2018-03-13 12:59:48 DEBUG1: role.databaseserver.psql.PropertiesChanged('org.fedoraproject.rolekit1.role.instance', '{'lasterror': ''}', '[]') 2018-03-13 12:59:48 DEBUG1: role.databaseserver.psql.PropertiesChanged('org.fedoraproject.rolekit1.role.instance', '{'state': 'deploying'}', '[]') 2018-03-13 12:59:48 DEBUG1: role.databaseserver.psql.StateChanged('deploying') 2018-03-13 12:59:48 DEBUG9: TRACE: Installing packages 2018-03-13 12:59:48 DEBUG1: role.databaseserver.psql.installPackages() 2018-03-13 12:59:48 DEBUG9: subprocess: ['dnf', '-y', 'install', 'rolekit', 'postgresql-server', 'postgresql-contrib', 'python3-psycopg2'] 2018-03-13 12:59:48 DEBUG1: b'Last metadata expiration check: 2:49:58 ago on Tue 13 Mar 2018 10:09:49 AM CET.' 2018-03-13 12:59:50 ERROR: b'Package rolekit-0.5.2-1.fc27.noarch is already installed, skipping.' 2018-03-13 12:59:50 DEBUG1: b'Dependencies resolved.' 2018-03-13 12:59:50 DEBUG1: b'================================================================================' 2018-03-13 12:59:50 DEBUG1: b' Package Arch Version Repository Size' 2018-03-13 12:59:50 DEBUG1: b'================================================================================' 2018-03-13 12:59:50 DEBUG1: b'Installing:' 2018-03-13 12:59:50 DEBUG1: b' postgresql-contrib x86_64 10.3-1.fc29 rawhide 804 k' 2018-03-13 12:59:50 DEBUG1: b' postgresql-server x86_64 10.3-1.fc29 rawhide 4.9 M' 2018-03-13 12:59:50 DEBUG1: b' python3-psycopg2 x86_64 2.7.4-1.fc28 rawhide 170 k' 2018-03-13 12:59:50 DEBUG1: b'Upgrading:' 2018-03-13 12:59:50 DEBUG1: b' glibc x86_64 2.27.9000-8.fc29 rawhide 3.7 M' 2018-03-13 12:59:50 DEBUG1: b' glibc-all-langpacks x86_64 2.27.9000-8.fc29 rawhide 24 M' 2018-03-13 12:59:50 DEBUG1: b' glibc-common x86_64 2.27.9000-8.fc29 rawhide 783 k' 2018-03-13 12:59:50 DEBUG1: b' libnsl x86_64 2.27.9000-8.fc29 rawhide 73 k' 2018-03-13 12:59:50 DEBUG1: b'Installing dependencies:' 2018-03-13 12:59:50 DEBUG1: b' libxcrypt x86_64 4.0.0-5.fc28 rawhide 77 k' 2018-03-13 12:59:50 DEBUG1: b' replacing libcrypt.x86_64 2.26.9000-41.fc28' 2018-03-13 12:59:50 DEBUG1: b' libxslt x86_64 1.1.32-2.fc28 rawhide 248 k' 2018-03-13 12:59:50 DEBUG1: b' perl-Carp noarch 1.42-395.fc28 rawhide 29 k' 2018-03-13 12:59:50 DEBUG1: b' perl-Exporter noarch 5.72-396.fc28 rawhide 33 k' 2018-03-13 12:59:50 DEBUG1: b' perl-libs x86_64 4:5.26.1-409.fc29 rawhide 1.6 M' 2018-03-13 12:59:50 DEBUG1: b' postgresql x86_64 10.3-1.fc29 rawhide 1.5 M' 2018-03-13 12:59:50 DEBUG1: b' postgresql-libs x86_64 10.3-1.fc29 rawhide 300 k' 2018-03-13 12:59:50 DEBUG1: b' uuid x86_64 1.6.2-40.fc28 rawhide 62 k' 2018-03-13 12:59:50 DEBUG1: b'' 2018-03-13 12:59:50 DEBUG1: b'Transaction Summary' 2018-03-13 12:59:50 DEBUG1: b'================================================================================' 2018-03-13 12:59:50 DEBUG1: b'Install 11 Packages' 2018-03-13 12:59:50 DEBUG1: b'Upgrade 4 Packages' 2018-03-13 12:59:50 DEBUG1: b'' 2018-03-13 12:59:50 DEBUG1: b'Total download size: 38 M' 2018-03-13 12:59:50 DEBUG1: b'Downloading Packages:' 2018-03-13 12:59:51 DEBUG1: b'(1/15): libxcrypt-4.0.0-5.fc28.x86_64.rpm 328 kB/s | 77 kB 00:00 ' 2018-03-13 12:59:51 DEBUG1: b'(2/15): libxslt-1.1.32-2.fc28.x86_64.rpm 2.7 MB/s | 248 kB 00:00 ' 2018-03-13 12:59:51 DEBUG1: b'(3/15): python3-psycopg2-2.7.4-1.fc28.x86_64.rp 372 kB/s | 170 kB 00:00 ' 2018-03-13 12:59:51 DEBUG1: b'(4/15): postgresql-contrib-10.3-1.fc29.x86_64.r 1.4 MB/s | 804 kB 00:00 ' 2018-03-13 12:59:51 DEBUG1: b'(5/15): perl-libs-5.26.1-409.fc29.x86_64.rpm 5.3 MB/s | 1.6 MB 00:00 ' 2018-03-13 12:59:51 DEBUG1: b'(6/15): postgresql-libs-10.3-1.fc29.x86_64.rpm 1.3 MB/s | 300 kB 00:00 ' 2018-03-13 12:59:51 DEBUG1: b'(7/15): perl-Carp-1.42-395.fc28.noarch.rpm 605 kB/s | 29 kB 00:00 ' 2018-03-13 12:59:51 DEBUG1: b'(8/15): uuid-1.6.2-40.fc28.x86_64.rpm 243 kB/s | 62 kB 00:00 ' 2018-03-13 12:59:51 DEBUG1: b'(9/15): postgresql-10.3-1.fc29.x86_64.rpm 3.1 MB/s | 1.5 MB 00:00 ' 2018-03-13 12:59:51 DEBUG1: b'(10/15): perl-Exporter-5.72-396.fc28.noarch.rpm 303 kB/s | 33 kB 00:00 ' 2018-03-13 12:59:52 DEBUG1: b'(11/15): glibc-common-2.27.9000-8.fc29.x86_64.r 3.9 MB/s | 783 kB 00:00 ' 2018-03-13 12:59:52 DEBUG1: b'(12/15): glibc-2.27.9000-8.fc29.x86_64.rpm 13 MB/s | 3.7 MB 00:00 ' 2018-03-13 12:59:52 DEBUG1: b'(13/15): postgresql-server-10.3-1.fc29.x86_64.r 12 MB/s | 4.9 MB 00:00 ' 2018-03-13 12:59:52 DEBUG1: b'(14/15): libnsl-2.27.9000-8.fc29.x86_64.rpm 242 kB/s | 73 kB 00:00 ' 2018-03-13 12:59:53 DEBUG1: b'(15/15): glibc-all-langpacks-2.27.9000-8.fc29.x 21 MB/s | 24 MB 00:01 ' 2018-03-13 12:59:53 DEBUG1: b'--------------------------------------------------------------------------------' 2018-03-13 12:59:53 DEBUG1: b'Total 14 MB/s | 38 MB 00:02 ' 2018-03-13 12:59:53 ERROR: b'warning: /var/cache/dnf/rawhide-2d95c80a1fa0a67d/packages/postgresql-contrib-10.3-1.fc29.x86_64.rpm: Header V3 RSA/SHA256 Signature, key ID 429476b4: NOKEY' 2018-03-13 12:59:53 ERROR: b'Importing GPG key 0x9DB62FB1:' 2018-03-13 12:59:53 ERROR: b' Userid : "Fedora 28 (28) <fedora-28>"' 2018-03-13 12:59:53 ERROR: b' Fingerprint: 128C F232 A937 1991 C8A6 5695 E08E 7E62 9DB6 2FB1' 2018-03-13 12:59:53 ERROR: b' From : /etc/pki/rpm-gpg/RPM-GPG-KEY-fedora-28-x86_64' 2018-03-13 12:59:53 DEBUG1: b'Key imported successfully' 2018-03-13 12:59:53 DEBUG1: b"Import of key(s) didn't help, wrong key(s)?" 2018-03-13 12:59:53 ERROR: b'Public key for postgresql-contrib-10.3-1.fc29.x86_64.rpm is not installed. Failing package is: postgresql-contrib-10.3-1.fc29.x86_64' 2018-03-13 12:59:53 ERROR: b' GPG Keys are configured as: file:///etc/pki/rpm-gpg/RPM-GPG-KEY-fedora-28-x86_64' 2018-03-13 12:59:53 ERROR: b'Public key for python3-psycopg2-2.7.4-1.fc28.x86_64.rpm is not installed. Failing package is: python3-psycopg2-2.7.4-1.fc28.x86_64' 2018-03-13 12:59:53 ERROR: b' GPG Keys are configured as: file:///etc/pki/rpm-gpg/RPM-GPG-KEY-fedora-28-x86_64' 2018-03-13 12:59:53 DEBUG1: b'The downloaded packages were saved in cache until the next successful transaction.' 2018-03-13 12:59:53 DEBUG1: b"You can remove cached packages by executing 'dnf clean packages'." 2018-03-13 12:59:53 ERROR: b'Public key for libxcrypt-4.0.0-5.fc28.x86_64.rpm is not installed. Failing package is: libxcrypt-4.0.0-5.fc28.x86_64' 2018-03-13 12:59:53 ERROR: b' GPG Keys are configured as: file:///etc/pki/rpm-gpg/RPM-GPG-KEY-fedora-28-x86_64' 2018-03-13 12:59:53 ERROR: b'Public key for libxslt-1.1.32-2.fc28.x86_64.rpm is not installed. Failing package is: libxslt-1.1.32-2.fc28.x86_64' 2018-03-13 12:59:53 ERROR: b' GPG Keys are configured as: file:///etc/pki/rpm-gpg/RPM-GPG-KEY-fedora-28-x86_64' 2018-03-13 12:59:53 ERROR: b'Public key for perl-libs-5.26.1-409.fc29.x86_64.rpm is not installed. Failing package is: perl-libs-4:5.26.1-409.fc29.x86_64' 2018-03-13 12:59:53 ERROR: b' GPG Keys are configured as: file:///etc/pki/rpm-gpg/RPM-GPG-KEY-fedora-28-x86_64' 2018-03-13 12:59:53 ERROR: b'Public key for postgresql-10.3-1.fc29.x86_64.rpm is not installed. Failing package is: postgresql-10.3-1.fc29.x86_64' 2018-03-13 12:59:53 ERROR: b' GPG Keys are configured as: file:///etc/pki/rpm-gpg/RPM-GPG-KEY-fedora-28-x86_64' 2018-03-13 12:59:53 ERROR: b'Public key for postgresql-libs-10.3-1.fc29.x86_64.rpm is not installed. Failing package is: postgresql-libs-10.3-1.fc29.x86_64' 2018-03-13 12:59:53 ERROR: b' GPG Keys are configured as: file:///etc/pki/rpm-gpg/RPM-GPG-KEY-fedora-28-x86_64' 2018-03-13 12:59:53 ERROR: b'Public key for uuid-1.6.2-40.fc28.x86_64.rpm is not installed. Failing package is: uuid-1.6.2-40.fc28.x86_64' 2018-03-13 12:59:53 ERROR: b' GPG Keys are configured as: file:///etc/pki/rpm-gpg/RPM-GPG-KEY-fedora-28-x86_64' 2018-03-13 12:59:53 ERROR: b'Public key for perl-Carp-1.42-395.fc28.noarch.rpm is not installed. Failing package is: perl-Carp-1.42-395.fc28.noarch' 2018-03-13 12:59:53 ERROR: b' GPG Keys are configured as: file:///etc/pki/rpm-gpg/RPM-GPG-KEY-fedora-28-x86_64' 2018-03-13 12:59:53 ERROR: b'Public key for perl-Exporter-5.72-396.fc28.noarch.rpm is not installed. Failing package is: perl-Exporter-5.72-396.fc28.noarch' 2018-03-13 12:59:53 ERROR: b' GPG Keys are configured as: file:///etc/pki/rpm-gpg/RPM-GPG-KEY-fedora-28-x86_64' 2018-03-13 12:59:53 ERROR: b'Public key for postgresql-server-10.3-1.fc29.x86_64.rpm is not installed. Failing package is: postgresql-server-10.3-1.fc29.x86_64' 2018-03-13 12:59:53 ERROR: b' GPG Keys are configured as: file:///etc/pki/rpm-gpg/RPM-GPG-KEY-fedora-28-x86_64' 2018-03-13 12:59:53 ERROR: b'Public key for glibc-2.27.9000-8.fc29.x86_64.rpm is not installed. Failing package is: glibc-2.27.9000-8.fc29.x86_64' 2018-03-13 12:59:53 ERROR: b' GPG Keys are configured as: file:///etc/pki/rpm-gpg/RPM-GPG-KEY-fedora-28-x86_64' 2018-03-13 12:59:53 ERROR: b'Public key for glibc-common-2.27.9000-8.fc29.x86_64.rpm is not installed. Failing package is: glibc-common-2.27.9000-8.fc29.x86_64' 2018-03-13 12:59:53 ERROR: b' GPG Keys are configured as: file:///etc/pki/rpm-gpg/RPM-GPG-KEY-fedora-28-x86_64' 2018-03-13 12:59:53 ERROR: b'Public key for glibc-all-langpacks-2.27.9000-8.fc29.x86_64.rpm is not installed. Failing package is: glibc-all-langpacks-2.27.9000-8.fc29.x86_64' 2018-03-13 12:59:53 ERROR: b' GPG Keys are configured as: file:///etc/pki/rpm-gpg/RPM-GPG-KEY-fedora-28-x86_64' 2018-03-13 12:59:53 ERROR: b'Public key for libnsl-2.27.9000-8.fc29.x86_64.rpm is not installed. Failing package is: libnsl-2.27.9000-8.fc29.x86_64' 2018-03-13 12:59:53 ERROR: b' GPG Keys are configured as: file:///etc/pki/rpm-gpg/RPM-GPG-KEY-fedora-28-x86_64' 2018-03-13 12:59:53 ERROR: b'Error: GPG check FAILED' 2018-03-13 12:59:53 DEBUG1: role.databaseserver.psql.PropertiesChanged('org.fedoraproject.rolekit1.role.instance', '{'lasterror': 'COMMAND_FAILED: 256'}', '[]') 2018-03-13 12:59:53 DEBUG1: role.databaseserver.psql.PropertiesChanged('org.fedoraproject.rolekit1.role.instance', '{'state': 'error'}', '[]') 2018-03-13 12:59:53 DEBUG1: role.databaseserver.psql.StateChanged('error') 2018-03-13 12:59:53 ERROR: RolekitError: COMMAND_FAILED: 256
2018-03-13 12:59:53 ERROR: b' GPG Keys are configured as: file:///etc/pki/rpm-gpg/RPM-GPG-KEY-fedora-28-x86_64' on a rawhide system right now means that it has an incorrect fedora-repos. Can you give me the RPM version of fedora-repos, and which exact image you booted please?
fedora-repos-28-0.3.noarch on a system I've had running since pre-branch. Yesterday I did a distro-sync, so it should have pulled everything from stable. I should probably try this on a fresh VM. I'll report back shortly.
OK, on a fresh install from last night's compose, the error is notably different. Looks like it might be related to the recent systemd update. ``` <...> 2018-03-13 08:38:27 DEBUG9: subprocess: ['/usr/lib/rolekit/roles/databaseserver/tools/rk_db_setpwd.py', '--database', 'psql', '--user', 'db_owner'] 2018-03-13 08:38:27 DEBUG1: b"False2018-03-13 08:38:27 Connected to local database 'psql'" 2018-03-13 08:38:27 DEBUG1: b'2018-03-13 08:38:27 Executing: ALTER ROLE db_owner WITH ENCRYPTED PASSWORD %(pwd)s;' 2018-03-13 08:38:27 DEBUG2: TRACE: Opening access to external addresses 2018-03-13 08:38:27 DEBUG2: TRACE: Restarting postgresql.service unit 2018-03-13 08:38:27 DEBUG1: systemd JobRemoved signal: (dbus.UInt32(758), dbus.ObjectPath('/org/freedesktop/systemd1/job/758'), dbus.String('postgresql.service'), dbus.String('done')) 2018-03-13 08:38:27 DEBUG2: TRACE: Database server deployed 2018-03-13 08:38:27 DEBUG9: TRACE: role-specific deployment complete, applying values 2018-03-13 08:38:27 DEBUG1: role.databaseserver.psql.PropertiesChanged('org.fedoraproject.rolekit1.role.instance', '{'database': 'psql'}', '[]') 2018-03-13 08:38:27 DEBUG1: role.databaseserver.psql.PropertiesChanged('org.fedoraproject.rolekit1.role.instance', '{'owner': 'db_owner'}', '[]') 2018-03-13 08:38:27 DEBUG1: role.databaseserver.psql.PropertiesChanged('org.fedoraproject.rolekit1.role.instance', '{'postgresql_conf': '/var/lib/pgsql/data/postgresql.conf'}', '[]') 2018-03-13 08:38:27 DEBUG1: role.databaseserver.psql.PropertiesChanged('org.fedoraproject.rolekit1.role.instance', '{'pg_hba_conf': '/var/lib/pgsql/data/pg_hba.conf'}', '[]') 2018-03-13 08:38:27 DEBUG1: role.databaseserver.psql.PropertiesChanged('org.fedoraproject.rolekit1.role.instance', '{'password': 'QmhJ8pgBTBeSkznK'}', '[]') 2018-03-13 08:38:27 DEBUG1: role.databaseserver.psql.PropertiesChanged('org.fedoraproject.rolekit1.role.instance', '{'database': 'psql', 'owner': 'db_owner', 'postgresql_conf': '/var/lib/pgsql/data/postgresql.conf', 'pg_hba_conf': '/var/lib/pgsql/data/pg_hba.conf', 'password': 'QmhJ8pgBTBeSkznK'}', '[]') 2018-03-13 08:38:27 DEBUG9: TRACE: Creating systemd target files 2018-03-13 08:38:27 DEBUG9: Creating target unit role-databaseserver-psql.target 2018-03-13 08:38:27 DEBUG9: Reloading systemd units 2018-03-13 08:38:27 DEBUG1: role.databaseserver.psql.PropertiesChanged('org.fedoraproject.rolekit1.role.instance', '{'lasterror': 'org.freedesktop.DBus.Error.NoReply: Message recipient disconnected from message bus without replying'}', '[]') 2018-03-13 08:38:27 DEBUG1: role.databaseserver.psql.PropertiesChanged('org.fedoraproject.rolekit1.role.instance', '{'state': 'error'}', '[]') 2018-03-13 08:38:27 DEBUG1: role.databaseserver.psql.StateChanged('error') ``` So, it looks like everything in the deployment succeeds except the final step; the one where we create a new systemd target file for the role and ask systemd to update its knowledge of installed units. Instead of replying with success on the D-BUS API equivalent of `systemctl daemon-reload`, systemd disconnects from the bus. CCing Zbigniew
Created attachment 1407681 [details] Python 3 reproducer OK, so the core of the problem is that sometime between systemd-234 (F27) and systemd-238 (F28 Beta), the D-BUS method org.freedesktop.systemd1.Manager.Reload() (which is the API equivalent of `systemctl daemon-reload`) became broken. It doesn't return a success message and instead seems to disconnect from D-BUS. This error then percolates up to rolekit which assumes the reload was unsuccessful and therefore it is unsafe to continue. I'm attaching a simple python3 reproducer which works on F27 and fails on F28. (Be aware that you obviously must run this under the context of a user who has privilege to call this API; root is easiest).
Even simpler reproducer: ``` gdbus call --system \ --dest org.freedesktop.systemd1 \ --object-path /org/freedesktop/systemd1 \ --method "org.freedesktop.systemd1.Manager.Reload" ``` Result: `Error: GDBus.Error:org.freedesktop.DBus.Error.NoReply: Message recipient disconnected from message bus without replying`
There's a much narrower delta, as I said - "between Fedora-Rawhide-20180303.n.0 and Fedora-Rawhide-20180308.n.2". So really, this clearly broke between 237 and 238, as 238 landed during that time.
With recent changes to make FreeIPA work better, this is now affecting FreeIPA role deployment. See this test of the FreeIPA update: https://openqa.fedoraproject.org/tests/205873 same failure.
Per sgallagh, zbigniew has been looking into this, he just hadn't mentioned it here...we're hoping he'll have a fix soon. We need one, as this is really blocking Server; we can't tell if there are bugs later in the postgres and freeipa tests with this blocking them.
https://github.com/systemd/systemd/pull/8473 I'm currently building systemd-238-4.fc28 and systemd-238-4.fc29 with that patch. I guess this could be pushed to F28 if a freeze exception is granted.
It's a clear Beta blocker - if you could submit an update and mark it as resolving this bug, that'd help a lot. Thanks!
+1 blocker, this seems like a clear blocker for "Database server role requirements".
+1 to being a Beta Blocker
That's +3, setting accepted.
systemd-238-4.fc28 has been submitted as an update to Fedora 28. https://bodhi.fedoraproject.org/updates/FEDORA-2018-066e9e600b
systemd-238-4.fc28 has been pushed to the Fedora 28 stable repository. If problems still persist, please make note of it in this bug report.