Bug 1568856
Summary: | DBUS Breaks on OS Point-Release Upgrade | |||
---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Thomas Jones <redhat> | |
Component: | dbus | Assignee: | David King <dking> | |
Status: | CLOSED CURRENTRELEASE | QA Contact: | Desktop QE <desktop-qa-list> | |
Severity: | urgent | Docs Contact: | Marie Hornickova <mdolezel> | |
Priority: | urgent | |||
Version: | 7.5 | CC: | amike, cobrown, cww, dking, dkochuka, dwilloug, fweimer, jkoten, joboyer, jomurphy, kwalker, lmiksik, loren, masanari.iida, mclasen, mjahoda, mmckinst, ncrawford, pdwyer, ptalbert, redhat, rmullett, salmy, sdodson, shane.seymour, snavale, tajima1989, takirby, tpelka, vbenes | |
Target Milestone: | rc | Keywords: | Documentation, PrioBumpGSS, Regression, ZStream | |
Target Release: | --- | |||
Hardware: | x86_64 | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | dbus-1.10.24-13.el7 | Doc Type: | Bug Fix | |
Doc Text: |
.Running *dbus-daemon* no longer fails to activate a system service
With the rebase of the D-Bus message bus daemon (*dbus-daemon*) to version 1.10.24, locations of several *dbus* tools were migrated. The `dbus-send` executable was moved from the `/bin` directory to the `/usr/bin` directory; the `dbus-daemon-launch-helper` executable was moved from the `libdir` directory to the `libexecdir` directory. Consequently, if a scriptlet in a package called the `dbus-send` command to send a message to D-Bus, and triggered a service activation, the activation could fail. With this update, the bug has been fixed by creating compatibility symlinks between the old and new locations of `dbus-daemon-launch-helper`. As a result, any running instance of *dbus-daemon* can now call the system bus and activate a system service.
|
Story Points: | --- | |
Clone Of: | ||||
: | 1660160 1660162 (view as bug list) | Environment: | ||
Last Closed: | 2019-08-28 08:43:09 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1539427, 1565828, 1566309, 1566312, 1566313, 1630904, 1660160, 1660162 |
Description
Thomas Jones
2018-04-18 10:38:45 UTC
I would say this is basically clone of bz1550582 It is not expected updating and therefore restarting dbus from e.g. graphical session (and it is not clear from this bz if you did the restart from multi-user or graphical target) would work properly. But in this case it seems dbus is not running or not working after update, correct. What yum reinstall dbus\* do (in multi-user taget), can you seem something from systemctl status dbus or in journal? Thanks -Tom We don't do graphical in AWS - AWS doesn't offer (virtual) console access. Due to lack of out of band access (console — graphical, serial or otherwise), all update actions happen at run-level 3 (or whatever the systemd equivalent is). In general, updating this way works. It's really only when there's a (mercifully-infrequent) DBUS update that we tend to have these issues. Typically what we see after a generic `yum -y update` is that DBUS becomes deranged and there's no communicating with it. The deranged state persists across reboots. While we haven't explicitly tried doing a `yum reinstall dbus\*` once things have reached this state, my suspicion is that it will fail. As to where we're observing logged symptoms/outputs: we're seeing them in the journals and the legacy log-files. Doing a `timedatectl status` is just a shorthand way of checking "is DBUS pissed off." This is caused by dbus-daemon being rebased to a new version (dbus-1.10.24-7.el7) in RHEL 7.5, and the locations of several dbus tools being migrated. dbus-send moved from /bin to /usr/bin, and dbus-daemon-launch-helper moved from under libdir to under libexecdir. The location of dbus-daemon-launch-helper is described in a configuration file (/usr/share/dbus-1/system.conf), but until dbus-daemon is updated, any running dbus-daemon instance (for the system bus, specifically) will be unaware of the new location, and will fail when trying to launch a system service. The running (old) dbus-daemon will fail to read the system.conf configuration file (because the canonical location changed, from under /etc to under /usr), and restarting dbus-daemon will disconnect all currently-connected services, which will not reconnect unless they are restarted afer dbus. If a scriptlet in a package called dbus-send and triggered a service activation, the activation would likely fail (because the helper binary would not be found). I could not find any uses of dbus-send in the vim package (nor vim-common subpackage), but it may be another package that is calling dbus-send, or it may be called as a side effect of the scriptlets in vim-common. A workaround for this problem would be to create a symlink between the old and new locations of dbus-daemon-launch-helper, so that the running dbus-daemon for the system bus can still call out to it. An alternative would be to update on the dbus packages, and then to restart the system immediately, before updating any other packages, although this may not be feasible if the shutdown process triggers any service activations. Cool. Thanks for the detailed info. I'll try setting up a symlink as part of the upgrade process to see if that helps us. I'm reasonably certain that the vim landmark is simply "last thing yum actually completed" rather than being the stuck process. I'm at a different work-site for the next day or so: depending on after-hours time-constraints, I may not have a yay/neigh update till Thursday. About to try some of the suggestions you made. Tested our problem across several AWS regions and found that simply doing a `yum update dbus` was sufficient to break a system (and it's reproducible 100% of the time with thatt). So, it's definitely in that subsystem that we're experiencing problems. Alright, I'm on an instance launched from a 7.4 AMI. In looking at the current "/bin/dbus-send", `readlnk` is telling me that, even though the RPM's `-ql` output says "/bin/dbus-send", the true location is already "/usr/bus/dbus-send" I'm currently a little unclear how pre-creating /usr/libexec/dbus-1/dbus-daemon-launch-helper as a symlink is going to help me? Won't updating the RPM blow away that symlink, leaving me in the same place I was before? Or, are you saying I should do something more like `mv /lib64/dbus-1/dbus-daemon-launch-helper /usr/libexec/dbus-1/dbus-daemon-launch-helper && ln -s /usr/libexec/dbus-1/dbus-daemon-launch-helper /lib64/dbus-1/dbus-daemon-launch-helper` I'd have a similar question for the system.conf file, but the new RPM appears to have both an "/etc/dbus-1/system.conf" file (like the 7.4 packaging) as well as a "/usr/share/dbus-1/system.conf". That said, the file in /etc is only 833bytes while the one in /usr/share is 4362bytes. Since this is a test-rig, I can blow it up, so I'll probably try out any permutations I can think of. However, it might prove helpful if you were able to provide further, detailed instructions. Thanks and advance. Launched instance from AMI. Placed the following into the instance's UserData: ``` #!/bin/bash if [[ -d /usr/libexec/dbus-1 ]] then echo "Directory already exists" else printf "Creating new directory" install -d -m 000755 /usr/libexec/dbus-1 -o root -g root && echo Success || echo FAILED printf "Updating SEL labels... " chcon --reference /lib64/dbus-1 /usr/libexec/dbus-1 && echo Success || echo FAILED fi printf "Moving dbus-daemon-launch-helper... " mv /lib64/dbus-1/dbus-daemon-launch-helper /usr/libexec/dbus-1/dbus-daemon-launch-helper && echo Success || echo FAILED printf "Creating symlink... " ln -s /usr/libexec/dbus-1/dbus-daemon-launch-helper mv /lib64/dbus-1/dbus-daemon-launch-helper && echo Success || echo FAILED sleep 10 yum update -y dbus && init 6 ``` After rebooting, system was in the same broken state as it gets to without attempting to fix paths. Any other tests to run or fixes to try? Please share the AMI ID that you are using to test. I've tried this in a local VM and in AWS but haven't been able to reproduce. Any AMI returned by: https://console.aws.amazon.com/ec2/v2/home?region=us-east-1#Images:visibility=public-images;search=spel-minimal-rhel-7.4-hvm;sort=desc:creationDate Exhibits the issue. The most recent of the above would be ami-05aa42022a79b86e7 We had no projects that were using 7.3. No DBUS issues were reported when going from 7.3 to 7.4. That said, as part of testing for this BugZilla submission, we verified that the issue is triggered when upgrading from 7.3 directly to 7.5. At this point, all but one of the 7.3 AMIs (ami-28b5b23e) have aged off. The most recent AMI from that query is technically RHEL 7.5 and does not reproduce the issue in question. Use any AMI from March or earlier, such as `ami-0338b428e333e97eb`. (In reply to Loren Gordon from comment #12) > The most recent AMI from that query is technically RHEL 7.5 and does not > reproduce the issue in question. Use any AMI from March or earlier, such as > `ami-0338b428e333e97eb`. These are public AMIs so shouldn't need to share them to you. Lemme know if you run into access issues with the AMI ID Loren noted. (In reply to Loren Gordon from comment #12) > The most recent AMI from that query is technically RHEL 7.5 and does not > reproduce the issue in question. Use any AMI from March or earlier, such as > `ami-0338b428e333e97eb`. These are public AMIs so shouldn't need to share them to you. Lemme know if you run into access issues with the AMI ID Loren noted. (In reply to Thomas Jones from comment #14) > (In reply to Loren Gordon from comment #12) > > The most recent AMI from that query is technically RHEL 7.5 and does not > > reproduce the issue in question. Use any AMI from March or earlier, such as > > `ami-0338b428e333e97eb`. > > These are public AMIs so shouldn't need to share them to you. Lemme know if > you run into access issues with the AMI ID Loren noted. Any news? Thanks -Tom (In reply to Tomas Pelka from comment #15) > Any news? > From us? No. We're currently waiting on you guys to see if the AMI listed by @loren was locatable and useable and if Red Hat was had had a chance to use it to reproduce the problem and start diagnostics. Was actually checking the case to see if I needed to Bueller it as I'd not received any case update notifications. If your diagnosticians prefer to do work in different regions, we can provide AMI-IDs equivalent to `ami-0338b428e333e97eb` in us-east-2, us-west-1, us-west-2 and even us-gov-west-1. They're all created by the same processes. @Thomas: There might be a syntax error in your UserData. printf "Creating symlink... " ln -s /usr/libexec/dbus-1/dbus-daemon-launch-helper mv /lib64/dbus-1/dbus-daemon-launch-helper && echo Success || echo FAILED Three arguments to `ln`? Thank you for that catch. Copy-paystah error. Fixed to: ``` mv /lib64/dbus-1/dbus-daemon-launch-helper \ /usr/libexec/dbus-1/dbus-daemon-launch-helper && \ ln -s /usr/libexec/dbus-1/dbus-daemon-launch-helper \ /lib64/dbus-1/dbus-daemon-launch-helper ``` Either way: does not seem to change the defective behavior encountered when the dbus RPM is updated. Probably worth noting that all of the UserData actions return success. However, after the `yum update` (and associated `init 6`) runs, the /lib64/dbus-1/ directory wholly disappears. When the system (eventually) returns from the `init 6`, DBUS is in its usual, "unhappy" state. I've run into the same problem and addressed it. I noticed that dbus.socket had changed from /var/run/dbus/system_bus_socket to /run/dbus/system_bus_socket. In CentOS 7.X, originally /var/run is symlink point to /run directory. So this change wouldn't be a problem in most cases. But in my case, unintendedly, /var/run wasn't a symbolic link. So many processes had failed to handle the socket. After fix /var/run to a symlink, the upgrade had been succeeded. I made a mistake in writing... s/CentOS/RHEL/ (In reply to Satoshi Tajima from comment #25) > But in my case, unintendedly, /var/run wasn't a symbolic link. > So many processes had failed to handle the socket. > > After fix /var/run to a symlink, the upgrade had been succeeded. If this is the case, it is not easy to fix inside the dbus package, and arguably the wrong place, as the filesystem package owns the /var/run symlink. The easiest thing at this point is to mention in the release notes that dbus upgrades require that /var/run is a symlink to /run (which is the default case in all versions of RHEL7, as far as I am aware). Moving this bug to the filesystem package, in the light of comment 28 While I see reason behind "switch to filesystem", I don't see any way how to fix it from filesystem package - and it is not an issue in filesystem package originally - it was change in behaviour caused by dbus rebase (and fixable through symlink in the old locations as mentioned in comment #4). Package filesystem installs /var/run as a symlink to ../run . I can imagine this to be an issue on systems that were RHEL 6 upgraded to RHEL 7 - as /var/run was not symlink there. However, original issue is not in filesystem. Still it doesn't really matter, unless we want to ship symlinks in old locations with new dbus package. I think this is something what has to be documented as known issue anyway. Marking Documentation, as I don't plan to do any changes in filesystem package and I can't fix it there. Only fix would be to ship symlinks to helper binaries in the old locations, but it wouldn't help for the existing issues. It will only help for the new updates. Anyway, switching back to dbus, filesystem package update can't fix this issue - and proper fix would be to ship symlinks in the old locations of the binaries to keep backward compatibility... I logged yet another "dbus update break abrt-dbus" as BZ#1650062 I think you guys are not notice the case, because I select it as "abrt" as component. Thanks Qa_ack+ for providing symlinks to original locations. *** Bug 1550582 has been marked as a duplicate of this bug. *** |