Bug 1269570
Summary: | Running 'virsh list' with libvirt-0.10.2-54.el6.x86_64 causes "Failed to connect socket to '/var/run/libvirt/libvirt-sock': Connection refused" | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Robert McSwain <rmcswain> | ||||
Component: | libvirt | Assignee: | Jiri Denemark <jdenemar> | ||||
Status: | CLOSED NOTABUG | QA Contact: | Virtualization Bugs <virt-bugs> | ||||
Severity: | urgent | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 6.5 | CC: | adevolder, dyuan, fjin, jdenemar, mkletzan, rbalakri, rmcswain, xuzhang, yafu, yalzhang, zhwang | ||||
Target Milestone: | rc | Keywords: | Reopened | ||||
Target Release: | --- | ||||||
Hardware: | x86_64 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2016-10-25 11:59:24 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 1269194, 1359965 | ||||||
Attachments: |
|
Description
Robert McSwain
2015-10-07 14:47:44 UTC
> [root@hq1-beprod-s1 yum.repos.d]# service libvirtd start
> Starting libvirtd daemon: [ OK ]
> [root@hq1-beprod-s1 yum.repos.d]# service libvirtd status
> libvirtd dead but pid file exists
This suggests that libvirtd failed early after starting. Could you please attach /var/log/libvirt/libvirtd.log so that we can see the error?
BTW, I looked at the sosreport attached to the case (it should have been attached to this bz too) and the logs there are pretty strange and confusing. Partially, this is because they downgraded to an older running libvirt before capturing the sosreport. That's pretty useless. We need them to capture the sosreport after they installed -54 libvirt, and confirmed they can't connect to the daemon and restarting the daemon doesn't help. One more question. Does this happen *only* with 'virsh lsit' or with other commands too? Could you try doing 'virsh destroy domain_name_that_does_NOT_exist' ? Putting back the needinfo as we still need the debug logs, without them we don't have much to do. Since there are no debug logs available and this issue is not reproducible for us (or anyone else as far as I can tell), I'm closing this as INSUFFICIENT_DATA, feel free to reopen this BZ (or rather create a new one) with that debug logs included. More info about how to enable/use them can be found here: http://wiki.libvirt.org/page/DebugLogs Martin, The customer reminded me that Debug logs are not being written because as soon as the affected version of libvirt is installed, the libvirt service is not started and hence no logs are written. What can we do to work around this? In his words: Please note after I update to the version (-54) I am having problem and look for process there is no process started ps -C libvirtd PID TTY TIME CMD This is after I revert back the packages (-29). ps -C libvirtd PID TTY TIME CMD 41137 ? 00:00:00 libvirtd That doesn't feel right because the first VIR_DEBUG() call is way before the non-fatal error that is seen in the log. And there's plenty more in between, so there should be many lines in the logfile. If the debug logs cannot be gathered, I reckon someone should try debugging the daemon on-site to see where it fails/crashes. This looks more like a crash, so anything like abrt/ulimit/gdb that you can use should suffice. Martin, I'm not sure what else to tell the customer, as we can't seem to get these logs from the affected version of libvirt. As soon as they install the affected version, libvirtd fails to start the daemon and no logs are generated. As logs are generated only when the daemon is started successfully, is there anything else you can think of or any specific directions on how to instruct this customer to use gdb to get what would be most helpful here? (In reply to Robert McSwain from comment #21) Logs are generated even when the daemon cannot start, but let's say that's not possible (maybe some other bug). Let's try one more thing for getting the logs a bit differently. In the meantime I'll try to come up with other ideas how to move on with this. After that affected package is installed, stop the service (even when it already crashed) and the as root run the daemon manually with the following command-line: LIBVIRT_DEBUG=1 LIBVIRT_LOG_OUTPUTS=stderr libvirtd (In reply to Robert McSwain from comment #21) As another idea, would it be possible for the customer to set up a system that experiences this error and provide some (at least limited) access to that system? I'm guessing not, but I had to ask :) Another way would be running the daemon with strace for example. That will, however, generate lot of output and there's a very low chance it will show something we need. You can also run it with gdb as said in some previous comments. Just running "gdb libvirtd" as root and then typing "run" and enter should show whether it crashes or not. If it does, use the command "t a a bt full" to get all the stacktraces so we know where it crashed. However it doesn't look like it's crashing so that might not help either. The last thing that I can think of right now is bisecting so we get closer to the change that caused it. Doing git bisect and going commit by commit won't probably fly, so I'd suggest at least figuring out the exact package version that caused it. Meaning find two packages with sequential release numbers (e.g. -51 and -52) where the older one works and the newer one does not. I'll try to think of other options in case none of this is possible/helps, but for now I've got nothing else on my mind. Created attachment 1154648 [details]
LIBVIRT_DEBUG=1 LIBVIRT_LOG_OUTPUTS=stderr libvirtd
This is great news, there is valuable information in that log. However it doesn't look like a problem with libvirt, maybe with libvirt's requirements. But I would see it as a some kind of backward incompatibility of libdevmapper. Jirka, could you check the builds are OK for us and move it to appropriate places? Thanks. This is actually a bug in device-mapper packages which started to export dm_task_get_info_with_deferred_remove symbol without bumping so version (mainly because the symbol was already there for some time, but it wasn't exported). But that confused dependency tracking since even an older library which does not export the symbol still satisfies the dependency. The bug should not show up on a fully updated system, though. So either don't install libvirt from 6.7 on a 6.5 system or manually update both libvirt and device-mapper-libs or just update the system completely. |