Red Hat Bugzilla – Bug 1306333
listAllDevices returns invalid parents in scsi_target devices
Last modified: 2016-07-12 09:54:51 EDT
Created attachment 1122826 [details]
output of mentioned script on affected machine
Description of problem:
Node device list returned by python API listAllDevices (virConnectListAllNodeDevices) sometimes contains devices, whose parent does not exist. We have seen this with scsi_target devices.
Version-Release number of selected component (if applicable):
Specific machines, 100%
Steps to Reproduce:
1. run following script to display XMLs of all devices:
c = libvirt.openReadOnly()
for dev in c.listAllDevices():
Some scsi_target devices will contain parent (scsi_host) that is not found in the list.
Returned devices should be consistent.
This is problem for RHEV as we build a device tree in database, where having an existing parent is one of database constraints. We mostly need to know if this is a bug and parents of these devices should exist somewhere in the tree or if such usage is wrong.
Created attachment 1122827 [details]
Created attachment 1122829 [details]
Created attachment 1122830 [details]
rhev device tree built on the host
Also, restarting libvirt clears these devices and can be considered workaround for any software that tries to build the device tree.
NOTE: If I 'massage' the python output a bit further, then there is no difference between the virsh and python output. The one thing virsh does that the libvirt-python doesn't do is sort the returned data on the <device> <name> field. Doing the same for the python output provided, I get the same list of data.
What may be interesting is to see what 'virsh nodedev-list --tree' shows for the 'scsi_host14' through 'scsi_host23' at least with respect to the 'parent'.
You can also see what virsh has for XML by using virsh nodedev-dumpxml scsi_target14_0_0 (or whichever of those targets).
In the long run though, for 'udev' the parent is filled in by a call to udev_device_get_parent (in udevSetParent() in node_device_udev.c) and for 'hal' the parent is filled in by the attribute "info.parent" (in dev_create() in node_device_hal.c). So libvirt is only filling in the data that it gets from the subsystem. I'll focus on udev since that's what's newer...
The reason why restarting libvirt would appear to clear things is that the device tree is rebuilt. Over time though, as devices are created, changed, or removed the libvirt node device driver is notified and will add/change/remove devices that based on udev/hal event callbacks. So "something" is creating the device and libvirt is just recognizing it. The problem then would be if that same "something" isn't removing the device and thus it appears to stick around. That could mean one of two things - "something" deletes the path, but doesn't tell the udev/hal subsystem or libvirt is missing the "remove" event. Debugging that would require setting up usage of a debug libvirt environment (e.g. /etc/libvirt/libvirtd.conf changes to get debug messages from at least the node_device subsystem by setting 'log_level', 'log_filters', and 'log_outputs').
Focusing more on those problematic target14_0_0 through target23_0_0 devices. It seems those are related to the iSCSI subsystem (I use iSCSI on my host, so I see a similar <path> construct of "/sys/devices/platform/host*/session*"). However, in my case, the <parent>scsi_host# does exist (from the output of a virsh nodedev-dumpxml scsi_target#_0_0). In fact the "/sys/class/scsi_host/host*/device/session*" values also match (another way to find things is via /sys/class/scsi_host... tree).
In any case, for me things are working just fine. Of course I'm using a different OS (f23), upstream libvirt, and libiscsi (1.15.0-1). Perhaps there's something fixed in the libiscsi subsystem. I don't follow it that closely.
Still first things first - let's see in your environment if the infrastructure exists (e.g. the /sys/devices/platform/... or /sys/class/scsi_host/... paths) for those devices. For example from the python output:
Does that <path> exist? Does "/sys/devices/platform/host14/scsi_host" exist? And I assume if it does there's a single "host14" subdirectory? Next, does /sys/class/scsi_host/host14 exist? I would further assume an 'ls -al' would indicate that it's a link to the devices/platform/... tree.
Now, if none of that exists, then setting up debugging to "see" if a remove event is missed/lost would be the next challenge.
not reproducible in qe's env either
per comment 6, set needinfo to Martin.
And could you pls provide a machine which can reproduce this issue in a private comment? thx
Can't really reproduce it again, possibly fixed in the process. Will update if I see some machine with reproducer.
I'm going to close as worksforme. If it shows up again, the feel free to reopen and please be sure to provide extra data as shown in comment 6
This reproduces repeatedly on mburman's lynx15.qa.lab.tlv.redhat.com
Burman, can you supply the information requested in the (long) comment 6?
*** Bug 1334633 has been marked as a duplicate of this bug. ***
[root@lynx15 ~]# cat /sys/devices/platform/
alarmtimer/ dcdbas/ host83/ ipmi_bmc.008b.32/ pcspkr/ serial8250/
coretemp.0/ Fixed MDIO bus.0/ host85/ microcode/ power/ uevent
[root@lynx15 ~]# cat /sys/devices/platform/host83/session78/target83\:0\:0/
83:0:0:0/ 83:0:0:1/ 83:0:0:2/ 83:0:0:3/ 83:0:0:4/ 83:0:0:5/ power/ subsystem/ uevent
[root@lynx15 ~]# cat /sys/devices/platform/host85/scsi_host/host85/
[root@lynx15 ~]# cat /sys/class/scsi_host/host85/
Note, this is not my HW, please contact me in privet or ncredy for further investigation, thanks
Right now I'm in the middle of other work, so in order to "save" the state, can you provide more information here vis-a-vis the details of what was asked for in comment 6? What's shown helps a little, but it's not complete. You will have to extrapolate to the environment causing the problem. Specifically:
"What may be interesting is to see what 'virsh nodedev-list --tree' shows for the 'scsi_host14' through 'scsi_host23' at least with respect to the 'parent'.
You can also see what virsh has for XML by using virsh nodedev-dumpxml scsi_target14_0_0 (or whichever of those targets)."
(although in this case it seems host85 is the target and there's no state under host85, but yet it shows up in output... )
It would also be beneficial to provide details such as OS version, libvirt version... anything else configured on the system (iSCSI) that may be generating host devices? It's noted that it's reproducible, but not how it's reproducible. Is this a stress test or just normal usage? What seems to trigger things? I can debug libvirt problems, but using RHEV is not in my wheelhouse. So if you have a sequence that can make it so the device doesn't exist, something is done, and then the device exists but with no parent, then run libvirtd with debugging enabled, grab/save the log output and then provide that - it would allow me to dig into that data. Without that, it'll be hard to replicate the environment and conditions in order to have a chance at resolving.
I can't provide all this details, it's an production environment that is running all the time(infra)and it's not mine.
What we can do is to talk online(tomorrow for example), i will get access to this environment, you as well. We will reproduce it(it's 100 reproducible) and you will debug it in real time and pick all the relevant and desired info.
Is that something that can work for you? this is the time to investigate it and not to 'save' the state of it. It shouldn't take to long.
We can talk on chat or via e-mail, thanks)
This is reproduced on -->
Red Hat Enterprise Linux Server release 7.2 (Maipo)
- Attaching libvirtd log in debug mode, vdsm log and supervdsm log.
As well the engine and server logs from engine.
- The output of virsh -r nodedev-list --tree
I don't see there a scsi_host## that is missing the parent(but maybe i missed it, it's why i asked you to log in)
[root@lynx15 ~]# virsh -V
Virsh command line tool of libvirt 1.2.17
See web site at http://libvirt.org/
Compiled with support for:
Hypervisors: QEMU/KVM LXC ESX Test
Networking: Remote Network Bridging Interface netcf Nwfilter VirtualPort
Storage: Dir Disk Filesystem SCSI Multipath iSCSI LVM RBD Gluster
Miscellaneous: Daemon Nodedev SELinux Secrets Debug DTrace Readline Modular
[root@lynx15 ~]# yum list installed | grep iscsi
iscsi-initiator-utils.x86_64 184.108.40.2063-33.el7_2.1 @latest_rhel_z_stream
libiscsi.x86_64 1.9.0-6.el7 @rhel-7.2
Created attachment 1158768 [details]
libvirtd and vdsm logs
Created attachment 1158769 [details]
engine and virsh output
Finally got some time to look at this... Started generating a response and "poof" away went my editing window because of a mistakenly typed ctrl-w (dang browser software)...
Anyway, what I've deduced from the output shown is I don't believe you've replicated the original problem. You'll note that the original problem lists "scsi_target14_0_0" through "scsi_target23_0_0" with <parent> fields that list 'scsi_host14' through 'scsi_host23'; however, those scsi_host##'s don't exist. That output is also show in some nodedev-list output
The output you've generated doesn't have that, but it does have some interesting tidbits.
Comparing the output in libvirtd.log and engine.log I found json type output in the engine.log at timestamp "2016-05-18 14:28:48,185 INFO" where if you edit the file and search on "scsi_host8" you'll see scsi_host87 and scsi_host89 listed; however, searching the libvirtd.log output around the same period shows a fetch of scsi_host's, but only for scsi_host0 through scsi_host5.
However, way back at the beginning of libvirtd.log there's remnants of a udev callback to 'remove' the 'scsi_host89' and 'scsi_host87' (at "2016-05-18 11:19:25.954" and "2016-05-18 11:19:25.896"). So as far as libvirt is concerned they were deleted, but that "[org.ovirt.engine.core.vdsbroker.HostDevListByCapsVDSCommand] (ajp-/127.0.0.1:8702-5) [4947bee4] FINISH, HostDevListByCapsVDSCommand" still listed them.
That hints to me that perhaps there's a "cache" being kept by something else that isn't updating the scsi_host list... Which is perhaps a different problem, but not a libvirt problem.
In order to reproduce the original scenario the scsi_target## devices have to be listed in libvirt output with no corresponding similarly name scsi_host## device (e.g scsi_target87 should have scsi_host87).
I don't think we keep any cache in vdsm(right, Martin?), would it be possible it is cached on libvirt side instead?
I can see in the same libvirt log that despite scsi_host89 was removed, scsi_target89_0_0 is still being reported. The code in vdsm querying it from libvirt is not caching anything and is always querying libvirt directly, so I'm assuming the wrong parent was indeed returned by libvirt for that scsi_target89_0_0 device
Not sure I have an answer for scsi_target87_0_0 and scsi_target89_0_0 (although perhaps notable that 88 isn't there). In the long run, libvirt only reports what it gets from udev. If udev has the target there, but no host - that could either be a bug in udev or perhaps that those two LUNs still have outstanding I/O on them so UDEV won't delete them. I don't have that kind of knowledge of that area.
The libvirt nodedev driver reacts to add, change, and delete events from udevEventHandleCallback. So as soon as udev tells us, we handle it appropriately.
Addition and Deletion from libvirt lists is not cached in any overt way that I can see.
Still, getting a parent is not "guaranteed" to return something as the API indicates:
the name of the device's parent, or NULL if an error occurred or when the device has no parent."
(In reply to John Ferlan from comment #22)
> Not sure I have an answer for scsi_target87_0_0 and scsi_target89_0_0
> (although perhaps notable that 88 isn't there). In the long run, libvirt
> only reports what it gets from udev. If udev has the target there, but no
> host - that could either be a bug in udev or perhaps that those two LUNs
> still have outstanding I/O on them so UDEV won't delete them. I don't have
> that kind of knowledge of that area.
yeah, I wouldn't be surprised if that's the reason
> The libvirt nodedev driver reacts to add, change, and delete events from
> udevEventHandleCallback. So as soon as udev tells us, we handle it
> Addition and Deletion from libvirt lists is not cached in any overt way that
> I can see.
> Still, getting a parent is not "guaranteed" to return something as the API
> the name of the device's parent, or NULL if an error occurred or when the
> device has no parent."
that's ok, but here we're talking of a parent which is not reported. Alright, we'll handle both cases internally, dropping the devices if they do not have a parent assuming there was some kind of an error or they are stuck/being removed