Bug 586160
Summary: | Console hang results from echo to sys/bus/pci/drivers/pci-stub/new_id | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | adam vinsh <adam.vinsh> |
Component: | kvm | Assignee: | Don Dutile (Red Hat) <ddutile> |
Status: | CLOSED NOTABUG | QA Contact: | Virtualization Bugs <virt-bugs> |
Severity: | high | Docs Contact: | |
Priority: | low | ||
Version: | 5.5 | CC: | adam.vinsh, adaora.onyia, andrew.james, bjorn.helgaas, bryan.stillwell, chrisw, chuck.morrison, jiayin.shao, li.zhang6, llim, maxwell.spangler, mike.miller, myron.stowe, scott.scriven, shengliang.lv, virt-maint, ykaul |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2010-09-10 21:39:15 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 580948 |
Description
adam vinsh
2010-04-26 23:23:20 UTC
I setup a basic config on other machines to verify this.. it appears that the hang occurs most often after the echo to "unbind" then to "new_id". I also realize that the version of qemu-kvm doesn't matter here.. this is kernel related. So I guess I mean that whatever sles11sp1 is doing works for this.. maybe that helps in debug. Adam, does the following sequence work?: unbind; new_id I believe adding a new_id to pci-stub & having it unbound from its original driver will cause a probe of the driver; iow, the bind isn't necesssary. btw the original description & c#1 seem to conflict as to the order you did the cmds. did you do: new_id, unbind, bind; or unbind, new_id, <shell-hang> ? or, if you tried multiple sequences, pls list which were tried. Hi Don, Here are some details on this one: "does the following sequence work?: unbind; new_id" The echo to unbind causes a hang right away. The system doesn't appear to be doing anything.. just sitting idle.. but that console will only scroll blank lines on "enter". No ctrl-c or anything similar appears to have any impact. "I believe adding a new_id to pci-stub & having it unbound from its original driver will cause a probe of the driver; iow, the bind isn't necessary." If I do this, how does the pci-stub know which card to bind to? In this system, 3 smart array cards share the address "103c 323a", but only the p812 is located in 0000:08:00.0 Or does that even matter... Maybe it just matters which pci id you pass to the guest to boot with, with VT-D? Here is an example of what I mean by "share the address "103c 323a" 103c:323a-103c:3249 AM312A [HP PCIe SAS SA P812 1GB Flash Cache.] (cciss) 103c:323a-103c:3247 AM311A [HP PCIe SAS SA P411 256MB Ctlr] (cciss) 103c:323a-103c:3241 SA-P212 [HP PCIe SAS SA P212 Ctlr] (cciss) "btw the original description & c#1 seem to conflict as to the order you did the cmds. did you do: new_id, unbind, bind; or unbind, new_id, <shell-hang> ?" I always have done new_id, unbind, bind. In C#1 I meant to say that the hang is after unbind and not new_id. Adam, Thanks for detailed info... it explains possible issues. So, some background on new_id, bind, unbind.... the echo "vid did" adds that 'vid did' to the pci table that the driver can 'match on' for a given PCI device, and thus the driver's probe routine is called when such a device is scanned (or "bind"-ed) *if* the device doesn't already have a driver associated with it (have a driver ... er, um, driving it .... already). So, adding a vid-did pairs to new_id should never cause a hang because it just expands a table for possible device<->driver matching. Now, the unbind for a given device will invoke the 'remove' entry point of a driver for a given device. Given you are seeing console hangs at this point, I would surmise your driver is hanging during this function(stack), and you have a driver problem. Tracing the remove code flow in the driver should show where the hang is. Bind will cause the driver to see if that device has a matching vid-did, and if so, invoke that driver's probe routine. for pci-stub, it does zippo to the device, but tags pci-stub as the in-use driver for that device (but invoking pci_register_driver()), so a pci scan won't try to attach another (the original) driver to that device. Last but not least, the vid-did-svid-sdid you show in c#3 does not appear to be an 'expected' use of vid-did-svid-sdid. A vid-did pair should uniquely identify a device; svid-sdid should be used to identify variations of a PCI device, like size of buffer ram provided, (sub-)vendor unique tweaks (like SROM attached or not attached if a device has such variances & cant be i-d'd through other registers). It appears from the listing above that you have 3 different devices that use the same driver, but have the same vid-did. The pci driver tables are designed to handle this case by assigning a uniques did to each device, and in the case above, list 3 vid-did pairs in an array of pci_device_id structs, registered by the driver with the pci subsystem (via pci_register_driver() ). Although it shouldn't be a problem in the above scenario (since you unbind a specific device via it's BDF, and the bind uses the same one), it's not the typical use of vid-did-svid-sdid. So, I'm guessing the cciss driver is not designed for hot-plug, or else it would fail on unplug (which invokes remove as well; possibly suspend before the remove too). Do you have a system you can try hot-adding this cciss device to & from & see if it mimics this hang behavior (at unplug time)? For Smart Array we key off the subsystem ID. The device ID identifies a family of Smart Array controllers. IOW, 103c323a covers the P410, P410i, P411,P212, and P812. The 103c3249 identifies this controller as a P812. Not sure if that helps at all. The cciss driver is not designed for hot-plug. (In reply to comment #5) > For Smart Array we key off the subsystem ID. The device ID identifies a family > of Smart Array controllers. IOW, 103c323a covers the P410, P410i, P411,P212, > and P812. The 103c3249 identifies this controller as a P812. Not sure if that > helps at all. > The cciss driver is not designed for hot-plug. And this last sentence is the arrow in the heart: A driver must be designed for hot-plug (support remove) in order to do device-assignment. Closing this bz as "NOTABUG" wrt device-assignment, since it is a driver issue. Feel free to re-open if you have further data stating (proving it works in hw hot-plug configuration) otherwise. |