Bug 509220
Summary: | i386 rhel4.8 kvm guests crashes in virtio during installation | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 4 | Reporter: | Gurhan Ozen <gozen> | ||||
Component: | kernel | Assignee: | Chris Lalancette <clalance> | ||||
Status: | CLOSED ERRATA | QA Contact: | Virtualization Bugs <virt-bugs> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | high | ||||||
Version: | 4.8 | CC: | clalance, dhoward, dyasny, jburke, jwm, mjenner, plyons, qzhang, rdassen, tao, tburke, virt-maint, ykaul | ||||
Target Milestone: | rc | Keywords: | ZStream | ||||
Target Release: | --- | ||||||
Hardware: | All | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2011-02-16 15:49:09 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 582911 | ||||||
Attachments: |
|
Description
Gurhan Ozen
2009-07-01 19:35:48 UTC
Seems like it is a guest driver issue. Changing the component. I'll take a look at this tomorrow. Chris Lalancette OK, I took look at this. First, I wasn't able to reproduce with an F-11 based host, so I'll try a RHEL-5 based one next. Gurhan, could you give me details about which piece(s) of hardware you were able to reproduce this on? Looking at the stack trace, we hit this: 0x2a0 is in do_virtblk_request (drivers/block/virtio_blk.c:157). 152 struct request *req; 153 unsigned int issued = 0; 154 155 while ((req = elv_next_request(q)) != NULL) { 156 vblk = req->rq_disk->private_data; 157 BUG_ON(req->nr_phys_segments > ARRAY_SIZE(vblk->sg)); 158 159 /* If this request fails, stop queue and wait for something to 160 finish to restart it. */ 161 if (!do_req(q, vblk, req)) { Which means (I think) that we were handed more segments than the ring can handle. I'll have to look further into how we got into that situation. Chris Lalancette OK, I have a thought as to how this might happen. I think the problem might be in the size of our scatterlist vs. what the host told us. In the current RHEL-4 code, there is a hardcoded scatterlist size of (3+MAX_PHYS_SEGMENTS). However, I think it is possible for the host to tell us to use more than that, and if it does so, then we set up the block layer to give us whatever the host tells us, irrespective of our internally hardcoded size. If that happens, then we'll run into this BUG() Luckily, upstream has moved to a dynamically allocated scatterlist. Assuming my above analysis is right, than this should fix the problem. I've done a backport of that patch to RHEL-4, and now I just need a place to test it. Hopefully Gurhan can provide me with the machines I need to do that. Chris Lalancette Created attachment 357314 [details]
Backport of linux-2.6 patch for dynamic scatterlist
This patch is what I have in mind. I've lightly tested it, and it seems to work (at least basically), but I'll still need to test it on the problem machine.
Chris Lalancette
Hm, yeah, I'm wondering if Dor's last comment is why things are different. In any case, I've now done a test of the installer with the patch in place, and things look pretty good; the install with that particular ks.cfg no longer fails. I'll get this patch queued up for 4.9. Chris Lalancette This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release. Committed in 89.11.EL . RPMS are available at http://people.redhat.com/vgoyal/rhel4/ The test kernel has been tested by a customer, and verified the issue is no longer reproducible: see https://enterprise.redhat.com/issue-tracker/?module=issues&action=view&tid=737433&gid=23498&view_type=lifoall#eid_6681283 An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2011-0263.html |