Bug 847548
Summary: | kernel BUG at include/linux/scatterlist.h:67 (vp_set / virtscsi_init / virtscsi_complete_free) kernel panics when virtio-scsi module is loaded | ||||||
---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Richard W.M. Jones <rjones> | ||||
Component: | kernel | Assignee: | Kernel Maintainer List <kernel-maint> | ||||
Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||
Severity: | unspecified | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | rawhide | CC: | amit.shah, awilliam, berrange, cfergeau, dwmw2, gansalmon, itamar, jonathan, kernel-maint, knoel, madhu.chinakonda, pbonzini, rjones, scottt.tw, virt-maint | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | RejectedNTH | ||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2012-08-24 00:12:16 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Richard W.M. Jones
2012-08-12 21:40:40 UTC
The qemu command line we're using is: /usr/bin/qemu-kvm \ -global virtio-blk-pci.scsi=off \ -nodefconfig \ -nodefaults \ -nographic \ -device virtio-scsi-pci,id=scsi \ -drive file=/tmp/libguestfs-test-tool-sda-D1C0Dp,format=raw,id=hd0,if=none \ -device scsi-hd,drive=hd0 \ -drive file=/var/tmp/.guestfs-1000/root.3411,snapshot=on,id=appliance,if=none,cache=unsafe \ -device scsi-hd,drive=appliance \ -machine accel=kvm:tcg \ -m 500 \ -no-reboot \ -no-hpet \ -device virtio-serial \ -serial stdio \ -device sga \ -chardev socket,path=/tmp/libguestfspKdIKD/guestfsd.sock,id=channel0 \ -device virtserialport,chardev=channel0,name=org.libguestfs.channel.0 \ -kernel /var/tmp/.guestfs-1000/kernel.3411 \ -initrd /var/tmp/.guestfs-1000/initrd.3411 \ -append 'panic=1 console=ttyS0 udevtimeout=600 no_timer_check acpi=off printk.time=1 cgroup_disable=memory root=/dev/sdb selinux=0 guestfs_verbose=1 TERM=xterm ' Still getting this with guest kernel 3.6.0-0.rc6.1.fc18.x86_64 and qemu-1.2-0.3.20120806git3e430569.fc19.x86_64. Note I can reproduce this using a regular guest as well as with libguestfs. Reassigning to the kernel, since changing back to kernel 3.5.0 makes the bug go away. (In reply to comment #2) > Still getting this with guest kernel 3.6.0-0.rc6.1.fc18.x86_64 I believe I meant to write kernel 3.6.0-0.rc1.git6.1.fc18.x86_64 In any case, I'll retest with the latest kernel from Koji. Same problem occurs with 3.6.0-0.rc2.git0.2.fc18.x86_64. Stack trace is identical to above. Created attachment 605707 [details]
0001-SCSI-virtio-scsi-Initialize-scatterlist-structure.patch
I'm trying out this patch.
Fixed :-) Posted on LKML. (In reply to comment #7) > https://lkml.org/lkml/2012/8/20/365 I'll get this in later today. Somewhat unfortunately, I doubt it will show up in the Alpha since we're in freeze and they're only taking blocker+NTH bugs. (In reply to comment #8) > (In reply to comment #7) > > https://lkml.org/lkml/2012/8/20/365 > > I'll get this in later today. Somewhat unfortunately, I doubt it will show > up in the Alpha since we're in freeze and they're only taking blocker+NTH > bugs. So: - In Rawhide, this prevents anyone from using virtio-scsi. That's serious, but hopefully you can get this into the Rawhide kernel so we should be OK. - In Fedora 18, this *doesn't* affect anything because the BUG_ON is an integrity check which only kicks in when debugging is enabled (disabled in Fedora 18 kernels, I think). Although the warning happens because a structure isn't initialized, in fact this doesn't cause a problem -- I tested that. (In reply to comment #9) > (In reply to comment #8) > > (In reply to comment #7) > > > https://lkml.org/lkml/2012/8/20/365 > > > > I'll get this in later today. Somewhat unfortunately, I doubt it will show > > up in the Alpha since we're in freeze and they're only taking blocker+NTH > > bugs. > > So: > > - In Rawhide, this prevents anyone from using virtio-scsi. That's > serious, but hopefully you can get this into the Rawhide kernel > so we should be OK. We haven't been building kernels for rawhide/f19 (git master branch) explicitly because the f18 branch is identical thus far. We rely on inheritance to get the kernels into rawhide. > > - In Fedora 18, this *doesn't* affect anything because the > BUG_ON is an integrity check which only kicks in when debugging > is enabled (disabled in Fedora 18 kernels, I think). Although > the warning happens because a structure isn't initialized, in fact > this doesn't cause a problem -- I tested that. Nope. Debugging is enabled in F18. We always ship Alpha with a debug kernel. Anyway, the patch is committed to the f18/master branches now. Hmm I wonder if this counts as a NTH bug ... From: https://fedoraproject.org/wiki/QA:SOP_nth_bug_process > In general, nice-to-have bugs are usually bugs for which > an update is not an optimal solution, Yes: guest will not even boot, so update is not possible. > and for which the fix > is reasonably small and testable (this consideration becomes > progressively more important as a release nears, so bugs may > be downgraded from nice-to-have status late in the release > process if it transpires that the fix is complex and hard to test). Yes: fix is a one-liner, and well understood / tested. > > Types of bugs which are typically likely to be accepted as > nice-to-have bugs include: > > * bugs which constitute infringements of the desktop- > related Fedora_Release_Criteria as applied to > non-default desktops > * bugs which result in a system being unable to > reach a graphical environment Yes: F18 guest using virtio-scsi will not even boot unless this patch has been applied to the kernel. > * significant installer bugs which do not meet the > criteria to be blocker bugs I'll likely be building a kernel with this fix later today. If it gets accepted via NTH, I can use that build in the update instead of the one currently queued. (In reply to comment #10) > We haven't been building kernels for rawhide/f19 (git master branch) > explicitly because the f18 branch is identical thus far. We rely on > inheritance to get the kernels into rawhide. There's a build of kernel-3.6.0-0.rc2.git1.2.fc18 which contains this fix (http://koji.fedoraproject.org/koji/buildinfo?buildID=349638). However that build isn't included/inherited when I build against Rawhide. Instead I'm still getting the old broken 3.6.0-0.rc1.git6.1.fc18 (see: http://kojipkgs.fedoraproject.org//work/tasks/3038/4413038/root.log). I've waited over 12 hours and there have been multiple rawhide repo builds in that time. Any idea what's going on? (In reply to comment #13) > (In reply to comment #10) > > We haven't been building kernels for rawhide/f19 (git master branch) > > explicitly because the f18 branch is identical thus far. We rely on > > inheritance to get the kernels into rawhide. > > There's a build of kernel-3.6.0-0.rc2.git1.2.fc18 which contains > this fix (http://koji.fedoraproject.org/koji/buildinfo?buildID=349638). > > However that build isn't included/inherited when I build against > Rawhide. Instead I'm still getting the old broken 3.6.0-0.rc1.git6.1.fc18 > (see: http://kojipkgs.fedoraproject.org//work/tasks/3038/4413038/root.log). > I've waited over 12 hours and there have been multiple rawhide repo builds > in that time. > > Any idea what's going on? Yeah, we're in Alpha freeze. The f19/rawhide koji tags inherit from the f18 koji tag. Builds done against f18 go into f18-updates-candidate and we have to file bodhi updates to get builds into the f18 tag from there. However, since we're in Alpha freeze, only blocker and NTH bugs are making it out of updates-testing and into the f18 tag. Since nothing new is going into the f18 tag, nothing is being inherited into rawhide. It'll get there eventually. That whole reason is why I haven't closed this bug yet either. kernel-3.6.0-0.rc2.git2.1.fc18, grub2-2.00-5.fc18, pesign-0.10-4.fc18 has been submitted as an update for Fedora 18. https://admin.fedoraproject.org/updates/kernel-3.6.0-0.rc2.git2.1.fc18,grub2-2.00-5.fc18,pesign-0.10-4.fc18 Discussed at 2012-08-22 NTH review meeting. We agreed that on merit this bug doesn't quite rank NTH status, the impact is nasty but it's in a pretty obscure configuration and we're trying to be strict about kernel NTH bugs. We think it'd be acceptable in an Alpha to document this issue and have anyone who wants to use the Alpha in this specific configuration take care to install a kernel from updates. However, in practice, the kernel build that fixes this is likely to make Alpha anyhow due to #849244 and #850003 being accepted as NTH. So don't worry about the process wankery. =) kernel-3.6.0-0.rc2.git2.1.fc18, grub2-2.00-5.fc18, pesign-0.10-4.fc18 has been pushed to the Fedora 18 stable repository. If problems still persist, please make note of it in this bug report. |