Bug 645322

Summary: Emit message "BUG: kvm_destroy_phys_mem: invalid parameters (slot=-1)" when boot guest with vf/pf attached
Product: Red Hat Enterprise Linux 5 Reporter: juzhang <juzhang>
Component: kvmAssignee: Alex Williamson <alex.williamson>
Status: CLOSED WONTFIX QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: low    
Version: 5.6CC: michen, mkenneth, virt-maint, ykaul
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-06-03 15:26:34 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 580946    

Description juzhang 2010-10-21 10:18:29 UTC
Description of problem:
Emit message "BUG: kvm_destroy_phys_mem: invalid parameters (slot=-1)" when booted guest with vf.guest can be boot successful and vf can get ip.

Version-Release number of selected component (if applicable):
#uname -r
2.6.18-227.el5

#lspci | grep 82576
03:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection
(rev 01)
03:00.1 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection
(rev 01)

#rpm -qa | grep kvm
etherboot-roms-kvm-5.4.4-13.el5
kvm-tools-83-205.el5
etherboot-zroms-kvm-5.4.4-13.el5
kvm-qemu-img-83-205.el5
kvm-debuginfo-83-205.el5
kvm-83-205.el5
kmod-kvm-83-205.el5


How reproducible:


Steps to Reproduce:
1.Setup SR-IOV, set max_vfs=7 when modprobe igb
2.Unbind one of vf from host kernel driver
#lspci -n | grep 03:10.0
03:10.0 0200: 8086:10ca (rev 01)

#echo "8086 10ca" >/sys/bus/pci/drivers/pci-stub/new_id
#echo 0000:03:10.0 >/sys/bus/pci/devices/0000\:03\:10.0/driver/unbind
#echo 0000:03:10.0 >/sys/bus/pci/drivers/pci-stub/bind
3.Boot guest with vf nic
#/usr/libexec/qemu-kvm -m 4096 -smp 4 -name rhel5.6test -uuid e793b1be-108b-9691-7ce7-98d0b2602abb -monitor stdio -boot c -drive file=/root/zhangjunyi/rhel5.5_64.raw,if=virtio,boot=on,format=raw,cache=none,werror=stop -net nic,macaddr=54:52:00:37:d0:0b,vlan=0,model=virtio -net tap,vlan=0,script=/etc/qemu-ifup  -vnc :10 -k en-us  -pcidevice host=03:10.0,id=sriov1
  
Actual results:
(qemu) BUG: kvm_destroy_phys_mem: invalid parameters (slot=-1)
BUG: kvm_destroy_phys_mem: invalid parameters (slot=-1)


Expected results:
don't emit message like "BUG: kvm_destroy_phys_mem: invalid parameters (slot=-1)",and vf work fine.

Additional info:
Please note:
1.Emit message "BUG: kvm_destroy_phys_mem: invalid parameters (slot=-1)" when boot guest with vf.however,guest can be booted successful and vf can get ip.

2.I also tested PF,hit this issue too.
Guest can be booted successful and PF can get ip.

Comment 1 juzhang 2010-10-21 10:46:36 UTC
Host dmesg when boot guest with pf attached
1.before boot guest,clear dmesg
#dmesg -c
#dmesg

2.dmesg while booting guest with pf attached
#dmesg
device tap0 entered promiscuous mode
breth0: topology change detected, propagating
breth0: port 2(tap0) entering forwarding state
PCI: Enabling device 0000:03:00.1 (0400 -> 0403)
ACPI: PCI Interrupt 0000:03:00.1[B] -> GSI 40 (level, low) -> IRQ 210
PM: Writing back config space on device 0000:03:00.1 at offset f (was 200, writing 207)
PM: Writing back config space on device 0000:03:00.1 at offset c (was 0, writing 400000)
PM: Writing back config space on device 0000:03:00.1 at offset 7 (was 0, writing e3244000)
PM: Writing back config space on device 0000:03:00.1 at offset 6 (was 1, writing b021)
PM: Writing back config space on device 0000:03:00.1 at offset 5 (was 0, writing e3800000)
PM: Writing back config space on device 0000:03:00.1 at offset 4 (was 0, writing e3220000)
PM: Writing back config space on device 0000:03:00.1 at offset 1 (was 100000, writing 100400)
assign device: host bdf = 3:0:1
printk: 7 messages suppressed.
kvm: 6492: cpu0 unimplemented perfctr wrmsr: 0x186 data 0x130079
kvm: 6492: cpu0 unimplemented perfctr wrmsr: 0xc1 data 0xffd74eae
kvm: 6492: cpu0 unimplemented perfctr wrmsr: 0x186 data 0x530079
kvm: 6492: cpu1 unimplemented perfctr wrmsr: 0x186 data 0x130079
kvm: 6492: cpu1 unimplemented perfctr wrmsr: 0xc1 data 0xffd74eae
kvm: 6492: cpu1 unimplemented perfctr wrmsr: 0x186 data 0x530079
kvm: 6492: cpu2 unimplemented perfctr wrmsr: 0x186 data 0x130079
kvm: 6492: cpu2 unimplemented perfctr wrmsr: 0xc1 data 0xffd74eae
kvm: 6492: cpu2 unimplemented perfctr wrmsr: 0x186 data 0x530079
kvm: 6492: cpu3 unimplemented perfctr wrmsr: 0x186 data 0x130079

Comment 3 Alex Williamson 2010-11-18 17:18:17 UTC
While annoying, I believe this error message is harmless.  When a PCI bar is mapped, device assignment attempts tries to destroy the previous mapping so that it can setup the new mapping.  However, the core PCI code has already set the previous mapping to unassigned, which essentially does the same thing.  So when device assignment does it, it's already been cleared, so we get the error messages.  This upstream patch seems to resolve it by checking mappings before blindly trying to remove them:

http://git.kernel.org/?p=virt/kvm/qemu-kvm.git;a=commitdiff;h=f839a3ef7103dd745a1bd9290655df0d69c3f3b2

Comment 6 RHEL Program Management 2011-01-11 20:53:23 UTC
This request was evaluated by Red Hat Product Management for
inclusion in the current release of Red Hat Enterprise Linux.
Because the affected component is not scheduled to be updated in the
current release, Red Hat is unfortunately unable to address this
request at this time. Red Hat invites you to ask your support
representative to propose this request, if appropriate and relevant,
in the next release of Red Hat Enterprise Linux.

Comment 7 RHEL Program Management 2011-01-11 22:54:36 UTC
This request was erroneously denied for the current release of
Red Hat Enterprise Linux.  The error has been fixed and this
request has been re-proposed for the current release.

Comment 9 Alex Williamson 2011-06-03 15:26:34 UTC
This is a benign error message when caused by device assignment.  It potentially serves a purpose for non-device assignment triggers, but we can't tell the difference at the place in the code where this occurs.  Fixing would require potentially intrusive changes.  Resolving as WONTFIX.