Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 680864

Summary: __packet_get_status unable to handle kernel paging request
Product: Red Hat Enterprise Linux 6 Reporter: Suqin Huang <shuang>
Component: kernelAssignee: Red Hat Kernel Manager <kernel-mgr>
Status: CLOSED DUPLICATE QA Contact: Red Hat Kernel QE team <kernel-qe>
Severity: high Docs Contact:
Priority: high    
Version: 6.0CC: khong, mst, tburke
Target Milestone: rcKeywords: TestBlocker
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-03-03 08:07:53 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 580951    
Bug Blocks:    
Attachments:
Description Flags
debug none

Description Suqin Huang 2011-02-28 08:46:12 UTC
Description of problem:
host crash while doing migration

Version-Release number of selected component (if applicable):
2.6.32-71.18.1.el6.x86_64

How reproducible:
100% 

Steps to Reproduce:
1.cmd:
qemu-kvm -drive file='/usr/images/RHEL-Server-6.0-64-virtio.qcow2',index=0,if=none,id=drive-virtio-disk1,media=disk,cache=none,format=qcow2,aio=native -device virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk1,id=virtio-disk1 -device virtio-net-pci,netdev=idS61yuA,mac=9a:f1:48:07:df:b8,netdev=idS61yuA,id=ndev00idS61yuA,bus=pci.0,addr=0x3 -netdev tap,id=idS61yuA,vhost=on,script='/usr/scripts/qemu-ifup-switch',downscript='no' -m 2048 -smp 2,cores=1,threads=1,sockets=2 -cpu cpu64-rhel6,+sse2,+x2apic -vnc :1 -rtc base=utc,clock=host,driftfix=none -M rhel6.0.0 -boot order=cdn,once=c,menu=off   -usbdevice tablet -no-kvm-pit-reinjection -enable-kvm  -incoming tcp:0:5200

2.
3.
  
Actual results:


Expected results:


Additional info:
1. host
processor	: 3
vendor_id	: AuthenticAMD
cpu family	: 16
model		: 2
model name	: AMD Phenom(tm) 9600B Quad-Core Processor
stepping	: 3
cpu MHz		: 1150.000
flags		: fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nonstop_tsc extd_apicid pni monitor cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs npt lbrv svm_lock
bogomips	: 4587.44
TLB size	: 1024 4K pages
clflush size	: 64
cache_alignment	: 64
address sizes	: 48 bits physical, 48 bits virtual
power management: ts ttp tm stc 100mhzsteps hwpstate

2. can not reproduce in rhel6.1 host
2.6.32-118.el6.x86_64

3.
crash info:

crash: invalid kernel virtual address: 7180  type: "possible"
WARNING: cannot read cpu_possible_map
crash: seek error: kernel virtual address: ffffffff8208e980  type: "xtime"

BUG: unable to handle kernel paging request at 0000000000001000
IP: [<ffffffff814a024a>] __packet_get_status+0x3a/0x40
PGD 21252b067 PUD 2147bd067
CE: hpet increasing min_delta_ns to 15000 nsec
PMD 0
Oops: 0000 [#1] SMP
last sysfs file: /sys/devices/virtual/net/t0-122919-IluH/flags
CPU 0
Modules linked in: nls_utf8 vhost_net macvtap macvlan tun nfs lockd fscache nfs_
acl auth_rpcgss sunrpc cpufreq_ondemand powernow_k8 freq_table bridge stp llc ip
v6 dm_mirror dm_region_hash dm_log kvm_amd kvm tpm_infineon wmi serio_raw edac_c
ore edac_mce_amd snd_hda_codec_analog snd_hda_intel snd_hda_codec snd_hwdep snd_
seq snd_seq_device snd_pcm snd_timer snd soundcore snd_page_alloc i2c_piix4 sg t
g3 shpchp ext4 mbcache jbd2 sr_mod cdrom sd_mod crc_t10dif ahci radeon ttm drm_k
ms_helper drm i2c_algo_bit i2c_core dm_mod [last unloaded: scsi_wait_scan]

Modules linked in: nls_utf8 vhost_net macvtap macvlan tun nfs lockd fscache nfs_
acl auth_rpcgss sunrpc cpufreq_ondemand powernow_k8 freq_table bridge stp llc ip
v6 dm_mirror dm_region_hash dm_log kvm_amd kvm tpm_infineon wmi serio_raw edac_c
ore edac_mce_amd snd_hda_codec_analog snd_hda_intel snd_hda_codec snd_hwdep snd_
seq snd_seq_device snd_pcm snd_timer snd soundcore snd_page_alloc i2c_piix4 sg t
g3 shpchp ext4 mbcache jbd2 sr_mod cdrom sd_mod crc_t10dif ahci radeon ttm drm_k
ms_helper drm i2c_algo_bit i2c_core dm_mod [last unloaded: scsi_wait_scan]
Pid: 30156, comm: tcpdump Not tainted 2.6.32-71.18.1.el6.x86_64 #1 HP Compaq dc5
850 Microtower
RIP: 0010:[<ffffffff814a024a>]  [<ffffffff814a024a>] __packet_get_status+0x3a/0x
40
RSP: 0018:ffff880214febaa8  EFLAGS: 00010213
RAX: 0000780000001000 RBX: 0000000000001000 RCX: ffff880214c924c0

Comment 3 Suqin Huang 2011-02-28 09:29:37 UTC
from the result we tested before, it works in 2.6.32-71.12.1.el6.x86_64

Comment 4 Suqin Huang 2011-03-01 06:11:37 UTC
Created attachment 481528 [details]
debug

Comment 5 Dor Laor 2011-03-01 12:19:29 UTC
(In reply to comment #3)
> from the result we tested before, it works in 2.6.32-71.12.1.el6.x86_64

Do you mean it is a regression?

Comment 6 Dor Laor 2011-03-01 12:21:38 UTC
Will it happen w/o vhost loaded?

Comment 10 Suqin Huang 2011-03-02 10:13:20 UTC
(In reply to comment #5)
> (In reply to comment #3)
> > from the result we tested before, it works in 2.6.32-71.12.1.el6.x86_64
> 
> Do you mean it is a regression?

From the acceptance testing result we tested before, it works in 2.6.32-71.12.1.el6.x86_64, but kernel 2.6.32-71.12.1.el6.x86_64 is deleted now, I can not test it any more. this issue also can reproduce in 2.6.32-71.14.1.el6.x86_64

Testing with vhost, and try to get complete log.

Will report the result soon.

Comment 11 Suqin Huang 2011-03-02 11:06:00 UTC
can reproduce with vhost=on
1. cmd:
qemu-kvm -drive file='/usr/images/RHEL-Server-6.0-64-virtio.qcow2',index=0,if=none,id=drive-virtio-disk1,media=disk,cache=none,format=qcow2,aio=native -device virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk1,id=virtio-disk1 -device virtio-net-pci,netdev=idvx5Ue1,mac=9a:f1:48:07:aa:1f,id=ndev00idvx5Ue1,bus=pci.0,addr=0x3 -netdev tap,id=idvx5Ue1,vhost=on,script='/usr/scripts/qemu-ifup-switch',downscript='no' -m 2048 -smp 2,cores=1,threads=1,sockets=2 -cpu cpu64-rhel6,+sse2,+x2apic -vnc :1 -rtc base=utc,clock=host,driftfix=none -M rhel6.0.0 -boot order=cdn,once=c,menu=off   -usbdevice tablet -no-kvm-pit-reinjection -enable-kvm  -incoming tcp:0:5200


2. vmcore:

PID: 9495   TASK: ffff88020e9f54e0  CPU: 1   COMMAND: "tcpdump"
 #0 [ffff880215949790] machine_kexec at ffffffff8103697b
 #1 [ffff8802159497f0] crash_kexec at ffffffff810b9078
 #2 [ffff8802159498c0] oops_end at ffffffff814cc900
 #3 [ffff8802159498f0] no_context at ffffffff8104652b
 #4 [ffff880215949940] __bad_area_nosemaphore at ffffffff810467b5
 #5 [ffff880215949990] bad_area_nosemaphore at ffffffff81046883
 #6 [ffff8802159499a0] do_page_fault at ffffffff814ce388
 #7 [ffff8802159499f0] page_fault at ffffffff814cbc75
    [exception RIP: __packet_get_status+58]
    RIP: ffffffff814a024a  RSP: ffff880215949aa8  RFLAGS: 00010213
    RAX: 0000780000001000  RBX: 0000000000001000  RCX: ffff8802141aed80
    RDX: 0000000000000000  RSI: 0000000000001000  RDI: 0000000000001000
    RBP: ffff880215949ab8   R8: ffff880215948000   R9: 0000000000000000
    R10: 0000000000000000  R11: 0000000000000001  R12: 0000000000001000
    R13: ffff8802155d7cc4  R14: ffff88021472aec0  R15: 0000000000000000
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #8 [ffff880215949ac0] packet_lookup_frame at ffffffff814a0288
 #9 [ffff880215949ae0] packet_poll at ffffffff814a0d0c
#10 [ffff880215949b10] sock_poll at ffffffff813fb5ca
#11 [ffff880215949b20] do_sys_poll at ffffffff8118274b
#12 [ffff880215949f40] sys_poll at ffffffff81182bcc
#13 [ffff880215949f80] system_call_fastpath at ffffffff81013172
    RIP: 00007fad0e30cdf8  RSP: 00007fff3a7d2d50  RFLAGS: 00010286
    RAX: 0000000000000007  RBX: ffffffff81013172  RCX: ffffffffffffffff
    RDX: 00000000000003e8  RSI: 0000000000000001  RDI: 00007fff3a7d3830
    RBP: 00000000000003e8   R8: 0000000000000000   R9: 0000000000000001
    R10: 0000000000000000  R11: 0000000000000246  R12: 0000000000000000
    R13: 0000000000451980  R14: 00007fff3a7d3830  R15: 0000000001322360
    ORIG_RAX: 0000000000000007  CS: 0033  SS: 002b
crash>

Comment 12 Michael S. Tsirkin 2011-03-02 11:24:31 UTC
I am confused, sorry.
Which host kernel does have a problem?
Which host kernel does not?

You list one qemu command. Since this is during

You say:
>2. can not reproduce in rhel6.1 host
>2.6.32-118.el6.x86_64

so in which host does it reproduce?
2.6.32-71.18.1.el6.x86_64?

Comment 13 Michael S. Tsirkin 2011-03-02 11:24:59 UTC
Also does it or does it not reprocuce without vhost=on?

Comment 14 Suqin Huang 2011-03-03 04:37:58 UTC
(In reply to comment #12)
> I am confused, sorry.
> Which host kernel does have a problem?
> Which host kernel does not?
> 
> You list one qemu command. Since this is during
> 
> You say:
> >2. can not reproduce in rhel6.1 host
> >2.6.32-118.el6.x86_64
> 
> so in which host does it reproduce?
> 2.6.32-71.18.1.el6.x86_64?

reproduce in 2.6.32-71.18.1.el6.x86_64

Comment 15 Michael S. Tsirkin 2011-03-03 07:27:13 UTC
So it's a duplicate of
https://bugzilla.redhat.com/show_bug.cgi?id=623915
?
Does it happen without vhost=on or not?

Comment 16 Suqin Huang 2011-03-03 07:44:39 UTC
(In reply to comment #15)
> So it's a duplicate of
> https://bugzilla.redhat.com/show_bug.cgi?id=623915

It block RHEL6.0Z migration testing, can you clone it to RHEL6.0Z, or change this one to RHEL6.0Z?
> ?
> Does it happen without vhost=on or not?

repeat 10 times without vhost=on, can not reproduce.

Comment 17 Michael S. Tsirkin 2011-03-03 08:07:53 UTC
So definitely a duplicate of 623915
Mark as such.

*** This bug has been marked as a duplicate of bug 623915 ***

Comment 18 Michael S. Tsirkin 2011-03-03 08:11:49 UTC
Re Comment 16, Please do not enable vhost in 6.0 at all.