| Summary: | Beaker can't detect kernel crash sometimes | ||
|---|---|---|---|
| Product: | [Retired] Beaker | Reporter: | Jianwen Ji <jiji> |
| Component: | beah | Assignee: | beaker-dev-list |
| Status: | CLOSED NOTABUG | QA Contact: | tools-bugs <tools-bugs> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 22 | CC: | dcallagh, mjia, rjoost, zhchen |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2016-04-06 03:49:16 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
|
Comment 1
Roman Joost
2016-03-31 23:41:04 UTC
(In reply to Roman Joost from comment #1) > Dear Ji, > > thank you for your bug report. > > Beaker detects Kernel panics and sets the corresponding recipe status to > Aborted with the result of PANIC. Looking closely into the console log of > your job, it looks like the kernel was unable to raise a Panic: > > RIP [<ffffffff81469f6d>] skb_under_panic+0x5d/0x70 > RSP <ffff880832457818> > > and then reboots. > > There is nothing Beaker can detect here, since the logs don't show any > information of a Kernel panic. > > Thus I think this is not a bug and I'd like to close this as CLOSED NOTABUG. Can we make beaker support it? Here is the following kernel crash message: <2>kernel BUG at net/core/skbuff.c:152! <4>invalid opcode: 0000 [#1] SMP <4>last sysfs file: /sys/devices/system/cpu/online <4>CPU 31 <4>Modules linked in: cpufreq_ondemand freq_table pcc_cpufreq ipv6 power_meter acpi_ipmi ipmi_si ipmi_msghandler microcode iTCO_wdt iTCO_vendor_support hpilo hpwdt bnx2x libcrc32c mdio igb dca i2c_algo_bit i2c_core ptp pps_core serio_raw sg lpc_ich mfd_core shpchp ext4 jbd2 mbcache sd_mod crc_t10dif sr_mod cdrom hpsa pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan] <4> <4>Pid: 4522, comm: client Not tainted 2.6.32-636.el6.x86_64 #1 HP ProLiant DL380p Gen8 <4>RIP: 0010:[<ffffffff81469f9d>] [<ffffffff81469f9d>] skb_under_panic+0x5d/0x70 <4>RSP: 0018:ffff88082792b7f8 EFLAGS: 00010296 <4>RAX: 0000000000000083 RBX: ffff8810315141c0 RCX: 00000000000012fe <4>RDX: 0000000000000000 RSI: 0000000000000046 RDI: 0000000000000246 <4>RBP: ffff88082792b818 R08: 0000000000018cb7 R09: 00000000fffffffb <4>R10: 0000000000000001 R11: 0000000000000008 R12: ffff8810315141c0 <4>R13: 000000000000000e R14: 0000000000000000 R15: ffff88102ff3c4d8 <4>FS: 00007f45fbdac700(0000) GS:ffff88085c560000(0000) knlGS:0000000000000000 <4>CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b <4>CR2: 00007f45fc14fdf0 CR3: 0000000831b71000 CR4: 00000000001407e0 <4>DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 <4>DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 <4>Process client (pid: 4522, threadinfo ffff880827928000, task ffff88082fef6ab0) <4>Stack: <4> 0000000000000048 0000000000000080 ffff8808335db020 ffffffff814a8ef6 <4><d> ffff88082792b838 ffffffff8146ad90 0000000a2792b898 ffff88102ff3c480 <4><d> ffff88082792b888 ffffffffa0302e14 ffff88082792b8f8 ffff88102ff3c4d0 <4>Call Trace: <4> [<ffffffff814a8ef6>] ? nf_hook_slow+0x76/0x120 <4> [<ffffffff8146ad90>] skb_push+0x40/0x50 <4> [<ffffffffa0302e14>] ip6_output_finish+0x84/0x120 [ipv6] <4> [<ffffffffa0305fbb>] ip6_output2+0x2bb/0x2d0 [ipv6] <4> [<ffffffffa030605c>] ip6_output+0x8c/0x150 [ipv6] <4> [<ffffffffa0305775>] ip6_local_out+0x25/0x30 [ipv6] <4> [<ffffffffa0305aca>] ip6_push_pending_frames+0x34a/0x580 [ipv6] <4> [<ffffffffa031ba0f>] udp_v6_push_pending_frames+0x16f/0x400 [ipv6] <4> [<ffffffffa031c8a2>] udpv6_sendmsg+0x942/0xd70 [ipv6] <4> [<ffffffff81242501>] ? avc_has_perm+0x71/0x90 <4> [<ffffffff814e625a>] inet_sendmsg+0x4a/0xb0 <4> [<ffffffff812437af>] ? selinux_socket_sendmsg+0x1f/0x30 <4> [<ffffffff81463553>] sock_sendmsg+0x123/0x150 <4> [<ffffffff81064c5e>] ? account_entity_enqueue+0x7e/0x90 <4> [<ffffffff810a6aa0>] ? autoremove_wake_function+0x0/0x40 <4> [<ffffffff81074f5b>] ? enqueue_task_fair+0xfb/0x100 <4> [<ffffffff8106b80c>] ? try_to_wake_up+0x24c/0x3e0 <4> [<ffffffff814633a4>] ? move_addr_to_kernel+0x64/0x70 <4> [<ffffffff81464d36>] __sys_sendmsg+0x406/0x420 <4> [<ffffffff810ac10f>] ? up_read+0x1f/0x30 <4> [<ffffffff81052204>] ? __do_page_fault+0x1f4/0x500 <4> [<ffffffff8113edf0>] ? __free_pages+0x60/0xa0 <4> [<ffffffff8113ee79>] ? free_pages+0x49/0x50 <4> [<ffffffff810ee45e>] ? __audit_syscall_exit+0x25e/0x290 <4> [<ffffffff81464f59>] sys_sendmsg+0x49/0x90 <4> [<ffffffff8100b0d2>] system_call_fastpath+0x16/0x1b <4>Code: 8b 57 68 48 89 44 24 10 8b 87 cc 00 00 00 48 89 44 24 08 8b bf c8 00 00 00 31 c0 48 89 3c 24 48 c7 c7 78 a2 82 81 e8 c6 cf 0d 00 <0f> 0b eb fe 66 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 <1>RIP [<ffffffff81469f9d>] skb_under_panic+0x5d/0x70 <4> RSP <ffff88082792b7f8> Dear Ji, the crash message you've posted is not a kernel crash, but just a kernel BUG. Beaker can recognize this as well but it doesn't mean that it will halt the recipe. The actual crash would halt the recipe as I explained above. There is nothing more to support here, since it is already supported. Specifically, the kernel "BUG" traces will be captured by rhts-report-result (report_result function) when your task reports any result. It will show up as a failure in the dmesg check. But in this particular case, the kernel BUG was printed and then the machine immediately rebooted -- so the dmesg checks never had a chance to finish because it crashed before rhts-report-result was even reached. Hi Roman & Dan, Got it. Many thanks for your comments! *** Bug 1337381 has been marked as a duplicate of this bug. *** |