Bug 1538906
| Summary: | several beaker jobs on aarch64 can not report panic | ||
|---|---|---|---|
| Product: | [Retired] Beaker | Reporter: | Li Shuang <shuali> |
| Component: | reports | Assignee: | Roman Joost <rjoost> |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Matt Tyson 🤬 <mtyson> |
| Severity: | unspecified | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 24 | CC: | achatter, dcallagh, jbastian, mjia, mtyson, rjoost |
| Target Milestone: | 25.0 | Keywords: | Patch |
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2018-03-19 04:17:57 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Comment 1
Dan Callaghan
2018-01-26 14:00:19 UTC
Add jbastian to cc... Jeff you have probably seen more than your fair share of arm64 oops messages :-) so I was wondering if you have any opinion here. Should we just add another pattern to Beaker's panic regex to match the "Internal error: Oops" string? I feel like we are fighting a bit of a losing battle here, if kernel developers keep using any arbitrary formatting and spelling for their oops messages, but maybe there is nothing better we can do. I was curious what abrt does, since it also catches oops by reading kernel messages. It looks like it has this quite lengthy list of possible patterns it will match on: https://github.com/abrt/abrt/blob/faf826e9b76f9a0de0c2b046080cf792f1232668/src/lib/kernel.c#L77 although nowhere can I see where it matches on the actual string "Oops" or anything like it. Maybe I am misreading that code. Just from perusing the kernel source code, I see lots of arbitrary formatting on the ARM side:
$ find . -type f | xargs grep Oops
...
./arm/kernel/traps.c: str = "Oops - BUG";
./arm/kernel/traps.c: arm_notify_die("Oops - undefined instruction", regs, &info, 0, 6);
./arm/kernel/traps.c: die("Oops - bad mode", regs, 0);
./arm/kernel/traps.c: arm_notify_die("Oops - bad syscall", regs, &info, n, 0);
./arm/kernel/traps.c: arm_notify_die("Oops - bad syscall(2)", regs, &info, no, 0);
./arm/kernel/traps.c: panic("Oops failed to kill thread");
./arm/mm/alignment.c: * Oops, we didn't handle the instruction.
./arm/mm/fault.c: * Oops. The kernel tried to access some page that wasn't present.
./arm/mm/fault.c: die("Oops", regs, fsr);
./arm64/kernel/traps.c: die("Oops - bad mode", regs, 0);
./arm64/kernel/traps.c: die("Oops - BUG", regs, 0);
./arm64/mm/fault.c: die("Oops", regs, esr);
./arm64/mm/fault.c: arm64_notify_die("Oops - SP/PC alignment exception", regs, &info, esr);
...
The x86 Oops messages are wrapped in a single function to give consistent formatting:
./x86/mm/fault.c: if (__die("Oops", regs, error_code))
The __die() function adds the colon:
int __die(const char *str, struct pt_regs *regs, long err)
{
...
printk(KERN_DEFAULT
"%s: %04lx [#%d]%s%s%s%s\n", str, err & 0xffff, ++die_counter,
...
I suppose a safe regex would be to search for the string Oops surrounded by word boundaries, e.g., \bOops\b
With the possibility that I might have missed the mark fixing this or clashing with Dan's progress, I put up a patch: https://gerrit.beaker-project.org/c/5989/ Injecting the following string into the dmesg log results in an expected failure during a beaker run. echo 'Internal error: Oops - SP/PC alignment exception: 8a000000 [#1] SMP' > /dev/kmesg Beaker 25.0 has been released. Release notes are available upstream: https://beaker-project.org/docs/whats-new/release-25.html This new pattern is too broad, it now triggers if someone puts "Oops" into their test case name: bug 1572880. We need to find a narrow one... |