Bug 1665248
| Summary: | Kernel panic in msgr-worker while running with kernel 3.10.0-957.1.3.el7.x86_64 | ||
|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat Ceph Storage | Reporter: | rom |
| Component: | RBD | Assignee: | Ilya Dryomov <idryomov> |
| Status: | CLOSED DUPLICATE | QA Contact: | ceph-qe-bugs <ceph-qe-bugs> |
| Severity: | urgent | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 4.0 | CC: | ceph-eng-bugs, jdillama |
| Target Milestone: | rc | ||
| Target Release: | 4.0 | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2019-02-12 20:56:18 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
rom
2019-01-10 19:29:24 UTC
This has been fixed in https://bugzilla.redhat.com/show_bug.cgi?id=1647460. I'll try to expedite the backport to 7.6.z. Nice!!! I dont have access to that bug. How can I see it's content? Also, any idea what kernel I have to go back to for a mean while? I cannot keep my system up and running with so frequent panics. Any 7.5 (i.e. 862) kernel. mmm Will I be able to run 7.5 kernel on 7.6 distro? Or I should revert the distro as well? Can you elaborate regarding the root cause? Any way I can just avoid it from happening? To stop using rbd? This bug is specific to the co-location scenario. If you avoid co-locating the kernel client on the OSD nodes (i.e. map your rbd devices on a separate node), you shouldn't see it. The root cause is an unfortunate interaction between one of the new kernel hardening asserts in 7.6 and the way loopback works. Nothing is actually wrong, but under certain circumstances the new assert gets triggered when the OSD attempts to receive packets from the kernel client over loopback. I'm not sure about 7.5 kernel on 7.6 distro. Reverting the distro is obviously more reliable, but if you can avoid co-location for now, you shouldn't have to. Got it. Any eta for the fix? Nothing definitive, watch for "libceph: fall back to sendmsg for slab pages" in 7.6 advisories. BTW another workaround would be to temporarily switch to ext4 on top of your rbd devices instead of xfs. Again, nothing is actually wrong, but it just so happens that one of the conditions needed to trigger this assert should never be true with ext4, so if your images are ephemeral or can be easily recreated with ext4, you can keep on co-locating. *** This bug has been marked as a duplicate of bug 1647460 *** |