Bug 1812261
Summary: | iptables-restore is segfaulting multiple times during an e2e run on multiple nodes | ||||||
---|---|---|---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Clayton Coleman <ccoleman> | ||||
Component: | Networking | Assignee: | Aniket Bhat <anbhat> | ||||
Networking sub component: | openshift-sdn | QA Contact: | zhaozhanqi <zzhao> | ||||
Status: | CLOSED ERRATA | Docs Contact: | |||||
Severity: | urgent | ||||||
Priority: | high | CC: | anbhat, bbennett, dcbw, iptables-maint-list, miabbott, nagrawal, pprinett, todoleza, wking | ||||
Version: | 4.4 | Keywords: | Reopened | ||||
Target Milestone: | --- | ||||||
Target Release: | 4.4.0 | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2020-05-04 11:45:45 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Clayton Coleman
2020-03-10 21:04:19 UTC
Workers journal in https://gcsweb-ci.svc.ci.openshift.org/gcs/origin-ci-test/pr-logs/pull/24654/pull-ci-openshift-origin-master-e2e-gcp/6227/artifacts/e2e-gcp/nodes/ should have it (looking at aws runs now) On aws https://gcsweb-ci.svc.ci.openshift.org/gcs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-aws-4.5/217/artifacts/e2e-aws/nodes/ grep "segfault" ~/Downloads/workers-journal-10 Mar 10 19:45:50.161776 ip-10-0-138-153 kernel: iptables-save[16510]: segfault at 80 ip 00007f24fd2e42c4 sp 00007ffc11488818 error 4 in libnftnl.so.11.2.0[7f24fd2d5000+2c000] Mar 10 19:54:03.107491 ip-10-0-138-153 kernel: iptables-save[84918]: segfault at 80 ip 00007fd47025f2c4 sp 00007ffccb82f928 error 4 in libnftnl.so.11.2.0[7fd470250000+2c000] Mar 10 19:51:17.970909 ip-10-0-138-186 kernel: iptables[66915]: segfault at 80 ip 00007f9c48fbf2c4 sp 00007ffc101e7478 error 4 in libnftnl.so.11.2.0[7f9c48fb0000+2c000] Mar 10 19:51:18.817755 ip-10-0-138-186 kernel: iptables-save[67093]: segfault at 80 ip 00007f6f33b502c4 sp 00007ffc39808508 error 4 in libnftnl.so.11.2.0[7f6f33b41000+2c000] Mar 10 19:51:19.626851 ip-10-0-138-186 kernel: iptables-restor[67290]: segfault at 18 ip 00007f43c4e3b49b sp 00007fff78708e10 error 4 in libnftnl.so.11.2.0[7f43c4e25000+2c000] Mar 10 19:51:22.600391 ip-10-0-138-186 kernel: iptables-save[67918]: segfault at 80 ip 00007f1382f512c4 sp 00007ffd18d08ce8 error 4 in libnftnl.so.11.2.0[7f1382f42000+2c000] Mar 10 19:51:23.237457 ip-10-0-138-186 kernel: iptables-restor[67953]: segfault at ffffffe0 ip 00007faf5f469d07 sp 00007fff05b4cad8 error 4 in libc-2.28.so[7faf5f30c000+1b9000] Mar 10 20:01:44.407400 ip-10-0-138-186 kernel: iptables-save[201004]: segfault at 80 ip 00007f1dbcd7a2c4 sp 00007ffe106d85c8 error 4 in libnftnl.so.11.2.0[7f1dbcd6b000+2c000] Mar 10 20:03:23.744734 ip-10-0-138-186 kernel: iptables-restor[224900]: segfault at 99 ip 00007fbb8aebe49b sp 00007ffd1ed46210 error 4 in libnftnl.so.11.2.0[7fbb8aea8000+2c000] Mar 10 19:55:42.073071 ip-10-0-150-43 kernel: iptables-save[127375]: segfault at 80 ip 00007f9b899e42c4 sp 00007ffdb5cb8ab8 error 4 in libnftnl.so.11.2.0[7f9b899d5000+2c000] Mar 10 19:57:00.115999 ip-10-0-150-43 kernel: iptables-restor[137883]: segfault at 50492060 ip 00007fab88704d07 sp 00007ffda600c138 error 4 in libc-2.28.so[7fab885a7000+1b9000] Mar 10 19:58:32.519735 ip-10-0-150-43 kernel: iptables-save[156848]: segfault at 80 ip 00007f0e312892c4 sp 00007ffc5fbec928 error 4 in libnftnl.so.11.2.0[7f0e3127a000+2c000] Mar 10 19:58:32.552517 ip-10-0-150-43 kernel: iptables-restor[156843]: segfault at 0 ip 00007f76a7dbeebd sp 00007ffcd653b710 error 4 in libnftnl.so.11.2.0[7f76a7db2000+2c000] Mar 10 20:06:31.309840 ip-10-0-150-43 kernel: iptables-save[253307]: segfault at 80 ip 00007fc83b4d32c4 sp 00007ffc57aa1978 error 4 in libnftnl.so.11.2.0[7fc83b4c4000+2c000] Mar 10 20:06:40.633635 ip-10-0-150-43 kernel: iptables-save[255938]: segfault at 80 ip 00007ff0611e62c4 sp 00007ffcc9fe8938 error 4 in libnftnl.so.11.2.0[7ff0611d7000+2c000] Mar 10 20:06:45.897857 ip-10-0-150-43 kernel: iptables-save[258365]: segfault at 80 ip 00007f70822a82c4 sp 00007ffdb5fb40b8 error 4 in libnftnl.so.11.2.0[7f7082299000+2c000] Mar 10 20:06:53.613746 ip-10-0-150-43 kernel: iptables-save[260152]: segfault at 80 ip 00007fcd78eef2c4 sp 00007ffc61839f18 error 4 in libnftnl.so.11.2.0[7fcd78ee0000+2c000] Happening in 4.4, release blocker: https://gcsweb-ci.svc.ci.openshift.org/gcs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-aws-4.4/1908/artifacts/e2e-aws/nodes/ ○ grep "segfault" ~/Downloads/workers-journal-12 Mar 10 20:51:43.912181 ip-10-0-134-209 kernel: iptables-save[294911]: segfault at 80 ip 00007f227d2d72c4 sp 00007ffc111c62c8 error 4 in libnftnl.so.11.2.0[7f227d2c8000+2c000] Mar 10 20:40:00.021182 ip-10-0-134-31 kernel: iptables-restor[144305]: segfault at 7f2066ffddb0 ip 00007f2032141394 sp 00007ffdcbb6d568 error 4 in libc-2.28.so[7f2031fe3000+1b9000] Mar 10 20:41:00.000713 ip-10-0-134-31 kernel: iptables[155669]: segfault at 80 ip 00007f2262e922c4 sp 00007ffff81591f8 error 4 in libnftnl.so.11.2.0[7f2262e83000+2c000] Mar 10 20:33:44.692800 ip-10-0-156-27 kernel: iptables-save[77075]: segfault at 80 ip 00007fea9494b2c4 sp 00007ffe473c4848 error 4 in libnftnl.so.11.2.0[7fea9493c000+2c000] Mar 10 20:38:34.544858 ip-10-0-156-27 kernel: iptables-save[133984]: segfault at 80 ip 00007f0ba2dd72c4 sp 00007fffa1ed21d8 error 4 in libnftnl.so.11.2.0[7f0ba2dc8000+2c000] Mar 10 20:43:44.706331 ip-10-0-156-27 kernel: iptables-restor[190893]: segfault at 0 ip 00007fbc49429cd5 sp 00007ffe34c19898 error 4 in libc-2.28.so[7fbc492cc000+1b9000] *** Bug 1811342 has been marked as a duplicate of this bug. *** Clayton, are we able to get either: 1) coredumps 2) 'iptables-save' output on the node? I know we don't run the networking bits of must-gather by default, but this would be a great time to have that info :( Also what specific RPM version of iptables is installed on whatever version of RHCOS is running on the node. Nevermind, Phil is all over it in bug 1807811 *** This bug has been marked as a duplicate of bug 1807811 *** Created attachment 1669423 [details]
coredump from ci run
Taken from a 4.5 master run.
*** Bug 1813214 has been marked as a duplicate of this bug. *** `iptables-1.8.4-10.el8` with the fix landed in the following RHCOS versions: 45.81.202003180406-0 44.81.202003172130-0 43.81.202003172053.0 These should have been picked up by the respective CI/nightly release payloads, as well. Let me know if the issue persists. Checking on the promotion, this is the current, promoted machine-os-content for 4.4: $ oc image info --output json registry.svc.ci.openshift.org/ocp/4.4:machine-os-content | jq -r .config.config.Labels.version 44.81.202003180730-0 so we should be good to go there. Not sure which nightly that went out with, but it's at least in: $ oc image info --output json $(oc adm release info --image-for=machine-os-content registry.svc.ci.openshift.org/ocp/release:4.4.0-0.nightly-2020-03-18-102708) | jq -r .config.config.Labels.version 44.81.202003180730-0 Did not find this issue on 4.4.0-0.nightly-2020-03-22-214549 and 4.4.0-0.nightly-2020-03-23-010639 with rhcos image 44.81.202003192230-0 Verified this bug. *** Bug 1814334 has been marked as a duplicate of this bug. *** Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0581 |