Bug 1954855

Summary: northd crashes with OCP on upgrade
Product: Red Hat Enterprise Linux Fast Datapath Reporter: Tim Rozet <trozet>
Component: OVNAssignee: OVN Team <ovnteam>
Status: CLOSED DUPLICATE QA Contact: Jianlin Shi <jishi>
Severity: high Docs Contact:
Priority: medium    
Version: RHEL 8.0CC: ctrautma, dcbw
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-02-16 21:53:55 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1953613, 1956358    

Description Tim Rozet 2021-04-28 22:16:54 UTC
Description of problem:
On multiple master nodes I see northd cores. In dmesg:
[327088.861386] traps: ovn-northd[1074285] general protection fault ip:7f34897f1d21 sp:7ffeea050a80 error:0 in libc-2.28.so[7f34897d0000+1b9000]

Version-Release number of selected component (if applicable):
ovn2.13-20.12.0-24.el8fdp.x86_64

OCP upgrade is from 4.7.3->4.7.6 which uses the same version of OVN.

Will attach core.

Comment 2 Dan Williams 2021-04-29 00:43:39 UTC
Note that there is no relevant difference between 4.7.3 and 4.7.6 in terms of RPMs. Only things that changed were NetworkManager, crio, and a few others.

Comment 3 Dan Williams 2021-04-29 01:31:27 UTC
Core was generated by `ovn-northd --no-chdir -vconsole:info -vfile:off --ovnnb-db ssl:172.21.8.40:9641'.
Program terminated with signal 11, Segmentation fault.
#0  0x00007f73b8c11d21 in __gconv_lookup_cache () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install glibc-2.17-324.el7_9.x86_64 libcap-ng-0.7.5-4.el7.x86_64 zlib-1.2.7-19.el7_9.x86_64
(gdb) bt
#0  0x00007f73b8c11d21 in __gconv_lookup_cache () from /lib64/libc.so.6
#1  0x0000000000000000 in ?? ()
(gdb)

Comment 4 Dan Williams 2021-04-29 01:43:46 UTC
core.ovn-northd.0.f78fa98b65584355b04b61fdb11da53e.4029.1619103625000000 (same as above)
---
Core was generated by `ovn-northd --no-chdir -vconsole:info -vfile:off --ovnnb-db ssl:172.21.8.40:9641'.
Program terminated with signal 11, Segmentation fault.
#0  0x00007f73b8c11d21 in __gconv_lookup_cache () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install glibc-2.17-324.el7_9.x86_64 libcap-ng-0.7.5-4.el7.x86_64 zlib-1.2.7-19.el7_9.x86_64
(gdb) bt
#0  0x00007f73b8c11d21 in __gconv_lookup_cache () from /lib64/libc.so.6
#1  0x0000000000000000 in ?? ()
(gdb)


core.ovn-northd.0.8e76dd4e0f6b41e5acc7184ebb6282db.415827.1619172149000000
---
Core was generated by `ovn-northd --no-chdir -vconsole:info -vfile:off --ovnnb-db ssl:172.21.8.40:9641'.
Program terminated with signal 11, Segmentation fault.
#0  0x00007fc063c55d21 in __gconv_lookup_cache () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install glibc-2.17-324.el7_9.x86_64 libcap-ng-0.7.5-4.el7.x86_64 zlib-1.2.7-19.el7_9.x86_64
(gdb) bt
#0  0x00007fc063c55d21 in __gconv_lookup_cache () from /lib64/libc.so.6
#1  0x0000000000000000 in ?? ()
(gdb) 


core.ovn-northd.0.ece4d0223aa04befa987e38ecb7b9869.1063219.1619104182000000
---
Core was generated by `ovn-northd --no-chdir -vconsole:info -vfile:off --ovnnb-db ssl:172.21.8.40:9641'.
Program terminated with signal 11, Segmentation fault.
#0  0x00007f000035ed21 in __gconv_lookup_cache () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install glibc-2.17-324.el7_9.x86_64 libcap-ng-0.7.5-4.el7.x86_64 zlib-1.2.7-19.el7_9.x86_64
(gdb) bt
#0  0x00007f000035ed21 in __gconv_lookup_cache () from /lib64/libc.so.6
#1  0x0000000000000000 in ?? ()
(gdb)

Comment 5 Dan Williams 2022-02-16 21:53:55 UTC

*** This bug has been marked as a duplicate of bug 1957030 ***