Bug 1957030

Summary:	core dumps for ovsdb-sever and northd seen in OCP 4.7->4.8 upgrade
Product:	Red Hat Enterprise Linux 8	Reporter:	Tim Rozet <trozet>
Component:	glibc	Assignee:	DJ Delorie <dj>
Status:	CLOSED WONTFIX	QA Contact:	qe-baseos-tools-bugs
Severity:	high	Docs Contact:
Priority:	medium
Version:	8.0	CC:	aconole, ashankar, bbennett, codonell, ctrautma, dcbw, dceara, dj, fche, fleitner, fweimer, huirwang, i.maximets, jhsiao, jluhrsen, mmichels, mnewsome, mpatel, pfrankli, ralongi, rbrattai, sipoyare, vpickard
Target Milestone:	rc	Keywords:	Triaged
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2022-02-04 00:10:07 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1953613, 1956358, 1957031

Description Tim Rozet 2021-05-04 21:40:58 UTC

Description of problem:
During upgrade of 4.7->4.8 we see crashes in these components. The versions are upgrading from 20.12.0-24.el8fdp to 20.12.0-25.el8fdp

https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.8-upgrade-from-stable-4.7-e2e-aws-ovn-upgrade/1389623166254452736

Cores can be found here:
https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.8-upgrade-from-stable-4.7-e2e-aws-ovn-upgrade/1389623166254452736/artifacts/e2e-aws-ovn-upgrade/gather-core-dump/artifacts/core.dumps.tar

Comment 1 Dan Williams 2021-05-05 03:59:58 UTC

Yet again, I get:

Core was generated by `ovn-northd --no-chdir -vconsole:info -vfile:off --ovnnb-db ssl:10.0.172.29:9641'.
Program terminated with signal 11, Segmentation fault.
#0  0x00007fa41b781d21 in __gconv_lookup_cache () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install glibc-2.17-324.el7_9.x86_64 libcap-ng-0.7.5-4.el7.x86_64 zlib-1.2.7-19.el7_9.x86_64
(gdb) t a a bt

Thread 1 (LWP 1):
#0  0x00007fa41b781d21 in __gconv_lookup_cache () from /lib64/libc.so.6
#1  0x0000000000000000 in ?? ()

Somebody should really try Dumitru's instructions in https://bugzilla.redhat.com/show_bug.cgi?id=1943413#c36 with one of the cores and make sure I'm doing it right.

Comment 2 Dan Williams 2021-05-05 14:16:17 UTC

ovsdb-server crash:

Core was generated by `ovsdb-server -vconsole:info -vfile:off --log-file=/var/log/ovn/ovsdb-server-nb.'.
Program terminated with signal 11, Segmentation fault.
#0  0x00007f158c800d21 in __gconv_lookup_cache () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install glibc-2.17-324.el7_9.x86_64 libcap-ng-0.7.5-4.el7.x86_64 zlib-1.2.7-19.el7_9.x86_64
(gdb) bt
#0  0x00007f158c800d21 in __gconv_lookup_cache () from /lib64/libc.so.6
#1  0x0000000000000000 in ?? ()
(gdb) t a a bt

Thread 2 (LWP 1):
#0  json_destroy (json=0x5590009c9960) at ../lib/json.c:354
#1  json_destroy_array (array=<optimized out>, array=<optimized out>) at ../lib/json.c:403
#2  json_destroy.part.10 () at ../lib/json.c:361
#3  0x0000558ffdc56b55 in json_destroy (json=<optimized out>) at ../lib/json.c:354
#4  json_destroy_array (array=<optimized out>, array=<optimized out>) at ../lib/json.c:403
#5  json_destroy.part.10 () at ../lib/json.c:361
#6  0x0000558ffdc56aac in json_destroy (json=<optimized out>) at ../lib/json.c:390
#7  json_destroy_object (object=0x559000b81b30) at ../lib/json.c:390
#8  json_destroy.part.10 () at ../lib/json.c:357
#9  0x0000558ffdc56aac in json_destroy (json=<optimized out>) at ../lib/json.c:390
#10 json_destroy_object (object=0x559000a2ea80) at ../lib/json.c:390
#11 json_destroy.part.10 () at ../lib/json.c:357
#12 0x0000558ffdc56aac in json_destroy (json=<optimized out>) at ../lib/json.c:390
#13 json_destroy_object (object=0x5590008c7fe0) at ../lib/json.c:390
#14 json_destroy.part.10 () at ../lib/json.c:357
#15 0x0000558ffdc56b55 in json_destroy (json=<optimized out>) at ../lib/json.c:354
#16 json_destroy_array (array=<optimized out>, array=<optimized out>) at ../lib/json.c:403
#17 json_destroy.part.10 () at ../lib/json.c:361
#18 0x0000558ffdc57f25 in json_destroy (json=<optimized out>) at ../lib/json.c:354
#19 0x0000558ffdc50136 in raft_entry_uninit (e=0x7f158ec860e0) at ../ovsdb/raft-private.c:294
#20 raft_entry_uninit (e=0x7f158ec860e0) at ../ovsdb/raft-private.c:291
#21 0x0000558ffdc4bc85 in raft_close (raft=0x558fffc92d70) at ../ovsdb/raft.c:1313
#22 0x0000558ffdc3a15e in ovsdb_storage_close (storage=0x558fffc966a0) at ../ovsdb/storage.c:138
#23 0x0000558ffdc3363f in ovsdb_destroy (db=0x558fffc966e0) at ../ovsdb/ovsdb.c:457
#24 0x0000558ffdc2cd8e in close_db (db=0x558fffc92fa0, comment=<optimized out>, config=<optimized out>) at ../ovsdb/ovsdb-server.c:540
#25 0x0000558ffdc2ab7f in main () at ../ovsdb/ovsdb-server.c:481
#26 0x00007f158c802803 in _nl_find_locale () from /lib64/libc.so.6
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

Thread 1 (LWP 214):
#0  0x00007f158c800d21 in __gconv_lookup_cache () from /lib64/libc.so.6
#1  0x0000000000000000 in ?? ()

Comment 3 Dan Williams 2021-05-05 14:17:35 UTC

I think these are the same as we've seen before in the upgrade tests; a crash on shutdown which should be harmless but still should be root-caused.

Comment 4 Dan Williams 2021-05-05 14:22:39 UTC

OVN DBs are here: https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.8-upgrade-from-stable-4.7-e2e-aws-ovn-upgrade/1389623166254452736/artifacts/e2e-aws-ovn-upgrade/gather-network/artifacts/

Comment 5 Dumitru Ceara 2021-05-10 21:57:30 UTC

All cores seem to be in abort() which gets called because the call to
raise(SIGTERM) failed.  This happens the fatal_signal_run(), after
signal hooks have been called, re-raising the original signal after
the signal handler has been reset to default.  The process should've
ended anyway even without the abort() (and SEGFAULT in the abort libc
implementation) so these cores are benign.

It is however a bit annoying that raise() fails and that abort()
segfaults, although I'm not sure how to investigate why they do.

sh-4.4# gdb /usr/sbin/ovsdb-server /tmp/core.ovsdb-server.0.365c47fc613d4813ba7913b1410aa1fa.3169.1620153057000000
[...]
Core was generated by `ovsdb-server -vconsole:info -vfile:off --log-file=/var/log/ovn/ovsdb-server-sb.'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007fa509febd21 in abort () from /lib64/libc.so.6
[Current thread is 1 (Thread 0x7fa503fff700 (LWP 177))]
Missing separate debuginfos, use: yum debuginfo-install glibc-2.28-127.el8_3.2.x86_64 libcap-ng-0.7.9-5.el8.x86_64 libevent-2.1.8-5.el8.x86_64 libnl3-3.5.0-1.el8.x86_64 numactl-libs-2.0.12-11.el8.x86_64 openssl-libs-1.1.1g-15.el8_3.x86_64 python3-libs-3.6.8-31.el8.x86_64 zlib-1.2.11-16.2.el8_3.x86_64
(gdb) bt
#0  0x00007fa509febd21 in abort () from /lib64/libc.so.6
#1  0x0000564f8faa37dc in fatal_signal_run () at ../lib/fatal-signal.c:263
#2  0x0000564f8fab38b0 in poll_block () at ../lib/poll-loop.c:388
#3  0x0000564f8faab4da in ovsrcu_postpone_thread (arg=<optimized out>) at ../lib/ovs-rcu.c:367
#4  0x0000564f8faac6c3 in ovsthread_wrapper (aux_=<optimized out>) at ../lib/ovs-thread.c:383
#5  0x00007fa50ac5415a in start_thread () from /lib64/libpthread.so.0
#6  0x00007fa50a0c6f73 in clone () from /lib64/libc.so.6
(gdb) frame 1
#1  0x0000564f8faa37dc in fatal_signal_run () at ../lib/fatal-signal.c:263
263             OVS_NOT_REACHED();
(gdb) list
258              * termination status reflects that we were killed by this signal */
259             signal(sig_nr, SIG_DFL);
260             raise(sig_nr);
261
262             ovs_mutex_unlock(&mutex);
263             OVS_NOT_REACHED();
264         }
265     }
266
267     void
(gdb) p sig_nr
$1 = 15 <<< SIGTERM

sh-4.4# gdb /usr/sbin/ovsdb-server /tmp/core.ovsdb-server.0.365c47fc613d4813ba7913b1410aa1fa.2275.1620153057000000
[...]
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007f158c800d21 in abort () from /lib64/libc.so.6
[Current thread is 1 (Thread 0x7f158af7e700 (LWP 214))]
Missing separate debuginfos, use: yum debuginfo-install glibc-2.28-127.el8_3.2.x86_64 libcap-ng-0.7.9-5.el8.x86_64 libevent-2.1.8-5.el8.x86_64 libnl3-3.5.0-1.el8.x86_64 numactl-libs-2.0.12-11.el8.x86_64 openssl-libs-1.1.1g-15.el8_3.x86_64 python3-libs-3.6.8-31.el8.x86_64 zlib-1.2.11-16.2.el8_3.x86_64
(gdb) bt
#0  0x00007f158c800d21 in abort () from /lib64/libc.so.6
#1  0x0000558ffdc557dc in fatal_signal_run () at ../lib/fatal-signal.c:263
#2  0x0000558ffdc658b0 in poll_block () at ../lib/poll-loop.c:388
#3  0x0000558ffdc5d4da in ovsrcu_postpone_thread (arg=<optimized out>) at ../lib/ovs-rcu.c:367
#4  0x0000558ffdc5e6c3 in ovsthread_wrapper (aux_=<optimized out>) at ../lib/ovs-thread.c:383
#5  0x00007f158d46915a in start_thread () from /lib64/libpthread.so.0
#6  0x00007f158c8dbf73 in clone () from /lib64/libc.so.6
(gdb) frame 1
#1  0x0000558ffdc557dc in fatal_signal_run () at ../lib/fatal-signal.c:263
263             OVS_NOT_REACHED();
(gdb) list
258              * termination status reflects that we were killed by this signal */
259             signal(sig_nr, SIG_DFL);
260             raise(sig_nr);
261
262             ovs_mutex_unlock(&mutex);
263             OVS_NOT_REACHED();
264         }
265     }
266
267     void
(gdb) p sig_nr
$1 = 15 <<< SIGTERM

sh-4.4# gdb /usr/sbin/ovsdb-server /tmp/core.ovsdb-server.0.e52709c99e5e48ed992eddb4bfba82bb.3258.1620153325000000
[...]
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007fcee65e5d21 in abort () from /lib64/libc.so.6
[Current thread is 1 (Thread 0x7fcee4d63700 (LWP 175))]
Missing separate debuginfos, use: yum debuginfo-install glibc-2.28-127.el8_3.2.x86_64 libcap-ng-0.7.9-5.el8.x86_64 libevent-2.1.8-5.el8.x86_64 libnl3-3.5.0-1.el8.x86_64 numactl-libs-2.0.12-11.el8.x86_64 openssl-libs-1.1.1g-15.el8_3.x86_64 python3-libs-3.6.8-31.el8.x86_64 zlib-1.2.11-16.2.el8_3.x86_64
(gdb) bt
#0  0x00007fcee65e5d21 in abort () from /lib64/libc.so.6
#1  0x0000558cf0e017dc in fatal_signal_run () at ../lib/fatal-signal.c:263
#2  0x0000558cf0e118b0 in poll_block () at ../lib/poll-loop.c:388
#3  0x0000558cf0e094da in ovsrcu_postpone_thread (arg=<optimized out>) at ../lib/ovs-rcu.c:367
#4  0x0000558cf0e0a6c3 in ovsthread_wrapper (aux_=<optimized out>) at ../lib/ovs-thread.c:383
#5  0x00007fcee724e15a in start_thread () from /lib64/libpthread.so.0
#6  0x00007fcee66c0f73 in clone () from /lib64/libc.so.6
(gdb) frame 1
#1  0x0000558cf0e017dc in fatal_signal_run () at ../lib/fatal-signal.c:263
263             OVS_NOT_REACHED();
(gdb) list sig_nr
Function "sig_nr" not defined.
(gdb) p sig_nr
$1 = 15

sh-4.4# gdb /usr/sbin/ovsdb-server /tmp/core.ovsdb-server.0.e52709c99e5e48ed992eddb4bfba82bb.2276.1620153325000000
[...]
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007ffa88b58d21 in abort () from /lib64/libc.so.6
[Current thread is 1 (Thread 0x7ffa872d6700 (LWP 206))]
Missing separate debuginfos, use: yum debuginfo-install glibc-2.28-127.el8_3.2.x86_64 libcap-ng-0.7.9-5.el8.x86_64 libevent-2.1.8-5.el8.x86_64 libnl3-3.5.0-1.el8.x86_64 numactl-libs-2.0.12-11.el8.x86_64 openssl-libs-1.1.1g-15.el8_3.x86_64 python3-libs-3.6.8-31.el8.x86_64 zlib-1.2.11-16.2.el8_3.x86_64
(gdb) bt
#0  0x00007ffa88b58d21 in abort () from /lib64/libc.so.6
#1  0x00005640b4e117dc in fatal_signal_run () at ../lib/fatal-signal.c:263
#2  0x00005640b4e218b0 in poll_block () at ../lib/poll-loop.c:388
#3  0x00005640b4e194da in ovsrcu_postpone_thread (arg=<optimized out>) at ../lib/ovs-rcu.c:367
#4  0x00005640b4e1a6c3 in ovsthread_wrapper (aux_=<optimized out>) at ../lib/ovs-thread.c:383
#5  0x00007ffa897c115a in start_thread () from /lib64/libpthread.so.0
#6  0x00007ffa88c33f73 in clone () from /lib64/libc.so.6
(gdb) frame 1
#1  0x00005640b4e117dc in fatal_signal_run () at ../lib/fatal-signal.c:263
263             OVS_NOT_REACHED();
(gdb) p sig_nr
$1 = 15

sh-4.4# gdb /usr/bin/ovn-northd /tmp/core.ovn-northd.0.365c47fc613d4813ba7913b1410aa1fa.2117.1620153057000000
[...]
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007fbc67b94d21 in abort () from /lib64/libc.so.6
Missing separate debuginfos, use: yum debuginfo-install glibc-2.28-127.el8_3.2.x86_64 libcap-ng-0.7.9-5.el8.x86_64 libevent-2.1.8-5.el8.x86_64 openssl-libs-1.1.1g-15.el8_3.x86_64 python3-libs-3.6.8-31.el8.x86_64 zlib-1.2.11-16.2.el8_3.x86_64
(gdb) bt
#0  0x00007fbc67b94d21 in abort () from /lib64/libc.so.6
#1  0x0000557334fe9e9c in fatal_signal_run () at lib/fatal-signal.c:313
#2  0x000055733502f110 in poll_block () at lib/poll-loop.c:388
#3  0x0000557334f73cd0 in main (argc=<optimized out>, argv=<optimized out>) at northd/ovn-northd.c:14267
(gdb) frame 1
#1  0x0000557334fe9e9c in fatal_signal_run () at lib/fatal-signal.c:313
313             OVS_NOT_REACHED();
(gdb) p sig_nr
$1 = 15

sh-4.4# gdb /usr/bin/ovn-northd /tmp/core.ovn-northd.0.e52709c99e5e48ed992eddb4bfba82bb.2099.1620153325000000
[...]
Core was generated by `ovn-northd --no-chdir -vconsole:info -vfile:off --ovnnb-db ssl:10.0.172.29:9641'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007f4f8f4d7d21 in abort () from /lib64/libc.so.6
Missing separate debuginfos, use: yum debuginfo-install glibc-2.28-127.el8_3.2.x86_64 libcap-ng-0.7.9-5.el8.x86_64 libevent-2.1.8-5.el8.x86_64 openssl-libs-1.1.1g-15.el8_3.x86_64 python3-libs-3.6.8-31.el8.x86_64 zlib-1.2.11-16.2.el8_3.x86_64
(gdb) bt
#0  0x00007f4f8f4d7d21 in abort () from /lib64/libc.so.6
#1  0x000055c4ef2d1e9c in fatal_signal_run () at lib/fatal-signal.c:313
#2  0x000055c4ef317110 in poll_block () at lib/poll-loop.c:388
#3  0x000055c4ef25bcd0 in main (argc=<optimized out>, argv=<optimized out>) at northd/ovn-northd.c:14267
(gdb) frame 1
#1  0x000055c4ef2d1e9c in fatal_signal_run () at lib/fatal-signal.c:313
313             OVS_NOT_REACHED();
(gdb) p sig_nr
$1 = 15

sh-4.4# gdb /usr/bin/ovn-northd /tmp/core.ovn-northd.0.719ca258ebb449c584d4d30ec1a6a317.2225.1620153188000000
[...]
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007fa41b781d21 in abort () from /lib64/libc.so.6
Missing separate debuginfos, use: yum debuginfo-install glibc-2.28-127.el8_3.2.x86_64 libcap-ng-0.7.9-5.el8.x86_64 libevent-2.1.8-5.el8.x86_64 openssl-libs-1.1.1g-15.el8_3.x86_64 python3-libs-3.6.8-31.el8.x86_64 zlib-1.2.11-16.2.el8_3.x86_64
(gdb) bt
#0  0x00007fa41b781d21 in abort () from /lib64/libc.so.6
#1  0x00005640e3067e9c in fatal_signal_run () at lib/fatal-signal.c:313
#2  0x00005640e30ad110 in poll_block () at lib/poll-loop.c:388
#3  0x00005640e2ff1cd0 in main (argc=<optimized out>, argv=<optimized out>) at northd/ovn-northd.c:14267
(gdb) frame 1
#1  0x00005640e3067e9c in fatal_signal_run () at lib/fatal-signal.c:313
313             OVS_NOT_REACHED();
(gdb) p sig_nr
$1 = 15

Comment 6 Dumitru Ceara 2021-05-10 22:05:00 UTC

With glibc debuginfo:

(gdb) bt
#0  __GI_abort () at abort.c:107
#1  0x00005640e3067e9c in fatal_signal_run () at lib/fatal-signal.c:313
#2  0x00005640e30ad110 in poll_block () at lib/poll-loop.c:388
#3  0x00005640e2ff1cd0 in main (argc=<optimized out>, argv=<optimized out>) at northd/ovn-northd.c:14267
(gdb) list
102
103       /* Now try to abort using the system specific command.  */
104       if (stage == 4)
105         {
106           ++stage;
107           ABORT_INSTRUCTION;
108         }
109
110       /* If we can't signal ourselves and the abort instruction failed, exit.  */
111       if (stage == 5)

From upstream glibc/abort.c code:

 /* Try again.  */
  if (stage == 3)
    {
      ++stage;
      raise (SIGABRT);
    }

  /* Now try to abort using the system specific command.  */
  if (stage == 4)
    {
      ++stage;
      ABORT_INSTRUCTION;
    }

I presume the "raise (SIGABRT)" call failed here too so we ended up
calling ABORT_INSTRUCTION.

Comment 7 Ilya Maximets 2021-05-12 11:08:12 UTC

So, it's a problem of glibc or kernel.  There is nothing we can do from the
OVS/OVN side.  The only thing that we can is to print a warning with the
error code indicating that raise() failed to improve the visibility a bit,
but the process will crash later inside glibc while executing abort()
anyway.  And this might not be useful too, because all the must-gather logs
starts right *after* crashes, so we don't have logs printed by the crashed
process.

Comment 8 Dan Williams 2021-05-12 17:54:43 UTC

(In reply to Ilya Maximets from comment #7)
> So, it's a problem of glibc or kernel.  There is nothing we can do from the
> OVS/OVN side.  The only thing that we can is to print a warning with the
> error code indicating that raise() failed to improve the visibility a bit,
> but the process will crash later inside glibc while executing abort()
> anyway.  And this might not be useful too, because all the must-gather logs
> starts right *after* crashes, so we don't have logs printed by the crashed
> process.

Or cleanly handle SIGTERM so that it breaks mainloop, cleans up, and exits like the "stop" appctl command does. Would that be an acceptable patch?

Comment 9 Ilya Maximets 2021-05-13 11:29:10 UTC

(In reply to Dan Williams from comment #8)
> (In reply to Ilya Maximets from comment #7)
> > So, it's a problem of glibc or kernel.  There is nothing we can do from the
> > OVS/OVN side.  The only thing that we can is to print a warning with the
> > error code indicating that raise() failed to improve the visibility a bit,
> > but the process will crash later inside glibc while executing abort()
> > anyway.  And this might not be useful too, because all the must-gather logs
> > starts right *after* crashes, so we don't have logs printed by the crashed
> > process.
> 
> Or cleanly handle SIGTERM so that it breaks mainloop, cleans up, and exits
> like the "stop" appctl command does. Would that be an acceptable patch?

Well, to correctly indicate the reason why the process was terminated, OVS
will need to re-raise the signal in the end anyway.  And it will fail.
Exiting with exit(0) doesn't sound like a good thing to do.

I think, it might be OK to log a failure of raise() with VLOG_FATAL() that will
print an error and call exit(EXIT_FAILURE).  Sounds better than a coredump.

What do you think?

I'm not sure how this will affect restartability of the daemon, e.g. systemd
will restart the daemon on SIGSEGV or SIGABRT if restart policy configured to
any of 'on-abort', 'on-abnormal' or 'on-failure'.  But EXIT_FAILURE will trigger
re-start only with 'on-failure'.
OTOH, SIGTERM considered as a "clean" signal and systemd restarts on SIGTERM
only if restart policy is 'on-success'.  So, in this scenario there was no
intention to have the daemon restarted at all and it's, probably, an OK solution.

Comment 10 jamo luhrsen 2021-08-17 16:46:44 UTC

*** Bug 1992281 has been marked as a duplicate of this bug. ***

Comment 11 jamo luhrsen 2021-08-17 16:52:51 UTC

@i.maximets, @dcbw,

do we think any resolution will come of this such that we won't have core dumps in ovsdb-server and ovn-northd
during an upgrade? I have a PR [0] that will fail our CI jobs when core files are found. I think there have
been other examples in the past where core dumps found [1] were more serious than maybe what we see here. So,
enabling a failure when they are found seems beneficial, but it's not very helpful if what we have from
this bz will not be resolved.

[0] https://github.com/openshift/release/pull/20566
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1943413

Comment 12 Ilya Maximets 2021-08-17 18:04:01 UTC

(In reply to jamo luhrsen from comment #11)
> do we think any resolution will come of this such that we won't have core
> dumps in ovsdb-server and ovn-northd during an upgrade?

As per discussion in OVS upstream, process has to re-raise the signal
that was originally intercepted and this will not be changed.  The problem
of a broken raise() is in glibc or lower.  If it's required to not have
coredumps, someone should debug glibc to find why it fails.

Suggesting to move this BZ to glibc component (if it exists).  Maximum that
can be done from the OVS/OVN side is to print the errno when raise() fails,
but OCP doesn't store logs of old containers during upgrade, AFAICT, so this
doesn't make a lot of sense and will not help with coredumps anyway.

Comment 14 jamo luhrsen 2021-08-19 18:16:40 UTC

(In reply to Ilya Maximets from comment #12)
> 
> Suggesting to move this BZ to glibc component (if it exists).  Maximum that
> can be done from the OVS/OVN side is to print the errno when raise() fails,
> but OCP doesn't store logs of old containers during upgrade, AFAICT, so this
> doesn't make a lot of sense and will not help with coredumps anyway.

ok, I just moved it to Red Hat Enterprise Linux 8 product, glibc component

I'm not sure what to use for Target Milestone or Release so just took the defaults of "beta" and "---"

Comment 15 Florian Weimer 2021-08-20 05:24:41 UTC

There are various possible explanations for this behavior. Clearly the signal is not being delivered.

For the first signal, this could be because signals are block. Do you use pthread_sigmask or sigprocmask to block signals? (Or the older sigblock function?)

However, this does not explain why abort cannot send the SIGABRT signal. One explanation would be a seccomp filter that blocks kill or sigaction. Do you use any of those?

The remaining explanation is that the signal is sent, but not to the right thread. In Red Hat Enterprise Linux 8, that is unlikely because raise uses getpid and gettid to determine the target, not cached values from the TCB.

Still it would be interesting to see the output from a coredump of the TCB:

(gdb) print *(struct pthread *)$fs_base

Is this bug something that you can reproduce at will in a test environment? Thanks.

Comment 16 jamo luhrsen 2021-08-23 21:21:45 UTC

> Still it would be interesting to see the output from a coredump of the TCB:
> 
> (gdb) print *(struct pthread *)$fs_base
> 

I will try to get this info for you.

> Is this bug something that you can reproduce at will in a test environment? Thanks.

yes, it happens every time. You can bring up a cluster-bot 4.8 cluster and upgrade it to
4.9 then you can see the core dumps on the master nodes. It also happens every time in our
4.8->4.9 upgrade CI jobs.

Comment 17 Ilya Maximets 2021-08-23 21:57:21 UTC

(In reply to Florian Weimer from comment #15)
> For the first signal, this could be because signals are block. Do you use
> pthread_sigmask or sigprocmask to block signals? (Or the older sigblock
> function?)

Application uses signals in a very simple way:

1. for signals SIGTERM, SIGINT, SIGHUP, SIGALRM, SIGSEGV:

     signal(sig_nr, fatal_signal_handler);

2. fatal_signal_handler() stores signal number in a global variable.

3. main() notices that signal is set and calls:

     signal(sig_nr, SIG_DFL);
     raise(sig_nr);

No other manipulations with signals in the code.
If signals can be masked/blocked from the outside of the process, I'm not
sure about that.

> 
> However, this does not explain why abort cannot send the SIGABRT signal. One
> explanation would be a seccomp filter that blocks kill or sigaction. Do you
> use any of those?

Application itself doesn't use it.  Again, if these could be set from the
outside, I'm not sure about that.

The context is that it's an OpenShift upgrade.  The process in question is
running inside a container on CoreOS.  CoreOS is getting updated and the
container image too.  IIUC, CoreOS is updated first (I'm not sure how exactly
this works) and after that container gets re-started from the new image.
At this point old container receives command to terminate and sends SIGTERM
to the process inside.  The process crashes due to failure to raise a signal.

Comment 18 Florian Weimer 2021-09-02 08:16:59 UTC

(In reply to jamo luhrsen from comment #16)
> > Still it would be interesting to see the output from a coredump of the TCB:
> > 
> > (gdb) print *(struct pthread *)$fs_base
> > 
> 
> I will try to get this info for you.

Thanks.

Some additional data worth collecting is the output of these commands before and after the update:

  ls -liH /usr/lib64/libc.so.6
  sha256sum /usr/lib64/libc.so.6

It would also be interesting to run an upgrade procedure while the relevant processing is running under strace, e.g. via:

  strace -p $PID -f -s 500 -v -o /var/tmp/ovsdb-server.$PID

> > Is this bug something that you can reproduce at will in a test environment? Thanks.
> 
> yes, it happens every time. You can bring up a cluster-bot 4.8 cluster and
> upgrade it to
> 4.9 then you can see the core dumps on the master nodes. It also happens
> every time in our
> 4.8->4.9 upgrade CI jobs.

As far as I know, we (the glibc team) do not have access to cluster-bot or cloud resources for OCP testing.

The components we maintain are used by an extremely broad set of software, so we generally have to rely on subject matter experts to set up reproduction environments and provide us with access to them.

Comment 23 jamo luhrsen 2021-09-29 20:51:10 UTC

(In reply to Florian Weimer from comment #18)
> (In reply to jamo luhrsen from comment #16)
> > > Still it would be interesting to see the output from a coredump of the TCB:
> > > 
> > > (gdb) print *(struct pthread *)$fs_base
> > > 
> > 
> > I will try to get this info for you.

I don't really know what I'm doing here, but that command verbatim does not work
when I first open the debugger:

[root@ci-ln-rsiqxxt-f76d1-7vxqb-master-1 tmp]# gdb /usr/bin/ovn-northd core.ovn-northd.0.604670c3c89c448ab5f87896fb7d60db.4117.1632862394000000
GNU gdb (GDB) Red Hat Enterprise Linux 8.2-15.el8
<snip>
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/bin/ovn-northd...Reading symbols from /usr/lib/debug/usr/bin/ovn-northd-21.09.0-13.el8fdp.x86_64.debug...done.
done.
[New LWP 1]
[New LWP 10]
[New LWP 9]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `ovn-northd --no-chdir -vconsole:info -vfile:off --ovnnb-db ssl:10.0.0.3:9641,ss'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007f6646c139bf in raise () from /lib64/libpthread.so.0
[Current thread is 1 (Thread 0x7f66479c2e00 (LWP 1))]
Missing separate debuginfos, use: yum debuginfo-install glibc-2.28-151.el8.x86_64 libcap-ng-0.7.9-5.el8.x86_64 libevent-2.1.8-5.el8.x86_64 openssl-libs-1.1.1g-15.el8_3.x86_64 python3-libs-3.6.8-37.el8.x86_64 unbound-libs-1.7.3-15.el8.x86_64 zlib-1.2.11-17.el8.x86_64
(gdb) print *(struct pthread *)$fs_base
No struct type named pthread.
(gdb) 

Can you help me with the right command. I have hardly any experience with gdb.


> 
> Thanks.
> 
> Some additional data worth collecting is the output of these commands before
> and after the update:
> 
>   ls -liH /usr/lib64/libc.so.6
>   sha256sum /usr/lib64/libc.so.6

I'm using openshift 4.9 now (core dumps still happen) and here is this output:

[root@ci-ln-rsiqxxt-f76d1-7vxqb-master-1 tmp]# ls -liH /usr/lib64/libc.so.6
142606723 -rwxr-xr-x. 1 root root 3167976 Mar  5  2021 /usr/lib64/libc.so.6
[root@ci-ln-rsiqxxt-f76d1-7vxqb-master-1 tmp]# sha256sum /usr/lib64/libc.so.6
5c09418382175429ac7da80007df08c718c3042a167de4e830f2e69c36719ecc  /usr/lib64/libc.so.6
[root@ci-ln-rsiqxxt-f76d1-7vxqb-master-1 tmp]# 

AFAICT the dump is happening with the initial version of northd and not with the version
we are upgrading to. If you still want the output from the version we are upgrading
to, I can get that as well.


> 
> It would also be interesting to run an upgrade procedure while the relevant
> processing is running under strace, e.g. via:
> 
>   strace -p $PID -f -s 500 -v -o /var/tmp/ovsdb-server.$PID

I'm not sure I can do this. At least, I don't know how at this point. I don't have
a manual way of recreating this. It's just a by-product of a full openshift cluster
upgrade.

> 
> > > Is this bug something that you can reproduce at will in a test environment? Thanks.
> > 
> > yes, it happens every time. You can bring up a cluster-bot 4.8 cluster and
> > upgrade it to
> > 4.9 then you can see the core dumps on the master nodes. It also happens
> > every time in our
> > 4.8->4.9 upgrade CI jobs.
> 
> As far as I know, we (the glibc team) do not have access to cluster-bot or
> cloud resources for OCP testing.
> 
> The components we maintain are used by an extremely broad set of software,
> so we generally have to rely on subject matter experts to set up
> reproduction environments and provide us with access to them.

totally understand. I can keep trying things in our environment based on your
requests and report back with the info.

Comment 24 Florian Weimer 2021-10-01 08:47:17 UTC

(In reply to jamo luhrsen from comment #23)
> [root@ci-ln-rsiqxxt-f76d1-7vxqb-master-1 tmp]# gdb /usr/bin/ovn-northd
> core.ovn-northd.0.604670c3c89c448ab5f87896fb7d60db.4117.1632862394000000
> GNU gdb (GDB) Red Hat Enterprise Linux 8.2-15.el8
> <snip>
> Type "apropos word" to search for commands related to "word"...
> Reading symbols from /usr/bin/ovn-northd...Reading symbols from
> /usr/lib/debug/usr/bin/ovn-northd-21.09.0-13.el8fdp.x86_64.debug...done.
> done.
> [New LWP 1]
> [New LWP 10]
> [New LWP 9]
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library "/lib64/libthread_db.so.1".
> Core was generated by `ovn-northd --no-chdir -vconsole:info -vfile:off
> --ovnnb-db ssl:10.0.0.3:9641,ss'.
> Program terminated with signal SIGSEGV, Segmentation fault.
> #0  0x00007f6646c139bf in raise () from /lib64/libpthread.so.0
> [Current thread is 1 (Thread 0x7f66479c2e00 (LWP 1))]
> Missing separate debuginfos, use: yum debuginfo-install
> glibc-2.28-151.el8.x86_64 libcap-ng-0.7.9-5.el8.x86_64
> libevent-2.1.8-5.el8.x86_64 openssl-libs-1.1.1g-15.el8_3.x86_64
> python3-libs-3.6.8-37.el8.x86_64 unbound-libs-1.7.3-15.el8.x86_64
> zlib-1.2.11-17.el8.x86_64
> (gdb) print *(struct pthread *)$fs_base
> No struct type named pthread.
> (gdb) 
> 
> Can you help me with the right command. I have hardly any experience with
> gdb.

The command is correct, but you need to install the right version of glibc-debuginfo. The package version must match exactly the version that was used to generate the coredump. Usually GDB prints a warning if there is a mismatch. I do wonder now if the impossible backtrace in comment 6 is the result of such a mismatch.

This core doesn't seem to be in the tarball from comment 0.

I can have a look at the coredumps myself, but I will need a pointer to the ovn-northd binary and its debuginfo (a pointer to an RPM build would be fine).

> > Some additional data worth collecting is the output of these commands before
> > and after the update:
> > 
> >   ls -liH /usr/lib64/libc.so.6
> >   sha256sum /usr/lib64/libc.so.6
> 
> I'm using openshift 4.9 now (core dumps still happen) and here is this
> output:
> 
> [root@ci-ln-rsiqxxt-f76d1-7vxqb-master-1 tmp]# ls -liH /usr/lib64/libc.so.6
> 142606723 -rwxr-xr-x. 1 root root 3167976 Mar  5  2021 /usr/lib64/libc.so.6
> [root@ci-ln-rsiqxxt-f76d1-7vxqb-master-1 tmp]# sha256sum /usr/lib64/libc.so.6
> 5c09418382175429ac7da80007df08c718c3042a167de4e830f2e69c36719ecc 
> /usr/lib64/libc.so.6
> [root@ci-ln-rsiqxxt-f76d1-7vxqb-master-1 tmp]# 
> 
> AFAICT the dump is happening with the initial version of northd and not with
> the version
> we are upgrading to. If you still want the output from the version we are
> upgrading
> to, I can get that as well.

I'm interested in the file system contents/command output *after* the core happens.

But if the impossible backtrace is the result of a core/debuginfo mismatch, this is a red herring and won't get us closer to the real cause.

Comment 25 Carlos O'Donell 2021-10-01 14:22:35 UTC

Just following up here, the next steps would be:

- A copy of the ovn-northd binary.
- A copy of a coredump that matches the ovn-northd binary.
- Exactly release of glibc that the system had.

From there we can double check that everything lines up and help get a backtrace that we can use to move forward.

Florian's out for a little while, and I'm stepping in to continue supporting this from the glibc side.

Comment 26 jamo luhrsen 2021-10-01 19:03:22 UTC

Thanks for taking a look @codonell

Let me sort of start from scratch using a recent core dump we have. I'll be a little verbose and I'm sure some of this is meaningless for you.

This job:
  https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.10-upgrade-from-stable-4.9-e2e-gcp-ovn-upgrade/1443661033020329984
used openshift release 4.9.0-fc.0  and upgraded to release 4.10.0-0.ci-2021-09-29-221425

during the upgrade we got these core dumps:
  https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.10-upgrade-from-stable-4.9-e2e-gcp-ovn-upgrade/1443661033020329984/artifacts/e2e-gcp-ovn-upgrade/gather-core-dump/artifacts/
The tar file there has 5 core dumps, two of which are for ovn-northd. The logs below use the dump in this file:
  ./registry-ci-openshift-org-ocp-4-10-2021-09-29-221425-sha256-864d8efffae65397e29d7bc8ee658a86d05ec8858133697cbace741a667439d6/node_core_dumps/ci-op-3fmqtvd8-82914-sbvs8-master-0_core_dump/core.ovn-northd.0.56eb4131a75c46d2a8ca6a201ea17970.3787.1633035371000000.lz4

to get the following logs, I started a cluster with release 4.9.0-fc.0 and exec'ed in to an ovnkube-master pod. Here are the versions of
relevant things running on that system:

  [root@ci-ln-tcn8z4t-f76d1-rznzc-master-2 tmp]# rpm -qa | egrep 'ovn|glibc'
  ovn21.09-21.09.0-13.el8fdp.x86_64
  ovn21.09-host-21.09.0-13.el8fdp.x86_64
  glibc-common-2.28-151.el8.x86_64
  ovn21.09-central-21.09.0-13.el8fdp.x86_64
  glibc-minimal-langpack-2.28-151.el8.x86_64
  ovn21.09-vtep-21.09.0-13.el8fdp.x86_64
  glibc-2.28-151.el8.x86_64

  [root@ci-ln-tcn8z4t-f76d1-rznzc-master-2 tmp]# ovn-northd --version
  ovn-northd 21.06.90
  Open vSwitch Library 2.16.90


so, OVN is ovn21.09-21.09.0-13 and glibc is ovn21.09-21.09.0-13, and ovn-northd is 21.06.90

next, installed the following rpms:

  [root@ci-ln-tcn8z4t-f76d1-rznzc-master-2 tmp]# ls -altr *rpm
  -rw-r--r--. 1 1000 hugetlbfs  2070988 Oct  1 16:48 ovn21.09-central-21.09.0-13.el8fdp.x86_64.rpm
  -rw-r--r--. 1 1000 hugetlbfs  3187512 Oct  1 16:48 ovn21.09-vtep-debuginfo-21.09.0-13.el8fdp.x86_64.rpm
  -rw-r--r--. 1 1000 hugetlbfs  4958280 Oct  1 16:48 ovn21.09-21.09.0-13.el8fdp.x86_64.rpm
  -rw-r--r--. 1 1000 hugetlbfs 16756168 Oct  1 16:48 ovn21.09-debuginfo-21.09.0-13.el8fdp.x86_64.rpm
  -rw-r--r--. 1 1000 hugetlbfs  3684112 Oct  1 16:48 ovn21.09-host-debuginfo-21.09.0-13.el8fdp.x86_64.rpm
  -rw-r--r--. 1 1000 hugetlbfs   929924 Oct  1 16:48 ovn21.09-vtep-21.09.0-13.el8fdp.x86_64.rpm
  -rw-r--r--. 1 1000 hugetlbfs  6964236 Oct  1 16:48 ovn21.09-central-debuginfo-21.09.0-13.el8fdp.x86_64.rpm
  -rw-r--r--. 1 1000 hugetlbfs  2073112 Oct  1 16:48 ovn21.09-debugsource-21.09.0-13.el8fdp.x86_64.rpm
  -rw-r--r--. 1 1000 hugetlbfs  1095180 Oct  1 16:48 ovn21.09-host-21.09.0-13.el8fdp.x86_64.rpm
  -rw-r--r--. 1 1000 hugetlbfs 14268976 Oct  1 16:55 glibc-debuginfo-2.28-151.el8.x86_64.rpm
  -rw-r--r--. 1 1000 hugetlbfs 11508944 Oct  1 16:55 glibc-debuginfo-common-2.28-151.el8.x86_64.rpm

  [root@ci-ln-tcn8z4t-f76d1-rznzc-master-2 tmp]# rpm -Uhv --force --nodeps *rpm
  Verifying...                          ################################# [100%]
  Preparing...                          ################################# [100%]
  Updating / installing...
     1:ovn21.09-debugsource-21.09.0-13.e################################# [  9%]
     2:ovn21.09-debuginfo-21.09.0-13.el8################################# [ 18%]
     3:ovn21.09-21.09.0-13.el8fdp       ################################# [ 27%]
     4:glibc-debuginfo-common-2.28-151.e################################# [ 36%]
     5:glibc-debuginfo-2.28-151.el8     ################################# [ 45%]
     6:ovn21.09-central-21.09.0-13.el8fd################################# [ 55%]
     7:ovn21.09-host-21.09.0-13.el8fdp  ################################# [ 64%]
     8:ovn21.09-vtep-21.09.0-13.el8fdp  ################################# [ 73%]
     9:ovn21.09-central-debuginfo-21.09.################################# [ 82%]
    10:ovn21.09-host-debuginfo-21.09.0-1################################# [ 91%]
    11:ovn21.09-vtep-debuginfo-21.09.0-1################################# [100%]

Installed gdb with 'yum install gdb -y'

here are the versions for 

And now with the proper glibc-debuginfo installed I do get some output from the 'print *(struct pthread *)$fs_base' cmd (thanks for the tip):

[root@ci-ln-tcn8z4t-f76d1-rznzc-master-2 tmp]# gdb /usr/bin/ovn-northd core.ovn-northd.0.56eb4131a75c46d2a8ca6a201ea17970.3787.1633035371000000
GNU gdb (GDB) Red Hat Enterprise Linux 8.2-15.el8
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/bin/ovn-northd...Reading symbols from /usr/lib/debug/usr/bin/ovn-northd-21.09.0-13.el8fdp.x86_64.debug...done.
done.
[New LWP 1]
[New LWP 12]
[New LWP 11]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `ovn-northd --no-chdir -vconsole:info -vfile:off --ovnnb-db ssl:10.0.0.2:9641,ss'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  __GI_abort () at abort.c:107
107	      ABORT_INSTRUCTION;
[Current thread is 1 (Thread 0x7fcf4ac23e00 (LWP 1))]
Missing separate debuginfos, use: yum debuginfo-install libcap-ng-0.7.9-5.el8.x86_64 libevent-2.1.8-5.el8.x86_64 openssl-libs-1.1.1g-15.el8_3.x86_64 python3-libs-3.6.8-37.el8.x86_64 unbound-libs-1.7.3-15.el8.x86_64 zlib-1.2.11-17.el8.x86_64
(gdb) print *(struct pthread *)$fs_base
$1 = {{header = {tcb = 0x7fcf4ac23e00, dtv = 0x7fcf4ac24750, self = 0x7fcf4ac23e00, multiple_threads = 1, gscope_flag = 0, sysinfo = 0, stack_guard = 10295380027830385664, pointer_guard = 2185615696620986620, vgetcpu_cache = {0, 0}, feature_1 = 0, 
      __glibc_unused1 = 0, __private_tm = {0x0, 0x0, 0x0, 0x0}, __private_ss = 0x0, ssp_base = 0, __glibc_unused2 = {{{i = {0, 0, 0, 0}}, {i = {0, 0, 0, 0}}, {i = {0, 0, 0, 0}}, {i = {0, 0, 0, 0}}}, {{i = {0, 0, 0, 0}}, {i = {0, 0, 0, 0}}, {i = {0, 0, 
              0, 0}}, {i = {0, 0, 0, 0}}}, {{i = {0, 0, 0, 0}}, {i = {0, 0, 0, 0}}, {i = {0, 0, 0, 0}}, {i = {0, 0, 0, 0}}}, {{i = {0, 0, 0, 0}}, {i = {0, 0, 0, 0}}, {i = {0, 0, 0, 0}}, {i = {0, 0, 0, 0}}}, {{i = {0, 0, 0, 0}}, {i = {0, 0, 0, 0}}, {
            i = {0, 0, 0, 0}}, {i = {0, 0, 0, 0}}}, {{i = {0, 0, 0, 0}}, {i = {0, 0, 0, 0}}, {i = {0, 0, 0, 0}}, {i = {0, 0, 0, 0}}}, {{i = {0, 0, 0, 0}}, {i = {0, 0, 0, 0}}, {i = {0, 0, 0, 0}}, {i = {0, 0, 0, 0}}}, {{i = {0, 0, 0, 0}}, {i = {0, 0, 0, 
              0}}, {i = {0, 0, 0, 0}}, {i = {0, 0, 0, 0}}}}, __padding = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}}, __padding = {0x7fcf4ac23e00, 0x7fcf4ac24750, 0x7fcf4ac23e00, 0x1, 0x0, 0x8ee089968a7b3c00, 0x1e54dde56c9ca0fc, 0x0 <repeats 17 times>}}, 
  list = {next = 0x7fcf4a081370 <__stack_user>, prev = 0x7fcf4a081370 <__stack_user>}, tid = 1, pid_ununsed = 0, robust_prev = 0x7fcf4ac240e0, robust_head = {list = 0x7fcf4ac240e0, futex_offset = -32, list_op_pending = 0x0}, cleanup = 0x0, 
  cleanup_jmp_buf = 0x7ffee6339a80, cancelhandling = 0, flags = 0, specific_1stblock = {{seq = 0, data = 0x0}, {seq = 0, data = 0x0}, {seq = 1, data = 0x0}, {seq = 1, data = 0x558b908c0e40}, {seq = 1, data = 0x558b908e0ff0}, {seq = 1, 
      data = 0x558b908e0d20}, {seq = 1, data = 0x558b908c0bf0}, {seq = 1, data = 0x558b908ea180}, {seq = 1, data = 0x558b908ea2f0}, {seq = 0, data = 0x0} <repeats 23 times>}, specific = {0x7fcf4ac24110, 0x0 <repeats 31 times>}, specific_used = true, 
  report_events = false, user_stack = true, stopped_start = false, parent_cancelhandling = 0, lock = 0, setxid_futex = 0, cpuclock_offset_ununsed = 0, joinid = 0x0, result = 0x0, schedparam = {sched_priority = 0}, schedpolicy = 0, start_routine = 0x0, 
  arg = 0x0, eventbuf = {eventmask = {event_bits = {0, 0}}, eventnum = TD_ALL_EVENTS, eventdata = 0x0}, nextevent = 0x0, exc = {exception_class = 0, exception_cleanup = 0x0, private_1 = 0, private_2 = 0}, stackblock = 0x0, 
  stackblock_size = 140732760562480, guardsize = 0, reported_guardsize = 0, tpp = 0x0, res = {retrans = 0, retry = 0, options = 0, nscount = 0, nsaddr_list = {{sin_family = 0, sin_port = 0, sin_addr = {s_addr = 0}, 
        sin_zero = "\000\000\000\000\000\000\000"}, {sin_family = 0, sin_port = 0, sin_addr = {s_addr = 0}, sin_zero = "\000\000\000\000\000\000\000"}, {sin_family = 0, sin_port = 0, sin_addr = {s_addr = 0}, sin_zero = "\000\000\000\000\000\000\000"}}, 
    id = 0, dnsrch = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, defdname = '\000' <repeats 255 times>, pfcode = 0, ndots = 0, nsort = 0, ipv6_unavail = 0, unused = 0, sort_list = {{addr = {s_addr = 0}, mask = 0}, {addr = {s_addr = 0}, mask = 0}, {addr = {
          s_addr = 0}, mask = 0}, {addr = {s_addr = 0}, mask = 0}, {addr = {s_addr = 0}, mask = 0}, {addr = {s_addr = 0}, mask = 0}, {addr = {s_addr = 0}, mask = 0}, {addr = {s_addr = 0}, mask = 0}, {addr = {s_addr = 0}, mask = 0}, {addr = {
          s_addr = 0}, mask = 0}}, __glibc_unused_qhook = 0x0, __glibc_unused_rhook = 0x0, res_h_errno = 0, _vcsock = 0, _flags = 0, _u = {pad = '\000' <repeats 51 times>, _ext = {nscount = 0, nsmap = {0, 0, 0}, nssocks = {0, 0, 0}, nscount6 = 0, 
        nsinit = 0, nsaddrs = {0x0, 0x0, 0x0}, __glibc_extension_index = 0}}}, c11 = false, end_padding = 0x7fcf4ac246f1 ""}
(gdb)


No idea if the below is useful, but grabbed the output just in case:

(gdb) frame 0
#0  __GI_abort () at abort.c:107
107	      ABORT_INSTRUCTION;
(gdb) list
102	
103	  /* Now try to abort using the system specific command.  */
104	  if (stage == 4)
105	    {
106	      ++stage;
107	      ABORT_INSTRUCTION;
108	    }
109	
110	  /* If we can't signal ourselves and the abort instruction failed, exit.  */
111	  if (stage == 5)
(gdb) frame 1
#1  0x0000558b8f4110cc in fatal_signal_run () at lib/fatal-signal.c:313
313	        OVS_NOT_REACHED();
(gdb) list
308	         * termination status reflects that we were killed by this signal */
309	        signal(sig_nr, SIG_DFL);
310	        raise(sig_nr);
311	
312	        ovs_mutex_unlock(&mutex);
313	        OVS_NOT_REACHED();
314	    }
315	}
316	
317	void
(gdb) frame 2
#2  0x0000558b8f4561b0 in poll_block () at lib/poll-loop.c:388
388	    fatal_signal_run();
(gdb) list
383	    loop->timeout_where = NULL;
384	    free(pollfds);
385	    free(wevents);
386	
387	    /* Handle any pending signals before doing anything else. */
388	    fatal_signal_run();
389	
390	    seq_woke();
391	}
392	^L
(gdb) frame 3
#3  0x0000558b8f3962bb in main (argc=<optimized out>, argv=<optimized out>) at northd/ovn-northd.c:15306
15306	        poll_block();
(gdb) list
15301	            VLOG_INFO("Resetting northbound database cluster state");
15302	            ovsdb_idl_reset_min_index(ovnnb_idl_loop.idl);
15303	            reset_ovnnb_idl_min_index = false;
15304	        }
15305	
15306	        poll_block();
15307	        if (should_service_stop()) {
15308	            exiting = true;
15309	        }
15310	    }
(gdb) frame 4
#0  0x0000000000000000 in ?? ()
(gdb) list
15311	
15312	
15313	    free(ovn_internal_version);
15314	    unixctl_server_destroy(unixctl);
15315	    ovsdb_idl_loop_destroy(&ovnnb_idl_loop);
15316	    ovsdb_idl_loop_destroy(&ovnsb_idl_loop);
15317	    service_stop();
15318	
15319	    exit(res);
15320	}
(gdb) frame 5
#0  0x0000000000000000 in ?? ()
(gdb) list
15321	
15322	static void
15323	ovn_northd_exit(struct unixctl_conn *conn, int argc OVS_UNUSED,
15324	                const char *argv[] OVS_UNUSED, void *exiting_)
15325	{
15326	    bool *exiting = exiting_;
15327	    *exiting = true;
15328	
15329	    unixctl_command_reply(conn, NULL);
15330	}
(gdb) frame 6
#0  0x0000000000000000 in ?? ()
(gdb) list
15331	
15332	static void
15333	ovn_northd_pause(struct unixctl_conn *conn, int argc OVS_UNUSED,
15334	                const char *argv[] OVS_UNUSED, void *state_)
15335	{
15336	    struct northd_state  *state = state_;
15337	    state->paused = true;
15338	
15339	    unixctl_command_reply(conn, NULL);
15340	}

Comment 27 jamo luhrsen 2021-10-01 19:04:44 UTC

also, in this case with the upgrade from 4.9 to 4.10 the relevant versions are not changing for ovn or glibc. output
after the upgrade:

  [root@ci-ln-tcn8z4t-f76d1-rznzc-master-2 ~]# rpm -qa | egrep 'ovn|glibc'
  glibc-common-2.28-151.el8.x86_64
  ovn21.09-central-21.09.0-19.el8fdp.x86_64
  ovn21.09-vtep-21.09.0-19.el8fdp.x86_64
  glibc-minimal-langpack-2.28-151.el8.x86_64
  glibc-2.28-151.el8.x86_64
  ovn21.09-21.09.0-19.el8fdp.x86_64
 ovn21.09-host-21.09.0-19.el8fdp.x86_64

and libc.so.6 info:

  [root@ci-ln-tcn8z4t-f76d1-rznzc-master-2 ~]# ls -liH /usr/lib64/libc.so.6
  142606723 -rwxr-xr-x. 1 root root 3167976 Mar  5  2021 /usr/lib64/libc.so.6
  [root@ci-ln-tcn8z4t-f76d1-rznzc-master-2 ~]# sha256sum /usr/lib64/libc.so.6
  5c09418382175429ac7da80007df08c718c3042a167de4e830f2e69c36719ecc  /usr/lib64/libc.so.6
  [root@ci-ln-tcn8z4t-f76d1-rznzc-master-2 ~]#

Comment 29 jamo luhrsen 2021-10-01 19:07:56 UTC

(In reply to Carlos O'Donell from comment #25)
> Just following up here, the next steps would be:
> 
> - A copy of the ovn-northd binary.

it's now attached to this bug and relevant to the info provided in comment #26

> - A copy of a coredump that matches the ovn-northd binary.

you can find that in the tar ball from this link:
  https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.10-upgrade-from-stable-4.9-e2e-gcp-ovn-upgrade/1443661033020329984/artifacts/e2e-gcp-ovn-upgrade/gather-core-dump/artifacts/

> - Exactly release of glibc that the system had.

it's also in comment #26, but it's glibc-2.28-151.el8.x86_64

> 
> From there we can double check that everything lines up and help get a
> backtrace that we can use to move forward.
> 
> Florian's out for a little while, and I'm stepping in to continue supporting
> this from the glibc side.

thanks. let me know what else I can do to help.

Comment 32 DJ Delorie 2021-10-26 21:39:15 UTC

So I did a review of the abort() code, and was able to trivially invoke the HLT opcode using seccomp bpf filters.  If the filter does not allow tgkill, or fails to emulate a trapped tgkill, abort has no way to stop the program other than HLT.  It's also possible to block raise() with seccomp as it also uses the tgkill syscall.

It is also possible to trap SIGABRT and reset it's handler on each signal, and longjump out of the handler (if it returns, you abort anyway).  However, you just loop then.  If you return from the signal handler, abort() will remove any handlers you might have re-added and re-send the signal, and the process dies.  Neither of these results in the HLT opcode.

So I don't think getting to the HLT opcode is a concern.  It just means that abort() needed to try really hard to abort.  The question remains, why was abort() called in the first place?  ;-)

Comment 33 Ilya Maximets 2021-10-27 12:11:02 UTC

(In reply to DJ Delorie from comment #32)
> So I did a review of the abort() code, and was able to trivially invoke the
> HLT opcode using seccomp bpf filters.  If the filter does not allow tgkill,
> or fails to emulate a trapped tgkill, abort has no way to stop the program
> other than HLT.  It's also possible to block raise() with seccomp as it also
> uses the tgkill syscall.
> 
> It is also possible to trap SIGABRT and reset it's handler on each signal,
> and longjump out of the handler (if it returns, you abort anyway).  However,
> you just loop then.  If you return from the signal handler, abort() will
> remove any handlers you might have re-added and re-send the signal, and the
> process dies.  Neither of these results in the HLT opcode.
> 
> So I don't think getting to the HLT opcode is a concern.  It just means that
> abort() needed to try really hard to abort.  The question remains, why was
> abort() called in the first place?  ;-)

Well, abort() was called because raise(SIGTERM) failed.  From the application
point of view that's a logical choice, because application wanted to terminate
itself, and it tries to do that with instruments it has.

HLT opcode is a concern for OpenShift CI, because it generates a coredump failing
the upgrade test.

I don't know if there are some seccomp restrictions applied during the upgrade
and if they can be removed.  @

Comment 34 Ilya Maximets 2021-10-27 12:19:35 UTC

(sorry, didn't finish the previous reply)

@jluhrsen, do you know if there are some seccomp restrictions in place?

> It is also possible to trap SIGABRT and reset it's handler on each signal,
> and longjump out of the handler (if it returns, you abort anyway).  However,
> you just loop then.  If you return from the signal handler, abort() will
> remove any handlers you might have re-added and re-send the signal, and the
> process dies.  Neither of these results in the HLT opcode.

@dj, does this work even with seccomp restrictions?

Anyway, sounds like a nasty workaround. :)  I don't think this will be acceptable
upstream.

Comment 35 DJ Delorie 2021-10-27 18:16:53 UTC

raise() and abort() both use the tgkill syscall to deliver signals.  If it's blocked, neither will work.

I don't think the loops will "work" if tgkill is blocked, as they still depend on the signal request being sent to the kernel, and the kernel deciding how to dispose of it.  I was just commenting on other ways one might THINK you could get to the HLT, but they don't.

Perhaps the application could use exit(0) instead of (or after) raise(SIGTERM) to exit?  Assuming seccomp is blocking tgkill, and needs to continue blocking (or emulating) it.

Comment 36 Ilya Maximets 2021-10-27 18:52:05 UTC

(In reply to DJ Delorie from comment #35)
> Perhaps the application could use exit(0) instead of (or after)
> raise(SIGTERM) to exit?  Assuming seccomp is blocking tgkill, and needs to
> continue blocking (or emulating) it.

We discussed this in upstream previously and conclusion was: since application
received a signal and user wanted to terminate the process with a signal,
application should be terminated by signal and have a corresponding exit code.
So, exit(0) is not a good option here and will not be accepted upstream.

And at this point, I think, we still don't know if there are seccomp restrictions
or it's something else.

It also seems a bit weird to prohibit application to send signals to itself.
If such restrictions applied, it should be expected that applications will
not work correctly.

Comment 37 DJ Delorie 2021-10-27 19:50:26 UTC

I agree that answering the seccomp question is key.  Perhaps it's part of running in a container, we've had issues with clone3() adoption in container security recently.

Perhaps exit() could be called *after* raise() is attempted, but before the OVS_NOT_REACHED() ?

Either way, is there still any concern about glibc's operation that I could investigate and/or help you understand?

Comment 38 Ilya Maximets 2021-10-27 20:13:04 UTC

(In reply to DJ Delorie from comment #37)
> I agree that answering the seccomp question is key.  Perhaps it's part of
> running in a container, we've had issues with clone3() adoption in container
> security recently.

The weird thing is that this only happens in the upgrade scenario and not
in the usual container termination case (raise of SIGTERM works just fine).

> Perhaps exit() could be called *after* raise() is attempted, but before the
> OVS_NOT_REACHED() ?

Upstream community doesn't want to do that.

> Either way, is there still any concern about glibc's operation that I could
> investigate and/or help you understand?

For now, It looks like we're still in the "unknown" state here.  We need
to find out if there are restrictions applied in the setup during upgrade.
If so, then we, probably, don't need help from glibc side, as everything
is clear in that case and we only need to figure out how to remove restrictions
or workaround the problem from the application side.
If it's not seccomp, then we'll need to investigate further, go deeper in
kernel or glibc.

If you know how to identify that certain seccomp restrictions are applied, that
might be helpful for @jluhrsen (I don't know how to do that and
I don't know if @jluhrsen knows).

Comment 39 Carlos O'Donell 2021-10-28 13:49:48 UTC

(In reply to Ilya Maximets from comment #38)
> If you know how to identify that certain seccomp restrictions are applied,
> that
> might be helpful for @jluhrsen (I don't know how to do that and
> I don't know if @jluhrsen knows).

If Jamo is in the West coast timezone, this matches DJ's timezone, and we could schedule a joint debug session to get the strace data we need from the process during uprade.

Florian noted we needed the strace here:
https://bugzilla.redhat.com/show_bug.cgi?id=1957030#c18

Jamo noted he doesn't know how to do this.

We need to predict the crash, ssh into the host, and be able to strace -p to attach to it.

The next step is really to increase *observability* into what is going on.

Comment 40 Tim Rozet 2021-10-28 14:30:27 UTC

BTW Jamo I don't think you need to do an actual OCP upgrade, you can simply delete the pod. That will cause crio to send sigterm to the process and northd or ovsdb-server will crash. I'll try verifying this on cluster bot in a just a bit.

Comment 41 DJ Delorie 2021-10-28 17:46:37 UTC

Capturing an strace shouldn't be tricky; you need not enter the container to do so.  Florian gave the strace command to use.  Use "ps" to find the pid of the OVN process (the pid is different when you look at it from outside the container, but that's OK, as it's consistent as long as all the tools are outside the container).

Once you have the pid, run the strace command.  It will connect to the running program inside the container, and log all syscalls (to the given -o output file if you use that option, else the screen).

Once strace is running, use /bin/kill to send a TERM signal to the OVN process (using the same pid, of course).

strace will capture all the attempted syscalls, and exit when the OVN process exits.

Comment 42 jamo luhrsen 2021-10-28 17:56:47 UTC

yes, I am PST and can sync with anyone in real time to work on this.
I have to admit that I don't follow a lot of the lingo here, so might
need some hand holding.

But, if we think this will reproduce with just a kill on the node
runnin the process I can try that now. trying with a Kind setup
at the moment and have a cluster-bot cluster coming up as well.
If this works I can run the strace command no problem and post the
output here.

Comment 43 DJ Delorie 2021-10-28 17:57:59 UTC

... unless the upgrade process itself somehow further restricts syscalls, in which case, the OVN process will terminate "normally", but then we'll have learned something new at least ;-)

Comment 44 Ilya Maximets 2021-10-28 18:06:21 UTC

I'm a bit confused because I thought that this issue happens only
during upgrade and not during the normal shutdown of containers.
Did I misunderstood or normal shutdown was never actually tested?

Comment 45 DJ Delorie 2021-10-28 18:15:17 UTC

Heh, yes, that's the assumption.  It can still be easier to work our way towards debugging that by starting with easier things to do, which are easier to learn how to do, and help us validate our assumptions.  We have one of two situations:

1. the container is blocking syscalls, in which case any SIGTERM style exit should cause the problem, and we can easily trap and debug it.

2. the upgrade process is blocking syscalls, and we will see a normal tgkill-based exit in the simple trace

Either way we learn something by trying to capture a simple exit, and can focus further debugging by eliminating some possibilities.

(ok, the paranoid in me says "three situations": 3. Something else is happening.  We'll burn that bridge when we get there.)

Even so, once we can get an strace of a user-killed process, it's basically the same method to get an strace of an upgrade-killed process.  ps, strace, start upgrade...

Comment 46 jamo luhrsen 2021-10-28 18:35:44 UTC

ok, first try with a openshift 4.9 cluster from cluster bot and I did not see a core dump created. I captured
the strace of doing a kill on the ovsdb-server process. Note there are 3 processes for ovsdb-server so I
killed the one that was for our SB, which matched what gdb saw in comment #5

sh-4.4# ps -elf | grep ovsdb-server
1 S openvsw+    1151       1  0  70 -10 - 17685 -      17:50 ?        00:00:01 ovsdb-server /etc/openvswitch/conf.db -vconsole:emer -vsyslog:err -vfile:info --remote=punix:/var/run/openvswitch/db.sock --private-key=db:Open_vSwitch,SSL,private_key --certificate=db:Open_vSwitch,SSL,certificate --bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert --user openvswitch:hugetlbfs --no-chdir --log-file=/var/log/openvswitch/ovsdb-server.log --pidfile=/var/run/openvswitch/ovsdb-server.pid --detach
4 S root        5253    3959  0  80   0 - 50796 -      17:53 ?        00:00:07 ovsdb-server -vconsole:info -vfile:off --log-file=/var/log/ovn/ovsdb-server-nb.log --remote=punix:/var/run/ovn/ovnnb_db.sock --pidfile=/var/run/ovn/ovnnb_db.pid --unixctl=/var/run/ovn/ovnnb_db.ctl --remote=db:OVN_Northbound,NB_Global,connections --private-key=/ovn-cert/tls.key --certificate=/ovn-cert/tls.crt --ca-cert=/ovn-ca/ca-bundle.crt --ssl-protocols=db:OVN_Northbound,SSL,ssl_protocols --ssl-ciphers=db:OVN_Northbound,SSL,ssl_ciphers /etc/ovn/ovnnb_db.db
4 S root        6771    6653  0  80   0 - 52764 -      17:53 ?        00:00:04 ovsdb-server -vconsole:info -vfile:off --log-file=/var/log/ovn/ovsdb-server-sb.log --remote=punix:/var/run/ovn/ovnsb_db.sock --pidfile=/var/run/ovn/ovnsb_db.pid --unixctl=/var/run/ovn/ovnsb_db.ctl --remote=db:OVN_Southbound,SB_Global,connections --private-key=/ovn-cert/tls.key --certificate=/ovn-cert/tls.crt --ca-cert=/ovn-ca/ca-bundle.crt --ssl-protocols=db:OVN_Southbound,SSL,ssl_protocols --ssl-ciphers=db:OVN_Southbound,SSL,ssl_ciphers /etc/ovn/ovnsb_db.db
0 S root       96160   92855  0  80   0 -  2302 -      18:26 ?        00:00:00 grep ovsdb-server
sh-4.4# PID=6771   
sh-4.4# strace -p $PID -f -s 500 -v -o /var/tmp/ovsdb-server.$PID &
[1] 100418
sh-4.4# strace: Process 6771 attached with 3 threads

sh-4.4# 
sh-4.4# ll /var/lib/systemd/coredump/
total 0
drwxr-xr-x. 2 root root  6 Oct 28 17:49 .
drwxr-xr-x. 5 root root 70 Oct 28 17:49 ..
sh-4.4#
sh-4.4# coredumpctl 
No coredumps found.
sh-4.4# 


attached is the file created from the strace command.

next I will issue an strace on the process(es) and issue an upgrade and we can see what happens.

Comment 48 DJ Delorie 2021-10-28 20:02:40 UTC

6992  tgkill(81, 97, SIGTERM)           = 0
6992  rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
6992  --- SIGTERM {si_signo=SIGTERM, si_code=SI_TKILL, si_pid=81, si_uid=0} ---
6991  <... futex resumed>)              = ?
6771  <... futex resumed>)              = ?
6992  +++ killed by SIGTERM +++
6991  +++ killed by SIGTERM +++
6771  +++ killed by SIGTERM +++

This is normal signal processing.

Comment 49 Tim Rozet 2021-10-28 21:11:37 UTC

I'm able to get crashes from northd and ovsdb-server from just killing the pod:
[root@ip-10-0-197-3 /]# chroot /host
sh-4.4# coredumpctl
TIME                            PID   UID   GID SIG COREFILE  EXE
Thu 2021-10-28 20:29:08 UTC    3487     0     0  11 present   /usr/sbin/ovsdb-server


oc delete pod -n openshift-ovn-kubernetes ovnkube-master-pcrpc --force --grace-period=0

Interestingly though...I cannot get it to happen again after I have killed the pods the first time :) I didn't grab the strace while I was doing this. I'll launch a new cluster and try it.

Comment 50 Mark Michelson 2021-10-29 13:58:27 UTC

*** Bug 1900938 has been marked as a duplicate of this bug. ***

Comment 51 jamo luhrsen 2021-10-29 17:31:26 UTC

some interesting things I've just now noticed.

1) I saw what @trozet saw where a pod delete would trigger the ovsdb-server core dump (as well as an ovn-northd dump), but only
   on the first pod deletion.
     - Wondering why a manual SIGTERM doesn't trigger it?
     - Why does it only happen the first time?

2) I was not able to trigger the ovsdb-server core dump on every node (there are 3 of them), but we knew this based on the
   dumps our jobs collect at the end. There is some specific environment that is allowing this to happen.

2) I no longer see ovsdb-server core dumps in openshift 4.10. It's still happening in 4.9 though.

I had an strace running on the wrong ovsdb-server process (there are two of them) when I was reproducing. I'll spin up a new
cluster and be more careful and will attach the strace shortly.

Comment 52 jamo luhrsen 2021-10-29 18:21:23 UTC

> I had an strace running on the wrong ovsdb-server process (there are two of
> them) when I was reproducing. I'll spin up a new
> cluster and be more careful and will attach the strace shortly.

I cannot get a coredump to happen when I attach with strace. even northd which
will dump repeatedly when deleting the pod will not crash when I attach with
strace. Not sure what that means...

Comment 53 Carlos O'Donell 2021-11-05 13:54:49 UTC

(In reply to jamo luhrsen from comment #52)
> > I had an strace running on the wrong ovsdb-server process (there are two of
> > them) when I was reproducing. I'll spin up a new
> > cluster and be more careful and will attach the strace shortly.
> 
> I cannot get a coredump to happen when I attach with strace. even northd
> which
> will dump repeatedly when deleting the pod will not crash when I attach with
> strace. Not sure what that means...

DJ, Jamo,

The next step here is to use systemtap to log tgkill, sigaction, ... anything signal related to get a picture of what is going on with the process.

DJ,

Could you work with Jamo to get a systemtap script that can log these events with enough information that we can see what's going on?

Comment 54 Tim Rozet 2021-11-05 14:40:01 UTC

"I cannot get a coredump to happen when I attach with strace. even northd which
will dump repeatedly when deleting the pod will not crash when I attach with
strace. Not sure what that means..."

I also experienced the same thing. There are 2 ovsdb-server processes and 1 northd per node. Reliably I could cause the core to happen on at least 1 of those processes on *initial* termination when I brought up a cluster. After attaching strace to all 3 processes, per each of the 3 nodes (9 processes total) I could not reproduce the core. Something about strace attachment is bypassing the core dump. Or we are incredibly unlucky :)

Comment 55 DJ Delorie 2021-11-06 01:28:26 UTC

I have two alternatives.  First, a simple stap script:

function foo() { printf("%d: saw %s\n", pid(), ppfunc()); }
probe syscall.tgkill { foo(); }
probe syscall.sigaction { foo(); }
probe syscall.prctl { foo(); }
probe process.end { printf("%d: end\n", pid()); }

Yup, that's the whole script.  Call it "tgkill.stap" for the sample below.  Then you can run:

$ sudo stap tgkill.stap

It will tell you about every process that calls tgkill, sigaction, prctl; and every process exit (although I think you see more of these than you should).  Run it (it takes a while to run the first time as it needs to compile the script), then delete the pod, and if the pod's process successfully calls tgkill, you'll see "saw tgkill" from the stap script, with the pid of the pod's process.  If it's blocked, you'll see nothing.

The second alternative comes from the systemtap examples, and is "strace, but systemtap":

https://sourceware.org/systemtap/examples/#process/strace.stp

Comment 56 jamo luhrsen 2021-11-10 00:50:39 UTC

It took me a bit to figure out how to get this stap test working, but once I think I have it
working I'm not seeing any output for the processes that we get a core dumps from. The script
produces a lot of output though. For example, running it for the duration of a pod deletion
until it's restarted fully I get 29k lines of output. But none of the lines are from the PID(s)
we are interested in. maybe I've got something setup wrong?

what I'm doing (for posterity, in case I need to remember this):

- create a debug pod
  -- oc debug --image docker.io/centos:centos8 node/ci-ln-f743lmt-72292-l68l8-master-0
     sh-4.4# uname -a
     Linux ci-ln-f743lmt-72292-l68l8-master-1 4.18.0-305.25.1.el8_4.x86_64 #1 SMP Mon Oct 18 14:34:11 EDT 2021 x86_64 x86_64 x86_64 GNU/Linux


- install the magic stap things
  -- dnf install systemtap systemtap-runtime
  -- grab kernel rpms needed from https://brewweb.engineering.redhat.com/brew/buildinfo?buildID=1766284
     -- kernel-4.18.0-305.25.1.el8_4.x86_64.rpm
     -- kernel-debuginfo-common-x86_64-4.18.0-305.25.1.el8_4.x86_64.rpm
     -- kernel-devel-4.18.0-305.25.1.el8_4.x86_64.rpm
     -- kernel-debuginfo-4.18.0-305.25.1.el8_4.x86_64.rpm
  -- copy rpms to debug pod
     -- oc cp -ndefault <each rpm> ci-ln-f743lmt-72292-l68l8-master-1-debug:/tmp/
  -- dnf install <each rpm>
  -- stap-prep
  -- create the tgkill.stap script in /tmp
 
I tried running 'stap /tmp/tgkill.stap' and piping that to an egrep of all the PIDS
that we see coredumps (ovsdb-server, northd) and redirect to a file, but it was empty.
I also just ran the script without any egrep and the resulting output did not have any
lines matching the northd dump that was created.

# shows 29k lines of output
sh-4.4# cat /tmp/tgkill_output.txt  | wc -l
29128

# checking on host what the new PID that we have a core is (297275)
sh-4.4# chroot /host
sh-4.4# coredumpctl 
TIME                            PID   UID   GID SIG COREFILE  EXE
Tue 2021-11-09 23:35:06 UTC    3692     0     0  11 present   /usr/bin/ovn-northd
Tue 2021-11-09 23:35:06 UTC    3806     0     0  11 present   /usr/sbin/ovsdb-server
Tue 2021-11-09 23:35:06 UTC    5399     0     0  11 present   /usr/sbin/ovsdb-server
Tue 2021-11-09 23:45:39 UTC  271658     0     0  11 present   /usr/bin/ovn-northd
Wed 2021-11-10 00:36:08 UTC  297275     0     0  11 present   /usr/bin/ovn-northd
sh-4.4# exit
exit

# that PID is not here:
sh-4.4# egrep 297275 /tmp/tgkill_output.txt


@dj, let me know if you see anything wrong with my steps above. I'll be
trying your other idea of "strace, but systemtap" next.

Comment 57 jamo luhrsen 2021-11-10 22:26:26 UTC

I'm not sure I'm getting the right info from the strace.stp script:
  https://sourceware.org/systemtap/examples/process/strace.stp

steps to get it working (was hitting bug #1940761):

- debug pod
  -- oc debug --image docker.io/centos:centos8 node/ip-10-0-249-171.ec2.internal

- copy kernel and systemtap rpms from https://brewweb.engineering.redhat.com/
    -- kernel-4.18.0-305.25.1.el8_4.x86_64.rpm
    -- kernel-debuginfo-common-x86_64-4.18.0-305.25.1.el8_4.x86_64.rpm
    -- kernel-debuginfo-4.18.0-305.25.1.el8_4.x86_64.rpm  
    -- kernel-devel-4.18.0-305.25.1.el8_4.x86_64.rpm
    -- systemtap-4.5-1.el8.x86_64.rpm
    -- systemtap-client-4.5-1.el8.x86_64.rpm
    -- systemtap-devel-4.5-1.el8.x86_64.rpm
    -- systemtap-runtime-4.5-1.el8.x86_64.rpm

  -- for i in $(ls kernel*); do oc cp -ndefault $i ip-10-0-249-171ec2internal-debug:/tmp/; done
  -- for i in $(ls system*); do oc cp -ndefault $i ip-10-0-249-171ec2internal-debug:/tmp/; done

- install on debug pod with "dnf install /tmp/*rpm"

- run script on northd process id:


sh-4.4# /tmp/strace.stp -x 2415
Thu Jan  1 00:00:00 1970.000000 restart_syscall() = -516
Thu Jan  1 00:00:00 1970.000000 urcu1[2444] restart_syscall() = 1
Wed Nov 10 22:16:05 2021.559796 urcu1[2444] getrusage(RUSAGE_THREAD, 0x7f3924000d58) = 0
Wed Nov 10 22:16:05 2021.559829 urcu1[2444] write(2, "2021-11-10T22:16:05.559Z|00001|fatal_signal(u"..., 104) = 104
Wed Nov 10 22:16:05 2021.559959 urcu1[2444] socket(PF_LOCAL, SOCK_DGRAM|SOCK_CLOEXEC, 0) = 15
Wed Nov 10 22:16:05 2021.559981 urcu1[2444] connect(15, {AF_UNIX, "/dev/log"}, 110) = -2 (ENOENT)
Wed Nov 10 22:16:05 2021.559995 urcu1[2444] close(15) = 0
Wed Nov 10 22:16:05 2021.560008 urcu1[2444] rt_sigaction(SIGSEGV, {SIG_DFL}, 0x7f3931a7b9a0, 8) = 0
Wed Nov 10 22:16:05 2021.560044 urcu1[2444] rt_sigprocmask(SIG_BLOCK, [SIGHUP|SIGINT|SIGQUIT|SIGILL|SIGTRAP|SIGABRT|SIGBUS|SIGFPE|SIGKILL|SIGUSR1|SIGSEGV|SIGPIPE|SIGUSR2|SIGALRM|SIGTERM|SIGCHLD|SIGCONT|SIGSTOP|SIGTSTP|SIGTTIN|SIGTTOU|SIGURG|SIGXCPU|SIGXFSZ|SIGVTALRM|SIGPROF|SIGWINCH|SIGIO/SIGPOLL|SIGPWR|SIGSYS], 0x7f3931a7ba80, 8) = 0
Wed Nov 10 22:16:05 2021.560051 urcu1[2444] getpid() = 1
Wed Nov 10 22:16:05 2021.560070 urcu1[2444] gettid() = 10
Wed Nov 10 22:16:05 2021.560077 urcu1[2444] tgkill(1, 10, SIGSEGV) = 0
Wed Nov 10 22:16:05 2021.560085 urcu1[2444] rt_sigprocmask(SIG_SETMASK, [EMPTY], 0x0, 8) = 0
Wed Nov 10 22:16:02 2021.412143 restart_syscall() = -516
Wed Nov 10 22:16:05 2021.557095 write(5, "\0", 1) = 1
Wed Nov 10 22:16:05 2021.557129 poll(0x556a89082580, 4, 2147483647) = 1
Wed Nov 10 22:16:05 2021.557140 getrusage(RUSAGE_THREAD, 0x556a88ad7198) = 0
Wed Nov 10 22:16:05 2021.557181 write(2, "2021-11-10T22:16:05.557Z|00035|fatal_signal|W"..., 89) = 89
Wed Nov 10 22:16:05 2021.557436 socket(PF_LOCAL, SOCK_DGRAM|SOCK_CLOEXEC, 0) = 15
Wed Nov 10 22:16:05 2021.557462 connect(15, {AF_UNIX, "/dev/log"}, 110) = -2 (ENOENT)
Wed Nov 10 22:16:05 2021.557475 close(15) = 0
Wed Nov 10 22:16:05 2021.557493 unlink("/var/run/ovn/ovn-northd.1.ctl") = 0
Wed Nov 10 22:16:05 2021.557521 unlink("/var/run/ovn/ovn-northd.pid") = 0
Wed Nov 10 22:16:05 2021.557536 rt_sigaction(SIGTERM, {SIG_DFL}, 0x7ffc0c9c7f00, 8) = 0
Wed Nov 10 22:16:05 2021.557553 rt_sigprocmask(SIG_BLOCK, [SIGHUP|SIGINT|SIGQUIT|SIGILL|SIGTRAP|SIGABRT|SIGBUS|SIGFPE|SIGKILL|SIGUSR1|SIGSEGV|SIGPIPE|SIGUSR2|SIGALRM|SIGTERM|SIGCHLD|SIGCONT|SIGSTOP|SIGTSTP|SIGTTIN|SIGTTOU|SIGURG|SIGXCPU|SIGXFSZ|SIGVTALRM|SIGPROF|SIGWINCH|SIGIO/SIGPOLL|SIGPWR|SIGSYS], 0x7ffc0c9c7fe0, 8) = 0
Wed Nov 10 22:16:05 2021.557561 getpid() = 1
Wed Nov 10 22:16:05 2021.557568 gettid() = 1
Wed Nov 10 22:16:05 2021.557575 tgkill(1, 1, SIGTERM) = 0
Wed Nov 10 22:16:05 2021.557582 rt_sigprocmask(SIG_SETMASK, [EMPTY], 0x0, 8) = 0
Wed Nov 10 22:16:05 2021.557588 rt_sigprocmask(SIG_UNBLOCK, [SIGABRT], 0x0, 8) = 0
Wed Nov 10 22:16:05 2021.557599 rt_sigprocmask(SIG_BLOCK, [SIGHUP|SIGINT|SIGQUIT|SIGILL|SIGTRAP|SIGABRT|SIGBUS|SIGFPE|SIGKILL|SIGUSR1|SIGSEGV|SIGPIPE|SIGUSR2|SIGALRM|SIGTERM|SIGCHLD|SIGCONT|SIGSTOP|SIGTSTP|SIGTTIN|SIGTTOU|SIGURG|SIGXCPU|SIGXFSZ|SIGVTALRM|SIGPROF|SIGWINCH|SIGIO/SIGPOLL|SIGPWR|SIGSYS], 0x7ffc0c9c7eb0, 8) = 0
Wed Nov 10 22:16:05 2021.557604 getpid() = 1
Wed Nov 10 22:16:05 2021.557607 gettid() = 1
Wed Nov 10 22:16:05 2021.557611 tgkill(1, 1, SIGABRT) = 0
Wed Nov 10 22:16:05 2021.557616 rt_sigprocmask(SIG_SETMASK, [EMPTY], 0x0, 8) = 0
Wed Nov 10 22:16:05 2021.557621 rt_sigaction(SIGABRT, {SIG_DFL}, 0x0, 8) = 0
Wed Nov 10 22:16:05 2021.557632 rt_sigprocmask(SIG_BLOCK, [SIGHUP|SIGINT|SIGQUIT|SIGILL|SIGTRAP|SIGABRT|SIGBUS|SIGFPE|SIGKILL|SIGUSR1|SIGSEGV|SIGPIPE|SIGUSR2|SIGALRM|SIGTERM|SIGCHLD|SIGCONT|SIGSTOP|SIGTSTP|SIGTTIN|SIGTTOU|SIGURG|SIGXCPU|SIGXFSZ|SIGVTALRM|SIGPROF|SIGWINCH|SIGIO/SIGPOLL|SIGPWR|SIGSYS], 0x7ffc0c9c7eb0, 8) = 0
Wed Nov 10 22:16:05 2021.557637 getpid() = 1
Wed Nov 10 22:16:05 2021.557640 gettid() = 1
Wed Nov 10 22:16:05 2021.557644 tgkill(1, 1, SIGABRT) = 0
Wed Nov 10 22:16:05 2021.557648 rt_sigprocmask(SIG_SETMASK, [EMPTY], 0x0, 8) = 0
Wed Nov 10 22:16:05 2021.557656 rt_sigaction(SIGSEGV, {SIG_DFL}, 0x7ffc0c9c70f0, 8) = 0
Wed Nov 10 22:16:05 2021.557668 rt_sigprocmask(SIG_BLOCK, [SIGHUP|SIGINT|SIGQUIT|SIGILL|SIGTRAP|SIGABRT|SIGBUS|SIGFPE|SIGKILL|SIGUSR1|SIGSEGV|SIGPIPE|SIGUSR2|SIGALRM|SIGTERM|SIGCHLD|SIGCONT|SIGSTOP|SIGTSTP|SIGTTIN|SIGTTOU|SIGURG|SIGXCPU|SIGXFSZ|SIGVTALRM|SIGPROF|SIGWINCH|SIGIO/SIGPOLL|SIGPWR|SIGSYS], 0x7ffc0c9c71d0, 8) = 0
Wed Nov 10 22:16:05 2021.557673 getpid() = 1
Wed Nov 10 22:16:05 2021.557676 gettid() = 1
Wed Nov 10 22:16:05 2021.557680 tgkill(1, 1, SIGSEGV) = 0
Wed Nov 10 22:16:05 2021.557685 rt_sigprocmask(SIG_SETMASK, [SIGSEGV], 0x0, 8) = 0
Wed Nov 10 22:16:05 2021.557690 write(5, "\0", 1) = 1
Thu Jan  1 00:00:00 1970.000000 stopwatch0[2443] restart_syscall() = 1
Wed Nov 10 22:16:05 2021.584209 stopwatch0[2443] getrusage(RUSAGE_THREAD, 0x7f392c000e48) = 0
Wed Nov 10 22:16:05 2021.584238 stopwatch0[2443] write(2, "2021-11-10T22:16:05.584Z|00001|fatal_signal(s"..., 109) = 109
Wed Nov 10 22:16:05 2021.584366 stopwatch0[2443] socket(PF_LOCAL, SOCK_DGRAM|SOCK_CLOEXEC, 0) = 15
Wed Nov 10 22:16:05 2021.584385 stopwatch0[2443] connect(15, {AF_UNIX, "/dev/log"}, 110) = -2 (ENOENT)
Wed Nov 10 22:16:05 2021.584398 stopwatch0[2443] close(15) = 0
Wed Nov 10 22:16:05 2021.584411 stopwatch0[2443] rt_sigaction(SIGSEGV, {SIG_DFL}, 0x7f393227c960, 8) = 0
Wed Nov 10 22:16:05 2021.584426 stopwatch0[2443] rt_sigprocmask(SIG_BLOCK, [SIGHUP|SIGINT|SIGQUIT|SIGILL|SIGTRAP|SIGABRT|SIGBUS|SIGFPE|SIGKILL|SIGUSR1|SIGSEGV|SIGPIPE|SIGUSR2|SIGALRM|SIGTERM|SIGCHLD|SIGCONT|SIGSTOP|SIGTSTP|SIGTTIN|SIGTTOU|SIGURG|SIGXCPU|SIGXFSZ|SIGVTALRM|SIGPROF|SIGWINCH|SIGIO/SIGPOLL|SIGPWR|SIGSYS], 0x7f393227ca40, 8) = 0
Wed Nov 10 22:16:05 2021.584434 stopwatch0[2443] getpid() = 1
Wed Nov 10 22:16:05 2021.584438 stopwatch0[2443] gettid() = 9
Wed Nov 10 22:16:05 2021.584445 stopwatch0[2443] tgkill(1, 9, SIGSEGV) = 0
Wed Nov 10 22:16:05 2021.584452 stopwatch0[2443] rt_sigprocmask(SIG_SETMASK, [EMPTY], 0x0, 8) = 0
ERROR: probe overhead (517092662 cycles) exceeded threshold (500000000 cycles) in last 1000000000 cycles
WARNING: Number of errors: 1, skipped probes: 1441159
WARNING: /usr/bin/staprun exited with status: 1
Pass 5: run failed.  [man error::pass5]
sh-4.4# 



what worries me is the ERROR at the end. I will get that just letting the script run without
any interaction after some period of time. BUT, the above output was grabbed as soon as I
started the script and issued a pod delete which created a coredump on that process id.

sh-4.4# chroot /host
sh-4.4# coredumpctl 
TIME                            PID   UID   GID SIG COREFILE  EXE
Wed 2021-11-10 22:16:06 UTC    2057     0     0  11 present   /usr/sbin/ovsdb-server
Wed 2021-11-10 22:16:06 UTC    2415     0     0  11 present   /usr/bin/ovn-northd
sh-4.4# 


just to show strace.stp with it's ERROR while running idle:

sh-4.4# /tmp/strace.stp -x 194096
Thu Jan  1 00:00:00 1970.000000 stopwatch0[194121] poll() = -516
Wed Nov 10 22:25:36 2021.994618 restart_syscall() = -516
Wed Nov 10 22:25:59 2021.373191 stopwatch0[194121] poll(0x7fc740000b60, 2, 2147483647) = 1
Wed Nov 10 22:25:59 2021.373422 stopwatch0[194121] getrusage(RUSAGE_THREAD, 0x7fc740000ca8) = 0
Wed Nov 10 22:25:59 2021.373429 stopwatch0[194121] read(7, 0x7fc74792bb90, 16) = 1
Wed Nov 10 22:25:59 2021.373440 stopwatch0[194121] poll(0x7fc740000b60, 2, 2147483647) = 1
Wed Nov 10 22:25:59 2021.373445 stopwatch0[194121] getrusage(RUSAGE_THREAD, 0x7fc740000ca8) = 0
Wed Nov 10 22:25:59 2021.373449 stopwatch0[194121] read(7, 0x7fc74792bb90, 16) = 1
Wed Nov 10 22:25:59 2021.373558 poll(0x55737c9a1c50, 4, 2147483647) = 1
Wed Nov 10 22:25:59 2021.374047 getrusage(RUSAGE_THREAD, 0x55737c91e338) = 0
Wed Nov 10 22:25:59 2021.374077 write(8, "\0", 1) = 1
Wed Nov 10 22:25:59 2021.374090 stat("/ovn-cert/tls.key", 0x7ffce9e88f20) = 0
Wed Nov 10 22:25:59 2021.374121 stat("/ovn-ca/ca-bundle.crt", 0x7ffce9e88e80) = 0
Wed Nov 10 22:25:59 2021.374132 read(9, 0x55737c923983, 5) = -11 (EAGAIN)
Wed Nov 10 22:25:59 2021.374142 read(10, 0x55737c97de13, 5) = 5
Wed Nov 10 22:25:59 2021.374147 read(10, 0x55737c97de18, 55) = 55
Wed Nov 10 22:25:59 2021.374167 accept(6, 0x7ffce9e88ec0, 0x7ffce9e88ebc) = -11 (EAGAIN)
Thu Jan  1 00:00:00 1970.000000 accept() = -11 (EAGAIN)
Wed Nov 10 22:25:59 2021.374185 write(8, "\0", 1) = 1
Wed Nov 10 22:25:59 2021.374199 getrusage(RUSAGE_THREAD, 0x55737c91e338) = 0
Wed Nov 10 22:25:59 2021.374203 write(8, "\0", 1) = 1
Wed Nov 10 22:25:59 2021.374208 stat("/ovn-cert/tls.key", 0x7ffce9e88f20) = 0
Wed Nov 10 22:25:59 2021.374214 stat("/ovn-ca/ca-bundle.crt", 0x7ffce9e88e80) = 0
Wed Nov 10 22:25:59 2021.374242 read(9, 0x55737c923983, 5) = -11 (EAGAIN)
Wed Nov 10 22:25:59 2021.374247 read(10, 0x55737c97de13, 5) = -11 (EAGAIN)
Wed Nov 10 22:25:59 2021.374253 accept(6, 0x7ffce9e88ec0, 0x7ffce9e88ebc) = -11 (EAGAIN)
Thu Jan  1 00:00:00 1970.000000 accept() = -11 (EAGAIN)
Wed Nov 10 22:25:59 2021.374261 write(8, "\0", 1) = 1
Wed Nov 10 22:25:36 2021.994656 restart_syscall() = 1
Wed Nov 10 22:25:59 2021.373106 getrusage(RUSAGE_THREAD, 0x55737c91e338) = 0
Wed Nov 10 22:25:59 2021.373124 write(8, "\0", 1) = 1
Wed Nov 10 22:25:36 2021.994608 stopwatch0[194121] restart_syscall() = 1
Wed Nov 10 22:25:59 2021.373144 stopwatch0[194121] getrusage(RUSAGE_THREAD, 0x7fc740000ca8) = 0
Wed Nov 10 22:25:59 2021.373151 stopwatch0[194121] read(7, 0x7fc74792bb90, 16) = 1
Wed Nov 10 22:25:59 2021.373198 stat("/ovn-cert/tls.key", 0x7ffce9e88f20) = 0
Wed Nov 10 22:25:59 2021.373211 stat("/ovn-ca/ca-bundle.crt", 0x7ffce9e88e80) = 0
Wed Nov 10 22:25:59 2021.373230 getrusage(RUSAGE_SELF, 0x7ffce9e88f90) = 0
Wed Nov 10 22:25:59 2021.373242 read(9, 0x55737c923983, 5) = 5
Wed Nov 10 22:25:59 2021.373248 read(9, 0x55737c923988, 158) = 158
Wed Nov 10 22:25:59 2021.373289 read(9, 0x55737c923983, 5) = 5
Wed Nov 10 22:25:59 2021.373294 read(9, 0x55737c923988, 243) = 243
Wed Nov 10 22:25:59 2021.373326 read(9, 0x55737c923983, 5) = -11 (EAGAIN)
Wed Nov 10 22:25:59 2021.373358 read(10, 0x55737c97de13, 5) = -11 (EAGAIN)
Wed Nov 10 22:25:59 2021.373368 accept(6, 0x7ffce9e88ec0, 0x7ffce9e88ebc) = -11 (EAGAIN)
Thu Jan  1 00:00:00 1970.000000 accept() = -11 (EAGAIN)
Wed Nov 10 22:25:59 2021.373385 write(8, "\0", 1) = 1
Wed Nov 10 22:25:59 2021.373411 poll(0x55737d8d7230, 4, 0) = 0
Wed Nov 10 22:25:59 2021.373417 getrusage(RUSAGE_THREAD, 0x55737c91e338) = 0
Wed Nov 10 22:25:59 2021.373435 write(8, "\0", 1) = 1
Wed Nov 10 22:25:59 2021.373443 stat("/ovn-cert/tls.key", 0x7ffce9e88f20) = 0
Wed Nov 10 22:25:59 2021.373477 stat("/ovn-ca/ca-bundle.crt", 0x7ffce9e88e80) = 0
Wed Nov 10 22:25:59 2021.373483 read(9, 0x55737c923983, 5) = -11 (EAGAIN)
Wed Nov 10 22:25:59 2021.373499 write(10, "\027\003\003\0:}\240N\220k\336&\364^v\333I_"..., 63) = 63
Wed Nov 10 22:25:59 2021.373511 read(10, 0x55737c97de13, 5) = -11 (EAGAIN)
Wed Nov 10 22:25:59 2021.373518 accept(6, 0x7ffce9e88ec0, 0x7ffce9e88ebc) = -11 (EAGAIN)
Thu Jan  1 00:00:00 1970.000000 accept() = -11 (EAGAIN)
Wed Nov 10 22:25:59 2021.373535 write(8, "\0", 1) = 1
Wed Nov 10 22:25:59 2021.373453 stopwatch0[194121] poll(0x7fc740000b60, 2, 2147483647) = 1
Wed Nov 10 22:25:59 2021.373554 stopwatch0[194121] getrusage(RUSAGE_THREAD, 0x7fc740000ca8) = 0
Wed Nov 10 22:25:59 2021.373561 stopwatch0[194121] read(7, 0x7fc74792bb90, 16) = 1
Wed Nov 10 22:25:59 2021.374094 stopwatch0[194121] getrusage(RUSAGE_THREAD, 0x7fc740000ca8) = 0
Wed Nov 10 22:25:59 2021.374103 stopwatch0[194121] read(7, 0x7fc74792bb90, 16) = 1
Wed Nov 10 22:25:59 2021.374113 stopwatch0[194121] poll(0x7fc740000b60, 2, 2147483647) = 1
Wed Nov 10 22:25:59 2021.374220 stopwatch0[194121] getrusage(RUSAGE_THREAD, 0x7fc740000ca8) = 0
Wed Nov 10 22:25:59 2021.374225 stopwatch0[194121] read(7, 0x7fc74792bb90, 16) = 2
Wed Nov 10 22:25:59 2021.374232 stopwatch0[194121] poll(0x7fc740000b60, 2, 2147483647) = 1
Wed Nov 10 22:25:59 2021.374275 stopwatch0[194121] getrusage(RUSAGE_THREAD, 0x7fc740000ca8) = 0
Wed Nov 10 22:25:59 2021.374279 stopwatch0[194121] read(7, 0x7fc74792bb90, 16) = 1
Wed Nov 10 22:25:59 2021.395852 getrusage(RUSAGE_THREAD, 0x55737c91e338) = 0
Wed Nov 10 22:25:59 2021.395866 write(8, "\0", 1) = 1
Wed Nov 10 22:25:59 2021.395883 stat("/ovn-cert/tls.key", 0x7ffce9e88f20) = 0
Wed Nov 10 22:25:59 2021.395906 stat("/ovn-ca/ca-bundle.crt", 0x7ffce9e88e80) = 0
Wed Nov 10 22:25:59 2021.395922 read(9, 0x55737c923983, 5) = -11 (EAGAIN)
Wed Nov 10 22:25:59 2021.395933 read(10, 0x55737c97de13, 5) = 5
Wed Nov 10 22:25:59 2021.395938 read(10, 0x55737c97de18, 158) = 158
Wed Nov 10 22:25:59 2021.395968 read(10, 0x55737c97de13, 5) = 5
Wed Nov 10 22:25:59 2021.395972 read(10, 0x55737c97de18, 243) = 243
Wed Nov 10 22:25:59 2021.395985 read(10, 0x55737c97de13, 5) = -11 (EAGAIN)
Wed Nov 10 22:25:59 2021.396012 accept(6, 0x7ffce9e88ec0, 0x7ffce9e88ebc) = -11 (EAGAIN)
Thu Jan  1 00:00:00 1970.000000 accept() = -11 (EAGAIN)
Wed Nov 10 22:25:59 2021.396034 write(8, "\0", 1) = 1
Wed Nov 10 22:25:59 2021.395906 stopwatch0[194121] poll(0x7fc740000b60, 2, 2147483647) = 1
Wed Nov 10 22:25:59 2021.396049 stopwatch0[194121] getrusage(RUSAGE_THREAD, 0x7fc740000ca8) = 0
Wed Nov 10 22:25:59 2021.396082 stopwatch0[194121] read(7, 0x7fc74792bb90, 16) = 1
Wed Nov 10 22:25:59 2021.374285 stopwatch0[194121] poll(0x7fc740000b60, 2, 2147483647) = 1
Wed Nov 10 22:25:59 2021.395889 stopwatch0[194121] getrusage(RUSAGE_THREAD, 0x7fc740000ca8) = 0
Wed Nov 10 22:25:59 2021.395896 stopwatch0[194121] read(7, 0x7fc74792bb90, 16) = 1
ERROR: probe overhead (682059752 cycles) exceeded threshold (500000000 cycles) in last 1000000000 cycles
WARNING: Number of errors: 1, skipped probes: 1334408
WARNING: /usr/bin/staprun exited with status: 1
Pass 5: run failed.  [man error::pass5]
sh-4.4#

Comment 58 jamo luhrsen 2021-11-10 22:27:01 UTC

please let me know what I can do further to help things here. I'm out of tasks to complete :)

Comment 59 DJ Delorie 2021-11-11 01:47:51 UTC

AFAIK the PIDS as seen from within a container (i.e. what getpid() returns) do NOT match the pids as seen from outside the container (i.e. what ps shows).  The "main" program in a container (the one you exec from outside the container) will think its PID is 1 but a "ps" from outside the container will give it some other pid.

My interpretation of this trace...

> "2021-11-10T22:16:05.559Z|00001|fatal_signal(u"..., 104) = 104
> Wed Nov 10 22:16:05 2021.559959 urcu1[2444] socket(PF_LOCAL,
> SOCK_DGRAM|SOCK_CLOEXEC, 0) = 15
> Wed Nov 10 22:16:05 2021.559981 urcu1[2444] connect(15, {AF_UNIX, "/dev/log"},
> 110) = -2 (ENOENT)
> Wed Nov 10 22:16:05 2021.559995 urcu1[2444] close(15) = 0
> Wed Nov 10 22:16:05 2021.560008 urcu1[2444] rt_sigaction(SIGSEGV, {SIG_DFL},
> 0x7f3931a7b9a0, 8) = 0
> Wed Nov 10 22:16:05 2021.560044 urcu1[2444] rt_sigprocmask(SIG_BLOCK,
> [SIGHUP|SIGINT|SIGQUIT|SIGILL|SIGTRAP|SIGABRT|SIGBUS|SIGFPE|SIGKILL|SIGUSR1|SIGSEGV|SIGPIPE|SIGUSR2|SIGALRM|SIGTERM|SIGCHLD|SIGCONT|SIGSTOP|SIGTSTP|SIGTTIN|SIGTTOU|SIGURG|SIGXCPU|SIGXFSZ|SIGVTALRM|SIGPROF|SIGWINCH|SIGIO/SIGPOLL|SIGPWR|SIGSYS],
> 0x7f3931a7ba80, 8) = 0
> Wed Nov 10 22:16:05 2021.560051 urcu1[2444] getpid() = 1
> Wed Nov 10 22:16:05 2021.560070 urcu1[2444] gettid() = 10
> Wed Nov 10 22:16:05 2021.560077 urcu1[2444] tgkill(1, 10, SIGSEGV) = 0
> Wed Nov 10 22:16:05 2021.560085 urcu1[2444] rt_sigprocmask(SIG_SETMASK, [EMPTY], 0x0, 8) = 0

Looks like some process named urcu1 (pid 2444?) sends SIGSEGV to thread
group 1, thread 10 (itself).


> Wed Nov 10 22:16:02 2021.412143 restart_syscall() = -516
> Wed Nov 10 22:16:05 2021.557095 write(5, "\0", 1) = 1
> Wed Nov 10 22:16:05 2021.557129 poll(0x556a89082580, 4, 2147483647) = 1
> Wed Nov 10 22:16:05 2021.557140 getrusage(RUSAGE_THREAD, 0x556a88ad7198) = 0
> Wed Nov 10 22:16:05 2021.557181 write(2,
> "2021-11-10T22:16:05.557Z|00035|fatal_signal|W"..., 89) = 89
> Wed Nov 10 22:16:05 2021.557436 socket(PF_LOCAL, SOCK_DGRAM|SOCK_CLOEXEC, 0) = 15
> Wed Nov 10 22:16:05 2021.557462 connect(15, {AF_UNIX, "/dev/log"}, 110) = -2 (ENOENT)
> Wed Nov 10 22:16:05 2021.557475 close(15) = 0

Some logging happens.

> Wed Nov 10 22:16:05 2021.557493 unlink("/var/run/ovn/ovn-northd.1.ctl") = 0
> Wed Nov 10 22:16:05 2021.557521 unlink("/var/run/ovn/ovn-northd.pid") = 0

Last bit of clean up.

Here's where it gets interesting, I think.  We try to die with SIGTERM:

> Wed Nov 10 22:16:05 2021.557536 rt_sigaction(SIGTERM, {SIG_DFL}, 0x7ffc0c9c7f00, 8) = 0

This is raise(SIGTERM):

> Wed Nov 10 22:16:05 2021.557553 rt_sigprocmask(SIG_BLOCK,
> [SIGHUP|SIGINT|SIGQUIT|SIGILL|SIGTRAP|SIGABRT|SIGBUS|SIGFPE|SIGKILL|SIGUSR1|SIGSEGV|SIGPIPE|SIGUSR2|SIGALRM|SIGTERM|SIGCHLD|SIGCONT|SIGSTOP|SIGTSTP|SIGTTIN|SIGTTOU|SIGURG|SIGXCPU|SIGXFSZ|SIGVTALRM|SIGPROF|SIGWINCH|SIGIO/SIGPOLL|SIGPWR|SIGSYS],
> 0x7ffc0c9c7fe0, 8) = 0
> Wed Nov 10 22:16:05 2021.557561 getpid() = 1
> Wed Nov 10 22:16:05 2021.557568 gettid() = 1
> Wed Nov 10 22:16:05 2021.557575 tgkill(1, 1, SIGTERM) = 0
> Wed Nov 10 22:16:05 2021.557582 rt_sigprocmask(SIG_SETMASK, [EMPTY], 0x0, 8) = 0

And now it's unblocked, so should be sent, but it's not, because we keep going...

> Wed Nov 10 22:16:05 2021.557588 rt_sigprocmask(SIG_UNBLOCK, [SIGABRT], 0x0, 8) = 0

We force SIGABRT to be unblocked, and...

> Wed Nov 10 22:16:05 2021.557599 rt_sigprocmask(SIG_BLOCK,
> [SIGHUP|SIGINT|SIGQUIT|SIGILL|SIGTRAP|SIGABRT|SIGBUS|SIGFPE|SIGKILL|SIGUSR1|SIGSEGV|SIGPIPE|SIGUSR2|SIGALRM|SIGTERM|SIGCHLD|SIGCONT|SIGSTOP|SIGTSTP|SIGTTIN|SIGTTOU|SIGURG|SIGXCPU|SIGXFSZ|SIGVTALRM|SIGPROF|SIGWINCH|SIGIO/SIGPOLL|SIGPWR|SIGSYS],
> 0x7ffc0c9c7eb0, 8) = 0
> Wed Nov 10 22:16:05 2021.557604 getpid() = 1
> Wed Nov 10 22:16:05 2021.557607 gettid() = 1
> Wed Nov 10 22:16:05 2021.557611 tgkill(1, 1, SIGABRT) = 0
> Wed Nov 10 22:16:05 2021.557616 rt_sigprocmask(SIG_SETMASK, [EMPTY], 0x0, 8) = 0

Try to send it.

> Wed Nov 10 22:16:05 2021.557621 rt_sigaction(SIGABRT, {SIG_DFL}, 0x0, 8) = 0

Now we've forced the SIGABRT handler to be it's default, and...

> Wed Nov 10 22:16:05 2021.557632 rt_sigprocmask(SIG_BLOCK,
> [SIGHUP|SIGINT|SIGQUIT|SIGILL|SIGTRAP|SIGABRT|SIGBUS|SIGFPE|SIGKILL|SIGUSR1|SIGSEGV|SIGPIPE|SIGUSR2|SIGALRM|SIGTERM|SIGCHLD|SIGCONT|SIGSTOP|SIGTSTP|SIGTTIN|SIGTTOU|SIGURG|SIGXCPU|SIGXFSZ|SIGVTALRM|SIGPROF|SIGWINCH|SIGIO/SIGPOLL|SIGPWR|SIGSYS],
> 0x7ffc0c9c7eb0, 8) = 0
> Wed Nov 10 22:16:05 2021.557637 getpid() = 1
> Wed Nov 10 22:16:05 2021.557640 gettid() = 1
> Wed Nov 10 22:16:05 2021.557644 tgkill(1, 1, SIGABRT) = 0
> Wed Nov 10 22:16:05 2021.557648 rt_sigprocmask(SIG_SETMASK, [EMPTY], 0x0, 8) = 0

Try to send it again.

On our next attempt, we should use the HLT opcode and segfault...

> Wed Nov 10 22:16:05 2021.557656 rt_sigaction(SIGSEGV, {SIG_DFL}, 0x7ffc0c9c70f0, 8) = 0
> Wed Nov 10 22:16:05 2021.557668 rt_sigprocmask(SIG_BLOCK,
> [SIGHUP|SIGINT|SIGQUIT|SIGILL|SIGTRAP|SIGABRT|SIGBUS|SIGFPE|SIGKILL|SIGUSR1|SIGSEGV|SIGPIPE|SIGUSR2|SIGALRM|SIGTERM|SIGCHLD|SIGCONT|SIGSTOP|SIGTSTP|SIGTTIN|SIGTTOU|SIGURG|SIGXCPU|SIGXFSZ|SIGVTALRM|SIGPROF|SIGWINCH|SIGIO/SIGPOLL|SIGPWR|SIGSYS],
> 0x7ffc0c9c71d0, 8) = 0
> Wed Nov 10 22:16:05 2021.557673 getpid() = 1
> Wed Nov 10 22:16:05 2021.557676 gettid() = 1
> Wed Nov 10 22:16:05 2021.557680 tgkill(1, 1, SIGSEGV) = 0
> Wed Nov 10 22:16:05 2021.557685 rt_sigprocmask(SIG_SETMASK, [SIGSEGV], 0x0, 8) = 0

That was us calling raise(SIGSEGV)

The rest seems unrelated, I'm not sure what's going in here, I can only
hope it's your software finally realizing it's time to exit ;-)

> Wed Nov 10 22:16:05 2021.557690 write(5, "\0", 1) = 1
> Thu Jan  1 00:00:00 1970.000000 stopwatch0[2443] restart_syscall() = 1
> Wed Nov 10 22:16:05 2021.584209 stopwatch0[2443] getrusage(RUSAGE_THREAD, 0x7f392c000e48) = 0
> Wed Nov 10 22:16:05 2021.584238 stopwatch0[2443] write(2, "2021-11-10T22:16:05.584Z|00001|fatal_signal(s"..., 109) = 109
> Wed Nov 10 22:16:05 2021.584366 stopwatch0[2443] socket(PF_LOCAL, SOCK_DGRAM|SOCK_CLOEXEC, 0) = 15
> Wed Nov 10 22:16:05 2021.584385 stopwatch0[2443] connect(15, {AF_UNIX, "/dev/log"}, 110) = -2 (ENOENT)
> Wed Nov 10 22:16:05 2021.584398 stopwatch0[2443] close(15) = 0
> Wed Nov 10 22:16:05 2021.584411 stopwatch0[2443] rt_sigaction(SIGSEGV, {SIG_DFL}, 0x7f393227c960, 8) = 0

So the trace shows that you're sending yourself the signal, the tgkill() syscall is "succeeding", but... nothing happens.  Something kernel side is blocking it, else it would return an error.  We still don't know what seccomp's involvement here is, and I can't think of anything else kernel-side that would block these.

Comment 60 jamo luhrsen 2021-11-16 19:05:22 UTC

> We still don't know what seccomp's involvement here is

can someone help me figure this out? I've never played with anything at this level
before. Here's some random data points that may or may not be helpful. Apologies if
this is meaningless:

# on the compute host
sh-4.4# podman info | grep sec
      libseccomp: 2.5.1
  security:
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json

# the pod running northd and ovsdb-server
❯ oc get pod -nopenshift-ovn-kubernetes ovnkube-master-mm9tc -o yaml | yq -r .securityContext
null

# from the container running northd
[root@ci-ln-cczgbr2-72292-tw8gm-master-0 ~]# ps -e | grep northd
      1 ?        00:00:00 ovn-northd
[root@ci-ln-cczgbr2-72292-tw8gm-master-0 ~]# grep Seccomp /proc/1/status
Seccomp:	0


somewhere some post told me to look for SECCOMP configs in /boot/config-$(uname -r), but that
doesn't exist in these cluster systems.

there some seccomp related files on the host that seem like logging files, but they are empty
(even after a core dump):

sh-4.4# ll /proc/sys/kernel/seccomp
total 0
dr-xr-xr-x. 1 root root 0 Nov 16 17:27 ..
dr-xr-xr-x. 1 root root 0 Nov 16 19:00 .
-rw-r--r--. 1 root root 0 Nov 16 19:00 actions_logged
-r--r--r--. 1 root root 0 Nov 16 19:00 actions_avail

Here are the contents of that file referenced in the podman info output:

sh-4.4# cat /usr/share/containers/seccomp.json
{
	"defaultAction": "SCMP_ACT_ERRNO",
	"archMap": [
		{
			"architecture": "SCMP_ARCH_X86_64",
			"subArchitectures": [
				"SCMP_ARCH_X86",
				"SCMP_ARCH_X32"
			]
		},
		{
			"architecture": "SCMP_ARCH_AARCH64",
			"subArchitectures": [
				"SCMP_ARCH_ARM"
			]
		},
		{
			"architecture": "SCMP_ARCH_MIPS64",
			"subArchitectures": [
				"SCMP_ARCH_MIPS",
				"SCMP_ARCH_MIPS64N32"
			]
		},
		{
			"architecture": "SCMP_ARCH_MIPS64N32",
			"subArchitectures": [
				"SCMP_ARCH_MIPS",
				"SCMP_ARCH_MIPS64"
			]
		},
		{
			"architecture": "SCMP_ARCH_MIPSEL64",
			"subArchitectures": [
				"SCMP_ARCH_MIPSEL",
				"SCMP_ARCH_MIPSEL64N32"
			]
		},
		{
			"architecture": "SCMP_ARCH_MIPSEL64N32",
			"subArchitectures": [
				"SCMP_ARCH_MIPSEL",
				"SCMP_ARCH_MIPSEL64"
			]
		},
		{
			"architecture": "SCMP_ARCH_S390X",
			"subArchitectures": [
				"SCMP_ARCH_S390"
			]
		}
	],
	"syscalls": [
		{
			"names": [
				"_llseek",
				"_newselect",
				"accept",
				"accept4",
				"access",
				"adjtimex",
				"alarm",
				"bind",
				"brk",
				"capget",
				"capset",
				"chdir",
				"chmod",
				"chown",
				"chown32",
				"clock_adjtime",
				"clock_adjtime64",
				"clock_getres",
				"clock_getres_time64",
				"clock_gettime",
				"clock_gettime64",
				"clock_nanosleep",
				"clock_nanosleep_time64",
				"clone",
				"close",
				"close_range",
				"connect",
				"copy_file_range",
				"creat",
				"dup",
				"dup2",
				"dup3",
				"epoll_create",
				"epoll_create1",
				"epoll_ctl",
				"epoll_ctl_old",
				"epoll_pwait",
				"epoll_pwait2",
				"epoll_wait",
				"epoll_wait_old",
				"eventfd",
				"eventfd2",
				"execve",
				"execveat",
				"exit",
				"exit_group",
				"faccessat",
				"faccessat2",
				"fadvise64",
				"fadvise64_64",
				"fallocate",
				"fanotify_mark",
				"fchdir",
				"fchmod",
				"fchmodat",
				"fchown",
				"fchown32",
				"fchownat",
				"fcntl",
				"fcntl64",
				"fdatasync",
				"fgetxattr",
				"flistxattr",
				"flock",
				"fork",
				"fremovexattr",
				"fsconfig",
				"fsetxattr",
				"fsmount",
				"fsopen",
				"fspick",
				"fstat",
				"fstat64",
				"fstatat64",
				"fstatfs",
				"fstatfs64",
				"fsync",
				"ftruncate",
				"ftruncate64",
				"futex",
				"futimesat",
				"get_robust_list",
				"get_thread_area",
				"getcpu",
				"getcwd",
				"getdents",
				"getdents64",
				"getegid",
				"getegid32",
				"geteuid",
				"geteuid32",
				"getgid",
				"getgid32",
				"getgroups",
				"getgroups32",
				"getitimer",
				"getpeername",
				"getpgid",
				"getpgrp",
				"getpid",
				"getppid",
				"getpriority",
				"getrandom",
				"getresgid",
				"getresgid32",
				"getresuid",
				"getresuid32",
				"getrlimit",
				"getrusage",
				"getsid",
				"getsockname",
				"getsockopt",
				"gettid",
				"gettimeofday",
				"getuid",
				"getuid32",
				"getxattr",
				"inotify_add_watch",
				"inotify_init",
				"inotify_init1",
				"inotify_rm_watch",
				"io_cancel",
				"io_destroy",
				"io_getevents",
				"io_setup",
				"io_submit",
				"ioctl",
				"ioprio_get",
				"ioprio_set",
				"ipc",
				"keyctl",
				"kill",
				"lchown",
				"lchown32",
				"lgetxattr",
				"link",
				"linkat",
				"listen",
				"listxattr",
				"llistxattr",
				"lremovexattr",
				"lseek",
				"lsetxattr",
				"lstat",
				"lstat64",
				"madvise",
				"memfd_create",
				"mincore",
				"mkdir",
				"mkdirat",
				"mknod",
				"mknodat",
				"mlock",
				"mlock2",
				"mlockall",
				"mmap",
				"mmap2",
				"mount",
				"move_mount",
				"mprotect",
				"mq_getsetattr",
				"mq_notify",
				"mq_open",
				"mq_timedreceive",
				"mq_timedsend",
				"mq_unlink",
				"mremap",
				"msgctl",
				"msgget",
				"msgrcv",
				"msgsnd",
				"msync",
				"munlock",
				"munlockall",
				"munmap",
				"name_to_handle_at",
				"nanosleep",
				"newfstatat",
				"open",
				"openat",
				"openat2",
				"open_tree",
				"pause",
				"pidfd_getfd",
				"pidfd_open",
				"pidfd_send_signal",
				"pipe",
				"pipe2",
				"pivot_root",
				"poll",
				"ppoll",
				"ppoll_time64",
				"prctl",
				"pread64",
				"preadv",
				"preadv2",
				"prlimit64",
				"pselect6",
				"pselect6_time64",
				"pwrite64",
				"pwritev",
				"pwritev2",
				"read",
				"readahead",
				"readlink",
				"readlinkat",
				"readv",
				"reboot",
				"recv",
				"recvfrom",
				"recvmmsg",
				"recvmsg",
				"remap_file_pages",
				"removexattr",
				"rename",
				"renameat",
				"renameat2",
				"restart_syscall",
				"rmdir",
				"rt_sigaction",
				"rt_sigpending",
				"rt_sigprocmask",
				"rt_sigqueueinfo",
				"rt_sigreturn",
				"rt_sigsuspend",
				"rt_sigtimedwait",
				"rt_tgsigqueueinfo",
				"sched_get_priority_max",
				"sched_get_priority_min",
				"sched_getaffinity",
				"sched_getattr",
				"sched_getparam",
				"sched_getscheduler",
				"sched_rr_get_interval",
				"sched_setaffinity",
				"sched_setattr",
				"sched_setparam",
				"sched_setscheduler",
				"sched_yield",
				"seccomp",
				"select",
				"semctl",
				"semget",
				"semop",
				"semtimedop",
				"send",
				"sendfile",
				"sendfile64",
				"sendmmsg",
				"sendmsg",
				"sendto",
				"setns",
				"set_robust_list",
				"set_thread_area",
				"set_tid_address",
				"setfsgid",
				"setfsgid32",
				"setfsuid",
				"setfsuid32",
				"setgid",
				"setgid32",
				"setgroups",
				"setgroups32",
				"setitimer",
				"setpgid",
				"setpriority",
				"setregid",
				"setregid32",
				"setresgid",
				"setresgid32",
				"setresuid",
				"setresuid32",
				"setreuid",
				"setreuid32",
				"setrlimit",
				"setsid",
				"setsockopt",
				"setuid",
				"setuid32",
				"setxattr",
				"shmat",
				"shmctl",
				"shmdt",
				"shmget",
				"shutdown",
				"sigaltstack",
				"signalfd",
				"signalfd4",
				"sigreturn",
				"socket",
				"socketcall",
				"socketpair",
				"splice",
				"stat",
				"stat64",
				"statfs",
				"statfs64",
				"statx",
				"symlink",
				"symlinkat",
				"sync",
				"sync_file_range",
				"syncfs",
				"sysinfo",
				"syslog",
				"tee",
				"tgkill",
				"time",
				"timer_create",
				"timer_delete",
				"timer_getoverrun",
				"timer_gettime",
				"timer_gettime64",
				"timer_settime",
				"timerfd_create",
				"timerfd_gettime",
				"timerfd_gettime64",
				"timerfd_settime",
				"timerfd_settime64",
				"times",
				"tkill",
				"truncate",
				"truncate64",
				"ugetrlimit",
				"umask",
				"umount",
				"umount2",
				"uname",
				"unlink",
				"unlinkat",
				"unshare",
				"utime",
				"utimensat",
				"utimensat_time64",
				"utimes",
				"vfork",
				"wait4",
				"waitid",
				"waitpid",
				"write",
				"writev"
			],
			"action": "SCMP_ACT_ALLOW",
			"args": [],
			"comment": "",
			"includes": {},
			"excludes": {}
		},
		{
			"names": [
				"personality"
			],
			"action": "SCMP_ACT_ALLOW",
			"args": [
				{
					"index": 0,
					"value": 0,
					"valueTwo": 0,
					"op": "SCMP_CMP_EQ"
				}
			],
			"comment": "",
			"includes": {},
			"excludes": {}
		},
		{
			"names": [
				"personality"
			],
			"action": "SCMP_ACT_ALLOW",
			"args": [
				{
					"index": 0,
					"value": 8,
					"valueTwo": 0,
					"op": "SCMP_CMP_EQ"
				}
			],
			"comment": "",
			"includes": {},
			"excludes": {}
		},
		{
			"names": [
				"personality"
			],
			"action": "SCMP_ACT_ALLOW",
			"args": [
				{
					"index": 0,
					"value": 131072,
					"valueTwo": 0,
					"op": "SCMP_CMP_EQ"
				}
			],
			"comment": "",
			"includes": {},
			"excludes": {}
		},
		{
			"names": [
				"personality"
			],
			"action": "SCMP_ACT_ALLOW",
			"args": [
				{
					"index": 0,
					"value": 131080,
					"valueTwo": 0,
					"op": "SCMP_CMP_EQ"
				}
			],
			"comment": "",
			"includes": {},
			"excludes": {}
		},
		{
			"names": [
				"personality"
			],
			"action": "SCMP_ACT_ALLOW",
			"args": [
				{
					"index": 0,
					"value": 4294967295,
					"valueTwo": 0,
					"op": "SCMP_CMP_EQ"
				}
			],
			"comment": "",
			"includes": {},
			"excludes": {}
		},
		{
			"names": [
				"sync_file_range2"
			],
			"action": "SCMP_ACT_ALLOW",
			"args": [],
			"comment": "",
			"includes": {
				"arches": [
					"ppc64le"
				]
			},
			"excludes": {}
		},
		{
			"names": [
				"arm_fadvise64_64",
				"arm_sync_file_range",
				"sync_file_range2",
				"breakpoint",
				"cacheflush",
				"set_tls"
			],
			"action": "SCMP_ACT_ALLOW",
			"args": [],
			"comment": "",
			"includes": {
				"arches": [
					"arm",
					"arm64"
				]
			},
			"excludes": {}
		},
		{
			"names": [
				"arch_prctl"
			],
			"action": "SCMP_ACT_ALLOW",
			"args": [],
			"comment": "",
			"includes": {
				"arches": [
					"amd64",
					"x32"
				]
			},
			"excludes": {}
		},
		{
			"names": [
				"modify_ldt"
			],
			"action": "SCMP_ACT_ALLOW",
			"args": [],
			"comment": "",
			"includes": {
				"arches": [
					"amd64",
					"x32",
					"x86"
				]
			},
			"excludes": {}
		},
		{
			"names": [
				"s390_pci_mmio_read",
				"s390_pci_mmio_write",
				"s390_runtime_instr"
			],
			"action": "SCMP_ACT_ALLOW",
			"args": [],
			"comment": "",
			"includes": {
				"arches": [
					"s390",
					"s390x"
				]
			},
			"excludes": {}
		},
		{
			"names": [
				"open_by_handle_at"
			],
			"action": "SCMP_ACT_ALLOW",
			"args": [],
			"comment": "",
			"includes": {
				"caps": [
					"CAP_DAC_READ_SEARCH"
				]
			},
			"excludes": {}
		},
		{
			"names": [
				"bpf",
				"fanotify_init",
				"lookup_dcookie",
				"perf_event_open",
				"quotactl",
				"setdomainname",
				"sethostname",
				"setns"
			],
			"action": "SCMP_ACT_ALLOW",
			"args": [],
			"comment": "",
			"includes": {
				"caps": [
					"CAP_SYS_ADMIN"
				]
			},
			"excludes": {}
		},
		{
			"names": [
				"chroot"
			],
			"action": "SCMP_ACT_ALLOW",
			"args": [],
			"comment": "",
			"includes": {
				"caps": [
					"CAP_SYS_CHROOT"
				]
			},
			"excludes": {}
		},
		{
			"names": [
				"delete_module",
				"init_module",
				"finit_module",
				"query_module"
			],
			"action": "SCMP_ACT_ALLOW",
			"args": [],
			"comment": "",
			"includes": {
				"caps": [
					"CAP_SYS_MODULE"
				]
			},
			"excludes": {}
		},
		{
			"names": [
				"get_mempolicy",
				"mbind",
				"set_mempolicy"
			],
			"action": "SCMP_ACT_ALLOW",
			"args": [],
			"comment": "",
			"includes": {
				"caps": [
					"CAP_SYS_NICE"
				]
			},
			"excludes": {}
		},
		{
			"names": [
				"acct"
			],
			"action": "SCMP_ACT_ALLOW",
			"args": [],
			"comment": "",
			"includes": {
				"caps": [
					"CAP_SYS_PACCT"
				]
			},
			"excludes": {}
		},
		{
			"names": [
				"kcmp",
				"process_madvise",
				"process_vm_readv",
				"process_vm_writev",
				"ptrace"
			],
			"action": "SCMP_ACT_ALLOW",
			"args": [],
			"comment": "",
			"includes": {
				"caps": [
					"CAP_SYS_PTRACE"
				]
			},
			"excludes": {}
		},
		{
			"names": [
				"iopl",
				"ioperm"
			],
			"action": "SCMP_ACT_ALLOW",
			"args": [],
			"comment": "",
			"includes": {
				"caps": [
					"CAP_SYS_RAWIO"
				]
			},
			"excludes": {}
		},
		{
			"names": [
				"settimeofday",
				"stime",
				"clock_settime",
				"clock_settime64"
			],
			"action": "SCMP_ACT_ALLOW",
			"args": [],
			"comment": "",
			"includes": {
				"caps": [
					"CAP_SYS_TIME"
				]
			},
			"excludes": {}
		},
		{
			"names": [
				"vhangup"
			],
			"action": "SCMP_ACT_ALLOW",
			"args": [],
			"comment": "",
			"includes": {
				"caps": [
					"CAP_SYS_TTY_CONFIG"
				]
			},
			"excludes": {}
		},
		{
			"names": [
				"socket"
			],
			"action": "SCMP_ACT_ERRNO",
			"args": [
				{
					"index": 0,
					"value": 16,
					"valueTwo": 0,
					"op": "SCMP_CMP_EQ"
				},
				{
					"index": 2,
					"value": 9,
					"valueTwo": 0,
					"op": "SCMP_CMP_EQ"
				}
			],
			"comment": "",
			"includes": {},
			"excludes": {
				"caps": [
					"CAP_AUDIT_WRITE"
				]
			},
			"errnoRet": 22
		},
		{
			"names": [
				"socket"
			],
			"action": "SCMP_ACT_ALLOW",
			"args": [
				{
					"index": 2,
					"value": 9,
					"valueTwo": 0,
					"op": "SCMP_CMP_NE"
				}
			],
			"comment": "",
			"includes": {},
			"excludes": {
				"caps": [
					"CAP_AUDIT_WRITE"
				]
			}
		},
		{
			"names": [
				"socket"
			],
			"action": "SCMP_ACT_ALLOW",
			"args": [
				{
					"index": 0,
					"value": 16,
					"valueTwo": 0,
					"op": "SCMP_CMP_NE"
				}
			],
			"comment": "",
			"includes": {},
			"excludes": {
				"caps": [
					"CAP_AUDIT_WRITE"
				]
			}
		},
		{
			"names": [
				"socket"
			],
			"action": "SCMP_ACT_ALLOW",
			"args": [
				{
					"index": 2,
					"value": 9,
					"valueTwo": 0,
					"op": "SCMP_CMP_NE"
				}
			],
			"comment": "",
			"includes": {},
			"excludes": {
				"caps": [
					"CAP_AUDIT_WRITE"
				]
			}
		},
		{
			"names": [
				"socket"
			],
			"action": "SCMP_ACT_ALLOW",
			"args": null,
			"comment": "",
			"includes": {
				"caps": [
					"CAP_AUDIT_WRITE"
				]
			},
			"excludes": {}
		}
	]
}

Comment 61 jamo luhrsen 2021-12-01 19:14:14 UTC

anything else I can do here to help?

Comment 62 DJ Delorie 2021-12-01 19:56:13 UTC

I think it's time to get the seccomp or kernel teams involved.  Our groups have exhausted our usual debug options.  I wish I could have been more help, but the glibc bits seem to be doing what they're supposed to be doing, and once we pass off to the kernel...

Comment 63 Mrunal Patel 2021-12-03 19:32:16 UTC

We use crio to run the pods (not podman) and seccomp isn't enabled by default in OpenShift yet. So, that explains
why you see Seccomp disabled for the northd process inside a pod.

Comment 64 jamo luhrsen 2021-12-11 15:34:29 UTC

@b

Comment 65 jamo luhrsen 2021-12-11 15:36:23 UTC

@bbennett, do you know someone from the kernel side that can help us out with this one?

Comment 68 Carlos O'Donell 2022-01-28 14:42:01 UTC

What do we think our next steps are here in this bug?

This doesn't seem to be a glibc issue, and is instead a seccomp/kernel issue related to specific design of the application and how the sandboxing technology is applied.

Is there any way we can help here?

Comment 69 jamo luhrsen 2022-01-28 17:44:42 UTC

(In reply to Carlos O'Donell from comment #68)
> What do we think our next steps are here in this bug?
> 
> This doesn't seem to be a glibc issue, and is instead a seccomp/kernel issue
> related to specific design of the application and how the sandboxing
> technology is applied.
> 
> Is there any way we can help here?

I sort of put this on the back-burner waiting to see if @aconole had any time/ideas about where
to go next. In the meantime, @ffernand has a PR [0] that stops our ovsdb and northd processes
in a way that doesn't produce the core dumps. When that lands and it's backport to 4.9 [1], we should end
up with CI jobs that do not get core dumps. That doesn't fix whatever the root problem is, but maybe this
is such an odd corner-case that it's not worth wasting any more time on? I do have another PR I want to
get in eventually that will fail our CI jobs (at least for OVN jobs) when core dumps are found.

[0] https://github.com/openshift/cluster-network-operator/pull/1281
[1] https://github.com/openshift/cluster-network-operator/pull/1279
[2] https://github.com/openshift/release/pull/20566

Comment 70 Carlos O'Donell 2022-01-31 17:17:32 UTC

(In reply to jamo luhrsen from comment #69)
> (In reply to Carlos O'Donell from comment #68)
> > What do we think our next steps are here in this bug?
> > 
> > This doesn't seem to be a glibc issue, and is instead a seccomp/kernel issue
> > related to specific design of the application and how the sandboxing
> > technology is applied.
> > 
> > Is there any way we can help here?
> 
> I sort of put this on the back-burner waiting to see if @aconole
> had any time/ideas about where
> to go next. In the meantime, @ffernand has a PR [0] that stops
> our ovsdb and northd processes
> in a way that doesn't produce the core dumps. When that lands and it's
> backport to 4.9 [1], we should end
> up with CI jobs that do not get core dumps. That doesn't fix whatever the
> root problem is, but maybe this
> is such an odd corner-case that it's not worth wasting any more time on? I
> do have another PR I want to
> get in eventually that will fail our CI jobs (at least for OVN jobs) when
> core dumps are found.
> 
> [0] https://github.com/openshift/cluster-network-operator/pull/1281
> [1] https://github.com/openshift/cluster-network-operator/pull/1279
> [2] https://github.com/openshift/release/pull/20566

Any objection to marking this bug CLOSED/NOTABUG for now?

We can always come back to review the application, sandbox, kernel, glibc behaviour if you need further help.

Comment 71 jamo luhrsen 2022-02-04 00:10:07 UTC

> 
> Any objection to marking this bug CLOSED/NOTABUG for now?

Is CLOSED/WONTFIX better? that's what I used. feel free to change if it doesn't make sense.

> We can always come back to review the application, sandbox, kernel, glibc
> behaviour if you need further help.

You can see that we no longer get core dumps in our 4.10->4.11 upgrade jobs [0] (both releases
have the changes that stop ovsdb/northd processes in a way to not trigger the core dump. We
still see them in 4.9->4.10 [1] as expected, because that change didn't land in 4.9 yet.

[0] https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.11-upgrade-from-stable-4.10-e2e-gcp-ovn-upgrade/1489294599682789376/artifacts/e2e-gcp-ovn-upgrade/gather-core-dump/artifacts/
[1] https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.10-upgrade-from-stable-4.9-e2e-gcp-ovn-upgrade/1489299314738991104/artifacts/e2e-gcp-ovn-upgrade/gather-core-dump/artifacts/

Comment 72 Dan Williams 2022-02-16 21:53:55 UTC

*** Bug 1954855 has been marked as a duplicate of this bug. ***

Comment 73 Red Hat Bugzilla 2023-11-16 04:25:14 UTC

The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days