Description of problem: The rngd service crashes soon after a start on my Power9 system in a VM running rawhide. I could not reproduce it with F-32 running rngd from Rawhide (rebuilt on F-32). It looks like rngd enters an end-less loop and then it crashes. dmesg: [ 354.226891] rngd[12270]: segfault (11) at 7fffcbedfff8 nip 7fff97b11208 lr 7fff97af0710 code 1 in libc-2.31.9000.so[7fff97a70000+210000] [ 354.228055] rngd[12270]: code: 4e800020 60420000 4bff9b79 60000000 4bffffb0 00000000 01000000 00000280 [ 354.228645] rngd[12270]: code: 60000000 60420000 3c4c0018 38425e00 <fba1ffe8> f821ffb1 7cbd2b79 4182019c coredumpctl gdb output: [root@fedora-ppc ~]# coredumpctl gdb PID: 12270 (rngd) UID: 0 (root) GID: 0 (root) Signal: 11 (SEGV) Timestamp: Mon 2020-07-27 15:18:37 CEST (14min ago) Command Line: /sbin/rngd -f Executable: /usr/sbin/rngd Control Group: /system.slice/rngd.service Unit: rngd.service Slice: system.slice Boot ID: ac29af7aaf334c77ac3d0e9112b31388 Machine ID: b969fce095794e53904aaa52f273e756 Hostname: fedora-ppc Storage: /var/lib/systemd/coredump/core.rngd.0.ac29af7aaf334c77ac3d0e9112b31388.12270.1595855917000000000000.zst Message: Process 12270 (rngd) of user 0 dumped core. Stack trace of thread 12270: #0 0x00007fff97b11208 _IO_default_xsputn (libc.so.6 + 0xa1208) #1 0x00007fff97af0710 __vfprintf_internal (libc.so.6 + 0x80710) #2 0x00007fff97b08d3c __vsnprintf_internal (libc.so.6 + 0x98d3c) #3 0x00007fff97bc68d8 __snprintf_chk@@GLIBC_2.17 (libc.so.6 + 0x1568d8) #4 0x000000010669cf20 refill_rand.part.0 (rngd + 0xcf20) #5 0x000000010669d198 init_openssl (rngd + 0xd198) #6 0x000000010669cdf4 refill_rand.part.0 (rngd + 0xcdf4) #7 0x000000010669d198 init_openssl (rngd + 0xd198) #8 0x000000010669cdf4 refill_rand.part.0 (rngd + 0xcdf4) #9 0x000000010669d198 init_openssl (rngd + 0xd198) #10 0x000000010669cdf4 refill_rand.part.0 (rngd + 0xcdf4) #11 0x000000010669d198 init_openssl (rngd + 0xd198) #12 0x000000010669cdf4 refill_rand.part.0 (rngd + 0xcdf4) #13 0x000000010669d198 init_openssl (rngd + 0xd198) #14 0x000000010669cdf4 refill_rand.part.0 (rngd + 0xcdf4) #15 0x000000010669d198 init_openssl (rngd + 0xd198) #16 0x000000010669cdf4 refill_rand.part.0 (rngd + 0xcdf4) #17 0x000000010669d198 init_openssl (rngd + 0xd198) #18 0x000000010669cdf4 refill_rand.part.0 (rngd + 0xcdf4) #19 0x000000010669d198 init_openssl (rngd + 0xd198) #20 0x000000010669cdf4 refill_rand.part.0 (rngd + 0xcdf4) #21 0x000000010669d198 init_openssl (rngd + 0xd198) #22 0x000000010669cdf4 refill_rand.part.0 (rngd + 0xcdf4) #23 0x000000010669d198 init_openssl (rngd + 0xd198) #24 0x000000010669cdf4 refill_rand.part.0 (rngd + 0xcdf4) #25 0x000000010669d198 init_openssl (rngd + 0xd198) #26 0x000000010669cdf4 refill_rand.part.0 (rngd + 0xcdf4) #27 0x000000010669d198 init_openssl (rngd + 0xd198) #28 0x000000010669cdf4 refill_rand.part.0 (rngd + 0xcdf4) #29 0x000000010669d198 init_openssl (rngd + 0xd198) #30 0x000000010669cdf4 refill_rand.part.0 (rngd + 0xcdf4) #31 0x000000010669d198 init_openssl (rngd + 0xd198) #32 0x000000010669cdf4 refill_rand.part.0 (rngd + 0xcdf4) #33 0x000000010669d198 init_openssl (rngd + 0xd198) #34 0x000000010669cdf4 refill_rand.part.0 (rngd + 0xcdf4) #35 0x000000010669d198 init_openssl (rngd + 0xd198) #36 0x000000010669cdf4 refill_rand.part.0 (rngd + 0xcdf4) #37 0x000000010669d198 init_openssl (rngd + 0xd198) #38 0x000000010669cdf4 refill_rand.part.0 (rngd + 0xcdf4) #39 0x000000010669d198 init_openssl (rngd + 0xd198) #40 0x000000010669cdf4 refill_rand.part.0 (rngd + 0xcdf4) #41 0x000000010669d198 init_openssl (rngd + 0xd198) #42 0x000000010669cdf4 refill_rand.part.0 (rngd + 0xcdf4) #43 0x000000010669d198 init_openssl (rngd + 0xd198) #44 0x000000010669cdf4 refill_rand.part.0 (rngd + 0xcdf4) #45 0x000000010669d198 init_openssl (rngd + 0xd198) #46 0x000000010669cdf4 refill_rand.part.0 (rngd + 0xcdf4) #47 0x000000010669d198 init_openssl (rngd + 0xd198) #48 0x000000010669cdf4 refill_rand.part.0 (rngd + 0xcdf4) #49 0x000000010669d198 init_openssl (rngd + 0xd198) #50 0x000000010669cdf4 refill_rand.part.0 (rngd + 0xcdf4) #51 0x000000010669d198 init_openssl (rngd + 0xd198) #52 0x000000010669cdf4 refill_rand.part.0 (rngd + 0xcdf4) #53 0x000000010669d198 init_openssl (rngd + 0xd198) #54 0x000000010669cdf4 refill_rand.part.0 (rngd + 0xcdf4) #55 0x000000010669d198 init_openssl (rngd + 0xd198) #56 0x000000010669cdf4 refill_rand.part.0 (rngd + 0xcdf4) #57 0x000000010669d198 init_openssl (rngd + 0xd198) #58 0x000000010669cdf4 refill_rand.part.0 (rngd + 0xcdf4) #59 0x000000010669d198 init_openssl (rngd + 0xd198) #60 0x000000010669cdf4 refill_rand.part.0 (rngd + 0xcdf4) #61 0x000000010669d198 init_openssl (rngd + 0xd198) #62 0x000000010669cdf4 refill_rand.part.0 (rngd + 0xcdf4) #63 0x000000010669d198 init_openssl (rngd + 0xd198) Stack trace of thread 12274: #0 0x00007fff982116a4 jent_lfsr_time (libjitterentropy.so.2 + 0x16a4) #1 0x00007fff982119c0 jent_measure_jitter (libjitterentropy.so.2 + 0x19c0) #2 0x00007fff98211a48 jent_gen_entropy (libjitterentropy.so.2 + 0x1a48) #3 0x00007fff98211b28 jent_read_entropy (libjitterentropy.so.2 + 0x1b28) #4 0x000000010669db38 thread_entropy_task (rngd + 0xdb38) #5 0x00007fff97ca9324 start_thread (libpthread.so.0 + 0x9324) #6 0x00007fff97bb2ad4 __clone (libc.so.6 + 0x142ad4) GNU gdb (GDB) Fedora 9.2-2.fc33 Copyright (C) 2020 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "ppc64le-redhat-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from /usr/sbin/rngd... Reading symbols from /usr/lib/debug/usr/sbin/rngd-6.10-3.fc33.ppc64le.debug... [New LWP 12270] [New LWP 12274] [New LWP 12275] [New LWP 12273] [New LWP 12272] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". Core was generated by `/sbin/rngd -f '. Program terminated with signal SIGSEGV, Segmentation fault. #0 0x00007fff97b11208 in _IO_default_xsputn () from /lib64/libc.so.6 [Current thread is 1 (Thread 0x7fff96eb0020 (LWP 12270))] (gdb) where #0 0x00007fff97b11208 in _IO_default_xsputn () from /lib64/libc.so.6 #1 0x00007fff97af0710 in __vfprintf_internal () from /lib64/libc.so.6 #2 0x00007fff97b08d3c in __vsnprintf_internal () from /lib64/libc.so.6 #3 0x00007fff97bc68d8 in __snprintf_chk@@GLIBC_2.17 () from /lib64/libc.so.6 #4 0x000000010669cf20 in snprintf (__fmt=0x1066a42e0 "[%-6s]: ", __n=0, __s=0x0) at /usr/include/bits/stdio2.h:67 #5 refill_rand (ent_src=ent_src@entry=0x1066c0180 <entropy_sources+288>) at rngd_darn.c:171 #6 0x000000010669d198 in refill_rand (ent_src=0x1066c0180 <entropy_sources+288>) at rngd_darn.c:167 #7 init_openssl (ent_src=0x1066c0180 <entropy_sources+288>) at rngd_darn.c:103 #8 init_openssl (ent_src=0x1066c0180 <entropy_sources+288>) at rngd_darn.c:73 #9 0x000000010669cdf4 in refill_rand (ent_src=ent_src@entry=0x1066c0180 <entropy_sources+288>) at rngd_darn.c:172 #10 0x000000010669d198 in refill_rand (ent_src=0x1066c0180 <entropy_sources+288>) at rngd_darn.c:167 #11 init_openssl (ent_src=0x1066c0180 <entropy_sources+288>) at rngd_darn.c:103 #12 init_openssl (ent_src=0x1066c0180 <entropy_sources+288>) at rngd_darn.c:73 #13 0x000000010669cdf4 in refill_rand (ent_src=ent_src@entry=0x1066c0180 <entropy_sources+288>) at rngd_darn.c:172 #14 0x000000010669d198 in refill_rand (ent_src=0x1066c0180 <entropy_sources+288>) at rngd_darn.c:167 #15 init_openssl (ent_src=0x1066c0180 <entropy_sources+288>) at rngd_darn.c:103 #16 init_openssl (ent_src=0x1066c0180 <entropy_sources+288>) at rngd_darn.c:73 #17 0x000000010669cdf4 in refill_rand (ent_src=ent_src@entry=0x1066c0180 <entropy_sources+288>) at rngd_darn.c:172 #18 0x000000010669d198 in refill_rand (ent_src=0x1066c0180 <entropy_sources+288>) at rngd_darn.c:167 #19 init_openssl (ent_src=0x1066c0180 <entropy_sources+288>) at rngd_darn.c:103 #20 init_openssl (ent_src=0x1066c0180 <entropy_sources+288>) at rngd_darn.c:73 #21 0x000000010669cdf4 in refill_rand (ent_src=ent_src@entry=0x1066c0180 <entropy_sources+288>) at rngd_darn.c:172 #22 0x000000010669d198 in refill_rand (ent_src=0x1066c0180 <entropy_sources+288>) at rngd_darn.c:167 #23 init_openssl (ent_src=0x1066c0180 <entropy_sources+288>) at rngd_darn.c:103 #24 init_openssl (ent_src=0x1066c0180 <entropy_sources+288>) at rngd_darn.c:73 #25 0x000000010669cdf4 in refill_rand (ent_src=ent_src@entry=0x1066c0180 <entropy_sources+288>) at rngd_darn.c:172 #26 0x000000010669d198 in refill_rand (ent_src=0x1066c0180 <entropy_sources+288>) at rngd_darn.c:167 #27 init_openssl (ent_src=0x1066c0180 <entropy_sources+288>) at rngd_darn.c:103 #28 init_openssl (ent_src=0x1066c0180 <entropy_sources+288>) at rngd_darn.c:73 #29 0x000000010669cdf4 in refill_rand (ent_src=ent_src@entry=0x1066c0180 <entropy_sources+288>) at rngd_darn.c:172 #30 0x000000010669d198 in refill_rand (ent_src=0x1066c0180 <entropy_sources+288>) at rngd_darn.c:167 #31 init_openssl (ent_src=0x1066c0180 <entropy_sources+288>) at rngd_darn.c:103 #32 init_openssl (ent_src=0x1066c0180 <entropy_sources+288>) at rngd_darn.c:73 #33 0x000000010669cdf4 in refill_rand (ent_src=ent_src@entry=0x1066c0180 <entropy_sources+288>) at rngd_darn.c:172 #34 0x000000010669d198 in refill_rand (ent_src=0x1066c0180 <entropy_sources+288>) at rngd_darn.c:167 #35 init_openssl (ent_src=0x1066c0180 <entropy_sources+288>) at rngd_darn.c:103 #36 init_openssl (ent_src=0x1066c0180 <entropy_sources+288>) at rngd_darn.c:73 #37 0x000000010669cdf4 in refill_rand (ent_src=ent_src@entry=0x1066c0180 <entropy_sources+288>) at rngd_darn.c:172 #38 0x000000010669d198 in refill_rand (ent_src=0x1066c0180 <entropy_sources+288>) at rngd_darn.c:167 #39 init_openssl (ent_src=0x1066c0180 <entropy_sources+288>) at rngd_darn.c:103 #40 init_openssl (ent_src=0x1066c0180 <entropy_sources+288>) at rngd_darn.c:73 #41 0x000000010669cdf4 in refill_rand (ent_src=ent_src@entry=0x1066c0180 <entropy_sources+288>) at rngd_darn.c:172 #42 0x000000010669d198 in refill_rand (ent_src=0x1066c0180 <entropy_sources+288>) at rngd_darn.c:167 #43 init_openssl (ent_src=0x1066c0180 <entropy_sources+288>) at rngd_darn.c:103 #44 init_openssl (ent_src=0x1066c0180 <entropy_sources+288>) at rngd_darn.c:73 #45 0x000000010669cdf4 in refill_rand (ent_src=ent_src@entry=0x1066c0180 <entropy_sources+288>) at rngd_darn.c:172 #46 0x000000010669d198 in refill_rand (ent_src=0x1066c0180 <entropy_sources+288>) at rngd_darn.c:167 #47 init_openssl (ent_src=0x1066c0180 <entropy_sources+288>) at rngd_darn.c:103 #48 init_openssl (ent_src=0x1066c0180 <entropy_sources+288>) at rngd_darn.c:73 #49 0x000000010669cdf4 in refill_rand (ent_src=ent_src@entry=0x1066c0180 <entropy_sources+288>) at rngd_darn.c:172 #50 0x000000010669d198 in refill_rand (ent_src=0x1066c0180 <entropy_sources+288>) at rngd_darn.c:167 #51 init_openssl (ent_src=0x1066c0180 <entropy_sources+288>) at rngd_darn.c:103 #52 init_openssl (ent_src=0x1066c0180 <entropy_sources+288>) at rngd_darn.c:73 #53 0x000000010669cdf4 in refill_rand (ent_src=ent_src@entry=0x1066c0180 <entropy_sources+288>) at rngd_darn.c:172 systemctl status rngd: ● rngd.service - Hardware RNG Entropy Gatherer Daemon Loaded: loaded (/usr/lib/systemd/system/rngd.service; enabled; vendor preset: enabled) Active: failed (Result: core-dump) since Mon 2020-07-27 15:18:38 CEST; 20min ago Process: 12270 ExecStart=/sbin/rngd -f (code=dumped, signal=SEGV) Main PID: 12270 (code=dumped, signal=SEGV) CPU: 11.630s Jul 27 15:18:34 fedora-ppc systemd[1]: Started Hardware RNG Entropy Gatherer Daemon. Jul 27 15:18:34 fedora-ppc rngd[12270]: Initializing available sources Jul 27 15:18:34 fedora-ppc rngd[12270]: [hwrng ]: Initialization Failed Jul 27 15:18:34 fedora-ppc rngd[12270]: [darn ]: Enabling power DARN rng support Jul 27 15:18:34 fedora-ppc rngd[12270]: [darn ]: Initialized Jul 27 15:18:34 fedora-ppc rngd[12270]: [jitter]: Initializing AES buffer Jul 27 15:18:37 fedora-ppc rngd[12270]: [jitter]: Enabling JITTER rng support Jul 27 15:18:37 fedora-ppc rngd[12270]: [jitter]: Initialized Jul 27 15:18:37 fedora-ppc rngd[12270]: [pkcs11]: PKCS11 Engine /usr/lib64/opensc-pkcs11.so Error: No such file or directory Jul 27 15:18:37 fedora-ppc rngd[12270]: [pkcs11]: Initialization Failed Jul 27 15:18:37 fedora-ppc rngd[12270]: [rtlsdr]: Initialization Failed Jul 27 15:18:38 fedora-ppc systemd[1]: rngd.service: Main process exited, code=dumped, status=11/SEGV Jul 27 15:18:38 fedora-ppc systemd[1]: rngd.service: Failed with result 'core-dump'. Jul 27 15:18:38 fedora-ppc systemd[1]: rngd.service: Consumed 11.630s CPU time. Version-Release number of selected component (if applicable): rng-tools-6.10-3.fc33 How reproducible: 100% Steps to Reproduce: 1. systemctl start rngd Actual results: segfault Expected results: no segfault Additional info: N/A
yup, looks like a loop between init_openssl and refill_rand. I'll set up a test build to fix it asap.
Created attachment 1702669 [details] patch to ensure we never loop forever in darn https://koji.fedoraproject.org/koji/taskinfo?taskID=48019171 Making a scratch build for you of rng-tools with the above patch. Please confirm that it resolves your issue Thanks!
So the crash is away, but now it says "initialization failed" for "darn", which doesn't look right. Jul 28 17:34:48 fedora-ppc systemd[1]: Started Hardware RNG Entropy Gatherer Daemon. Jul 28 17:34:48 fedora-ppc rngd[1763]: Initializing available sources Jul 28 17:34:48 fedora-ppc rngd[1763]: [hwrng ]: Initialization Failed Jul 28 17:34:48 fedora-ppc rngd[1763]: [darn ]: Initialization Failed Jul 28 17:34:48 fedora-ppc rngd[1763]: [jitter]: Initializing AES buffer Jul 28 17:34:48 fedora-ppc rngd[1763]: [jitter]: Unable to obtain AES key, disabling AES in JITTER source Jul 28 17:34:48 fedora-ppc rngd[1763]: [jitter]: Enabling JITTER rng support Jul 28 17:34:48 fedora-ppc rngd[1763]: [jitter]: Initialized Jul 28 17:34:48 fedora-ppc rngd[1763]: [pkcs11]: PKCS11 Engine /usr/lib64/opensc-pkcs11.so Error: No such file or directory Jul 28 17:34:48 fedora-ppc rngd[1763]: [pkcs11]: Initialization Failed Jul 28 17:34:48 fedora-ppc rngd[1763]: [rtlsdr]: Initialization Failed Jul 28 17:34:48 fedora-ppc rngd[1763]: Entropy Generation is slow, consider tuning/adding sources Jul 28 17:34:48 fedora-ppc rngd[1763]: Entropy Generation is slow, consider tuning/adding sources Jul 28 17:34:48 fedora-ppc rngd[1763]: Entropy Generation is slow, consider tuning/adding sources Jul 28 17:34:48 fedora-ppc rngd[1763]: Entropy Generation is slow, consider tuning/adding sources Jul 28 17:34:48 fedora-ppc rngd[1763]: Entropy Generation is slow, consider tuning/adding sources Jul 28 17:34:48 fedora-ppc rngd[1763]: Entropy Generation is slow, consider tuning/adding sources Jul 28 17:34:48 fedora-ppc rngd[1763]: Entropy Generation is slow, consider tuning/adding sources Jul 28 17:34:48 fedora-ppc rngd[1763]: Entropy Generation is slow, consider tuning/adding sources Jul 28 17:34:48 fedora-ppc rngd[1763]: Entropy Generation is slow, consider tuning/adding sources Jul 28 17:34:49 fedora-ppc rngd[1763]: Entropy Generation is slow, consider tuning/adding sources Jul 28 17:34:49 fedora-ppc rngd[1763]: Entropy Generation is slow, consider tuning/adding sources Jul 28 17:34:49 fedora-ppc rngd[1763]: Entropy Generation is slow, consider tuning/adding sources Jul 28 17:34:49 fedora-ppc rngd[1763]: Entropy Generation is slow, consider tuning/adding sources
hmm, what system are you running this on, mind if I take a look at it?
(In reply to Neil Horman from comment #4) > hmm, what system are you running this on, mind if I take a look at it? It's sitting on my desk :-) But it's a Rawhide VM running on F-32 host on a Talos Power9 system. I will try to reproduce that on a Boston P9 system in the morning. DARN is initialized OK in a F-32 VM with rng-tools-6.10-3.fc32.ppc64le (was locally rebuilt for F-32).
So no, it 6.10-3 crashes in F-32 VM too, but it took 20 minutes from start to the crash (same backtrace as in Rawhide). I guess I can try it locally on bare-metal.
I'll see if I can grab a p9 system in beaker as well
Note to self: Got it reproduced. Looks like a problem with the AES mangling. Will look deeper in the aM
https://github.com/nhorman/rng-tools/commit/0d6a4c1eb830c8a0c619e18378eba1761d2fa7a2 got it fixed upstream, will backport shortly
This bug appears to have been reported against 'rawhide' during the Fedora 33 development cycle. Changing version to 33.
And it seems Neil forgot to backport the fix. Vladis, could you take care of that, please? Or do you let me to do it? Another option is to rebase rng-tools to 6.11 for F>=33. rng-tools in F-32 are OK.
(In reply to Dan Horák from comment #13) > And it seems Neil forgot to backport the fix. Vladis, could you take care of > that, please? Or do you let me to do it? Another option is to rebase > rng-tools to 6.11 for F>=33. rng-tools in F-32 are OK. Hello, Dan, It is a bit weird situation here. While I maintain rng-tools in RHEL, I do not maintain it in Fedora. Even if I want for make a fix, I do not have any permissions for the https://src.fedoraproject.org/rpms/rng-tools/ repo. Neil is formally still the package maintainer in Fedora and this bz should be for him. If, as you mention, you can do the fix, please, feel free to go ahead and fix.
Thanks for the info, Vladis. I haven't noticed that it was Troy who merged your fix in November. I have opened https://src.fedoraproject.org/rpms/rng-tools/pull-request/6 with rebase to the latest 6.11 and also sent Neil an email. I'll give it some time, then I will commit it.
Hello, Dan, Thank you for making this rebase PR. This PR was merged by Neil, but unfortunately fails the Zuul and CI tests. I cannot get if it is the package or a build infra issue.
It might be also the CI test that got outdated :-) You should be able to run the test manually, to be able to review the results.
https://docs.fedoraproject.org/en-US/ci/tests/#_executing has some docs, unfortunately they are unusable for my ppc64le system
rng-tools-6.12-1 is in the Rawhide, F34, F33 and has been submitted for stable by bodhi for F32 as of today. closing as resolved.