Bug 1650704 (rhel8-rngd-no-auto-start)
Summary: | rngd is started early in the boot process and includes jitter entropy | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 8 | Reporter: | bugzilla | ||||||
Component: | rng-tools | Assignee: | Neil Horman <nhorman> | ||||||
Status: | CLOSED ERRATA | QA Contact: | Vilém Maršík <vmarsik> | ||||||
Severity: | high | Docs Contact: | |||||||
Priority: | high | ||||||||
Version: | 8.1 | CC: | bhu, bugzilla, dgilbert, hwkernel-mgr, jreznik, kernel-qe-hw, mthacker, nhorman, nmavrogi, tmraz, vmarsik | ||||||
Target Milestone: | rc | ||||||||
Target Release: | 8.0 | ||||||||
Hardware: | All | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | |||||||||
: | 1652468 (view as bug list) | Environment: | |||||||
Last Closed: | 2020-04-28 16:06:42 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | 1652468 | ||||||||
Bug Blocks: | 1525054 | ||||||||
Attachments: |
|
Description
bugzilla
2018-11-16 22:01:40 UTC
Why do you think the After=network.target sshd-keygen.target is the cause for this behavior? The network is needed to start the service and you need to have the server host keys generated too. Please, provide the complete log from journal which could give some more hints why the service was not started or what was wrong before. Created attachment 1507152 [details]
journalctl -x
NetworkManager seems to be up and down all over the place. Tried removing it but then have no network at all (as network-scripts are deprecated).
just noticed i don't have to actually login to the console, just entering something to the username prompt (not password) allows me to ssh You started booting at 09:30:40. sshd is listening in less than 30 seconds: Nov 19 09:31:04 vbrhel8 sshd[708]: Server listening on 192.168.0.110 port 22. and after 5 more seconds, it accepts your password: Nov 19 09:31:09 vbrhel8 sshd[925]: Accepted password for simon from 192.168.0.5 port 54494 ssh2 I do not see any issue with the attached log. Can you point out to the the event in the log that represent you logging into the console? ok, uploaded a new journal as i didn't note down the timings. also left a good few mins to ensure it had booted fully, findings are: appears to have finished booting: Mon 19 Nov 11:54:30 GMT 2018 2 minutes later..... ssh: connect to host 192.168.0.110 port 22: Connection refused Mon 19 Nov 11:56:24 GMT 2018 now i login to console: Mon 19 Nov 11:57:03 GMT 2018 from journalctl at that time: Nov 19 11:57:07 vbrhel8 systemd[1]: Started OpenSSH server daemon. ssh now works..... Mon 19 Nov 11:57:12 GMT 2018 seems way too coincidental that ssh works instantly as soon as i login to the console, but i can't see anything in the logs really. Created attachment 1507193 [details]
new journal
seems any input to the console starts ssh - ctrl-alt-del works! is it waiting for getty or looking for a keyboard or something weird i wonder (rng looking for entropy from keypresses?) Nov 19 12:19:24 vbrhel8 kernel: random: crng init done Nov 19 12:19:24 vbrhel8 kernel: random: 7 urandom warning(s) missed due to ratelimiting The boot is really finished this late: Nov 19 11:55:02 vbrhel8 systemd[1]: Startup finished in 773ms (kernel) + 2.557s (initrd) + 1min 33.578s (userspace) = 1min 36.908s. The sshd service is ready two minutes later Nov 19 11:57:07 vbrhel8 sshd[893]: Server listening on 192.168.0.110 port 22. but no earlier than we see this message from kernel, which will probably be the case for the trouble: Nov 19 11:57:07 vbrhel8 kernel: random: crng init done Does this happen only on the first boot (which requires sshd to generate the private keys? The journal could show you more logs if you omit the -x, which might come handy, but I think you are right that the kernel is probably not having enough entropy at the boot time. I will ask around if we have somebody with better insight into this. There is not much that can be done with uninitialized CRNG on openssh side, I am reassigning this to rng-tools to assure that rngd is started early in the boot process. there seems to be loads of fedora 28/29 bugs around kernel entropy issues e.g. https://bugzilla.redhat.com/show_bug.cgi?id=1572916 its not just firstboot, journalctl without -x didn't present anything more interesting, this seems to be the order of events: Nov 19 13:42:39 vbrhel8 kernel: random: crng init done Nov 19 13:42:39 vbrhel8 kernel: random: 7 urandom warning(s) missed due to ratelimiting Nov 19 13:42:39 vbrhel8 sshd[713]: Server listening on 192.168.0.110 port 22. Nov 19 13:42:39 vbrhel8 systemd[1]: Started OpenSSH server daemon. Nov 19 13:42:39 vbrhel8 systemd[1]: Reached target Multi-User System. .....blah.... Nov 19 13:42:40 vbrhel8 login[734]: ROOT LOGIN ON tty1 Nov 19 13:42:44 vbrhel8 sshd[925]: Accepted password for simon from 192.168.0.5 port 56374 ssh2 noticed rng-tools isn't installed, so installed it, upon reboot i get the same problem but noticed this in the logs..... [root@vbrhel8 ~]# journalctl -u rngd -- Logs begin at Mon 2018-11-19 13:54:23 GMT, end at Mon 2018-11-19 13:54:53 GMT. -- Nov 19 13:54:29 vbrhel8 systemd[1]: Started Hardware RNG Entropy Gatherer Daemon. Nov 19 13:54:29 vbrhel8 rngd[671]: Failed to init entropy source 0: Hardware RNG Device Nov 19 13:54:29 vbrhel8 rngd[671]: Failed to init entropy source 1: TPM RNG Device Nov 19 13:54:29 vbrhel8 rngd[671]: Failed to init entropy source 2: Intel RDRAND Instruction RNG Nov 19 13:54:29 vbrhel8 rngd[671]: can't open any entropy source Nov 19 13:54:29 vbrhel8 rngd[671]: Maybe RNG device modules are not loaded Nov 19 13:54:29 vbrhel8 systemd[1]: rngd.service: Main process exited, code=exited, status=1/FAILURE Nov 19 13:54:29 vbrhel8 systemd[1]: rngd.service: Failed with result 'exit-code'. Maybe also the jitter entropy source is not enabled in the rngd by default? That would be also needed. This actually seems like it should be an ssh problem, no? It would be sufficient and more correct for sshd to alter its service file to add a: After=rngd.service directive to its unit file. Reassigning. dang it appears that RHEL 8 missed the update and is a version behind. We should be on at least version 6.3, which has jitterentropy support. I can update to that level, but I need to get acks on this bug to do so. If you can help me get the pm and qe acks, I'll do the update. QA can test that rngd is started before sshd and supports jitterentropy. I doubt that we could reproduce this hang on any x86_64, as stated in the description. Is this what is needed here? https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=19228822 Heres an updated build for the rng-tools package. It: 1) Is updated to the latest rng-tools upstream, and has jitterentropy support 2) Modifies the rngd service to start as part of the sysinit.target (read: it starts early) Can you install it and confirm that it solves this issue for you? i can't get to that site and dnf is only showing 6.2-1 available. but i did a minimal install and rng-tools isn't even part of that, so surely that needs fixing too or is there another way to get entropy without it? In response to comment 23 - that was meant for Nikon, as its internal only, but I've exported the packages here: http://people.redhat.com/nhorman/rpms/rng-tools.tbz2 so that you can test them as well. FWIW, It allows an aarch64 system to boot fine for me. 1 note, this fix doesn't include the ability to install rngd by default. That will have to be the subject of another bug against the comps component If there is general consensus on the viability of this solution, I'll go ahead and check it in yes, installing in an x86_64 vm allows me to ssh in without typing into the console thanks: [root@vbrhel8 ~]# journalctl|grep -i entropy Nov 20 23:00:26 vbrhel8 systemd[1]: Started Hardware RNG Entropy Gatherer Daemon. Nov 20 23:00:26 vbrhel8 rngd[669]: Failed to init entropy source hwrng Nov 20 23:00:26 vbrhel8 rngd[669]: Failed to init entropy source rdrand Nov 20 23:00:32 vbrhel8 rngd[669]: Initalizing entropy source jitter but what about the minimal install which doesn't include rng-tools at all - can it be added as part of @core so its always installed? Yes, we also need the rngd added to core comps group. Tomas, if you could open a bz requesting that rng-tools be added to the default install in comps, i'll finish getting this checked in. Thanks! Looks fixed to me. On 8.0 Beta 1, I could not log in for more than 2 minutes after the system booted. The boot took about 20s, but sshd showed to be running 3 minutes less: [root@apm-mustang-b0-01 ~]# cat /etc/redhat-release Red Hat Enterprise Linux release 8.0 Beta (Ootpa) [root@apm-mustang-b0-01 ~]# uptime 06:25:07 up 6 min, 1 user, load average: 0.00, 0.08, 0.05 [root@apm-mustang-b0-01 ~]# systemctl status sshd (...) Active: active (running) since Fri 2019-03-22 06:22:27 EDT; 2min 44s ago # systemctl status rngd Unit rngd.service could not be found. On latest RC, I could log in immediately after boot finished, and rndg was running: # cat /etc/redhat-release Red Hat Enterprise Linux release 8.0 (Ootpa) # systemctl status sshd (...) Active: active (running) since Fri 2019-03-22 11:08:18 EDT; 3h 59min left # systemctl status rngd (...) Active: active (running) since Fri 2019-03-22 11:08:16 EDT; 3h 59min left Just not sure wby sshd/rngd claim to run for 4h, when I have installed the machine before less than 1h. Will open a new bugreport for that, if I can reproduce on another machine. Considering this bug verified. this seems to have reared its head again in 8.1 beta - sshd doesn't start in a vm until the mouse has been moved about, despite rngd.service with jitter running rng-tools-6.6-2.el8 just seen the problem in centos 8.0.1905 too - seems to have the same package as rhel 8.1beta: rng-tools-6.6-2.el8.x86_64 also this is in a qemu-kvm vm not virtualbox I've figured it out - its triggered by setting ListenAddress to a specific IPv4 address in /etc/ssh/sshd_config this causes the delay: ListenAddress 192.168.1.2 this works fine: ListenAddress 0.0.0.0 so the rngd fix was working, but this was causing a similar delay i guess. Could not reproduce on 8.1 beta, but at least latest 8.2 did not show any delay of ssh login after reboot. Neil did not object. Closing the bug as Verified. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:1762 |