RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1912059 - Freeradius no longer starts after 60 days due to expired certficiate
Summary: Freeradius no longer starts after 60 days due to expired certficiate
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: freeradius
Version: 8.3
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: 8.0
Assignee: Alex Scheel
QA Contact: Filip Dvorak
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-01-03 04:42 UTC by redhat
Modified: 2021-07-28 08:14 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-01-05 18:59:14 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Issue Reproduction (26.35 KB, text/plain)
2021-01-05 22:59 UTC, redhat
no flags Details

Description redhat 2021-01-03 04:42:38 UTC
Description of problem:
After 60 days from the first start of radiusd.service, it will refuse to start due to a certificate expiration error:

systemd[1]: Starting FreeRADIUS high performance RADIUS server....
sh[1046556]: C = FR, ST = Radius, L = Somewhere, O = Example Inc., emailAddress = admin, CN = Example Certificate Authority
sh[1046556]: error 10 at 1 depth lookup: certificate has expired
sh[1046556]: C = FR, ST = Radius, O = Example Inc., CN = Example Server Certificate, emailAddress = admin
sh[1046556]: error 10 at 0 depth lookup: certificate has expired
sh[1046556]: error server.pem: verification failed
sh[1046556]: make: *** [Makefile:107: server.vrfy] Error 2
systemd[1]: radiusd.service: Control process exited, code=exited status=2
systemd[1]: radiusd.service: Failed with result 'exit-code'.
systemd[1]: Failed to start FreeRADIUS high performance RADIUS server..


This error is related to the EAP-TLS CA certificate located at /etc/raddb/certs/server.pem.


Due to a recent change in the radiusd service file (Bug #1672285), the boostrap script is now called at service start. After 90 days the certificate expires, and the boostrap script dies when run. This prevents radiusd from starting.


The /etc/raddb/certs/bootstrap file explicitly states that bootstrap should not be run twice:

#  This is a wrapper script to create default certificates when the
#  server first starts in debugging mode.  Once the certificates have been
#  created, this file should be deleted.

If certificate generation needs to be done at service start, the script should be setup to only run once. The file should not be removed as it's not marked as a config file in the freeradius rpm template, and will simply be reinstalled when freeradius is updated. (Maybe it could just regenerate the certificate, but the bootstrap script doesn't appear to have this ability).

Generated certificate expiry:

openssl x509 -in server.pem -noout -text | grep -e{After,Before}
            Not Before: Oct 25 13:40:58 2020 GMT
            Not After : Dec 24 13:40:58 2020 GMT


Version-Release number of selected component (if applicable):
freeradius-3.0.20-3.module_el8.3.0+476+0982bc20.x86_64

Steps to Reproduce:
1. Start the radiusd service.
2. Wait 60 days.
3. Attempt to restart the radiusd service.

Expected Results:
Freeradius starts.

Actual Results:
Freeradius dies with above certificate error.

Additional info:
The radiusd.service override noted in Bug #1672285 can be used as a workaround once you've hit the 90 day expiration (no obvious way to regenerate cert) until this is fixed upstream.

Comment 1 Alex Scheel 2021-01-04 15:28:15 UTC
Generally, you'd want to drop in your own CA infrastructure rather than running off of the self-signed bootstraps. As you pointed out, upstream really wasn't intending for users to run off of the bootstrap script for a longer period of time... I can patch it to work, but I guess the question is: why use $RANDOM_PACKAGE's CA infra, rather than maintaining your own?

If you file a customer case, I can see if I can get this backported, but otherwise it is looking like it'll have to wait for a later release.

Comment 2 redhat 2021-01-04 21:54:06 UTC
(In reply to Alex Scheel from comment #1)
> Generally, you'd want to drop in your own CA infrastructure rather than
> running off of the self-signed bootstraps. As you pointed out, upstream
> really wasn't intending for users to run off of the bootstrap script for a
> longer period of time... I can patch it to work, but I guess the question
> is: why use $RANDOM_PACKAGE's CA infra, rather than maintaining your own?

The issue I've described here happens with a stock install of freeradius. Personally I don't use the CA that's generated by the bootstrap script, or use the EAP module. However the script is always called by the radiusd service even on a fresh install.

I managed to replicate this on a fresh install of freeradius on CentOS 8.3 with the following:

# dnf install freeradius
# systemctl start radiusd
# systemctl stop radiusd
# cd /etc/raddb/certs/
# openssl x509 -in server.pem -noout -text | grep -e{Before,After}
            Not Before: Jan  4 21:26:48 2021 GMT
            Not After : Mar  5 21:26:48 2021 GMT
# date -s 'Mar  6 21:26:48 2021 GMT'
# systemctl restart radiusd
Job for radiusd.service failed because the control process exited with error code.
See "systemctl status radiusd.service" and "journalctl -xe" for details.
# journalctl -u radiusd | tail -n 10
Mar 07 07:56:51 hostname systemd[1]: Starting FreeRADIUS high performance RADIUS server....
Mar 07 07:56:51 hostname sh[961667]: C = FR, ST = Radius, L = Somewhere, O = Example Inc., emailAddress = admin, CN = Example Certificate Authority
Mar 07 07:56:51 hostname sh[961667]: error 10 at 1 depth lookup: certificate has expired
Mar 07 07:56:51 hostname sh[961667]: C = FR, ST = Radius, O = Example Inc., CN = Example Server Certificate, emailAddress = admin
Mar 07 07:56:51 hostname sh[961667]: error 10 at 0 depth lookup: certificate has expired
Mar 07 07:56:51 hostname sh[961667]: error server.pem: verification failed
Mar 07 07:56:51 hostname sh[961667]: make: *** [Makefile:107: server.vrfy] Error 2
Mar 07 07:56:51 hostname systemd[1]: radiusd.service: Control process exited, code=exited status=2
Mar 07 07:56:51 hostname systemd[1]: radiusd.service: Failed with result 'exit-code'.
Mar 07 07:56:51 hostname systemd[1]: Failed to start FreeRADIUS high performance RADIUS server..


> 
> If you file a customer case, I can see if I can get this backported, but
> otherwise it is looking like it'll have to wait for a later release.

As stated above, this happens on a fresh install of freeradius on CentOS 8.3 (and presumably RHEL 8.3?).

Comment 3 Alex Scheel 2021-01-05 15:05:08 UTC
> I managed to replicate this on a fresh install of freeradius on CentOS 8.3 with the following:

Right, but you can do the same with e.g., apache-httpd. If you setup something that requires a x509 infrastructure and _ignore the x509 infrastructure requirements_, you end up with well, a broken application. And unless the bootstrap script knows, beyond a doubt, that the certs that are in the directory are the ones generated by us, you don't want to go overwriting random certificates and keys.

IMO, the recommended solution should be to disable EAP-TLS/... unless you're sure you're using them and using them with a proper x509 infrastructure. Just like you would httpd: you wouldn't leave the default https server enabled without giving it proper certificate and configuring it to show real content.


> As stated above, this happens on a fresh install of freeradius on CentOS 8.3 (and presumably RHEL 8.3?).

A customer case would be required to get this backported to 8.3 and shipped in any y stream earlier than 8.5 from my point of view./

Comment 4 redhat 2021-01-05 16:33:43 UTC
Sorry you've missed the bug here. The bootstrap script will crash if the test certificate is not valid, failing radiusd.service before it even starts freeradius. The test cert is something created by the bootstrap script, and this issue will occur whether you use this test cert or not. Looking at this more closely, it may only be repeatable if make is installed.

The bootstrap script calls "make all" (if make is installed) to generate the CA and cert for testing purposes, which eventually calls the verify function on line 108:

.PHONY: server.vrfy
server.vrfy: ca.pem
        @$(OPENSSL) verify $(PARTIAL) -CAfile ca.pem server.pem

If you try to restart radiusd after the test cert (server.pem) has expired, this verify function will fail. This causes bootstrap to fail, and the radiusd.service to fail before it has even started freeradius. (As a test I removed this verify line, and the issue no longer occurs).

Bug #1672285 introduced this issue by calling the bootstrap script multiple times. The logic in the bootstrap or make file needs to be modified to do nothing if the test cert exists.

P.S:

Looking at the code more closely it seems you only hit this bug if you have make installed (which calls "make all"), the backup code in bootstrap is a little more sensible, only verifying the created cert when it doesn't exist.

if [ ! -e server.pem ]; then
  openssl pkcs12 -in server.p12 -out server.pem -passin pass:`grep output_password server.cnf | sed 's/.*=//;s/^ *//'` -passout pass:`grep output_password server.cnf | sed 's/.*=//;s/^ *//'` || exit 1
  openssl verify -CAfile ca.pem server.pem || exit 1
  chmod g+r server.pem
fi

Comment 5 Alex Scheel 2021-01-05 18:59:14 UTC
> Sorry you've missed the bug here. 

What you call a bug, I really think is just misconfiguration. You shouldn't leave bootstrap enabled for any period of time in production, if you aren't using said functionality. That means following the steps in the BZ you already linked to, to disable it. You'll either be disabling it because you've provided your own X509 infrastructure, or you'll be disabling it because you're not using EAP-TLS/EAP-TTLS/PEAP methods. The other option is hooking your own X509 infrastructure into the bootstrap/... scripts, but that'll be overwritten on upgrade.


The issue with the solution in bootstrap without make, is that certificates won't get renewed. You'd still have to go in and delete a file, otherwise the certificates won't regenerate.



My 2c., but this really isn't a bug -- this is misconfiguration.

Comment 6 William Brown 2021-01-05 21:56:21 UTC
Hey there, just to help clarify here. I think that there are two issues being reported here:

* A missing Requires on make for the bootstrap to run
* That even if the test ca (server.pem) is NOT used, it's legitimacy is checked causing the server to fail.

As a mock example you can imagine I create freeradius with a different CA/server cert, so I reconfigure the eap module with:

    certificate_file = ${certdir}/mycert.pem

So even though I would be using my own x509 certs/ca that are legitimate and valid, because the bootstrap script is always run and checks the test cert in server.pem, it will cause the server to fail despite it NOT being used.

So I don't think this is a misconfiguration in this case. I think that the original report is correct and that bootstrap should "only run once", especially given it has the capability to prevent a server starting.

Comment 7 Alex Scheel 2021-01-05 22:50:57 UTC
> * A missing Requires on make for the bootstrap to run

Make is NOT required for the bootstrap script to run, but we still do add it as a Requires in the RPM spec. QE liked the behavior with make better than without make (earlier tickets discussed those behavior differences).

It runs with and without make just fine. The behavior is admittedly a little different between the two implementations, but this is due to how Make processes rules.


> * That even if the test ca (server.pem) is NOT used, it's legitimacy is checked causing the server to fail.

Systemd unit files are considered configuration. We provide a mechanism for overriding this configuration in a manner that persists across package updates (systemd unit overrides). This is the preferred mechanism for disabling the bootstrap script when it is no longer required. See my comment 36 on https://bugzilla.redhat.com/show_bug.cgi?id=1672285#c36.

Comment 8 redhat 2021-01-05 22:59:52 UTC
Created attachment 1744729 [details]
Issue Reproduction

Log of issue reproduction on freshly installed VM.

Comment 9 redhat 2021-01-05 23:03:00 UTC
I've attached a reproduction of this issue on a freshly installed VM. Based on this, I'm going to assume every user is going to hit this issue. If this is expected, do you require *every* single user to manually disable the bootstrap script after 60 days to make freeradius work?

Comment 10 Geoffrey D. Bennett 2021-01-06 04:55:10 UTC
> What you call a bug, I really think is just misconfiguration.

I hit this bug too. That the default configuration stops working after 60 days seems like a misconfiguration out-of-the-box to me.

The bootstrap script itself even says that it should be deleted once the certificates have been created:

#  This is a wrapper script to create default certificates when the
#  server first starts in debugging mode.  Once the certificates have been
#  created, this file should be deleted.

If you still think it's a good idea to run bootstrap every time the server starts, what about changing:

ExecStartPre=/bin/sh /etc/raddb/certs/bootstrap

to:

ExecStartPre=-/bin/sh /etc/raddb/certs/bootstrap

so that failure of the script doesn't prevent the server from starting?

Comment 11 Alex Scheel 2021-01-06 14:38:20 UTC
Respectfully, I believe both of you have missed the point. 


> That the default configuration stops working after 60 days seems like a misconfiguration out-of-the-box to me. 

The default configuration includes TLS authentication methods as examples for the administrator. These examples are meant to be viewed, modified, and the certificates replaced or the methods disabled and the certificates removed.

This is no different than installing apache-httpd with mod_ssl+sscg, having it automatically generate certificates the first time it is run, and needing to configure and maintain it thereafter, in order to keep the certificates working and non-expired.
 
This is no different than installing ipa-server and running ipa-server-install, and having to do system administration tasks to ensure it remains operational long-term, especially after ~20 years when the root CA certificate expires and needs to be manually re-issued and provisioned on all machines in the organization.
 
This is no different than installing postgres, enabling TLS, and needing to do maintenance (cert rotation) to ensure continued availability of the system.
 
Many servers fail to start if the certificates they are pointed at are expired or are otherwise invalid. They expect you to disable TLS configuration, if you are not needing TLS for some reason. And if you need it, they expect you slot in your own infrastructure and maintain it.

 
The push towards ACME has resulted in a lot of x509 deployment scenarios being fully automated. 90 days is becoming industry-standard lifetime for server-side TLS certificates. Major browser vendors have capped the limit at 365 days now. Dropping a systemd unit file should be an easy addition to the existing setup, alongside configuring a real X509 deployment.


I will also point out this bug existed for _years_ before this and nobody care. Certificates were generated once at package installation time. Bootstrap script was not continually run. This resulted in certificates expiring and (if radiusd even started), clients would be unable to connect if they actually used these certificates in the default EAP-TLS/EAP-TTLS/PEAP configurations. This makes the failure noisy and forces administrators to correctly configure the service they support. This recent change was done to comply with RHEL and Fedora packaging policy, moving certificate generation from package build/install time to service start.



The change, if one should be made, isn't "fix certificate generation with one-time bootstrap scripts" as so proposed above -- it is to disable bootstrapping entirely and remove TLS as offered methods in the default configuration. That change will likely not be made in RHEL 8.

Comment 12 Geoffrey D. Bennett 2021-01-21 17:11:52 UTC
I get your point that if you need TLS then you need to maintain it, but I disagree that if you don't need TLS that you should be expected to disable it. The freeradius RPMs have not worked this way in the past; the recent update was a breaking change and caused working systems (in the default configuration with TLS enabled but not being used) to completely fail at next restart when there was no need for them to fail at all.

Your examples with Apache and PostgreSQL are off the mark (I am not familiar with IPA). Neither of them will fail to start if they are configured with expired certificates. Similarly, FreeRADIUS itself starts fine with an expired certificate.

> I will also point out this bug existed for _years_ before this and nobody care.

By "this bug" you are meaning this bug #1912059? This bug is new with the most-recent freeradius update; it is certainly not years old. People care now because their systems stopped working with the latest update.

> This recent change was done to comply with RHEL and Fedora packaging policy, moving certificate generation from package build/install time to service start.

The packaging guidelines at https://docs.fedoraproject.org/en-US/packaging-guidelines/Initial_Service_Setup/#_common_guidelines say that the bootstrap script should be run from an "-init.service" service unit and should be conditional on the configuration file(s) not already existing. There is no suggestion in the packaging guidelines that the bootstrap process should be run on every service start, and doing so as we see here causes problems.

Please make it so the bootstrap script runs only once as per the packaging guidelines. That will fix our issue.


Note You need to log in before you can comment on or make changes to this bug.