Bug 2056126
| Summary: | [RFE] Extend time to warn of upcoming certificate expiration | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Virtualization Manager | Reporter: | Greg Scott <gscott> |
| Component: | ovirt-engine | Assignee: | Martin Perina <mperina> |
| Status: | CLOSED ERRATA | QA Contact: | Pavol Brilla <pbrilla> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 4.4.10 | CC: | didi, emarcus, gdeolive, gveitmic, mavital, michal.skrivanek, mkalinin, mperina, mtessun, mwest, pelauter |
| Target Milestone: | ovirt-4.5.0 | Keywords: | FutureFeature, ZStream |
| Target Release: | 4.5.0 | ||
| Hardware: | All | ||
| OS: | All | ||
| Whiteboard: | |||
| Fixed In Version: | ovirt-engine-4.5.0 | Doc Type: | Release Note |
| Doc Text: |
With this release, the Red Hat Virtualization Manager 4.4 SP1 certificate expiration check will warn of upcoming certificate expiration earlier:
1. If a certificate is about to expire in the upcoming 120 days, a WARNING event is raised in the audit log.
2. If a certificate is about to expire in the upcoming 30 days, an ALERT event is raised in the audit log.
This checks for internal RHV certificates (for example certificate for RHVM <-> hypervisor communication), but it doesn't check for custom certificates configured for HTTPS access to RHVM as configured according to [https://access.redhat.com/documentation/en-us/red_hat_virtualization/4.4/html-single/administration_guide/index#Replacing_the_Manager_CA_Certificate](https://access.redhat.com/documentation/en-us/red_hat_virtualization/4.4/html-single/administration_guide/index#Replacing_the_Manager_CA_Certificate)
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2022-05-26 16:23:53 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | Infra | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Greg Scott
2022-02-18 22:05:02 UTC
RHV Manager already contains quite sophisticated ways to check for certificate expiration.
RHVM performs certificate checks every day (can be configured using
engine-config option CertificationValidityCheckTimeInHours) and it checks
not only hosts certificates, but also the engine certificate and the engine
CA certificate. This check produces following records in ovirt-engine audit
log:
1. If the certificate has already expired then below audit log ALERT is
created depending on the type of certificate
- Host ${VdsName} certification has expired at ${ExpirationDate}. Please renew the host's certification.
- Engine's certification has expired at ${ExpirationDate}. Please renew the engine's certification.
- Engine's CA certification has expired at ${ExpirationDate}.
2. If the certificate is going to expire in less than 7 days, then below audit log ALERT is created depending on the type of certificate
- Host ${VdsName} certification is about to expire at ${ExpirationDate}. Please renew the host's certification.
- Engine's certification is about to expire at ${ExpirationDate}. Please renew the engine's certification.
- Engine's CA certification is about to expire at ${ExpirationDate}.
3. If the certificate is going to expire in less than 30 days, then below audit log WARNING is created depending on the type of certificate
- Host ${VdsName} certification is about to expire at ${ExpirationDate}. Please renew the host's certification.
- Engine's certification is about to expire at ${ExpirationDate}. Please renew the engine's certification.
- Engine's CA certification is about to expire at ${ExpirationDate}.
So from my point of view there is enough warnings about upcoming certificate expiration, so now let's take a look at certificate renewal:
1. There's Enroll Certificate action available from both UI and RESTAPI, which renews host certificates which are going to expire
2. During each Host upgrade action (again available from both UI and RESTAPI and also part of cluster_upgrade Ansible role) we are checking if certificate is going to expire soon or is already expired and if so, certificates are renewed during host upgrade
3. We are checking engine certificate validity during each engine-setup execution and if the certificate is going to expire soon and is expired we are renewing that certificate. The same check along with renewal is happening also for engine CA certificate
For certificate renewal the host needs to be in Maintenance, because loading the renewed certificate requires restart of services (VDSM, libvirt, ...), so it cannot be performed when the host is Up.
Now regarding UI: after logging to webamin you are going to be redirected to Dashboard and on the right you can see a box called Events. Here you have a 3 links: Alerts, Errors and all Events. So it very easy to click on Alerts, which will redirect you to Events where only Alerts are shown.
Regarding email notification, you can easily set notifier to send for example Alert events via SMTP to defined email address:
https://access.redhat.com/documentation/en-us/red_hat_virtualization/4.4/html/administration_guide/chap-event_notifications
So from my personal point of view I believe that we have enough options for administrator to be alerted that hypervisors certificates are going to expire soon.
RHV does a great job checking for certificates. But RHV needs to improve on notifying about them. First up is email notifications - many major customers prohibit rogue SMTP servers, and so email notification cannot happen with these customers. But even if it could, we don't document that 13 month certificate lifetime anywhere, and so why would anyone set up any notification for it? And that leads to events and warnings. With every login, there are a number of events and warnings, and the plain truth is, many admins don't check them. Until one day, 13 months after the last upgrade, when the entire RHV environment dies for no apparent reason, and we consume a support team for hours or days while a whole customer company shuts down. Try telling the CEO of a major organization how an outage like this is their fault because their admin didn't recognize that a pesky security warning they've seen for the past few weeks, buried with lots of other pesky warnings, shut them down. This should be easy to address. Ideally, the whole certificate renewal process should be automated. But if automated renewals are not feasible, then at least, instead of blaming the victim, why not warn them? When, say, half a certificate's lifetime is gone, take new admin portal logins to a screen with a warning that says the certificates will expire on [date], this will break your RHV environment, and here is what you must do to avoid all that bad stuff. Click here to continue onto the admin portal. One simple warning like that could save lots of grief. let's renew certificates automatically that are bound to expire 4 months in advance. let's alert a month before expiration that should be doable by 4.5 nack on any other changes, too late for that Engine/Host/CA certs are giving warnings less than 120 days before expiration in event log and alerts in period shorter than 30 days. Verified on Software Version:4.5.0.5-0.7.el8ev note that bug 2079890 in ovirt-engine-4.5.0.8 is changing the warning (and renew) interval to 365 days Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: RHV Manager (ovirt-engine) [ovirt-4.5.0] security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:4711 |