Bug 1204670

Summary: Firefox uses 100% CPU for 5 minutes with self-signed certificate, before connecting
Product: [Fedora] Fedora Reporter: Marius Vollmer <mvollmer>
Component: firefoxAssignee: Martin Stransky <stransky>
Status: CLOSED EOL QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 27CC: anton4linux, dperpeet, gecko-bugs-nobody, jscotka, mvollmer, notting, notting, rrelyea, stefw, stransky
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-11-30 23:22:57 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Example certificate
none
Screenshot of all relevant self-signed certificates in 'Certificate Manager' none

Description Marius Vollmer 2015-03-23 10:38:33 UTC
Description of problem:

Opening Cockpit (https://<IP>:9090 on a Fedora 21 or 22 box) in Firefox causes Firefox to use 100% CPU for a long time (many minutes) before it shows the expected warning about self signed certificates.

After granting the expection, it again uses 100% CPU for about the same time until proceeding.  At that time, Cockpit has times out, but reloading works and one can then use Cockpit in a normal way.

After restarting Firefox, it again uses 100% CPU for a long time on the first connection to Cockpit.

Version-Release number of selected component (if applicable):
firefox-36.0-1.fc21.x86_64

How reproducible:
Always

Steps to Reproduce:
1. Install "Fedora Server 22" in a VM.
2. Open the Cockpit port on it:
   # firewall-cmd --add-service cockpit
3. Open Cockpit in Firefox:
   https://<IP>:9090

Actual results:

Firefox hangs as described above, and sometimes even crashes.

Expected results:

Firefox shows the usual dialogs about self signed certs and load Cockpit without delay.

Additional info:

Cockpit creates its own self signed certificate like so:

openssl req -x509 -days 36500 -newkey rsa:2048 -keyout $key_file  -keyform PEM -nodes -out $out_file -outform PEM -subj /CN=localhost.localdomain

Comment 1 Marius Vollmer 2015-03-23 10:39:04 UTC
Epiphany connects without delay, after warning about the certificate.

Comment 2 Stef Walter 2015-03-23 11:00:31 UTC
This is not Cockpit specific. If I take the same certificate and put it on my local apache httpd 2.4.10 instance with mod_ssl ... I get identical behavior.

Comment 3 Stef Walter 2015-03-23 11:04:00 UTC
Created attachment 1005307 [details]
Example certificate

This is an example certificate that causes the problem. Here is how it is configured with httpd and mod_ssl:

LoadModule ssl_module modules/mod_ssl.so

Listen 443 https
<VirtualHost _default_:443>
DocumentRoot /var/www/html
SSLEngine on
SSLCertificateFile /etc/cockpit/ws-certs.d/~self-signed.cert
SSLCertificateKeyFile /etc/cockpit/ws-certs.d/~self-signed.cert
</VirtualHost>

Comment 4 Stef Walter 2015-03-23 11:06:49 UTC
After Firefox connects, clicking 'View Certificate' makes firefox hang solid ... it seems like it hangs forever. It doesn't respond to clicks, nor am I able to relaunch Firefox until I kill Firefox completely.

Comment 6 Stef Walter 2015-03-23 11:18:47 UTC
firefox-36.0-1.fc22.x86_64

Also seen on F21

Comment 7 Stef Walter 2015-03-23 11:33:18 UTC
Firefox falls over because of the "CN=localhost.localdomain" in the certificate subject. This generates a certificate that causes Firefox problems:

$ openssl req -x509 -days 36500 -newkey rsa:2048 -keyout $key_file  -keyform PEM -nodes -out $out_file -outform PEM -subj /CN=localhost.localdomain

Whereas this works fine:

$ openssl req -x509 -days 36500 -newkey rsa:2048 -keyout $key_file  -keyform PEM -nodes -out $out_file -outform PEM -subj /CN=localhost

Cockpit workaround for our self-signed certificate generation: 

https://github.com/cockpit-project/cockpit/pull/1984

Comment 8 Stef Walter 2015-03-23 13:27:47 UTC
Work around here for Cockpit: https://admin.fedoraproject.org/updates/cockpit-0.45-1.fc22

Comment 14 Bill Nottingham 2015-03-24 17:36:23 UTC
It's not just 'CN=localhost.localdomain' - I can reproduce the hang on viewing the cert with a self-signed cert with a different CN here.

Comment 15 Bill Nottingham 2015-03-24 19:45:32 UTC
Stef - do you happen to have another self-signed localhost.localdomain cert installed?  Where I saw this hang was when I had multiple self-signed serts with the same info (but different keys, as they were generated at different times.)

Comment 16 Stef Walter 2015-03-24 20:29:03 UTC
Yes, this browser has seen many certificates with CN=localhost.localdomain ... all self-signed certificates generated by various cockpit instances.

Comment 17 Marius Vollmer 2015-03-25 07:57:23 UTC
(In reply to Bill Nottingham from comment #15)
> Where I saw this hang was when I had multiple self-signed serts
> with the same info (but different keys, as they were generated at different
> times.)

Hmm, does that mean that changing the subject to /CN=localhost is not a permanent fix?

It might be that changing the subject helps initially, but as soon as a second certificate with the same subject shows up, Firefox will hang again.  True?

How would one create self-signed certs for machines without domain name?  No subject, or a random subject?

Comment 18 Stef Walter 2015-03-25 09:09:05 UTC
> Hmm, does that mean that changing the subject to /CN=localhost is not a permanent fix?

I guess so. Firefox is really the thing that needs to be fixed.

Comment 19 Kai Engert (:kaie) (inactive account) 2015-03-25 09:41:08 UTC
Firefox/NSS has a requirement that {issuer-name,serial-number} must be unique.

If you have added many overrides for certificates, and you a duplicate {issuer-name,serial-number} certificate, then firefox will report sec_error_reused_issuer_and_serial. However, firefox shouldn't hang.

You didn't report that error. Also, the serial number on the example you've provided appears to be random.

I guess you aren't affected by the duplicate issuer/serial issue.

Comment 22 Kai Engert (:kaie) (inactive account) 2015-03-25 09:50:37 UTC
Side note:

Although unrelated to this bug, your server certificate is really out of spec. Please, at least add a basic constraint extension which says "this isn't a CA".

Comment 23 Stef Walter 2015-03-25 09:57:41 UTC
> Although unrelated to this bug, your server certificate is really out of spec. Please, at least add a basic constraint extension which says "this isn't a CA".

I guess OpenSSL shouldn't generate certificates that are out of spec? The certificate generation command is derived from:

$ rpm -qf /etc/ssl/certs/make-dummy-cert 
openssl-1.0.1k-2.fc22.x86_64

You probably want to patch 'openssl req' or its make-dummp-cert to fix that.

My reading of the relevant RFC notes that this is optional, and is treated as cA = FALSE when not present.

https://tools.ietf.org/html/rfc5280#section-4.2.1.9

Comment 24 Kai Engert (:kaie) (inactive account) 2015-03-25 10:04:46 UTC
If your browser profile really have seen many certificates with the same subject name, and if you had added many permanent overrides: If you connect to the server and the cert uses the same subject name, then I wonder: Maybe firefox searches your cert database for a trusted certificate with that name?

That would explain why:
- the bug cannot be seen with a fresh profile
- I cannot see the bug with my old profile
- you can fix it by using a different cert for the new name
  (because you don't have many other certs with the same name)

If this theory were right, you could try to:
- open firefox certificate manager
- open the server tab
- delete all entries for localhost.localdomain
- restart firefox and try again

Does that fix it?

If it fixed it, it would mean that firefox didn't detect that it's a self signed cert right away and that searching for a potential issuer cert won't help.

Comment 25 Kai Engert (:kaie) (inactive account) 2015-03-25 10:05:59 UTC
(In reply to Stef Walter from comment #23)
> 
> I guess OpenSSL shouldn't generate certificates that are out of spec? 

It's out of spec, because you use it as a server cert.


> My reading of the relevant RFC notes that this is optional, and is treated
> as cA = FALSE when not present.

The basic constraint extension is present in your cert, and it says ca=true

Comment 26 Stef Walter 2015-03-25 10:21:31 UTC
Created attachment 1006266 [details]
Screenshot of all relevant self-signed certificates in 'Certificate Manager'

Comment 27 Stef Walter 2015-03-25 10:25:27 UTC
> The basic constraint extension is present in your cert, and it says ca=true

Aha. Good catch.

> Does that fix it?
> 
> If it fixed it, it would mean that firefox didn't detect that it's a self signed cert right away and that searching for a potential issuer cert won't help.

No I don't see it fixed. I removed all the entries in the screenshot above, and still see the same issue.

Comment 28 Kai Engert (:kaie) (inactive account) 2015-03-25 10:35:16 UTC
Ok, thanks for trying. And there weren't that many entries.

Right now I'm out of ideas. I wish I could reproduce.

I've tried to install cockpit on my f21 machine, but I can connect to localhost:9090 without problems.

Comment 29 Stef Walter 2015-03-25 10:38:31 UTC
> The basic constraint extension is present in your cert, and it says ca=true

Posted a pull request to fix this: https://github.com/cockpit-project/cockpit/pull/2018

> Right now I'm out of ideas. I wish I could reproduce.

Do you want my profile?

Comment 30 Kai Engert (:kaie) (inactive account) 2015-03-25 10:53:50 UTC
(In reply to Stef Walter from comment #29)
> > Right now I'm out of ideas. I wish I could reproduce.
> 
> Do you want my profile?

Maybe files cert8.db / cert9.db and cert_override.txt would be sufficient for the first step.

Comment 31 Marius Vollmer 2015-03-25 11:42:16 UTC
(In reply to Stef Walter from comment #18)
> > Hmm, does that mean that changing the subject to /CN=localhost is not a permanent fix?
> 
> I guess so. Firefox is really the thing that needs to be fixed.

Yes, sure, sorry.  What I meant is that /CN=localhost might not be a effective workaround because it fundamentally doesn't change anything.  Changing the subject just resets things enough to work for a while, until they break again when Firefox has gotten into the wedged state.

Comment 32 Marius Vollmer 2015-03-25 11:48:04 UTC
(In reply to Kai Engert (:kaie) from comment #28)

> I've tried to install cockpit on my f21 machine, but I can connect to
> localhost:9090 without problems.

Try deleting /etc/cockpit/ws-certs.d/~self-signed.cert followed by "systemctl restart cockpit".

If this bug is really caused by Firefox accumulating multiple certificates/exceptions, then you might be able to trigger it by regenerating the Cockpit cert.

Comment 33 Kai Engert (:kaie) (inactive account) 2015-03-25 11:57:31 UTC
Thanks, I can reproduce now.

I confirm the time to connect takes longer with each additional permanent override.

Comment 34 Kai Engert (:kaie) (inactive account) 2015-03-25 12:11:28 UTC
Firefox stores each cert of the permanent override. And it makes a mistake, which is probably responsible for the effect you're seeing.

It uses the same "nickname" for each new certificat, although the certificates are different.

I think Firefox should use a unique nickname each time it imports a new certificate.

The fact that there are different certs with identical nicknames might confuse NSS. Either NSS shouldn't get confused by that, or NSS should refuse to import them with the conflicting nickname.

I'll debug more and file an upstream bug.

For the time being, here is a workaround for you:

Please ensure you quit all Firefox windows before you do you the following:

cd into the Firefox profile directory, $HOME/.mozilla/firefox/*.default (or the directory of the profile you're using).

ls -l cert*

Do you see cert8.db?
====================
Then run
  certutil -d dbm:. -L
You'll see many localhost.localdomain entries.

Then run
  certutil -d dbm:. -D -n localhost.localdomain

multiple times, it will delete on cert with each execution.

You can run the -L command again to check how many you have left.

When you have deleted all of them, try firefox again, and it should be quick again.


Do you see cert9.db?
====================
In the above command, instead of dbm:. use sql:.

Comment 35 Kai Engert (:kaie) (inactive account) 2015-03-25 12:33:26 UTC
I found that Firefox/PSM code deliberately reuses the same nickname, because the certificate has the same subject name. It might not be the cause of the issue.

Using the NSS utility tstclnt to connect to the server, while using the NSS database with multiple entries (that hangs firefox) works fine and quick.

Using the NSS utility certutil to validate the certificate is also quick.

I suspect that the issue is with the mozilla::pkix certificate validation code. The fact that you're using a CA certificate with a CAID parameter might contribute to the confusion of the mozilla::pkix code. But it shouldn't be confused. It sounds like it's looping. Anyway, it shouldn't behave like that.

I will be interesting if you can still reproduce this issue after you've fixed the server cert to not be a CA any more.

7 permanent overrides, meaning 7 stored certs, was sufficient to trigger a long delay already.

Comment 36 Marius Vollmer 2015-03-25 13:26:24 UTC
(In reply to Kai Engert (:kaie) from comment #33)
> Thanks, I can reproduce now.
> 
> I confirm the time to connect takes longer with each additional permanent
> override.

Awesome, thanks a lot for figuring this out!

This means that this issue is far less severe than we thought it would be.

Comment 37 Kai Engert (:kaie) (inactive account) 2015-03-25 14:04:55 UTC
I indeed see a lot of looping in mozilla::pkix

I see recursive attempts to find issuer certs, and I'll prepare a test case for upstream to debug and get this fixed.

Once you have a build of cockpit that uses non-ca server cert, we could test if that solves the issue already. Could you try to add 7 overrides, each time with a different cert, and check if it's still slow?

Comment 38 Bob Relyea 2015-03-25 15:57:53 UTC
Kai is mozilla::pkix trying to find issuers of a self-signed certificate? We should report that back to mozilla. I think that they should short circuit self-signed certs. Can we reproduce the problem with the NSS PKIX validator?

bob

Comment 39 Bob Relyea 2015-03-25 16:13:28 UTC
Oh, I bet it is. I bet it's trying to build all possible chains. Each new cert doubles the number of possible chains, so by the time you get to 7 you need to check 2^6 chains.

The problem is 2 fold: 1) cockpit is using selfsigned certs. The pathological case is with selfsigned certs. 2) cockpit is marking their selfsigned certs as CA's (therefore you can make chains our of them). 3) Mozilla should short-circuilt self-signed tests in chain building (don't build chains past a self-signed cert).

Cockpit should be able to make this issue go away with the any of the following: Issue new certificates from a common root (better security characteristics. Self-signed certs a fine hack to get things going, but if you are trying to actually deploy a product, you need to be able to get roots from a common infrastructure. I think the IPA guys can help. They have spent a fair amount of time figuring out how to distribute new and manage root certificates for their infrastructure.

In the meantime, don't issue roots with CA=True in them. If the roots had CA=False, mozilla:pkix wouldn't try to build chains with them.

You can also fix this by varying your the DN for your cert. You don't need to make a unique CN, changing any component of the DN would work. For production, I would actually recommend not putting the hostname in the DN, but put it in the subject Alt extension. The latter is the preferred way anyway, but that's not what is causing the issue.

bob

Comment 40 Kai Engert (:kaie) (inactive account) 2015-03-25 19:02:07 UTC
I've reported the issue upstream and also provided 8 test services to make it easy for people to reproduce the issue.
https://bugzilla.mozilla.org/show_bug.cgi?id=1147544

Comment 41 Stef Walter 2015-03-25 20:26:02 UTC
> Cockpit should be able to make this issue go away with the any of
> the following: Issue new certificates from a common root (better
> security characteristics. Self-signed certs a fine hack to get
> things going, but if you are trying to actually deploy a product,
> you need to be able to get roots from a common infrastructure. I
> think the IPA guys can help. They have spent a fair amount of
> time figuring out how to distribute new and manage root
> certificates for their infrastructure.

Yes, we have that on our roadmap: Make it easy to use domain (as in FreeIPA) issued certificates with the various hosts.

However because Cockpit heavily targets the "out of the box" use case, self-signed certificates will be a staple of Cockpit usage for the forseeable future.

This is analogous to how most people use SSH keys verified by fingerprint, rather than SSH certificates signed by a common authority.

> In the meantime, don't issue roots with CA=True in them. If the
> roots had CA=False, mozilla:pkix wouldn't try to build chains with them.

Done. Thanks for confirming this.

(In reply to Kai Engert (:kaie) from comment #40)
> I've reported the issue upstream and also provided 8 test services to make
> it easy for people to reproduce the issue.
> https://bugzilla.mozilla.org/show_bug.cgi?id=1147544

Thanks Kai!

Comment 42 Bob Relyea 2015-03-25 21:41:13 UTC
> However because Cockpit heavily targets the "out of the box" use case,
> self-signed certificates will be a staple of Cockpit usage for the forseeable
> future.

So any instructions that include 'override this security warning' isn't 'out of the box ready'. I'm sure we can come up with something that will make this unnecessary, and still have reasonable 'out of the box' access.

> This is analogous to how most people use SSH keys verified by fingerprint,
> rather than SSH certificates signed by a common authority.

As soon as you make the browser your engine, you need to respect the infrastructure. The browser security overrides are likely to get more and more difficult, we should not depend on them for any shipping product. It's considered unfriendly and poor security within the browser community. Doing so make us look like we don't know what we are doing.

bob

Comment 43 Stef Walter 2015-03-26 04:08:59 UTC
(In reply to Bob Relyea from comment #42)
> > This is analogous to how most people use SSH keys verified by fingerprint,
> > rather than SSH certificates signed by a common authority.
> 
> As soon as you make the browser your engine, you need to respect the
> infrastructure. The browser security overrides are likely to get more and
> more difficult, we should not depend on them for any shipping product. It's
> considered unfriendly and poor security within the browser community. Doing
> so make us look like we don't know what we are doing.

Cockpit is not a product. The product is "Fedora Server", the product is RHEL, the product is  ... Cockpit is the admin interface. 

So yes, if we're at the point where Fedora or RHEL can trivially acquire have a signed Web server certificate out of the box upon installation ... then that would be great. However, it needs to meet Cockpit's ideals and criteria if it's going to replace self-signed certificates. In particular the following:

 * Cockpit requires no configuration or infrastructure

http://stef.thewalter.net/ideals-of-cockpit.html

Comment 44 Martin Stransky 2015-06-03 11:49:14 UTC
Kai, is there anything needed on Firefox/PSM side or is that just NSS issue? Or do we need any local (Fedora/RHEL) fix for that or it can be moved upstream?

Comment 45 Fedora End Of Life 2015-11-04 10:51:04 UTC
This message is a reminder that Fedora 21 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 21. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora  'version'
of '21'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 21 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 46 Fedora End Of Life 2015-12-02 10:25:02 UTC
Fedora 21 changed to end-of-life (EOL) status on 2015-12-01. Fedora 21 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.

Comment 47 Martin Stransky 2016-02-29 14:22:30 UTC
Bill, why is this one reopened and what fix do you expect here?

Comment 48 Bill Nottingham 2016-02-29 15:21:38 UTC
... it's reopened because it's still an open issue. See upstream bug at https://bugzilla.mozilla.org/show_bug.cgi?id=1056341

Comment 49 Fedora End Of Life 2016-11-24 11:36:00 UTC
This message is a reminder that Fedora 23 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 23. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora  'version'
of '23'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 23 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 50 Fedora End Of Life 2017-11-16 18:38:20 UTC
This message is a reminder that Fedora 25 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 25. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora  'version'
of '25'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version'
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not
able to fix it before Fedora 25 is end of life. If you would still like
to see this bug fixed and are able to reproduce it against a later version
of Fedora, you are encouraged  change the 'version' to a later Fedora
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's
lifetime, sometimes those efforts are overtaken by events. Often a
more recent Fedora release includes newer upstream software that fixes
bugs or makes them obsolete.

Comment 51 Ben Cotton 2018-11-27 13:51:58 UTC
This message is a reminder that Fedora 27 is nearing its end of life.
On 2018-Nov-30  Fedora will stop maintaining and issuing updates for
Fedora 27. It is Fedora's policy to close all bug reports from releases
that are no longer maintained. At that time this bug will be closed as
EOL if it remains open with a Fedora  'version' of '27'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 27 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 52 Ben Cotton 2018-11-30 23:22:57 UTC
Fedora 27 changed to end-of-life (EOL) status on 2018-11-30. Fedora 27 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.