Bug 1512825 - add mux pod failed for Serial number 02 has already been issued
Summary: add mux pod failed for Serial number 02 has already been issued
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: apiserver-auth
Version: 3.7.0
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 3.9.0
Assignee: Mo
QA Contact: Junqi Zhao
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-11-14 08:40 UTC by Anping Li
Modified: 2021-09-09 12:49 UTC (History)
12 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
When creating a series of certificates with the command line "oc adm ... ca create-server-cert --signer-serial=/path/to/serial_file", newlines in the serial_file was not handled correctly, which sometimes caused a failure to generate numbers in the ascendant manner in the serial_file. The problem was fixed so that SerialFileGenerator correctly reads and writes serial files that end in a newline.
Clone Of:
Environment:
Last Closed: 2018-03-28 14:12:08 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
/etc/origin/logging (27.14 KB, application/x-gzip)
2017-11-14 08:52 UTC, Anping Li
no flags Details
Deploy logging (355.55 KB, application/x-gzip)
2017-11-14 08:54 UTC, Anping Li
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:0489 0 None None None 2018-03-28 14:12:37 UTC

Description Anping Li 2017-11-14 08:40:00 UTC
Description of problem:

add mux pod failed for Serial number 02 has already been issued. Checking the certificate files on first masters /etc/origin/logging, we can found the number in the  ca.serial.txt is lower that the number in ca.db. It seems the playbook overwrite the ca.serial.txt wrongly.

[root@openshift-181 logging] #cat ca.serial.txt
02
[root@openshift-181 logging]# cat ca.db
V    191114073852Z        02    unknown    /O=Logging/OU=OpenShift/CN=system.logging.fluentd
V    191114073853Z        03    unknown    /O=Logging/OU=OpenShift/CN=system.logging.kibana
V    191114073855Z        04    unknown    /O=Logging/OU=OpenShift/CN=system.logging.curator
V    191114073856Z        05    unknown    /O=Logging/OU=OpenShift/CN=system.admin



By the way, no such issue with fresh installation.


Version-Release number of selected component (if applicable):
openshift-ansible-3.7.7-1.git.0.3e1b62b.el7.noarch

How reproducible:
always


Steps to Reproduce:
1) deploy logging without mux pod
openshift_logging_install_logging=true
openshift_logging_image_version=v3.7
openshift_logging_namespace=logging
openshift_logging_es_cluster_size=1
openshift_logging_es_nodeselector={'logging-node': '1'}
2) add mux variables in inventory file
openshift_logging_use_mux=true
openshift_logging_mux_client_mode=maximal
3) deploy mux pods using openshift-logging.yml

Actual results:
task path: /usr/share/ansible/openshift-ansible/roles/openshift_logging/tasks/generate_pems.yaml:31
Using module file /usr/lib/python2.7/site-packages/ansible/modules/commands/command.py
    "changed": true, 
    "cmd": [
        "openssl", 
        "ca", 
        "-in", 
        "/etc/origin/logging/system.logging.mux.csr", 
        "-notext", 
        "-out", 
        "/etc/origin/logging/system.logging.mux.crt", 
        "-config", 
        "/etc/origin/logging/signing.conf", 
        "-extensions", 
        "v3_req", 
        "-batch", 
        "-extensions", 
        "server_ext"
    ], 
    "delta": "0:00:00.008351", 
    "end": "2017-11-14 03:13:01.203513", 
    "failed": true, 
    "invocation": {
        "module_args": {
            "_raw_params": "openssl ca -in /etc/origin/logging/system.logging.mux.csr -notext -out /etc/origin/logging/system.logging.mux.crt -config /etc/origin/logging/signing.conf -extensions v3_req -batch -extensions server_ext", 
            "_uses_shell": false, 
            "chdir": null, 
            "creates": null, 
            "executable": null, 
            "removes": null, 
            "warn": true
        }
    }, 
    "rc": 1, 
    "start": "2017-11-14 03:13:01.195162", 
    "stderr": "Using configuration from /etc/origin/logging/signing.conf\nCheck that the request matches the signature\nSignature ok\nERROR:Serial number 02 has already been issued,\n      check the database/serial_file for corruption\nThe matching entry has the following details\nType          :Valid\nExpires on    :191114073852Z\nSerial Number :02\nFile name     :unknown\nSubject Name  :/O=Logging/OU=OpenShift/CN=system.logging.fluentd", 
    "stderr_lines": [
        "Using configuration from /etc/origin/logging/signing.conf", 
        "Check that the request matches the signature", 
        "Signature ok", 
        "ERROR:Serial number 02 has already been issued,", 
        "      check the database/serial_file for corruption", 
        "The matching entry has the following details", 
        "Type          :Valid", 
        "Expires on    :191114073852Z", 
        "Serial Number :02", 
        "File name     :unknown", 
        "Subject Name  :/O=Logging/OU=OpenShift/CN=system.logging.fluentd"
    ], 
    "stdout": "", 
    "stdout_lines": []

Expected results:
The mux pod can be added without error.


Addional Info:
The workaround is to /etc/origin/logging and redeploy all cert.

Comment 1 Anping Li 2017-11-14 08:52:44 UTC
Created attachment 1351835 [details]
/etc/origin/logging

Comment 2 Anping Li 2017-11-14 08:54:10 UTC
Created attachment 1351836 [details]
Deploy logging

Comment 3 Noriko Hosoi 2017-12-08 19:29:36 UTC
I could reproduce the problem without ansible.

1) deployed common logging without mux.

==> /etc/origin/logging/ca.serial.txt <==
09

2) sudo oc adm --config=/etc/origin/master/admin.kubeconfig ca create-server-cert --key=/etc/origin/logging/mux.key --cert=/etc/origin/logging/mux.crt --hostnames='logging-mux, mux.ec2-52-70-1-212.compute-1.amazonaws.com' --signer-cert=/etc/origin/logging/ca.crt --signer-key=/etc/origin/logging/ca.key --signer-serial=/etc/origin/logging/ca.serial.txt

==> /etc/origin/logging/ca.serial.txt <==
02

Note that the reason why the mux cert creation did not fail for me was the serial IDs were equal to or greater than 04, thus there was no conflict with 02 (which also puzzles me, though.  Why it started with 04?).
  V	191208180736Z		04	unknown	/O=Logging/OU=OpenShift /CN=system.logging.fluentd
  V	191208180737Z		05	unknown	/O=Logging/OU=OpenShift/CN=system.logging.kibana
  V	191208180737Z		06	unknown	/O=Logging/OU=OpenShift/CN=system.logging.curator
  V	191208180738Z		07	unknown	/O=Logging/OU=OpenShift/CN=system.admin
  V	191208180739Z		08	unknown	/O=Logging/OU=OpenShift/CN=system.logging.es

Could someone who is familiar with the origin crypto code take a look?
"./origin/pkg/cmd/server/crypto/crypto.go"

Comment 4 Noriko Hosoi 2017-12-12 23:04:33 UTC
Hello Command Line Interface expert, 

Could you please help us running "oc adm ca create-server-cert" with --signer-serial file?  The number in the serial file is unexpectedly reset/lowered.  Thanks.

Comment 5 Mo 2017-12-14 04:54:53 UTC
I do not see anything incorrect in the commands run.

We ignore an error here https://github.com/openshift/origin/blob/master/pkg/cmd/server/crypto/crypto.go#L282 that could be the reason the file has 02 written to it (but the files seems to have valid starting data so I do not know why it would matter).

David, you write this code, see anything off?

Comment 6 Mo 2017-12-14 05:12:07 UTC
Actually it seems like we reset that file:

https://github.com/openshift/origin/blob/master/pkg/cmd/server/crypto/crypto.go#L388

You can try running the command with --overwrite=false which will prevent that file from being reset as long as all files needed for the command to run are present and valid.

Comment 7 Noriko Hosoi 2017-12-14 19:32:12 UTC
Unfortunately, adding "--overwrite=false" did not help.

Here is the ca.serial.txt and ca.db before running "oc adm ca create-server-cert":
$ cat ca.serial.txt
09
$ ls ca.db
[origin@ip-172-18-8-158 logging]$ cat ca.db
V    191214183824Z        04    unknown    /O=Logging/OU=OpenShift/CN=system.logging.fluentd
V    191214183825Z        05    unknown    /O=Logging/OU=OpenShift/CN=system.logging.kibana
V    191214183825Z        06    unknown    /O=Logging/OU=OpenShift/CN=system.logging.curator
V    191214183826Z        07    unknown    /O=Logging/OU=OpenShift/CN=system.admin
V    191214183827Z        08    unknown    /O=Logging/OU=OpenShift/CN=system.logging.es

Then, I ran this command line:
$ sudo oc adm --config=/etc/origin/master/admin.kubeconfig ca create-server-cert --key=/etc/origin/logging/mux.key --cert=/etc/origin/logging/mux.crt --hostnames='logging-mux, mux.ec2-52-70-1-212.compute-1.amazonaws.com' --signer-cert=/etc/origin/logging/ca.crt --signer-key=/etc/origin/logging/ca.key --signer-serial=/etc/origin/logging/ca.serial.txt --overwrite=false

Then ca.serial.txt was reset as originally reported.
$ cat ca.serial.txt
02

Another question is you mentioned this line in #c6 which is in MakeCA.
https://github.com/openshift/origin/blob/master/pkg/cmd/server/crypto/crypto.go#L388

The command line we are having this issue is in making a server cert with existing ca key/cert.

Comment 8 Simo Sorce 2018-01-03 17:46:17 UTC
We'll have to reproduce and analyze locally.
David confirms he sees nothing wrong.

Comment 9 openshift-github-bot 2018-01-22 15:59:57 UTC
Commits pushed to master at https://github.com/openshift/openshift-ansible

https://github.com/openshift/openshift-ansible/commit/ac23e6e362d8758032c1dd573d0ff6a958445df5
Bug 1512825 - add mux pod failed for Serial number 02 has already been issued

According to mkhan, to run the "oc adm ca create-server-cert" command
line with --signer-serial option, the following changes need to be made.
1. adding --overwrite=false
2. <ca.serial.txt> should contain only [0-9A-F]*.
   (no trailing newlines are allowed for now)

This patch solves 1.

https://github.com/openshift/openshift-ansible/commit/4d93123e9626657e55ce03cb8a0288a6ba5e3f2e
Merge pull request #6798 from nhosoi/bz1512825

Automatic merge from submit-queue.

Bug 1512825 - add mux pod failed for Serial number 02 has already been issued

According to mkhan, to run the "oc adm ca create-server-cert" command
line with --signer-serial option, the following changes need to be made.
1. adding --overwrite=false
2. <ca.serial.txt> should contain only [0-9A-F]*.
   (no trailing newlines are allowed for now)

This patch solves 1.

Comment 11 Noriko Hosoi 2018-02-02 18:22:54 UTC
(In reply to Mo from comment #10)
> https://github.com/openshift/origin/pull/18405

Thank you for the PR, @Mo.

From our point of view, we are setting --overwrite=false by ourselves.  And there's almost no chance to run into a race condition in generating certs, as long as the latter half of your PR 18405 is merged, it'd solve our problem.  Thanks!

Comment 16 Mo 2018-02-14 19:17:45 UTC
(In reply to openshift-github-bot from comment #9)
> Commits pushed to master at https://github.com/openshift/openshift-ansible
> 
> https://github.com/openshift/openshift-ansible/commit/
> ac23e6e362d8758032c1dd573d0ff6a958445df5
> Bug 1512825 - add mux pod failed for Serial number 02 has already been issued
> 
> According to mkhan, to run the "oc adm ca create-server-cert"
> command
> line with --signer-serial option, the following changes need to be made.
> 1. adding --overwrite=false
> 2. <ca.serial.txt> should contain only [0-9A-F]*.
>    (no trailing newlines are allowed for now)
> 
> This patch solves 1.



So my initial understanding of the bug was incorrect.  Specifically, no change is required to the overwrite flag as it deals with the certs being generated and does not change how the serial file is interacted with.

Based on the discussion in origin, we decided to not change the default for overwrite as that would be a backwards incompatible change in some scenarios.  With this in mind, I have opened https://github.com/openshift/openshift-ansible/pull/7155 to revert the change opened to address point 1 above.

The only bug in the command was how it read/wrote to the serial file (point 2 above).  https://github.com/openshift/origin/pull/18405 handles that case across all commands that interact with the serial file.

Comment 17 openshift-github-bot 2018-02-14 23:49:29 UTC
Commits pushed to master at https://github.com/openshift/openshift-ansible

https://github.com/openshift/openshift-ansible/commit/3db1b63ab35cac6821c8795f7a84ef51407fbb6f
Revert "Bug 1512825 - add mux pod failed for Serial number 02 has already been issued"

This reverts commit ac23e6e362d8758032c1dd573d0ff6a958445df5.

That commit introduced a backwards incompatible change to how the
commands run.  This undoes that.  The original change was not
required to prevent overwriting of the serial file.

Bug 1512825

https://github.com/openshift/openshift-ansible/commit/5160996a7bca9053720cd6230537f7acacaa2134
Merge pull request #7155 from enj/enj/i/revert_overwrite_certs/1512825

Automatic merge from submit-queue.

Revert "Bug 1512825 - add mux pod failed for Serial number 02 has already been issued"

@sdodson @nhosoi This reverts #6798 as we are not changing the default in origin https://github.com/openshift/origin/pull/18405.

This reverts commit ac23e6e362d8758032c1dd573d0ff6a958445df5.

That commit introduced a backwards incompatible change to how the commands run.  This undoes that.  The original change was not required to prevent overwriting of the serial file.

Bug 1512825

Comment 18 openshift-github-bot 2018-02-21 01:29:20 UTC
Commits pushed to master at https://github.com/openshift/origin

https://github.com/openshift/origin/commit/266aa46bd0a5728d3a8831a6321cd1c3fd8b2360
Correctly handle newlines in SerialFileGenerator

This change makes it so that SerialFileGenerator correctly reads and
writes serial files that end in a newline.  This allows it to
interoperate with other tools that may interact with the serial file
such as openssl.  Tests were added to assert that the incrementing
logic works and that the serial file always ends with a newline.

Bug 1512825

Signed-off-by: Monis Khan <mkhan>

https://github.com/openshift/origin/commit/7642cc86bc4243c475670c7d4bb07e6093410bcf
Merge pull request #18405 from enj/enj/i/ca_serial_number/1512825

Automatic merge from submit-queue (batch tested with PRs 18404, 18405).

Correctly handle newlines in serial files

Correctly handle newlines in SerialFileGenerator

This change makes it so that `SerialFileGenerator` correctly reads and writes serial files that end in a newline.  This allows it to interoperate with other tools that may interact with the serial file such as openssl.  Tests were added to assert that the incrementing logic works and that the serial file always ends with a newline.

[Bug 1512825](https://bugzilla.redhat.com/show_bug.cgi?id=1512825)

Signed-off-by: Monis Khan <mkhan>

---

/kind bug
/assign @simo5 @deads2k
@openshift/sig-security

Comment 21 Junqi Zhao 2018-03-08 04:11:56 UTC
Deploy logging with mux, installation steps don't throw out error "ERROR:Serial number 02 has already been issued"

Scenarios:
1. Refresh install logging with mux
2. Re-install logging with mux again
3. Install logging without mux, delete logging, and then install logging with mux

PLAY RECAP *********************************************************************************
host-8-248-214.host.centralci.eng.rdu2.redhat.com : ok=326  changed=75   unreachable=0    failed=0   
host-8-248-79.host.centralci.eng.rdu2.redhat.com : ok=0    changed=0    unreachable=0    failed=0   
localhost                  : ok=11   changed=0    unreachable=0    failed=0   


INSTALLER STATUS *********************************************************************************
Initialization             : Complete (0:00:16)
Logging Install            : Complete (0:05:21)


env:
# openshift version
openshift v3.9.3
kubernetes v1.9.1+a0ce1bc657
etcd 3.2.16

# rpm -qa | grep openshift-ansible
openshift-ansible-playbooks-3.9.3-1.git.0.e166207.el7.noarch
openshift-ansible-docs-3.9.3-1.git.0.e166207.el7.noarch
openshift-ansible-roles-3.9.3-1.git.0.e166207.el7.noarch
openshift-ansible-3.9.3-1.git.0.e166207.el7.noarch

Comment 26 errata-xmlrpc 2018-03-28 14:12:08 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0489

Comment 27 Noriko Hosoi 2018-07-19 15:32:44 UTC
*** Bug 1599518 has been marked as a duplicate of this bug. ***


Note You need to log in before you can comment on or make changes to this bug.