Description of problem: add mux pod failed for Serial number 02 has already been issued. Checking the certificate files on first masters /etc/origin/logging, we can found the number in the ca.serial.txt is lower that the number in ca.db. It seems the playbook overwrite the ca.serial.txt wrongly. [root@openshift-181 logging] #cat ca.serial.txt 02 [root@openshift-181 logging]# cat ca.db V 191114073852Z 02 unknown /O=Logging/OU=OpenShift/CN=system.logging.fluentd V 191114073853Z 03 unknown /O=Logging/OU=OpenShift/CN=system.logging.kibana V 191114073855Z 04 unknown /O=Logging/OU=OpenShift/CN=system.logging.curator V 191114073856Z 05 unknown /O=Logging/OU=OpenShift/CN=system.admin By the way, no such issue with fresh installation. Version-Release number of selected component (if applicable): openshift-ansible-3.7.7-1.git.0.3e1b62b.el7.noarch How reproducible: always Steps to Reproduce: 1) deploy logging without mux pod openshift_logging_install_logging=true openshift_logging_image_version=v3.7 openshift_logging_namespace=logging openshift_logging_es_cluster_size=1 openshift_logging_es_nodeselector={'logging-node': '1'} 2) add mux variables in inventory file openshift_logging_use_mux=true openshift_logging_mux_client_mode=maximal 3) deploy mux pods using openshift-logging.yml Actual results: task path: /usr/share/ansible/openshift-ansible/roles/openshift_logging/tasks/generate_pems.yaml:31 Using module file /usr/lib/python2.7/site-packages/ansible/modules/commands/command.py "changed": true, "cmd": [ "openssl", "ca", "-in", "/etc/origin/logging/system.logging.mux.csr", "-notext", "-out", "/etc/origin/logging/system.logging.mux.crt", "-config", "/etc/origin/logging/signing.conf", "-extensions", "v3_req", "-batch", "-extensions", "server_ext" ], "delta": "0:00:00.008351", "end": "2017-11-14 03:13:01.203513", "failed": true, "invocation": { "module_args": { "_raw_params": "openssl ca -in /etc/origin/logging/system.logging.mux.csr -notext -out /etc/origin/logging/system.logging.mux.crt -config /etc/origin/logging/signing.conf -extensions v3_req -batch -extensions server_ext", "_uses_shell": false, "chdir": null, "creates": null, "executable": null, "removes": null, "warn": true } }, "rc": 1, "start": "2017-11-14 03:13:01.195162", "stderr": "Using configuration from /etc/origin/logging/signing.conf\nCheck that the request matches the signature\nSignature ok\nERROR:Serial number 02 has already been issued,\n check the database/serial_file for corruption\nThe matching entry has the following details\nType :Valid\nExpires on :191114073852Z\nSerial Number :02\nFile name :unknown\nSubject Name :/O=Logging/OU=OpenShift/CN=system.logging.fluentd", "stderr_lines": [ "Using configuration from /etc/origin/logging/signing.conf", "Check that the request matches the signature", "Signature ok", "ERROR:Serial number 02 has already been issued,", " check the database/serial_file for corruption", "The matching entry has the following details", "Type :Valid", "Expires on :191114073852Z", "Serial Number :02", "File name :unknown", "Subject Name :/O=Logging/OU=OpenShift/CN=system.logging.fluentd" ], "stdout": "", "stdout_lines": [] Expected results: The mux pod can be added without error. Addional Info: The workaround is to /etc/origin/logging and redeploy all cert.
Created attachment 1351835 [details] /etc/origin/logging
Created attachment 1351836 [details] Deploy logging
I could reproduce the problem without ansible. 1) deployed common logging without mux. ==> /etc/origin/logging/ca.serial.txt <== 09 2) sudo oc adm --config=/etc/origin/master/admin.kubeconfig ca create-server-cert --key=/etc/origin/logging/mux.key --cert=/etc/origin/logging/mux.crt --hostnames='logging-mux, mux.ec2-52-70-1-212.compute-1.amazonaws.com' --signer-cert=/etc/origin/logging/ca.crt --signer-key=/etc/origin/logging/ca.key --signer-serial=/etc/origin/logging/ca.serial.txt ==> /etc/origin/logging/ca.serial.txt <== 02 Note that the reason why the mux cert creation did not fail for me was the serial IDs were equal to or greater than 04, thus there was no conflict with 02 (which also puzzles me, though. Why it started with 04?). V 191208180736Z 04 unknown /O=Logging/OU=OpenShift /CN=system.logging.fluentd V 191208180737Z 05 unknown /O=Logging/OU=OpenShift/CN=system.logging.kibana V 191208180737Z 06 unknown /O=Logging/OU=OpenShift/CN=system.logging.curator V 191208180738Z 07 unknown /O=Logging/OU=OpenShift/CN=system.admin V 191208180739Z 08 unknown /O=Logging/OU=OpenShift/CN=system.logging.es Could someone who is familiar with the origin crypto code take a look? "./origin/pkg/cmd/server/crypto/crypto.go"
Hello Command Line Interface expert, Could you please help us running "oc adm ca create-server-cert" with --signer-serial file? The number in the serial file is unexpectedly reset/lowered. Thanks.
I do not see anything incorrect in the commands run. We ignore an error here https://github.com/openshift/origin/blob/master/pkg/cmd/server/crypto/crypto.go#L282 that could be the reason the file has 02 written to it (but the files seems to have valid starting data so I do not know why it would matter). David, you write this code, see anything off?
Actually it seems like we reset that file: https://github.com/openshift/origin/blob/master/pkg/cmd/server/crypto/crypto.go#L388 You can try running the command with --overwrite=false which will prevent that file from being reset as long as all files needed for the command to run are present and valid.
Unfortunately, adding "--overwrite=false" did not help. Here is the ca.serial.txt and ca.db before running "oc adm ca create-server-cert": $ cat ca.serial.txt 09 $ ls ca.db [origin@ip-172-18-8-158 logging]$ cat ca.db V 191214183824Z 04 unknown /O=Logging/OU=OpenShift/CN=system.logging.fluentd V 191214183825Z 05 unknown /O=Logging/OU=OpenShift/CN=system.logging.kibana V 191214183825Z 06 unknown /O=Logging/OU=OpenShift/CN=system.logging.curator V 191214183826Z 07 unknown /O=Logging/OU=OpenShift/CN=system.admin V 191214183827Z 08 unknown /O=Logging/OU=OpenShift/CN=system.logging.es Then, I ran this command line: $ sudo oc adm --config=/etc/origin/master/admin.kubeconfig ca create-server-cert --key=/etc/origin/logging/mux.key --cert=/etc/origin/logging/mux.crt --hostnames='logging-mux, mux.ec2-52-70-1-212.compute-1.amazonaws.com' --signer-cert=/etc/origin/logging/ca.crt --signer-key=/etc/origin/logging/ca.key --signer-serial=/etc/origin/logging/ca.serial.txt --overwrite=false Then ca.serial.txt was reset as originally reported. $ cat ca.serial.txt 02 Another question is you mentioned this line in #c6 which is in MakeCA. https://github.com/openshift/origin/blob/master/pkg/cmd/server/crypto/crypto.go#L388 The command line we are having this issue is in making a server cert with existing ca key/cert.
We'll have to reproduce and analyze locally. David confirms he sees nothing wrong.
Commits pushed to master at https://github.com/openshift/openshift-ansible https://github.com/openshift/openshift-ansible/commit/ac23e6e362d8758032c1dd573d0ff6a958445df5 Bug 1512825 - add mux pod failed for Serial number 02 has already been issued According to mkhan, to run the "oc adm ca create-server-cert" command line with --signer-serial option, the following changes need to be made. 1. adding --overwrite=false 2. <ca.serial.txt> should contain only [0-9A-F]*. (no trailing newlines are allowed for now) This patch solves 1. https://github.com/openshift/openshift-ansible/commit/4d93123e9626657e55ce03cb8a0288a6ba5e3f2e Merge pull request #6798 from nhosoi/bz1512825 Automatic merge from submit-queue. Bug 1512825 - add mux pod failed for Serial number 02 has already been issued According to mkhan, to run the "oc adm ca create-server-cert" command line with --signer-serial option, the following changes need to be made. 1. adding --overwrite=false 2. <ca.serial.txt> should contain only [0-9A-F]*. (no trailing newlines are allowed for now) This patch solves 1.
https://github.com/openshift/origin/pull/18405
(In reply to Mo from comment #10) > https://github.com/openshift/origin/pull/18405 Thank you for the PR, @Mo. From our point of view, we are setting --overwrite=false by ourselves. And there's almost no chance to run into a race condition in generating certs, as long as the latter half of your PR 18405 is merged, it'd solve our problem. Thanks!
(In reply to openshift-github-bot from comment #9) > Commits pushed to master at https://github.com/openshift/openshift-ansible > > https://github.com/openshift/openshift-ansible/commit/ > ac23e6e362d8758032c1dd573d0ff6a958445df5 > Bug 1512825 - add mux pod failed for Serial number 02 has already been issued > > According to mkhan, to run the "oc adm ca create-server-cert" > command > line with --signer-serial option, the following changes need to be made. > 1. adding --overwrite=false > 2. <ca.serial.txt> should contain only [0-9A-F]*. > (no trailing newlines are allowed for now) > > This patch solves 1. So my initial understanding of the bug was incorrect. Specifically, no change is required to the overwrite flag as it deals with the certs being generated and does not change how the serial file is interacted with. Based on the discussion in origin, we decided to not change the default for overwrite as that would be a backwards incompatible change in some scenarios. With this in mind, I have opened https://github.com/openshift/openshift-ansible/pull/7155 to revert the change opened to address point 1 above. The only bug in the command was how it read/wrote to the serial file (point 2 above). https://github.com/openshift/origin/pull/18405 handles that case across all commands that interact with the serial file.
Commits pushed to master at https://github.com/openshift/openshift-ansible https://github.com/openshift/openshift-ansible/commit/3db1b63ab35cac6821c8795f7a84ef51407fbb6f Revert "Bug 1512825 - add mux pod failed for Serial number 02 has already been issued" This reverts commit ac23e6e362d8758032c1dd573d0ff6a958445df5. That commit introduced a backwards incompatible change to how the commands run. This undoes that. The original change was not required to prevent overwriting of the serial file. Bug 1512825 https://github.com/openshift/openshift-ansible/commit/5160996a7bca9053720cd6230537f7acacaa2134 Merge pull request #7155 from enj/enj/i/revert_overwrite_certs/1512825 Automatic merge from submit-queue. Revert "Bug 1512825 - add mux pod failed for Serial number 02 has already been issued" @sdodson @nhosoi This reverts #6798 as we are not changing the default in origin https://github.com/openshift/origin/pull/18405. This reverts commit ac23e6e362d8758032c1dd573d0ff6a958445df5. That commit introduced a backwards incompatible change to how the commands run. This undoes that. The original change was not required to prevent overwriting of the serial file. Bug 1512825
Commits pushed to master at https://github.com/openshift/origin https://github.com/openshift/origin/commit/266aa46bd0a5728d3a8831a6321cd1c3fd8b2360 Correctly handle newlines in SerialFileGenerator This change makes it so that SerialFileGenerator correctly reads and writes serial files that end in a newline. This allows it to interoperate with other tools that may interact with the serial file such as openssl. Tests were added to assert that the incrementing logic works and that the serial file always ends with a newline. Bug 1512825 Signed-off-by: Monis Khan <mkhan> https://github.com/openshift/origin/commit/7642cc86bc4243c475670c7d4bb07e6093410bcf Merge pull request #18405 from enj/enj/i/ca_serial_number/1512825 Automatic merge from submit-queue (batch tested with PRs 18404, 18405). Correctly handle newlines in serial files Correctly handle newlines in SerialFileGenerator This change makes it so that `SerialFileGenerator` correctly reads and writes serial files that end in a newline. This allows it to interoperate with other tools that may interact with the serial file such as openssl. Tests were added to assert that the incrementing logic works and that the serial file always ends with a newline. [Bug 1512825](https://bugzilla.redhat.com/show_bug.cgi?id=1512825) Signed-off-by: Monis Khan <mkhan> --- /kind bug /assign @simo5 @deads2k @openshift/sig-security
Deploy logging with mux, installation steps don't throw out error "ERROR:Serial number 02 has already been issued" Scenarios: 1. Refresh install logging with mux 2. Re-install logging with mux again 3. Install logging without mux, delete logging, and then install logging with mux PLAY RECAP ********************************************************************************* host-8-248-214.host.centralci.eng.rdu2.redhat.com : ok=326 changed=75 unreachable=0 failed=0 host-8-248-79.host.centralci.eng.rdu2.redhat.com : ok=0 changed=0 unreachable=0 failed=0 localhost : ok=11 changed=0 unreachable=0 failed=0 INSTALLER STATUS ********************************************************************************* Initialization : Complete (0:00:16) Logging Install : Complete (0:05:21) env: # openshift version openshift v3.9.3 kubernetes v1.9.1+a0ce1bc657 etcd 3.2.16 # rpm -qa | grep openshift-ansible openshift-ansible-playbooks-3.9.3-1.git.0.e166207.el7.noarch openshift-ansible-docs-3.9.3-1.git.0.e166207.el7.noarch openshift-ansible-roles-3.9.3-1.git.0.e166207.el7.noarch openshift-ansible-3.9.3-1.git.0.e166207.el7.noarch
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:0489
*** Bug 1599518 has been marked as a duplicate of this bug. ***