RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2094752 - rhcd service fails to start when configured with stage environment
Summary: rhcd service fails to start when configured with stage environment
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: rhc
Version: 8.6
Hardware: Unspecified
OS: Unspecified
urgent
high
Target Milestone: rc
: ---
Assignee: Link Dupont
QA Contact: jaudet
URL:
Whiteboard:
Depends On:
Blocks: 2095598 2095599
TreeView+ depends on / blocked
 
Reported: 2022-06-08 09:26 UTC by Jameer Pathan
Modified: 2022-06-17 16:55 UTC (History)
19 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-06-17 16:55:20 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHELPLAN-124628 0 None None None 2022-06-08 09:41:20 UTC
Red Hat Issue Tracker SAT-9993 0 None None None 2022-06-08 09:30:20 UTC

Description Jameer Pathan 2022-06-08 09:26:09 UTC
Description of problem:
rhcd service fails to start when configured with stage environment

Version-Release number of selected component (if applicable):
- rhel 8.6
- rhc-0.2.1-7.el8.x86_64

How reproducible:
- Always

Steps to Reproduce:

--- With stage environment, rhel 8.6 ---

Subscribe rhel 8.6 host with stage
***
[root@dhcp-3-227 ~]# subscription-manager register --user=username --password=password
Registering to: subscription.rhsm.stage.redhat.com:443/subscription
The system has been registered with ID: 2935c4c1-0abc-4004-908a-21cf30e21ae6
The registered system name is: dhcp-3-227.example.com
***

Update /etc/rhc/config.toml:
***
[root@dhcp-3-227 ~]# cat /etc/rhc/config.toml 
# rhc global configuration settings

broker = ["wss://connect.cloud.stage.redhat.com:443"]
data-host = "cert.cloud.stage.redhat.com"
cert-file = "/etc/pki/consumer/cert.pem"
key-file = "/etc/pki/consumer/key.pem"
log-level = "error"
***

Run "rhc connect":
***
[root@dhcp-3-227 ~]# rhc connect                       ** notice exit status 1 **
Connecting dhcp-3-227.example.com to Red Hat.
This might take a few seconds.

● This system is already connected to Red Hat Subscription Management
exit status 1
***

Logs from systemctl start rhcd:
***
[root@dhcp-3-227 ~]# systemctl start rhcd
[root@dhcp-3-227 ~]# systemctl status rhcd
● rhcd.service - Red Hat connector daemon
   Loaded: loaded (/usr/lib/systemd/system/rhcd.service; disabled; vendor preset: disabled)
   Active: failed (Result: exit-code) since Wed 2022-06-08 03:03:59 EDT; 7s ago
     Docs: https://github.com/redhatinsights/yggdrasil
  Process: 2736 ExecStart=/usr/sbin/rhcd (code=exited, status=1/FAILURE)
 Main PID: 2736 (code=exited, status=1/FAILURE)

Jun 08 03:03:57 dhcp-3-227.example.com systemd[1]: Started Red Hat connector daemon.
Jun 08 03:03:59 dhcp-3-227.example.com rhcd[2736]: cannot connect to broker: network Error : websocket: close 1006 (abnormal closure): unexpected EOF
Jun 08 03:03:59 dhcp-3-227.example.com systemd[1]: rhcd.service: Main process exited, code=exited, status=1/FAILURE
Jun 08 03:03:59 dhcp-3-227.example.com systemd[1]: rhcd.service: Failed with result 'exit-code'.

[root@dhcp-3-227 ~]# journalctl --unit=rhcd
-- Logs begin at Wed 2022-06-08 02:49:22 EDT, end at Wed 2022-06-08 03:03:59 EDT. --
Jun 08 03:03:57 dhcp-3-227.example.com systemd[1]: Started Red Hat connector daemon.
Jun 08 03:03:59 dhcp-3-227.example.com rhcd[2736]: cannot connect to broker: network Error : websocket: close 1006 (abnormal closure): unexpected EOF
Jun 08 03:03:59 dhcp-3-227.example.com systemd[1]: rhcd.service: Main process exited, code=exited, status=1/FAILURE
Jun 08 03:03:59 dhcp-3-227.example.com systemd[1]: rhcd.service: Failed with result 'exit-code'.
***

--- With production environment, rhel 8.6 ---

[root@dhcp-3-130 ~]# subscription-manager register --user=username --password=password
Registering to: subscription.rhsm.redhat.com:443/subscription
The system has been registered with ID: 46e8d792-0bf6-4109-93b8-d770004de0f2
The registered system name is: dhcp-3-130.example.com

[root@dhcp-3-130 ~]# cat /etc/rhc/config.toml 
# rhc global configuration settings

broker = ["wss://connect.cloud.redhat.com:443"]
cert-file = "/etc/pki/consumer/cert.pem"
key-file = "/etc/pki/consumer/key.pem"
log-level = "error"

[root@dhcp-3-130 ~]# rhc connect
Connecting dhcp-3-130.example.com to Red Hat.
This might take a few seconds.

● This system is already connected to Red Hat Subscription Management
● Connected to Red Hat Insights
● Activated the Red Hat connector daemon

Manage your Red Hat connector systems: https://red.ht/connector

[root@dhcp-3-130 ~]# systemctl start rhcd
[root@dhcp-3-130 ~]# journalctl --unit=rhcd
-- Logs begin at Wed 2022-06-08 03:00:02 EDT, end at Wed 2022-06-08 03:08:38 EDT. --
Jun 08 03:08:19 dhcp-3-130.example.com systemd[1]: Started Red Hat connector daemon.

Actual results:
- rhcd service fails to start when configured with stage environment

Expected results:
- rhcd service works when configured with stage environment

Comment 2 Link Dupont 2022-06-08 10:52:41 UTC
Can you set the 'log-level' to "debug", start rhcd.service again, and attach the output of `journalctl --unit=rhcd.service`?

Comment 3 Jameer Pathan 2022-06-08 11:42:49 UTC
Jun 08 05:23:28 dhcp-3-227.example.com systemd[1]: Started Red Hat connector daemon.
Jun 08 05:23:29 dhcp-3-227.example.com rhcd[3147]: cannot connect to broker: network Error : websocket: close 1006 (abnormal closure): unexpected EOF
Jun 08 05:23:29 dhcp-3-227.example.com systemd[1]: rhcd.service: Main process exited, code=exited, status=1/FAILURE
Jun 08 05:23:29 dhcp-3-227.example.com systemd[1]: rhcd.service: Failed with result 'exit-code'.
Jun 08 07:31:39 dhcp-3-227.example.com systemd[1]: Started Red Hat connector daemon.
Jun 08 07:31:39 dhcp-3-227.example.com rhcd[3309]: [rhcd] 2022/06/08 07:31:39 /builddir/build/BUILD/rhc/yggdrasil-0.2.1/cmd/yggd/main.go:160: starting rhcd version 0.2.1
Jun 08 07:31:39 dhcp-3-227.example.com rhcd[3309]: [rhcd] 2022/06/08 07:31:39 /builddir/build/BUILD/rhc/yggdrasil-0.2.1/cmd/yggd/main.go:209: listening on socket: @yggd-dispatcher-iEEDtI
Jun 08 07:31:41 dhcp-3-227.example.com rhcd[3309]: cannot connect to broker: network Error : websocket: close 1006 (abnormal closure): unexpected EOF
Jun 08 07:31:41 dhcp-3-227.example.com systemd[1]: rhcd.service: Main process exited, code=exited, status=1/FAILURE
Jun 08 07:31:41 dhcp-3-227.example.com systemd[1]: rhcd.service: Failed with result 'exit-code'.

Comment 4 jaudet 2022-06-08 14:07:59 UTC
I provisioned a RHEL 8.6 VM and executed the following:

subscription-manager register --username insights-qa --serverurl https://subscription.rhsm.stage.redhat.com:443
subscription-manager attach --pool=8a82d2b77fc27a39018029d31d115815
vi /etc/rhc/config.toml  # set broker and data-host
rhc connect  # exit 1

The `rhc connect` command failed, as originally reported. I then provisioned a second RHEL 8.6 VM and executed the following:

subscription-manager register --username insights-qa
vi /etc/rhc/config.toml  # set broker and data-host
rhc connect  # exit 0

The `rhc connect` command succeeded. Perhaps this indicates there is some difficulty with the certificate provided by stage RHSM?

Comment 6 Rodrigo Antunes 2022-06-08 15:01:08 UTC
It seems that the certificate generated when you subscribe the host to STAGE RHSM has changed.

Now the subject has not only the CN, but also has the O = org_id.

New cert (openssl x509 -in /etc/pki/consumer/cert.pem -text --noout):
Certificate:
    Data:
        Version: 3 (0x2)
        Serial Number: 1626572696888250470 (0x1692bf3ff75f9066)
        Signature Algorithm: sha256WithRSAEncryption
        Issuer: C = US, ST = North Carolina, O = "Red Hat, Inc.", OU = Red Hat Network, CN = Red Hat Candlepin Authority, emailAddress = ca-support
        Validity
            Not Before: Jun  8 10:34:34 2022 GMT
            Not After : Jun  8 11:34:34 2023 GMT
        Subject: O = 11789772, CN = 6011c284-905d-47f2-a450-2a0bafd71e3c

Old cert (openssl x509 -in /root/old-cert.pem -text --noout)::
Certificate:
    Data:
        Version: 3 (0x2)
        Serial Number: 9215938700743403232 (0x7fe5976b9195e6e0)
        Signature Algorithm: sha256WithRSAEncryption
        Issuer: C = US, ST = North Carolina, O = "Red Hat, Inc.", OU = Red Hat Network, CN = Red Hat Candlepin Authority, emailAddress = ca-support
        Validity
            Not Before: Jul  2 13:06:48 2021 GMT
            Not After : Jul  2 14:06:48 2022 GMT
        Subject: CN = 0272bb50-57e8-4476-981a-ac25fb969abd

When I try to start rhcd using the new cert I got:
[root@iqe-vm-rhc-e2e-sbbylrgbnp ~]# rhcd --broker wss://connect.cloud.stage.redhat.com:443 --cert-file /etc/pki/consumer/cert.pem --key-file /etc/pki/consumer/key.pem --log-level debug
[rhcd] 2022/06/08 10:00:25 /builddir/build/BUILD/rhc/yggdrasil-0.2.1/cmd/yggd/main.go:160: starting rhcd version 0.2.1
[rhcd] 2022/06/08 10:00:25 /builddir/build/BUILD/rhc/yggdrasil-0.2.1/cmd/yggd/main.go:209: listening on socket: @yggd-dispatcher-oHfBIE
cannot connect to broker: network Error : websocket: close 1006 (abnormal closure): unexpected EOF

Using the old cert I don't have any errors:
[root@iqe-vm-rhc-e2e-sbbylrgbnp ~]# rhcd --broker wss://connect.cloud.stage.redhat.com:443 --cert-file /root/old-cert.pem --key-file /root/old-key.pem --log-level debug
[rhcd] 2022/06/08 10:00:52 /builddir/build/BUILD/rhc/yggdrasil-0.2.1/cmd/yggd/main.go:160: starting rhcd version 0.2.1
[rhcd] 2022/06/08 10:00:52 /builddir/build/BUILD/rhc/yggdrasil-0.2.1/cmd/yggd/main.go:209: listening on socket: @yggd-dispatcher-VOtOqf
[rhcd] 2022/06/08 10:00:53 /builddir/build/BUILD/rhc/yggdrasil-0.2.1/cmd/yggd/main.go:337: starting worker: rhc-package-manager-worker
[rhcd] 2022/06/08 10:00:53 /builddir/build/BUILD/rhc/yggdrasil-0.2.1/cmd/yggd/main.go:337: starting worker: rhc-worker-playbook.worker
[rhcd] 2022/06/08 10:00:53 /builddir/build/BUILD/rhc/yggdrasil-0.2.1/cmd/yggd/main.go:357: cannot start watching '/etc/rhc/tags.toml': lstat /etc/rhc/tags.toml: no such file or directory
[rhcd] 2022/06/08 10:00:53 /builddir/build/BUILD/rhc/yggdrasil-0.2.1/cmd/yggd/exec.go:54: started process: 18303
[rhcd] 2022/06/08 10:00:53 /builddir/build/BUILD/rhc/yggdrasil-0.2.1/cmd/yggd/exec.go:92: watching process: 18303
[rhcd] 2022/06/08 10:00:53 /builddir/build/BUILD/rhc/yggdrasil-0.2.1/cmd/yggd/exec.go:54: started process: 18307
[rhcd] 2022/06/08 10:00:53 /builddir/build/BUILD/rhc/yggdrasil-0.2.1/cmd/yggd/exec.go:92: watching process: 18307
[rhcd] 2022/06/08 10:00:53 /builddir/build/BUILD/rhc/yggdrasil-0.2.1/cmd/yggd/grpc.go:69: worker registered: {pid:18303 handler:package-manager addr:@ygg-package-manager-RmNFAr features:map[] detachedContent:false}
[rhcd] 2022/06/08 10:00:53 /builddir/build/BUILD/rhc/yggdrasil-0.2.1/cmd/yggd/mqtt.go:131: published message 1d6d806e-30b4-4028-a7a1-f16c73ab2e20 to topic redhat/insights/0272bb50-57e8-4476-981a-ac25fb969abd/control/out

Comment 7 Link Dupont 2022-06-08 15:13:19 UTC
I'm able to reproduce this as well. Will start looking into a root cause.

Comment 8 Link Dupont 2022-06-08 15:44:37 UTC
Same error when I connect directly to the broker using `mqttcli/sub`:

[root@rhel-8-dev ~]# sub -broker wss://connect.cloud.stage.redhat.com:443 -cert-file /etc/pki/consumer/cert.pem -key-file /etc/pki/consumer/key.pem -topic redhat/insights/8924d620-3d96-48fd-b0ee-d730a92fc07f/control/in
2022/06/08 11:41:45 connect failed: network Error : websocket: close 1006 (abnormal closure): unexpected EOF

Comment 9 Link Dupont 2022-06-08 15:54:20 UTC
Parsing the certificate subject isn't the problem here. I stepped through the parseCertCN function[1] and it returns the CommonName from the subject correctly. And because it happens to mqttcli[2] as well (which doesn't parse the contents of the certificate at all), I strongly doubt this is a client-side issue. I wonder if there is some problem with TLS handshaking with the broker.

1: https://github.com/RedHatInsights/yggdrasil/blob/main/cmd/yggd/util.go#L31
2: https://git.sr.ht/~spc/mqttcli/tree/main/item/mqtt.go#L39

Comment 10 Link Dupont 2022-06-08 16:28:51 UTC
JWT authentication is unaffected; this only affects mTLS authentication (which rhcd uses).

[link@thelio rhc-demo]$ yggctl generate control-message --type command '{"command":"ping"}' | pub -config ldupont-teNiem6C.cfg -topic redhat/insights/8924d620-3d96-48fd-b0ee-d730a92fc07f/control/in
2022/06/08 12:26:46 connected: wss://connect.cloud.stage.redhat.com:443
2022/06/08 12:26:46 published: [redhat/insights/8924d620-3d96-48fd-b0ee-d730a92fc07f/control/in] [123 34 116 121 112 101 34 58 34 99 111 109 109 97 110 100 34 44 34 109 101 115 115 97 103 101 95 105 100 34 58 34 101 54 50 52 102 50 53 98 45 101 52 101 57 45 52 49 102 50 45 98 102 53 48 45 53 102 52 51 99 99 48 52 53 97 100 54 34 44 34 114 101 115 112 111 110 115 101 95 116 111 34 58 34 34 44 34 118 101 114 115 105 111 110 34 58 49 44 34 115 101 110 116 34 58 34 50 48 50 50 45 48 54 45 48 56 84 49 50 58 50 54 58 52 53 46 52 52 50 50 54 52 56 57 55 45 48 52 58 48 48 34 44 34 99 111 110 116 101 110 116 34 58 123 34 99 111 109 109 97 110 100 34 58 34 112 105 110 103 34 44 34 97 114 103 117 109 101 110 116 115 34 58 110 117 108 108 125 125 10]

[link@thelio rhc-demo]$ sub -config ldupont-Izah3bu7.cfg -topic redhat/insights/8924d620-3d96-48fd-b0ee-d730a92fc07f/control/in
2022/06/08 12:28:18 connected: wss://connect.cloud.stage.redhat.com:443
2022/06/08 12:28:18 subscribed: redhat/insights/8924d620-3d96-48fd-b0ee-d730a92fc07f/control/in
2022/06/08 12:28:20 [redhat/insights/8924d620-3d96-48fd-b0ee-d730a92fc07f/control/in] {"type":"command","message_id":"94facd7c-ac17-4f8a-9e68-cccec1430117","response_to":"","version":1,"sent":"2022-06-08T12:28:20.234849158-04:00","content":{"command":"ping","arguments":null}}

Comment 11 Eric Helms 2022-06-09 13:26:40 UTC
A note from our own testing of these workflows that may not be as obvious in the description, there are no errors thrown if you:

 1) Register to production subscription.rhsm.redhat.com
 2) Install rhc
 3) Configure rhc to point at the stage environment (keeping RHSM pointed at production and using production certificates)
 4) Start rhcd


So the connection to the broker in stage appears to work if the box is registered to production RHSM rather than stage RHSM.

Comment 12 Link Dupont 2022-06-09 13:43:04 UTC
(In reply to Eric Helms from comment #11)
> A note from our own testing of these workflows that may not be as obvious in
> the description, there are no errors thrown if you:
> 
>  1) Register to production subscription.rhsm.redhat.com
>  2) Install rhc
>  3) Configure rhc to point at the stage environment (keeping RHSM pointed at
> production and using production certificates)
>  4) Start rhcd
> 
> 
> So the connection to the broker in stage appears to work if the box is
> registered to production RHSM rather than stage RHSM.

The service might start successfully, but did you test whether or not you can publish messages to the host? I think the broker will silently accept authenticated connections from clients, but if they do not have the right ACLs to the topic(s) the client subscribed to, no messages will be received on those topics. We do have some machinery in place in cloud-connector to tell hosts to disconnect if they don't have the correct credentials, so it does seem odd that a production certificate would work correctly when used to authenticate to the stage broker.

Comment 19 Link Dupont 2022-06-17 16:55:20 UTC
Resolving as NOTABUG, this was not actually a bug in rhc.


Note You need to log in before you can comment on or make changes to this bug.