Bug 2136916 - The inst.sshd boot option no longer works
Summary: The inst.sshd boot option no longer works
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: systemd
Version: rawhide
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
Assignee: Michal Sekletar
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
: 2141140 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-10-21 20:38 UTC by Martin Kolman
Modified: 2022-12-09 20:28 UTC (History)
31 users (show)

Fixed In Version: systemd-252.3-594.fc38
Clone Of:
Environment:
Last Closed: 2022-12-08 22:38:45 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
package list for the 20221007 image where inst.sshd works (26.22 KB, text/plain)
2022-10-21 20:39 UTC, Martin Kolman
no flags Details
package list for the 20221008 image where inst.sshd no longer works (26.23 KB, text/plain)
2022-10-21 20:40 UTC, Martin Kolman
no flags Details
journal dump from the 20221007 image (651.15 KB, text/plain)
2022-10-24 13:03 UTC, Martin Kolman
no flags Details
journal dump from the 20221008 image (878.81 KB, text/plain)
2022-10-24 13:07 UTC, Martin Kolman
no flags Details
journal dump from the 20221008 image but with inst.sshd boot option being used (761.21 KB, text/plain)
2022-10-24 13:20 UTC, Martin Kolman
no flags Details
Rawhide systemctl output (1.99 KB, text/plain)
2022-11-07 22:05 UTC, Jiri Konecny
no flags Details
Fedora 26 systemctl output (1.33 KB, text/plain)
2022-11-07 22:06 UTC, Jiri Konecny
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker FC-637 0 None None None 2022-10-24 13:47:37 UTC

Internal Links: 2133187

Description Martin Kolman 2022-10-21 20:38:19 UTC
Description of problem:
Looks like something changed in Rawhide & its no longer possible to SSH into the installer boot iso after passing the inst.sshd boot option.

Version-Release number of selected component (if applicable):
I had an approximate idea when this started happening, so I went over Rawhide boot.iso images released in that time period and tracked it down to the 20221008 compose - that's the first image on which I'm not able to SSH into, 20221007 works fine.

I've dumped the lorax package from /root and got this diff:

275,278c275,278
< kernel-6.1.0-0.rc0.20221006git833477fce7a1.4.fc38.x86_64
< kernel-core-6.1.0-0.rc0.20221006git833477fce7a1.4.fc38.x86_64
< kernel-modules-6.1.0-0.rc0.20221006git833477fce7a1.4.fc38.x86_64
< kernel-modules-extra-6.1.0-0.rc0.20221006git833477fce7a1.4.fc38.x86_64
---
> kernel-6.1.0-0.rc0.20221007git4c86114194e6.5.fc38.x86_64
> kernel-core-6.1.0-0.rc0.20221007git4c86114194e6.5.fc38.x86_64
> kernel-modules-6.1.0-0.rc0.20221007git4c86114194e6.5.fc38.x86_64
> kernel-modules-extra-6.1.0-0.rc0.20221007git4c86114194e6.5.fc38.x86_64
392c392
< libgpg-error-1.45-2.fc37.x86_64
---
> libgpg-error-1.46-1.fc38.x86_64
406c406
< libksba-1.6.1-1.fc38.x86_64
---
> libksba-1.6.2-1.fc38.x86_64
566,568c566,568
< openssh-9.0p1-5.fc38.x86_64
< openssh-clients-9.0p1-5.fc38.x86_64
< openssh-server-9.0p1-5.fc38.x86_64
---
> openssh-9.0p1-6.fc38.x86_64
> openssh-clients-9.0p1-6.fc38.x86_64
> openssh-server-9.0p1-6.fc38.x86_64
712,716c712,716
< systemd-251.5-607.fc38.x86_64
< systemd-libs-251.5-607.fc38.x86_64
< systemd-pam-251.5-607.fc38.x86_64
< systemd-resolved-251.5-607.fc38.x86_64
< systemd-udev-251.5-607.fc38.x86_64
---
> systemd-252~rc1-608.fc38.x86_64
> systemd-libs-252~rc1-608.fc38.x86_64
> systemd-pam-252~rc1-608.fc38.x86_64
> systemd-resolved-252~rc1-608.fc38.x86_64
> systemd-udev-252~rc1-608.fc38.x86_64
747,748c747,748
< wireplumber-0.4.11-4.fc38.x86_64
< wireplumber-libs-0.4.11-4.fc38.x86_64
---
> wireplumber-0.4.12-1.fc38.x86_64
> wireplumber-libs-0.4.12-1.fc38.x86_64


As you can see, there is no anaconda update, so its most likely not caused by Anaconda code changes. Still filling on the anaconda component for now as this still could be related to how we run sshd, unless we find otherwise.

There is an openssh update, but looking at its changelog:
https://koji.fedoraproject.org/koji/buildinfo?buildID=2072342

It does not look that incriminating:

* Wed Oct 05 2022 Anthony Rabbito <hello> - 9.0p1-6
- Add a socket unit to ssh-agent user unit (rhbz#2125576)

* Thu Sep 29 2022 Dmitry Belyavskiy <dbelyavs> - 9.0p1-5
- RSAMinSize => RequiredRSASize

There is also a systemd update:

https://koji.fedoraproject.org/koji/buildinfo?buildID=2075107

* Sun Oct 09 2022 Zbigniew Jędrzejewski-Szmek <zbyszek.pl>
- Fix indentation in %sysusers_create_compat macro (rhbz#2132835)

* Sun Oct 09 2022 Zbigniew Jędrzejewski-Szmek <zbyszek.pl>
- Correctly move systemd-measure to systemd-udev subpackage

* Fri Oct 07 2022 Zbigniew Jędrzejewski-Szmek <zbyszek.pl>
- Version 252-rc1 (for details see
  https://raw.githubusercontent.com/systemd/systemd/v252-rc1/NEWS)

* Sat Oct 01 2022 Zbigniew Jędrzejewski-Szmek <zbyszek.pl>
- Fix permissions on %ghost files (rhbz#2122889)

* Sat Oct 01 2022 Zbigniew Jędrzejewski-Szmek <zbyszek.pl>
- Version 251.5 (rhbz#2129343, rhbz#2121106, rhbz#2130188)

Also nothing obvious as far as I can tell. :P



How reproducible:
Always.

Steps to Reproduce:
1. boot the image
2. switch to TTY2 or second tab in tmux running on TTY1 to get into root shell
3. ssh to localhost or remotely


Actual results:
SSH connection fails with error:


$ ssh root.136.123 -vvv
OpenSSH_8.8p1, OpenSSL 3.0.5 5 Jul 2022
debug1: Reading configuration data /home/mkolman/.ssh/config
debug1: Reading configuration data /etc/ssh/ssh_config
debug3: /etc/ssh/ssh_config line 55: Including file /etc/ssh/ssh_config.d/50-redhat.conf depth 0
debug1: Reading configuration data /etc/ssh/ssh_config.d/50-redhat.conf
debug2: checking match for 'final all' host 10.43.136.123 originally 10.43.136.123
debug3: /etc/ssh/ssh_config.d/50-redhat.conf line 3: not matched 'final'
debug2: match not found
debug3: /etc/ssh/ssh_config.d/50-redhat.conf line 5: Including file /etc/crypto-policies/back-ends/openssh.config depth 1 (parse only)
debug1: Reading configuration data /etc/crypto-policies/back-ends/openssh.config
debug3: gss kex names ok: [gss-curve25519-sha256-,gss-nistp256-sha256-,gss-group14-sha256-,gss-group16-sha512-,gss-gex-sha1-,gss-group14-sha1-,gss-group1-sha1-]
debug3: kex names ok: [curve25519-sha256,curve25519-sha256,ecdh-sha2-nistp256,ecdh-sha2-nistp384,ecdh-sha2-nistp521,diffie-hellman-group-exchange-sha256,diffie-hellman-group14-sha256,diffie-hellman-group16-sha512,diffie-hellman-group18-sha512,diffie-hellman-group-exchange-sha1,diffie-hellman-group14-sha1,diffie-hellman-group1-sha1]
debug1: configuration requests final Match pass
debug2: resolve_canonicalize: hostname 10.43.136.123 is address
debug1: re-parsing configuration
debug1: Reading configuration data /home/mkolman/.ssh/config
debug1: Reading configuration data /etc/ssh/ssh_config
debug3: /etc/ssh/ssh_config line 55: Including file /etc/ssh/ssh_config.d/50-redhat.conf depth 0
debug1: Reading configuration data /etc/ssh/ssh_config.d/50-redhat.conf
debug2: checking match for 'final all' host 10.43.136.123 originally 10.43.136.123
debug3: /etc/ssh/ssh_config.d/50-redhat.conf line 3: matched 'final'
debug2: match found
debug3: /etc/ssh/ssh_config.d/50-redhat.conf line 5: Including file /etc/crypto-policies/back-ends/openssh.config depth 1
debug1: Reading configuration data /etc/crypto-policies/back-ends/openssh.config
debug3: gss kex names ok: [gss-curve25519-sha256-,gss-nistp256-sha256-,gss-group14-sha256-,gss-group16-sha512-,gss-gex-sha1-,gss-group14-sha1-,gss-group1-sha1-]
debug3: kex names ok: [curve25519-sha256,curve25519-sha256,ecdh-sha2-nistp256,ecdh-sha2-nistp384,ecdh-sha2-nistp521,diffie-hellman-group-exchange-sha256,diffie-hellman-group14-sha256,diffie-hellman-group16-sha512,diffie-hellman-group18-sha512,diffie-hellman-group-exchange-sha1,diffie-hellman-group14-sha1,diffie-hellman-group1-sha1]
debug3: expanded UserKnownHostsFile '~/.ssh/known_hosts' -> '/home/mkolman/.ssh/known_hosts'
debug3: expanded UserKnownHostsFile '~/.ssh/known_hosts2' -> '/home/mkolman/.ssh/known_hosts2'
debug3: ssh_connect_direct: entering
debug1: Connecting to 10.43.136.123 [10.43.136.123] port 22.
debug3: set_sock_tos: set socket 3 IP_TOS 0x48
debug1: Connection established.
debug1: identity file /home/mkolman/.ssh/id_rsa type 0
debug1: identity file /home/mkolman/.ssh/id_rsa-cert type -1
debug1: identity file /home/mkolman/.ssh/id_dsa type -1
debug1: identity file /home/mkolman/.ssh/id_dsa-cert type -1
debug1: identity file /home/mkolman/.ssh/id_ecdsa type -1
debug1: identity file /home/mkolman/.ssh/id_ecdsa-cert type -1
debug1: identity file /home/mkolman/.ssh/id_ecdsa_sk type -1
debug1: identity file /home/mkolman/.ssh/id_ecdsa_sk-cert type -1
debug1: identity file /home/mkolman/.ssh/id_ed25519 type -1
debug1: identity file /home/mkolman/.ssh/id_ed25519-cert type -1
debug1: identity file /home/mkolman/.ssh/id_ed25519_sk type -1
debug1: identity file /home/mkolman/.ssh/id_ed25519_sk-cert type -1
debug1: identity file /home/mkolman/.ssh/id_xmss type -1
debug1: identity file /home/mkolman/.ssh/id_xmss-cert type -1
debug1: Local version string SSH-2.0-OpenSSH_8.8
debug1: kex_exchange_identification: banner line 0: /etc/ssh/sshd_config: No such file or directory
kex_exchange_identification: read: Connection reset by peer
Connection reset by 10.43.136.123 port 22


The most interesting line here seems to be: 

debug1: kex_exchange_identification: banner line 0: /etc/ssh/sshd_config: No such file or directory
kex_exchange_identification: read: Connection reset by peer

Indeed, we don't have this exact file on the image, but that has not changed - the 20221007 image does not have it & it still works. We do have /etc/ssh/sshd_config.d and /etc/ssh/sshd_config.anaconda, which we use when starting the ssh from a systemd unit:

https://github.com/rhinstaller/anaconda/blob/361206c051fd469a63d7dd73b362c6c09bf0720d/data/systemd/anaconda-sshd.service#L14


[Unit]
Description=OpenSSH server daemon
Before=anaconda.target
After=syslog.target network.target sshd-keygen.target
Wants=sshd-keygen.target
ConditionKernelCommandLine=|inst.sshd
ConditionKernelCommandLine=!inst.sshd=0
ConditionPathIsDirectory=|/sys/hypervisor/s390

[Service]
EnvironmentFile=-/etc/crypto-policies/back-ends/opensshserver.config
EnvironmentFile=-/etc/sysconfig/sshd
ExecStartPre=/usr/sbin/handle-sshpw
ExecStart=/usr/sbin/sshd -D $OPTIONS $CRYPTO_POLICY -f /etc/ssh/sshd_config.anaconda
ExecReload=/bin/kill -HUP $MAINPID


Expected results:
It is possible to SSH in if inst.sshd is passed.

Additional info:
This was first detected in Cockpit CI during Rawhide boot iso image refresh used to test the Anaconda Web UI:

https://github.com/cockpit-project/bots/pull/3966

The cockpit infra gates all the images & this image is gated by Anaconda Web UI tests, which among other things SSH into the boot iso, which now fails. Interestingly no other CI test has detected this so far in the 13 days since the regression has been in place, which is something that could be improved.

Comment 1 Martin Kolman 2022-10-21 20:39:21 UTC
Created attachment 1919497 [details]
package list for the 20221007 image where inst.sshd works

Comment 2 Martin Kolman 2022-10-21 20:40:12 UTC
Created attachment 1919498 [details]
package list for the 20221008 image where inst.sshd no longer works

Comment 3 Martin Kolman 2022-10-24 13:03:49 UTC
Created attachment 1919987 [details]
journal dump from the 20221007 image

This is a journal dump from the 20221007 image, with openssh-9.0p1-5. Note that there is *no* "Listening on sshd" line in it.

Comment 4 Martin Kolman 2022-10-24 13:04:34 UTC
(In reply to Martin Kolman from comment #3)
> Created attachment 1919987 [details]
> journal dump from the 20221007 image
> 
> This is a journal dump from the 20221007 image, with openssh-9.0p1-5. Note
> that there is *no* "Listening on sshd" line in it.

Slight correction - should have been "Listening on sshd.socket".

Comment 5 Martin Kolman 2022-10-24 13:07:36 UTC
Created attachment 1919988 [details]
journal dump from the 20221008 image

This is a journal dump from the 20221008 image with openssh-9.0p1-6 - note that there *is* now the "Listening on sshd.socket" line, that was not present on the 20221007 image.

Comment 6 Martin Kolman 2022-10-24 13:20:59 UTC
Created attachment 1919989 [details]
journal dump from the 20221008 image but with inst.sshd boot option being used

This is journal dump from the 20221008 image (which has openssh-9.0p1-6) but this time the inst.sshd option has been set. Looking at the contents of the log file, we can see that the sshd.socket service is active:


"Oct 24 11:46:06 fedora systemd[1]: Listening on sshd.socket - OpenSSH Server Socket."


Also, as inst.sshd has been passed, the anaconda-sshd.service should be started:


Oct 24 11:46:07 fedora systemd[1]: Starting anaconda-sshd.service - OpenSSH server daemon...


But then:


subj=system_u:system_r:kernel_t:s0 msg='unit=anaconda-sshd comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Oct 24 11:46:09 dhcp113.anaconda.englab.brq.redhat.com systemd[1]: Started anaconda-sshd.service - OpenSSH server daemon.
Oct 24 11:46:09 dhcp113.anaconda.englab.brq.redhat.com kernel: kauditd_printk_skb: 49 callbacks suppressed
Oct 24 11:46:09 dhcp113.anaconda.englab.brq.redhat.com kernel: audit: type=1130 audit(1666611969.745:219): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:kernel_t:s0 msg='unit=anaconda-sshd comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Oct 24 11:46:09 dhcp113.anaconda.englab.brq.redhat.com systemd[1]: anaconda-direct.service - the anaconda installation program was skipped because no trigger condition checks were met.
Oct 24 11:46:09 dhcp113.anaconda.englab.brq.redhat.com systemd[1]: Reached target anaconda.target - Anaconda System Services.
Oct 24 11:46:09 dhcp113.anaconda.englab.brq.redhat.com systemd[1]: Starting anaconda.service - Anaconda...
Oct 24 11:46:09 dhcp113.anaconda.englab.brq.redhat.com sshd[1842]: error: Bind to port 22 on 0.0.0.0 failed: Address already in use.
Oct 24 11:46:09 dhcp113.anaconda.englab.brq.redhat.com systemd[1]: anaconda-sshd.service: Main process exited, code=exited, status=255/EXCEPTION
Oct 24 11:46:09 dhcp113.anaconda.englab.brq.redhat.com sshd[1842]: error: Bind to port 22 on :: failed: Address already in use.
Oct 24 11:46:09 dhcp113.anaconda.englab.brq.redhat.com systemd[1]: anaconda-sshd.service: Failed with result 'exit-code'.
Oct 24 11:46:09 dhcp113.anaconda.englab.brq.redhat.com sshd[1842]: fatal: Cannot bind any address.
Oct 24 11:46:09 dhcp113.anaconda.englab.brq.redhat.com audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:kernel_t:s0 msg='unit=anaconda-sshd comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=failed'
Oct 24 11:46:09 dhcp113.anaconda.englab.brq.redhat.com systemd[1]: anaconda-sshd.service: Consumed 2.089s CPU time.
Oct 24 11:46:09 dhcp113.anaconda.englab.brq.redhat.com kernel: audit: type=1131 audit(1666611969.821:220): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:kernel_t:s0 msg='unit=anaconda-sshd comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=failed'


So it looks like the SSH port (22) is already occupied by the sshd.socket service and "our" sshd service can't be started as a result.

This did not happen on older <=20221007 images with openssh <=openssh-9.0p1-6 - so anaconda-sshd.service was able to bind to port 22 successfully.

Comment 7 Martin Kolman 2022-10-24 13:30:50 UTC
(In reply to Martin Kolman from comment #6)
> Created attachment 1919989 [details]
> journal dump from the 20221008 image but with inst.sshd boot option being
> used
> 
> This is journal dump from the 20221008 image (which has openssh-9.0p1-6) but
> this time the inst.sshd option has been set. Looking at the contents of the
> log file, we can see that the sshd.socket service is active:
> 
> 
> "Oct 24 11:46:06 fedora systemd[1]: Listening on sshd.socket - OpenSSH
> Server Socket."
> 
> 
> Also, as inst.sshd has been passed, the anaconda-sshd.service should be
> started:
> 
> 
> Oct 24 11:46:07 fedora systemd[1]: Starting anaconda-sshd.service - OpenSSH
> server daemon...
> 
> 
> But then:
> 
> 
> subj=system_u:system_r:kernel_t:s0 msg='unit=anaconda-sshd comm="systemd"
> exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
> Oct 24 11:46:09 dhcp113.anaconda.englab.brq.redhat.com systemd[1]: Started
> anaconda-sshd.service - OpenSSH server daemon.
> Oct 24 11:46:09 dhcp113.anaconda.englab.brq.redhat.com kernel:
> kauditd_printk_skb: 49 callbacks suppressed
> Oct 24 11:46:09 dhcp113.anaconda.englab.brq.redhat.com kernel: audit:
> type=1130 audit(1666611969.745:219): pid=1 uid=0 auid=4294967295
> ses=4294967295 subj=system_u:system_r:kernel_t:s0 msg='unit=anaconda-sshd
> comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=?
> res=success'
> Oct 24 11:46:09 dhcp113.anaconda.englab.brq.redhat.com systemd[1]:
> anaconda-direct.service - the anaconda installation program was skipped
> because no trigger condition checks were met.
> Oct 24 11:46:09 dhcp113.anaconda.englab.brq.redhat.com systemd[1]: Reached
> target anaconda.target - Anaconda System Services.
> Oct 24 11:46:09 dhcp113.anaconda.englab.brq.redhat.com systemd[1]: Starting
> anaconda.service - Anaconda...
> Oct 24 11:46:09 dhcp113.anaconda.englab.brq.redhat.com sshd[1842]: error:
> Bind to port 22 on 0.0.0.0 failed: Address already in use.
> Oct 24 11:46:09 dhcp113.anaconda.englab.brq.redhat.com systemd[1]:
> anaconda-sshd.service: Main process exited, code=exited, status=255/EXCEPTION
> Oct 24 11:46:09 dhcp113.anaconda.englab.brq.redhat.com sshd[1842]: error:
> Bind to port 22 on :: failed: Address already in use.
> Oct 24 11:46:09 dhcp113.anaconda.englab.brq.redhat.com systemd[1]:
> anaconda-sshd.service: Failed with result 'exit-code'.
> Oct 24 11:46:09 dhcp113.anaconda.englab.brq.redhat.com sshd[1842]: fatal:
> Cannot bind any address.
> Oct 24 11:46:09 dhcp113.anaconda.englab.brq.redhat.com audit[1]:
> SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295
> subj=system_u:system_r:kernel_t:s0 msg='unit=anaconda-sshd comm="systemd"
> exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=failed'
> Oct 24 11:46:09 dhcp113.anaconda.englab.brq.redhat.com systemd[1]:
> anaconda-sshd.service: Consumed 2.089s CPU time.
> Oct 24 11:46:09 dhcp113.anaconda.englab.brq.redhat.com kernel: audit:
> type=1131 audit(1666611969.821:220): pid=1 uid=0 auid=4294967295
> ses=4294967295 subj=system_u:system_r:kernel_t:s0 msg='unit=anaconda-sshd
> comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=?
> res=failed'
> 
> 
> So it looks like the SSH port (22) is already occupied by the sshd.socket
> service and "our" sshd service can't be started as a result.
> 
> This did not happen on older <=20221007 images with openssh
> <=openssh-9.0p1-6 - so anaconda-sshd.service was able to bind to port 22
> successfully.

One more typo. :P Should have been <=openssh-9.0p1-5 - the last "known good" openssh version in this regard.

Comment 8 Martin Kolman 2022-10-24 13:43:16 UTC
First - thanks a lot to Vendula Poncova for noticing port 22 is being occupied by sshd.socket - that sent me in the right direction! :)

So based on the logs attached to comments 3-7 it seems very likely this regression has been caused by a change in openssh-9.0p1-6. From the changelog:

* Wed Oct 05 2022 Anthony Rabbito <hello> - 9.0p1-6
- Add a socket unit to ssh-agent user unit (rhbz#2125576)

Looking into distgit, looks like these are the actual changes:

https://src.fedoraproject.org/rpms/openssh/c/9417892cb75727f72757dec758d16d115356da68?branch=rawhide
https://src.fedoraproject.org/rpms/openssh/c/11b8701db94897460eb06709597e25a1e8ea14f6?branch=rawhide
https://src.fedoraproject.org/rpms/openssh/c/499c2eb7ecdc6c790450411a945c8c35b0e74301?branch=rawhide

Even though neither directly touches sshd.socket, the ssh-agent.service and ssh-agent.socket changes must have somehow activated sshd.socket:


diff --git a/ssh-agent.service b/ssh-agent.service
index c215022..50a9ea1 100644
--- a/ssh-agent.service
+++ b/ssh-agent.service
@@ -5,6 +5,7 @@
 ConditionEnvironment=!SSH_AGENT_PID
 Description=OpenSSH key agent
 Documentation=man:ssh-agent(1) man:ssh-add(1) man:ssh(1)
+Requires=ssh-agent.socket
 
 [Service]
 Environment=SSH_AUTH_SOCK=%t/ssh-agent.socket
@@ -12,3 +13,6 @@ ExecStart=/usr/bin/ssh-agent -a $SSH_AUTH_SOCK
 PassEnvironment=SSH_AGENT_PID
 SuccessExitStatus=2
 Type=forking
+
+[Install]
+Also=ssh-agent.socket
diff --git a/ssh-agent.socket b/ssh-agent.socket
new file mode 100644
index 0000000..d589cbc
--- /dev/null
+++ b/ssh-agent.socket
@@ -0,0 +1,14 @@
+[Unit]
+Description=OpenSSH key agent
+Documentation=man:ssh-agent(1) man:ssh-add(1) man:ssh(1)
+
+[Socket]
+ListenStream=%t/ssh-agent.socket
+Service=ssh-agent.service
+Priority=6
+Backlog=5
+SocketMode=0600
+DirectoryMode=0700
+
+[Install]
+WantedBy=sockets.target


So as this has been clearly caused by a recent change in openssh, switching the component to openssh. 

Still please let us now if you need more information about the installation environment or want to test a prospective fix - thanks! :)

Comment 9 Martin Kolman 2022-10-25 16:21:15 UTC
Possible workaround:

0. make sure inst.sshd is passed as boot option
1: stop sshd socket service:

systemctl stop sshd.socket

2. start anaconda sshd service:

systemctl start anaconda-sshd

Comment 10 Martin Kolman 2022-11-07 13:58:23 UTC
Any updates on this ? This still blocking CI image refresh for our Web UI tests (making it more and more likely it breaks with the outdated image) as well as making any runtime installation issues hard to debug.

A speedy fix or revert of the patch causing this issues would be appreciated!

Comment 11 Alexander Sosedkin 2022-11-07 14:43:40 UTC
Hi, a concerned passerby here.
Since we've ended up with two competitors for port 22: which one do we want to win?
https://github.com/rhinstaller/anaconda/blob/master/data/systemd/anaconda-sshd.service
or
https://src.fedoraproject.org/rpms/openssh/blob/rawhide/f/sshd.socket ?

Do I guess correctly that we can't leave sshd.socket on in the installation image
since the user must opt into having SSH open to the public?
In that case, can installation images just... not ship sshd.socket?

needinfo'ing jkonecny at random^W^Was the latest non-comment committer to anaconda-sshd.service
in hopes that he can shed more light on why it is like it is.

Comment 12 Jiri Konecny 2022-11-07 16:03:32 UTC
Hi Alexander,

honestly we shouldn't be fighting about the socket. Anaconda is not using anything so special we are using sshd anyway. We need to find a way how to work with the new solution in a way that inst.sshd will still work as expected. For that we need an input from the openssh maintainers.

IMHO I would expect that our service would change and use their socket or maybe we should really just remove that from the installation environment because we don't want to have it auto-enabled anyway?

Comment 13 Martin Kolman 2022-11-07 16:38:35 UTC
For the record, bug 2125576 being addressed is what triggered the clash with anaconda-sshd.service.

Comment 14 Dmitry Belyavskiy 2022-11-07 17:44:49 UTC
Dear colleagues,

Unfortunately I don't understand the whole picture here so before changing this place I'd like to get some shared opinion.

Comment 15 Brian Lane 2022-11-07 19:15:51 UTC
It looks to me like this can be fixed by just disabling sshd.socket on the boot.iso -- PR is here https://github.com/weldr/lorax/pull/1283

Comment 16 Jiri Konecny 2022-11-07 20:38:44 UTC
Hi Brian, thanks for coming with the fix but I would like to first find out if this service should really be enabled by default.

The sshd.socket seems to be enable SSH by default on Fedora where I don't think it is intentional.

Dmitry could you please confirm my assumption above? And if I'm correct than we need to resolve this issue and not enable the sshd.socket by default (in general I see that as security hole).

Comment 17 Brian Lane 2022-11-07 21:51:00 UTC
(In reply to Jiri Konecny from comment #16)
> Hi Brian, thanks for coming with the fix but I would like to first find out
> if this service should really be enabled by default.

That's a separate issue from the title of this bug. Although I now see this bug isn't assigned to anaconda, could have sworn it was when I looked at it earlier. Seems safer to me to disable it in lorax, no matter what.

> 
> The sshd.socket seems to be enable SSH by default on Fedora where I don't
> think it is intentional.
> 
> Dmitry could you please confirm my assumption above? And if I'm correct than
> we need to resolve this issue and not enable the sshd.socket by default (in
> general I see that as security hole).

As far as the boot.iso is concerned, yes it would be, if it worked. Luckily there is no config file and connections fail.

Comment 18 Jiri Konecny 2022-11-07 22:04:08 UTC
I did some testing about the default state of the sshd.socket:

In the installed Rawhide system:
sshd.socket is disabled

Installation environment Fedora 36:
sshd.socket is disabled

Installation environment Fedora Rawhide:
sshd.socket is enabled (which takes port before anaconda-sshd.service we are using when inst.sshd)



We don't want to have sshd.socket enabled in the installation environment and there is a regression in default enabled state. I also wonder why the sshd.socket is enabled in the installation environment but not in the installed system (maybe Anaconda prevents that or Lorax causing it?).


Also more info about inst.sshd. It's just a simple service file:
https://github.com/rhinstaller/anaconda/blob/master/data/systemd/anaconda-sshd.service


So to make this works all sshd services/sockets have to be disabled. I'll attach logs from my discovery for comparation.

Comment 19 Jiri Konecny 2022-11-07 22:05:30 UTC
Created attachment 1922894 [details]
Rawhide systemctl output

Comment 20 Jiri Konecny 2022-11-07 22:06:13 UTC
Created attachment 1922895 [details]
Fedora 26 systemctl output

Comment 21 Jiri Konecny 2022-11-07 22:07:33 UTC
Sorry the last attachment is of course Fedora 36. The description has a typo.

Comment 22 Jiri Konecny 2022-11-07 22:10:07 UTC
(In reply to Brian Lane from comment #17)
> (In reply to Jiri Konecny from comment #16)
> > Hi Brian, thanks for coming with the fix but I would like to first find out
> > if this service should really be enabled by default.
> 
> That's a separate issue from the title of this bug. Although I now see this
> bug isn't assigned to anaconda, could have sworn it was when I looked at it
> earlier. Seems safer to me to disable it in lorax, no matter what.

It is safer but we need to make this work also in Image Builder so Lorax fix would be just half of it.

Another reason is that there could be more environments than just us which could face the same issue so it should be fixed appropriately.

Comment 23 Jakub Jelen 2022-11-08 10:01:35 UTC
The sshd.socket should not be enabled by default, but openssh updated can not enable it itself anyway. The enabled services and sockets are in https://src.fedoraproject.org/rpms/fedora-release/blob/rawhide/f/90-default.preset and there is really only the sshd.service (which should be there) so if something enables you the sshd.socket I would probably look into the systemd. The change between -5 and -6 release in openssh could not have introduced anything related to that (unless systemd is terribly broken and by enabling user services it would enable also unrelated system socket. This would also go against all the packaging guidelines). So I think the openssh change is red herring.

Comment 24 Anthony Rabbito 2022-11-09 02:40:59 UTC
I'm still having a hard time understanding how this change https://src.fedoraproject.org/rpms/openssh/pull-request/35#request_diff caused this RHBZ. It does not touch sshd.socket and the change is for the opposite effect of sshd.

Regardless I mentioned to Jiri on the side it can be worth using a Systemd drop-in/override for sshd.service taking the directives of https://github.com/rhinstaller/anaconda/blob/master/data/systemd/anaconda-sshd.service and just using the sshd unit with anaconda overrides. Regardless I think it's worth finding how how sshd magically starting being enabled by default in the first place.

Comment 25 Jiri Konecny 2022-11-09 08:42:38 UTC
We contacted systemd team to help us with debugging and based on their debugging it seems that this is bug in systemd.

This is minimal reproducer on the installed system:

dnf -y --installroot=/var/lib/machines/f           install                  passwd dnf fedora-release vim-minimal systemd systemd-networkd openssh-server

output:
```
Running scriptlet: openssh-server-9.0p1-8.fc38.x86_64
Created symlink /etc/systemd/system/multi-user.target.wants/sshd.service → /usr/lib/systemd/system/sshd.service.
Created symlink /etc/systemd/system/sockets.target.wants/sshd.socket → /usr/lib/systemd/system/sshd.socket.
```

Thanks a lot everyone for your help!

Meanwhile as hotfix we would like to merge the lorax PR. That would also make the installation environment more robust in the future.

Comment 26 Michal Sekletar 2022-11-30 10:14:00 UTC
The underlying root cause seems to be absence of mounted /proc in the environment where systemctl preset is called on the unit file. As a result all non-static unit files (i.e. such that have [Install] section) are enabled by preset operation. This is clearly wrong and major bug.

Here is a minimal reproducer,

$ rpm -q systemd
systemd-252.2-591.fc38.x86_64

$ mount --bind / /mnt/root 

$ chroot /mnt/root

$ (chroot) systemctl list-unit-files | grep debug-shell.service
debug-shell.service                        disabled enabled

As seen in the above output preset state (3rd column) of debug-shell.service is enabled which is clearly wrong. Preset state is reported correctly after mounting procfs.

$ (chroot) mount -t proc proc /proc
$ (chroot) systemctl list-unit-files | grep debug-shell.service
debug-shell.service                        disabled disabled

Comment 27 Zbigniew Jędrzejewski-Szmek 2022-12-02 16:46:07 UTC
https://github.com/systemd/systemd/pull/25581

Comment 28 Fedora Update System 2022-12-08 22:29:58 UTC
FEDORA-2022-7f1889cc8c has been submitted as an update to Fedora 38. https://bodhi.fedoraproject.org/updates/FEDORA-2022-7f1889cc8c

Comment 29 Fedora Update System 2022-12-08 22:38:45 UTC
FEDORA-2022-7f1889cc8c has been pushed to the Fedora 38 stable repository.
If problem still persists, please make note of it in this bug report.

Comment 30 Adam Williamson 2022-12-09 20:28:27 UTC
*** Bug 2141140 has been marked as a duplicate of this bug. ***


Note You need to log in before you can comment on or make changes to this bug.