Hide Forgot
Description of problem: Given the F5 router with scc hostnetwork has been running. Create edge route. Check the router logs will print: Error copying certificate openshift_route_default_secured-edge-route-https-cert to F5 BIG-IP. Output from scp command: unknown user 1000010000 Version-Release number of selected component (if applicable): oc v3.2.0.6 kubernetes v1.2.0-36-g4a3f9c5 F5 router images id: 4f888f02bb09 How reproducible: always Steps to Reproduce: 1. Given the openshift and F5 server is running 2. Create F5 router with scc hostnetwork 3. Create edge route with oc create -f https://raw.githubusercontent.com/openshift-qe/v3-testfiles/master/routing/edge/route_edge.json 4. Check the router logs Actual results: [root@ip-10-3-90-123 ~]# oc logs router-1-x3jcb W0323 04:23:54.988062 1 f5.go:243] Strict certificate verification is *DISABLED* I0323 04:23:55.478979 1 router.go:161] Router is including routes in all namespaces E0323 04:24:52.135020 1 f5.go:1535] Error copying certificate openshift_route_default_secured-edge-route-https-cert to F5 BIG-IP. Output from scp command: unknown user 1000010000 Error: exit status 255 E0323 04:24:52.183018 1 f5.go:1528] Error deleting tempfile for certificate openshift_route_default_secured-edge-route-https-cert from F5 BIG-IP. Output from ssh command: No user exists for uid 1000010000 Error: exit status 255 Expected results: should not this error and work well. Additional info: if using scc privileged in step 2. the error message will be: E0323 03:56:44.886367 1 f5.go:1535] Error copying certificate openshift_route_default_secured-edge-route-https-cert to F5 BIG-IP. Output from scp command: Warning: Permanently added '10.3.88.53' (RSA) to the list of known hosts. Permission denied (publickey,keyboard-interactive,hostbased). lost connection Error: exit status 1 E0323 03:56:51.341102 1 f5.go:1528] Error deleting tempfile for certificate openshift_route_default_secured-edge-route-https-cert from F5 BIG-IP. Output from ssh command: Warning: Permanently added '10.3.88.53' (RSA) to the list of known hosts. Permission denied (publickey,keyboard-interactive,hostbased). Error: exit status 255 E0323 03:56:51.341219 1 controller.go:85] exit status 1
And also passthrough route cannot be synced to F5 server's policies. since this block the all F5 tls testing. raise the Severity to high
This has to do with the changes to how the router user is now added to the hostnetwork scc - which means that the router user/uid inside the container has restrictive access/capabilities. @zhaozhanqi, remove the router user from the hostnetwork SCC and add to the privileged SCC. That should make it work. # oadm policy remove-scc-from-user hostnetwork -z router # oadm policy add-scc-to-user privileged -z router
@Ram Ranganathan I also tried privileged, you can refer to the 'Additional info' in the bug description.
@zhaozhanqi, my bad - was late and I didn't notice the additional info section. But in any case, the privileged scc allows scp to proceed (not the user id issue) - it looks to be a credentials issue here (permission denied: invalid username/password?).
@Ram Ranganathan Yes, but I can scp the file to F5 server using the ca key (--external-host-private-key=) by manually
I checked your environment, the router.pem inside the container is not the same as ~/.ssh/id_rsa on the host. Did you start the router with the correct path to the external host key?
hi, this still be issue if using scc/hostnetwork for service account router since the default is using hostnetwork. you can refer "error: router could not be created; service account "router" is not allowed to access the host network on nodes, grant access with oadm policy add-scc-to-user hostnetwork -z router" the following is the router logs when using scc/hostnetwork [root@ip-10-3-90-123 ~]# oc logs router-1-hg56m W0412 22:24:49.482868 1 f5.go:243] Strict certificate verification is *DISABLED* I0412 22:24:50.019568 1 router.go:161] Router is including routes in all namespaces E0412 22:24:52.814037 1 f5.go:1535] Error copying certificate openshift_route_default_secured-edge-route-https-cert to F5 BIG-IP. Output from scp command: unknown user 1000010000 Error: exit status 255 E0412 22:24:52.852055 1 f5.go:1528] Error deleting tempfile for certificate openshift_route_default_secured-edge-route-https-cert from F5 BIG-IP. Output from ssh command: No user exists for uid 1000010000 Error: exit status 255 E0412 22:24:52.852140 1 controller.go:85] exit status 255
This error happens when the secret is stale. See comment#13 (https://bugzilla.redhat.com/show_bug.cgi?id=1320490#c13). Closing this bug. Re-open if the error is seen even when the keys are correct.
Please refer to comment 12. this is still cannot work for hostnetwork scc
(In reply to zhaozhanqi from comment #18) > Please refer to comment 12. this is still cannot work for hostnetwork scc typo.. should be comment 16
@zhaozhanqi / @rchopra, so the main issue I see here is that you can not run scp with the generated uid (example 1000020000). By default that's the preallocated user id the /usr/bin/openshift-router process runs under inside the container because of the permissions of the hostnetwork scc (runAsUser === MustRunInRange). In order for scp to work, that would need to be runAsUser === RunAsAny. Create an scc which has that set (and add the router service user to that scc) and it will work or I think using the privileged scc will also work - though that's a "wee" bit more perms than is needed. Since its late, just updating the docs would be a better bet here. You probably need to get the right magic "scc"/oadm policy incantations before that!
For now, to cover this bug, the documentation changes are proposed in this PR: https://github.com/openshift/openshift-docs/pull/2660
I also believe that https://bugzilla.redhat.com/show_bug.cgi?id=1369513 is needed a the customer pointed out that this was also an issue.
Since https://bugzilla.redhat.com/show_bug.cgi?id=1320490#c21 pointed out that a fix was applied, I am closing this bug as current release.