Bug 1120966
Summary: | oo-admin-* utils not installed on broker | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | OKD | Reporter: | Nicholas Schuetz <nick> | ||||||
Component: | Installer | Assignee: | N. Harrison Ripps <hripps> | ||||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | |||||||
Severity: | medium | Docs Contact: | |||||||
Priority: | unspecified | ||||||||
Version: | 2.x | CC: | mmccomas, nick | ||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | x86_64 | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2014-07-18 18:40:05 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
Nicholas Schuetz
2014-07-18 04:25:43 UTC
Looking at it further, the openshift-origin repo isn't even enabled on the broker like it is on the node. I copied the openshift-origin.repo file to the broker and then installed the openshift-origin-broker-util package via yum. This gave me the oo-admin tools on the broker so i could manually create my districts. Can you please provide: 1. The ~/.openshift/oo-install-cfg.yml file from the host where you ran oo-install 2. The /tmp/openshift-deploy.log file from your broker host? It sounds like there was a problem during the puppet deployment; this will help me confirm it. Created attachment 919071 [details]
oo-install-cfg.yml
Created attachment 919072 [details]
openshift-deploy.log
Okay, interesting. The output log only contains one line: "Could not run: Could not find file /tmp/oo_install_configure_broker.nicknach.net.pp" So the puppet config script was never copied to the broker. From the oo-install run, do you still have any of the STDOUT content? I would expect to see some sort of error related to the scp attempt. Preflight check: verifying system and resource availability. Checking broker.nicknach.net: * SSH connection succeeded * Target host is running Red Hat Enterprise Linux * Located getenforce * SELinux is running in enforcing mode * Located yum * Could not find optional channel through RHN. * Found enabled optional repo through RHSM. * puppet RPM is installed. * openssh-clients RPM is installed. * The 'bind' package is not installed on this host. The 'bind' RPM is required, but not installed on broker.nicknach.net. Do you want me to try to install it for you? (y/n/q) y Checking availability of 'bind' RPM... available. Attempting to install... success! Checking node01.nicknach.net: * SSH connection succeeded * Target host is running Red Hat Enterprise Linux * Located getenforce * SELinux is running in enforcing mode * Located yum * Could not find optional channel through RHN. * Found enabled optional repo through RHSM. * puppet RPM is installed. * openssh-clients RPM is installed. Deploying workflow 'origin_deploy'. Preparing to install OpenShift Origin on the following hosts: * broker.nicknach.net (Broker, DBServer, MsgServer, NameServer) * node01.nicknach.net (Node) Generating template for 'broker.nicknach.net' * Checking for apps.nicknach.net DNS key... not found; attempting to generate. * Key generation successful. * Checking for nicknach.net DNS key... not found; attempting to generate. * Key generation successful. * BIND DNS enabled. * Created template /tmp/oo_install_configure_broker.nicknach.net.pp * Copying Puppet script to host... success. Removing local copy. Generating template for 'node01.nicknach.net' * Created template /tmp/oo_install_configure_node01.nicknach.net.pp * Copying Puppet script to host... success. Removing local copy. node01.nicknach.net: Running Puppet deployment for host broker.nicknach.net: Running Puppet deployment for host node01.nicknach.net: Puppet module removal failed. This is expected if the module was not installed. node01.nicknach.net: Attempting Puppet module installation (try #1) broker.nicknach.net: Puppet module removal failed. This is expected if the module was not installed. broker.nicknach.net: Attempting Puppet module installation (try #1) node01.nicknach.net: Puppet module installation succeeded. node01.nicknach.net: Cleaning yum repos. node01.nicknach.net: Running the Puppet deployment. This step may take up to an hour. broker.nicknach.net: Puppet module installation succeeded. broker.nicknach.net: Cleaning yum repos. broker.nicknach.net: Running the Puppet deployment. This step may take up to an hour. broker.nicknach.net: Puppet deployment completed. broker.nicknach.net: Cleaning up temporary files. broker.nicknach.net: Clean up of /tmp/#hostfile} failed; please remove this file manually. node01.nicknach.net: Puppet deployment completed. node01.nicknach.net: Cleaning up temporary files. Host deployments completed succesfully. Restarting services in dependency order. broker.nicknach.net: service named restart succeeded. broker.nicknach.net: service mongod restart failed: node01.nicknach.net: service ruby193-mcollective stop succeeded. broker.nicknach.net: service activemq restart failed: node01.nicknach.net: service ruby193-mcollective start succeeded. broker.nicknach.net: service openshift-broker restart failed: broker.nicknach.net: service openshift-console restart failed: Now performing post-installation tasks. Failed to create district 'Default'. You will need to run the following manually on a Broker to create the district: oo-admin-ctl-district -c create -n Default -p small Then you will need to run the add-node command for each associated node: oo-admin-ctl-district -c add-node -n Default -i <node_hostname> Attempting to register available cartridge types with Broker(s). Could not register cartridge types with Broker(s). Log into any Broker and attempt to register the carts with this command: oo-admin-ctl-cartridge -c import-node --activate The following user / password combinations were created during the configuration: Web console: demo / bUhawVfOgYs1JwqEchbyg MCollective: mcollective / VTBOmhjAqYKOjdlczrmg MongoDB Admin: admin / iB0ByQ9vQCYFOVMiTAEMXw MongoDB User: openshift / hgONaAeAOCQFF5ZAzcOg Be sure to record these somewhere for future use. Deployment successful. Exiting installer. All tasks completed. oo-install exited; removing temporary assets. The two interesting lines from above come from the section that begins with: Generating template for 'broker.nicknach.net' Specifically: * Created template /tmp/oo_install_configure_broker.nicknach.net.pp * Copying Puppet script to host... success. Removing local copy. The scp command that is run looks like this: scp -q /tmp/<puppet_file_name> <target_host_user>@<target_host_name>:/tmp/<puppet_file_name> And if this command returns exit code 0, we assume that the copy was successful. Two thoughts: either /tmp/ got cleared out between the file copy and the puppet run, or the copy was not successful but returned a 0 exit code anyway. If you run oo-install with a '-d' flag, you will see a ton of debug output, mostly related to SSH channel info. If you run it and then search for "Copying Puppet script to host..." in the output, you will see the scp command being run with the '-v' flag for verbose logging. If there is some sort of error, this is the place where you would see it. * BIND DNS enabled. * Created template /tmp/oo_install_configure_broker.nicknach.net.pp Executing: program /usr/bin/ssh host broker.nicknach.net, user root, command scp -v -t /tmp/oo_install_configure_broker.nicknach.net.pp OpenSSH_5.3p1, OpenSSL 1.0.0-fips 29 Mar 2010 debug1: Reading configuration data /etc/ssh/ssh_config debug1: Applying options for * debug1: Executing proxy command: exec /usr/bin/sss_ssh_knownhostsproxy -p 22 broker.nicknach.net debug1: permanently_set_uid: 0/0 debug1: identity file /root/.ssh/identity type -1 debug1: identity file /root/.ssh/identity-cert type -1 debug1: identity file /root/.ssh/id_rsa type -1 debug1: identity file /root/.ssh/id_rsa-cert type -1 debug1: identity file /root/.ssh/id_dsa type -1 debug1: identity file /root/.ssh/id_dsa-cert type -1 debug1: permanently_drop_suid: 0 debug1: Remote protocol version 2.0, remote software version OpenSSH_5.3 debug1: match: OpenSSH_5.3 pat OpenSSH* debug1: Enabling compatibility mode for protocol 2.0 debug1: Local version string SSH-2.0-OpenSSH_5.3 debug1: SSH2_MSG_KEXINIT sent debug1: SSH2_MSG_KEXINIT received debug1: kex: server->client aes128-ctr hmac-md5 none debug1: kex: client->server aes128-ctr hmac-md5 none debug1: SSH2_MSG_KEX_DH_GEX_REQUEST(1024<1024<8192) sent debug1: expecting SSH2_MSG_KEX_DH_GEX_GROUP debug1: SSH2_MSG_KEX_DH_GEX_INIT sent debug1: expecting SSH2_MSG_KEX_DH_GEX_REPLY debug1: Host 'broker.nicknach.net' is known and matches the RSA host key. debug1: Found key in /var/lib/sss/pubconf/known_hosts:1 debug1: ssh_rsa_verify: signature correct debug1: SSH2_MSG_NEWKEYS sent debug1: expecting SSH2_MSG_NEWKEYS debug1: SSH2_MSG_NEWKEYS received debug1: SSH2_MSG_SERVICE_REQUEST sent debug1: SSH2_MSG_SERVICE_ACCEPT received debug1: Authentications that can continue: publickey,gssapi-keyex,gssapi-with-mic,password debug1: Next authentication method: gssapi-keyex debug1: No valid Key exchange context debug1: Next authentication method: gssapi-with-mic debug1: Unspecified GSS failure. Minor code may provide more information Credentials cache file '/tmp/krb5cc_0' not found debug1: Unspecified GSS failure. Minor code may provide more information Credentials cache file '/tmp/krb5cc_0' not found debug1: Unspecified GSS failure. Minor code may provide more information debug1: Unspecified GSS failure. Minor code may provide more information Credentials cache file '/tmp/krb5cc_0' not found debug1: Next authentication method: publickey debug1: Trying private key: /root/.ssh/identity debug1: Trying private key: /root/.ssh/id_rsa debug1: read PEM private key done: type RSA debug1: Authentication succeeded (publickey). debug1: channel 0: new [client-session] debug1: Requesting no-more-sessions debug1: Entering interactive session. debug1: Sending environment. debug1: Sending env XMODIFIERS = @im=ibus debug1: Sending env LANG = en_US.UTF-8 debug1: Sending command: scp -v -t /tmp/oo_install_configure_broker.nicknach.net.pp Sending file modes: C0644 1786 oo_install_configure_broker.nicknach.net.pp Sink: C0644 1786 oo_install_configure_broker.nicknach.net.pp debug1: client_input_channel_req: channel 0 rtype exit-status reply 0 debug1: channel 0: free: client-session, nchannels 1 debug1: fd 0 clearing O_NONBLOCK debug1: fd 1 clearing O_NONBLOCK Transferred: sent 4104, received 2176 bytes, in 0.1 seconds Bytes per second: sent 40717.2, received 21588.8 debug1: Exit status 0 * Copying Puppet script to host... success. Removing local copy. Generating template for 'node01.nicknach.net' no failed events in the audit log either... # cat /var/log/audit/audit.log |grep failed [root@broker ~]# Okay, that seems legit. The /tmp/oo_install_configure_broker.nicknach.net.pp file should be present on your broker from the time that the scp is performed until after the puppet module is run. Can you please confirm that after that success that the file was present on the broker at /tmp/oo_install_configure_broker.nicknach.net.pp If present on the broker host during the period between the scp command and the puppet module deployment completing: then your original bug was possibly the result of a transient SCP error. If not present: then something about your broker host is unusual. the broker portion is very quick (probably because it's erroring out) and thereby the pp file exists only fleetingly. This is a up2date verion of RHEL 6.5 vanilla with optional repo added. I think I know what the problem is. Are you running oo-install -on- the broker host? Yes! I was able to complete the install properly when not running oo-install from the broker node. Interesting how it couldnt provision itself, that *use* to work. If there is a technical reason for this, maybe there should be a warning or exit or both. Okay. Because your broker's ssh_host name is not 'localhost', oo-install doesn't know that the puppet config file that it generates at localhost:/tmp/<puppet_file_name> is literally the same file as broker.nicknach.net:/tmp/<puppet_file_name>. SCPing a file over itself is not a problem, but oo-install cleans up the oo-install host's copy of that file as soon as the scp is completed. In your case, this means deleting the file from where it needed to be. If I defer the 'local' delete until the 'remote' host operations are done, this situation will not impact the installation. As an aside, you might wonder why I am not doing a sanity check like "If I run `hostname` and it is identical to this oo-install target system's host name, I must be on localhost for that system." The reason is because the hostname that oo-install knows about is not necessarily a target host's actual -current hostname-. It can still find that host as long as the right SSH alias has been set up. Anyhow, I will update once I have patched this issue. Okay, I've deployed a fix for this. The PR is here: https://github.com/openshift/openshift-extras/pull/412 And the code has been pushed to install.openshift.com. Thanks for helping me track this down. Please re-open this bug if the problem isn't solved. |