Bug 1368296

Summary: quick install with the installer.cfg.yml in other directory failed
Product: OpenShift Container Platform Reporter: liujia <jiajliu>
Component: InstallerAssignee: Tim Bielawa <tbielawa>
Status: CLOSED ERRATA QA Contact: Johnny Liu <jialiu>
Severity: medium Docs Contact:
Priority: medium    
Version: 3.3.0CC: aos-bugs, jiajliu, jokerman, mmccomas, tbielawa
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: Quick installer was not verifying file system paths when read from a configuration file. Consequence: Quick installer would attempt to read a file which does not exist, throw a stack trace, and abort the installation. Fix: The file system path is now verified to exist when read from a configuration file. Result: Quick installer does not crash now.
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-09-27 09:45:07 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description liujia 2016-08-19 02:40:25 UTC
Description of problem:
quick install failed when use -c to specify a config file which was not located in.config/openshift/.

Gathering information from hosts...
ERROR! Unexpected Exception: [Errno 2] No such file or directory: '/root/.config/openshift/.ansible/callback_facts.yaml'


Version-Release number of selected component (if applicable):
atomic-openshift-utils-3.3.12-1.git.0.b26c8c2.el7.noarch

How reproducible:
always

Steps to Reproduce:
1.put an installer.cfg.yml in the path /root
2.run "atomic-openshift-installer -c /root/installer.cfg.yml install"


Actual results:

Gathering information from hosts...
ERROR! Unexpected Exception: [Errno 2] No such file or directory: '/root/.config/openshift/.ansible/callback_facts.yaml'
There was a problem fetching the required information. Please see /tmp/ansible.log for details.


Expected results:
It will start ansible playbook to install successfully.

Additional info:
installer.cfg.yml
...
deployment:
  ansible_ssh_user: root
  hosts:
  - connect_to: openshift-190.lab.eng.nay.redhat.com
    hostname: 192.168.0.189
    ip: 192.168.0.189
    node_labels: '{''region'': ''infra''}'
    public_hostname: openshift-190.lab.eng.nay.redhat.com
    public_ip: 10.66.147.190
    roles:
    - master
    - etcd
    - node
    - storage
  - connect_to: openshift-191.lab.eng.nay.redhat.com
    hostname: 192.168.0.80
    ip: 192.168.0.80
    node_labels: '{''region'': ''infra''}'
    public_hostname: openshift-191.lab.eng.nay.redhat.com
    public_ip: 10.66.147.191
    roles:
    - master
    - etcd
    - node
  - connect_to: openshift-194.lab.eng.nay.redhat.com
    hostname: 192.168.0.15
    ip: 192.168.0.15
    public_hostname: openshift-194.lab.eng.nay.redhat.com
    public_ip: 10.66.147.194
    roles:
    - master
    - etcd
    - node
  - connect_to: openshift-195.lab.eng.nay.redhat.com
    hostname: 192.168.0.65
    ip: 192.168.0.65
    public_hostname: openshift-195.lab.eng.nay.redhat.com
    public_ip: 10.66.147.195
    roles:
    - node
  - connect_to: openshift-197.lab.eng.nay.redhat.com
    hostname: 192.168.0.16
    ip: 192.168.0.16
    public_hostname: openshift-197.lab.eng.nay.redhat.com
    public_ip: 10.66.147.197
    roles:
    - master_lb

...

Comment 1 Tim Bielawa 2016-08-19 15:34:59 UTC
I hit a road-block initially trying to reproduce this issue. I found I had to ensure that the `~/.config/openshift/.ansible/` directory was cleared away to reproduce the issue. I cleared parent directory away as well.


> # rm -fR ~/.config/openshift/
> # atomic-openshift-installer -c /root/installer.cfg.yml install
> ... set some config options ...
>
> Gathering information from hosts...
> ERROR! Unexpected Exception: [Errno 2] No such file or directory:
>     '/root/.config/openshift/.ansible/callback_facts.yaml'
> There was a problem fetching the required information.
> Please see /tmp/ansible.log for details.

Working on figuring out the exact nature of this bug now.

Comment 2 Tim Bielawa 2016-08-19 19:10:17 UTC
You ran the installer using the config file called

> /root/installer.cfg.yml

In that file what is the value of this key?

> ansible_callback_facts_yaml

Your report says that the file 

> /root/.config/openshift/.ansible/callback_facts.yaml

does not exist. Is that the same file as the "ansible_callback_facts_yaml" key?

Comment 3 Tim Bielawa 2016-08-22 18:54:01 UTC
Jia,

I've opened https://github.com/openshift/openshift-ansible/pull/2341 with a patch that I believe fixes this issue.

Please try the patch out using these commands:

> cd `mktemp -d`
> wget -r -l1 --no-parent -A.rpm --cut-dirs=5 --no-host-directories http://file.rdu.redhat.com/~tbielawa/BZ1368296/
> sudo yum localinstall *.rpm
> atomic-openshift-installer -d -c /root/installer.cfg.yml install

Please include the generated debug log with your report:

> /tmp/installer.txt

Comment 5 Tim Bielawa 2016-08-24 20:34:51 UTC
Patch merged fixes the issue as I was able to reproduce it. Now the installer will continue instead of abort for the condition where the cached facts file doesn't exist or is otherwise unreadable. The installer will continue and then recollect the missing system facts.

Comment 7 liujia 2016-08-25 07:40:51 UTC
(In reply to Tim Bielawa from comment #3)
> Jia,
> 
> I've opened https://github.com/openshift/openshift-ansible/pull/2341 with a
> patch that I believe fixes this issue.
> 
> Please try the patch out using these commands:
> 
> > cd `mktemp -d`
> > wget -r -l1 --no-parent -A.rpm --cut-dirs=5 --no-host-directories http://file.rdu.redhat.com/~tbielawa/BZ1368296/
> > sudo yum localinstall *.rpm
> > atomic-openshift-installer -d -c /root/installer.cfg.yml install
> 
> Please include the generated debug log with your report:
> 
> > /tmp/installer.txt

Sorry for late notice and reply. I just saw your comments of this bug. I will verify it with latest errata version right now.

Many thx for your detail info about the bug.

Comment 8 liujia 2016-08-25 10:11:27 UTC
Version:
atomic-openshift-utils-3.3.15-1.git.0.a9fd72e.el7.noarch

Verify Steps:
1.put an installer.cfg.yml in the path /root

ansible_callback_facts_yaml: /root/.config/openshift/.ansible/callback_facts.yaml
ansible_config: /usr/share/atomic-openshift-utils/ansible.cfg
ansible_inventory_path: /root/.config/openshift/hosts
ansible_log_path: /tmp/ansible.log
deployment:
  ansible_ssh_user: root
  hosts:
  ...
  ...
  master_routingconfig_subdomain: ''
  proxy_exclude_hosts: ''
  proxy_http: ''
  proxy_https: ''
  roles:
    etcd: {}
    master: {}
    master_lb: {}
    node: {}
    storage: {}
variant: openshift-enterprise
variant_version: '3.3'
version: v2

2.run "atomic-openshift-installer -u -c /root/installer.cfg.yml install"

Verify Result:
1. Though there are no file(callback_facts.yaml) in  /root/.config/openshift/.ansible/ configured in the installer config, it will continue to collect info and create the file in the same directory as the installer.cfg.yml which located in /root for this case.
2. After gathering information, it will inform user about the hosts' location:
Wrote atomic-openshift-installer config: /root/installer.cfg.yml
Wrote Ansible inventory: /root/hosts

then,it will continue to install with ansible playbook.

btw, i checked another two cases accoding to the fix.
(1) if there is wrong file in the /root/.config/openshift/.ansible/, it will also recollect info and inform user to delete wrong callback_facts.yaml.
(2) if there is right file in the /root/.config/openshift/.ansible/, it will update the file with new collected info and then create hosts in /root(the same as installer.cfg.yml)

I think it now fixed perfectly.

Comment 10 errata-xmlrpc 2016-09-27 09:45:07 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2016:1933