Bug 1777020 - pre-provisioned nodes need missing lvm2 package
Summary: pre-provisioned nodes need missing lvm2 package
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: tripleo-ansible
Version: 16.0 (Train)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: beta
: 16.0 (Train on RHEL 8.1)
Assignee: Alex Schultz
QA Contact: Sasha Smolyak
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-11-26 18:49 UTC by David Rosenfeld
Modified: 2020-02-06 14:43 UTC (History)
7 users (show)

Fixed In Version: tripleo-ansible-0.4.1-0.20191129121711.6156789.el8ost
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-02-06 14:42:58 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 696416 0 'None' 'MERGED' 'Add LVM2 package install to bootstrap' 2019-12-07 02:24:20 UTC
Red Hat Product Errata RHEA-2020:0283 0 None None None 2020-02-06 14:43:58 UTC

Internal Links: 1777336

Description David Rosenfeld 2019-11-26 18:49:50 UTC
Description of problem: 
Attempted splitstack deployment tests and see following error:

"fatal: [ceph-0]: FAILED! => changed=false ",
        "  - --privileged",
        "  - --ipc=host",
        "  - --ulimit",
        "  - nofile=1024:4096",
        "  - /run/lock/lvm:/run/lock/lvm:z",
        "  - /var/run/udev/:/var/run/udev/:z",
        "  - /dev:/dev",
        "  - /run/lvm/:/run/lvm/",
        "  - --entrypoint=ceph-volume",
        "  - lvm",
        "  - batch",
        "  - --bluestore",
        "  - --yes",
        "  - --prepare",
        "  - --report",
        "  - --format=json",
        "  stderr: 'Error: error checking path \"/run/lock/lvm\": stat /run/lock/lvm: no such file or directory'",
        "fatal: [ceph-2]: FAILED! => changed=false ",
        "fatal: [ceph-1]: FAILED! => changed=false ",
        "NO MORE HOSTS LEFT *************************************************************",
        "PLAY RECAP *********************************************************************",
        "ceph-0                     : ok=89   changed=4    unreachable=0    failed=1    skipped=198  rescued=0    ignored=0   ",
        "ceph-1                     : ok=87   changed=4    unreachable=0    failed=1    skipped=197  rescued=0    ignored=0   ",
        "ceph-2                     : ok=87   changed=4    unreachable=0    failed=1    skipped=197  rescued=0    ignored=0   ",
        "compute-0                  : ok=45   changed=3    unreachable=0    failed=0    skipped=145  rescued=0    ignored=0   ",
        "compute-1                  : ok=34   changed=2    unreachable=0    failed=0    skipped=115  rescued=0    ignored=0   ",
        "controller-0               : ok=166  changed=21   unreachable=0    failed=0    skipped=287  rescued=0    ignored=0   ",
        "controller-1               : ok=153  changed=19   unreachable=0    failed=0    skipped=277  rescued=0    ignored=0   ",
        "controller-2               : ok=153  changed=19   unreachable=0    failed=0    skipped=279  rescued=0    ignored=0   ",


Version-Release number of selected component (if applicable): RHOS_TRUNK-16.0-RHEL-8-20191122.n.2


How reproducible: Every time a split stack jenkins job is executed

 
Steps to Reproduce:
1. Execute any split stack job in Jenkins e.g. DFG-df-splitstack-16-virsh-3cont_2comp_3ceph-skip-deploy-identifier-scaleup
2.
3.

Actual results:
Job aborts with error in description

Expected results:
Job completes successfully

Additional info:

Comment 2 John Fulton 2019-11-26 21:37:16 UTC
This is happening because the lvm2 package is missing from your ceph storage nodes [1] and is required by Ceph in order to create bluestore OSDs with ceph-volume [2]

I'd file this as a overcloud image bug (missing needed package) but because you're using split-stack the person doing the deployment is responsible for installing the needed packages [3].

That said it doesn't explicitly say you need to install the lvm2 package. For that reason I think you need to fix your job by installing that package and we need a docbug to tell the user to install the package.


[1] 
[fultonj@runcible bz1777020]$ ls ceph-0/etc/lvm
ls: cannot access 'ceph-0/etc/lvm': No such file or directory
[fultonj@runcible bz1777020]$ grep -i lvm ceph-0/var/log/rpm.list 
[fultonj@runcible bz1777020]$ 

[2] 
https://github.com/ceph/ceph-ansible/blob/v4.0.5/roles/ceph-config/tasks/main.yml#L18
ceph-ansible-4.0.5-1.el8cp.noarch
ceph_docker_image: ceph/rhceph-4.0-rhel8
ceph_docker_image_tag: latest

[3]
https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/15/html-single/director_installation_and_usage/index#registering-the-operating-system-for-pre-provisioned-nodes

Comment 3 John Fulton 2019-11-26 21:49:33 UTC
Please update step 7.3 "Registering the operating system for pre-provisioned nodes" in the director_installation_and_usage document [1] to have an additional step which requires the user to install the lvm2 package. 

Here's an example where I've moved step 6 to step 7 and inserted a new step 6. 
"""
5. Enable the required Red Hat Enterprise Linux repositories.

    For x86_64 systems, run:

    [root@controller-0 ~]# sudo subscription-manager repos --enable=rhel-8-for-x86_64-baseos-rpms --enable=rhel-8-for-x86_64-appstream-rpms --enable=rhel-8-for-x86_64-highavailability-rpms --enable=ansible-2.8-for-rhel-8-x86_64-rpms --enable=openstack-15-for-rhel-8-x86_64-rpms --enable=rhceph-4-osd-for-rhel-8-x86_64-rpms--enable=rhceph-4-mon-for-rhel-8-x86_64-rpms --enable=rhceph-4-tools-for-rhel-8-x86_64-rpms --enable=advanced-virt-for-rhel-8-x86_64-rpms --enable=fast-datapath-for-rhel-8-x86_64-rpms


6. Install packages required by Ceph (optional)

If you're going to using Ceph in the overloud, then run the following command to install the necessary packages:

[root@controller-0 ~]# sudo yum install -y lvm2


7. Update your system to ensure you have the latest base system packages:

[root@controller-0 ~]# sudo yum update -y
[root@controller-0 ~]# sudo reboot
"""

[1] https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/15/html-single/director_installation_and_usage/index#registering-the-operating-system-for-pre-provisioned-nodes

Comment 4 John Fulton 2019-11-27 12:56:23 UTC
We will also add a validation to make solving this issue easier to understand in the field. Tracked in bug 1777336.

Comment 5 Alex Schultz 2019-11-27 17:36:29 UTC
So there is a tripleo-bootstrap role that should handle this pre-req install.

Comment 6 John Fulton 2019-11-27 17:44:43 UTC
(In reply to John Fulton from comment #2)
> I'd file this as a overcloud image bug (missing needed package) but because
> you're using split-stack the person doing the deployment is responsible for
> installing the needed packages [3].

I was wrong about ^ I thought the person doing the deployment was responsible for installing the packages. They're only responsible for enabling the repositories [3]. tripleo-bootstrap is what installs them.


[3] https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/15/html-single/director_installation_and_usage/index#registering-the-operating-system-for-pre-provisioned-nodes

Comment 7 John Fulton 2019-12-02 14:06:24 UTC
Since this is a DF bz and the patch has merged I'm changing the DFG label here to DF

Comment 15 errata-xmlrpc 2020-02-06 14:42:58 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2020:0283


Note You need to log in before you can comment on or make changes to this bug.