Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 2052039

Summary: OSP16.2 controller deploys fail when using LVM
Product: Red Hat OpenStack Reporter: David Rosenfeld <drosenfe>
Component: openstack-tripleo-commonAssignee: Adriano Petrich <apetrich>
Status: CLOSED DUPLICATE QA Contact: David Rosenfeld <drosenfe>
Severity: high Docs Contact:
Priority: unspecified    
Version: 16.2 (Train)CC: jslagle, mburns, slinaber, tkajinam
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-02-09 03:09:19 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description David Rosenfeld 2022-02-08 14:59:46 UTC
Description of problem: In OSP 16.2 one job is timing out with this in logs:

2022-02-03 06:06:12.751887 | 525400ca-ee63-8987-97fb-000000004f13 |     TIMING | Debug output for task: Run container-puppet tasks (bootstrap tasks) for step 2 | compute-1 | 0:10:59.169064 | 0.06s
2022-02-03 06:06:12.815229 | 525400ca-ee63-8987-97fb-000000004f15 |       TASK | Generate container puppet configs for step 2
2022-02-03 06:06:12.880497 | 525400ca-ee63-8987-97fb-000000004f3e |    CHANGED | Wait for container-puppet tasks (bootstrap tasks) for step 2 to finish | compute-2
2022-02-03 06:06:12.881242 | 525400ca-ee63-8987-97fb-000000004f3e |     TIMING | Wait for container-puppet tasks (bootstrap tasks) for step 2 to finish | compute-2 | 0:10:59.298402 | 3.61s
2022-02-03 06:06:12.881850 | 525400ca-ee63-8987-97fb-000000004f15 |    SKIPPED | Generate container puppet configs for step 2 | compute-1
2022-02-03 06:06:12.882126 | 525400ca-ee63-8987-97fb-000000004f15 |     TIMING | Generate container puppet configs for step 2 | compute-1 | 0:10:59.299291 | 0.07s
2022-02-03 06:06:12.940826 | 525400ca-ee63-8987-97fb-000000004f16 |       TASK | Manage Puppet containers (bootstrap tasks) for step 2 with tripleo-ansible
2022-02-03 06:06:12.959347 | 525400ca-ee63-8987-97fb-000000004f3f |       TASK | Debug output for task: Run container-puppet tasks (bootstrap tasks) for step 2
2022-02-03 06:06:13.019435 | 525400ca-ee63-8987-97fb-000000004f16 |    SKIPPED | Manage Puppet containers (bootstrap tasks) for step 2 with tripleo-ansible | compute-1
2022-02-03 06:06:13.020041 | 525400ca-ee63-8987-97fb-000000004f16 |     TIMING | Manage Puppet containers (bootstrap tasks) for step 2 with tripleo-ansible | compute-1 | 0:10:59.437210 | 0.08s
2022-02-03 06:06:13.020715 | 525400ca-ee63-8987-97fb-000000004f3f |         OK | Debug output for task: Run container-puppet tasks (bootstrap tasks) for step 2 | compute-2 | result={
    "bootstrap_tasks_outputs.stdout_lines | default([]) | union(bootstrap_tasks_outputs.stderr_lines | default([]))": [
        "2022-02-03 06:06:09,610 INFO: 23519 -- Running container-puppet",
        "2022-02-03 06:06:09,610 INFO: 23519 -- Service compilation completed."
    ],
    "changed": false,
    "failed_when_result": false
}

The only thing different about this job and all the other successful deployments that were tested is that this job uses LVM storage backend.

Logs to test run:

https://rhos-ci-jenkins.lab.eng.tlv2.redhat.com/view/DFG/view/df/view/controller_replacement/job/DFG-df-controller_replacement-16.2-virthost-3cont_3comp-yes_UC_SSL-yes_OC_SSL-lvm-ipv4-geneve-replace_controller-corrupt_disk-same_node-RHELOSP-40165/59/

Version-Release number of selected component (if applicable): RHOS-16.2-RHEL-8-20220201.n.1


How reproducible: Every time


Steps to Reproduce:
1. Execute Jenkins job: https://rhos-ci-jenkins.lab.eng.tlv2.redhat.com/view/DFG/view/df/view/controller_replacement/job/DFG-df-controller_replacement-16.2-virthost-3cont_3comp-yes_UC_SSL-yes_OC_SSL-lvm-ipv4-geneve-replace_controller-corrupt_disk-same_node-RHELOSP-40165/
2.
3.

Actual results: Overcloud controllers failed in deployment with error shown above.


Expected results: Overcloud deploy is successful


Additional info:

Comment 1 James Slagle 2022-02-08 22:32:37 UTC
From http://rhos-ci-logs.lab.eng.tlv2.redhat.com/logs/rcj/DFG-df-controller_replacement-16.2-virthost-3cont_3comp-yes_UC_SSL-yes_OC_SSL-lvm-ipv4-geneve-replace_controller-corrupt_disk-same_node-RHELOSP-40165/59/undercloud-0/home/stack/overcloud_install.log.gz

the error is from controller-2:

"Error: Evaluation Error: Error while evaluating a Resource Statement, Evaluation Error: Error while evaluating a Function Call, Duplicate declaration: Package[fence-agents-ipmilan] is already declared at (file: /etc/puppet/modules/pacemaker/manifests/install.pp, line: 40); cannot redeclare (file: /etc/puppet/modules/pacemaker/manifests/stonith/fence_ipmilan.pp, line: 216) (file: /etc/puppet/modules/pacemaker/manifests/stonith/fence_ipmilan.pp, line: 216, column: 5) (file: /etc/puppet/modules/tripleo/manifests/fencing.pp, line: 132) on node controller-2.redhat.local"],

Is this job different because it has fencing configured? Probably need DFG:PIDONE to look at this.

Comment 2 Takashi Kajinami 2022-02-09 03:09:19 UTC
Closing this because I believe this is duplicated of https://bugzilla.redhat.com/show_bug.cgi?id=2050780 .
Please try the version including the fix.

*** This bug has been marked as a duplicate of bug 2050780 ***