Bug 1381629 - rhel-osp-director: minor update fails with: Error: Duplicate declaration: Class[Ceph::Profile::Params] is already declared; cannot redeclare
Summary: rhel-osp-director: minor update fails with: Error: Duplicate declaration: C...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: rhosp-director
Version: 8.0 (Liberty)
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: async
: 8.0 (Liberty)
Assignee: Angus Thomas
QA Contact: Omri Hochman
URL:
Whiteboard:
Depends On:
Blocks: 1305654
TreeView+ depends on / blocked
 
Reported: 2016-10-04 15:21 UTC by Alexander Chuzhoy
Modified: 2016-12-29 16:56 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-11-09 20:52:27 UTC
Target Upstream Version:


Attachments (Terms of Use)
versionlock list (29.51 KB, text/plain)
2016-10-04 15:32 UTC, Alexander Chuzhoy
no flags Details
/var/lib/heat-config/heat-config-puppet/ab53c213-94cc-4fd8-bf7f-258cf45b0bae.pp (81.89 KB, text/plain)
2016-10-05 17:03 UTC, Alexander Chuzhoy
no flags Details
hieradata dir from controller-2 (12.27 KB, application/x-gzip)
2016-10-05 17:08 UTC, Alexander Chuzhoy
no flags Details
/usr/share/openstack-puppet/modules/ from controller (1.91 MB, application/x-gzip)
2016-10-05 17:47 UTC, Alexander Chuzhoy
no flags Details
/usr/share/openstack-puppet/modules/ from undercloud (2.87 MB, application/x-gzip)
2016-10-05 17:49 UTC, Alexander Chuzhoy
no flags Details

Description Alexander Chuzhoy 2016-10-04 15:21:54 UTC
rhel-osp-director:   minor update fails with: Error: Duplicate declaration: Class[Ceph::Profile::Params] is already declared; cannot redeclare


Environment:
openstack-puppet-modules-7.1.3-1.el7ost.noarch
openstack-tripleo-heat-templates-0.8.14-18.el7ost.noarch
instack-undercloud-2.2.7-7.el7ost.noarch


Steps to reproduce:
1. Deploy 8 (versionlock enabled, list of version locked RPMs is attached) with this command:
openstack overcloud deploy --log-file ~/pilot/overcloud_deployment.log -t 400 --stack overcloud \
--templates ~/pilot/templates/overcloud \
-e ~/pilot/templates/overcloud/environments/network-isolation.yaml \
-e ~/pilot/templates/network-environment.yaml \
-e ~/pilot/templates/overcloud/environments/storage-environment.yaml \
-e ~/pilot/templates/dell-environment.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/puppet-pacemaker.yaml \
--control-flavor control --compute-flavor compute --ceph-storage-flavor ceph-storage --swift-storage-flavor swift-storage --block-storage-flavor block-storage --neutron-public-interface bond1 --neutron-network-type vlan --neutron-disable-tunneling --os-auth-url http://192.168.120.101:5000/v2.0 --os-project-name admin --os-user-id admin --os-password e4d873423b80b29cb347741ca7318f00fc03fe32 --control-scale 3 --compute-scale 3 --ceph-storage-scale 3 --ntp-server 0.centos.pool.ntp.org --neutron-network-vlan-ranges physint:201:220,physext --neutron-bridge-mappings physint:br-tenant,physext:br-ex



2. Update the undercloud. 
3. Re-patch /usr/share/openstack-tripleo-heat-templates/environments/puppet-pacemaker.yaml with changes as during deployment.
4. Connect all overcloud nodes to cdn and run the update command:
yes ""|openstack overcloud update stack overcloud -i \
--templates ~/pilot/templates/overcloud \
-e /usr/share/openstack-tripleo-heat-templates/overcloud-resource-registry-puppet.yaml \
-e ~/pilot/templates/overcloud/environments/network-isolation.yaml \
-e ~/pilot/templates/network-environment.yaml \
-e ~/pilot/templates/overcloud/environments/storage-environment.yaml \
-e ~/pilot/templates/dell-environment.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/puppet-pacemaker.yaml



Result:
Update fails.
[stack@director ~]$ heat deployment-show f2da2bb7-8b51-49aa-8e49-cbb6966ea754
{
  "status": "FAILED",
  "server_id": "d0cd22dd-4a74-4329-99cb-ec4aabe42c47",
  "config_id": "ab53c213-94cc-4fd8-bf7f-258cf45b0bae",
  "output_values": {
    "deploy_stdout": "",
    "deploy_stderr": "Could not retrieve fact='apache_version', resolution='<anonymous>': undefined method `[]' for nil:NilClass\nCould not retrieve fact='apache_version', resolution='<anonymous>': undefined method `[]' for nil:NilClass\n\u001b[1;31mWarning: Scope(Class[Mongodb::Server]): Replset specified, but no replset_members or replset_config provided.\u001b[0m\n\u001b[1;31mError: Duplicate declaration: Class[Ceph::Profile::Params] is already declared; cannot redeclare at /var/lib/heat-config/heat-config-puppet/ab53c213-94cc-4fd8-bf7f-258cf45b0bae.pp:576 on node overcloud-controller-2.localdomain\u001b[0m\n\u001b[1;31mError: Duplicate declaration: Class[Ceph::Profile::Params] is already declared; cannot redeclare at /var/lib/heat-config/heat-config-puppet/ab53c213-94cc-4fd8-bf7f-258cf45b0bae.pp:576 on node overcloud-controller-2.localdomain\u001b[0m\n",
    "deploy_status_code": 1
  },
  "creation_time": "2016-10-03T22:01:02",
  "updated_time": "2016-10-04T14:42:40",
  "input_values": {},
  "action": "UPDATE",
  "status_reason": "deploy_status_code : Deployment exited with non-zero status code: 1",
  


Expected result:
Successful update.

Comment 1 Alexander Chuzhoy 2016-10-04 15:32:08 UTC
Created attachment 1207247 [details]
versionlock list

Comment 3 Emilien Macchi 2016-10-04 19:43:29 UTC
My first thoughts, looking at the code:

  if $enable_ceph {
    $mon_initial_members = downcase(hiera('ceph_mon_initial_members'))
    if str2bool(hiera('ceph_ipv6', false)) {
      $mon_host = hiera('ceph_mon_host_v6')
    } else {
      $mon_host = hiera('ceph_mon_host')
    }
    class { '::ceph::profile::params': ## DUPLICATED #1
      mon_initial_members => $mon_initial_members,
      mon_host            => $mon_host,
    }
    include ::ceph::conf
    include ::ceph::profile::mon
  }

(...)
  
  if str2bool(hiera('enable_external_ceph', false)) {
    if str2bool(hiera('ceph_ipv6', false)) {
      $mon_host = hiera('ceph_mon_host_v6')
    } else { 
      $mon_host = hiera('ceph_mon_host')
    }
    class { '::ceph::profile::params': ## DUPLICATED #2
      mon_host            => $mon_host,
    }
    include ::ceph::conf
    include ::ceph::profile::client
  } 


It looks like both enable_ceph and enable_external_ceph are set to True, which leads to this duplicated resource.
I'm investigating in the env given to see is external_ceph is actually true.

Comment 4 Alexander Chuzhoy 2016-10-04 20:29:28 UTC
Before updating the undercloud I run:
diff -Nar ~/pilot/templates/overcloud/ /usr/share/openstack-tripleo-heat-templates/|grep ^diff

diff -Nar /home/stack/pilot/templates/overcloud/puppet/hieradata/ceph.yaml /usr/share/openstack-tripleo-heat-templates/puppet/hieradata/ceph.yaml
diff -Nar /home/stack/pilot/templates/overcloud/puppet/manifests/overcloud_controller_pacemaker.pp /usr/share/openstack-tripleo-heat-templates/puppet/manifests/overcloud_controller_pacemaker.pp


Running diff on 2 overcloud_controller_pacemaker.pp I get:
557a558
>     include ::ceph::profile::rgw
999c1000,1001
<     enabled        => $non_pcmk_start,
---
>     # enabled        => $non_pcmk_start,
>     enabled        => false,
1875a1878,1884
>   if $ceph::profile::params::enable_rgw
>   {
>     exec { 'create_radosgw_keyring':
>       command => "/usr/bin/ceph auth get-or-create client.radosgw.gateway mon 'allow rwx' osd 'allow rwx' -o /etc/ceph/ceph.client.radosgw.gateway.keyring" ,
>       creates => "/etc/ceph/ceph.client.radosgw.gateway.keyring" ,
>     }
>   }


and I manually patch /usr/share/openstack-tripleo-heat-templates/puppet/manifests/overcloud_controller_pacemaker.pp with the diff after yum updating the undercloud.

as for /home/stack/pilot/templates/overcloud/puppet/hieradata/ceph.yaml 
I see that /usr/share/openstack-tripleo-heat-templates/puppet/hieradata/ceph.yaml  doesn't change when we run yum update on the undercloud, so I simply used the same copy of the file in update command.

Comment 5 Sofer Athlan-Guyot 2016-10-05 15:19:45 UTC
Hi,

to further debug this it would be helpful to have:

 - /var/lib/heat-config/heat-config-puppet/ab53c213-94cc-4fd8-bf7f-258cf45b0bae.pp from overcloud-controller-2.localdomain;
 - the entire /etc/puppet/hieradata/ directory from the same host (small)
 - the /usr/share/openstack-puppet/modules/ directory could be useful as well (small when compressed)

It's could be that the class is still declared in hieradata, or that
as Emilien says that both parameters are set (enable_external_ceph and
ceph_mon_initial_members). With those files we should be able to find
the root cause.

Comment 6 arkady kanevsky 2016-10-05 15:30:47 UTC
Can we have a patch for it so we can drop it into JS-6.0?

Comment 7 Alexander Chuzhoy 2016-10-05 17:03:31 UTC
Created attachment 1207650 [details]
/var/lib/heat-config/heat-config-puppet/ab53c213-94cc-4fd8-bf7f-258cf45b0bae.pp

Comment 8 Alexander Chuzhoy 2016-10-05 17:08:42 UTC
Created attachment 1207652 [details]
hieradata dir from controller-2

Comment 9 Alexander Chuzhoy 2016-10-05 17:47:34 UTC
Created attachment 1207660 [details]
/usr/share/openstack-puppet/modules/ from controller

Comment 10 Alexander Chuzhoy 2016-10-05 17:49:31 UTC
Created attachment 1207661 [details]
/usr/share/openstack-puppet/modules/ from undercloud

Comment 11 Alexander Chuzhoy 2016-10-05 17:51:10 UTC
Hi Sofer.
The requsted data is added.
Do you need anything else?
Thanks.

Comment 12 Giulio Fidente 2016-10-06 18:22:37 UTC
As per comment #3 from Emilien, I think they are deploying with settings for both managed *and* external ceph at the same time.

Could you collect from a controller node the output from:

$ sudo hiera enable_external_ceph

this is not supposed to return 'true', unless they are deploying with external ceph

Comment 13 Alexander Chuzhoy 2016-10-06 18:25:34 UTC
[heat-admin@overcloud-controller-0 ~]$ sudo hiera enable_external_ceph
nil

Comment 14 Sofer Athlan-Guyot 2016-10-10 10:31:15 UTC
Hi,

thanks Alexander for the information provided. I could reproduce
locally the error. The problem comes from the local modification of the
template done in
https://bugzilla.redhat.com/show_bug.cgi?id=1381629#c4 , most notably
the

   
    557a558
    >     include ::ceph::profile::rgw

This inclusion cause puppet to automatically include the
ceph::profile::params, hence the duplicate error:


    Debug: importing '/home/chem/Src/redhat/openstack-puppet-modules/ceph/manifests/profile/rgw.pp' in environment production
    Debug: Automatically imported ceph::profile::rgw from ceph/profile/rgw into production
    Warning: ModuleLoader: module 'ceph' has unresolved dependencies - it will only see those that are resolved. Use 'puppet module list --tree' to see information about modules
       (file & line not available)
    Debug: importing '/home/chem/Src/redhat/openstack-puppet-modules/ceph/manifests/profile/base.pp' in environment production
    Debug: Automatically imported ceph::profile::base from ceph/profile/base into production
    Debug: importing '/home/chem/Src/redhat/openstack-puppet-modules/ceph/manifests/profile/params.pp' in environment production
    Debug: Automatically imported ceph::profile::params from ceph/profile/params into production


I don't know exactly why the modification was done, but it cannot
work. You have to remove it from the list of your local modification
to have the minor upgrade get past this error.

Comment 15 Alexander Chuzhoy 2016-10-12 17:25:15 UTC
I believe this is not a bug and occured as a result of wrongly patching /usr/share/openstack-tripleo-heat-templates/environments/puppet-pacemaker.yaml after runinng yum update.

Comment 16 Mike Orazi 2016-10-12 17:36:36 UTC
Is this potentially related to how rgw is handled in this environment?

Comment 17 Sofer Athlan-Guyot 2016-10-13 10:21:13 UTC
Hi, 

Alexander, Mike, you are both right.   This is not a bug, and it's related to the inclusion of the rados gateway in the manifest (include ::ceph::profile::rgw).  This local modification is blocking the upgrade.

Comment 18 arkady kanevsky 2016-10-13 12:29:45 UTC
So what is recommendation?
Replace offending line with ???
Fix handling duplicate ceph defintiion?

Update upgrade & update doc: remove offending lines, do update and upgrade, then update/upgrade rgw and other libraries manually/scripts and then bring lines back?

OSP10 will have new bits for RGW but nobody tried update or upgrade with them.

Comment 19 Sean Merrow 2016-11-04 17:57:12 UTC
Sofer: Are you able to clearly articulate our recommendation to Dell and the answers to Arkady's questions in comment 18?

Comment 20 Randy Perryman 2016-11-04 18:08:25 UTC
This is not a bug and was caused by a incorrect patch.  The correct procedure has been documented and verified by myself and Sasha.

Comment 21 arkady kanevsky 2016-11-09 20:29:27 UTC
Randy,
can we close this BZ?

Comment 22 Randy Perryman 2016-11-09 20:52:27 UTC
So the error in this file turned out to have a duplication of the line 
"include ::ceph::profile::rgw"  once this was fixed, the error went away.  This bug can be closed.


Note You need to log in before you can comment on or make changes to this bug.