Bug 1128874

Summary: Rubygem-Staypuft: HA-neutron deployment gets paused with errors installing the third controller (2 have passed). Execution of '/usr/bin/systemctl start mariadb' returned 1: Job for mariadb.service failed
Product: Red Hat OpenStack Reporter: Alexander Chuzhoy <sasha>
Component: rubygem-staypuftAssignee: Crag Wolfe <cwolfe>
Status: CLOSED ERRATA QA Contact: Leonid Natapov <lnatapov>
Severity: high Docs Contact:
Priority: high    
Version: unspecifiedCC: mburns, morazi, sclewis, yeylon
Target Milestone: gaKeywords: TestOnly
Target Release: Installer   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-08-21 18:08:50 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
/var/log/messages file from the controller that had the pupper errors.
none
/var/log/mariadb/mariadb.log file from the controller that had the pupper errors.
none
/var/log/messages file from another controler that was managed to complete the deployment on its own.
none
/var/log/messages file from the third controler that was managed to complete the deployment on its own. none

Description Alexander Chuzhoy 2014-08-11 17:21:02 UTC
Rubygem-Staypuft:  HA-neutron deployment gets paused with errors installing the third controller (2 have passed). Execution of '/usr/bin/systemctl start mariadb' returned 1: Job for mariadb.service failed


Environment:
rhel-osp-installer-0.1.9-1.el6ost.noarch
openstack-foreman-installer-2.0.18-1.el6ost.noarch
ruby193-rubygem-foreman_openstack_simplify-0.0.6-8.el6ost.noarch
openstack-puppet-modules-2014.1-19.9.el6ost.noarch

Steps to reproduce:
1. Install rhel-osp-installer
2. Configure/run an HA-neutron deployment (3 controllers + 2 computes)
3. in Advanced params, set the private_network to 192.168.0.0 (the address of the PXE network).

Result:
The deployment gets paused with error installing the third controller, while 2 other have passed.
Note: Running the puppet agent manually on that controller finished with no error. Resuming the deployment afterwards results in successful deployment.

Expected result:
The deployment should complete successfully with no intervention.

Comment 1 Alexander Chuzhoy 2014-08-11 17:23:41 UTC
Created attachment 925843 [details]
/var/log/messages file from the controller that had the pupper errors.

Comment 2 Alexander Chuzhoy 2014-08-11 17:24:55 UTC
Created attachment 925844 [details]
/var/log/mariadb/mariadb.log file from the controller that had the pupper errors.

Comment 3 Alexander Chuzhoy 2014-08-11 17:36:03 UTC
The yaml for the failed controller:
---
classes:
  quickstack::openstack_common: 
  quickstack::pacemaker::cinder:
    backend_eqlx: 'false'
    backend_eqlx_name:
    - eqlx_backend
    backend_glusterfs: false
    backend_glusterfs_name: glusterfs_backend
    backend_iscsi: 'false'
    backend_iscsi_name: iscsi_backend
    backend_nfs: 'true'
    backend_nfs_name: nfs_backend
    backend_rbd: 'false'
    backend_rbd_name: rbd_backend
    db_name: cinder
    db_ssl: false
    db_ssl_ca: ''
    db_user: cinder
    debug: false
    enabled: true
    eqlx_chap_login:
    - ''
    eqlx_chap_password:
    - ''
    eqlx_group_name:
    - group-0
    eqlx_pool:
    - default
    eqlx_use_chap:
    - 'false'
    glusterfs_shares: []
    log_facility: LOG_USER
    multiple_backends: false
    nfs_mount_options: nosharecache
    nfs_shares:
    - 192.168.0.1:/cinder
    qpid_heartbeat: '60'
    rbd_ceph_conf: /etc/ceph/ceph.conf
    rbd_flatten_volume_from_snapshot: 'false'
    rbd_max_clone_depth: '5'
    rbd_pool: volumes
    rbd_secret_uuid: 7fa6d270-5742-4bc1-ac68-0b17dd003040
    rbd_user: volumes
    rpc_backend: cinder.openstack.common.rpc.impl_kombu
    san_ip:
    - ''
    san_login:
    - grpadmin
    san_password:
    - ''
    san_thin_provision:
    - 'false'
    use_syslog: false
    verbose: 'true'
    volume: true
  quickstack::pacemaker::common:
    fence_ipmilan_address: ''
    fence_ipmilan_expose_lanplus: 'true'
    fence_ipmilan_hostlist: ''
    fence_ipmilan_host_to_address: []
    fence_ipmilan_interval: 60s
    fence_ipmilan_lanplus_options: ''
    fence_ipmilan_password: ''
    fence_ipmilan_username: ''
    fence_xvm_clu_iface: eth2
    fence_xvm_clu_network: ''
    fence_xvm_key_file_password: ''
    fence_xvm_manage_key_file: 'false'
    fencing_type: disabled
    pacemaker_cluster_members: 192.168.0.8 192.168.0.10 192.168.0.11
    pacemaker_cluster_name: openstack
  quickstack::pacemaker::galera:
    galera_monitor_password: monitor_pass
    galera_monitor_username: monitor_user
    mysql_root_password: 1ee3f43f37d5f76848595099681f68d9
    wsrep_cluster_members:
    - 192.168.0.8
    - 192.168.0.10
    - 192.168.0.11
    wsrep_cluster_name: galera_cluster
    wsrep_ssl: false
    wsrep_ssl_cert: /etc/pki/galera/galera.crt
    wsrep_ssl_key: /etc/pki/galera/galera.key
    wsrep_sst_method: rsync
    wsrep_sst_password: sst_pass
    wsrep_sst_username: sst_user
  quickstack::pacemaker::glance:
    backend: file
    db_name: glance
    db_ssl: false
    db_ssl_ca: ''
    db_user: glance
    debug: false
    enabled: true
    filesystem_store_datadir: /var/lib/glance/images/
    log_facility: LOG_USER
    pcmk_fs_device: 192.168.0.1:/cinder
    pcmk_fs_dir: /var/lib/glance/images
    pcmk_fs_manage: 'true'
    pcmk_fs_options: nosharecache,context=\"system_u:object_r:glance_var_lib_t:s0\"
    pcmk_fs_type: nfs
    pcmk_swift_is_local: true
    rbd_store_pool: images
    rbd_store_user: images
    sql_idle_timeout: '3600'
    swift_store_auth_address: http://127.0.0.1:5000/v2.0/
    swift_store_key: ''
    swift_store_user: ''
    use_syslog: false
    verbose: 'true'
  quickstack::pacemaker::heat:
    db_name: heat
    db_ssl: false
    db_ssl_ca: ''
    db_user: heat
    debug: false
    enabled: true
    log_facility: LOG_USER
    qpid_heartbeat: '60'
    use_syslog: false
    verbose: 'true'
  quickstack::pacemaker::horizon:
    horizon_ca: /etc/ipa/ca.crt
    horizon_cert: /etc/pki/tls/certs/PUB_HOST-horizon.crt
    horizon_key: /etc/pki/tls/private/PUB_HOST-horizon.key
    keystone_default_role: _member_
    memcached_port: '11211'
    secret_key: 0702bc7e913a64258c190aa291077837
    verbose: 'true'
  quickstack::pacemaker::keystone:
    admin_email: admin
    admin_password: 0164bce77dcb4ff7524e9cf43d9915d8
    admin_tenant: admin
    admin_token: 884b4fe2d325d2941f5961435d0568d2
    ceilometer: 'true'
    cinder: 'true'
    db_name: keystone
    db_ssl: 'false'
    db_ssl_ca: ''
    db_type: mysql
    db_user: keystone
    debug: 'false'
    enabled: 'true'
    glance: 'true'
    heat: 'true'
    heat_cfn: 'false'
    idle_timeout: '200'
    keystonerc: 'true'
    log_facility: LOG_USER
    nova: 'true'
    public_protocol: http
    region: RegionOne
    swift: 'false'
    token_driver: keystone.token.backends.sql.Token
    token_format: PKI
    use_syslog: 'false'
    verbose: 'true'
  quickstack::pacemaker::load_balancer: 
  quickstack::pacemaker::memcached: 
  quickstack::pacemaker::neutron:
    core_plugin: neutron.plugins.ml2.plugin.Ml2Plugin
    enabled: true
    enable_tunneling: 'true'
    external_network_bridge: br-ex
    ml2_flat_networks:
    - ! '*'
    ml2_mechanism_drivers:
    - openvswitch
    - l2population
    ml2_network_vlan_ranges:
    - physnet-external
    ml2_security_group: 'True'
    ml2_tenant_network_types:
    - vxlan
    ml2_tunnel_id_ranges:
    - 10:1000
    ml2_type_drivers:
    - local
    - flat
    - vlan
    - gre
    - vxlan
    ml2_vxlan_group: 224.0.0.1
    ovs_bridge_mappings:
    - physnet-external:br-ex
    ovs_bridge_uplinks:
    - br-ex:ens8
    ovs_tunnel_iface: ens7
    ovs_tunnel_network: ''
    ovs_tunnel_types:
    - vxlan
    - gre
    ovs_vlan_ranges:
    - physnet-external
    ovs_vxlan_udp_port: '4789'
    tenant_network_type: vlan
    tunnel_id_ranges: 1:1000
    verbose: 'true'
  quickstack::pacemaker::nova:
    auto_assign_floating_ip: 'true'
    db_name: nova
    db_user: nova
    default_floating_pool: nova
    force_dhcp_release: 'false'
    image_service: nova.image.glance.GlanceImageService
    memcached_port: '11211'
    multi_host: 'true'
    neutron_metadata_proxy_secret: 8d89120a015781f3fa793cbe49c0b28b
    qpid_heartbeat: '60'
    rpc_backend: nova.openstack.common.rpc.impl_kombu
    scheduler_host_subset_size: '30'
    verbose: 'true'
  quickstack::pacemaker::params:
    amqp_group: amqp
    amqp_password: 27cec767f1845e46996143779c4e3cb9
    amqp_port: '5672'
    amqp_provider: rabbitmq
    amqp_username: openstack
    amqp_vip: 192.168.0.99
    ceilometer_admin_vip: 192.168.0.88
    ceilometer_group: ceilometer
    ceilometer_private_vip: 192.168.0.88
    ceilometer_public_vip: 192.168.0.88
    ceilometer_user_password: 47c3f996059b9896aec51456c3c9646e
    cinder_admin_vip: 192.168.0.89
    cinder_db_password: b2509f3bb880a3ce4bf54010c2b98b4e
    cinder_group: cinder
    cinder_private_vip: 192.168.0.89
    cinder_public_vip: 192.168.0.89
    cinder_user_password: 6654863e83a6ce9bd01b75a4f4ed3cb5
    cluster_control_ip: 192.168.0.8
    db_group: db
    db_vip: 192.168.0.90
    glance_admin_vip: 192.168.0.91
    glance_db_password: 7de2b3ba48b3bf9a287030c5602eb900
    glance_group: glance
    glance_private_vip: 192.168.0.91
    glance_public_vip: 192.168.0.91
    glance_user_password: b7054bc8bf248aa06383c79d1e7ec6ee
    heat_admin_vip: 192.168.0.92
    heat_auth_encryption_key: a32c809f960cea9b3e76309060532493
    heat_cfn_admin_vip: 192.168.0.93
    heat_cfn_enabled: 'true'
    heat_cfn_group: heat_cfn
    heat_cfn_private_vip: 192.168.0.93
    heat_cfn_public_vip: 192.168.0.93
    heat_cfn_user_password: 0579cc2f023286f5b39aacb8b034b1dc
    heat_cloudwatch_enabled: 'true'
    heat_db_password: 19e847f1c8a2f8b4fcc9ec399a9fe073
    heat_group: heat
    heat_private_vip: 192.168.0.92
    heat_public_vip: 192.168.0.92
    heat_user_password: 6667b9a8ae5d0ae58f6d1ff99f7d43b8
    horizon_admin_vip: 192.168.0.94
    horizon_group: horizon
    horizon_private_vip: 192.168.0.94
    horizon_public_vip: 192.168.0.94
    include_amqp: 'true'
    include_cinder: 'true'
    include_glance: 'true'
    include_heat: 'true'
    include_horizon: 'true'
    include_keystone: 'true'
    include_mysql: 'true'
    include_neutron: 'true'
    include_nova: 'true'
    include_swift: 'false'
    keystone_admin_vip: 192.168.0.95
    keystone_db_password: 7ea9ddade46825e50f2e6401fd8a5163
    keystone_group: keystone
    keystone_private_vip: 192.168.0.95
    keystone_public_vip: 192.168.0.95
    keystone_user_password: 660d8620c76a70cdc104148f06463bcf
    lb_backend_server_addrs:
    - 192.168.0.8
    - 192.168.0.10
    - 192.168.0.11
    lb_backend_server_names:
    - maca25400702876.example.com
    - maca25400702875.example.com
    - maca25400702877.example.com
    loadbalancer_group: loadbalancer
    loadbalancer_vip: 192.168.0.96
    neutron: 'true'
    neutron_admin_vip: 192.168.0.98
    neutron_db_password: f56c8fb8254a63cb10b15b013590dfe8
    neutron_group: neutron
    neutron_metadata_proxy_secret: 8d89120a015781f3fa793cbe49c0b28b
    neutron_private_vip: 192.168.0.98
    neutron_public_vip: 192.168.0.98
    neutron_user_password: db83b3dbe314fb4d5e8a75ee1ada4955
    nova_admin_vip: 192.168.0.97
    nova_db_password: 554bdcba009e0978157ec4da8f06b13d
    nova_group: nova
    nova_private_vip: 192.168.0.97
    nova_public_vip: 192.168.0.97
    nova_user_password: a614ab98718f42fb9cfde6b676d5bc19
    private_iface: ''
    private_ip: 10.8.29.147
    private_network: 192.168.0.0
    swift_group: swift
    swift_public_vip: 192.168.0.100
    swift_user_password: ''
  quickstack::pacemaker::qpid:
    backend_port: '15672'
    config_file: /etc/qpidd.conf
    connection_backlog: '65535'
    haproxy_timeout: 120s
    log_to_file: UNSET
    manage_service: true
    max_connections: '65535'
    package_ensure: present
    package_name: qpid-cpp-server
    realm: QPID
    service_enable: true
    service_ensure: running
    service_name: qpidd
    worker_threads: '17'
  quickstack::pacemaker::swift:
    memcached_port: '11211'
    swift_internal_vip: 192.168.0.100
    swift_shared_secret: faffe511acf79d72b6438920b8790863
    swift_storage_device: ''
    swift_storage_ips: []
parameters:
  puppetmaster: staypuft.example.com
  domainname: Default domain used for provisioning
  hostgroup: base_RedHat_7/HA-neutron/HA Controller
  root_pw: $5$fm$EO4A.ybSB/ofUaZWkzNePd38XRpUQNXls8y1feWvIy3
  puppet_ca: staypuft.example.com
  foreman_env: production
  owner_name: Admin User
  owner_email: root
  ntp-server: clock.redhat.com
  ui::cinder::driver_backend: nfs
  ui::cinder::eqlx_group_name: group-0
  ui::cinder::eqlx_pool: default
  ui::cinder::nfs_uri: 192.168.0.1:/cinder
  ui::cinder::rbd_secret_uuid: 7fa6d270-5742-4bc1-ac68-0b17dd003040
  ui::cinder::san_login: grpadmin
  ui::deployment::amqp_provider: rabbitmq
  ui::deployment::layout_name: High Availability Controllers / Compute
  ui::deployment::networking: neutron
  ui::deployment::platform: rhel7
  ui::glance::driver_backend: nfs
  ui::glance::nfs_network_path: 192.168.0.1:/cinder
  ui::neutron::compute_tenant_interface: ens7
  ui::neutron::external_interface_name: ens8
  ui::neutron::networker_tenant_interface: ens7
  ui::neutron::network_segmentation: vxlan
  ui::neutron::use_external_interface: 'true'
  ui::nova::network_manager: FlatDHCPManager
  ui::passwords::admin: 0164bce77dcb4ff7524e9cf43d9915d8
  ui::passwords::amqp: 27cec767f1845e46996143779c4e3cb9
  ui::passwords::amqp_nssdb: bb1755ade38caa4dd9a072787488f88d
  ui::passwords::ceilometer_metering_secret: a55ba849858684bd3c21b423c549ff92
  ui::passwords::ceilometer_user: 47c3f996059b9896aec51456c3c9646e
  ui::passwords::cinder_db: b2509f3bb880a3ce4bf54010c2b98b4e
  ui::passwords::cinder_user: 6654863e83a6ce9bd01b75a4f4ed3cb5
  ui::passwords::glance_db: 7de2b3ba48b3bf9a287030c5602eb900
  ui::passwords::glance_user: b7054bc8bf248aa06383c79d1e7ec6ee
  ui::passwords::heat_auth_encrypt_key: a32c809f960cea9b3e76309060532493
  ui::passwords::heat_cfn_user: 0579cc2f023286f5b39aacb8b034b1dc
  ui::passwords::heat_db: 19e847f1c8a2f8b4fcc9ec399a9fe073
  ui::passwords::heat_user: 6667b9a8ae5d0ae58f6d1ff99f7d43b8
  ui::passwords::horizon_secret_key: 0702bc7e913a64258c190aa291077837
  ui::passwords::keystone_admin_token: 884b4fe2d325d2941f5961435d0568d2
  ui::passwords::keystone_db: 7ea9ddade46825e50f2e6401fd8a5163
  ui::passwords::keystone_user: 660d8620c76a70cdc104148f06463bcf
  ui::passwords::mode: random
  ui::passwords::mysql_root: 1ee3f43f37d5f76848595099681f68d9
  ui::passwords::neutron_db: f56c8fb8254a63cb10b15b013590dfe8
  ui::passwords::neutron_metadata_proxy_secret: 8d89120a015781f3fa793cbe49c0b28b
  ui::passwords::neutron_user: db83b3dbe314fb4d5e8a75ee1ada4955
  ui::passwords::nova_db: 554bdcba009e0978157ec4da8f06b13d
  ui::passwords::nova_user: a614ab98718f42fb9cfde6b676d5bc19
  ui::passwords::swift_admin: 3a7d0dcd78ba161dfbabca82cc77b636
  ui::passwords::swift_shared_secret: faffe511acf79d72b6438920b8790863
  ui::passwords::swift_user: b0535d312085ab6d60fff6bc843b9bd7
environment: production

Comment 4 Crag Wolfe 2014-08-11 18:11:51 UTC
Trying to piece together what happened here.  The first puppet error messages is for heat:

Aug 11 16:39:25 maca25400702876 start-puppet-agent: Error: unable to find group or resource: openstack-heat-api-cfn
Aug 11 16:39:25 maca25400702876 puppet-agent[3364]: (/Stage[main]/Quickstack::Pacemaker::Heat/Quickstack::Pacemaker::Resource::Service[openstack-heat-api-cfn]/Pacemaker::Resource::Service[openstack-heat-api-cfn]/Pacemaker::Resource::Systemd[openstack-heat-api-cfn]/Pcmk_resource[openstack-heat-api-cfn]/ensure) created
Aug 11 16:39:26 maca25400702876 puppet-agent[3364]: (/Stage[main]/Quickstack::Pacemaker::Heat/Quickstack::Pacemaker::Constraint::Base[heat-api-cfn-constr]/Pacemaker::Constraint::Base[heat-api-cfn-constr]/Exec[Creating order constraint heat-api-cfn-constr]/returns) Error: Resource 'openstack-heat-api-cfn-clone' does not exist

Where did you see "Execution of '/usr/bin/systemctl start mariadb' returned 1: Job for mariadb.service failed" ?  It is not in /var/log/messages.  Is there a timestamp associated with the message?

Comment 5 Alexander Chuzhoy 2014-08-11 18:15:43 UTC
I saw this message in the puppet report for the controller, where I had to manually re-run the puppet.


Could not start Service[galera]: Execution of '/usr/bin/systemctl start mariadb' returned 1: Job for mariadb.service failed. See 'systemctl status mariadb.service' and 'journalctl -xn' for details. Wrapped exception: Execution of '/usr/bin/systemctl start mariadb' returned 1: Job for mariadb.service failed. See 'systemctl status mariadb.service' and 'journalctl -xn' for details.

Comment 6 Alexander Chuzhoy 2014-08-11 18:28:40 UTC
Created attachment 925862 [details]
/var/log/messages file from another controler that was managed to complete the deployment on its own.

Comment 7 Alexander Chuzhoy 2014-08-11 18:30:26 UTC
Created attachment 925864 [details]
/var/log/messages file from the third controler that was managed to complete the deployment on its own.

Comment 8 Crag Wolfe 2014-08-11 18:50:29 UTC
The first puppet error in the attachment 925862 [details] is:

Aug 11 16:32:40 maca25400702876 Filesystem(fs-varlibglanceimages)[21759]: ERROR: Couldn't mount filesystem 192.168.0.1:/cinder on /var/lib/glance/images

This implies the glance-related parameter, pcmk_fs_options, is incorrectly set  to 192.168.0.1:/cinder .  This error also occurs in the other attachment.

I'm thinking the galera error reported in this BZ when re-running puppet by hand may be due to https://bugzilla.redhat.com/show_bug.cgi?id=1123312 , so I think we should wait until a fix is in for that to see if it addresses the issue.  However, I'm concerned about the other errors that occurred so that none of the puppet runs completed successfully (cinder and heat).

Comment 10 Leonid Natapov 2014-08-20 12:30:06 UTC
openstack-foreman-installer-2.0.21-1.el6ost

No galera errors. mariadb started on all nodes successfully.

Comment 11 errata-xmlrpc 2014-08-21 18:08:50 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2014-1090.html