Bug 1316283
Summary: | If there is a /var/lib/neutron/ha_confs/<router-id>.pid then l3 agent fails to spawn a keepalived process for that router | |||
---|---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Jeremy <jmelvin> | |
Component: | openstack-neutron | Assignee: | Assaf Muller <amuller> | |
Status: | CLOSED ERRATA | QA Contact: | Alexander Stafeyev <astafeye> | |
Severity: | medium | Docs Contact: | ||
Priority: | medium | |||
Version: | 7.0 (Kilo) | CC: | amuller, bperkins, chrisw, jmelvin, jruzicka, majopela, nchandek, nyechiel, srevivo, tfreger | |
Target Milestone: | async | Keywords: | Reopened, ZStream | |
Target Release: | 7.0 (Kilo) | |||
Hardware: | All | |||
OS: | All | |||
Whiteboard: | ||||
Fixed In Version: | openstack-neutron-2015.1.2-13.el7ost | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1545481 (view as bug list) | Environment: | ||
Last Closed: | 2018-02-14 15:39:57 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1545481 |
Description
Jeremy
2016-03-09 21:31:35 UTC
Assigned to Miguel for triaging. Miguel, could this be a bug with ProcessManager? It's odd, looking at the 'active' method implementation in Kilo code, it checked (As it does on master) that the resource's UUID is present in the cmdline of the pid in question, and the UUID for keepalived_manager is the router_id, so the scenario in this RHBZ "should" not be happening. It could be, I would debug around here [1], as you said: That bug may have been prevented by that logic, we look for the UUID of the process to be in the process cmdline /proc/<pid>/cmdline, and otherwise active will return False, triggering the respawning on the ProcessMonitor. Please note that the logic for spawning keepalived and tracking the vrrp child is in [2] I'm sending the bug to @hmlnarik as suggested on ping. [1] https://github.com/openstack/neutron/blob/2768da320d7fb1630f2ffa32ec6485b279ba37e8/neutron/agent/linux/external_process.py#L134 [2] https://github.com/openstack/neutron/blob/stable/kilo/neutron/agent/linux/keepalived.py#L332 Bad bot! Bad! Go away. This bug was accidentally moved from POST to MODIFIED via an error in automation, please see mmccune with any questions SandboxB:[root@ttbossbxbmgmt0001 neutron1596344-keepalive]# yum localupdate openstack-neutron-2015.1.2-13.el7ost.noarch.rpm openstack-neutron-common-2015.1.2-13.el7ost.noarch.rpm openstack-neutron-ml2-2015.1.2-13.el7ost.noarch.rpm python-neutron-2015.1.2-13.el7ost.noarch.rpm Loaded plugins: langpacks, priorities, product-id, rhnplugin, search-disabled-repos, subscription-manager This system is not registered with RHN Classic or Red Hat Satellite. You can use rhn_register to register. Red Hat Satellite or RHN Classic support will be disabled. Examining openstack-neutron-2015.1.2-13.el7ost.noarch.rpm: openstack-neutron-2015.1.2-13.el7ost.noarch Marking openstack-neutron-2015.1.2-13.el7ost.noarch.rpm as an update to openstack-neutron-2015.1.2-11.el7ost.noarch Examining openstack-neutron-common-2015.1.2-13.el7ost.noarch.rpm: openstack-neutron-common-2015.1.2-13.el7ost.noarch Marking openstack-neutron-common-2015.1.2-13.el7ost.noarch.rpm as an update to openstack-neutron-common-2015.1.2-11.el7ost.noarch Examining openstack-neutron-ml2-2015.1.2-13.el7ost.noarch.rpm: openstack-neutron-ml2-2015.1.2-13.el7ost.noarch Marking openstack-neutron-ml2-2015.1.2-13.el7ost.noarch.rpm as an update to openstack-neutron-ml2-2015.1.2-11.el7ost.noarch Examining python-neutron-2015.1.2-13.el7ost.noarch.rpm: python-neutron-2015.1.2-13.el7ost.noarch Marking python-neutron-2015.1.2-13.el7ost.noarch.rpm as an update to python-neutron-2015.1.2-11.el7ost.noarch Resolving Dependencies --> Running transaction check ---> Package openstack-neutron.noarch 0:2015.1.2-11.el7ost will be updated ---> Package openstack-neutron.noarch 0:2015.1.2-13.el7ost will be an update ---> Package openstack-neutron-common.noarch 0:2015.1.2-11.el7ost will be updated --> Processing Dependency: openstack-neutron-common = 2015.1.2-11.el7ost for package: openstack-neutron-openvswitch-2015.1.2-11.el7ost.noarch datadog | 951 B 00:00:00 enterprise-cloud_ECE_Custom_Packages_ECE_Custom_Packages | 2.1 kB 00:00:00 rhel-7-server-openstack-7.0-director-rpms | 2.1 kB 00:00:00 rhel-7-server-openstack-7.0-optools-rpms | 2.1 kB 00:00:00 rhel-7-server-openstack-7.0-rpms | 2.1 kB 00:00:00 rhel-7-server-rh-common-rpms | 2.1 kB 00:00:00 rhel-7-server-rhceph-1.3-tools-rpms | 2.1 kB 00:00:00 rhel-7-server-rpms | 2.1 kB 00:00:00 rhel-7-server-satellite-tools-6.1-rpms | 2.1 kB 00:00:00 rhel-ha-for-rhel-7-server-rpms | 2.1 kB 00:00:00 treasure-data | 2.9 kB 00:00:00 ---> Package openstack-neutron-common.noarch 0:2015.1.2-13.el7ost will be an update ---> Package openstack-neutron-ml2.noarch 0:2015.1.2-11.el7ost will be updated ---> Package openstack-neutron-ml2.noarch 0:2015.1.2-13.el7ost will be an update ---> Package python-neutron.noarch 0:2015.1.2-11.el7ost will be updated ---> Package python-neutron.noarch 0:2015.1.2-13.el7ost will be an update --> Finished Dependency Resolution Error: Package: openstack-neutron-openvswitch-2015.1.2-11.el7ost.noarch (@/openstack-neutron-openvswitch-2015.1.2-11.el7ost.noarch) Requires: openstack-neutron-common = 2015.1.2-11.el7ost Removing: openstack-neutron-common-2015.1.2-11.el7ost.noarch (@/openstack-neutron-common-2015.1.2-11.el7ost.noarch) openstack-neutron-common = 2015.1.2-11.el7ost Updated By: openstack-neutron-common-2015.1.2-13.el7ost.noarch (/openstack-neutron-common-2015.1.2-13.el7ost.noarch) openstack-neutron-common = 2015.1.2-13.el7ost Available: openstack-neutron-common-2015.1.0-12.el7ost.noarch (rhel-7-server-openstack-7.0-rpms) openstack-neutron-common = 2015.1.0-12.el7ost Available: openstack-neutron-common-2015.1.0-16.el7ost.noarch (rhel-7-server-openstack-7.0-rpms) openstack-neutron-common = 2015.1.0-16.el7ost Available: openstack-neutron-common-2015.1.1-6.el7ost.noarch (rhel-7-server-openstack-7.0-rpms) openstack-neutron-common = 2015.1.1-6.el7ost Available: openstack-neutron-common-2015.1.1-7.el7ost.noarch (rhel-7-server-openstack-7.0-rpms) openstack-neutron-common = 2015.1.1-7.el7ost Available: openstack-neutron-common-2015.1.2-3.el7ost.noarch (rhel-7-server-openstack-7.0-rpms) openstack-neutron-common = 2015.1.2-3.el7ost Available: openstack-neutron-common-2015.1.2-6.el7ost.noarch (rhel-7-server-openstack-7.0-rpms) openstack-neutron-common = 2015.1.2-6.el7ost Available: openstack-neutron-common-2015.1.2-7.el7ost.noarch (rhel-7-server-openstack-7.0-rpms) openstack-neutron-common = 2015.1.2-7.el7ost Available: openstack-neutron-common-2015.1.2-9.el7ost.noarch (rhel-7-server-openstack-7.0-rpms) openstack-neutron-common = 2015.1.2-9.el7ost ********************************************************************** yum can be configured to try to resolve such errors by temporarily enabling disabled repos and searching for missing dependencies. To enable this functionality please set 'notify_only=0' in /etc/yum/pluginconf.d/search-disabled-repos.conf ********************************************************************** Error: Package: openstack-neutron-openvswitch-2015.1.2-11.el7ost.noarch (@/openstack-neutron-openvswitch-2015.1.2-11.el7ost.noarch) Requires: openstack-neutron-common = 2015.1.2-11.el7ost Removing: openstack-neutron-common-2015.1.2-11.el7ost.noarch (@/openstack-neutron-common-2015.1.2-11.el7ost.noarch) openstack-neutron-common = 2015.1.2-11.el7ost Updated By: openstack-neutron-common-2015.1.2-13.el7ost.noarch (/openstack-neutron-common-2015.1.2-13.el7ost.noarch) openstack-neutron-common = 2015.1.2-13.el7ost Available: openstack-neutron-common-2015.1.0-12.el7ost.noarch (rhel-7-server-openstack-7.0-rpms) openstack-neutron-common = 2015.1.0-12.el7ost Available: openstack-neutron-common-2015.1.0-16.el7ost.noarch (rhel-7-server-openstack-7.0-rpms) openstack-neutron-common = 2015.1.0-16.el7ost Available: openstack-neutron-common-2015.1.1-6.el7ost.noarch (rhel-7-server-openstack-7.0-rpms) openstack-neutron-common = 2015.1.1-6.el7ost Available: openstack-neutron-common-2015.1.1-7.el7ost.noarch (rhel-7-server-openstack-7.0-rpms) openstack-neutron-common = 2015.1.1-7.el7ost Available: openstack-neutron-common-2015.1.2-3.el7ost.noarch (rhel-7-server-openstack-7.0-rpms) openstack-neutron-common = 2015.1.2-3.el7ost Available: openstack-neutron-common-2015.1.2-6.el7ost.noarch (rhel-7-server-openstack-7.0-rpms) openstack-neutron-common = 2015.1.2-6.el7ost Available: openstack-neutron-common-2015.1.2-7.el7ost.noarch (rhel-7-server-openstack-7.0-rpms) openstack-neutron-common = 2015.1.2-7.el7ost Available: openstack-neutron-common-2015.1.2-9.el7ost.noarch (rhel-7-server-openstack-7.0-rpms) openstack-neutron-common = 2015.1.2-9.el7ost You could try using --skip-broken to work around the problem You could try running: rpm -Va --nofiles --nodigest SandboxB:[root@ttbossbxbmgmt0001 neutron1596344-keepalive]# SandboxB:[root@ttbossbxbmgmt0001 neutron1596344-keepalive]# yum repolist Loaded plugins: langpacks, priorities, product-id, rhnplugin, search-disabled-repos, subscription-manager This system is not registered with RHN Classic or Red Hat Satellite. You can use rhn_register to register. Red Hat Satellite or RHN Classic support will be disabled. repo id repo name status datadog Datadog, Inc. 53 !enterprise-cloud_ECE_Custom_Packages_ECE_Custom_Packages ECE Custom Packages 5 !rhel-7-server-openstack-7.0-director-rpms/7Server/x86_64 Red Hat Enterprise Linux OpenStack Platform 7.0 director for RH 78 !rhel-7-server-openstack-7.0-optools-rpms/7Server/x86_64 Red Hat Enterprise Linux OpenStack Platform 7.0 Operational Too 91 !rhel-7-server-openstack-7.0-rpms/7Server/x86_64 Red Hat Enterprise Linux OpenStack Platform 7.0 for RHEL 7 (RPM 1,094 !rhel-7-server-rh-common-rpms/7Server/x86_64 Red Hat Enterprise Linux 7 Server - RH Common (RPMs) 168 !rhel-7-server-rhceph-1.3-tools-rpms/7Server/x86_64 Red Hat Ceph Storage Tools 1.3 for Red Hat Enterprise Linux 7 S 71 !rhel-7-server-rpms/7Server/x86_64 Red Hat Enterprise Linux 7 Server (RPMs) 20,259+2 !rhel-7-server-satellite-tools-6.1-rpms/x86_64 Red Hat Satellite Tools 6.1 (for RHEL 7 Server) (RPMs) 83 !rhel-ha-for-rhel-7-server-rpms/7Server/x86_64 Red Hat Enterprise Linux High Availability (for RHEL 7 Server) 190 treasure-data/7Server/x86_64 Ye Ole Rpm Repo 9 repolist: 22,101 SandboxB:[root@ttbossbxbmgmt0001 neutron1596344-keepalive]# Verified. [root@overcloud-controller-0 ~]# rpm -qa | grep openstack-neutron-2015 openstack-neutron-2015.1.2-13.el7ost.noarch saw the following in l3-agent.log: 2016-04-21 13:15:44.778 22415 ERROR neutron.agent.linux.external_process [-] default-service for router with uuid cf2abf13-4ad2-4f0e-9f5e-9bc8539d990d not found. The process should not have died 2016-04-21 13:15:44.779 22415 ERROR neutron.agent.linux.external_process [-] respawning keepalived for uuid cf2abf13-4ad2-4f0e-9f5e-9bc8539d990d ________________________________________________ The pid was changed. The manually pid (keystone pid) : [root@overcloud-controller-0 ~]# cat /var/lib/neutron/ha_confs/cf2abf13-4ad2-4f0e-9f5e-9bc8539d990d.pid 8708 After the log showed the error the pid changed: [root@overcloud-controller-0 ~]# cat /var/lib/neutron/ha_confs/cf2abf13-4ad2-4f0e-9f5e-9bc8539d990d.pid 26096 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2016-1062.html |