Bug 2005849 - haproxy cannot connect to mysql (NOSRV)
Summary: haproxy cannot connect to mysql (NOSRV)
Keywords:
Status: CLOSED DUPLICATE of bug 2000088
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: python-PyMySQL
Version: 16.1 (Train)
Hardware: x86_64
OS: Linux
high
high
Target Milestone: ---
: ---
Assignee: OSP Team
QA Contact: nlevinki
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-09-20 10:33 UTC by Eduardo Olivares
Modified: 2021-11-30 07:51 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-11-30 07:51:31 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker OSP-9729 0 None None None 2021-11-15 12:47:54 UTC

Description Eduardo Olivares 2021-09-20 10:33:18 UTC
Description of problem:
The following OSP 16.1 update job fails:
https://rhos-ci-jenkins.lab.eng.tlv2.redhat.com/job/DFG-network-networking-ovn-update-16.1_director-rhel-virthost-3cont_2comp_2net-ipv4-geneve-composable/75/

The job updates OSP from 16.1 z6-async-rhbz1999919 (http://download.eng.bos.redhat.com/rcm-guest/puddles/OpenStack/16.1-RHEL-8/z6-async-rhbz1999919/) to 16.1 z7-spin4 (http://download.eng.bos.redhat.com/rcm-guest/puddles/OpenStack/16.1-RHEL-8/z7-spin4/).

The update is performed successfully. After the update, the overcloud nodes are rebooted. Then the following command fails at 14:59:10 (see [1]):
$ source ~/overcloudrc && (openstack flavor delete 200 || true) && openstack flavor create --id 200 --ram 2048 --disk 10 --vcpus 2 guest_image
Gateway Timeout (HTTP 504)\nGateway Timeout (HTTP 504)


According to the logs, the overcloud reboot finished successfully at 14:59:02 (see [2]).


Apparently, the haproxy fails to connect to mysql from 14:54:08 until the job ends at 15:17:55 (see [3]).


According to the pacemaker logs from controller-0 (see [4]), the galera-bundle resource was up and running at 14:57:
Sep 17 14:57:24 controller-0 pacemaker-schedulerd[2985] (pe__print_bundle)  info:  Container bundle set: galera-bundle [cluster.common.tag/rhosp16-openstack-mariadb:pcmklatest]
Sep 17 14:57:24 controller-0 pacemaker-schedulerd[2985] (common_print)  info:    galera-bundle-0    (ocf::heartbeat:galera):    Master controller-0
Sep 17 14:57:24 controller-0 pacemaker-schedulerd[2985] (common_print)  info:    galera-bundle-1    (ocf::heartbeat:galera):    Master controller-1
Sep 17 14:57:24 controller-0 pacemaker-schedulerd[2985] (common_print)  info:    galera-bundle-2    (ocf::heartbeat:galera):    Master controller-2



[1] https://rhos-ci-jenkins.lab.eng.tlv2.redhat.com/job/DFG-network-networking-ovn-update-16.1_director-rhel-virthost-3cont_2comp_2net-ipv4-geneve-composable/75/consoleFull
[2] http://rhos-ci-logs.lab.eng.tlv2.redhat.com/logs/rcj/DFG-network-networking-ovn-update-16.1_director-rhel-virthost-3cont_2comp_2net-ipv4-geneve-composable/75/console_logs/ir-tripleo-overcloud-reboot.log
[3] http://rhos-ci-logs.lab.eng.tlv2.redhat.com/logs/rcj/DFG-network-networking-ovn-update-16.1_director-rhel-virthost-3cont_2comp_2net-ipv4-geneve-composable/75/controller-0/var/log/containers/haproxy/haproxy.log.gz
[4] http://rhos-ci-logs.lab.eng.tlv2.redhat.com/logs/rcj/DFG-network-networking-ovn-update-16.1_director-rhel-virthost-3cont_2comp_2net-ipv4-geneve-composable/75/controller-0/var/log/pacemaker/pacemaker.log.gz


Version-Release number of selected component (if applicable):
z7-spin4

How reproducible:
Only tested once

Steps to Reproduce:
1. run the ovn osp16.1 update job - the failure should happen during the overcloud reboot stage
2.
3.


Note You need to log in before you can comment on or make changes to this bug.