Bug 1805984 - [OSP16] IPv6 Overcloud deployment failed with Spine-Leaf network on chronyc waitsync
Summary: [OSP16] IPv6 Overcloud deployment failed with Spine-Leaf network on chronyc w...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-common
Version: 16.0 (Train)
Hardware: Unspecified
OS: Unspecified
medium
urgent
Target Milestone: ---
: ---
Assignee: Adriano Petrich
QA Contact: Alexander Chuzhoy
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-02-21 19:11 UTC by Yuri Obshansky
Modified: 2020-05-06 16:22 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-05-06 16:22:41 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
ansible.log (25.02 KB, application/gzip)
2020-02-21 19:11 UTC, Yuri Obshansky
no flags Details

Description Yuri Obshansky 2020-02-21 19:11:04 UTC
Created attachment 1664832 [details]
ansible.log

Description of problem:
OSP 16 DCN with IPv6 and  Spine-Leaf network topology deployment failed on 
2020-02-21 18:49:19,355 p=885 u=mistral |  fatal: [central-controller0-0]: FAILED! => {"changed": true, "cmd": ["chronyc", "waitsync", "20"], "del
ta": "0:03:10.191617", "end": "2020-02-21 13:49:19.283560", "msg": "non-zero return code", "rc": 1, "start": "2020-02-21 13:46:09.091943", "stderr
": "", "stderr_lines": [], "stdout": "try: 1, refid: 00000000, correction: 0.000000000, skew: 0.000\ntry: 2, refid: 00000000, correction: 0.000000
000, skew: 0.000\ntry: 3, refid: 00000000, correction: 0.000000000, skew: 0.000\ntry: 4, refid: 00000000, correction: 0.000000000, skew: 0.000\ntr
y: 5, refid: 00000000, correction: 0.000000000, skew: 0.000\ntry: 6, refid: 00000000, correction: 0.000000000, skew: 0.000\ntry: 7, refid: 0000000
0, correction: 0.000000001, skew: 0.000\ntry: 8, refid: 00000000, correction: 0.000000001, skew: 0.000\ntry: 9, refid: 00000000, correction: 0.000
000001, skew: 0.000\ntry: 10, refid: 00000000, correction: 0.000000001, skew: 0.000\ntry: 11, refid: 00000000, correction: 0.000000001, skew: 0.00
0\ntry: 12, refid: 00000000, correction: 0.000000001, skew: 0.000\ntry: 13, refid: 00000000, correction: 0.000000001, skew: 0.000\ntry: 14, refid:
 00000000, correction: 0.000000001, skew: 0.000\ntry: 15, refid: 00000000, correction: 0.000000001, skew: 0.000\ntry: 16, refid: 00000000, correct
ion: 0.000000001, skew: 0.000\ntry: 17, refid: 00000000, correction: 0.000000002, skew: 0.000\ntry: 18, refid: 00000000, correction: 0.000000002, 
skew: 0.000\ntry: 19, refid: 00000000, correction: 0.000000002, skew: 0.000\ntry: 20, refid: 00000000, correction: 0.000000002, skew: 0.000", "std
out_lines": ["try: 1, refid: 00000000, correction: 0.000000000, skew: 0.000", "try: 2, refid: 00000000, correction: 0.000000000, skew: 0.000", "tr
y: 3, refid: 00000000, correction: 0.000000000, skew: 0.000", "try: 4, refid: 00000000, correction: 0.000000000, skew: 0.000", "try: 5, refid: 000
00000, correction: 0.000000000, skew: 0.000", "try: 6, refid: 00000000, correction: 0.000000000, skew: 0.000", "try: 7, refid: 00000000, correctio
n: 0.000000001, skew: 0.000", "try: 8, refid: 00000000, correction: 0.000000001, skew: 0.000", "try: 9, refid: 00000000, correction: 0.000000001, 
skew: 0.000", "try: 10, refid: 00000000, correction: 0.000000001, skew: 0.000", "try: 11, refid: 00000000, correction: 0.000000001, skew: 0.000", 
"try: 12, refid: 00000000, correction: 0.000000001, skew: 0.000", "try: 13, refid: 00000000, correction: 0.000000001, skew: 0.000", "try: 14, refi
d: 00000000, correction: 0.000000001, skew: 0.000", "try: 15, refid: 00000000, correction: 0.000000001, skew: 0.000", "try: 16, refid: 00000000, c
orrection: 0.000000001, skew: 0.000", "try: 17, refid: 00000000, correction: 0.000000002, skew: 0.000", "try: 18, refid: 00000000, correction: 0.0
00000002, skew: 0.000", "try: 19, refid: 00000000, correction: 0.000000002, skew: 0.000", "try: 20, refid: 00000000, correction: 0.000000002, skew
: 0.000"]}


Version-Release number of selected component (if applicable):
RHOS_TRUNK-16.0-RHEL-8-20200220.n.0

How reproducible:
Jenkins job
https://rhos-ci-staging-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/DFG/view/edge/job/DFG-edge-deployment-16-rhel-virthost-ipv6-3cont-2comp-2leafs-x-2comp-default/2/console

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Alex Schultz 2020-02-21 21:44:39 UTC
This points to a mis-configuration with unavailable ntp server.

Comment 2 Yuri Obshansky 2020-02-21 22:01:59 UTC
(In reply to Alex Schultz from comment #1)
> This points to a mis-configuration with unavailable ntp server.

I think, it is configured well.

[stack@site-undercloud-0 ~]$ cat overcloud_deploy.sh
#!/bin/bash

openstack overcloud deploy \
--timeout 240 \
--templates /usr/share/openstack-tripleo-heat-templates \
--stack central \
--libvirt-type kvm \
--ntp-server clock1.rdu2.redhat.com \
............

[stack@site-undercloud-0 ~]$ cat undercloud.conf
[DEFAULT]
local_interface = eth0
local_ip = 192.168.24.1/24
undercloud_public_host = 192.168.24.2
undercloud_admin_host = 192.168.24.3
undercloud_ntp_servers=clock.redhat.com,clock2.redhat.com
undercloud_nameservers=10.11.5.19
..............

Yuri

Comment 3 Alex Schultz 2020-02-21 22:04:11 UTC
No it's not. You should use more than 1 server for the overcloud deployment.

Comment 4 Yuri Obshansky 2020-02-21 22:11:55 UTC
(In reply to Alex Schultz from comment #3)
> No it's not. You should use more than 1 server for the overcloud deployment.

Well, I have successfully installed Overcloud IPv4 with --ntp-server clock1.rdu2.redhat.com only.
Of course, I can check and rerun Overcloud IPv6 with  --ntp-server clock1.rdu2.redhat.com,clock2.redhat.com
Will update with results.

Comment 5 Alex Schultz 2020-02-21 22:30:36 UTC
Just because it worked once doesn't mean it'll always work. We've had issues with the internal ntp servers and it is recommend you use more than one.

Comment 6 Yuri Obshansky 2020-02-21 22:53:59 UTC
(In reply to Alex Schultz from comment #5)
> Just because it worked once doesn't mean it'll always work. We've had issues
> with the internal ntp servers and it is recommend you use more than one.

It always works perfect except the IPv6.
 
Failed again with --ntp-server clock1.rdu2.redhat.com,clock2.redhat.com
TASK [Ensure system is NTP time synced] ****************************************
Friday 21 February 2020  22:39:26 +0000 (0:00:00.907)       0:03:23.919 ******* 
fatal: [central-controller0-0]: FAILED! => {"changed": true, "cmd": ["chronyc", "waitsync", "20"], "delta": "0:03:10.198521", "end": "2020-02-21 17:42:36.896896", "msg": "non-zero return code", "rc": 1, "start": "2020-02-21 17:39:26.698375", "stderr": "", "stderr_lines": [], "stdout": "try: 1, refid: 00000000, correction: 0.000000000, skew: 0.000\ntry: 2, refid: 00000000, correction: 0.000000000, skew: 0.000\ntry: 3, refid: 00000000, correction: 0.000000000, skew: 0.000\ntry: 4, refid: 00000000, correction: 0.000000000, skew: 0.000\ntry: 5, refid: 00000000, correction: 0.000000001, skew: 0.000\ntry: 6, refid: 00000000, correction: 0.000000001, skew: 0.000\ntry: 7, refid: 00000000, correction: 0.000000001, skew: 0.000\ntry: 8, refid: 00000000, correction: 0.000000001, skew: 0.000\ntry: 9, refid: 00000000, correction: 0.000000001, skew: 0.000\ntry: 10, refid: 00000000, correction: 0.000000001, skew: 0.000\ntry: 11, refid: 00000000, correction: 0.000000001, skew: 0.000\ntry: 12, refid: 00000000, correction: 0.000000002, skew: 0.000\ntry: 13, refid: 00000000, correction: 0.000000002, skew: 0.000\ntry: 14, refid: 00000000, correction: 0.000000002, skew: 0.000\ntry: 15, refid: 00000000, correction: 0.000000002, skew: 0.000\ntry: 16, refid: 00000000, correction: 0.000000002, skew: 0.000\ntry: 17, refid: 00000000, correction: 0.000000002, skew: 0.000\ntry: 18, refid: 00000000, correction: 0.000000002, skew: 0.000\ntry: 19, refid: 00000000, correction: 0.000000002, skew: 0.000\ntry: 20, refid: 00000000, correction: 0.000000003, skew: 0.000", "stdout_lines": ["try: 1, refid: 00000000, correction: 0.000000000, skew: 0.000", "try: 2, refid: 00000000, correction: 0.000000000, skew: 0.000", "try: 3, refid: 00000000, correction: 0.000000000, skew: 0.000", "try: 4, refid: 00000000, correction: 0.000000000, skew: 0.000", "try: 5, refid: 00000000, correction: 0.000000001, skew: 0.000", "try: 6, refid: 00000000, correction: 0.000000001, skew: 0.000", "try: 7, refid: 00000000, correction: 0.000000001, skew: 0.000", "try: 8, refid: 00000000, correction: 0.000000001, skew: 0.000", "try: 9, refid: 00000000, correction: 0.000000001, skew: 0.000", "try: 10, refid: 00000000, correction: 0.000000001, skew: 0.000", "try: 11, refid: 00000000, correction: 0.000000001, skew: 0.000", "try: 12, refid: 00000000, correction: 0.000000002, skew: 0.000", "try: 13, refid: 00000000, correction: 0.000000002, skew: 0.000", "try: 14, refid: 00000000, correction: 0.000000002, skew: 0.000", "try: 15, refid: 00000000, correction: 0.000000002, skew: 0.000", "try: 16, refid: 00000000, correction: 0.000000002, skew: 0.000", "try: 17, refid: 00000000, correction: 0.000000002, skew: 0.000", "try: 18, refid: 00000000, correction: 0.000000002, skew: 0.000", "try: 19, refid: 00000000, correction: 0.000000002, skew: 0.000", "try: 20, refid: 00000000, correction: 0.000000003, skew: 0.000"]}

Something with IPv6. I cannot identify what is the root of issue.

I have a running environment where you can get more information.
Ping me on IRCC

Yuri

Comment 7 Alex Schultz 2020-02-21 22:57:13 UTC
Right because those aren't ipv6 capable systems. You'd have to use an IPV6 ntp server.  This is an environmental/configuration issue.

Comment 8 Yuri Obshansky 2020-02-21 23:30:54 UTC
I'm not agreed with you.
Up to documentation it should be IPv4 address 
https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.0/html-single/ipv6_networking_for_the_overcloud/index
$ openstack overcloud deploy --templates \
  -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation-v6.yaml \
  -e /usr/share/openstack-tripleo-heat-templates/environments/network-environment-v6.yaml \
  -e /usr/share/openstack-tripleo-heat-templates/environments/net-single-nic-with-vlans-v6.yaml \
  -e /home/stack/templates/network-environment.yaml \
  --ntp-server pool.ntp.org \
  [ADDITIONAL OPTIONS]

In OSP 13 it works ok
(undercloud) [stack@site-undercloud-0 ~]$ ntpstat
synchronised to NTP server (10.11.160.238) at stratum 2
   time correct to within 109 ms
   polling server every 1024 s

chrony-3.4-1.el7.x86_64

In OSP 16 does not work. -> chrony-3.5-1.el8.x86_64

If you think it is not related to Openstack, let's escalate to another project.

Thank you
Yuri

Comment 11 Harald Jensås 2020-03-12 22:33:41 UTC
Looks like network configuration is not correct:
  {\"ip_netmask\": \"169.254.169.254/32\", \"next_hop\": \"\"}
                                                          ^^ next_hop for metadat is empty.

We no longer use metadata over the network, see https://review.opendev.org/665373. So you should drop it from you nic-config file.




2020-02-21 18:44:41,440 p=885 u=mistral |  ok: [central-controller0-0] => {
    "NetworkConfig_result.stderr_lines": [
        "+ '[' -n '{\"network_config\": [{\"addresses\": [{\"ip_netmask\": \"192.168.24.42/24\"}], \"dns_servers\": [\"10.11.5.19\", \"10.5.30.160\"], \"name\": \"nic1\", \"routes\": [{\"destination\": \"192.168.34.0/24\", \"nexthop\": \"192.168.24.254\"}, {\"destination\": \"192.168.44.0/24\", \"nexthop\": \"192.168.24.254\"}, {\"ip_netmask\": \"169.254.169.254/32\", \"next_hop\": \"\"}], \"type\": \"interface\", \"use_dhcp\": false}, {\"members\": [{\"mtu\": 1500, \"name\": \"nic2\", \"primary\": true, \"type\": \"interface\"}, {\"addresses\": [{\"ip_netmask\": \"fd00:fd00:fd00:3000::16e/64\"}], \"mtu\": 1500, \"routes\": [{\"destination\": \"fd00:fd00:fd00:3004::/64\", \"nexthop\": \"fd00:fd00:fd00:3000::1\"}, {\"destination\": \"fd00:fd00:fd00:3008::/64\", \"nexthop\": \"fd00:fd00:fd00:3000::1\"}], \"type\": \"vlan\", \"vlan_id\": 1183}, {\"addresses\": [{\"ip_netmask\": \"fd00:fd00:fd00:4000::71/64\"}], \"mtu\": 1500, \"routes\": [{\"destination\": \"fd00:fd00:fd00:4004::/64\", \"nexthop\": \"fd00:fd00:fd00:4000::1\"}, {\"destination\": \"fd00:fd00:fd00:4008::/64\", \"nexthop\": \"fd00:fd00:fd00:4000::1\"}], \"type\": \"vlan\", \"vlan_id\": 1188}, {\"addresses\": [{\"ip_netmask\": \"fd00:fd00:fd00:2000::3f9/64\"}], \"mtu\": 1500, \"routes\": [{\"destination\": \"fd00:fd00:fd00:2004::/64\", \"nexthop\": \"fd00:fd00:fd00:2000::1\"}, {\"destination\": \"fd00:fd00:fd00:2008::/64\", \"nexthop\": \"fd00:fd00:fd00:2000::1\"}], \"type\": \"vlan\", \"vlan_id\": 1185}, {\"addresses\": [{\"ip_netmask\": \"172.19.1.164/24\"}], \"mtu\": 1500, \"routes\": [{\"destination\": \"172.19.2.0/24\", \"nexthop\": \"172.19.1.254\"}, {\"destination\": \"172.19.3.0/24\", \"nexthop\": \"172.19.1.254\"}], \"type\": \"vlan\", \"vlan_id\": 1189}], \"mtu\": 1500, \"name\": \"br-isolated\", \"type\": \"ovs_bridge\", \"use_dhcp\": false}, {\"addresses\": [{\"ip_netmask\": \"2620:52:0:13b8:5054:ff:fe3e:89/64\"}], \"members\": [{\"name\": \"nic3\", \"primary\": true, \"type\": \"interface\"}], \"name\": \"br-ex\", \"routes\": [{\"ip_netmask\": \"::/0\", \"next_hop\": \"2620:52:0:13b8::fe\"}], \"type\": \"ovs_bridge\", \"use_dhcp\": false}]}' ']'",
        "+ '[' -z '' ']'",
        "+ trap configure_safe_defaults EXIT",
        "++ date +%Y-%m-%dT%H:%M:%S",
        "+ DATETIME=2020-02-21T13:44:17",
        "+ '[' -f /etc/os-net-config/config.json ']'",
        "+ mkdir -p /etc/os-net-config",
        "+ echo '{\"network_config\": [{\"addresses\": [{\"ip_netmask\": \"192.168.24.42/24\"}], \"dns_servers\": [\"10.11.5.19\", \"10.5.30.160\"], \"name\": \"nic1\", \"routes\": [{\"destination\": \"192.168.34.0/24\", \"nexthop\": \"192.168.24.254\"}, {\"destination\": \"192.168.44.0/24\", \"nexthop\": \"192.168.24.254\"}, {\"ip_netmask\": \"169.254.169.254/32\", \"next_hop\": \"\"}], \"type\": \"interface\", \"use_dhcp\": false}, {\"members\": [{\"mtu\": 1500, \"name\": \"nic2\", \"primary\": true, \"type\": \"interface\"}, {\"addresses\": [{\"ip_netmask\": \"fd00:fd00:fd00:3000::16e/64\"}], \"mtu\": 1500, \"routes\": [{\"destination\": \"fd00:fd00:fd00:3004::/64\", \"nexthop\": \"fd00:fd00:fd00:3000::1\"}, {\"destination\": \"fd00:fd00:fd00:3008::/64\", \"nexthop\": \"fd00:fd00:fd00:3000::1\"}], \"type\": \"vlan\", \"vlan_id\": 1183}, {\"addresses\": [{\"ip_netmask\": \"fd00:fd00:fd00:4000::71/64\"}], \"mtu\": 1500, \"routes\": [{\"destination\": \"fd00:fd00:fd00:4004::/64\", \"nexthop\": \"fd00:fd00:fd00:4000::1\"}, {\"destination\": \"fd00:fd00:fd00:4008::/64\", \"nexthop\": \"fd00:fd00:fd00:4000::1\"}], \"type\": \"vlan\", \"vlan_id\": 1188}, {\"addresses\": [{\"ip_netmask\": \"fd00:fd00:fd00:2000::3f9/64\"}], \"mtu\": 1500, \"routes\": [{\"destination\": \"fd00:fd00:fd00:2004::/64\", \"nexthop\": \"fd00:fd00:fd00:2000::1\"}, {\"destination\": \"fd00:fd00:fd00:2008::/64\", \"nexthop\": \"fd00:fd00:fd00:2000::1\"}], \"type\": \"vlan\", \"vlan_id\": 1185}, {\"addresses\": [{\"ip_netmask\": \"172.19.1.164/24\"}], \"mtu\": 1500, \"routes\": [{\"destination\": \"172.19.2.0/24\", \"nexthop\": \"172.19.1.254\"}, {\"destination\": \"172.19.3.0/24\", \"nexthop\": \"172.19.1.254\"}], \"type\": \"vlan\", \"vlan_id\": 1189}], \"mtu\": 1500, \"name\": \"br-isolated\", \"type\": \"ovs_bridge\", \"use_dhcp\": false}, {\"addresses\": [{\"ip_netmask\": \"2620:52:0:13b8:5054:ff:fe3e:89/64\"}], \"members\": [{\"name\": \"nic3\", \"primary\": true, \"type\": \"interface\"}], \"name\": \"br-ex\", \"routes\": [{\"ip_netmask\": \"::/0\", \"next_hop\": \"2620:52:0:13b8::fe\"}], \"type\": \"ovs_bridge\", \"use_dhcp\": false}]}'",
        "++ type -t network_config_hook",
        "+ '[' '' = function ']'",
        "+ sed -i 's/: \"bridge_name/: \"br-ex/' /etc/os-net-config/config.json",
        "+ sed -i s/interface_name/nic1/ /etc/os-net-config/config.json",
        "+ set +e",
        "+ os-net-config -c /etc/os-net-config/config.json -v --detailed-exit-codes",
        "[2020/02/21 01:44:17 PM] [INFO] Using config file at: /etc/os-net-config/config.json",
        "[2020/02/21 01:44:17 PM] [INFO] Ifcfg net config provider created.",
        "[2020/02/21 01:44:17 PM] [INFO] Not using any mapping file.",
        "[2020/02/21 01:44:18 PM] [WARNING] Config file failed schema validation at network_config/0:",
        "    {'addresses': [{'ip_netmask': '192.168.24.42/24'}], 'dns_servers': ['10.11.5.19', '10.5.30.160'], 'name': 'nic1', 'routes': [{'destination': '192.168.34.0/24', 'nexthop': '192.168.24.254'}, {'destination': '192.168.44.0/24', 'nexthop': '192.168.24.254'}, {'ip_netmask': '169.254.169.254/32', 'next_hop': ''}], 'type': 'interface', 'use_dhcp': False, 'nic_mapping': None, 'persist_mapping': False} is not valid under any of the given schemas",
        "  Sub-schemas tested and not matching:",
        "  - items/oneOf/interface/routes/items/oneOf: {'ip_netmask': '169.254.169.254/32', 'next_hop': ''} is not valid under any of the given schemas",
        "  -- items/oneOf/interface/routes/items/oneOf/0/next_hop/oneOf: '' is not valid under any of the given schemas",
        "  --- items/oneOf/interface/routes/items/oneOf/0/next_hop/oneOf/ip_address_string/oneOf: '' is not valid under any of the given schemas",
        "  ---- items/oneOf/interface/routes/items/oneOf/0/next_hop/oneOf/ip_address_string/oneOf/ipv4_address_string/pattern: '' does not match '^((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?).){3}(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$'",
        "  ---- items/oneOf/interface/routes/items/oneOf/0/next_hop/oneOf/ip_address_string/oneOf/ipv6_address_string/pattern: '' does not match '^(([0-9a-fA-F]{1,4}:){7,7}[0-9a-fA-F]{1,4}|([0-9a-fA-F]{1,4}:){1,7}:|([0-9a-fA-F]{1,4}:){1,6}:[0-9a-fA-F]{1,4}|([0-9a-fA-F]{1,4}:){1,5}(:[0-9a-fA-F]{1,4}){1,2}|([0-9a-fA-F]{1,4}:){1,4}(:[0-9a-fA-F]{1,4}){1,3}|([0-9a-fA-F]{1,4}:){1,3}(:[0-9a-fA-F]{1,4}){1,4}|([0-9a-fA-F]{1,4}:){1,2}(:[0-9a-fA-F]{1,4}){1,5}|[0-9a-fA-F]{1,4}:((:[0-9a-fA-F]{1,4}){1,6})|:((:[0-9a-fA-F]{1,4}){1,7}|:))$'",
        "  --- items/oneOf/interface/routes/items/oneOf/0/next_hop/oneOf/param/oneOf: '' is not valid under any of the given schemas",
        "  ---- items/oneOf/interface/routes/items/oneOf/0/next_hop/oneOf/param/oneOf/0/type: '' is not of type 'object'",
        "  ---- items/oneOf/interface/routes/items/oneOf/0/next_hop/oneOf/param/oneOf/1/type: '' is not of type 'object'",
        "  -- items/oneOf/interface/routes/items/oneOf/1/additionalProperties: Additional properties are not allowed ('next_hop', 'ip_netmask' were unexpected)",


Note You need to log in before you can comment on or make changes to this bug.