Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1677607

Summary: Deployment of OSP14 with skydive environment failed
Product: Red Hat OpenStack Reporter: Roman Safronov <rsafrono>
Component: SkydiveAssignee: safchain
Status: CLOSED DUPLICATE QA Contact: safchain
Severity: medium Docs Contact:
Priority: unspecified    
Version: 14.0 (Rocky)CC: mkaliyam, nplanel, safchain, sbaubeau
Target Milestone: ---Keywords: Triaged, ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-04-12 06:44:15 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Roman Safronov 2019-02-15 10:57:41 UTC
Description of problem:
Deployment of OSP14 with skydive environment failed. 
Link to CI build https://rhos-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/OSPD-Customized-Deployment-virt/8887/

Version-Release number of selected component (if applicable):
Puddle: 14.0-RHEL-7/2019-01-17.2


How reproducible:
always, tried to rerun overcloud_deploy.sh script with skydive environment again and got the same result. 

Steps to Reproduce:
1. Run deployment, specify  /usr/share/openstack-tripleo-heat-templates/environments/services/skydive-environment.yaml as addidional environment-file parameter to 'openstack overcloud deploy'  (according to the https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/14/html/use_skydive_for_openstack_network_analysis/install-skydive)
2.
3.

Actual results:
Overcloud deployment fails. From overcloud_install.log


"TASK [skydive_analyzer : Install skydive analyzer package] *********************", "Wednesday 13 February 2019  22:07:52 -0500 (0:00:00.068)       0:01:51.242 **** ", "skipping: [controller-1]", "", "Overcloud configuration failed.
TASK [skydive_analyzer : Start service] ****************************************", "Wednesday 13 February 2019  22:07:52 -0500 (0:00:00.066)       0:01:51.308 **** ", "skipping: [controller-1]", "", "TASK [skydive_analyzer : include_role] *****************************************", "Wednesday 13 February 2019  22:07:52 -0500 (0:00:00.073)       0:01:51.382 **** ", "skipping: [controller-1]", "", "TASK [skydive_analyzer : include_role] *****************************************", "Wednesday 13 February 2019  22:07:52 -0500 (0:00:00.067)       0:01:51.449 **** ", "", "TASK [skydive_common : Ensure config file permissions] *************************", "Wednesday 13 February 2019  22:07:52 -0500 (0:00:00.131)       0:01:51.581 **** ", "changed: [controller-1]", "", "TASK [skydive_analyzer : Check API status, retrieve token from login page] *****", "Wednesday 13 February 2019  22:07:52 -0500 (0:00:00.298)       0:01:51.879 **** ", "FAILED - RETRYING: Check API status, retrieve token from login page (10 retries left).", "FAILED - RETRYING: Check API status, retrieve token from login page (9 retries left).", "FAILED - RETRYING: Check API status, retrieve token from login page (8 retries left).", "FAILED - RETRYING: Check API status, retrieve token from login page (7 retries left).", "FAILED - RETRYING: Check API status, retrieve token from login page (6 retries left).", "FAILED - RETRYING: Check API status, retrieve token from login page (5 retries left).", "FAILED - RETRYING: Check API status, retrieve token from login page (4 retries left).", "FAILED - RETRYING: Check API status, retrieve token from login page (3 retries left).", "FAILED - RETRYING: Check API status, retrieve token from login page (2 retries left).", "FAILED - RETRYING: Check API status, retrieve token from login page (1 retries left).", "fatal: [controller-1]: FAILED! => {\"attempts\": 10, \"changed\": false, \"content\": \"\", \"msg\": \"Status code was -1 and not [200]: Request failed: <urlopen error [Errno 111] Connection refused>\", \"redirected\": false, \"status\": -1, \"url\": \"http://192.168.24.8:8082/login\"}", "", "PLAY RECAP *********************************************************************", "compute-0                  : ok=23   changed=14   unreachable=0    failed=0   ", "compute-1                  : ok=23   changed=14   unreachable=0    failed=0   ", "controller-0               : ok=49   changed=26   unreachable=0    failed=1   ", "controller-1               : ok=61   changed=35   unreachable=0    failed=1   ", "controller-2               : ok=49   changed=26   unreachable=0    failed=1   ", "localhost                  : ok=1    changed=0    unreachable=0    failed=0   ", "", "Wednesday 13 February 2019  22:11:17 -0500 (0:03:24.334)       0:05:16.213 **** ", "=============================================================================== "]}


NO MORE HOSTS LEFT *************************************************************

PLAY RECAP *********************************************************************
compute-0                  : ok=167  changed=71   unreachable=0    failed=0   
compute-1                  : ok=168  changed=71   unreachable=0    failed=0   
controller-0               : ok=216  changed=95   unreachable=0    failed=0   
controller-1               : ok=210  changed=94   unreachable=0    failed=0   
controller-2               : ok=210  changed=94   unreachable=0    failed=0   
undercloud                 : ok=16   changed=12   unreachable=0    failed=1   

Wednesday 13 February 2019  22:11:18 -0500 (0:05:19.147)       0:36:52.397 **** 
=============================================================================== 

Agent containers are running on all nodes (as expected).
Analyzer container is running on 1 controller node.

Note: I tried to deploy HA environment with 3 controllers and 2 compute nodes. 


Expected results:
Overcloud deployment succeeds


Additional info:
OVN networking team plans to use Skydive in CI for automated testing purposes so it's kind of blocker for us.

Comment 1 safchain 2019-02-15 16:20:41 UTC
Can you provide the skydive-analyzer log ? It would help to understand what happened.

Comment 2 Roman Safronov 2019-02-16 14:18:35 UTC
I noticed that skydive-analyzer container continuously reboots 


[heat-admin@controller-2 ~]$ sudo docker logs skydive-analyzer-controller-2
2019-02-16T14:14:48.509Z	INFO	analyzer/analyzer.go:47 glob..func1	controller-2: Skydive Analyzer 0.20.1 starting...
2019-02-16 14:14:48.510755 I | embed: listening for peers on http://127.0.0.1:12380
2019-02-16 14:14:48.510806 I | embed: listening for peers on http://172.31.0.2:12380
2019-02-16 14:14:48.510847 I | embed: listening for peers on http://[::1]:12380
2019-02-16 14:14:48.510901 I | embed: listening for client requests on 127.0.0.1:12379
2019-02-16 14:14:48.510939 I | embed: listening for client requests on 172.31.0.2:12379
2019-02-16 14:14:48.510977 I | embed: listening for client requests on [::1]:12379
2019-02-16 14:14:48.514786 I | etcdserver: name = controller-2
2019-02-16 14:14:48.514805 I | etcdserver: data dir = /var/lib/skydive/etcd
2019-02-16 14:14:48.514811 I | etcdserver: member dir = /var/lib/skydive/etcd/member
2019-02-16 14:14:48.514816 I | etcdserver: heartbeat = 100ms
2019-02-16 14:14:48.514820 I | etcdserver: election = 1000ms
2019-02-16 14:14:48.514824 I | etcdserver: snapshot count = 100000
2019-02-16 14:14:48.514836 I | etcdserver: advertise client URLs = http://127.0.0.1:12379,http://172.31.0.2:12379,http://[::1]:12379
2019-02-16 14:14:48.571888 I | etcdserver: restarting member 1f54d8ea95d424d9 in cluster 11190c3d8223db46 at commit index 3
2019-02-16 14:14:48.571962 I | raft: 1f54d8ea95d424d9 became follower at term 72877
2019-02-16 14:14:48.571988 I | raft: newRaft 1f54d8ea95d424d9 [peers: [], term: 72877, commit: 3, applied: 0, lastindex: 3, lastterm: 1]
2019-02-16 14:14:48.575418 W | auth: simple token is not cryptographically signed
2019-02-16 14:14:48.577209 I | etcdserver: starting server... [version: 3.2.15+git, cluster version: to_be_decided]
2019-02-16 14:14:48.577817 I | etcdserver/membership: added member b20636ba3b9b6e2 [http://192.168.24.21:12380] to cluster 11190c3d8223db46
2019-02-16 14:14:48.577854 I | rafthttp: starting peer b20636ba3b9b6e2...
2019-02-16 14:14:48.577891 I | rafthttp: started HTTP pipelining with peer b20636ba3b9b6e2
2019-02-16 14:14:48.579388 I | rafthttp: started streaming with peer b20636ba3b9b6e2 (writer)
2019-02-16 14:14:48.581590 I | rafthttp: started peer b20636ba3b9b6e2
2019-02-16 14:14:48.581684 I | rafthttp: added peer b20636ba3b9b6e2
2019-02-16 14:14:48.583149 I | rafthttp: started streaming with peer b20636ba3b9b6e2 (stream MsgApp v2 reader)
2019-02-16 14:14:48.583182 I | rafthttp: started streaming with peer b20636ba3b9b6e2 (stream Message reader)
2019-02-16 14:14:48.583237 I | rafthttp: started streaming with peer b20636ba3b9b6e2 (writer)
2019-02-16 14:14:48.583550 I | etcdserver/membership: added member 1f54d8ea95d424d9 [http://192.168.24.15:12380] to cluster 11190c3d8223db46
2019-02-16 14:14:48.583714 I | etcdserver/membership: added member bb1dbff13bcb9fae [http://192.168.24.12:12380] to cluster 11190c3d8223db46
2019-02-16 14:14:48.583741 I | rafthttp: starting peer bb1dbff13bcb9fae...
2019-02-16 14:14:48.583774 I | rafthttp: started HTTP pipelining with peer bb1dbff13bcb9fae
2019-02-16 14:14:48.586027 I | rafthttp: started streaming with peer bb1dbff13bcb9fae (writer)
2019-02-16 14:14:48.587157 I | rafthttp: started streaming with peer bb1dbff13bcb9fae (writer)
2019-02-16 14:14:48.589083 I | rafthttp: started peer bb1dbff13bcb9fae
2019-02-16 14:14:48.589125 I | rafthttp: added peer bb1dbff13bcb9fae
2019-02-16 14:14:48.589166 I | rafthttp: started streaming with peer bb1dbff13bcb9fae (stream MsgApp v2 reader)
2019-02-16 14:14:48.589369 I | rafthttp: started streaming with peer bb1dbff13bcb9fae (stream Message reader)
2019-02-16 14:14:48.872442 I | raft: 1f54d8ea95d424d9 is starting a new election at term 72877
2019-02-16 14:14:48.872544 I | raft: 1f54d8ea95d424d9 became candidate at term 72878
2019-02-16 14:14:48.872561 I | raft: 1f54d8ea95d424d9 received MsgVoteResp from 1f54d8ea95d424d9 at term 72878
2019-02-16 14:14:48.872572 I | raft: 1f54d8ea95d424d9 [logterm: 1, index: 3] sent MsgVote request to b20636ba3b9b6e2 at term 72878
2019-02-16 14:14:48.872603 I | raft: 1f54d8ea95d424d9 [logterm: 1, index: 3] sent MsgVote request to bb1dbff13bcb9fae at term 72878
2019-02-16 14:14:49.872447 I | raft: 1f54d8ea95d424d9 is starting a new election at term 72878
2019-02-16 14:14:49.872488 I | raft: 1f54d8ea95d424d9 became candidate at term 72879
2019-02-16 14:14:49.872498 I | raft: 1f54d8ea95d424d9 received MsgVoteResp from 1f54d8ea95d424d9 at term 72879
2019-02-16 14:14:49.872520 I | raft: 1f54d8ea95d424d9 [logterm: 1, index: 3] sent MsgVote request to b20636ba3b9b6e2 at term 72879
2019-02-16 14:14:49.872530 I | raft: 1f54d8ea95d424d9 [logterm: 1, index: 3] sent MsgVote request to bb1dbff13bcb9fae at term 72879
2019-02-16 14:14:51.672477 I | raft: 1f54d8ea95d424d9 is starting a new election at term 72879
2019-02-16 14:14:51.672519 I | raft: 1f54d8ea95d424d9 became candidate at term 72880
2019-02-16 14:14:51.672534 I | raft: 1f54d8ea95d424d9 received MsgVoteResp from 1f54d8ea95d424d9 at term 72880
2019-02-16 14:14:51.672547 I | raft: 1f54d8ea95d424d9 [logterm: 1, index: 3] sent MsgVote request to bb1dbff13bcb9fae at term 72880
2019-02-16 14:14:51.672555 I | raft: 1f54d8ea95d424d9 [logterm: 1, index: 3] sent MsgVote request to b20636ba3b9b6e2 at term 72880
2019-02-16 14:14:53.275480 I | raft: 1f54d8ea95d424d9 is starting a new election at term 72880
2019-02-16 14:14:53.275526 I | raft: 1f54d8ea95d424d9 became candidate at term 72881
2019-02-16 14:14:53.275541 I | raft: 1f54d8ea95d424d9 received MsgVoteResp from 1f54d8ea95d424d9 at term 72881
2019-02-16 14:14:53.275551 I | raft: 1f54d8ea95d424d9 [logterm: 1, index: 3] sent MsgVote request to b20636ba3b9b6e2 at term 72881
2019-02-16 14:14:53.275560 I | raft: 1f54d8ea95d424d9 [logterm: 1, index: 3] sent MsgVote request to bb1dbff13bcb9fae at term 72881
2019-02-16 14:14:53.583403 W | rafthttp: health check for peer b20636ba3b9b6e2 could not connect: <nil>
2019-02-16 14:14:53.589346 W | rafthttp: health check for peer bb1dbff13bcb9fae could not connect: <nil>
2019-02-16 14:14:54.872425 I | raft: 1f54d8ea95d424d9 is starting a new election at term 72881
2019-02-16 14:14:54.872486 I | raft: 1f54d8ea95d424d9 became candidate at term 72882
2019-02-16 14:14:54.872520 I | raft: 1f54d8ea95d424d9 received MsgVoteResp from 1f54d8ea95d424d9 at term 72882
2019-02-16 14:14:54.872548 I | raft: 1f54d8ea95d424d9 [logterm: 1, index: 3] sent MsgVote request to b20636ba3b9b6e2 at term 72882
2019-02-16 14:14:54.872568 I | raft: 1f54d8ea95d424d9 [logterm: 1, index: 3] sent MsgVote request to bb1dbff13bcb9fae at term 72882
2019-02-16 14:14:55.577655 E | etcdserver: publish error: etcdserver: request timed out
2019-02-16 14:14:56.072446 I | raft: 1f54d8ea95d424d9 is starting a new election at term 72882
2019-02-16 14:14:56.072489 I | raft: 1f54d8ea95d424d9 became candidate at term 72883
2019-02-16 14:14:56.072504 I | raft: 1f54d8ea95d424d9 received MsgVoteResp from 1f54d8ea95d424d9 at term 72883
2019-02-16 14:14:56.072524 I | raft: 1f54d8ea95d424d9 [logterm: 1, index: 3] sent MsgVote request to b20636ba3b9b6e2 at term 72883
2019-02-16 14:14:56.072546 I | raft: 1f54d8ea95d424d9 [logterm: 1, index: 3] sent MsgVote request to bb1dbff13bcb9fae at term 72883
2019-02-16 14:14:57.672456 I | raft: 1f54d8ea95d424d9 is starting a new election at term 72883
2019-02-16 14:14:57.672504 I | raft: 1f54d8ea95d424d9 became candidate at term 72884
2019-02-16 14:14:57.672515 I | raft: 1f54d8ea95d424d9 received MsgVoteResp from 1f54d8ea95d424d9 at term 72884
2019-02-16 14:14:57.672526 I | raft: 1f54d8ea95d424d9 [logterm: 1, index: 3] sent MsgVote request to b20636ba3b9b6e2 at term 72884
2019-02-16 14:14:57.672537 I | raft: 1f54d8ea95d424d9 [logterm: 1, index: 3] sent MsgVote request to bb1dbff13bcb9fae at term 72884
2019-02-16 14:14:58.583773 W | rafthttp: health check for peer b20636ba3b9b6e2 could not connect: dial tcp 192.168.24.21:12380: i/o timeout
2019-02-16 14:14:58.589607 W | rafthttp: health check for peer bb1dbff13bcb9fae could not connect: dial tcp 192.168.24.12:12380: i/o timeout
2019-02-16 14:14:59.072499 I | raft: 1f54d8ea95d424d9 is starting a new election at term 72884
2019-02-16 14:14:59.072549 I | raft: 1f54d8ea95d424d9 became candidate at term 72885
2019-02-16 14:14:59.072562 I | raft: 1f54d8ea95d424d9 received MsgVoteResp from 1f54d8ea95d424d9 at term 72885
2019-02-16 14:14:59.072573 I | raft: 1f54d8ea95d424d9 [logterm: 1, index: 3] sent MsgVote request to b20636ba3b9b6e2 at term 72885
2019-02-16 14:14:59.072606 I | raft: 1f54d8ea95d424d9 [logterm: 1, index: 3] sent MsgVote request to bb1dbff13bcb9fae at term 72885
2019-02-16 14:15:00.272436 I | raft: 1f54d8ea95d424d9 is starting a new election at term 72885
2019-02-16 14:15:00.272495 I | raft: 1f54d8ea95d424d9 became candidate at term 72886
2019-02-16 14:15:00.272511 I | raft: 1f54d8ea95d424d9 received MsgVoteResp from 1f54d8ea95d424d9 at term 72886
2019-02-16 14:15:00.272522 I | raft: 1f54d8ea95d424d9 [logterm: 1, index: 3] sent MsgVote request to b20636ba3b9b6e2 at term 72886
2019-02-16 14:15:00.272531 I | raft: 1f54d8ea95d424d9 [logterm: 1, index: 3] sent MsgVote request to bb1dbff13bcb9fae at term 72886
2019-02-16 14:15:02.072437 I | raft: 1f54d8ea95d424d9 is starting a new election at term 72886
2019-02-16 14:15:02.072491 I | raft: 1f54d8ea95d424d9 became candidate at term 72887
2019-02-16 14:15:02.072507 I | raft: 1f54d8ea95d424d9 received MsgVoteResp from 1f54d8ea95d424d9 at term 72887
2019-02-16 14:15:02.072530 I | raft: 1f54d8ea95d424d9 [logterm: 1, index: 3] sent MsgVote request to bb1dbff13bcb9fae at term 72887
2019-02-16 14:15:02.072551 I | raft: 1f54d8ea95d424d9 [logterm: 1, index: 3] sent MsgVote request to b20636ba3b9b6e2 at term 72887
2019-02-16 14:15:02.577934 E | etcdserver: publish error: etcdserver: request timed out
2019-02-16 14:15:03.584059 W | rafthttp: health check for peer b20636ba3b9b6e2 could not connect: dial tcp 192.168.24.21:12380: i/o timeout
2019-02-16 14:15:03.589775 W | rafthttp: health check for peer bb1dbff13bcb9fae could not connect: dial tcp 192.168.24.12:12380: i/o timeout
2019-02-16 14:15:03.972454 I | raft: 1f54d8ea95d424d9 is starting a new election at term 72887
2019-02-16 14:15:03.972494 I | raft: 1f54d8ea95d424d9 became candidate at term 72888
2019-02-16 14:15:03.972516 I | raft: 1f54d8ea95d424d9 received MsgVoteResp from 1f54d8ea95d424d9 at term 72888
2019-02-16 14:15:03.972529 I | raft: 1f54d8ea95d424d9 [logterm: 1, index: 3] sent MsgVote request to b20636ba3b9b6e2 at term 728

Comment 3 safchain 2019-02-19 13:00:30 UTC
I suspect a connection issue with embedded etcd cluster. Working on figuring out why.

Comment 4 safchain 2019-02-20 14:15:59 UTC
The etcd cluster connection issue is confirmed. A fix has been developed and will be backported

Comment 5 safchain 2019-02-21 15:31:46 UTC
The following parameters could help

~~~
parameter_defaults:
  SkydiveVars:
    analyzers:
       skydive_analyzer_docker_extra_env: "--net=host"
  ControllerExtraConfig:
    tripleo::firewall::firewall_rules:
      '600 allow skydive etcd':
        dport:
          - 12379
          - 12380
~~~

Comment 6 Roman Safronov 2019-02-21 23:16:58 UTC
Tried to redeploy with the specified parameters, still fails 

https://rhos-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/OSPD-Customized-Deployment-virt/8953/

e] *********************", "Thursday 21 February 2019  17:35:08 -0500 (0:00:00.073)       0:01:51.826 ***** ", "skipping: [controller-2]", "", "TASK [skydOvercloud configuration failed.
ive_analyzer : Start service] ****************************************", "Thursday 21 February 2019  17:35:08 -0500 (0:00:00.074)       0:01:51.901 ***** ", "skipping: [controller-2]", "", "TASK [skydive_analyzer : include_role] *****************************************", "Thursday 21 February 2019  17:35:08 -0500 (0:00:00.064)       0:01:51.966 ***** ", "skipping: [controller-2]", "", "TASK [skydive_analyzer : include_role] *****************************************", "Thursday 21 February 2019  17:35:08 -0500 (0:00:00.068)       0:01:52.034 ***** ", "", "TASK [skydive_common : Ensure config file permissions] *************************", "Thursday 21 February 2019  17:35:08 -0500 (0:00:00.130)       0:01:52.165 ***** ", "changed: [controller-2]", "", "TASK [skydive_analyzer : Check API status, retrieve token from login page] *****", "Thursday 21 February 2019  17:35:08 -0500 (0:00:00.300)       0:01:52.465 ***** ", "FAILED - RETRYING: Check API status, retrieve token from login page (10 retries left).", "FAILED - RETRYING: Check API status, retrieve token from login page (9 retries left).", "FAILED - RETRYING: Check API status, retrieve token from login page (8 retries left).", "FAILED - RETRYING: Check API status, retrieve token from login page (7 retries left).", "FAILED - RETRYING: Check API status, retrieve token from login page (6 retries left).", "FAILED - RETRYING: Check API status, retrieve token from login page (5 retries left).", "FAILED - RETRYING: Check API status, retrieve token from login page (4 retries left).", "FAILED - RETRYING: Check API status, retrieve token from login page (3 retries left).", "FAILED - RETRYING: Check API status, retrieve token from login page (2 retries left).", "FAILED - RETRYING: Check API status, retrieve token from login page (1 retries left).", "fatal: [controller-2]: FAILED! => {\"attempts\": 10, \"changed\": false, \"content\": \"\", \"msg\": \"Status code was -1 and not [200]: Request failed: <urlopen error [Errno 111] Connection refused>\", \"redirected\": false, \"status\": -1, \"url\": \"http://192.168.24.9:8082/login\"}", "", "PLAY RECAP *********************************************************************", "compute-0                  : ok=23   changed=14   unreachable=0    failed=0   ", "compute-1                  : ok=23   changed=14   unreachable=0    failed=0   ", "controller-0               : ok=49   changed=26   unreachable=0    failed=1   ", "controller-1               : ok=49   changed=26   unreachable=0    failed=1   ", "controller-2               : ok=61   changed=35   unreachable=0    failed=1   ", "localhost                  : ok=1    changed=0    unreachable=0    failed=0   ", "", "Thursday 21 February 2019  17:38:33 -0500 (0:03:24.533)       0:05:16.999 ***** ", "=============================================================================== "]}


analyzer container reboots periodically


[heat-admin@controller-2 ~]$ sudo docker logs skydive-analyzer-controller-2
2019-02-21T22:59:57.895Z	INFO	analyzer/analyzer.go:47 glob..func1	controller-2: Skydive Analyzer 0.20.1 starting...
2019-02-21 22:59:57.898460 I | embed: listening for peers on http://127.0.0.1:12380
2019-02-21 22:59:57.898525 I | embed: listening for peers on http://192.168.24.9:12380
2019-02-21 22:59:57.898581 I | embed: listening for peers on http://10.0.0.115:12380
2019-02-21 22:59:57.898632 I | embed: listening for peers on http://172.17.2.11:12380
2019-02-21 22:59:57.898677 I | embed: listening for peers on http://172.17.1.18:12380
2019-02-21 22:59:57.898724 I | embed: listening for peers on http://172.17.1.26:12380
2019-02-21 22:59:57.898770 I | embed: listening for peers on http://172.17.3.16:12380
2019-02-21 22:59:57.898817 I | embed: listening for peers on http://172.17.4.12:12380
2019-02-21 22:59:57.898860 I | embed: listening for peers on http://172.17.4.28:12380
2019-02-21 22:59:57.898904 I | embed: listening for peers on http://172.31.0.1:12380
2019-02-21 22:59:57.898960 I | embed: listening for peers on http://[::1]:12380
2019-02-21 22:59:57.899075 I | embed: listening for client requests on 127.0.0.1:12379
2019-02-21 22:59:57.899150 I | embed: listening for client requests on 192.168.24.9:12379
2019-02-21 22:59:57.899198 I | embed: listening for client requests on 10.0.0.115:12379
2019-02-21 22:59:57.899244 I | embed: listening for client requests on 172.17.2.11:12379
2019-02-21 22:59:57.899288 I | embed: listening for client requests on 172.17.1.18:12379
2019-02-21 22:59:57.899334 I | embed: listening for client requests on 172.17.1.26:12379
2019-02-21 22:59:57.899379 I | embed: listening for client requests on 172.17.3.16:12379
2019-02-21 22:59:57.899423 I | embed: listening for client requests on 172.17.4.12:12379
2019-02-21 22:59:57.899470 I | embed: listening for client requests on 172.17.4.28:12379
2019-02-21 22:59:57.899521 I | embed: listening for client requests on 172.31.0.1:12379
2019-02-21 22:59:57.899583 I | embed: listening for client requests on [::1]:12379
2019-02-21 22:59:57.904677 I | etcdserver: name = controller-2
2019-02-21 22:59:57.904694 I | etcdserver: data dir = /var/lib/skydive/etcd
2019-02-21 22:59:57.904700 I | etcdserver: member dir = /var/lib/skydive/etcd/member
2019-02-21 22:59:57.904705 I | etcdserver: heartbeat = 100ms
2019-02-21 22:59:57.904709 I | etcdserver: election = 1000ms
2019-02-21 22:59:57.904713 I | etcdserver: snapshot count = 100000
2019-02-21 22:59:57.904734 I | etcdserver: advertise client URLs = http://127.0.0.1:12379,http://192.168.24.9:12379,http://10.0.0.115:12379,http://172.17.2.11:12379,http://172.17.1.18:12379,http://172.17.1.26:12379,http://172.17.3.16:12379,http://172.17.4.12:12379,http://172.17.4.28:12379,http://172.31.0.1:12379,http://[::1]:12379
2019-02-21 22:59:57.906133 I | etcdserver: restarting member bf52047d414054f0 in cluster 2852ccc7617537a at commit index 3
2019-02-21 22:59:57.906173 I | raft: bf52047d414054f0 became follower at term 750
2019-02-21 22:59:57.906191 I | raft: newRaft bf52047d414054f0 [peers: [], term: 750, commit: 3, applied: 0, lastindex: 3, lastterm: 1]
2019-02-21 22:59:57.908342 W | auth: simple token is not cryptographically signed
2019-02-21 22:59:57.910640 I | etcdserver: starting server... [version: 3.2.15+git, cluster version: to_be_decided]
2019-02-21 22:59:57.911617 I | etcdserver/membership: added member 4630e5a2f88b421e [http://192.168.24.7:12380] to cluster 2852ccc7617537a
2019-02-21 22:59:57.911647 I | rafthttp: starting peer 4630e5a2f88b421e...
2019-02-21 22:59:57.911667 I | rafthttp: started HTTP pipelining with peer 4630e5a2f88b421e
2019-02-21 22:59:57.916443 I | rafthttp: started streaming with peer 4630e5a2f88b421e (writer)
2019-02-21 22:59:57.916528 I | rafthttp: started streaming with peer 4630e5a2f88b421e (writer)
2019-02-21 22:59:57.919949 I | rafthttp: started peer 4630e5a2f88b421e
2019-02-21 22:59:57.920016 I | rafthttp: added peer 4630e5a2f88b421e
2019-02-21 22:59:57.920150 I | etcdserver/membership: added member bf52047d414054f0 [http://192.168.24.9:12380] to cluster 2852ccc7617537a
2019-02-21 22:59:57.920241 I | etcdserver/membership: added member d3b1ec02dd41f466 [http://192.168.24.19:12380] to cluster 2852ccc7617537a
2019-02-21 22:59:57.920265 I | rafthttp: starting peer d3b1ec02dd41f466...
2019-02-21 22:59:57.920285 I | rafthttp: started HTTP pipelining with peer d3b1ec02dd41f466
2019-02-21 22:59:57.920949 I | rafthttp: started peer d3b1ec02dd41f466
2019-02-21 22:59:57.920977 I | rafthttp: added peer d3b1ec02dd41f466
2019-02-21 22:59:57.921049 I | rafthttp: started streaming with peer 4630e5a2f88b421e (stream MsgApp v2 reader)
2019-02-21 22:59:57.921204 I | rafthttp: started streaming with peer d3b1ec02dd41f466 (writer)
2019-02-21 22:59:57.921236 I | rafthttp: started streaming with peer 4630e5a2f88b421e (stream Message reader)
2019-02-21 22:59:57.921285 I | rafthttp: started streaming with peer d3b1ec02dd41f466 (stream MsgApp v2 reader)
2019-02-21 22:59:57.921380 I | rafthttp: started streaming with peer d3b1ec02dd41f466 (stream Message reader)
2019-02-21 22:59:57.921444 I | rafthttp: started streaming with peer d3b1ec02dd41f466 (writer)
2019-02-21 22:59:58.906645 I | raft: bf52047d414054f0 is starting a new election at term 750
2019-02-21 22:59:58.906741 I | raft: bf52047d414054f0 became candidate at term 751
2019-02-21 22:59:58.906763 I | raft: bf52047d414054f0 received MsgVoteResp from bf52047d414054f0 at term 751
2019-02-21 22:59:58.906774 I | raft: bf52047d414054f0 [logterm: 1, index: 3] sent MsgVote request to d3b1ec02dd41f466 at term 751
2019-02-21 22:59:58.906787 I | raft: bf52047d414054f0 [logterm: 1, index: 3] sent MsgVote request to 4630e5a2f88b421e at term 751
2019-02-21 23:00:00.006619 I | raft: bf52047d414054f0 is starting a new election at term 751
2019-02-21 23:00:00.006674 I | raft: bf52047d414054f0 became candidate at term 752
2019-02-21 23:00:00.006693 I | raft: bf52047d414054f0 received MsgVoteResp from bf52047d414054f0 at term 752
2019-02-21 23:00:00.006727 I | raft: bf52047d414054f0 [logterm: 1, index: 3] sent MsgVote request to 4630e5a2f88b421e at term 752
2019-02-21 23:00:00.006766 I | raft: bf52047d414054f0 [logterm: 1, index: 3] sent MsgVote request to d3b1ec02dd41f466 at term 752
2019-02-21 23:00:01.306610 I | raft: bf52047d414054f0 is starting a new election at term 752



[heat-admin@controller-2 ~]$ ps ax | grep sky
 136513 ?        Ssl    0:00 /usr/bin/docker-current run --rm --privileged --net=host --pid=host --user 0:0 -v /var/run/openvswitch/db.sock:/var/run/openvswitch/db.sock -v /var/run/docker.sock:/var/run/docker.sock -v /var/run/netns:/host/run:ro,shared -v /etc/skydive/skydive.yml:/etc/skydive.yml -e SKYDIVE_NETNS_RUN_PATH=/host/run -p 8081:8081 --name=skydive-agent-controller-2 192.168.24.1:8787/rhosp14/openstack-skydive-agent:2019-01-16.1 /usr/bin/skydive agent --conf /etc/skydive.yml
 136545 ?        Ssl    0:10 /usr/bin/skydive agent --conf /etc/skydive.yml
 351518 ?        Ssl    0:00 /usr/bin/docker-current run --rm --net=host --name=skydive-analyzer-controller-2 --user 1002:1004 -v /etc/skydive/skydive.yml:/etc/skydive.yml -v /var/lib/skydive:/var/lib/skydive -p 8082:8082 -p 8082:8082/udp -p 12379:12379 -p 12380:12380 192.168.24.1:8787/rhosp14/openstack-skydive-analyzer:2019-01-16.1 /usr/bin/skydive analyzer --conf /etc/skydive.yml
 351547 ?        Ssl    0:00 /usr/bin/skydive analyzer --conf /etc/skydive.yml

Comment 7 safchain 2019-04-12 06:44:15 UTC

*** This bug has been marked as a duplicate of bug 1676915 ***

Comment 8 safchain 2019-04-12 06:50:14 UTC
The version skydive-0.20.3-1.el7ost that will be part of Z should fix the issues.