Bug 1518689

Summary: [RFE] otopi should log on failure list of network connections
Product: [oVirt] otopi Reporter: Yedidyah Bar David <didi>
Component: Plugins.GeneralAssignee: Yedidyah Bar David <didi>
Status: CLOSED CURRENTRELEASE QA Contact: David Necpal <dnecpal>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: masterCC: bugs, didi, pstehlik
Target Milestone: ovirt-4.2.0Keywords: FutureFeature
Target Release: ---Flags: rule-engine: ovirt-4.2?
pstehlik: testing_plan_complete-
rule-engine: planning_ack?
rule-engine: devel_ack+
pstehlik: testing_ack+
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Enhancement
Doc Text:
Otopi is able to log a machine's network connections after failures. This option is enabled by installing the package, otopi-debug-plugins. It can help to debug service start failures caused by "Address already in use" errors.
Story Points: ---
Clone Of: 1518545 Environment:
Last Closed: 2017-12-20 10:51:43 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Integration RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Yedidyah Bar David 2017-11-29 13:22:14 UTC
+++ This bug was initially created as a clone of Bug #1518545 +++

Description of problem:

We had several different cases lately of engine-setup or host-deploy failing to start services due to "Address already in use". It will be helpful if otopi logs e.g. the output of 'ss -anp' on failure.

This was already merged in the master branch, as a new debug plugin, always active if 'otopi-debug-plugins' is installed. Opening current bug for consideration for 4.1 (otopi 1.6 branch).

Comment 2 David Necpal 2017-12-13 14:48:38 UTC
Can you please provide verification steps?

Comment 3 Yedidyah Bar David 2017-12-13 15:00:10 UTC
(In reply to David Necpal from comment #2)
> Can you please provide verification steps?

1. Install otopi-debug-plugins
2. Install and run any otopi-based tool, e.g. engine-setup or otopi itself
3. Make this tool fail in the middle - e.g. press ^C
4. Search the generated log file for 'tcp connections'

You can automate 3 using 'force_fail' from the debug-plugins package, e.g.:

# OTOPI_FORCE_FAIL_STAGE=STAGE_SETUP otopi

Comment 4 David Necpal 2017-12-13 16:33:17 UTC
Verified based on suggested verification steps from comment #3

ovirt-engine-4.2.1-0.0.master.20171212133551.git0c8ab5a.el7.centos.noarch
otopi-1.7.6-0.0.master.20171204131110.gitd5016f6.el7.centos.noarch
otopi-debug-plugins-1.7.6-0.0.master.20171204131110.gitd5016f6.el7.centos.noarch


2017-12-13 17:22:00,483+0100 DEBUG otopi.plugins.otopi.debug.debug_failure.debug_failure debug_failure._notification:100 tcp connections:
id uid local foreign state pid exe
0: 995 0.0.0.0:2222 0.0.0.0:0 LISTEN 6761 /usr/sbin/sshd
1: 0 0.0.0.0:111 0.0.0.0:0 LISTEN 1 /usr/lib/systemd/systemd
2: 0 0.0.0.0:6641 0.0.0.0:0 LISTEN 21301 /usr/sbin/ovsdb-server
3: 0 0.0.0.0:6642 0.0.0.0:0 LISTEN 21309 /usr/sbin/ovsdb-server
4: 0 0.0.0.0:54323 0.0.0.0:0 LISTEN 6721 /usr/bin/python2.7
5: 108 0.0.0.0:6100 0.0.0.0:0 LISTEN 6780 /usr/bin/python2.7
6: 0 0.0.0.0:22 0.0.0.0:0 LISTEN 17013 /usr/sbin/sshd
7: 26 0.0.0.0:5432 0.0.0.0:0 LISTEN 2145 /opt/rh/rh-postgresql95/root/usr/bin/postgres
8: 0 127.0.0.1:25 0.0.0.0:0 LISTEN 2426 /usr/libexec/postfix/master
9: 0 0.0.0.0:35357 0.0.0.0:0 LISTEN 21317 /usr/bin/python2.7
10: 0 0.0.0.0:9696 0.0.0.0:0 LISTEN 21317 /usr/bin/python2.7
11: 26 127.0.0.1:5432 127.0.0.1:50758 ESTABLISHED 6955 /opt/rh/rh-postgresql95/root/usr/bin/postgres
12: 0 10.37.138.254:22 10.34.131.130:45594 ESTABLISHED 14088 /usr/sbin/sshd
13: 0 10.37.138.254:52798 216.176.179.218:80 LAST_ACK UnknownPID UnknownEXE
14: 0 10.37.138.254:51096 152.19.134.199:443 ESTABLISHED UnknownPID UnknownEXE
15: 0 10.37.138.254:55658 193.84.206.135:80 LAST_ACK UnknownPID UnknownEXE
16: 26 127.0.0.1:5432 127.0.0.1:50742 ESTABLISHED 6729 /opt/rh/rh-postgresql95/root/usr/bin/postgres
17: 0 10.37.138.254:35374 147.32.127.196:80 ESTABLISHED UnknownPID UnknownEXE
18: 0 10.37.138.254:39014 152.19.134.199:80 LAST_ACK UnknownPID UnknownEXE
19: 26 127.0.0.1:5432 127.0.0.1:50756 ESTABLISHED 6950 /opt/rh/rh-postgresql95/root/usr/bin/postgres
20: 26 127.0.0.1:5432 127.0.0.1:50760 ESTABLISHED 6956 /opt/rh/rh-postgresql95/root/usr/bin/postgres
21: 0 127.0.0.1:5432 127.0.0.1:50728 TIME_WAIT UnknownPID UnknownEXE
22: 0 10.37.138.254:53508 212.69.166.138:80 CLOSE_WAIT UnknownPID UnknownEXE
23: 26 127.0.0.1:5432 127.0.0.1:50750 ESTABLISHED 6749 /opt/rh/rh-postgresql95/root/usr/bin/postgres
24: 0 10.37.138.254:52104 199.38.241.10:80 CLOSE_WAIT UnknownPID UnknownEXE
25: 26 127.0.0.1:5432 127.0.0.1:50740 ESTABLISHED 6727 /opt/rh/rh-postgresql95/root/usr/bin/postgres
26: 26 127.0.0.1:5432 127.0.0.1:50762 ESTABLISHED 6960 /opt/rh/rh-postgresql95/root/usr/bin/postgres
27: 0 10.37.138.254:40128 66.109.26.212:80 LAST_ACK UnknownPID UnknownEXE
28: 26 127.0.0.1:5432 127.0.0.1:50746 ESTABLISHED 6733 /opt/rh/rh-postgresql95/root/usr/bin/postgres
29: 26 127.0.0.1:5432 127.0.0.1:50744 ESTABLISHED 6732 /opt/rh/rh-postgresql95/root/usr/bin/postgres
30: 26 127.0.0.1:5432 127.0.0.1:50748 ESTABLISHED 6747 /opt/rh/rh-postgresql95/root/usr/bin/postgres
31: 0 10.37.138.254:40238 152.19.134.199:80 CLOSE_WAIT UnknownPID UnknownEXE
32: 26 127.0.0.1:5432 127.0.0.1:50766 ESTABLISHED 6988 /opt/rh/rh-postgresql95/root/usr/bin/postgres
33: 0 10.37.138.254:56882 193.84.206.135:80 CLOSE_WAIT UnknownPID UnknownEXE

Comment 5 Yedidyah Bar David 2017-12-14 07:20:33 UTC
Not sure what's the question, looks good to me.

Comment 6 Sandro Bonazzola 2017-12-20 10:51:43 UTC
This bugzilla is included in oVirt 4.2.0 release, published on Dec 20th 2017.

Since the problem described in this bug report should be
resolved in oVirt 4.2.0 release, published on Dec 20th 2017, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.