Bug 1335426 - Cockpit port (9090) is getting closed on HE hosts
Summary: Cockpit port (9090) is getting closed on HE hosts
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-hosted-engine-setup
Classification: oVirt
Component: General
Version: 1.3.6.1
Hardware: Unspecified
OS: Unspecified
urgent
high
Target Milestone: ovirt-4.0.1
: 2.0.1
Assignee: Yedidyah Bar David
QA Contact: Nikolai Sednev
URL:
Whiteboard:
Depends On: 1354199
Blocks: 1306965 1338732
TreeView+ depends on / blocked
 
Reported: 2016-05-12 08:36 UTC by wanghui
Modified: 2017-05-11 09:31 UTC (History)
15 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-08-04 13:32:36 UTC
oVirt Team: Integration
Embargoed:
rbarry: needinfo-
rule-engine: ovirt-4.0.z+
rule-engine: blocker+
rule-engine: planning_ack+
fdeutsch: devel_ack+
mavital: testing_ack+


Attachments (Terms of Use)
sosreport (5.88 MB, application/x-xz)
2016-05-12 08:36 UTC, wanghui
no flags Details
can not connect page in rhev-hypervisor7-ng-4.0-20160607.1 (18.07 KB, image/png)
2016-06-08 08:14 UTC, wanghui
no flags Details
ps -aux (21.55 KB, text/plain)
2016-06-23 08:53 UTC, Ying Cui
no flags Details
VM-created.png (86.42 KB, image/png)
2016-06-23 08:56 UTC, Ying Cui
no flags Details
sosreport (6.01 MB, application/x-xz)
2016-06-23 09:06 UTC, Ying Cui
no flags Details
var_log (297.25 KB, application/x-bzip)
2016-06-23 09:07 UTC, Ying Cui
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 59882 0 master MERGED packaging: setup: Add Cockpit to firewall 2020-06-03 05:28:13 UTC
oVirt gerrit 60263 0 ovirt-hosted-engine-setup-2.0 MERGED packaging: setup: Add Cockpit to firewall 2020-06-03 05:28:13 UTC

Description wanghui 2016-05-12 08:36:51 UTC
Created attachment 1156502 [details]
sosreport

Description of problem:
If we change to other page during the HE setup after vm created, the cockpit shows can not be connected in some pages like logs, storage, networking, subscriptions, accounts, diagnostic report, terminal.

Version-Release number of selected component (if applicable):
rhev-hypervisor7-ng-20160506.0.el7
imgbased-0.6-0.1.el7ev.noarch
ovirt-hosted-engine-setup-1.3.6.1-1.el7ev.noarch
ovirt-host-deploy-1.4.1-1.el7ev.noarch
ovirt-hosted-engine-ha-1.3.5.5-1.el7ev.noarch
cockpit-ovirt-0.5.1-0.0.ovirt40.el7ev.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Install rhev-hypervisor7-ng-3.6-20160506.0.el7
2. Start Hosted Engine setup and run after vm was created
3. Change to Dashboard page
4. Check logs, storage, networking, subscriptions, accounts, diagnostic report, terminal page

Actual results:
1. It reports "Unable to connect".

Expected results:
1. It should show the correct content.

Additional info:

Comment 1 Red Hat Bugzilla Rules Engine 2016-05-12 08:52:41 UTC
Bug tickets must have version flags set prior to targeting them to a release. Please ask maintainer to set the correct version flags and only then set the target milestone.

Comment 2 Yaniv Lavi 2016-05-26 05:21:53 UTC
Flags? target milestone? acks?

Comment 3 wanghui 2016-06-02 07:05:03 UTC
Test version:
rhev-hypervisor7-ng-4.0-20160527.0
cockpit-ovirt-dashboard-0.10.1-0.0.1.el7ev.noarch

Test steps:
1. Install rhev-hypervisor7-ng-4.0-20160527.0
2. Enter cockpit
3. Enter Hosted Engine page to start setup and run after vm was created
4. Change to Dashboard page
5. Check logs, storage, networking, subscriptions, accounts, diagnostic report, terminal page

Test result:
1. Still shows Unable to connect.

So this issue is not fixed in cockpit-ovirt-dashboard-0.10.1-0.0.1.el7ev.noarch. Change the status to assigned.

Comment 4 Red Hat Bugzilla Rules Engine 2016-06-02 07:05:09 UTC
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.

Comment 5 Ryan Barry 2016-06-02 11:50:44 UTC
(In reply to wanghui from comment #3)
> Test version:
> rhev-hypervisor7-ng-4.0-20160527.0
> cockpit-ovirt-dashboard-0.10.1-0.0.1.el7ev.noarch
> 
> Test steps:
> 1. Install rhev-hypervisor7-ng-4.0-20160527.0
> 2. Enter cockpit
> 3. Enter Hosted Engine page to start setup and run after vm was created
> 4. Change to Dashboard page
> 5. Check logs, storage, networking, subscriptions, accounts, diagnostic
> report, terminal page
> 
> Test result:
> 1. Still shows Unable to connect.
> 
> So this issue is not fixed in
> cockpit-ovirt-dashboard-0.10.1-0.0.1.el7ev.noarch. Change the status to
> assigned.

I changed this to ON_QA because I'm not able to reproduce this bug at all.

These steps are also somewhat unclear to me.

Do you mean that you enter the hosted engine setup page, deploy the engine, then enter the page again and start setup (after deployment)?

Then switch to the oVirt dashboard page, or the cockpit dashboard page?

Can you please get a screenshot of what cockpit looks like when you're unable to connect?

Comment 6 wanghui 2016-06-08 08:12:30 UTC
(In reply to Ryan Barry from comment #5)
> (In reply to wanghui from comment #3)
> > Test version:
> > rhev-hypervisor7-ng-4.0-20160527.0
> > cockpit-ovirt-dashboard-0.10.1-0.0.1.el7ev.noarch
> > 
> > Test steps:
> > 1. Install rhev-hypervisor7-ng-4.0-20160527.0
> > 2. Enter cockpit
> > 3. Enter Hosted Engine page to start setup and run after vm was created
> > 4. Change to Dashboard page
> > 5. Check logs, storage, networking, subscriptions, accounts, diagnostic
> > report, terminal page
> > 
> > Test result:
> > 1. Still shows Unable to connect.
> > 
> > So this issue is not fixed in
> > cockpit-ovirt-dashboard-0.10.1-0.0.1.el7ev.noarch. Change the status to
> > assigned.
> 
> I changed this to ON_QA because I'm not able to reproduce this bug at all.
> 
> These steps are also somewhat unclear to me.
> 
> Do you mean that you enter the hosted engine setup page, deploy the engine,
> then enter the page again and start setup (after deployment)?

Actually the deploy engine is not finished. Just need to wait for VM starts and then switch to other page.
> 
> Then switch to the oVirt dashboard page, or the cockpit dashboard page?

Should switch to cockpit dashboard page and then switch to oVirt dashboard page.
> 
> Can you please get a screenshot of what cockpit looks like when you're
> unable to connect?

Comment 7 wanghui 2016-06-08 08:14:10 UTC
Created attachment 1165859 [details]
can not connect page in rhev-hypervisor7-ng-4.0-20160607.1

Comment 8 Fabian Deutsch 2016-06-08 08:38:06 UTC
Hui, can you provide the sosreport from the host?
I wonder if iptables might have closed 9090.

What appliance were you using for setup?

Comment 9 Ryan Barry 2016-06-08 13:25:39 UTC
(In reply to wanghui from comment #6)
> Actually the deploy engine is not finished. Just need to wait for VM starts
> and then switch to other page.
> 
> Should switch to cockpit dashboard page and then switch to oVirt dashboard
> page.

These are the steps I took, but I didn't wait for the VM to start. I'll try that today.

> Hui, can you provide the sosreport from the host?
> I wonder if iptables might have closed 9090.

It definitely looks this way, though I'd hope cockpit is still related/established.

I'm guessing this is the time of the close, though:
May 12 16:02:29 cshao.redhat.com cockpit-ws[12614]: WebSocket from 10.66.65.8 for root closed
May 12 16:02:44 cshao.redhat.com cockpit-ws[12614]: root: timed out
May 12 16:02:44 cshao.redhat.com cockpit-session[12620]: pam_unix(cockpit:session): session closed for user root



Chain INPUT (policy ACCEPT 0 packets, 0 bytes)
    pkts      bytes target     prot opt in     out     source               destination         
    4107  2080764 ACCEPT     all  --  lo     *       0.0.0.0/0            0.0.0.0/0           
       0        0 ACCEPT     icmp --  *      *       0.0.0.0/0            0.0.0.0/0            icmptype 255
   12495 57380948 ACCEPT     all  --  *      *       0.0.0.0/0            0.0.0.0/0            state RELATED,ESTABLISHED
       2      120 ACCEPT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            state NEW tcp dpt:22
       0        0 ACCEPT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            state NEW tcp dpt:5900
       0        0 ACCEPT     udp  --  *      *       0.0.0.0/0            0.0.0.0/0            state NEW udp dpt:5900
       0        0 ACCEPT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            state NEW tcp dpt:5901
       0        0 ACCEPT     udp  --  *      *       0.0.0.0/0            0.0.0.0/0            state NEW udp dpt:5901
     332    49192 REJECT     all  --  *      *       0.0.0.0/0            0.0.0.0/0            reject-with icmp-host-proh(In reply to Fabian Deutsch from comment #8)

Comment 10 Fabian Deutsch 2016-06-09 11:00:21 UTC
Port 9090 is closed.

This sounds like you are using a 3.6 Engine.

Hui, please let me know what version of Engine you are using.

Comment 11 Fabian Deutsch 2016-06-14 18:56:47 UTC
Ping?

Comment 12 wanghui 2016-06-15 06:11:26 UTC
(In reply to Fabian Deutsch from comment #10)
> Port 9090 is closed.
> 
> This sounds like you are using a 3.6 Engine.
> 
> Hui, please let me know what version of Engine you are using.

Fabian, according to my bug, I report this issue in rhev-hypervisor7-ng-3.6-20160506.0.el7. So yes, I am using engine3.6. But I also try this issue with engine4.0 as screenshot shown.

And I think the 9090 port is not closed because some pages can be accessed instead of all the pages.

Comment 13 Fabian Deutsch 2016-06-15 09:35:48 UTC
It can be that some pages can be accessed, because they were accessed before.

Can you try the following flow in a clean environment:

1. Use RHEV-M 4.0
2. Install NGN 4.0
3. Enter Cockpit of NGN
4. Leave cockpit again / Close browser window
5. Add NGN to Engine
6. Enter Cockpit

After 6 you should be able to switch pages in cockpit

Comment 14 wanghui 2016-06-21 06:42:05 UTC
(In reply to Fabian Deutsch from comment #13)
> It can be that some pages can be accessed, because they were accessed before.
> 
> Can you try the following flow in a clean environment:
> 
> 1. Use RHEV-M 4.0
> 2. Install NGN 4.0
> 3. Enter Cockpit of NGN
> 4. Leave cockpit again / Close browser window
> 5. Add NGN to Engine
> 6. Enter Cockpit
> 
> After 6 you should be able to switch pages in cockpit

Yes, I can.

Comment 15 Fabian Deutsch 2016-06-21 15:30:11 UTC
Moving this bug to ON_QA according to commetn 14.

I do not see any further gap.

Comment 16 wanghui 2016-06-23 05:19:06 UTC
Hi Fabian,

The test scenario in comment#13 is not the same as my original issue. Do you mean this issue can be verified with comment#13?

Comment 17 Ying Cui 2016-06-23 08:52:00 UTC
Checking original bug description, we can not verify this bug according to comment 13.

And this bug is serious to cockpit UI + HE + NGN.

The exact steps:
1. Installed RHEV-H-NG.
2. Access cockpit by browser.
3. Navigate to Hosted Engine UI.
4. Start Hosted-engine deployment.
5. During HE deploying, you need to monitor the UI, _AFTER_ vm is created, see VM-created.png
6. Navigate to other UI. e.g: dashboard - logs
7. Again navigate to other UI. e.g: oVirt - Hosted-Engine. 
8. Re-access the cockpit by browser

Results:
step 6, step 7 and step 8, "Unable to connect" is displayed in UI, this issue cause the whole cockpit UI can not be accessed.

Additional checking:
In short, we did all, but it is always failed to access cockpit, we have to _reinstall_ NGN.

More actions:
# ps -aux  
^^ you can see the attachment

# systemctl status cockpit
● cockpit.service - Cockpit Web Service
   Loaded: loaded (/usr/lib/systemd/system/cockpit.service; static; vendor preset: disabled)
   Active: inactive (dead) since Thu 2016-06-23 16:21:08 CST; 21min ago
     Docs: man:cockpit-ws(8)
 Main PID: 19104 (code=exited, status=0/SUCCESS)

Jun 23 14:07:18 dhcp-11-17.nay.redhat.com cockpit-session[19116]: pam_succeed_if(cockpit:auth): requirement "uid >= 1000" not met by user "root"
Jun 23 14:07:27 dhcp-11-17.nay.redhat.com cockpit-session[19119]: pam_ssh_add: Failed adding some keys
Jun 23 14:07:27 dhcp-11-17.nay.redhat.com cockpit-ws[19104]: logged in user: root
Jun 23 14:07:29 dhcp-11-17.nay.redhat.com cockpit-ws[19104]: New connection from 10.72.7.15 for root
Jun 23 15:38:31 dhcp-11-17.nay.redhat.com cockpit-ws[19104]: New connection from 10.72.7.15 for root
Jun 23 15:41:33 dhcp-11-17.nay.redhat.com cockpit-ws[19104]: WebSocket from 10.72.7.15 for root closed
Jun 23 15:42:19 dhcp-11-17.nay.redhat.com cockpit-ws[19104]: New connection from 10.72.7.15 for root
Jun 23 15:44:12 dhcp-11-17.nay.redhat.com cockpit-ws[19104]: WebSocket from 10.72.7.15 for root closed
Jun 23 16:19:38 localhost.localdomain cockpit-ws[19104]: Logging out user root from 10.72.7.15
Jun 23 16:19:38 localhost.localdomain cockpit-ws[19104]: WebSocket from 10.72.7.15 for root closed

# systemctl restart cockpit
# systemctl status cockpit
● cockpit.service - Cockpit Web Service
   Loaded: loaded (/usr/lib/systemd/system/cockpit.service; static; vendor preset: disabled)
   Active: active (running) since Thu 2016-06-23 16:43:43 CST; 1s ago
     Docs: man:cockpit-ws(8)
  Process: 23282 ExecStartPre=/usr/sbin/remotectl certificate --ensure --user=root --group=cockpit-ws --selinux-type=etc_t (code=exited, status=0/SUCCESS)
 Main PID: 23287 (cockpit-ws)
   CGroup: /system.slice/cockpit.service
           └─23287 /usr/libexec/cockpit-ws

Jun 23 16:43:43 localhost.localdomain systemd[1]: Starting Cockpit Web Service...
Jun 23 16:43:43 localhost.localdomain systemd[1]: Started Cockpit Web Service.
Jun 23 16:43:43 localhost.localdomain cockpit-ws[23287]: Using certificate: /etc/cockpit/ws-certs.d/0-self-signed.cert

^^^ After doing above steps to restart cockpit service, it does not help, cockpit still can not be accessed.

Then, I reboot the RHEV-H-NG, after the OS is booted, it does not help, cockpit still can not be access.

Comment 18 Ying Cui 2016-06-23 08:53:29 UTC
Created attachment 1171375 [details]
ps -aux

Comment 19 Ying Cui 2016-06-23 08:56:07 UTC
Created attachment 1171377 [details]
VM-created.png

Comment 20 Ying Cui 2016-06-23 09:06:15 UTC
Created attachment 1171379 [details]
sosreport

Comment 21 Ying Cui 2016-06-23 09:07:21 UTC
Created attachment 1171380 [details]
var_log

Comment 22 Ying Cui 2016-06-23 09:09:05 UTC
I will send my env. info to you by email, and I will keep the env. one working day for you debug if needed.

Comment 23 Fabian Deutsch 2016-06-23 10:16:02 UTC
After the HE deployment port 9090 is not open anymore:

[root@dhcp-11-17 ~]# iptables -L -n -x -v
Chain INPUT (policy ACCEPT 0 packets, 0 bytes)
    pkts      bytes target     prot opt in     out     source               destination         
      64     6492 ACCEPT     all  --  lo     *       0.0.0.0/0            0.0.0.0/0           
       0        0 ACCEPT     icmp --  *      *       0.0.0.0/0            0.0.0.0/0            icmptype 255
    3679   263133 ACCEPT     all  --  *      *       0.0.0.0/0            0.0.0.0/0            state RELATED,ESTABLISHED
       2      120 ACCEPT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            state NEW tcp dpt:22
       0        0 ACCEPT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            state NEW tcp dpt:5900
       0        0 ACCEPT     udp  --  *      *       0.0.0.0/0            0.0.0.0/0            state NEW udp dpt:5900
       0        0 ACCEPT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            state NEW tcp dpt:5901
       0        0 ACCEPT     udp  --  *      *       0.0.0.0/0            0.0.0.0/0            state NEW udp dpt:5901
    3224   400726 REJECT     all  --  *      *       0.0.0.0/0            0.0.0.0/0            reject-with icmp-host-prohibited

Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
    pkts      bytes target     prot opt in     out     source               destination         

Chain OUTPUT (policy ACCEPT 1505 packets, 6754093 bytes)
    pkts      bytes target     prot opt in     out     source               destination         
[root@dhcp-11-17 ~]# iptables -L -n -x -v | grep 9090

Comment 24 Fabian Deutsch 2016-06-23 10:17:21 UTC
Simone, do you know what component is taking care of the iptables in the HE host case?

We had bug 1314781 to take care that the cockpit port is opened when a host is getting added from Engine.

Comment 25 Fabian Deutsch 2016-06-23 10:18:17 UTC
Raising priority because it will block remote access to the host after HE deployment.

Comment 26 Fabian Deutsch 2016-06-23 10:21:27 UTC
The appliance image used for testing was rhevm-appliance-20160619.0-2.x86_64.rhevm.ova (4.0)

Comment 27 Moran Goldboim 2016-06-28 09:43:22 UTC
Moving to Integration - setting iptables on HE.

Comment 28 Yedidyah Bar David 2016-06-28 13:27:15 UTC
Not sure what's the exact question here.

Tried a bit to understand what happens with hosted-engine in node-ng and so far failed...

1. hosted-engine --deploy asks the user whether to configure iptables.

2. If the answer is 'yes', the code in hosted-engine-setup configures iptables, opening a few ports and closing the rest. For an example about how to add rules there from the answer file, see bug 1288979 comment 8.

3. At a later point, when the host is added to the engine, the engine runs host-deploy on it, and if answer above was 'yes', also configures iptables on the host. I understand this part was handled by bug 1314781.

Comment 29 Yedidyah Bar David 2016-06-28 13:34:05 UTC
Ryan, please tell me if anything else is needed/missing.

Comment 30 Ryan Barry 2016-06-28 14:02:07 UTC
As often iterated, we'd like to avoid setting anything in answerfiles if possible. It's something that can be done, but it isn't ideal, especially since installing through cockpit is a supported primary flow. hosted-engine-setup should "just work" with this use case, which is the purpose of this bug.

To that end, what about adding port 9090 to @CUSTOM_RULES@ for hosted-engine-setup? Similar to the gluster bug, engine already supports this case, so hosted-engine-setup should as well.

Comment 32 Ying Cui 2016-07-04 12:58:43 UTC
Ryan, we need to move this bug to correct component, not Node component. And consider it is beta blocker as QE point of view.

Comment 33 Nikolai Sednev 2016-07-21 11:21:47 UTC
Works for me on these components:
sanlock-3.2.4-2.el7_2.x86_64
ovirt-hosted-engine-ha-2.0.1-1.el7ev.noarch
ovirt-imageio-daemon-0.3.0-0.el7ev.noarch
ovirt-host-deploy-1.5.1-1.el7ev.noarch
ovirt-engine-sdk-python-3.6.7.0-1.el7ev.noarch
qemu-kvm-rhev-2.3.0-31.el7_2.16.x86_64
mom-0.5.5-1.el7ev.noarch
ovirt-setup-lib-1.0.2-1.el7ev.noarch
ovirt-vmconsole-host-1.0.4-1.el7ev.noarch
libvirt-client-1.2.17-13.el7_2.5.x86_64
vdsm-4.18.6-1.el7ev.x86_64
ovirt-hosted-engine-setup-2.0.1-1.el7ev.noarch
ovirt-imageio-common-0.3.0-0.el7ev.noarch
ovirt-vmconsole-1.0.4-1.el7ev.noarch
Linux version 3.10.0-327.22.2.el7.x86_64 (mockbuild.eng.bos.redhat.com) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-4) (GCC) ) #1 SMP Thu Jun 9 10:09:10 EDT 2016
Linux 3.10.0-327.22.2.el7.x86_64 #1 SMP Thu Jun 9 10:09:10 EDT 2016 x86_64 x86_64 x86_64 GNU/Linux
Red Hat Enterprise Linux release 7.2.

Tested from Chrome 
Version 50.0.2661.86 (64-bit)


Note You need to log in before you can comment on or make changes to this bug.