Bug 1425372

Summary: Connect Admin Console failed after upgrade from NGN-3.6 to 4.1 node
Product: [oVirt] ovirt-node Reporter: Huijuan Zhao <huzhao>
Component: Installation & UpdateAssignee: Douglas Schilling Landgraf <dougsland>
Status: CLOSED CURRENTRELEASE QA Contact: Huijuan Zhao <huzhao>
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.1CC: bugs, cshao, dguo, dougsland, huzhao, jiawu, leiwang, qiyuan, rbarry, sbonazzo, weiwang, yaniwang, ycui, yzhao
Target Milestone: ovirt-4.1.1Keywords: Reopened, TestBlocker
Target Release: ---Flags: rule-engine: ovirt-4.1?
sbonazzo: blocker?
ycui: planning_ack?
sbonazzo: devel_ack+
ycui: testing_ack+
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-03-25 14:35:50 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Node RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1421098    
Attachments:
Description Flags
All logs from host
none
screenshot of connect admin console failed
none
All logs from host
none
comment 8: Sosreport and all logs in /var/log and /tmp none

Description Huijuan Zhao 2017-02-21 09:47:56 UTC
Description of problem:
Upgrade from NGN-3.6 to 4.1 node, login 4.1 node, access Admin Console failed: https://$IP:9090/, it reports "Unable to connect".


Version-Release number of selected component (if applicable):
From:
RHVH-3.6-20170217.5-RHVH-x86_64-dvd1.iso
To:
redhat-virtualization-host-4.1-20170208.0


How reproducible:
100%

Steps to Reproduce:
1. Install RHVH 3.6
2. Setup local repos in RHVH 3.6, then upgrade to RHVH 4.1
   # yum update
3. Reboot and login RHVH 4.1, connect Admin Console:
   https://$IP:9090/


Actual results:
After step3, connect Admin Console failed: https://$IP:9090/, it reports "Unable to connect".
Please refer to attachment for detailed info.


Expected results:
After step3, should connect Admin Console successful.


Additional info:

Comment 1 Huijuan Zhao 2017-02-21 09:49:26 UTC
Created attachment 1256038 [details]
screenshot of connect admin console failed

Comment 2 Huijuan Zhao 2017-02-21 09:50:56 UTC
Created attachment 1256039 [details]
All logs from host

Comment 3 Sandro Bonazzola 2017-02-21 10:06:10 UTC
Can you please add sos report?

Comment 4 Huijuan Zhao 2017-02-21 10:30:04 UTC
(In reply to Sandro Bonazzola from comment #3)
> Can you please add sos report?

I added it into the attachment "All logs from host":
/var/log/sosreport-dell-per730-35.lab.eng.pek2.redhat.com-20170221062651.tar.xz

Comment 5 cshao 2017-02-22 10:45:12 UTC
Add keyword "testblocker" due to it block all cockpit testing.

Comment 6 Douglas Schilling Landgraf 2017-02-24 19:11:47 UTC
Hi Huijuan Zhao,

(In reply to Huijuan Zhao from comment #0)
> Description of problem:
> Upgrade from NGN-3.6 to 4.1 node, login 4.1 node, access Admin Console
> failed: https://$IP:9090/, it reports "Unable to connect".
> 
> 
> Version-Release number of selected component (if applicable):
> From:
> RHVH-3.6-20170217.5-RHVH-x86_64-dvd1.iso
> To:
> redhat-virtualization-host-4.1-20170208.0
> 
> 
> How reproducible:
> 100%
> 
> Steps to Reproduce:
> 1. Install RHVH 3.6
> 2. Setup local repos in RHVH 3.6, then upgrade to RHVH 4.1
>    # yum update
> 3. Reboot and login RHVH 4.1, connect Admin Console:
>    https://$IP:9090/
> 
> 
> Actual results:
> After step3, connect Admin Console failed: https://$IP:9090/, it reports
> "Unable to connect".
> Please refer to attachment for detailed info.
> 
> 


Looks like a cockpit failure:

# systemctl start cockpit.service
Job for cockpit.service failed because the control process exited with error code. See "systemctl status cockpit.service" and "journalctl -xe" for details.

# systemctl status cockpit.service -l
● cockpit.service - Cockpit Web Service
   Loaded: loaded (/usr/lib/systemd/system/cockpit.service; static; vendor preset: disabled)
   Active: failed (Result: exit-code) since Fri 2017-02-24 19:09:08 CST; 10s ago
     Docs: man:cockpit-ws(8)
  Process: 21538 ExecStartPre=/usr/sbin/remotectl certificate --ensure --user=root --group=cockpit-ws --selinux-type=etc_t (code=exited, status=217/USER)

Feb 24 19:09:08 localhost.localdomain systemd[1]: Starting Cockpit Web Service...
Feb 24 19:09:08 localhost.localdomain systemd[21538]: Failed at step USER spawning /usr/sbin/remotectl: No such process
Feb 24 19:09:08 localhost.localdomain systemd[1]: cockpit.service: control process exited, code=exited status=217
Feb 24 19:09:08 localhost.localdomain systemd[1]: Failed to start Cockpit Web Service.
Feb 24 19:09:08 localhost.localdomain systemd[1]: Unit cockpit.service entered failed state.
Feb 24 19:09:08 localhost.localdomain systemd[1]: cockpit.service failed.


> Expected results:
> After step3, should connect Admin Console successful.
> 
> 
> Additional info:

Could you please try with an updated rhvh 4.1 image? I didn't reproduce the 
report with 4.1-20170222.0 image.

1) Install redhat-virtualization-host-3.6-20170216.0.x86_64.squashfs

2) Make available the rpm for upgrade, in external server:

  # mkdir -p /var/www/html/upgrade
  # cd /var/www/html/upgrade 
  # wget http://server/redhat-virtualization-host-image-update-4.1-20170222.0.el7_3.noarch.rpm
  # createrepo .


3) In the installed rhvh 3.6, update:

  # vi /etc/yum.repos.d/local.repo 
  [local]
  name = local
  baseurl = http://192.168.122.1/upgrade
  enabled = 1
  gpgcheck = 0

4) Update and Reboot

  # yum update 
  # reboot

5) You must have access to the cockpit in the upgraded image:
   IP_ADDRESS:9090

Comment 7 Huijuan Zhao 2017-02-27 10:02:10 UTC
(In reply to Douglas Schilling Landgraf from comment #6)
> 
> Could you please try with an updated rhvh 4.1 image? I didn't reproduce the 
> report with 4.1-20170222.0 image.
> 
> 1) Install redhat-virtualization-host-3.6-20170216.0.x86_64.squashfs
> 
> 2) Make available the rpm for upgrade, in external server:
> 
>   # mkdir -p /var/www/html/upgrade
>   # cd /var/www/html/upgrade 
>   # wget
> http://server/redhat-virtualization-host-image-update-4.1-20170222.0.el7_3.
> noarch.rpm
>   # createrepo .
> 
> 
> 3) In the installed rhvh 3.6, update:
> 
>   # vi /etc/yum.repos.d/local.repo 
>   [local]
>   name = local
>   baseurl = http://192.168.122.1/upgrade
>   enabled = 1
>   gpgcheck = 0
> 
> 4) Update and Reboot
> 
>   # yum update 
>   # reboot
> 
> 5) You must have access to the cockpit in the upgraded image:
>    IP_ADDRESS:9090

I tried with rhvh-4.1-20170222.0 image according to above steps, no such issue anymore. Can access cockpit successful after update to rhvh4.1

Test version:
From:
redhat-virtualization-host-3.6-20170216.0.x86_64.squashfs
To:
redhat-virtualization-host-4.1-20170222.0
imgbased-0.9.13-0.1.el7ev.noarch

Comment 8 Huijuan Zhao 2017-03-24 07:05:29 UTC
Encountered this issue again with new build, so reopen this bug.

Test version:
From:
redhat-virtualization-host-3.6-20170307.0
To:
redhat-virtualization-host-4.1-20170323.0
cockpit-ws-126-1.el7.x86_64


Test steps:
1. Install RHVH 3.6 (redhat-virtualization-host-3.6-20170307.0), add it to engine 3.6(3.6 cluster), and add NFS storage to host in engine side.
2. Setup local repos in RHVH 3.6, then upgrade to RHVH 4.1 (redhat-virtualization-host-4.1-20170323.0)
   # yum update
3. Reboot and login RHVH 4.1, check cockpit status, connect Admin Console:
   https://$IP:9090/



Actual results:
After step3, cockpit.service is not active, connect Admin Console failed: https://$IP:9090/, it reports "Unable to connect".

# systemctl status cockpit.service
● cockpit.service - Cockpit Web Service
   Loaded: loaded (/usr/lib/systemd/system/cockpit.service; static; vendor preset: disabled)
   Active: failed (Result: start-limit) since Fri 2017-03-24 06:47:31 GMT; 8min ago
     Docs: man:cockpit-ws(8)
  Process: 25692 ExecStartPre=/usr/sbin/remotectl certificate --ensure --user=root --group=cockpit-ws --selinux-type=etc_t (code=exited, status=217/USER)

Mar 24 06:47:31 dhcp-10-16.nay.redhat.com systemd[1]: cockpit.service failed.
Mar 24 06:47:31 dhcp-10-16.nay.redhat.com systemd[1]: start request repeated too quickly for cockpit.service
Mar 24 06:47:31 dhcp-10-16.nay.redhat.com systemd[1]: Failed to start Cockpit Web Service.
Mar 24 06:47:31 dhcp-10-16.nay.redhat.com systemd[1]: cockpit.service failed.
Mar 24 06:47:34 dhcp-10-16.nay.redhat.com systemd[1]: start request repeated too quickly for cockpit.service
Mar 24 06:47:34 dhcp-10-16.nay.redhat.com systemd[1]: Failed to start Cockpit Web Service.
Mar 24 06:47:34 dhcp-10-16.nay.redhat.com systemd[1]: cockpit.service failed.
Mar 24 06:47:34 dhcp-10-16.nay.redhat.com systemd[1]: start request repeated too quickly for cockpit.service
Mar 24 06:47:34 dhcp-10-16.nay.redhat.com systemd[1]: Failed to start Cockpit Web Service.
Mar 24 06:47:34 dhcp-10-16.nay.redhat.com systemd[1]: cockpit.service failed.

# systemctl start cockpit.service
Job for cockpit.service failed because the control process exited with error code. See "systemctl status cockpit.service" and "journalctl -xe" for details.



Expected results:
After step3, cockpit.service should be active, and should connect Admin Console successful.


Please refer to attachment for detailed logs.

Comment 9 Huijuan Zhao 2017-03-24 07:10:25 UTC
Created attachment 1265975 [details]
comment 8: Sosreport and all logs in /var/log and /tmp

Comment 10 Douglas Schilling Landgraf 2017-03-24 21:25:39 UTC
Hi Huijuan Zhao,

(In reply to Huijuan Zhao from comment #9)
> Created attachment 1265975 [details]
> comment 8: Sosreport and all logs in /var/log and /tmp

Please always open a new bug report, as we are talking about different versions of RHVH.

Comment 11 Huijuan Zhao 2017-03-25 14:35:50 UTC
(In reply to Douglas Schilling Landgraf from comment #10)
> Hi Huijuan Zhao,

> Please always open a new bug report, as we are talking about different
> versions of RHVH.

Douglas, got it. And already reported a new Bug 1435887 to track the issue in comment 8, so close this bug now.