Bug 1364037 - uid/gid drift - Breaks Cockpit and HE
Summary: uid/gid drift - Breaks Cockpit and HE
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-node
Classification: oVirt
Component: Installation & Update
Version: 4.0
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ovirt-4.0.2
: ---
Assignee: Fabian Deutsch
QA Contact: Huijuan Zhao
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-08-04 11:10 UTC by Huijuan Zhao
Modified: 2016-08-17 14:44 UTC (History)
13 users (show)

Fixed In Version: imgbased-0.8.4-1.el7
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-08-17 14:44:43 UTC
oVirt Team: Node
Embargoed:
rule-engine: ovirt-4.0.z+
rule-engine: blocker+
rule-engine: planning_ack+
fdeutsch: devel_ack+
ycui: testing_ack+


Attachments (Terms of Use)
screenshot of cockpit login page (96.10 KB, image/png)
2016-08-04 11:10 UTC, Huijuan Zhao
no flags Details
All logs (6.04 MB, application/x-gzip)
2016-08-04 11:12 UTC, Huijuan Zhao
no flags Details
kickstart file (830 bytes, text/plain)
2016-08-12 03:59 UTC, Huijuan Zhao
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1364034 0 high CLOSED Hosted Engine always show "Not running" status in cockpit after deploy it. 2021-02-22 00:41:40 UTC
oVirt gerrit 62047 0 master MERGED utils: Fix IDMap to work 2021-01-04 10:10:45 UTC
oVirt gerrit 62048 0 master MERGED osupdater: Add important note 2021-01-04 10:10:45 UTC
oVirt gerrit 62049 0 master MERGED Revert "imgbase: Drop journal support" 2021-01-04 10:10:45 UTC
oVirt gerrit 62050 0 master MERGED cli: Add journal logging 2021-01-04 10:10:45 UTC
oVirt gerrit 62076 0 master MERGED osupdater: Use factory etc is for dirft detection 2021-01-04 10:10:45 UTC
oVirt gerrit 62079 0 master MERGED osupdater: Reset permissions if they look incorrect 2021-01-04 10:10:45 UTC
oVirt gerrit 62224 0 master MERGED osupdater: also setugids on update 2021-01-04 10:10:46 UTC

Internal Links: 1364034

Description Huijuan Zhao 2016-08-04 11:10:54 UTC
Created attachment 1187432 [details]
screenshot of cockpit login page

Description of problem:
Upgrade RHVH, boot to new build, access cockpit via "ip:9090", login page can not display account input box, there is error:
Authentication failed: no results

Version-Release number of selected component (if applicable):
redhat-virtualization-host-4.0-20160803.3
imgbased-0.7.4-0.1.el7ev.noarch
cockpit-0.114-2.el7.x86_64
cockpit-ovirt-dashboard-0.10.6-1.3.4.el7ev.noarch
redhat-virtualization-host-image-update-placeholder-4.0-0.26.el7.noarch

How reproducible:
80%

Steps to Reproduce:
1. Install redhat-virtualization-host-4.0-20160727.1
2. Login RHVH and setup local repos
3. Login cockpit via "ip:9090"
4. Upgrade RHVH:
   # yum update
5. Reboot and login new build redhat-virtualization-host-4.0-20160803.3
6. Login cockpit via "ip:9090"


Actual results:
1. After step3, cockpit display normal, can login cockpit successful
2. After step6, cockpit login page display error, can not display account input box


Expected results:
2. After step6, cockpit login page should display normal, can login cockpit successful


Additional info:

Comment 1 Huijuan Zhao 2016-08-04 11:12:30 UTC
Created attachment 1187433 [details]
All logs

Comment 2 Ying Cui 2016-08-04 12:03:33 UTC
We consider it is blocker, need to pay more attention on it.

Comment 3 Fabian Deutsch 2016-08-04 13:11:14 UTC
Form journal:

# systemctl status cockpit
● cockpit.service - Cockpit Web Service
   Loaded: loaded (/usr/lib/systemd/system/cockpit.service; static; vendor preset: disabled)
   Active: inactive (dead) since Do 2016-08-04 09:05:18 EDT; 57s ago
     Docs: man:cockpit-ws(8)
  Process: 7904 ExecStart=/usr/libexec/cockpit-ws (code=exited, status=0/SUCCESS)
  Process: 7901 ExecStartPre=/usr/sbin/remotectl certificate --ensure --user=root --group=cockpit-ws --selinux-type=etc_t (code=exited, status=0/SUCCESS)
 Main PID: 7904 (code=exited, status=0/SUCCESS)

Aug 04 09:03:48 dhcp-10-16.nay.redhat.com systemd[1]: Starting Cockpit Web Service...
Aug 04 09:03:48 dhcp-10-16.nay.redhat.com systemd[1]: Started Cockpit Web Service.
Aug 04 09:03:48 dhcp-10-16.nay.redhat.com cockpit-ws[7904]: Using certificate: /etc/cockpit/ws-certs.d/0-self-signed.cert
Aug 04 09:03:51 dhcp-10-16.nay.redhat.com cockpit-ws[7904]: couldn't parse /usr/libexec/cockpit-session auth output: JSON data was empty
Aug 04 09:04:21 dhcp-10-16.nay.redhat.com cockpit-ws[7904]: couldn't parse /usr/libexec/cockpit-session auth output: JSON data was empty
Aug 04 09:05:02 dhcp-10-16.nay.redhat.com cockpit-ws[7904]: couldn't parse /usr/libexec/cockpit-session auth output: JSON data was empty


Stef, have you seen this before?
Could it be a regression of the last update?Th eupdate was:
-cockpit-ovirt-dashboard-0.10.6-1.3.3.el7ev.noarch
+cockpit-ovirt-dashboard-0.10.6-1.3.4.el7ev.noarch
-dbus-1.6.12-13.el7.x86_64
+dbus-1.6.12-14.el7_2.x86_64
-dbus-libs-1.6.12-13.el7.x86_64
+dbus-libs-1.6.12-14.el7_2.x86_64

There are no denials

Comment 4 Stef Walter 2016-08-04 14:53:19 UTC
I'm (mostly on) PTO. In order to diagnose provide a downloadable reproducible image where this functionality manifests itself. Alternatively, complete, step by step, instructions from install, channel, repo, all the way through to the bug.

Comment 5 Fabian Deutsch 2016-08-05 11:34:50 UTC
Thanks Stef - This is on RHVH where we just replace one image with the other.

Raising the urgency, because cockpit can not be accessed after updates anymore.

Comment 6 Red Hat Bugzilla Rules Engine 2016-08-05 11:34:56 UTC
This bug report has Keywords: Regression or TestBlocker.
Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP.

Comment 7 Fabian Deutsch 2016-08-05 11:50:08 UTC
Manually running cockpit-ws works:

$ systemctl stop cockpit.socket
$ systemctl stop cockpit.service
$ /usr/libexec/cockpit-ws

Now access and authentication to HOST:9090 works.

Going by to use the service results in this bug again.

The systemd differences between the working nad non-working setup are:
-systemd-219-19.el7_2.11.x86_64
-systemd-libs-219-19.el7_2.11.x86_64
-systemd-python-219-19.el7_2.11.x86_64
-systemd-sysv-219-19.el7_2.11.x86_64
+systemd-219-19.el7_2.12.x86_64
+systemd-libs-219-19.el7_2.12.x86_64
+systemd-python-219-19.el7_2.12.x86_64
+systemd-sysv-219-19.el7_2.12.x86_64

Comment 8 Peter 2016-08-05 13:56:27 UTC
On the test system i was given access to the setuid flags and group owner of cockpit-session were incorrect.

Removing cockpit-ws, and reinstalling the cockpit-ws-0.114-2.el7.x86_64.rpm fixed things. I'm not sure how those permissions got changed.

Comment 9 Huijuan Zhao 2016-08-08 02:03:58 UTC
(In reply to Fabian Deutsch from comment #5)
> Thanks Stef - This is on RHVH where we just replace one image with the other.
> 
> Raising the urgency, because cockpit can not be accessed after updates
> anymore.

Thanks Fabian and Stef, please contact with me if need other more detailed info.

Comment 10 Fabian Deutsch 2016-08-08 07:42:42 UTC
Okay, it looks like an uid-drift issue:

# ls -shal {/,l,l2}/usr/libexec/cockpit-session 
28K -rwsr-x---. 1 root input      28K 15. Jul 11:13 l2/usr/libexec/cockpit-session
28K -rwsr-x---. 1 root cockpit-ws 28K 15. Jul 11:13 l/usr/libexec/cockpit-session
28K -rwsr-x---. 1 root input      28K 15. Jul 11:13 //usr/libexec/cockpit-session

(l is the prev image, l2 the current).

The uid is different between the images, thus the uid drift fix failed.

Comment 11 Fabian Deutsch 2016-08-09 22:08:31 UTC
Please try to update to redhat-virtualization-host-image-update-4.0-20160810.0

Comment 12 Huijuan Zhao 2016-08-10 03:42:30 UTC
(In reply to Fabian Deutsch from comment #11)
> Please try to update to
> redhat-virtualization-host-image-update-4.0-20160810.0

Hi Fabian, still no boot entry for new imgbased, so can not check cockpit for new imgbased, and there is some fail info during update.

I keeped the RHVH env, and send email to you for the env info.

Comment 13 Huijuan Zhao 2016-08-11 02:40:14 UTC
Still encounter this issue same as comment 0 with build redhat-virtualization-host-image-update-4.0-20160810.1.el7_2.noarch.rpm

Comment 14 Huijuan Zhao 2016-08-12 03:04:57 UTC
Still encounter this issue on redhat-virtualization-host-image-update-4.0-20160811.0, same as comment 0.

How reproducible:
80%

Test version:
redhat-virtualization-host-4.0-20160811.0
imgbased-0.8.4-1.el7ev.noarch
cockpit-ws-0.114-2.el7.x86_64
cockpit-ovirt-dashboard-0.10.6-1.3.6.el7ev.noarch
redhat-virtualization-host-image-update-placeholder-4.0-1.el7.noarch

So I have to change the status to ASSIGNED.

ENV: 10.66.148.10, pw:redhat, you can access host if needed.

Comment 15 Ryan Barry 2016-08-12 03:30:47 UTC
Thanks for testing.

I don't need a test environment, but I am curious whether the environment used for testing differed each time (or a reliable method to reproduce). I tested this 6 times today, and never encountered a reproducer after the patch.

Was a clean installation performed each time, then upgraded? Or was the upgrade rolled back?

Can you provide the results of "rpm --verify cockpit-ws"?

Comment 16 Huijuan Zhao 2016-08-12 03:52:38 UTC
1. In the host after update:
# rpm --verify cockpit-ws
......G..    /usr/libexec/cockpit-session

2. Yes, we used different ENVs to test, and clean install with kickstart file, attachment is the kickstart file.
After update, can rollback to old imgbase successful, and can access cockpit in old imgbase.

3. But there is no such update issue from redhat-virtualization-host-4.0-20160810.1 to redhat-virtualization-host-4.0-20160811.0

Comment 17 Huijuan Zhao 2016-08-12 03:59:13 UTC
Created attachment 1190257 [details]
kickstart file

Comment 18 Ryan Barry 2016-08-12 04:11:08 UTC
Sorry, I'm unclear on #3.

This is not reproducible from 20160810.1 when upgraded to 20160811.0? Only from 20160803?

Comment 19 Huijuan Zhao 2016-08-12 04:26:08 UTC
Yes, this is not reproducible from 20160810.1 when upgraded to 20160811.0.
But can reproduce from redhat-virtualization-host-4.0-20160727.1 when upgraded to 20160811.0.

I did not try from 20160803, let me try it now.

Comment 20 Ryan Barry 2016-08-12 04:33:40 UTC
Don't worry about it -- the UID/gid drifted after 20160727, I was just curious which image you upgraded from. I tested from 20160803, but I'll try 20160727 tomorrow.

Comment 21 Huijuan Zhao 2016-08-12 05:12:12 UTC
ok, thank you Ryan for your testing, good night.

Yes, you are right, can not reproduce from 20160803.

I would like to summary our testing:
1. Can reproduce from 20160727
2. Can not reproduce from 20160803 and 20160810.1

Comment 22 Fabian Deutsch 2016-08-12 11:30:21 UTC
There was a problem with the spec, fixed, now:

# rpm -e redhat-virtualization-host-image-update-4.0-20160811.0.el7_2.noarch
# rpm -ivh redhat-virtualization-host-image-update-4.0-20160811.0.el7_2.noarch.rpm
Vorbereiten...                        ################################# [100%]
Aktualisierung/ Installation...

   1:redhat-virtualization-host-image-################################# [100%]

# mount /dev/mapper/r4b_dell--pet105--02-rhvh--4.0--0.20160811.0+1 l
# chroot l
# ls -shalr /usr/libexec/cockpit-*
268K -rwxr-xr-x. 1 root root       265K 15. Jul 11:13 /usr/libexec/cockpit-ws
304K -rwxr-xr-x. 1 root root       303K 15. Jul 11:13 /usr/libexec/cockpit-stub
 28K -rwsr-x---. 1 root cockpit-ws  28K 15. Jul 11:13 /usr/libexec/cockpit-session
 24K -rwsr-xr-x. 1 root root        24K 15. Jul 11:13 /usr/libexec/cockpit-polkit


cockpit ws is right.

Comment 23 Huijuan Zhao 2016-08-15 07:43:09 UTC
This issue is fixed in redhat-virtualization-host-4.0-20160812.0, below is detailed testing info:

Test version:
redhat-virtualization-host-4.0-20160812.0
imgbased-0.8.4-1.el7ev.noarch
cockpit-ws-0.114-2.el7.x86_64
cockpit-ovirt-dashboard-0.10.6-1.3.6.el7ev.noarch
redhat-virtualization-host-image-update-placeholder-4.0-1.el7.noarch


Test Steps:
1. Install redhat-virtualization-host-4.0-20160727.1
2. Login RHVH and setup local repos
3. Login cockpit via "ip:9090"
4. Upgrade RHVH:
   # yum update
5. Reboot and login new build redhat-virtualization-host-4.0-20160803.3
6. Login cockpit via "ip:9090"


Test results:
After step6, cockpit login page should display normal, can login cockpit successful

I will change the status to VERIFIED when the status changes to ON_QA

Comment 24 Ying Cui 2016-08-15 12:18:38 UTC
VERIFIED according to comment 23


Note You need to log in before you can comment on or make changes to this bug.