Bug 1251151

Summary: Can not login RHEV-H after automatic installation on physical machine: Authentication token manipulation error
Product: Red Hat Enterprise Virtualization Manager Reporter: Ying Cui <ycui>
Component: ovirt-nodeAssignee: Douglas Schilling Landgraf <dougsland>
Status: CLOSED ERRATA QA Contact: Ying Cui <ycui>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 3.5.4CC: cshao, cwu, fdeutsch, gklein, huiwa, leiwang, lsurette, mburman, mgoldboi, rbarry, tlitovsk, ycui, ykaul
Target Milestone: ovirt-3.6.1Keywords: AutomationBlocker, TestBlocker, ZStream
Target Release: 3.6.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: ovirt-node-3.3.0-0.14.20151013git5f84da0.el7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1269840 (view as bug list) Environment:
Last Closed: 2016-03-09 14:34:33 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Node RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1199035, 1254510, 1269840    
Attachments:
Description Flags
sosreport
none
varlog
none
all_info
none
varlog_ovirt-node-3.3.0-0.13.20151008git03eefb5
none
sosreport_ovirt-node-3.3.0-0.13.20151008git03eefb5 none

Description Ying Cui 2015-08-06 14:54:51 UTC
Description of problem:
Can not login RHEV-H after automatic installation.

Boot media: 
  USB - encounter this issue.
  CDROM - encounter this issue.
  We did not test it over PXE, because we are blocked by ticket https://engineering.redhat.com/rt/Ticket/Display.html?id=364821

arguments: storage_init= storage_vol= BOOTIF= firstboot adminpw=

-------
Please login as 'admin' to configure the node
localhost login: admin

Password: 
You are required to change your password immediately (root enforced)
Changing password for admin.
(current) UNIX password: 

Authentication token manipulation error
-------

Version-Release number of selected component (if applicable):
rhev-hypervisor-7-7.1-20150805.0
rhev-hypervisor-6-6.7-20150804.0


How reproducible:
100% on USB boot and CDROM boot on _physical_ machine.
Can _not_ reproduce it on VM.

Steps to Reproduce:
1. Auto install RHEV-H with below parameters:
BOOTIF= storage_init=/dev/sda adminpw= storage_vol= firstboot
2. Login the hypervisor.
3. Change password as a must.

Actual results:
1. Can not login RHEV-H, can not change password for admin

Expected results:
Can login RHEVH after automatic installation with BOOTIF= storage_init= adminpw= storage_vol= firstboot

Comment 1 Ying Cui 2015-08-06 14:57:30 UTC
Created attachment 1060006 [details]
sosreport

Comment 2 Ying Cui 2015-08-06 14:58:18 UTC
Created attachment 1060008 [details]
varlog

Comment 5 Ryan Barry 2015-08-06 16:42:28 UTC
I'm not able to reproduce this on a local system (tested CDROM installs on a R300 and a 9020), and I was able to immediately log into QE's system and change the password as normal.

Testing USB now, but this looks ok to me so far.

I'll keep trying to reproduce.

Ying -

If you get a chance, can you log in as root and provide the last couple logs from dmesg and the journal when this happens?

Also, output from attaching strace to login during a failed attempt may be very long, but potentially helpful.

Comment 6 Ryan Barry 2015-08-06 17:58:09 UTC
I'm also not able to reproduce this with USB installs

Comment 7 Ying Cui 2015-08-07 06:41:33 UTC
Ryan, for dmesg log and secure log, you can check above the attachment 1060008 [details].

<snip>
[    0.000000] Command line: initrd=initrd0.img root=live:CDLABEL=rhev-hypervisor7-7.1-20150805.0 rootfstype=auto ro rd.live.image rd.live.check crashkernel=256M rd_NO_MULTIPATH rootflags=ro elevator=deadline install quiet max_loop=256 rhgb rd.luks=0 rd.md=0 rd.dm=0 storage_init=/dev/sda storage_vol=:::::: adminpw=OKr05SbCu3D3g firstboot BOOTIF=em1 BOOT_IMAGE=vmlinuz0
</snip>


# cat /var/log/secure 

Aug  7 03:51:35 dhcp-10-118 usermod[1810]: change user 'admin' password
Aug  7 03:51:35 dhcp-10-118 chage[1816]: changed password expiry for admin
Aug  7 03:51:47 dhcp-10-118 sshd[1935]: Server listening on 0.0.0.0 port 22.
Aug  7 03:51:47 dhcp-10-118 sshd[1935]: Server listening on :: port 22.
Aug  7 03:51:35 dhcp-10-118 usermod[1810]: change user 'admin' password
Aug  7 03:51:35 dhcp-10-118 chage[1816]: changed password expiry for admin
Aug  7 03:51:47 dhcp-10-118 sshd[1935]: Server listening on 0.0.0.0 port 22.
Aug  7 03:51:47 dhcp-10-118 sshd[1935]: Server listening on :: port 22.
Aug  7 03:57:23 localhost sshd[2419]: Server listening on 0.0.0.0 port 22.
Aug  7 03:57:23 localhost sshd[2419]: Server listening on :: port 22.
Aug  7 03:57:59 localhost login: pam_unix(login:account): expired password for user admin (root enforced)
Aug  7 03:58:01 localhost unix_chkpwd[16708]: password check failed for user (admin)
Aug  7 03:58:01 localhost login: pam_unix(login:chauthtok): authentication failure; logname=LOGIN uid=0 euid=0 tty=tty1 ruser= rhost=  user=admin
Aug  7 03:58:04 localhost login: Authentication token manipulation error
Aug  7 04:06:07 localhost sshd[2444]: Server listening on 0.0.0.0 port 22.
Aug  7 04:06:07 localhost sshd[2444]: Server listening on :: port 22.
Aug  7 04:06:34 localhost login: pam_unix(login:account): expired password for user admin (root enforced)
Aug  7 04:06:37 localhost unix_chkpwd[16595]: password check failed for user (admin)
Aug  7 04:06:37 localhost login: pam_unix(login:chauthtok): authentication failure; logname=LOGIN uid=0 euid=0 tty=tty1 ruser= rhost=  user=admin
Aug  7 04:06:39 localhost login: Authentication token manipulation error
Aug  7 06:14:37 localhost sshd[2446]: Server listening on 0.0.0.0 port 22.
Aug  7 06:14:37 localhost sshd[2446]: Server listening on :: port 22.
Aug  7 06:15:22 localhost login: pam_unix(login:account): expired password for user admin (root enforced)
Aug  7 06:15:26 localhost unix_chkpwd[16605]: password check failed for user (admin)
Aug  7 06:15:26 localhost login: pam_unix(login:chauthtok): authentication failure; logname=LOGIN uid=0 euid=0 tty=tty1 ruser= rhost=  user=admin
Aug  7 06:15:27 localhost login: Authentication token manipulation error

# cat /etc/pam.d/password-auth
#%PAM-1.0
# This file is auto-generated.
# User changes will be destroyed the next time authconfig is run.
auth        required      pam_env.so
auth        sufficient    pam_unix.so try_first_pass nullok
auth        required      pam_deny.so

account     required      pam_unix.so

password    requisite     pam_pwquality.so try_first_pass local_users_only retry=3 authtok_type=
password    sufficient    pam_unix.so try_first_pass use_authtok nullok sha512 shadow
password    required      pam_deny.so

session     optional      pam_keyinit.so revoke
session     required      pam_limits.so
-session     optional      pam_systemd.so
session     [success=1 default=ignore] pam_succeed_if.so service in crond quiet use_uid
session     required      pam_unix.so

# cat /etc/pam.d/system-auth
#%PAM-1.0
# This file is auto-generated.
# User changes will be destroyed the next time authconfig is run.
auth        required      pam_env.so
auth        sufficient    pam_unix.so try_first_pass nullok
auth        required      pam_deny.so

account     required      pam_unix.so

password    requisite     pam_pwquality.so try_first_pass local_users_only retry=3 authtok_type=
password    sufficient    pam_unix.so try_first_pass use_authtok nullok sha512 shadow
password    required      pam_deny.so

session     optional      pam_keyinit.so revoke
session     required      pam_limits.so
-session     optional      pam_systemd.so
session     [success=1 default=ignore] pam_succeed_if.so service in crond quiet use_uid
session     required      pam_unix.so

# getent passwd admin
admin:x:1001:1001::/home/admin:/usr/libexec/ovirt-admin-shell

# getent shadow admin
admin:OKr05SbCu3D3g:0:0:99999:7:::

# ls -Z /etc/passwd
-rw-r--r--. root root system_u:object_r:passwd_file_t:s0 /etc/passwd

# ls -Z /etc/shadow
----------. root root system_u:object_r:shadow_t:s0    /etc/shadow

Comment 8 Ying Cui 2015-08-07 07:32:05 UTC
Created attachment 1060252 [details]
all_info

including fresh logs:
journalctl  sosreport-localhost-20150807070827.tar.xz  varlog.tar.bz2

Comment 9 Ying Cui 2015-08-07 10:42:00 UTC
I lower the severity to medium. Because we can reproduce this issue on _previous_ RHEV-H as well rhev-hypervisor7-7.1-20150603.0.iso and rhevh-6.6-20150512.0.el6ev.iso. And devel side can not reproduce it.

Comment 12 Ying Cui 2015-09-18 09:42:57 UTC
Moved this severity to Urgent again, because QE encountered this issue lots of times during automatic installation on physical machine. It is affecting our test now.

Comment 14 Fabian Deutsch 2015-09-30 11:05:16 UTC
Can you please check with enforcing=0 during installation?

Comment 15 Douglas Schilling Landgraf 2015-10-01 00:29:31 UTC
Hi Ying,

Is it possible to autoinstall adding rootpw as well in the kernel args params and provide the machine via ssh for us to debug? 

Thanks!

Comment 16 Fabian Deutsch 2015-10-02 14:57:50 UTC
*** Bug 1246833 has been marked as a duplicate of this bug. ***

Comment 17 Anatoly Litovsky 2015-10-06 10:45:59 UTC
Hi 
I can see this too with the auto install scripts.
Ping me on IRC and I can provide machines with the error .

Comment 18 Anatoly Litovsky 2015-10-06 11:29:38 UTC
Neither enforcing=0 or rootpw helps

Comment 19 Fabian Deutsch 2015-10-06 12:29:05 UTC
Can you attach logs from the failed run?

Comment 20 Fabian Deutsch 2015-10-08 09:14:38 UTC
A summary on the observations: ntpd.service is using the PrivateTmp service directive, which creates a slave subtree of / with private /tmp and /var/tmp.
Somehow, the exact reason is still unknown, the umounts of the bind mounts in the root ns are not correctly propagated to the ntpd mount ns - thus the bind mounts are kept there.
Because the bind mounts still exist in th entpd mount ns, the files can not be removed (because they still serve as targets for the bind mounts in the ntpd mounts ns).

A workaround is to stop ntpd or (this is the fix for now) launch ntpd without PrivateTmp.

Comment 22 Fabian Deutsch 2015-10-09 05:58:26 UTC
*** Bug 1242366 has been marked as a duplicate of this bug. ***

Comment 23 Fabian Deutsch 2015-10-09 09:51:36 UTC
*** Bug 1270203 has been marked as a duplicate of this bug. ***

Comment 24 Ying Cui 2015-10-12 04:02:26 UTC
There two new patches 47167 and 47168 are review in Progress, not merged into Fixed In Version: ovirt-node-3.3.0-0.11.20151008git33a0533.el7ev.

And tested on build rhev-hypervisor7-7.2-20151009.0(ovirt-node-3.3.0-0.13.20151008git03eefb5.el7ev.noarch)
Still can reproduce this error the same as original bug description.

So it is still affecting our automated installation testing, about 60+ test cases.

Comment 25 Ying Cui 2015-10-12 07:22:40 UTC
After confirmed with Fabian, patches 47167 and 47168 are for 3.5.z.

So all patches are already included into rhev-hypervisor7-7.2-20151009.0(ovirt-node-3.3.0-0.13.20151008git03eefb5.el7ev.noarch).

Then I have to assign this bug, new fresh log will be provided later.

Comment 26 Ying Cui 2015-10-12 08:42:56 UTC
Created attachment 1081907 [details]
varlog_ovirt-node-3.3.0-0.13.20151008git03eefb5

Comment 27 Ying Cui 2015-10-12 08:44:27 UTC
Created attachment 1081908 [details]
sosreport_ovirt-node-3.3.0-0.13.20151008git03eefb5

Comment 41 errata-xmlrpc 2016-03-09 14:34:33 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-0378.html