Bug 1560951

Summary: RHEL7.5 - Hadoop Datanode service throws exception with Kerberos security enabled
Product: Red Hat Enterprise Linux 7 Reporter: Yussuf Shaikh <yussuf>
Component: krb5Assignee: Robbie Harwood <rharwood>
Status: CLOSED CURRENTRELEASE QA Contact: BaseOS QE Security Team <qe-baseos-security>
Severity: urgent Docs Contact:
Priority: high    
Version: 7.5CC: bugproxy, dpal, fnovak, hannsj_uhl, jjarvis, jkachuck, pkis, rharwood, yussuf
Target Milestone: rcKeywords: Patch
Target Release: 7.6   
Hardware: ppc64le   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-04-26 17:38:26 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
Bug Depends On:    
Bug Blocks: 1507957, 1513404    
Attachments:
Description Flags
datanode log from the server none

Description Yussuf Shaikh 2018-03-27 10:44:46 UTC
Created attachment 1413666 [details]
datanode log from the server

Description of problem:
Hadoop Datanode service fails with error attached in hadoop-hdfs-datanode.log.

Below errors were seen in Kerberos log:
Mar 27 14:48:17 pts00433-vm38.persistent.co.in krb5kdc[8737](info): TGS_REQ (1 etypes {16}) 10.77.67.132: PROCESS_TGS: authtime 0,  dn/pts00433-vm38.persistent.co.in@EXAMPLE.COM for nn/pts00433-vm38.persistent.co.in@EXAMPLE.COM, Ticket expired
Mar 27 14:48:55 pts00433-vm38.persistent.co.in krb5kdc[8737](info): TGS_REQ (4 etypes {18 17 16 23}) 10.77.67.132: PROCESS_TGS: authtime 0,  nn/pts00433-vm38.persistent.co.in@EXAMPLE.COM for nn/pts00433-vm38.persistent.co.in@EXAMPLE.COM, Ticket expired

# krb5-config --version
Kerberos 5 release 1.15.1
# uname -a
Linux pts00433-vm38.persistent.co.in 3.10.0-830.el7.ppc64le #1 SMP Mon Jan 15 12:26:57 EST 2018 ppc64le ppc64le ppc64le GNU/Linux
# cat /etc/redhat-release
Red Hat Enterprise Linux Server release 7.5 Beta (Maipo)

Version-Release number of selected component (if applicable):
# yum list installed | grep krb
krb5-devel.ppc64le                 1.15.1-18.el7       installed
krb5-libs.ppc64le                  1.15.1-18.el7       @anaconda/7.5
krb5-pkinit.ppc64le                1.15.1-18.el7       installed
krb5-server.ppc64le                1.15.1-18.el7       installed
krb5-workstation.ppc64le           1.15.1-18.el7       installed

How reproducible:
Errors logged multiple times in Datanode service log and even for other services log eg: Ambari Infra Solr, Namenode, etc.

Steps to Reproduce:
1.Install HDP2.6.4 with Ambari2.6.1
2.Enable Kerberos security configuration on Ambari.
3.Start Datanode service.
4.Verify Datanode log for Exception

Actual results:
Below error in service log:
2018-03-27 14:46:44,739 WARN  ipc.Client (Client.java:run(711)) - Couldn't setup connection for dn/pts00433-vm38.persistent.co.in@EXAMPLE.COM to pts00433-vm38.persistent.co.in/10.77.67.132:8020
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Ticket expired (32) - PROCESS_TGS)]

Expected results:
No exception shown for Kerberos with Datanode or any service in HDP.

Additional info:

Comment 2 Yussuf Shaikh 2018-03-27 14:21:07 UTC
The errors are not seen when Ambari on RHEL7.5 is pointed to Kerberos server running on RHEL7.3.
Note: difference is krb build versions on each platform ie: 1.15.1-8.el7 and 1.15.1-18.el7 on RHEL7.3 and RHEL7.5 respectively.

Comment 3 Yussuf Shaikh 2018-03-29 12:15:25 UTC
With source compiled krb5 versions 1.15.1 & 1.15.2 on Power RHEL-7.5, the issue was not reproducible for us. These are maintenance releases from the community. We are not able to find exact source for krb5 build version 1.15.1-8 or 1.15.1-18.

Comment 4 Robbie Harwood 2018-03-29 15:43:36 UTC
Hi, I'm aware of this issue and planned to fix it with rhel-7.5 GA - krb5-1.15.1-19.

If you want test packages until then: https://rharwood.fedorapeople.org/packaging/krb5-1.15.1-19.el7/

Comment 5 Hanns-Joachim Uhl 2018-03-29 16:07:19 UTC
(In reply to Robbie Harwood from comment #4)
> Hi, I'm aware of this issue and planned to fix it with rhel-7.5 GA -
> krb5-1.15.1-19.
> 
.
Hello Red Hat / Robbie or Joe,
... with RHEL7.5 being closed is there already a 7.5.z zstream bugzilla
open for this issue ...?
If yes, can you please authorize us for that bugzilla ...?
Please advise ...
Thanks in advance for your support.

Comment 6 Joseph Kachuck 2018-03-29 19:43:27 UTC
Hello,
We can not request a Z stream until we have a fix that has been approved for RHEL 7.6. 

Robbie,
Would you be able to confirm is there is a RHEL 7.5 BZ for comment 4?

Thank You
Joe Kachuck

Comment 7 Robbie Harwood 2018-03-29 21:01:30 UTC
As per #c4 : the fix will be released as a Z-stream (0day) for 7.5.  If you need packages before then, they have been provided.

Comment 8 Yussuf Shaikh 2018-03-30 11:29:20 UTC
The error is not occurring with krb5-1.15.1-19.el7.