Bug 1980757

Summary: dns_tkey_gssnegotiate: TKEY is unacceptable" during ipa-client-install
Product: Red Hat Enterprise Linux 7 Reporter: Rakesh Kumar <rakkumar>
Component: bindAssignee: Petr Menšík <pemensik>
Status: CLOSED NEXTRELEASE QA Contact: rhel-cs-infra-services-qe <rhel-cs-infra-services-qe>
Severity: high Docs Contact:
Priority: unspecified    
Version: 7.9CC: aegorenk, jorton, mescanfe
Target Milestone: rcKeywords: Triaged
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1980916 (view as bug list) Environment:
Last Closed: 2022-01-24 11:47:36 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1772888    
Bug Blocks: 1980916    

Description Rakesh Kumar 2021-07-09 13:01:09 UTC
Description of problem: 
dns_tkey_gssnegotiate: TKEY is unacceptable .  It seems that issue only triggers when multiple client enrollment is done simultaneously either by using puppet .


Version-Release number of selected component (if applicable):

Bind Version:
=====================
bind-export-libs-9.11.4-26.P2.el7_9.5.x86_64                
bind-libs-9.11.4-26.P2.el7_9.5.x86_64                      
bind-libs-lite-9.11.4-26.P2.el7_9.5.x86_64                 
bind-license-9.11.4-26.P2.el7_9.5.noarch                    
bind-utils-9.11.4-26.P2.el7_9.5.x86_64                     
rpcbind-0.2.0-49.el7.x86_64 
==============================

sssd : sssd-1.16.5-10.el7_9.7.x86_64
IPA Client: 
ipa-client-4.6.8-5.el7_9.5.x86_64
How reproducible:


Steps to Reproduce:
1.
==================================================
2021-05-13T12:42:47Z DEBUG stderr=Reply from SOA query:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id:  13812
;; flags: qr aa rd ra; QUESTION: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 0
;; QUESTION SECTION:
;test.idm.example.net. IN        SOA

;; AUTHORITY SECTION:
idm.example.net.     0       IN      SOA     test.idm.example.net. hostmaster.example.com. 11000000 3000 900 120000 3000

Found zone name: idm.example.net
The master is: test.idm.example.net
start_gssrequest
Found realm from ticket: example.net
send_gssrequest
recvmsg reply from GSS-TSIG query
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id:  45612
;; flags: qr ra; QUESTION: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
;3111111111.sig-ops-oeb-test.idm.example.net. ANY TKEY

;; ANSWER SECTION:
3111111111.sig-ops-test.idm.example.net. 0 ANY TKEY gss-tsig. 0 0 3 BADNAME 0  0

dns_tkey_gssnegotiate: TKEY is unacceptable

2021-05-13T12:42:47Z DEBUG nsupdate failed: Command '/usr/bin/nsupdate -g /etc/ipa/.dns_update.txt' returned non-zero exit status 1
2021-05-13T12:42:47Z ERROR Failed to update DNS records. <<<<<<<<<<<<<<<<<<<<<<
2.
==============================================================
2021-05-13T12:42:47Z WARNING Hostname (test.idm.example.net) does not have A/AAAA record.
2021-05-13T12:42:47Z DEBUG IP check failed: cannot use loopback IP address 127.0.0.1
2021-05-13T12:42:47Z DEBUG IP check failed: cannot use loopback IP address ::1
2021-05-13T12:42:47Z DEBUG IP check successful: 10.x.x.x
2021-05-13T12:42:47Z DEBUG IP check failed: cannot use link-local IP address gggg::1gg:11gg:gg11:11gg%ggg1
2021-05-13T12:42:47Z DEBUG Searching for an interface of IP address: 10.x.x.x


dns_tkey_gssnegotiate: TKEY is unacceptable

2021-05-13T12:42:47Z DEBUG nsupdate failed: Command '/usr/bin/nsupdate -g /etc/ipa/.dns_update.txt' returned non-zero exit status 1
2021-05-13T12:42:47Z ERROR Failed to update DNS records. <<<<<<<<<<<<<<<<<<<


2021-05-13T12:42:47Z DEBUG DNS resolver: Query: test.idm.example.net IN A
2021-05-13T12:42:47Z DEBUG DNS resolver: No record.
2021-05-13T12:42:47Z DEBUG DNS resolver: Query: test.idm.example.net IN AAAA
2021-05-13T12:42:48Z DEBUG DNS resolver: No record.
2021-05-13T12:42:48Z DEBUG DNS resolver: Query: 111.x.x.x.in-addr.arpa. IN PTR
2021-05-13T12:42:48Z DEBUG DNS resolver: No record.
2021-05-13T12:42:48Z WARNING Missing A/AAAA record(s) for host test.idm.example.net: 10.x.x.x
====================================================================
3.

Actual results:


Expected results: IPA client installation should be done .


Additional info:
========================

As we have a workaround like

*********************************************
1. Create a file /tmp/nsupdate.txt with content from the ipaclient-install.log file:
---
update delete  test.idm.example.net. IN A
show
send

update delete  test.idm.example.net. IN AAAA
show
send

update add  test.idm.example.net. 1200 IN A 10.x.x.x
show
send
---


*****************************************************

The workaround that we're using is not perfect:



It slows down instance provisioning process

It can mask issues with the IPA cluster - provisioning will take longer to fail if there are genuine issues

The workaround is still prone to failures (we only retry the DNS creation a couple of time
===============================================

Comment 4 Petr Menšík 2021-07-09 14:41:34 UTC
This bug is related to bug #1755643

Comment 5 Petr Menšík 2021-07-09 16:16:28 UTC
This error is emited by even by the most recent development version just in two places. One when deleting the key and it is not there [1]. Second when new name is requested and such key is already there [2]. On that part key name in question can be created by two ways. First way takes qname from QUESTION section, if it does not equal root. The second way is more interesting, it gets randomly generated from isc_nonce_buf(). A bit later it verifies such key does not already exist [2]. It does not try new nonce in that case, but returns a failure. It seems to me it would be quite safe to regenerate nonce name few times, until not yet used name is generated. Just simple loop with retries might help, without caring for nonce random qualities.

Now the question is on which side is the name generated. Client obviously does not have such information until it receives error from the server. It seems to me retry with another name would be safe way to fix this issue. Not sure why it was no problem with more recent versions.

1. https://gitlab.isc.org/isc-projects/bind9/-/blob/main/lib/dns/tkey.c#L671
2. https://gitlab.isc.org/isc-projects/bind9/-/blob/main/lib/dns/tkey.c#L869

Comment 6 Petr Menšík 2021-07-09 17:23:17 UTC
The requested name is generated in start_gssrequest method of nsupdate.c [1]. In upstream recent versions, isc_nonce_buf is used. In 9.11 isc_random_get is used to fetch random value. It seems to me relying on better random generator should help. But I am not sure just 32-bits of random value should make such difference, when the only problem is pure random number is already taken on server and another random number would fix the situation. Especially when the reason is clearly specified inside TKEY reply. Is seems to me nsupdate should just retry with another number, when it starts the gss session.

1. https://gitlab.isc.org/isc-projects/bind9/-/blob/main/bin/nsupdate/nsupdate.c#L2971

Comment 7 Petr Menšík 2021-07-09 20:24:45 UTC
The same problem is still in latest v9_11 branch of upstream, including more recent version in RHEL8. It should not be present on RHEL9 9.16 version. Or more precisely, it should be much less likely to happen. I think the real problem is still in the code, just likehood it would occur is much lower with entropy usage.

Comment 9 Petr Menšík 2021-07-10 11:46:47 UTC
Unfortunately upstream is not willing to make any change to 9.11 branch anymore because it is close to EOL and this change is not security related. Because it happens only under specific conditions I am not sure how much it needs to be fixed under RHEL7 also. It does not seem critical fix and is present there at least since 9.11 rebase.