Description of problem: In environment with two MS active directory servers (Windows 2008 R2) working in failover and RHEL5 client authenticating against them, when one of AD servers goes down, changing password through kpasswd/passwd fails with (wrong) error. Version-Release number of selected component (if applicable): RHEL5 krb5-libs-1.6.1-36.el5_5.6 How reproducible: always Steps to Reproduce: 1. Have two MS AD servers (Windows 2008 R2) and one RHEL5 client authenticating against them using ldap running on ADs for getting user/group account information 2. Turn off one of ADs 3. Login to RHEL client machine using user from ldap and change a password for him Actual results: $ passwd Changing password for user wuser8. Kerberos 5 Password: New UNIX password: Retype new UNIX password: passwd: Authentication token manipulation error $ $ tail -f /var/log/secure ... Dec 1 17:52:13 webmail1 passwd: pam_krb5[30556]: password change failed for wuser8@: Requested protocol version not supported Dec 1 17:52:13 webmail1 passwd: pam_krb5[30556]: pam_chauthtok (updating authtok) returning 20 (Authentication token manipulation error) Expected results: According to data returned from kpasswd service it should return "Request is a replay" than "Requested protocol version not supported" (see 'Additional info:' section) Additional info: From tcpdump we can see the following comunnication: 1. AS-REQ -> server 2. AS-REP -> client 3. AP-REQ -> server 4. no response from server 5. AP-REQ -> server 6. KRB5KRB_AP_ERR_REPEAT -> client Currently, we still don't know why the MS AD/kpasswd service returns KRB5KRB_AP_ERR_REPEAT error to client instead of AP-REP, even the password gets changed correctly (it seems to be that there is some issue in MS AD kpasswd service during failover). The kerberos library wrongly parses the error reply packet. The data in packet are correct, but it is an error reply, not correct reply for AP-REQ (requiesting a password change). The related function contains a code logic that in some situation doesn't return an error and instead continues in following code, which leads to returning wrong error - "Requested protocol version not supported". Firstly let's look at data we get from server: Breakpoint 3, krb5int_rd_chpw_rep (context=0xb56da20, auth_context=0xc0f3590, packet=0x7fffba587a00, result_code=0x7fffba5879cc, result_data=0x7fffba587af0) at chpw.c:92 92 if (packet->length < 4) (gdb) x/114x packet->data 0xc0c6fd0: 0x7e 0x70 0x30 0x6e 0xa0 0x03 0x02 0x01 0xc0c6fd8: 0x05 0xa1 0x03 0x02 0x01 0x1e 0xa4 0x11 0xc0c6fe0: 0x18 0x0f 0x32 0x30 0x31 0x30 0x31 0x31 0xc0c6fe8: 0x32 0x35 0x31 0x35 0x33 0x37 0x30 0x38 0xc0c6ff0: 0x5a 0xa5 0x05 0x02 0x03 0x0c 0xb9 0xfe 0xc0c6ff8: 0xa6 0x03 0x02 0x01 0x22 0xa9 0x1e 0x1b 0xc0c7000: 0x1c 0x57 0x32 0x4b 0x38 0x52 0x32 0x2e 0xc0c7008: 0x47 0x53 0x53 0x4c 0x41 0x42 0x2e 0x50 0xc0c7010: 0x4e 0x51 0x2e 0x52 0x45 0x44 0x48 0x41 0xc0c7018: 0x54 0x2e 0x43 0x4f 0x4d 0xaa 0x1d 0x30 0xc0c7020: 0x1b 0xa0 0x03 0x02 0x01 0x02 0xa1 0x14 0xc0c7028: 0x30 0x12 0x1b 0x06 0x6b 0x61 0x64 0x6d 0xc0c7030: 0x69 0x6e 0x1b 0x08 0x63 0x68 0x61 0x6e 0xc0c7038: 0x67 0x65 0x70 0x77 0xac 0x04 0x04 0x02 0xc0c7040: 0x00 0x03 (gdb) And the codepath (note '<<<---'): Breakpoint 3, krb5int_rd_chpw_rep (context=0x4b06a20, auth_context=0x568c590, packet=0x7fff1b66ce90, result_code=0x7fff1b66ce5c, result_data=0x7fff1b66cf80) at chpw.c:92 92 if (packet->length < 4) (gdb) n 97 ptr = packet->data; (gdb) 101 plen = (*ptr++ & 0xff); (gdb) 102 plen = (plen<<8) | (*ptr++ & 0xff); (gdb) 104 if (plen != packet->length) (gdb) 111 if (krb5_is_krb_error(packet)) { <<<--- 1) (gdb) 113 if ((ret = krb5_rd_error(context, packet, &krberror))) (gdb) 116 if (krberror->e_data.data == NULL) { <<<--- 2) (gdb) 131 vno = (*ptr++ & 0xff); (gdb) 132 vno = (vno<<8) | (*ptr++ & 0xff); (gdb) 134 if (vno != 1) (gdb) 135 return(KRB5KDC_ERR_BAD_PVNO); (gdb) p plen $5 = 32368 (gdb) p packet->length $6 = 114 (gdb) in 1) it recognizes error reply according to packet data (if you look at the data, tha data starts with 0x7e): 1385 #define krb5_is_krb_error(dat)\ 1386 ((dat) && (dat)->length && ((dat)->data[0] == 0x7e ||\ 1387 (dat)->data[0] == 0x5e)) but in 2), when you look at the related code (again note '<<<---'): krb5-1.6.1/src/lib/krb5/krb/chpw.c:krb5int_rd_chpw_rep(): === <snip> === 104 if (plen != packet->length) 105 { 106 /* 107 * MS KDCs *may* send back a KRB_ERROR. Although 108 * not 100% correct via RFC3244, it's something 109 * we can workaround here. 110 */ 111 if (krb5_is_krb_error(packet)) { 112 113 if ((ret = krb5_rd_error(context, packet, &krberror))) <<<--- 1) 114 return(ret); 115 116 if (krberror->e_data.data == NULL) { <<<--- 2) 117 ret = ERROR_TABLE_BASE_krb5 + (krb5_error_code) krberror->error; 118 krb5_free_error(context, krberror); 119 return (ret); 120 } 121 } 122 else 123 { 124 return(KRB5KRB_AP_ERR_MODIFIED); 125 } 126 } 127 128 129 /* verify version number */ 130 131 vno = (*ptr++ & 0xff); <<<--- 3) 132 vno = (vno<<8) | (*ptr++ & 0xff); 133 134 if (vno != 1) <<<--- 4) 135 return(KRB5KDC_ERR_BAD_PVNO); === </snip> === it simply continues if krberror->e_data.data are not null and continues in parsing packet (3) as if it was correct reply on password change, but it isn't so it fails on checking the version number (4) with value not 1, but 12398 (0x306e) leading to KRB5KDC_ERR_BAD_PVNO ("Requested protocol version not supported"). Actually, I don't know why there is the condition as krberror->error and krberror->e_data.data doesn't seem to relate in scope of parsing their values from packet: krb5-1.6.1/src/lib/krb5/asn.1/krb5_decode.c:decode_krb5_error(): === <snip> === 709 get_field((*rep)->error,6,asn1_decode_ui_4); <<<<--- get krberror->error 710 if(tagnum == 7){ alloc_field((*rep)->client,krb5_principal_data); } 711 opt_field((*rep)->client,7,asn1_decode_realm); 712 opt_field((*rep)->client,8,asn1_decode_principal_name); 713 alloc_field((*rep)->server,krb5_principal_data); 714 get_field((*rep)->server,9,asn1_decode_realm); 715 get_field((*rep)->server,10,asn1_decode_principal_name); 716 opt_lenfield((*rep)->text.length,(*rep)->text.data,11,asn1_decode_generalstring); 717 opt_lenfield((*rep)->e_data.length,(*rep)->e_data.data,12,asn1_decode_charstring); <<<--- get krberror->e_data.data === </snip> === maybe they relate to each other in some situation, but in the code following after the condition there is no code checking and doing something with krberror->e_data.data. Moreover krberror->error contains a correct value KRB5KRB_AP_ERR_REPEAT. I don't know what krberror->e_data.data should hold, but it is not NULL and points to 2 bytes with value 0x0003. Also I am not very confident in behaviour of krb5_rd_error(), why it doesn't return the KRB5KRB_AP_ERR_REPEAT even it sets krberror->error with it. The code logic with retval variable is little bit tricky and maybe easier to understand for krb5 package maintainer. The current version of upstream kerberos seems to have this fixed (note '<<<---'), but it still returns different error (1) than sent in the packet (why?): krb5-1.8.3/src/lib/krb5/krb/chpw.c:krb5int_rd_chpw_rep(): === <snip> === 103 if (plen != packet->length) { 104 /* 105 * MS KDCs *may* send back a KRB_ERROR. Although 106 * not 100% correct via RFC3244, it's something 107 * we can workaround here. 108 */ 109 if (krb5_is_krb_error(packet)) { 110 111 if ((ret = krb5_rd_error(context, packet, &krberror))) 112 return(ret); 113 114 if (krberror->e_data.data == NULL) 115 ret = ERROR_TABLE_BASE_krb5 + (krb5_error_code) krberror->error; 116 else 117 ret = KRB5KRB_AP_ERR_MODIFIED; <<<--- 1) 118 krb5_free_error(context, krberror); 119 return(ret); <<<--- 120 } else { 121 return(KRB5KRB_AP_ERR_MODIFIED); 122 } 123 } 124 125 126 /* verify version number */ 127 128 vno = (*ptr++ & 0xff); 129 vno = (vno<<8) | (*ptr++ & 0xff); 130 131 if (vno != 1) 132 return(KRB5KDC_ERR_BAD_PVNO); === </snip> ===
I'm wondering if this is something that the patch for bug #427789 affected adversely. To check that, are you able to retry this with a krb5 package that's been rebuilt without patch #89 applied? Just reverting the patch would be a regression, so we couldn't do that, but at least we'd know whether or not it was a sequence number problem.
Hello, I am sorry for the delay, I must have overlooked an email with needinfo flag. I have tried to build the krb5 packages without mentioned patch krb5-trunk-seqnum.patch (#89), but it ended with exactly the same error with exactly the same code path. No problems for me to do anything else for you, just tell me. Best regards, -Martin
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2011-1031.html