Bug 1481655 - krb5-libs dns_canonicalize_hostname broken services tracker
Summary: krb5-libs dns_canonicalize_hostname broken services tracker
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: krb5
Version: 26
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Robbie Harwood
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
: 1483628 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-08-15 11:23 UTC by Karel Volný
Modified: 2017-10-04 11:35 UTC (History)
28 users (show)

Fixed In Version: krb5-1.15.1-22.fc28, krb5-1.15.1-22.fc27 krb5-1.15.1-22.fc26
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-08-20 18:28:48 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Karel Volný 2017-08-15 11:23:11 UTC
Description of problem:
after upgrading my system, several tools stopped to work with kerberized services, while the same services were accessible via different tools

downgrading back didn't help

investigating further, a colleague has found that the upgrade just irreversibly crippled my configfile

Version-Release number of selected component (if applicable):
krb5-libs-1.15.1-21.fc26

How reproducible:
always

Steps to Reproduce:
1. edit /etc/krb5.conf to your needs, backup it
2. upgrade from <krb5-libs-1.15.1-21 to krb5-libs-1.15.1-21
3. diff /etc/krb5.conf.backup /etc/krb5.conf

Actual results:
10a11
>  dns_canonicalize_hostname = false

Expected results:
no such difference, /etc/krb5.conf.rpmnew exists so you can review and merge new options manually

Additional info:
obviously, the culprit is this spec snippet:

%triggerun libs -- krb5-libs < 1.15.1-20
if ! grep -q 'dns_canonicalize_hostname' /etc/krb5.conf ; then
    sed -i 's/\[libdefaults\]/\[libdefaults\]\n dns_canonicalize_hostname = false/' /etc/krb5.conf
fi

no bugs linked, no explanation why it needs to destroy existing configuration instead of just leaving the new default to new installations ...

this costed me ~3 hours + ~1 hour of the colleague to find out, another ~2-3 hours me and another colleague working around and doing my tasks using his account, and email to the whole department to warn people when this hits stable - hundreds of people spending minute reading it, tenths of people spending two minutes fixing things after upgrade = one or two workdays lost ... bravo, just bravo ...

Comment 1 Robbie Harwood 2017-08-15 16:24:43 UTC
Your tone is not conducive to a good relationship with the maintainer.  I'm sorry this has cost you time, but please understand: we both care about this product, and both want it to work.

Further, we did not roll this change out wildly: it sat in rawhide for quite some time to see if it would break anything.

> investigating further, a colleague has found that the upgrade just irreversibly crippled my configfile

It's not at all irreversible.  You can literally edit the file to set it to true again.

> no such difference, /etc/krb5.conf.rpmnew exists so you can review and merge new options manually

After some testing, we have found that most users do not look at rpmnew for files that they do not already expect to change.  I'm glad you're aware of the mechanisms for doing so, but for security-related changes (like this one), and for changes that other programs depend on (like /etc/krb5.conf.d), we will not be using the rpmnew mechanism.

Further: you would have encountered this breakage eventually whenever you did a new installation.  Since it was going to be a problem for you anyway, better to find it sooner.

Can you provide information (private comment if it's internal stuff) on what broke so that we can actually work to resolve this instead of venting at each other?

Comment 2 Dr. David Alan Gilbert 2017-08-15 17:25:45 UTC
I hit this just now as well.
I couldn't use krb to any of our services (ssh, beaker etc).
klist still showed a valid token.

I agree this is a pretty nasty bug.

Comment 3 Robbie Harwood 2017-08-15 17:43:38 UTC
Thanks.  Is there anything besides beaker web login that you've seen broken?  Specific commands/workflows/etc. that would be helpful.  (I'll start talking to beaker folks about this.)

We are unlikely to roll back this change, so I'd like to know what's relying on canonicalization in order to get it fixed.

Comment 4 Dr. David Alan Gilbert 2017-08-15 18:01:52 UTC
(In reply to Robbie Harwood from comment #3)
> Thanks.  Is there anything besides beaker web login that you've seen broken?
> Specific commands/workflows/etc. that would be helpful.  (I'll start talking
> to beaker folks about this.)
> 
> We are unlikely to roll back this change, so I'd like to know what's relying
> on canonicalization in order to get it fixed.

Well the thing I first noticed was a kerberised ssh failing (to shell);
when that failed, after checking kinit I checked the beaker web UI just to find a 2nd kerberos thing to check and saw it was broken as well; so that's a 2 of 2 failure and I didn't look elsewhere.

Dave

Comment 5 Robbie Harwood 2017-08-15 18:10:39 UTC
For ssh commands, the target hostname needs to match that of the service principal.  If it stopped working here, that probably means you need to add the ".redhat.com" (or whatever the subdomains are).

(Restoring accidentally cleared needinfo.)

Comment 6 James 2017-08-15 19:05:50 UTC
This also broke kerberised NFS4 for me.

Comment 8 Robbie Harwood 2017-08-15 20:49:45 UTC
(If you reach this bug by having an issue with a load balancer, Simo has a good blog post on how to do this correctly https://ssimo.org/blog/id_019.html )

Comment 9 Bojan Smojver 2017-08-16 00:21:12 UTC
I understand that you may want to do this kind of thing on the new installations. But, messing with people's working configurations by sed-ing - I'm not so sure about that approach.

I totally didn't realize it wasn't me that put that in there (it was a while ago I messed with krb5.conf on my systems, so I could not remember). And, reverting to an older krb5 didn't fix the problem, which confused me even more.

It's pretty standard practice for people to use short names on internal networks and trust local DNS. So, forcing this for everyone on every upgrade may be a step too far.

Comment 10 James 2017-08-16 07:36:50 UTC
(In reply to Robbie Harwood from comment #8)
> (If you reach this bug by having an issue with a load balancer, Simo has a
> good blog post on how to do this correctly
> https://ssimo.org/blog/id_019.html )

I should mention that I was accessing my NFS server through a CNAME, so this probably applies there too.

Comment 12 Dr. David Alan Gilbert 2017-08-16 11:06:29 UTC
Is there a correct way to handle ssh ?  It's very common to CNAME ssh; if there's no way to do it now then it's not purely a matter of tracking the service.

Comment 14 Zuzana Svetlikova 2017-08-16 11:26:50 UTC
At least it could have been announced BEFORE it was built in rawhide.

Comment 15 Karel Volný 2017-08-16 11:51:56 UTC
(In reply to Robbie Harwood from comment #1)
> Your tone is not conducive to a good relationship with the maintainer.

thanks for the lesson, however let me remind you that *your style of maintaining the package* was not conductive to a good relationship with the users

it's simple as that - action and reaction; you can hardly expect someone hit by hammer to be thankful and all positive ... OTOH, while I acknowledge the bug description reflects my frustration, and e.g. Bojan (with whom I totally agree) has expressed things better, I fail to see where it is getting personal to deserve such lesson (unlike the statement above, but again, it's not me who started with "your")

if that's some kind of cultural difference, I apologize for not getting grasp of it

> I'm sorry this has cost you time, but please understand: we both care about
> this product, and both want it to work.

the point is not about my time, but the company resources - looks like our team is not the only one affected, and Red Hat is paying cold hard cash to people who now have to find out and fix their systems instead of working on their daily tasks

and obviously, it's not just my time, and not just Red Hat's resources

to be honest, I don't care about 'this product' krb5 and I want it to work just to the extent to be able to do my job, and this whole thing had negative impact on the product I do care about, causing another delay (albeit small, as I'm mostly waiting for Beaker) in my assigned RHSA release (and I highlihgt "S" for "Security")

> Further, we did not roll this change out wildly: it sat in rawhide for quite
> some time to see if it would break anything.

... and were aware it breaks things and still pushed this

> > investigating further, a colleague has found that the upgrade just irreversibly crippled my configfile
> 
> It's not at all irreversible. You can literally edit the file to set it to
> true again.

irreversibly from the package manager point of view - downgrading krb5 to previous version did not remove the offending line

> but for security-related changes (like this one)

so, what's the CVE #?

the respective changelog entry is:

* St srp 02 2017 Robbie Harwood <rharwood> - 1.15.1-20
- Disable dns_canonicalize_hostname.  This may break some setups.

this commit: https://src.fedoraproject.org/rpms/krb5/c/ccd78d8ee908015ca558e7428c27151cb1af5579?branch=master

it does't give a clue why it is needed to divert from upstream, and furthermore why it is needed so bad to break existing configurations, and also to circumvent Fedora packaging guidelines

> Further: you would have encountered this breakage eventually whenever you
> did a new installation.  Since it was going to be a problem for you anyway,
> better to find it sooner.

on new install, things don't work out of the box, I have to configure the service, while on existing install I just expect the things to work, I do not review whole /etc daily (after each update) to check whether some rogue sed didn't change something important

that's quite a difference ...

> Can you provide information (private comment if it's internal stuff) on what
> broke so that we can actually work to resolve this instead of venting at
> each other?

bug #1481213

Comment 16 jabailey 2017-08-16 14:17:16 UTC
Not sure if this is related but FWIW I cannot visit JIRA in Google Chrome (nor in Chrome incognito mode) or Firefox. Clearing Chrome HSTS did not work. 

From Chrome i get:
Your connection is not private

Attackers might be trying to steal your information from projects.engineering.redhat.com (for example, passwords, messages, or credit cards). Learn more
NET::ERR_CERT_AUTHORITY_INVALID
 
Automatically send some system information and page content to Google to help detect dangerous apps and sites. Privacy policy
ReloadHIDE ADVANCED
projects.engineering.redhat.com normally uses encryption to protect your information. When Google Chrome tried to connect to projects.engineering.redhat.com this time, the website sent back unusual and incorrect credentials. This may happen when an attacker is trying to pretend to be projects.engineering.redhat.com, or a Wi-Fi sign-in screen has interrupted the connection. Your information is still secure because Google Chrome stopped the connection before any data was exchanged.

You cannot visit projects.engineering.redhat.com right now because the website uses HSTS. Network errors and attacks are usually temporary, so this page will probably work later.


From Firefox i get:

Your connection is not secure

The owner of projects.engineering.redhat.com has configured their website improperly. To protect your information from being stolen, Firefox has not connected to this website.

Comment 17 Robbie Harwood 2017-08-16 20:07:47 UTC
I don't need to put up with this.  I hope you're happy.

Comment 18 Fedora Update System 2017-08-16 21:08:59 UTC
krb5-1.15.1-22.fc26 has been submitted as an update to Fedora 26. https://bodhi.fedoraproject.org/updates/FEDORA-2017-e3bc695be0

Comment 19 Bojan Smojver 2017-08-16 22:24:50 UTC
(In reply to Fedora Update System from comment #18)
> krb5-1.15.1-22.fc26 has been submitted as an update to Fedora 26.
> https://bodhi.fedoraproject.org/updates/FEDORA-2017-e3bc695be0

Not quite sure why you also removed this from the default config going forward. I certainly didn't ask for that. I understand the need for tightening things up. I was only unhappy about existing working setups being broken.

Comment 21 Stefan Assmann 2017-08-17 11:19:25 UTC
With "dns_canonicalize_hostname = false" kerberos authentication via console (conserver-client) no longer works and I'm asked for my user password again. This is on f26 with krb5-workstation-1.15.1-21.fc26.x86_64. Switching the setting back to true and everything starts working again.

Comment 22 Stephen Herr 2017-08-17 15:49:02 UTC
So for the record I think this would break all applications that are served over hostnames other than the server's canonical hostname, right?

So for example I manage a server that deploys several kerberos-authenticated applications, each running on a VHOST with its own hostname that CNAMEs back to the server's canonical hostname. But there is only one server-side HTTP@ principal that apache uses to authenticate the incoming tickets, one for the server's canonical hostname. Without DNS resolution, that is guaranteed to always fail, right? Or is it expected that I would have many HTTP@ pricipals, one for each CNAME?

Comment 23 Stephen Herr 2017-08-17 16:38:55 UTC
(In reply to Stephen Herr from comment #22)
> Or is it expected that I would have many HTTP@
> pricipals, one for each CNAME?

Actually I don't even think that would, work, because that would break everyone not on a krb5-1.15.1-21.fc26 client (or who had the configured it to use the dns canonical hostname), right? So this change essentially is a non-backwards compatible client change that makes it impossible for simple httpd server to serve multiple kerberos-authenticated applications to all clients?

Comment 24 Luis Claudio R. Goncalves 2017-08-18 23:14:02 UTC
'rhpkg clone' fails with the following error message:

  Authentication failed.
  fatal: Could not read from remote repository.

  Please make sure you have the correct access rights
  and the repository exists.

  Please ensure you are in devel LDAP group. Run `ssh shell.devel.redhat.com groups $USER` to check.
  Could not execute clone: Command '['git', 'clone', '-b', 'rhel-7.5', 'ssh://<user>@pkgs.devel.redhat.com/rpms/kernel-rt', '--origin', 'origin']' returned non-zero exit status 128

Setting "dns_canonicalize_hostname = true" mitigated the problem.

Comment 25 Fedora Update System 2017-08-19 18:53:14 UTC
krb5-1.15.1-22.fc26 has been pushed to the Fedora 26 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2017-e3bc695be0

Comment 26 Trever Adams 2017-08-19 21:26:34 UTC
I am seeing this with Samba 4 (self compiled so I have AD/DC functionality) and Dovecot/Postfix where users are looked up via LDAP that is gssapi/kerberos authenticated. hosts = DOMAN/REALM so that if there are more than one servers they will respond. It also is difficult to autoconfigure using puppet to find a given LDAP server (but that isn't the main reason it is done).

It took me a while to figure out the problem as one of the setups configured this way works, the other doesn't. Two different domains. Configurations are nearly identical (different domain/host names, variations in users and such, but dovecot/postfix is identical other than the domain/host names, both S4 servers have the same krb5.conf except for the realm). One works, one doesn't.

If I change the one to cannonicalize the hostname, both suddenly work.

Comment 27 Trever Adams 2017-08-19 21:35:16 UTC
On the affected domain, it also broke clients logging into a windows machine (probably the same problem with LDAP).

Comment 28 Fedora Update System 2017-08-20 18:28:48 UTC
krb5-1.15.1-22.fc26 has been pushed to the Fedora 26 stable repository. If problems still persist, please make note of it in this bug report.

Comment 29 Nathaniel McCallum 2017-08-21 14:19:14 UTC
*** Bug 1483628 has been marked as a duplicate of this bug. ***

Comment 30 John Obaterspok 2017-08-25 04:30:52 UTC
[root@host ~]# cat /etc/krb5.conf | grep canon
 dns_canonicalize_hostname = false

[root@host ~]# rpm -qa krb5\*
krb5-libs-1.15.1-22.fc26.x86_64
krb5-workstation-1.15.1-22.fc26.x86_64
krb5-devel-1.15.1-22.fc26.x86_64
krb5-pkinit-1.15.1-22.fc26.x86_64
krb5-server-1.15.1-22.fc26.x86_64


The update didn't revert the change for me. Wasn't it suppose to remove the above line?

Comment 31 Robbie Harwood 2017-08-25 16:26:29 UTC
(In reply to John Obaterspok from comment #30)
> [root@host ~]# cat /etc/krb5.conf | grep canon
>  dns_canonicalize_hostname = false
> 
> [root@host ~]# rpm -qa krb5\*
> krb5-libs-1.15.1-22.fc26.x86_64
> krb5-workstation-1.15.1-22.fc26.x86_64
> krb5-devel-1.15.1-22.fc26.x86_64
> krb5-pkinit-1.15.1-22.fc26.x86_64
> krb5-server-1.15.1-22.fc26.x86_64
> 
> 
> The update didn't revert the change for me. Wasn't it suppose to remove the
> above line?

No.  As per requests above, it uses rpmnew files to push a configuration change.  If you were not expecting to update config files in this way: neither are most people, which is part of why I didn't do the original change in that way.


Note You need to log in before you can comment on or make changes to this bug.