846080 – NIS user causes gdm autologin to fail, user isn't valid yet?

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 846080 - NIS user causes gdm autologin to fail, user isn't valid yet?

Summary: NIS user causes gdm autologin to fail, user isn't valid yet?

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 6
Classification:	Red Hat
Component:	ypbind
Sub Component:
Version:	6.2
Hardware:	i686
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	rc
Target Release:	---
Assignee:	Honza Horak
QA Contact:	Jakub Prokes
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	947782 1159825
TreeView+	depends on / blocked

Reported:	2012-08-06 18:50 UTC by Rick Berge
Modified:	2015-07-22 06:44 UTC (History)
CC List:	6 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:	Cause: ypbind is not checked when the set timeout is too short Consequence: Not possible to start gdm with autologin on using NIS. Fix: Use configured timeout value and check if binding was successful even if the timeout is set to 0. Result: gdm with autologin on using NIS works properly
Clone Of:
Environment:
Last Closed:	2015-07-22 06:44:37 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
Increase timer for checking rpc availability of ypbind (473 bytes, patch) 2012-08-07 21:57 UTC, Rick Berge	no flags	Details \| Diff
el5's way of checking rpc availability of ypbind (985 bytes, patch) 2012-08-07 22:08 UTC, Rick Berge	no flags	Details \| Diff
use configured value and check even if it is zero (728 bytes, patch) 2012-08-09 16:36 UTC, Honza Horak	no flags	Details \| Diff
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2015:1332	0	normal	SHIPPED_LIVE	ypbind bug fix and enhancement update	2015-07-20 17:52:51 UTC

Description Rick Berge 2012-08-06 18:50:24 UTC

User-Agent:       Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:14.0) Gecko/20100101 Firefox/14.0.1
Build Identifier: 

I set up autologin using an NIS user on one of our lab machines.  X intermittently stopped coming up.

Saw the symptoms of Bug 629328. The X spinner cursor shows, but nothing else starts, and "/var/log/gdm/:0-slave.log" ends with 26 pairs of the following lines (same pid in all lines)

gdm-simple-slave[1869]: GLib-GObject-WARNING: invalid (NULL) pointer instance
gdm-simple-slave[1869]: GLib-GObject-CRITICAL: g_signal_handlers_disconnect_matched: assertion `G_TYPE_CHECK_INSTANCE (instance)' failed

During debugging, I added an init.d script that runs well after ypbind finishes (about 6 seconds), and I noticed commands involving the user were failing.  For some reason, NIS isn't starting fast enough for ids to be valid?  Implications are pretty severe, that NIS ids can't take part in autologin or any init.d scripts, without some arbitrary sleeps involved.

Setting NIS domain: domain is '....'  [  OK  ]
Starting NIS service: [  OK  ]
Binding NIS service: .....[  OK  ]
Enabling Bluetooth devices:
Starting sshd: [  OK  ]
Starting xinetd: [  OK  ]
Starting ntpd: [  OK  ]
Starting postgresql service: [  OK  ]
chown: invalid user: `lab'
Starting logjunk: [FAILED]

Reproducible: Sometimes

Steps to Reproduce:
Set up an NIS client machine
Verified NIS user 'lab' could log in normally to gnome.
Put this in /etc/gdm/custom.conf:
    [daemon]
    AutomaticLogin=lab
    AutomaticLoginEnable=true
Rebooted
It may fail more reliably if I hit escape during boot, for the non-quiet mode where individual services log their startup.
Actual Results:  
Sometimes X doesn't start at all.  Other times it will finally start after a delay of 30-60s.



Expected Results:  
Autologin works with NIS user, and NIS users are available immediately after NIS service script finishes

gdm-2.30.4-32.el6.i686
ypbind-1.20.4-29.el6.i686

Comment 2 Rick Berge 2012-08-07 21:54:38 UTC

For a few seconds after service should be valid,
    ypwhich: Can't communicate with ypbind


Looking at the ypbind service script,
    echo -n $"Binding NIS service: "
    # the following fixes problems with the init scripts continuing
    # even when we are really not bound yet to a server, and then things
    # that need NIS fail.
    timeout=10
    firsttime=1
    SECONDS=0
    while [ $SECONDS -lt $timeout ]; do
        if /usr/sbin/rpcinfo -p | LC_ALL=C fgrep -q ypbind
        then
            if [ $firsttime -eq 1 ]; then
                # reset timeout
                timeout=$NISTIMEOUT
                firsttime=0
            fi
            /usr/bin/ypwhich > /dev/null 2>&1
            retval=$?
            if [ $retval -eq 0 ]; then
                break;
            fi
        fi
        sleep 2
        echo -n "."
    done

This is looping for only timeout=10 seconds.  For whatever reason, it needs 18 seconds on this system/network.  Note the ability to reset the timeout to NISTIMEOUT, which defaults to 45, but only if rpcinfo finds ypbind.

I'll attach two patches folks can play with.  The first alternative just bumps the timer.  The second alternative brings back the control loop from ypbind-1.19-11.el5 (but I didn't check why it was replaced.)

Comment 3 Rick Berge 2012-08-07 21:57:57 UTC

Created attachment 602864 [details]
Increase timer for checking rpc availability of ypbind

Option 1, increased the timer in the service

Comment 4 Rick Berge 2012-08-07 22:08:41 UTC

Created attachment 602870 [details]
el5's way of checking rpc availability of ypbind

Option 2, use the simpler loop logic from el5's ypbind

Comment 5 Honza Horak 2012-08-09 16:02:50 UTC

Thanks for reporting. Do I understand correctly, that increasing timer as you suggest in comment #3 (not using $NISTIMEOUT) fixes your issue?

Comment 6 Rick Berge 2012-08-09 16:06:20 UTC

Yes.  Either patch will fix things.  I think the second patch is better, since it makes things use only a single timer value, which is already configurable.

Comment 7 Honza Horak 2012-08-09 16:36:40 UTC

Created attachment 603304 [details]
use configured value and check even if it is zero

Then it seems you hit the same issue as mentioned in Fedora:
https://bugzilla.redhat.com/show_bug.cgi?id=624688#c23

If we used a solution from comment #4, rpcinfo wouldn't be executed at all if NISTIMEOUT=0, which is not correct. We still want to check if binding was successful in that case, so I'd propose something like the patch attached. Can you check if it works for you?

Comment 8 Rick Berge 2012-08-09 20:23:54 UTC

Works for me.  I'm all for using the same code as the other branch.  Thanks.

Comment 21 Honza Horak 2015-02-19 14:41:37 UTC

Sorry, the reproducer above is not correct, this one could be better:
1. configure ypbind
2. #> service rpcbind stop
3. #> service ypbind restart

Actual results:
service ypbind restart takes 10s

Expected results:
service ypbind restart takes 45s (or whatever is defined in $NISTIMEOUT either in /etc/sysconfig/network or in /etc/sysconfig/ypbind.

Comment 25 errata-xmlrpc 2015-07-22 06:44:37 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-1332.html

Note You need to log in before you can comment on or make changes to this bug.