Bug 444992
Summary: | gdm reads $HOME and memorizes read data before pam_mount mounts $HOME | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Hans Ulrich Niedermann <rhbugs> | ||||||
Component: | gdm | Assignee: | jmccann | ||||||
Status: | CLOSED WONTFIX | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||
Severity: | low | Docs Contact: | |||||||
Priority: | medium | ||||||||
Version: | 9 | CC: | cschalle, dwalsh, jengelh, rstrode, sh1 | ||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | All | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2009-07-14 14:14:39 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
Hans Ulrich Niedermann
2008-05-02 16:27:57 UTC
Off those things you list, gdm only reads the face. And it can handle missing faces just fine. If your session state is messed up, thats not gdms fault. OK, I could use some help from someone familiar with the plethora of interacting processes in a Gnome desktop here, then. On "ps faux", I can see these processes potentially related to the problem: gnome-session (spawned by root process gdm-session-worker) gconfd-2 gnome-settings-daemon bonobo-activation-server dbus-daemon dbus-launch /usr/libexec/gvfsd /usr/libexec/gnome-vfs-daemon /usr/libexec/notification-daemon Except for gnome-session, all these processes are NOT shown as a child of another process. Which ones of these might see an unmounted $HOME when $HOME is mounted via pam_mount by gdm? Can you attach your pam config to see the ordering involved here? pam_mount is synchronous, correct? Created attachment 304796 [details]
/etc/pam.d/gdm
Created attachment 304797 [details]
/etc/pam.d/system-auth
I have just run a little test: Put a "sleep 30" into /usr/sbin/gdm, attached an strace to the gdm process and all its children, and logged into as the user. Analysis of the 800MB 3E6 lines strace log file up to yields the following (I have only analyzed the logs up until the point where the mount(2) call on the user's $HOME happens): Chronologially according to strace output: * pid=16589 gdm-simple-greeter reads ~/.face ~/.face.icon ~/.gnome/gdm * pid=16712 gdm-session-worker reads ~/.dmrc * pid=17161 /usr/libexec/gconfd-2 reads/writes ~/.orbitrc ~/.gconf.path ~/.gconf ~/.gconf/saved_state * pid=17148 gnome-keyring-daemon reads ~/.orbitrc * pid=17175 child of pid=17174 which is child of pid=17148 gnome-keyring-daemon reads/writes ~/.gnome2 ~/.gnome2/keyring ~/.gnome2/keyring/<RANDOMSTRING> * pid=17253 mount -orw,noatime,acl,... ... $HOME mounts ~, finally! pid=17253 is a child of pid=17177 mount.crypt (from the pam_mount package) is a child of pid=17176 non-exec child of pid=16712 gdm-session-worker Process tree (who-is-forked-by-whom, gathered from strace log, not 'ps faux' output): + 16589 gdm-simple-greeter + 16712 gdm-session-worker +-+ 17148 gnome-keyring-daemon | +-+ 17160 <non-exec child> | | `-+ 17161 gconfd-2 | +-+ 17174 <non-exec child> | `-+ 17175 <non-exec child> `-+ 17176 <non-exec child> `-+ 17177 mount.crypt `-+ 17253 mount -o... ... $HOME If I now assume a) forking never creates a lower PID b) the output of "strace -o logfile -f -s512 -p$PID" is guaranteed to be in the actual sequence the events happened this would mean that gdm-session-worker is responsible for starting gnome-keyring-daemon, and gnome-keyring-daemon is responsible for starting gconf-2. gconf-2 is started at the wrong point in time - before $HOME is mounted - so I'd blame gdm-session-worker for starting gnome-keyring-daemon too early. That's it so far from this side of the problem. pam_mount code analysis to confirm its synchronocity is next on my list, unless above analysis shows someone a different place to look. Changing version to '9' as part of upcoming Fedora 9 GA. More information and reason for this action is here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping This is not related to gdm as it affects console login as well. At a console, you may get this error: No directory /home/user! Logging in with home = /. The login process jumps the gun (so to speak). I got this from mine at a console: pam_mount(rdconf2.c:209) checking sanity of volume record (/dev/sda3) pam_mount(pam_mount.c:535) about to perform mount operations pam_mount(mount.c:409) information for mount: pam_mount(mount.c:410) ---------------------- pam_mount(mount.c:411) (defined by globalconf) pam_mount(mount.c:412) user: user pam_mount(mount.c:413) server: pam_mount(mount.c:414) volume: /dev/sda3 pam_mount(mount.c:415) mountpoint: /home pam_mount(mount.c:416) options: pam_mount(mount.c:417) fs_key_cipher: pam_mount(mount.c:418) fs_key_path: pam_mount(mount.c:419) use_fstab: 0 pam_mount(mount.c:420) ---------------------- pam_mount(mount.c:182) realpath of volume /home is /home pam_mount(mount.c:186) checking to see if /dev/mapper/_dev_sda3 is already mounted at /home pam_mount(mount.c:873) checking for encrypted filesystem key configuration pam_mount(mount.c:899) about to start building mount command pam_mount(misc.c:323) could not fill %(before=-o OPTIONS) pam_mount(misc.c:285) command: mount [-t] [crypt] [/dev/sda3] [/home] pam_mount(misc.c:56) set_myuid<pre>: (uid=0, euid=0, gid=0, egid=0) pam_mount(misc.c:56) set_myuid<post>: (uid=0, euid=0, gid=0, egid=0) key slot 0 unlocked. pam_mount(mount.c:104) mount errors: pam_mount(mount.c:107) Command successful. pam_mount(mount.c:933) waiting for mount Filesystem Type 1K-blocks Used Available Use% Mounted on /dev/sda2 ext3 20161204 12720632 6416432 67% / proc proc 0 0 0 - /proc sysfs sysfs 0 0 0 - /sys devpts devpts 0 0 0 - /dev/pts /dev/sda1 ext3 194442 18429 165974 10% /boot tmpfs tmpfs 1902180 0 1902180 0% /dev/shm none binfmt_misc 0 0 0 - /proc/sys/fs/binfmt_misc sunrpc rpc_pipefs 0 0 0 - /var/lib/nfs/rpc_pipefs fusectl fusectl 0 0 0 - /sys/fs/fuse/connections /dev/mapper/_dev_sda3 ext3 133475588 77705688 48989696 62% /home pam_mount(pam_mount.c:134) clean system authtok (0) pam_mount(misc.c:285) command: pmvarrun [-u] [user] [-o] [1] pam_mount(misc.c:56) set_myuid<pre>: (uid=0, euid=0, gid=0, egid=0) pam_mount(misc.c:56) set_myuid<post>: (uid=0, euid=0, gid=0, egid=0) pam_mount(pam_mount.c:425) pmvarrun says login count is 2 pam_mount(pam_mount.c:548) done opening session (ret=0) Last login: Mon Jun 2 17:54:53 on tty1 No directory /home/user! Logging in with home = /. [user@localhost /]$ ls -l /home drwx------ 45 user user 4096 2006-06-02 17:58 user drwx------ 2 root root 16384 2008-05-28 20:08 lost+found [user@localhost /]$ You can see that it *appears* to have mounted successfully, however, it is unable to read the user's home directory. A moment later, the data IS available. I have never seen it happen on console logins so far. However, comment #8 means that analyzing pam_mount behaviour from the code, and then possibly verifying there is no race condition between the kernel returning from mount(2) and the mounted filesystem actually showing up in userspace processes' namespaces. I experience this problem every time on graphical logins, but never on console logins. Downgrading to FC8 gdm solves the problem. Perhaps the severity should be increased from "low" due to the very high potential for mysterious corruption of the user's gconf database. So in attachment 304796 [details]
session optional pam_gnome_keyring.so auto_start
session include system-auth
we see pam_gnome_keyring.so is getting started before the system-auth stack is
running, and the system-auth stack is what runs pam_mount.
Maybe we need to swap the two in the file.
(note the priority field in bugzilla is ignored, it's not used for triage)
Dan, I have vague memories of the ordering of this being important for selinux reasons. do you see any problems with us swaping the two lines so pam_gnome_keyring runs after the session stack, not before? When I swap these two lines, Gnome comes up properly after mounting all my files as intended. I am now having some issues with pulseaudio not starting up at the same time or something similar, but I am not yet certain that this is directly related. Oh, I bet if we put it after system-auth it won't get run at all. will need to play with it. >Comment #2 From Hans Ulrich Niedermann >Which ones of these might see an unmounted $HOME when $HOME is mounted via pam_mount by gdm? Any of the processes started before the mount operation, and any children of said processes if they have not reloaded their current working directory. (I.e. when they fork/exec, they keep using the (then-hidden) directory they had, even if it is going to be overmounted, unless they reissue chdir.) Additionally, you may run into bug #449646. You can assume in good faith that any program that is not running by the time you enter your password is started by GDM after the mount operation, and should only see the new directory. While GDM sees the hidden home, I do not think it actually propagates it in the aforementioned fashion, since I had success with GDM and pam_mount before. What can screw up too is if there is already a, say, gconfd-2 instance running from a previous login session or something like that. In other words, gconfd2 is running, you logout, you login again, and then start gconfd2 again. Some programs try to be overly smart (firefox does the same) that processes linger around a little longer in the assumption that the user might just start the program again. Because the process is already running, they can just create a new window, and hence speed up startup; but this is an optimization that sometimes interferes. It's like when you try to start firefox and it says "firefox is already running...". Well, I have determined that if the pam_gnome_keyring stuff comes after system-auth (and thus might not be run at all according to comment #14) the pam_mount based mounting of $HOME works correctly. I don't know enough about gnome-keyring-daemon to tell who is starting the "/usr/bin/gnome-keyring-daemon -d --login" process I see in my process list, though, or whether gnome-keyring-daemon works as intended. The pam_mount(8) manpage provides hints how to deal with stacked execution, i.e. "run pam_mount after keyring, in a way that keyring still acts as 'sufficient' even if a module comes after it", but that might cut deep into the include files, etc. Just put pam_mount above it, and it should be fine. The following works for me on Fedora 10, without pam_keyring being installed: auth required pam_env.so auth required pam_mount.so auth sufficient pam_unix.so nullok try_first_pass auth requisite pam_succeed_if.so uid >= 500 quiet auth required pam_deny.so account required pam_unix.so account sufficient pam_localuser.so account sufficient pam_succeed_if.so uid < 500 quiet account required pam_permit.so password requisite pam_cracklib.so try_first_pass retry=3 password sufficient pam_unix.so sha512 shadow nullok try_first_pass use_authtok password required pam_deny.so session required pam_selinux.so close session optional pam_mount.so session required pam_selinux.so open multiple session optional pam_keyinit.so revoke session required pam_limits.so session [success=1 default=ignore] pam_succeed_if.so service in crond quiet use_uid session required pam_unix.so The only issue left is that when pam_mount's mounting of a user's $HOME triggers a filesystem check, there is no UI whatsoever indicating that the box is not hung but doing something reasonable - but that is not the subject of this bug. TBH, all this PAM stacking and sufficient and stuff appears a little underdocumented. Therefore, I do not really understand it (and thus cannot really folly any faint hints in the pam_mount man page) and am quite reluctant to rewrite all of /etc/pam.d/ manually or anything radical like that. Anyway... pam_keyring appears not to be needed for anything, so if I can leave it off my system, I can live with the current state of things. (In reply to comment #12) > Dan, I have vague memories of the ordering of this being important for selinux > reasons. do you see any problems with us swaping the two lines so > pam_gnome_keyring runs after the session stack, not before? Sorry for accidentally removing this needinfo. I can, however, attest that the pam_selinux stuff in comment #18 is both necessary for the mounting to work and had to be manually added by me along with the pam_mount line: (In reply to comment #18) > session required pam_selinux.so close > session optional pam_mount.so > session required pam_selinux.so open multiple Reinstating Dan's needinfo. Any process that needs to be run as if the user executed it needs to run after pam_selinux.so open. Any utility that needs to run as root should happen between the close and open. So mount and pam_console_apply are to things that should happen in between. Apps executed after pam_selinux.so open will run under the user context so if the user is confined the apps will only be allowed to do what the user is allowed to do, if they execute between the close and open they will be able to transition from the login program context to an appropiate context for the domain. sshd_t -> mount_exec_t -> mount_t as opposed to sshd_t -> mount_exec_t -> xguest_t >The only issue left is that when pam_mount's mounting of a user's $HOME
>triggers a filesystem check, there is no UI whatsoever indicating that
>the box is not hung but doing something reasonable
I do not see a way to relay this information through PAM to the user. As for fsck, I highly recommend xfs over ext3, and fsck times are just part of the story.
This message is a reminder that Fedora 9 is nearing its end of life. Approximately 30 (thirty) days from now Fedora will stop maintaining and issuing updates for Fedora 9. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '9'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 9's end of life. Bug Reporter: Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 9 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora please change the 'version' of this bug to the applicable version. If you are unable to change the version, please add a comment here and someone will do it for you. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. The process we are following is described here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping Fedora 9 changed to end-of-life (EOL) status on 2009-07-10. Fedora 9 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. Thank you for reporting this bug and we are sorry it could not be fixed. |