https://bugzilla.redhat.com/show_bug.cgi?id=1754395 https://bugzilla.redhat.com/show_bug.cgi?id=1897717 https://github.com/systemd/systemd/issues/4446 https://github.com/sddm/sddm/issues/1335 I've been researching a black screen bug for about a month now and ultimately find that this problem has been going on in one form or another for years. Here it is in a nutshell, from one of the comments: > Cause of this bug is simple - systemd don't import ALL enviroment variables if any variable name is not POSIX compilant > (it must conform to the regex [0-9A-Z_], see: https://github.com/systemd/systemd/blob/a859abf062cef1511e4879c4ee39c6036ebeaec8/src/basic/env-util.c#L19). > > You can test it with this command: `env %=1 systemctl --user import-environment`, it will fail. > > Now there is 2 known packages exporting those variables: environment-modules and Lmod. > > Solution is simple - skip ONLY invalid variables while importing others. > > I've created bug in the systemd, but still without answer - https://github.com/systemd/systemd/issues/14878. So, what we basically have is two projects lmod and environment-modules that create apparently non-compliant env variables that systemd won't recognize. This in turn has caused and is continuing to cause a cascading trail of havoc. The correct way to fix this is either lmod and environment-modules change to a compliant naming scheme or systemd accept the names - or just skip the ones it doesn't want to recognize. It appears the root cause of the problem has been identified, yet as problems occur instead of fixing the root cause, each impact has been handled as a one-off with it's own circumvention. As time goes on, other issues will continue to surface until the root cause is fixed.
Hmmm, as I understand it, the reported problem is that if the environment contains non-posix-compliant variables (like "%"), systemctl import-environment will fail to import _anything_. If that's the case, I can't reproduce. If use "env %=1 AAAA=2", % isn't imported, but AAAA is.
Matt, please read: https://github.com/systemd/systemd/issues/14878 That is the basic issue explained in more detail, and it appears we are at loggerheads at coming to an agreement on how to solve the issue. I'm definitely not the SME on this, but as I've already mentioned, it makes absolutely no sense to build a workaround into every application which is affected when the root cause should be addressed. I don't have Lmod on my system, but I do have environment-modules and if I try to remove it, it wants to take 155 other applications with it, so that doesn't appear to be an option either.
Gerald, it looks like a fix was merged in that ticket? In https://github.com/systemd/systemd/pull/17188#issuecomment-708518052, Lennart says "Summary: terrible usecase, but will merge this anyway." Like I said, I can't reproduce the minimal reproducer you've given, unless I misunderstand the problem. I have systemd-246.7-2.fc33 on my system. Can you also try the minimal case and test if you're having the KDE problems still?
Hey Matt, thanks for your help with this. I had missed https://github.com/systemd/systemd/pull/17188#issuecomment-708518052 and from the description that "should" fix the issue. I am also running systemd-246.7-2 but unfortuately, unless I am reading this incorrectly, the fix was incorporated into systemd-247: https://github.com/systemd/systemd/pull/17188#issuecomment-708518052 and 247 according to koji is only in rawhide - and yes, I'm still having the issue. I'll let kde upstream know and also see if I can figure out how to get 247 installed to test. Hopefully, this will fix quite a few problems and negate the need for hacky workarounds.
Let me clarify: - please ignore https://github.com/systemd/systemd/pull/17188. It was (mostly) reverted. - https://github.com/keszybz/systemd/commit/a4ccce22d9552dc74b6916cc5ec57f2a0b686b4f makes systemctl ignore envvars that systemd doesn't like. This means that tools which call 'systemctl import-environment' will not fail. Other tools which call the DBus api directly need to do similar filtering. - Lmod is not doing anything wrong. It is export bash functions as variables, which is a bit wonky, but allowed by POSIX. It actually makes sense for that use case. So... if there's some particular tool which calls the Dbus api directly and tries to push unacceptable variables, it needs to be patched not to do that. *Ideally* it would be patched to just push *only specific variables*, because most variables only make sense for that program, for example $OLDPWD, $PWD, $SHLVL, $TERM, $SHELL. Only variables that make sense *globally* for the user session should be pushed to the systemd environment block. AFAIC, the issue has been resolved on the systemd side. The only thing that I plan to change (maybe) is https://github.com/systemd/systemd/issues/4446. But it is not directly related to this issue.
Thanks very much(In reply to Zbigniew Jędrzejewski-Szmek from comment #5) > Let me clarify: > - please ignore https://github.com/systemd/systemd/pull/17188. It was > (mostly) reverted. > - > https://github.com/keszybz/systemd/commit/ > a4ccce22d9552dc74b6916cc5ec57f2a0b686b4f makes systemctl > ignore envvars that systemd doesn't like. This means that tools which call > 'systemctl import-environment' > will not fail. > Other tools which call the DBus api directly need to do similar filtering. > - Lmod is not doing anything wrong. It is export bash functions as > variables, which is a bit wonky, but allowed by POSIX. > It actually makes sense for that use case. > > So... if there's some particular tool which calls the Dbus api directly and > tries to push unacceptable variables, > it needs to be patched not to do that. *Ideally* it would be patched to just > push *only specific variables*, > because most variables only make sense for that program, for example > $OLDPWD, $PWD, $SHLVL, $TERM, $SHELL. > Only variables that make sense *globally* for the user session should be > pushed to the systemd environment block. > > AFAIC, the issue has been resolved on the systemd side. The only thing that > I plan to change (maybe) > is https://github.com/systemd/systemd/issues/4446. But it is not directly > related to this issue. Yeah, I just tested with 247 and the problem was still there, and then I found where it had been backed out. Need to wait for a response from upstream kde. Thank you very much for the clarification and quick response!
FWIW, I can confirm removing environment-modules fixes the problem, at least for me.
(In reply to Göran Uddeborg from comment #7) > FWIW, I can confirm removing environment-modules fixes the problem, at least > for me. How did you remove it? Isn't it required? When I issue dnf remove environment-modules, DNF also wants to remove 155 dependent packages. I guess I don't understand why this problem has been going on for so long, and why it appears so difficult to fix. In the meantime, the bug reports just continue to come in.
(In reply to Gerald Cox from comment #8) > (In reply to Göran Uddeborg from comment #7) > > FWIW, I can confirm removing environment-modules fixes the problem, at least > > for me. > > How did you remove it? Isn't it required? When I issue dnf remove > environment-modules, > DNF also wants to remove 155 dependent packages. > > I guess I don't understand why this problem has been going on for so long, > and why it > appears so difficult to fix. In the meantime, the bug reports just continue > to come in. I removed temporarily with rpm -e environment-modules --nodeps and I can also confirm it fixes the issue with 1897717, but that's not really a solution for me since I have other programs that need environment-modules. Clearly the issue is with the BASH_FUNC env variables - which circle us back to square one again.
If there is a new way to push shell function to the user environment, could you please give some hints on how this should be done. I would be happy to update the environment-modules package to help fix this issue.
(In reply to Gerald Cox from comment #8) > (In reply to Göran Uddeborg from comment #7) > > FWIW, I can confirm removing environment-modules fixes the problem, at least > > for me. > > How did you remove it? Isn't it required? When I issue dnf remove > environment-modules, > DNF also wants to remove 155 dependent packages. That wasn't a problem for me, but I guess this depends on what you have installed. In my case DNF only removed three packages: environment-modules itself, scl-utils since it requires modulecmd from environment-modules, and vim-filesystem since it wasn't needed any more. That was all.
(In reply to Xavier Delaruelle from comment #10) > If there is a new way to push shell function to the user environment, could > you please give some hints on how this should be done. I would be happy to > update the environment-modules package to help fix this issue. I assume you mean bash functions, since for example zsh doesn't have the equivalent functionality of inheriting shell functions through the environment afaik. Currently such functions cannot be pushed through the systemd environment block. They need to be (and are) created individually in various shells using .bash_profile and equivalent mechanisms for other shells. I think this is the correct solution, for two reasons. Firstly, such functions are shell-specific, and we support multiple shells, and we don't want to clutter the environment with an implementation for every installed shell, esp. that some shells don't support inheriting functions through the environment anyway. Secondly, this kind of setup is not appropriate for the global settings, which are used for non-shell environments too. (So for example, even if we wanted some environment module settings to apply to all processes of a given user, for systemd services which are invoked directly without a shell being involved at any point, exported bash functions wouldn't have any effect anyway.) It's entirely possible that environment modules are already DTRT, and don't need to change anything. As discussed above, the problem became *visible* when the environment block including exported bash functions was pushed into systemd and systemd refused it. Such variables need to be filtered out by the pusher. tl;dr: Define shell-specific setup through shell-specific profile scripts. Only push variables which make sense globally to the systemd environment block. > The only thing that I plan to change (maybe) is https://github.com/systemd/systemd/issues/4446. > But it is not directly related to this issue. https://github.com/systemd/systemd/pull/18129
(In reply to Zbigniew Jędrzejewski-Szmek from comment #12) > It's entirely possible that environment modules are already DTRT, and don't > need to > change anything. As discussed above, the problem became *visible* when the > environment > block including exported bash functions was pushed into systemd and systemd > refused it. > Such variables need to be filtered out by the pusher. Maybe to clarify a little bit, either environment-modules, Lmod or scl-utils defines: * a bash shell function in a .sh file under /etc/profile.d * a csh shell alias in a .csh file under /etc/profile.d The goal is to get such shell function (or shell alias for the csh shell family) defined when a bash/sh or csh/tcsh shell is started whatever the way this shell is started.
> Maybe to clarify a little bit, either environment-modules, Lmod or scl-utils defines: ... and exports (export -f funcname).
(In reply to Zbigniew Jędrzejewski-Szmek from comment #12) > tl;dr: Define shell-specific setup through shell-specific profile scripts. > Only push variables which make sense globally to the systemd > environment block. I really appreciate you taking the time to respond and clarify. But for some reason, people aren't getting that memo or they are finding it extremely difficult to accomplish. This problem in one form or another has been going on for quite a while as evidenced by all the tickets that reference it in one way or another and as time goes by the collateral damage keeps accumulating. People aren't connecting the dots. Hopefully, this report will hammer home the root cause.
> This problem in one form or another has been > going on for quite a while as evidenced by all the tickets that reference > it in one way or another and as time goes by the collateral damage keeps > accumulating. That is to a large extent my fault. I wanted to fix it in one way, and it took a long time because I had a hard time convincing other upstream maintainers to the approach. And then it turned out that my approach is infeasible, though for different reasons than those raised in the discussion. If you start counting the time from systemd-246.7 which had the patch for systemctl import-invironment, it's not that long.
(In reply to Gerald Cox from comment #15) > (In reply to Zbigniew Jędrzejewski-Szmek from comment #12) > > > tl;dr: Define shell-specific setup through shell-specific profile scripts. > > Only push variables which make sense globally to the systemd > > environment block. > > I really appreciate you taking the time to respond and clarify. > > But for some reason, people aren't getting that memo or they are finding it > extremely difficult to accomplish. This problem in one form or another has > been > going on for quite a while as evidenced by all the tickets that reference > it in one way or another and as time goes by the collateral damage keeps > accumulating. People aren't connecting the dots. > > Hopefully, this report will hammer home the root cause. To dig a bit further, the memo says "Define shell-specific setup through shell-specific profile scripts.", but systemd goes through the shell-specific profile scripts. Is there a reference doc about how systemd is doing the user environment initialization on Fedora? I would be glad to help here on the environment-modules side.
I pushed a pull request to beef up systemd's docs a bit: https://github.com/systemd/systemd/pull/18137. It also deprecates a blanket 'systemctl import-environment' command and changes systemctl to emit a warning when that is used. It is fairly easy to describe how the environment is constructed by systemd, but the interaction with other parts of the stack is complex and not documented well. I don't think the systemd man pages are appropriate for this kind of documentation — we don't have the expertise and they would become stale at some point. We could probably put a document under systemd.io if no better place is found. > To dig a bit further, the memo says "Define shell-specific setup through shell-specific profile scripts.", > but systemd goes through the shell-specific profile scripts. Systemd itself doesn't use the shell *at all*. This also means that it doesn't care about profile scripts. Instead, various desktop environments execute in an environment where the shell profile functions have been executed, and when they call 'import-environment' or the dbus-api equivalent to propagate session-specific variables, various variables are pushed to the manager environment block. I don't think this makes sense, because we end up with too many wrong things in the global environment. What I think should happen is that the desktop environment should export a few variables it sets itself (emphasize *itself* here, if the desktop environment is assigning a variable, it knows whether it belongs in the global session block): DESKTOP_SESSION, DISPLAY, GDMSESSION, GDM_LANG, GNOME_SETUP_DISPLAY?, SESSION_MANAGER?, WAYLAND_DISPLAY, XAUTHORITY, XDG_SESSION_*, or equivalent for other environments. Whenever a shell process is created for the user, it should execute shell profile functions and create set the shell-specific variables there. I'm not an expert in bash profile initialization, but it seems that the right thing happens already: when I clean up my systemd user environment block by removing e.g. all MODULE* variables, and start a shell process as a service, I get them back: $ systemd-run --user -t bash (nested) $ set|grep ^MODULE MODULEPATH=/etc/scl/modulefiles:/usr/share/Modules/modulefiles:/etc/modulefiles:/usr/share/modulefiles MODULEPATH_modshare=/usr/share/modulefiles:1:/usr/share/Modules/modulefiles:1:/etc/modulefiles:1 MODULESHOME=/usr/share/Modules MODULES_CMD=/usr/share/Modules/libexec/modulecmd.tcl MODULES_RUN_QUARANTINE='LD_LIBRARY_PATH LD_PRELOAD' When I start a gnome-terminal, the shell in the gnome terminal also has the MODULE* variables. So it seems that the environment-modules stuff is properly initialized without pushing anything through the manager environment block. I don't think there's anything to fix in environment modules. It's gnome-desktop/kde that might need some tweaking.
(In reply to Zbigniew Jędrzejewski-Szmek from comment #18) > Systemd itself doesn't use the shell *at all*. This also means that it > doesn't care about profile > scripts. Instead, various desktop environments execute in an environment > where the shell profile > functions have been executed, and when they call 'import-environment' or the > dbus-api equivalent > to propagate session-specific variables, various variables are pushed to the > manager environment > block. I don't think this makes sense, because we end up with too many wrong > things in the global > environment. Thanks a lot for the clarification and doc updates. It better understand it now. I can confirm that default bash shell configuration for user accounts correctly executes shell profile scripts (~/.bash_profile and ~/.bashrc evaluates /etc/bashrc which in turn evaluates /etc/profile.d/*.sh scripts).
This bug will be discussed at the Fedora Prioritized Bugs meeting on 2021-01-13 at 1600 UTC in #fedora-meeting https://apps.fedoraproject.org/calendar/base/#m9788
This is accepted as a Fedora Prioritized Bug: https://meetbot.fedoraproject.org/fedora-meeting/2021-01-13/fedora_prioritized_bugs_and_issues.2021-01-13-16.01.log.html#l-67
Latest update from KDE: FYI, a patch merged into Plasma 5.21 that filters env names as well the env values. I'm hoping this fixes our issue. Great news David! Can you point me toward the patch so I can advise the Fedora maintainers and see if they can backport so we can test it out? Also, I had opened https://bugzilla.redhat.com/show_bug.cgi?id=1912046 in Fedora to see if we could get the systemd and DE folks all on the same page to avoid this type of issue going forward. Part of the systemd response to that bug was: What I think should happen is that the desktop environment should export a few variables it sets itself (emphasize *itself* here, if the desktop environment is assigning a variable, it knows whether it belongs in the global session block): DESKTOP_SESSION, DISPLAY, GDMSESSION, GDM_LANG, GNOME_SETUP_DISPLAY?, SESSION_MANAGER?, WAYLAND_DISPLAY, XAUTHORITY, XDG_SESSION_*, or equivalent for other environments. Whenever a shell process is created for the user, it should execute shell profile functions and create set the shell-specific variables there. Is this the approach KDE is taking going forward or do you have a better idea or path? You can either respond in the fedora bug or here and I'll pass the word along. Fedora has made it a prioritized bug to attempt to ensure we have everyone's concurrence (as much as possible). Thanks again for your quick attention and action. It is much appreciated!
The patch referenced in comment #22 was backported by Rex (Thanks again for the quick work on getting it applied!) and looks like it circumvents the current known issues. However, after reading the comments for the patch, here: https://invent.kde.org/plasma/plasma-workspace/-/merge_requests/569 it doesn't appear to address the root cause. As I understand it, if there is another variable created that somehow isn't covered by the patch filters, we're back to square one and stuff again starts breaking. Unless anyone has a better idea the solution by Zbigniew seems very logical: > The long-term solution is to only pass specific variables. DISPLAY, WAYLAND_DISPLAY, and so on. > The rule should be "The display manager sets a variable, > it may forward it to the environment block if appropriate. No other variables Wouldn't it be easier to just pass what is needed, then to try to filter out something that is unknown?
In today's Prioritized Bugs meeting, we agreed that since our two main desktop environments have addressed this upstream and zbyszek has said there are no addtional systemd changes to make, this bug can be considered closed. So I am closing it.