Bug 1004435

Summary: RFE: PrivateNetwork=yes should become a NOP if the kernel doesn't support network namespaces
Product: [Fedora] Fedora Reporter: Alessandro Suardi <alessandro.suardi>
Component: systemdAssignee: Ray Strode [halfline] <rstrode>
Status: CLOSED UPSTREAM QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: unspecified    
Version: rawhideCC: cschalle, jkysela, johannbg, lnykryn, lpoetter, msekleta, plautrba, rstrode, systemd-maint, vpavlin, zbyszek
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-11-16 18:23:37 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
3.11.0-rc7 kernel .config (ok under F17, bad under F19)
none
lspci -v
none
alsactl GDB backtrace
none
alsa-info.sh from 3.11.0 non-working
none
A test patch for alsactl
none
A test patch for alsactl
none
.config file where alsactl doesn't core, GNOME starts, box reboots. yay none

Description Alessandro Suardi 2013-09-04 15:51:03 UTC
Description of problem:

startx in F19 doesn't bring to a working gnome-session if running Torvalds kernel 3.11.0-rc7 (3.11.0 final as well) - "something has gone wrong" and log out button appear. Disabling GNOME extensions doesn't fix the issue.

startx works with and without GNOME extensions under Fedora kernels.

Same 3.11.0-rc7 kernel, built from identical .config under F17, brings to a working gnome-session in Fedora 17 on the same computer (I am dual-booting F17 and F19 on the same box).

I don't know whether this is gnome-session, gnome-shell, some gnome app, startx, X related, anything else - so logging this initially to gnome-session.


Version-Release number of selected component (if applicable):


How reproducible:
100%

Steps to Reproduce:
1. log in at console tty (runlevel 3)
2. startx
3.

Actual results:
GNOME failure with log out button is displayed

Expected results:
functional GNOME desktop is displayed

Additional info:

Comment 1 Alessandro Suardi 2013-09-04 23:33:35 UTC
Created attachment 793905 [details]
3.11.0-rc7 kernel .config (ok under F17, bad under F19)

Comment 2 Alessandro Suardi 2013-09-04 23:46:35 UTC
Forgot to mention - h/w is a Dell Latitude E6420. Also attaching lspci -v output.

Comment 3 Alessandro Suardi 2013-09-04 23:47:54 UTC
Created attachment 793907 [details]
lspci -v

Comment 4 Alessandro Suardi 2013-09-16 10:07:59 UTC
Created attachment 798201 [details]
alsactl GDB backtrace

Comment 5 Alessandro Suardi 2013-09-16 10:09:31 UTC
Since it looks like the problem area was
 - session doesn't start, because
   - pulseaudio doesn't start, because
     - alsactl service dumps core upon startup, because
       - diff'ing dmesg from working to non-working seems to point to kernel audio detection issues (present in F19 kernel, absent in custom)

...I grabbed the Fedora kernel .config, trimmed away several known-useless entries and built a 3.11.0 kernel that successfully starts my Gnome session.

I will leave this bug open while I work to find out what is the difference between my custom .config that didn't work (but worked on F17) and the F19 one, just in case someone else stumbles into this bug.

Switching to alsa-utils because of alsactl core dump, and attaching GDB output obtained after installing debuginfo RPMs of alsactl, plus alsa-info.sh from the problem 3.11.0 kernel, in case someone already has clues.

Comment 6 Alessandro Suardi 2013-09-16 10:14:15 UTC
Created attachment 798204 [details]
alsa-info.sh from 3.11.0 non-working

Comment 7 Jaroslav Kysela 2013-09-16 13:21:25 UTC
Could you reproduce the core dump with '/usr/sbin/alsactl -s -n 19 -c -E ALSA_CONFIG_PATH=/etc/alsa/alsactl.conf --init' command on the command line as root?

Comment 8 Alessandro Suardi 2013-09-16 14:59:24 UTC
I copy/pasted your alsactl command, but that turns into an usage error:

[root@xbox ~]# /usr/sbin/alsactl -s -n 19 -c -E ALSA_CONFIG_PATH=/etc/alsa/alsactl.conf --init
/usr/sbin/alsactl: option '--initfile' requires an argument



The core dump happens when invoking the systemctl service command as root (the GDB backtrace I attached was indeed obtained that way) like this:

[root@xbox ~]# ulimit -c
0
[root@xbox ~]# ulimit -c unlimited
[root@xbox ~]# /usr/sbin/alsactl -s -n 19 -c -E ALSA_CONFIG_PATH=/etc/alsa/alsactl.conf --initfile=/lib/alsa/init/00main rdaemon
Found hardware: "HDA-Intel" "Intel CougarPoint HDMI" "HDA:111d76e7,10280493,00100102 HDA:14f12c06,14f1000f,00100000 HDA:80862805,80860101,00100000" "0x1028" "0x0493"
Hardware is initialized using a generic method
alsactl[3181]: segfault at 1 ip 00000037bd248a7d sp 00007fffd06b9c70 error 4 in libc-2.17.so[37bd200000+1b6000]
Segmentation fault (core dumped)
[root@xbox ~]# gdb /usr/sbin/alsactl core.
core.2891  core.3181
[root@xbox ~]# gdb /usr/sbin/alsactl core.3181
GNU gdb (GDB) Fedora (7.6-34.fc19)
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /usr/sbin/alsactl...Reading symbols from /usr/lib/debug/usr/sbin/alsactl.debug...done.
done.
[New LWP 3181]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/sbin/alsactl -s -n 19 -c -E ALSA_CONFIG_PATH=/etc/alsa/alsactl.conf --init'.
Program terminated with signal 11, Segmentation fault.
#0  0x00000037bd248a7d in _IO_vfprintf_internal (s=s@entry=0x7fffd06ba260, format=<optimized out>, format@entry=0x412860 "failed to obtain info for control #%d (%s)", ap=ap@entry=0x7fffd06ba818)
    at vfprintf.c:1634
1634              process_arg (((struct printf_spec *) NULL));
(gdb) bt
#0  0x00000037bd248a7d in _IO_vfprintf_internal (s=s@entry=0x7fffd06ba260, format=<optimized out>, format@entry=0x412860 "failed to obtain info for control #%d (%s)", ap=ap@entry=0x7fffd06ba818)
    at vfprintf.c:1634
#1  0x00000037bd30ae95 in ___vsnprintf_chk (s=0x7fffd06ba403 "failed to obtain info for control #", maxlen=<optimized out>, flags=flags@entry=1, slen=slen@entry=18446744073709551615,
    format=format@entry=0x412860 "failed to obtain info for control #%d (%s)", args=args@entry=0x7fffd06ba818) at vsnprintf_chk.c:63
#2  0x000000000040ad9a in vsnprintf (__ap=0x7fffd06ba818, __fmt=0x412860 "failed to obtain info for control #%d (%s)", __n=<optimized out>, __s=<optimized out>) at /usr/include/bits/stdio2.h:77
#3  syslog_ (fcn=<optimized out>, line=<optimized out>, fmt=fmt@entry=0x412860 "failed to obtain info for control #%d (%s)", ap=ap@entry=0x7fffd06ba818, prio=3) at utils.c:112
#4  0x000000000040b2b7 in cerror_ (fcn=fcn@entry=0x412708 <__FUNCTION__.9591> "set_control", line=line@entry=1325, cond=<optimized out>,
    fmt=fmt@entry=0x412860 "failed to obtain info for control #%d (%s)") at utils.c:154
#5  0x00000000004098d1 in set_control (handle=0xc09e40, control=0xbfefa0, maxnumid=maxnumid@entry=0x7fffd06bb18c, doit=doit@entry=1) at state.c:1325
#6  0x0000000000409f06 in set_controls (card=<optimized out>, top=0xbf8630, doit=doit@entry=1) at state.c:1512
#7  0x000000000040a74f in load_state (file=file@entry=0x411839 "/var/lib/alsa/asound.state", initfile=initfile@entry=0x7fffd06bc8b2 "/lib/alsa/init/00main", cardname=cardname@entry=0x0,
    do_init=do_init@entry=1) at state.c:1746
#8  0x0000000000405c06 in main (argc=<optimized out>, argv=<optimized out>) at alsactl.c:354
(gdb) quit


The copy/paste above is from 3.11.1 compiled with .config from 3.11.0-rc6 (just wanted to make sure the -rc7 .config wasn't somehow corrupted), which also yielded a completely working kernel under F17.


I'll have to say that I'm pretty puzzled - my initial cut of a working 3.11.0 from a reduced F19 kernel .config is good enough to start up Gnome and allow me to file this additional information - but also fails to reboot the machine, hanging completely after printing "Machine restart"... it's like if F19 is making building custom kernels exceedingly painful, even for those like me who have done so since 1.3.60, that is for more than 15 years.


In the meantime, thanks for your prompt feedback - I'm back to hunting for the magic combination of kernel .config options needed to boot, start Gnome and reboot the laptop under F19 userspace :/

Comment 9 Jaroslav Kysela 2013-09-16 15:18:14 UTC
Created attachment 798314 [details]
A test patch for alsactl

A test patch for alsactl..

Comment 10 Jaroslav Kysela 2013-09-16 15:20:59 UTC
Could you use a test patch from comment#9 and report back results (printed values)?

You can obtain the source for the alsa-utils package from ftp://ftp.alsa-project.org/pub/utils/alsa-utils-1.0.27.2.tar.bz2 .

Comment 11 Alessandro Suardi 2013-09-16 17:26:21 UTC
Sure thing - here it is:

[root@xbox alsactl]# cd /download/linux/f19/alsa-utils-1.0.27.2/alsactl/
[root@xbox alsactl]# ulimit -c unlimited
[root@xbox alsactl]# ./alsactl -s -n 19 -c -E ALSA_CONFIG_PATH=/etc/alsa/alsactl.conf --initfile=/lib/alsa/init/00main rdaemon  2>&1
numid = 16
stderr = 0x37bd37b088
strerr = 'No such file or directory'
Found hardware: "HDA-Intel" "Intel CougarPoint HDMI" "HDA:111d76e7,10280493,00100102 HDA:14f12c06,14f1000f,00100000 HDA:80862805,80860101,00100000" "0x1028" "0x0493"
Hardware is initialized using a generic method
numid = 16
stderr = 0x37bd37b088
strerr = 'No such file or directory'
traps: alsactl[2835] general protection ip:37bd248e29 sp:7fff201085c0 error:0 in libc-2.17.so[37bd200000+1b6000]
Segmentation fault (core dumped)

Comment 12 Jaroslav Kysela 2013-09-17 09:50:58 UTC
Created attachment 798698 [details]
A test patch for alsactl

Please, provide also output with this patch.

Comment 13 Alessandro Suardi 2013-09-17 13:25:10 UTC
This one is with 3.11.0 rebuilt without the only SND option which differed between my .config and F19 (CONFIG_SND_SUPPORT_OLD_API was set to "y" in my kernel - this specific boot has it to "n")...

Fails nonetheless, in this way:

[root@xbox alsactl]# ./alsactl -s -n 19 -c -E ALSA_CONFIG_PATH=/etc/alsa/alsactl.conf --initfile=/lib/alsa/init/00main rdaemon
numid = 16
stderr = 0x37bd37b088
strerr = 'No such file or directory'
Found hardware: "HDA-Intel" "Intel CougarPoint HDMI" "HDA:111d76e7,10280493,00100102 HDA:14f12c06,14f1000f,00100000 HDA:80862805,80860101,00100000" "0x1028" "0x0493"
Hardware is initialized using a generic method
numid = 16
stderr = 0x37bd37b088
strerr = 'No such file or directory'
len = 27, size = 997
traps: alsactl[2860] general protection ip:37bd248e29 sp:7fffd41a9880 error:0 in libc-2.17.so[37bd200000+1b6000]
Segmentation fault (core dumped)


Thanks !

Comment 14 Alessandro Suardi 2013-09-19 21:39:36 UTC
Okay - further work (now on 3.12.1-rc1) shows the issue appears to be elsewhere...

With these diffs I can fix the alsactl core dump, but still my gnome-session doesn't start...

diff .config-3.12.1-rc1-working .config-3.12.1-rc1-try2 
2030d2029
< CONFIG_SND_JACK=y
2042c2041
< # CONFIG_SND_SUPPORT_OLD_API is not set
---
> CONFIG_SND_SUPPORT_OLD_API=y
2056d2054
< CONFIG_SND_AC97_CODEC=m
2105,2108c2103,2105
< CONFIG_SND_HDA_INPUT_BEEP=y
< CONFIG_SND_HDA_INPUT_BEEP_MODE=1
< CONFIG_SND_HDA_INPUT_JACK=y
< CONFIG_SND_HDA_PATCH_LOADER=y
---
> # CONFIG_SND_HDA_INPUT_BEEP is not set
> # CONFIG_SND_HDA_INPUT_JACK is not set
> # CONFIG_SND_HDA_PATCH_LOADER is not set
2128,2129c2125,2126
< CONFIG_SND_INTEL8X0=m
< CONFIG_SND_INTEL8X0M=m
---
> # CONFIG_SND_INTEL8X0 is not set
> # CONFIG_SND_INTEL8X0M is not set
2152d2148
< CONFIG_AC97_BUS=m
2628c2624
< CONFIG_EXT4_FS=y
---
> CONFIG_EXT4_FS=m
2635c2631
< CONFIG_JBD2=y
---
> CONFIG_JBD2=m
3129c3125
< CONFIG_CRYPTO_CRC32C=y
---
> CONFIG_CRYPTO_CRC32C=m
3227c3223
< CONFIG_CRC16=y
---
> CONFIG_CRC16=m

Maybe CONFIG_SND_HDA_INPUT_BEEP or CONFIG_SND_HDA_INPUT_JACK or CONFIG_SND_HDA_PATCH_LOADER make a difference now ? Surely I never had a kernel with those built in F17, and gnome-session always worked.
Ditto for the added CONFIG_SND _modules_ which aren't even loaded according to 'lsmod' when I get to a kernel where alsactl doesn't bomb.


The weird part is that at some point I ended up with an unbootable kernel where EXT4 was selected as a module (as it has been for ages) and apparently wasn't properly loaded - I was dropped at the dracut# prompt where I was able to see that /dev/sda was detected... so tried the simple mount /dev/sdaN /somewhere for my rootfs, and lo and behold - "unrecognized filesystem type ext4", or something like that. CONFIG_EXT4_FS=y instead of =m appears to have made my kernel bootable...

Other than conjecturing that is something broken in the module loading area, my only option is now to iterate through kernels built from the F19 .config and cut out supposedly unneeded stuff until I cut out the part that unexpectedly makes hell break loose.

Comment 15 Alessandro Suardi 2013-09-23 14:14:20 UTC
Switching this to rtkit, as I finally found the issue - which basically is nothing else than a dup of bug 907576.

rtkit-daemon doesn't start because of missing NET_NS support in kernel, and from there it's downhill.

While I understand that the "real" problem is that NET_NS was not autoselected anymore at some point of KConfig development, can we pretty please have rtkit-daemon tell us quite vocally (instead of failing too smoothly, and in the process making the graphical desktop entirely unusable) ?


Offtopic, at this point -
I am rebuilding one final time my kernel for the ALSA core dump, since the current 3.12.0-rc1 works fine without CONFIG_SND_INTEL8X0/CONFIG_SND_INTEL8X0M and now I'd love to verify that it still works even without either HDA_INPUT_BEEP or HDA_INPUT_JACK or HDA_PATCH_LOADER...

Comment 16 Alessandro Suardi 2013-09-23 14:48:00 UTC
The alsactl core dump is cured by compiling in CONFIG_SND_HDA_INPUT_BEEP:

[asuardi@xbox src]$ ls -l .config-3.12.0-rc1-f19-1[67]*
-rw-r--r--. 1 asuardi asuardi 86798 Sep 23 16:29 .config-3.12.0-rc1-f19-16-without_SND_HDA_CODECs_and_JACK_and_PATCH_LOADER_and_BEEP-alsacore
-rw-r--r--. 1 asuardi asuardi 86820 Sep 23 16:39 .config-3.12.0-rc1-f19-17-without_SND_HDA_CODECs_and_JACK_and_PATCH_LOADER-working

[asuardi@xbox src]$ diff .config-3.12.0-rc1-f19-1[67]*
2171c2171,2172
< # CONFIG_SND_HDA_INPUT_BEEP is not set
---
> CONFIG_SND_HDA_INPUT_BEEP=y
> CONFIG_SND_HDA_INPUT_BEEP_MODE=1

Attaching my trimmed-so-far .config file for reference, leaving to Jaroslav the decision whether the alsactl core dump should be pursued as a separate bug or else - I am satisfied with the current status.


I should also remark that configuring in NET_NS also yields a kernel that _does_ reboot after printing "Machine restart" - instead of hanging there, requiring holding down the power button to shut down.

Comment 17 Alessandro Suardi 2013-09-23 14:51:58 UTC
Created attachment 801687 [details]
.config file where alsactl doesn't core, GNOME starts, box reboots. yay

Comment 18 Lennart Poettering 2013-09-26 20:17:19 UTC
Well, PrivateNetwork=yes is a systemd feature, not an rtkit feature.

It's probably a good idea to ignore PrivateNetwork=no if network namespaces are not available. Reassigning.

Comment 19 Zbigniew Jędrzejewski-Szmek 2013-10-03 14:40:39 UTC
Hm, I'm pretty sure it should be the other way around: if network ns is not available, fail to start. People rely on PrivateNetwork for containment of units, and we should not break those expectations.

Comment 20 Fedora End Of Life 2015-01-09 19:43:41 UTC
This message is a notice that Fedora 19 is now at end of life. Fedora 
has stopped maintaining and issuing updates for Fedora 19. It is 
Fedora's policy to close all bug reports from releases that are no 
longer maintained. Approximately 4 (four) weeks from now this bug will
be closed as EOL if it remains open with a Fedora 'version' of '19'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 19 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 21 Zbigniew Jędrzejewski-Szmek 2015-11-16 18:23:37 UTC
I think that we cannot ignore PrivateNetwork=yes because people could rely on that for security. It's more of a discussion for systemd-devel mailing list or upstream bugtracker at https://github.com/systemd/systemd/issues.