Bug 806107

Summary: Fedora 17 Alpha can't start xenstored - and hence xend nor xl work as they depend on that userspace process.
Product: [Fedora] Fedora Reporter: Konrad Rzeszutek Wilk <ketuzsezr>
Component: xenAssignee: Michael Young <m.a.young>
Status: CLOSED CURRENTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: rawhideCC: gansalmon, greno, itamar, jforbes, jonathan, kernel-maint, kraxel, lersek, madhu.chinakonda, m.a.young, mike, virt-maint, xen-maint
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: xen-4.1.2-20.fc17 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-08-07 21:58:51 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Konrad Rzeszutek Wilk 2012-03-22 21:53:25 UTC
Description of problem:
xenstored is the first component of Xend to work properly. For it to function
it needs to interact over /proc/xen/xsd_port

But somehow the xsd_port nor xsa_kva are there

Version-Release number of selected component (if applicable):


How reproducible:

Install Fedora Core 17

Steps to Reproduce:
1. 
2.
3.
  
Actual results:

dr-xr-xr-x 134 root root 0 Mar 22 21:43 ..
-r--r--r--   1 root root 0 Mar 22 21:43 capabilities
-rw-------   1 root root 0 Mar 22 21:43 privcmd
-rw-------   1 root root 0 Mar 22 21:43 xenbus


Expected results:
x   2 root root 0 Mar 22 21:43 .
dr-xr-xr-x 134 root root 0 Mar 22 21:43 ..
-r--r--r--   1 root root 0 Mar 22 21:43 capabilities
-rw-------   1 root root 0 Mar 22 21:43 privcmd
-rw-------   1 root root 0 Mar 22 21:43 xenbus
-rw-------   1 root root 0 Mar 22 21:43 xsd_kva
-rw-------   1 root root 0 Mar 22 21:43 xsd_port


Additional info:

This is 3.3.0-1.fc17.i686.PAE

Comment 1 Konrad Rzeszutek Wilk 2012-03-22 21:56:53 UTC
Doing the magic 'setenforce 0' and then starting xenstored makes it work. I still can't see the /proc/xen/xsd_port ioctl file thought.

Comment 2 Laszlo Ersek 2012-03-22 22:09:08 UTC
Upstream discussion with patch:
https://lkml.org/lkml/2012/3/22/303

Changing component to "kernel".

Comment 3 Dave Jones 2012-03-23 15:08:04 UTC
I'm confused, bfcfaa77bdf0f775263e906015982a608df01c76 wasn't in 3.3
This shouldn't affect F17.

Comment 4 Laszlo Ersek 2012-03-23 16:08:56 UTC
Sorry, I may have been confused by the coinciding symptoms.

Konrad, what do you think? Thanks.

Comment 5 Laszlo Ersek 2012-03-23 16:14:18 UTC
Note: if you open the discussion in comment 2, it says:

"v3.3 with just that git commit"

and the kernel config file Konrad posted begins as:

#
# Automatically generated file; DO NOT EDIT.
# Linux/x86_64 3.3.0 Kernel Configuration
#

Thus what may have happened is: 3.3 was taken, upstream bfcfaa77 applied on it, the RPM rebuilt and used, and when it broke, the BZ was reported against 3.3 (not "3.3 + bfcfaa77"). In other words, this may be NOTABUG.

Konrad, please confirm. Thanks!

Comment 6 Michael Young 2012-03-23 20:41:08 UTC
I can confirm the missing files really are missing with a regular F17 kernel running dom0.

I was also playing with my customised kernel on F16 (built to test a fix for Bug 804347 ) booting as a dom0 but without any xen processes, and I found by mounting and unmounting /proc/xen I could get either 3, 4 or 5 files in /proc/xen so it might be a periodic bug.

Comment 7 Konrad Rzeszutek Wilk 2012-03-23 20:47:20 UTC
(In reply to comment #5)
> Note: if you open the discussion in comment 2, it says:
> 
> "v3.3 with just that git commit"
> 
> and the kernel config file Konrad posted begins as:
> 
> #
> # Automatically generated file; DO NOT EDIT.
> # Linux/x86_64 3.3.0 Kernel Configuration
> #
> 
> Thus what may have happened is: 3.3 was taken, upstream bfcfaa77 applied on it,
> the RPM rebuilt and used, and when it broke, the BZ was reported against 3.3
> (not "3.3 + bfcfaa77"). In other words, this may be NOTABUG.
> 
> Konrad, please confirm. Thanks!

Correct. It was the 3.3 post (or also known as 3.4-rc0)

Comment 8 Konrad Rzeszutek Wilk 2012-03-23 20:48:52 UTC
Let me change the title back.

Comment 9 Konrad Rzeszutek Wilk 2012-03-23 20:50:13 UTC
So to either clarify or muddy the waters - the issue I am seeing is against 3.3.0-1.fc17.i686.PAE. Which is a 3.3 kernel (right?), not a 3.4-rc0 type?

Comment 10 Josh Boyer 2012-03-26 20:53:15 UTC
(In reply to comment #9)
> So to either clarify or muddy the waters - the issue I am seeing is against
> 3.3.0-1.fc17.i686.PAE. Which is a 3.3 kernel (right?), not a 3.4-rc0 type?

Correct.  The 3.4-rc0 type kernels will have 3.4.0-0.rcX.gitX.0.fc18 naming, and aren't in F17.

So this bug is really against rawhide from what I can tell.  What I can't tell is what the bug is really asking...

Comment 11 Josh Boyer 2012-03-26 20:56:13 UTC
I'm guessing this is fixed by upstream commit:

commit f132c5be05e407a99cf582347a2ae0120acd3ad7
Author: Al Viro <viro.org.uk>
Date:   Thu Mar 22 21:59:52 2012 +0000

    Fix full_name_hash() behaviour when length is a multiple of 8
    
    We want it to match what hash_name() is doing, which means extra
    multiply by 9 in this case...
    
    Reported-and-Tested-by: Konrad Rzeszutek Wilk <konrad.wilk>
    Signed-off-by: Al Viro <viro.org.uk>
    Signed-off-by: Linus Torvalds <torvalds>

which means it should get rolled into rawhide with the next kernel build.

Comment 12 Josh Boyer 2012-03-26 21:04:21 UTC
OK, except as Dave pointed out neither of those commits are in F17 3.3.0-1.fc17.

Konrad, if xenstored works on a stock Fedora 3.3.0-1.fc17 kernel with setenforce=0, is there something in the audit log before that?  Also, you said both of the files were needed for it to work but you only see one file with setenforce=0 but xenstored "works".

Confused.

Comment 13 Michael Young 2012-03-31 15:47:10 UTC
The selinux problems could be a separate issue - I have just discovered that the permissions on the /var/lib/xenstored mount changed between F16 and F17 which was enough to stop guests booting with selinux in enforcing mode. It is fixed in the upcoming version of xen so it is probably worth retesting with the new version, though I can't see how this would relate to the missing files in /proc/xen .

Comment 14 Konrad Rzeszutek Wilk 2012-04-03 20:42:26 UTC
I think the /proc/xen not seeing is b/c I am sudo-ing?

Either way, more than happy to test any new version - should I use xen-4.1.2-14.fc17 ?

Comment 15 Konrad Rzeszutek Wilk 2012-04-03 21:24:10 UTC
So with that version I saw a missing permission file:

[root@tst013 konrad]# systemctl status xendomains.service
xendomains.service - Xendomains - start and stop guests on boot and shutdown
	  Loaded: loaded (/usr/lib/systemd/system/xendomains.service; enabled)
	  Active: failed (Result: exit-code) since Tue, 03 Apr 2012 17:15:49 -0400; 44s ago
	 Process: 1572 ExecStart=/usr/libexec/xendomains start (code=exited, status=203/EXEC)
	 Process: 1570 ExecStartPre=/bin/grep -q control_d /proc/xen/capabilities (code=exited, status=0/SUCCESS)
	  CGroup: name=systemd:/system/xendomains.service

Apr 03 17:15:49 tst013.dumpdata.com (ndomains)[1572]: Failed at step EXEC spawning /usr/libexec/xendomains: Permission denied

[root@tst013 konrad]#  /usr/libexec/xendomains 
bash: /usr/libexec/xendomains: Permission denied
[root@tst013 konrad]# chmod 755 /usr/libexec/xendomains 
[root@tst013 konrad]# systemctl start xendomains.service
[root@tst013 konrad]# xm list
Error: Unable to connect to xend: No such file or directory. Is xend running?
[root@tst013 konrad]# systemctl status xendomains.service
xendomains.service - Xendomains - start and stop guests on boot and shutdown
	  Loaded: loaded (/usr/lib/systemd/system/xendomains.service; enabled)
	  Active: active (exited) since Tue, 03 Apr 2012 17:17:26 -0400; 30s ago
	 Process: 1619 ExecStart=/usr/libexec/xendomains start (code=exited, status=0/SUCCESS)
	 Process: 1617 ExecStartPre=/bin/grep -q control_d /proc/xen/capabilities (code=exited, status=0/SUCCESS)
	  CGroup: name=systemd:/system/xendomains.service

and that the "xenconsoled.service" isn't started by default.

But I can't actually find anywhere the serice that would start xend?

Comment 16 Konrad Rzeszutek Wilk 2012-04-03 21:25:52 UTC
In F17 I see:
root@tst013 konrad]# systemctl --all | grep xen
proc-xen.mount            loaded active   mounted       Mount /proc/xen files
var-lib-xenstored.mount   loaded active   mounted       mount xenstore file system
xenconsoled.service       loaded active   running       Xenconsoled - handles logging from guest consoles and hypervisor
xendomains.service        loaded active   exited        Xendomains - start and stop guests on boot and shutdown
xenstored.service         loaded active   running       Xenstored - daemon managing xenstore file system

While in F16 there was also the xend.service?

Comment 17 Michael Young 2012-04-03 23:08:46 UTC
(In reply to comment #14)
> I think the /proc/xen not seeing is b/c I am sudo-ing?
> 
> Either way, more than happy to test any new version - should I use
> xen-4.1.2-14.fc17 ?

Yes, xen-4.1.2-14.fc17 fixes the permissions on /var/lib/xenstored though there may be other selinux issues. xend can be started in the normal systemd way, ie. systemctl start xend.service (systemctl enable xend.service will start it on boot). I believe service xend start still works as well.

Comment 18 Konrad Rzeszutek Wilk 2012-04-04 17:37:25 UTC
Ah, hadn't realized that it was _not_ enabled by default. So this is what I get:
systemctl enable xend.service
ln -s '/usr/lib/systemd/system/xend.service' '/etc/systemd/system/multi-user.target.wants/xend.service'
[root@tst013 konrad]# systemctl start xend.service
Job failed. See system journal and 'systemctl status' for details.
[root@tst013 konrad]# systemctl status
Too few arguments.
[root@tst013 konrad]# systemctl status xend.service
xend.service - Xend - interface between hypervisor and some applications
	  Loaded: loaded (/usr/lib/systemd/system/xend.service; enabled)
	  Active: failed (Result: exit-code) since Wed, 04 Apr 2012 13:29:28 -0400; 10s ago
	 Process: 1552 ExecStart=/usr/sbin/xend (code=exited, status=1/FAILURE)
	 Process: 1550 ExecStartPre=/bin/grep -q control_d /proc/xen/capabilities (code=exited, status=0/SUCCESS)
	  CGroup: name=systemd:/system/xend.service

Apr 04 13:29:28 tst013.dumpdata.com xend[1552]: Traceback (most recent call last):
Apr 04 13:29:28 tst013.dumpdata.com xend[1552]: File "/usr/lib/python2.7/site.py", line 563, in <module>
Apr 04 13:29:28 tst013.dumpdata.com xend[1552]: main()



Amd the graphical tool tells me:
SELinux is preventing xend from read access on the file /etc/passwd.

which is odd. Why would Xend want to touch /etc/passwd?

Apr  4 13:29:19 tst013 systemd[1]: Reloading.
Apr  4 13:29:28 tst013 xend[1552]: Traceback (most recent call last):
Apr  4 13:29:28 tst013 xend[1552]: File "/usr/lib/python2.7/site.py", line 563, in <module>
Apr  4 13:29:28 tst013 xend[1552]: main()
Apr  4 13:29:28 tst013 xend[1552]: File "/usr/lib/python2.7/site.py", line 545, in main
Apr  4 13:29:28 tst013 xend[1552]: known_paths = addusersitepackages(known_paths)
Apr  4 13:29:28 tst013 xend[1552]: File "/usr/lib/python2.7/site.py", line 278, in addusersitepackages
Apr  4 13:29:28 tst013 xend[1552]: user_site = getusersitepackages()
Apr  4 13:29:28 tst013 xend[1552]: File "/usr/lib/python2.7/site.py", line 253, in getusersitepackages
Apr  4 13:29:28 tst013 xend[1552]: user_base = getuserbase() # this will also set USER_BASE
Apr  4 13:29:28 tst013 xend[1552]: File "/usr/lib/python2.7/site.py", line 243, in getuserbase
Apr  4 13:29:28 tst013 xend[1552]: USER_BASE = get_config_var('userbase')
Apr  4 13:29:28 tst013 xend[1552]: File "/usr/lib/python2.7/sysconfig.py", line 520, in get_config_var
Apr  4 13:29:28 tst013 xend[1552]: return get_config_vars().get(name)
Apr  4 13:29:28 tst013 xend[1552]: File "/usr/lib/python2.7/sysconfig.py", line 424, in get_config_vars
Apr  4 13:29:28 tst013 xend[1552]: _CONFIG_VARS['userbase'] = _getuserbase()
Apr  4 13:29:28 tst013 xend[1552]: File "/usr/lib/python2.7/sysconfig.py", line 182, in _getuserbase
Apr  4 13:29:28 tst013 xend[1552]: return env_base if env_base else joinuser("~", ".local")
Apr  4 13:29:28 tst013 xend[1552]: File "/usr/lib/python2.7/sysconfig.py", line 169, in joinuser
Apr  4 13:29:28 tst013 xend[1552]: return os.path.expanduser(os.path.join(*args))
Apr  4 13:29:28 tst013 xend[1552]: File "/usr/lib/python2.7/posixpath.py", line 260, in expanduser
Apr  4 13:29:28 tst013 xend[1552]: userhome = pwd.getpwuid(os.getuid()).pw_dir
Apr  4 13:29:28 tst013 xend[1552]: KeyError: 'getpwuid(): uid not found: 0'
Apr  4 13:29:28 tst013 systemd[1]: xend.service: control process exited, code=exited status=1
Apr  4 13:29:28 tst013 systemd[1]: Unit xend.service entered failed state.

which is all basic python related.

Comment 19 Michael Young 2012-04-10 22:06:22 UTC
There is code in xend to run as a different user, but I suspect it would take a lot of work to actually use it. I suspect bits of this code are attempting to do uid related operations and thus trying to look at the passwd file.

To complicate things, if libvirtd is started when xend it isn't there you get a xend status process left lying around which systemd doesn't know about and that gets in the way if you try to start xend.

Comment 20 Gerry Reno 2012-06-07 02:20:41 UTC
Related:  http://www.fedoraforum.org/forum/showthread.php?t=280766

Does this mean Xen is non-functional in F17  =  showstopper.

.

Comment 21 Josh Boyer 2012-06-07 13:12:45 UTC
(In reply to comment #20)
> Related:  http://www.fedoraforum.org/forum/showthread.php?t=280766
> 
> Does this mean Xen is non-functional in F17  =  showstopper.
> 
> .

F17 already shipped, so obviously not a showstopper.

Konrad, Michael, now that F17 is on the 3.4.0 (really 3.4.1) kernel, are you still seeing the issues?  I seems that the kernel wasn't really the culprit anyway, and that a newer xen package was needed?

Comment 22 Michael Young 2012-06-09 18:20:34 UTC
I haven't noticed any missing files in /proc/xen recently, but I haven't been looking for them so I might have missed them. I haven't had any problems using xen.

On the xend front I worked out why it wasn't starting via systemd when selinux was enabled - if python doesn't have the HOME environment variable set it tries to wrok out what it should be by looking in /etc/passwd before it starts parsing any of the code. This should be fixed in xen-4.1.2-19.fc17/18 .

Comment 23 W. Michael Petullo 2012-06-11 03:29:19 UTC
I just tested xen-4.1.2-19.fc17 and it does seem to fix this problem. I was able to run my Xen/Dom0 stack with SELinux enforcing the targeted policy.

Comment 24 Josh Boyer 2012-06-11 12:21:36 UTC
Moving to xen.  I think this can be closed now.

Comment 25 Gerry Reno 2012-06-11 20:18:11 UTC
Now not seeing any Xen package listed for F17:

# yum list available xen
Error: No matching Packages to list

Comment 26 Michael Young 2012-08-07 21:58:51 UTC
This should be fixed in xen-4.1.2-20.fc17 or later (xen-4.1.2-19.fc17 was obsoleted before it was released as an update).

(In reply to comment #25)
> Now not seeing any Xen package listed for F17:
> 
> # yum list available xen
> Error: No matching Packages to list

I assume that was just a repository glitch. It worked for me when I tried it just now.