RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 883832 - Cannot start VMs after upgrade from 6.3 to libvirt-0.10.2-10
Summary: Cannot start VMs after upgrade from 6.3 to libvirt-0.10.2-10
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: libvirt
Version: 6.4
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Peter Krempa
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks: 888457
TreeView+ depends on / blocked
 
Reported: 2012-12-05 12:18 UTC by Christophe Fergeau
Modified: 2013-02-21 07:28 UTC (History)
9 users (show)

Fixed In Version: libvirt-0.10.2-12.el6
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-02-21 07:28:07 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
debug log (190.59 KB, application/octet-stream)
2012-12-06 09:16 UTC, Luwen Su
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2013:0276 0 normal SHIPPED_LIVE Moderate: libvirt security, bug fix, and enhancement update 2013-02-20 21:18:26 UTC

Description Christophe Fergeau 2012-12-05 12:18:01 UTC
After upgrading my RHEL6 box to 6.4, 'virsh start vmname' now fails with "erreur :internal error Failed to get user record for name '107': No such file or directory". The VM I'm trying to start was created with RHEL6.3, or possible even older versions.
Looking more into this, this boils down to virGetUserIDByName/virGetGroupIDByName not handling getpwnam_r/getgrnam_r returning ENOENT when the user/group could not be found. 

The backtrace leading there is
    #0 parseIds (label=0x7fb88800f4c0 "107:107", uidPtr=0x7fb8b0c1c8ac, gidPtr=0x7fb8b0c1c8a8)
    at security/security_dac.c:107
    #1 0x00007fb8b812e470 in virSecurityDACParseIds (def=0x7fb89000ea00, priv=0x7fb8a4077408, uidPtr=0x7fb8b0c1c8fc,
    gidPtr=0x7fb8b0c1c8f8) at security/security_dac.c:134
    #2 virSecurityDACGetIds (def=0x7fb89000ea00, priv=0x7fb8a4077408, uidPtr=0x7fb8b0c1c8fc, gidPtr=0x7fb8b0c1c8f8)
    at security/security_dac.c:158
    #3 0x00007fb8b812ed82 in virSecurityDACSetProcessLabel (mgr=<value optimized out>, def=0x7fb89000ea00)
    at security/security_dac.c:859
    #4 0x00007fb8b812d6f3 in virSecurityStackSetProcessLabel (mgr=<value optimized out>, vm=0x7fb89000ea00)
    at security/security_stack.c:354
    #5 0x00000000004a5f41 in qemuProcessHook (data=<value optimized out>) at qemu/qemu_process.c:2712
    #6 0x00007fb8b7fdebf3 in virCommandHook (data=0x7fb888003720) at util/command.c:2068
    #7 0x00007fb8b7fe12c6 in virExecWithHook (argv=0x7fb888003a60, envp=0x7fb8880030e0, keepfd=<value optimized out>,
    keepfd_size=<value optimized out>, retpid=<value optimized out>, infd=31, outfd=0x7fb8b0c1d3bc,
    errfd=0x7fb8b0c1d3bc, flags=7, data=0x7fb888003720, pidfile=0x7fb888002b40 "/var/run/libvirt/qemu/win7.pid",
    capabilities=0, hook=0x7fb8b7fdeb90 <virCommandHook>) at util/command.c:615
    #8 0x00007fb8b7fe1b3f in virCommandRunAsync (cmd=0x7fb888003720, pid=0x0) at util/command.c:2212
    #9 0x00007fb8b7fe1f79 in virCommandRun (cmd=0x7fb888003720, exitstatus=0x0) at util/command.c:1998
    #10 0x00000000004a7a8c in qemuProcessStart (conn=0x7fb884002090, driver=0x7fb8a4006720, vm=0x7fb8a400cdc0,
    migrateFrom=0x0, stdin_fd=<value optimized out>, stdin_path=0x0, snapshot=0x0,
    vmop=VIR_NETDEV_VPORT_PROFILE_OP_CREATE, flags=1) at qemu/qemu_process.c:3728
    #11 0x000000000046a39e in qemuDomainObjStart (conn=0x7fb884002090, driver=0x7fb8a4006720, vm=0x7fb8a400cdc0,
    flags=<value optimized out>) at qemu/qemu_driver.c:5604
    #12 0x000000000046a992 in qemuDomainStartWithFlags (dom=0x7fb88800e9e0, flags=0) at qemu/qemu_driver.c:5661
    #13 0x00007fb8b80892b0 in virDomainCreate (domain=0x7fb88800e9e0) at libvirt.c:8296
    #14 0x000000000043f7d2 in remoteDispatchDomainCreate (server=<value optimized out>, client=<value optimized out>,
    msg=<value optimized out>, rerr=0x7fb8b0c1db80, args=<value optimized out>, ret=<value optimized out>)
    at remote_dispatch.h:1066
    #15 remoteDispatchDomainCreateHelper (server=<value optimized out>, client=<value optimized out>,
    msg=<value optimized out>, rerr=0x7fb8b0c1db80, args=<value optimized out>, ret=<value optimized out>)
    at remote_dispatch.h:1044
    #16 0x00007fb8b80d4142 in virNetServerProgramDispatchCall (prog=0x164e000, server=0x1642cd0, client=0x164d180,
    msg=0x164e810) at rpc/virnetserverprogram.c:431
    #17 virNetServerProgramDispatch (prog=0x164e000, server=0x1642cd0, client=0x164d180, msg=0x164e810)
    at rpc/virnetserverprogram.c:304
    #18 0x00007fb8b80d298e in virNetServerProcessMsg (srv=<value optimized out>, client=0x164d180,
    prog=<value optimized out>, msg=0x164e810) at rpc/virnetserver.c:170
    #19 0x00007fb8b80d302c in virNetServerHandleJob (jobOpaque=<value optimized out>, opaque=0x1642cd0)
    at rpc/virnetserver.c:191
    #20 0x00007fb8b7ff9e0c in virThreadPoolWorker (opaque=<value optimized out>) at util/threadpool.c:144
    #21 0x00007fb8b7ff96f9 in virThreadHelper (data=<value optimized out>) at util/threads-pthread.c:161
    #22 0x00007fb8b7926851 in start_thread (arg=0x7fb8b0c1e700) at pthread_create.c:301
    #23 0x00007fb8b726d90d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115

I've sent potential patches for this issue at https://www.redhat.com/archives/libvir-list/2012-December/msg00207.html

Comment 2 Luwen Su 2012-12-06 09:06:07 UTC
Hi Christophe,
  I'm the libvirt QE . Could you please provide your nss's version , the nsswitch.conf's content ? Because basing on refer another Bug 883547  , i can reproduce the issue , but i'm not sure if it's the same one of your's .Thanks.

My steps:
Packages:
libvirt-0.9.3-2.el6.x86_64.rpm
libvirt-0.10.2-10.el6.x86_64
nss-3.13.1-7.el6_2.x86_64

1.Install libvirt-0.9.3-2.el6.x86_64.rpm and define && start a guest , then destroy it
# virsh list --all
 Id    Name                           State
----------------------------------------------------
 -     test                           shut off

2.Add Add unavailable service to nsswitch.conf
#cat nsswitch.conf | gerp winbind
passwd:     files winbind
shadow:     files winbind
group:      files winbind

3.Upgrade libvirt to libvirt-0.10.2-10.el6.x86_64 , then start guest
#virsh start test
error: Failed to start domain test
error: internal error Failed to get user record for name '107': No such file or directory

Addional info:
I test with the latest nss-3.14.0.0-9.el6 change / not change the config file , use pkgs below , no error happen.
0.9.10-21  to  0.10.2-10
0.9.4-23   to  0.10.2-10

Comment 3 Luwen Su 2012-12-06 09:16:56 UTC
Created attachment 658589 [details]
debug log

Comment 4 Christophe Fergeau 2012-12-06 09:28:24 UTC
nss is nss-3.14.0.0-9.el6.x86_64 and nssswitch.conf contains 
# /etc/nsswitch.conf
#
# An example Name Service Switch config file. This file should be
# sorted with the most-used services at the beginning.
#
# The entry '[NOTFOUND=return]' means that the search for an
# entry should stop if the search in the previous entry turned
# up nothing. Note that if the search failed due to some other reason
# (like no NIS server responding) then the search continues with the
# next entry.
#
# Valid entries include:
#
#       nisplus                 Use NIS+ (NIS version 3)
#       nis                     Use NIS (NIS version 2), also called YP
#       dns                     Use DNS (Domain Name Service)
#       files                   Use the local files
#       db                      Use the local database (.db) files
#       compat                  Use NIS on compat mode
#       hesiod                  Use Hesiod for user lookups
#       [NOTFOUND=return]       Stop searching if not found so far
#

# To use db, put the "db" in front of "files" for entries you want to be
# looked up first in the databases
#
# Example:
#passwd:    db files nisplus nis
#shadow:    db files nisplus nis
#group:     db files nisplus nis

passwd:     files winbind
shadow:     files winbind
group:      files winbind

#hosts:     db files nisplus nis dns
hosts:      files dns

# Example - obey only what nisplus tells us...
#services:   nisplus [NOTFOUND=return] files
#networks:   nisplus [NOTFOUND=return] files
#protocols:  nisplus [NOTFOUND=return] files
#rpc:        nisplus [NOTFOUND=return] files
#ethers:     nisplus [NOTFOUND=return] files
#netmasks:   nisplus [NOTFOUND=return] files

bootparams: nisplus [NOTFOUND=return] files

ethers:     files
netmasks:   files
networks:   files
protocols:  files
rpc:        files
services:   files

netgroup:   files

publickey:  nisplus

automount:  files
aliases:    files nisplus


I didn't edit it myself, so it probably is the default one.

Comment 5 Luwen Su 2012-12-06 10:02:31 UTC
Thanks Christophe , Now i can 100% reproduce and workaround it...

Seems i catch the point:
If add winbind in config file , need start the service.
And there is no relevant about the libvirt version through my test.

BTW in my box , 
  
   passwd:     files
   shadow:     files
   group:      files

   is the default config.

For now , i find two ways to work around
1.
  Remove the winbind in nsswitch.conf.Then restart libvirtd service
2.
  Juse start winbind service....like below.

# rpm -q nss libvirt
nss-3.14.0.0-9.el6.x86_64
libvirt-0.10.2-10.el6.x86_64

#cat /etc/nsswitch.conf | grep winbind
passwd:     files winbind
shadow:     files winbind
group:      files winbind

# service winbind status
winbindd is stopped
# virsh start test
error: Failed to start domain test
error: internal error Failed to get user record for name '107': No such file or directory

# service winbind start
Starting Winbind services:                                 [  OK  ]
# virsh start test
Domain test started

Comment 6 Luwen Su 2012-12-06 10:21:47 UTC
(In reply to comment #5)
> Thanks Christophe , Now i can 100% reproduce and workaround it...
> 
> Seems i catch the point:
> If add winbind in config file , need start the service.
> And there is no relevant about the libvirt version through my test.
> 
> BTW in my box , 
>   
>    passwd:     files
>    shadow:     files
>    group:      files
> 
>    is the default config.
> 
> For now , i find two ways to work around
> 1.
>   Remove the winbind in nsswitch.conf.Then restart libvirtd service
> 2.
>   Juse start winbind service....like below.
> 
> # rpm -q nss libvirt
> nss-3.14.0.0-9.el6.x86_64
> libvirt-0.10.2-10.el6.x86_64
> 
> #cat /etc/nsswitch.conf | grep winbind
> passwd:     files winbind
> shadow:     files winbind
> group:      files winbind
> 
> # service winbind status
> winbindd is stopped
> # virsh start test
> error: Failed to start domain test
> error: internal error Failed to get user record for name '107': No such file
> or directory
> 
> # service winbind start
> Starting Winbind services:                                 [  OK  ]
> # virsh start test
> Domain test started

Ah , sorry , i made a mistake.
At least in libvirt-0.9.10-21.el6.x86_64 , the guest can start normally without error.

# rpm -q libvirt nss
libvirt-0.9.10-21.el6.x86_64
nss-3.14.0.0-9.el6.x86_64

#cat /etc/nsswitch.conf | grep winbind
passwd:     files winbind
shadow:     files winbind
group:      files winbind

# service winbind status
winbindd is stopped

# virsh list --all
 Id Name                 State
----------------------------------
  - test                 shut off

# virsh start test
Domain test started

Comment 7 Luwen Su 2012-12-06 10:41:28 UTC
The issue first appeared in libvirt-0.10.2-4 , libvirt-0.10.2-2 is ok

Comment 8 Peter Krempa 2012-12-11 12:45:24 UTC
The issue should be fixed upstream with:

commit a33f4eae83ecc6fb6e33006650c7f81e16584bd0
Author: Christophe Fergeau <cfergeau>
Date:   Wed Dec 5 11:21:10 2012 +0100

    util: Don't fail virGetGroupIDByName when group not found
    
    virGetGroupIDByName is documented as returning 1 if the groupname
    cannot be found. getgrnam_r is documented as returning:
    « 0 or ENOENT or ESRCH or EBADF or EPERM or ...  The given name
    or gid was not found. »
     and that:
    « The formulation given above under "RETURN VALUE" is from POSIX.1-2001.
    It  does  not  call  "not  found"  an error, hence does not specify what
    value errno might have in this situation.  But that makes it impossible to
    recognize errors.  One might argue that according to POSIX errno should be
    left unchanged if an entry is not found.  Experiments on various UNIX-like
    systems shows that lots of different values occur in this situation: 0,
    ENOENT, EBADF, ESRCH, EWOULDBLOCK, EPERM and probably others. »
    
    virGetGroupIDByName returns an error when the return value of getgrnam_r
    is non-0. However on my RHEL system, getgrnam_r returns ENOENT when the
    requested user cannot be found, which then causes virGetGroupID not
    to behave as documented (it returns an error instead of falling back
    to parsing the passed-in value as an gid).
    
    This commit makes virGetGroupIDByName only report an error when errno
    is set to one of the values in the posix description of getgrnam_r
    (which are the same as the ones described in the manpage on my system).

commit 6c6c03dc0e66db400beacc4453efa6e10ec08260
Author: Christophe Fergeau <cfergeau>
Date:   Wed Dec 5 11:21:10 2012 +0100

    util: Don't fail virGetUserIDByName when user not found
    
    virGetUserIDByName is documented as returning 1 if the username
    cannot be found. getpwnam_r is documented as returning:
    « 0 or ENOENT or ESRCH or EBADF or EPERM or ...  The given name
    or uid was not found. »
     and that:
    « The formulation given above under "RETURN VALUE" is from POSIX.1-2001.
    It  does  not  call  "not  found"  an error, hence does not specify what
    value errno might have in this situation.  But that makes it impossible to
    recognize errors.  One might argue that according to POSIX errno should be
    left unchanged if an entry is not found.  Experiments on various UNIX-like
    systems shows that lots of different values occur in this situation: 0,
    ENOENT, EBADF, ESRCH, EWOULDBLOCK, EPERM and probably others. »
    
    virGetUserIDByName returns an error when the return value of getpwnam_r
    is non-0. However on my RHEL system, getpwnam_r returns ENOENT when the
    requested user cannot be found, which then causes virGetUserID not
    to behave as documented (it returns an error instead of falling back
    to parsing the passed-in value as an uid).
    
    This commit makes virGetUserIDByName only report an error when errno
    is set to one of the values in the posix description of getpwnam_r
    (which are the same as the ones described in the manpage on my system).

Comment 11 Luwen Su 2012-12-13 05:45:21 UTC
Test in libvirt-0.10.2-12.el6.x86_64 with comment 6's steps.
The guest can start normally and no error showed in libvirtd.log.
Set to Verified.

Comment 12 errata-xmlrpc 2013-02-21 07:28:07 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2013-0276.html


Note You need to log in before you can comment on or make changes to this bug.