Bug 883832
Summary: | Cannot start VMs after upgrade from 6.3 to libvirt-0.10.2-10 | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Christophe Fergeau <cfergeau> | ||||
Component: | libvirt | Assignee: | Peter Krempa <pkrempa> | ||||
Status: | CLOSED ERRATA | QA Contact: | Virtualization Bugs <virt-bugs> | ||||
Severity: | unspecified | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 6.4 | CC: | acathrow, dallan, dyasny, dyuan, lsu, mjenner, mzhan, pkrempa, rwu | ||||
Target Milestone: | rc | ||||||
Target Release: | --- | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | libvirt-0.10.2-12.el6 | Doc Type: | Bug Fix | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2013-02-21 07:28:07 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 888457 | ||||||
Attachments: |
|
Description
Christophe Fergeau
2012-12-05 12:18:01 UTC
Hi Christophe, I'm the libvirt QE . Could you please provide your nss's version , the nsswitch.conf's content ? Because basing on refer another Bug 883547 , i can reproduce the issue , but i'm not sure if it's the same one of your's .Thanks. My steps: Packages: libvirt-0.9.3-2.el6.x86_64.rpm libvirt-0.10.2-10.el6.x86_64 nss-3.13.1-7.el6_2.x86_64 1.Install libvirt-0.9.3-2.el6.x86_64.rpm and define && start a guest , then destroy it # virsh list --all Id Name State ---------------------------------------------------- - test shut off 2.Add Add unavailable service to nsswitch.conf #cat nsswitch.conf | gerp winbind passwd: files winbind shadow: files winbind group: files winbind 3.Upgrade libvirt to libvirt-0.10.2-10.el6.x86_64 , then start guest #virsh start test error: Failed to start domain test error: internal error Failed to get user record for name '107': No such file or directory Addional info: I test with the latest nss-3.14.0.0-9.el6 change / not change the config file , use pkgs below , no error happen. 0.9.10-21 to 0.10.2-10 0.9.4-23 to 0.10.2-10 Created attachment 658589 [details]
debug log
nss is nss-3.14.0.0-9.el6.x86_64 and nssswitch.conf contains # /etc/nsswitch.conf # # An example Name Service Switch config file. This file should be # sorted with the most-used services at the beginning. # # The entry '[NOTFOUND=return]' means that the search for an # entry should stop if the search in the previous entry turned # up nothing. Note that if the search failed due to some other reason # (like no NIS server responding) then the search continues with the # next entry. # # Valid entries include: # # nisplus Use NIS+ (NIS version 3) # nis Use NIS (NIS version 2), also called YP # dns Use DNS (Domain Name Service) # files Use the local files # db Use the local database (.db) files # compat Use NIS on compat mode # hesiod Use Hesiod for user lookups # [NOTFOUND=return] Stop searching if not found so far # # To use db, put the "db" in front of "files" for entries you want to be # looked up first in the databases # # Example: #passwd: db files nisplus nis #shadow: db files nisplus nis #group: db files nisplus nis passwd: files winbind shadow: files winbind group: files winbind #hosts: db files nisplus nis dns hosts: files dns # Example - obey only what nisplus tells us... #services: nisplus [NOTFOUND=return] files #networks: nisplus [NOTFOUND=return] files #protocols: nisplus [NOTFOUND=return] files #rpc: nisplus [NOTFOUND=return] files #ethers: nisplus [NOTFOUND=return] files #netmasks: nisplus [NOTFOUND=return] files bootparams: nisplus [NOTFOUND=return] files ethers: files netmasks: files networks: files protocols: files rpc: files services: files netgroup: files publickey: nisplus automount: files aliases: files nisplus I didn't edit it myself, so it probably is the default one. Thanks Christophe , Now i can 100% reproduce and workaround it... Seems i catch the point: If add winbind in config file , need start the service. And there is no relevant about the libvirt version through my test. BTW in my box , passwd: files shadow: files group: files is the default config. For now , i find two ways to work around 1. Remove the winbind in nsswitch.conf.Then restart libvirtd service 2. Juse start winbind service....like below. # rpm -q nss libvirt nss-3.14.0.0-9.el6.x86_64 libvirt-0.10.2-10.el6.x86_64 #cat /etc/nsswitch.conf | grep winbind passwd: files winbind shadow: files winbind group: files winbind # service winbind status winbindd is stopped # virsh start test error: Failed to start domain test error: internal error Failed to get user record for name '107': No such file or directory # service winbind start Starting Winbind services: [ OK ] # virsh start test Domain test started (In reply to comment #5) > Thanks Christophe , Now i can 100% reproduce and workaround it... > > Seems i catch the point: > If add winbind in config file , need start the service. > And there is no relevant about the libvirt version through my test. > > BTW in my box , > > passwd: files > shadow: files > group: files > > is the default config. > > For now , i find two ways to work around > 1. > Remove the winbind in nsswitch.conf.Then restart libvirtd service > 2. > Juse start winbind service....like below. > > # rpm -q nss libvirt > nss-3.14.0.0-9.el6.x86_64 > libvirt-0.10.2-10.el6.x86_64 > > #cat /etc/nsswitch.conf | grep winbind > passwd: files winbind > shadow: files winbind > group: files winbind > > # service winbind status > winbindd is stopped > # virsh start test > error: Failed to start domain test > error: internal error Failed to get user record for name '107': No such file > or directory > > # service winbind start > Starting Winbind services: [ OK ] > # virsh start test > Domain test started Ah , sorry , i made a mistake. At least in libvirt-0.9.10-21.el6.x86_64 , the guest can start normally without error. # rpm -q libvirt nss libvirt-0.9.10-21.el6.x86_64 nss-3.14.0.0-9.el6.x86_64 #cat /etc/nsswitch.conf | grep winbind passwd: files winbind shadow: files winbind group: files winbind # service winbind status winbindd is stopped # virsh list --all Id Name State ---------------------------------- - test shut off # virsh start test Domain test started The issue first appeared in libvirt-0.10.2-4 , libvirt-0.10.2-2 is ok The issue should be fixed upstream with: commit a33f4eae83ecc6fb6e33006650c7f81e16584bd0 Author: Christophe Fergeau <cfergeau> Date: Wed Dec 5 11:21:10 2012 +0100 util: Don't fail virGetGroupIDByName when group not found virGetGroupIDByName is documented as returning 1 if the groupname cannot be found. getgrnam_r is documented as returning: « 0 or ENOENT or ESRCH or EBADF or EPERM or ... The given name or gid was not found. » and that: « The formulation given above under "RETURN VALUE" is from POSIX.1-2001. It does not call "not found" an error, hence does not specify what value errno might have in this situation. But that makes it impossible to recognize errors. One might argue that according to POSIX errno should be left unchanged if an entry is not found. Experiments on various UNIX-like systems shows that lots of different values occur in this situation: 0, ENOENT, EBADF, ESRCH, EWOULDBLOCK, EPERM and probably others. » virGetGroupIDByName returns an error when the return value of getgrnam_r is non-0. However on my RHEL system, getgrnam_r returns ENOENT when the requested user cannot be found, which then causes virGetGroupID not to behave as documented (it returns an error instead of falling back to parsing the passed-in value as an gid). This commit makes virGetGroupIDByName only report an error when errno is set to one of the values in the posix description of getgrnam_r (which are the same as the ones described in the manpage on my system). commit 6c6c03dc0e66db400beacc4453efa6e10ec08260 Author: Christophe Fergeau <cfergeau> Date: Wed Dec 5 11:21:10 2012 +0100 util: Don't fail virGetUserIDByName when user not found virGetUserIDByName is documented as returning 1 if the username cannot be found. getpwnam_r is documented as returning: « 0 or ENOENT or ESRCH or EBADF or EPERM or ... The given name or uid was not found. » and that: « The formulation given above under "RETURN VALUE" is from POSIX.1-2001. It does not call "not found" an error, hence does not specify what value errno might have in this situation. But that makes it impossible to recognize errors. One might argue that according to POSIX errno should be left unchanged if an entry is not found. Experiments on various UNIX-like systems shows that lots of different values occur in this situation: 0, ENOENT, EBADF, ESRCH, EWOULDBLOCK, EPERM and probably others. » virGetUserIDByName returns an error when the return value of getpwnam_r is non-0. However on my RHEL system, getpwnam_r returns ENOENT when the requested user cannot be found, which then causes virGetUserID not to behave as documented (it returns an error instead of falling back to parsing the passed-in value as an uid). This commit makes virGetUserIDByName only report an error when errno is set to one of the values in the posix description of getpwnam_r (which are the same as the ones described in the manpage on my system). Test in libvirt-0.10.2-12.el6.x86_64 with comment 6's steps. The guest can start normally and no error showed in libvirtd.log. Set to Verified. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHSA-2013-0276.html |