Bug 1206884 - HE deployment fails due to libvirtError: internal error client socket is closed
Summary: HE deployment fails due to libvirtError: internal error client socket is closed
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: vdsm
Version: 3.5.1
Hardware: x86_64
OS: Linux
high
high
Target Milestone: ovirt-3.6.0-rc
: 3.6.0
Assignee: Sandro Bonazzola
QA Contact: Nikolai Sednev
URL:
Whiteboard:
: 1231614 1244740 (view as bug list)
Depends On:
Blocks: 1059435 1198639 1235591 1250199
TreeView+ depends on / blocked
 
Reported: 2015-03-29 10:04 UTC by Nikolai Sednev
Modified: 2016-03-09 19:34 UTC (History)
21 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1235591 (view as bug list)
Environment:
Last Closed: 2016-03-09 19:34:00 UTC
oVirt Team: Integration
Target Upstream Version:
istein: needinfo+


Attachments (Terms of Use)
answers-20150323062421.conf (2.54 KB, text/plain)
2015-03-29 10:06 UTC, Nikolai Sednev
no flags Details
alma03 logs (13.29 KB, application/x-gzip)
2015-04-02 14:36 UTC, Nikolai Sednev
no flags Details
sosreportalma03 (6.21 MB, application/x-xz)
2015-04-02 14:38 UTC, Nikolai Sednev
no flags Details
answers-20150402142253.conf (2.41 KB, text/plain)
2015-04-02 14:38 UTC, Nikolai Sednev
no flags Details
Config and logs from a failed attempt (7.58 MB, application/x-bzip)
2015-06-19 10:18 UTC, Fabian Deutsch
no flags Details


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2016:0362 normal SHIPPED_LIVE vdsm 3.6.0 bug fix and enhancement update 2016-03-09 23:49:32 UTC
oVirt gerrit 43947 None MERGED Calling vdsm-tool configure after touching its conf files 2020-10-19 10:50:39 UTC

Description Nikolai Sednev 2015-03-29 10:04:33 UTC
Description of problem:
Now only configuration of virtual bridge fails during first iteration of HE deployment:  
[ INFO  ] Configuring the management bridge
[ ERROR ] Failed to execute stage 'Misc configuration': Failed to setup networks {'rhevm': {'nic': 'p1p1', 'bootproto': 'dhcp', 'blockingdhcp': True}}. Error code: "16" message: "Unexpected exception"
[ INFO  ] Stage: Clean up


and succeeds on second deployment iteration:

[ INFO  ] Starting vdsmd
[ INFO  ] Waiting for VDSM hardware info
[ INFO  ] Waiting for VDSM hardware info
[ INFO  ] Configuring the management bridge
[ INFO  ] Creating Storage Domain
[ INFO  ] Creating Storage Pool
[ INFO  ] Connecting Storage Pool


At the end on latest vt14.1 rhevm-3.5.1-0.2.el6ev.noarch  both hosts seen normally now.

During addition of second host to environment, it also suffered same issue with rhevm bridge and caused failure for the first time I ran hosted-engine --deploy on it, but succeeded on second run. 
[ INFO  ] Stage: Closing up                                                                                                                                
[ ERROR ] Failed to execute stage 'Closing up': <urlopen error [Errno 111] Connection refused>                                                             
[ INFO  ] Stage: Clean up         

Please see latest answers-20150323062421.conf attachedfrom second host.    


Components used:

rhevm-guest-agent-common-1.0.10-2.el6ev.noarch
rhevm-3.5.1-0.2.el6ev.noarch

sanlock-2.8-1.el6.x86_64
mom-0.4.1-4.el6ev.noarch
vdsm-4.16.8.1-7.el6ev.x86_64
ovirt-hosted-engine-ha-1.2.5-1.el6ev.noarch
libvirt-0.10.2-46.el6_6.3.x86_64
qemu-kvm-rhev-0.12.1.2-2.446.el6.x86_64
ovirt-hosted-engine-setup-1.2.2-1.el6ev.noarch

Version-Release number of selected component (if applicable):


How reproducible:
100%

Steps to Reproduce:
1.Using Foreman, reprovision the host to RHEVH
2.Start deployment of hosted engine on RHEVH
3.Get the network error.


Actual results:
Deployment fails with an error "[ ERROR ] Failed to execute stage 'Misc configuration': Failed to setup networks {'rhevm': {'nic': 'p1p1', 'bootproto': 'dhcp', 'blockingdhcp': True}}. Error code: "16" message: "Unexpected exception"

Expected results:
Deployment have to pass without errors.

Additional info:
If the same deployment reran again, then deployment succeeds.

Comment 1 Nikolai Sednev 2015-03-29 10:06:00 UTC
Created attachment 1007953 [details]
answers-20150323062421.conf

Comment 2 Fabian Deutsch 2015-04-02 13:05:59 UTC
Nikolai, could you please also provide the sosreport from the host side?

We need to find out what the exception is.

Comment 3 Nikolai Sednev 2015-04-02 14:35:37 UTC
[root@alma03 /]# hosted-engine --deploy
[ INFO  ] Stage: Initializing          
[ INFO  ] Generating a temporary VNC password.
[ INFO  ] Stage: Environment setup            
          Continuing will configure this host for serving as hypervisor and create a VM where you have to install oVirt Engine afterwards.                                                                                                                        
          Are you sure you want to continue? (Yes, No)[Yes]:                                                                     
          Configuration files: []                                                                                                
          Log file: /var/log/ovirt-hosted-engine-setup/ovirt-hosted-engine-setup-20150402142028-1h4utr.log                       
          Version: otopi-1.3.0 (otopi-1.3.0-2.el6ev)                                                                             
          It has been detected that this program is executed through an SSH connection without using screen.                     
          Continuing with the installation may lead to broken installation if the network connection fails.                      
          It is highly recommended to abort the installation and run it inside a screen session using command "screen".          
          Do you want to continue anyway? (Yes, No)[No]: yes                                                                     
[ INFO  ] Hardware supports virtualization                                                                                       
[ INFO  ] Stage: Environment packages setup                                                                                      
[ INFO  ] Stage: Programs detection                                                                                              
[ INFO  ] Stage: Environment setup                                                                                               
[ INFO  ] Generating libvirt-spice certificates                                                                                  
[ INFO  ] Stage: Environment customization                                                                                       
                                                                                                                                 
          --== STORAGE CONFIGURATION ==--                                                                                        
                                                                                                                                 
          During customization use CTRL-D to abort.                                                                              
          Please specify the storage you would like to use (iscsi, nfs3, nfs4)[nfs3]:                                            
          Please specify the full shared storage connection path to use (example: host:/path): 10.35.160.108:/RHEV/nsednev_HE_3_5
[ INFO  ] Installing on first host                                                                                               
          Please provide storage domain name. [hosted_storage]:                                                                  
          Local storage datacenter name is an internal name and currently will not be shown in engine's admin UI.                
          Please enter local datacenter name [hosted_datacenter]:                                                                
                                                                                                                                 
          --== SYSTEM CONFIGURATION ==--                                                                                         
                                                                                                                                 
                                                                                                                                 
          --== NETWORK CONFIGURATION ==--                                                                                        
                                                                                                                                 
          Please indicate a nic to set rhevm bridge on: (em1, em2, p1p1, p1p2) [em1]: p1p1                                       
          iptables was detected on your computer, do you wish setup to configure it? (Yes, No)[Yes]:                             
          Please indicate a pingable gateway IP address [10.35.117.254]:                                                         
                                                                                                                                 
          --== VM CONFIGURATION ==--                                                                                             
                                                                                                                                 
          Please specify the device to boot the VM from (cdrom, disk, pxe) [cdrom]: pxe                                          
          Please specify an alias for the Hosted Engine image [hosted_engine]:                                                   
          The following CPU types are supported by this host:                                                                    
                 - model_SandyBridge: Intel SandyBridge Family                                                                   
                 - model_Westmere: Intel Westmere Family                                                                         
                 - model_Nehalem: Intel Nehalem Family                                                                           
                 - model_Penryn: Intel Penryn Family                                                                             
                 - model_Conroe: Intel Conroe Family                                                                             
          Please specify the CPU type to be used by the VM [model_SandyBridge]:                                                  
          Please specify the number of virtual CPUs for the VM [Defaults to minimum requirement: 2]:                             
          Please specify the disk size of the VM in GB [Defaults to minimum requirement: 25]:                                    
          You may specify a unicast MAC address for the VM or accept a randomly generated default [00:16:3e:23:1a:fa]: 00:16:3E:7B:BB:BB                                                                                                                          
          Please specify the memory size of the VM in MB [Defaults to minimum requirement: 4096]:                                
          Please specify the console type you would like to use to connect to the VM (vnc, spice) [vnc]:                         
                                                                                                                                 
          --== HOSTED ENGINE CONFIGURATION ==--                                                                                  
                                                                                                                                 
          Enter the name which will be used to identify this host inside the Administrator Portal [hosted_engine_1]:             
          Enter 'admin@internal' user password that will be used for accessing the Administrator Portal:                         
          Confirm 'admin@internal' user password:                                                                                
          Please provide the FQDN for the engine you would like to use.                                                          
          This needs to match the FQDN that you will use for the engine installation within the VM.                              
          Note: This will be the FQDN of the VM you are now going to create,                                                     
          it should not point to the base host or to any other existing machine.                                                 
          Engine FQDN: nsednev-he-2.qa.lab.tlv.redhat.com                                                                        
          Please provide the name of the SMTP server through which we will send notifications [localhost]:                       
          Please provide the TCP port number of the SMTP server [25]:                                                            
          Please provide the email address from which notifications will be sent [root@localhost]:                               
          Please provide a comma-separated list of email addresses which will get notifications [root@localhost]:                
[ INFO  ] Stage: Setup validation                                                                                                
                                                                                                                                 
          --== CONFIGURATION PREVIEW ==--                                                                                        
                                                                                                                                 
          Bridge interface                   : p1p1
          Engine FQDN                        : nsednev-he-2.qa.lab.tlv.redhat.com
          Bridge name                        : rhevm
          SSH daemon port                    : 22
          Firewall manager                   : iptables
          Gateway address                    : 10.35.117.254
          Host name for web application      : hosted_engine_1
          Host ID                            : 1
          Image alias                        : hosted_engine
          Image size GB                      : 25
          Storage connection                 : 10.35.160.108:/RHEV/nsednev_HE_3_5
          Console type                       : vnc
          Memory size MB                     : 4096
          MAC address                        : 00:16:3E:7B:BB:BB
          Boot type                          : pxe
          Number of CPUs                     : 2
          CPU Type                           : model_SandyBridge

          Please confirm installation settings (Yes, No)[Yes]:
[ INFO  ] Stage: Transaction setup
[ INFO  ] Stage: Misc configuration
[ INFO  ] Stage: Package installation
[ INFO  ] Stage: Misc configuration
[ INFO  ] Configuring libvirt
[ INFO  ] Configuring VDSM
[ INFO  ] Starting vdsmd
[ INFO  ] Waiting for VDSM hardware info
[ INFO  ] Waiting for VDSM hardware info
[ INFO  ] Configuring the management bridge
[ ERROR ] Failed to execute stage 'Misc configuration': Failed to setup networks {'rhevm': {'nic': 'p1p1', 'bootproto': 'dhcp', 'blockingdhcp': True}}. Error code: "16" message: "Unexpected exception"
[ INFO  ] Stage: Clean up
[ INFO  ] Generating answer file '/var/lib/ovirt-hosted-engine-setup/answers/answers-20150402142253.conf'
[ INFO  ] Stage: Pre-termination
[ INFO  ] Stage: Termination




Logs attached including the sosreport.

Comment 4 Nikolai Sednev 2015-04-02 14:36:38 UTC
Created attachment 1010219 [details]
alma03 logs

Comment 5 Nikolai Sednev 2015-04-02 14:38:14 UTC
Created attachment 1010220 [details]
sosreportalma03

Comment 6 Nikolai Sednev 2015-04-02 14:38:40 UTC
Created attachment 1010221 [details]
answers-20150402142253.conf

Comment 7 Sandro Bonazzola 2015-05-08 15:04:53 UTC
(In reply to Nikolai Sednev from comment #5)
> Created attachment 1010220 [details]
> sosreportalma03

I see in VDSM logs:
 Thread-16::DEBUG::2015-04-02 14:22:53,586::BindingXMLRPC::1133::vds::(wrapper) client [127.0.0.1]::call setupNetworks with ({'rhevm': {'nic': 'p1p1', 'bootproto': 'dhcp', 'blockingdhcp': True}}, {}, {'connectivityCheck': False}) {}

 Thread-16::ERROR::2015-04-02 14:22:53,597::BindingXMLRPC::1143::vds::(wrapper) libvirt error
Traceback (most recent call last):
  File "/usr/share/vdsm/rpc/BindingXMLRPC.py", line 1136, in wrapper
  File "/usr/share/vdsm/rpc/BindingXMLRPC.py", line 554, in setupNetworks
  File "/usr/share/vdsm/API.py", line 1394, in setupNetworks
  File "/usr/share/vdsm/supervdsm.py", line 50, in __call__
  File "/usr/share/vdsm/supervdsm.py", line 48, in <lambda>
  File "<string>", line 2, in setupNetworks
  File "/usr/lib64/python2.6/multiprocessing/managers.py", line 740, in _callmethod
libvirtError: internal error client socket is closed

With a corresponding supervdsm log:
 MainProcess|Thread-16::DEBUG::2015-04-02 14:22:53,587::supervdsmServer::101::SuperVdsm.ServerCallback::(wrapper) call setupNetworks with ({'rhevm': {'nic': 'p1p1', 'bootproto': 'dhcp', 'blockingdhcp': True}}, {}, {'connectivityCheck': False}) {}
 MainProcess|Thread-16::WARNING::2015-04-02 14:22:53,588::libvirtconnection::135::root::(wrapper) connection to libvirt broken. ecode: 1 edom: 7
 MainProcess|Thread-16::CRITICAL::2015-04-02 14:22:53,588::libvirtconnection::137::root::(wrapper) taking calling process down.
 MainThread::DEBUG::2015-04-02 14:22:53,588::supervdsmServer::453::SuperVdsm.Server::(main) Terminated normally
 MainProcess|Thread-16::DEBUG::2015-04-02 14:22:53,588::libvirtconnection::143::root::(wrapper) Unknown libvirterror: ecode: 1 edom: 7 level: 2 message: internal error client socket is closed
 MainProcess|Thread-16::ERROR::2015-04-02 14:22:53,589::supervdsmServer::105::SuperVdsm.ServerCallback::(wrapper) Error in setupNetworks
 Traceback (most recent call last):
  File "/usr/share/vdsm/supervdsmServer", line 103, in wrapper
    res = func(*args, **kwargs)
  File "/usr/share/vdsm/supervdsmServer", line 223, in setupNetworks
    return setupNetworks(networks, bondings, **options)
  File "/usr/share/vdsm/network/api.py", line 612, in setupNetworks
  File "/usr/lib/python2.6/site-packages/vdsm/netinfo.py", line 112, in networks
  File "/usr/lib/python2.6/site-packages/vdsm/libvirtconnection.py", line 129, in wrapper
  File "/usr/lib64/python2.6/site-packages/libvirt.py", line 3405, in getLibVersion
 libvirtError: internal error client socket is closed

and a libvirt log:
 2015-04-02 14:22:26.347+0000: 17198: info : libvirt version: 0.10.2, package: 46.el6_6.3 (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>, 2015-01-23-10:39:40, x86-026.build.eng.bos.redhat.com)
 2015-04-02 14:22:26.347+0000: 17198: error : virNWFilterSnoopLeaseFileRefresh:1903 : open("/var/run/libvirt/network/nwfilter.ltmp"): No such file or directory
 2015-04-02 14:22:26.398+0000: 17198: error : virExecWithHook:426 : Cannot find 'pm-is-supported' in path: No such file or directory
 2015-04-02 14:22:26.398+0000: 17198: warning : qemuCapsInit:885 : Failed to get host power management capabilities
 2015-04-02 14:22:26.809+0000: 17198: error : virExecWithHook:426 : Cannot find 'pm-is-supported' in path: No such file or directory
 2015-04-02 14:22:26.809+0000: 17198: warning : lxcCapsInit:77 : Failed to get host power management capabilities
 2015-04-02 14:22:51.820+0000: 17191: error : virExecWithHook:426 : Cannot find 'pm-is-supported' in path: No such file or directory
 2015-04-02 14:22:51.820+0000: 17191: warning : qemuCapsInit:885 : Failed to get host power management capabilities

looks like an issue on vdsm side.

Comment 8 Sandro Bonazzola 2015-05-08 15:05:43 UTC
moving to vdsm / network for review.

Comment 9 Dan Kenigsberg 2015-05-12 12:22:00 UTC
 MainProcess|Thread-16::CRITICAL::2015-04-02 14:22:53,588::libvirtconnection::137::root::(wrapper) taking calling process down.

means that libvirtd died after vdsm connected to it. That's not a problem of vdsm proper.

Can you tell what took libvirtd down? the copied 'pm-is-supported' errors reminds me of an issue fix in a newer libvirt.

Comment 10 Sandro Bonazzola 2015-05-12 13:01:21 UTC
(In reply to Dan Kenigsberg from comment #9)
> Can you tell what took libvirtd down? the copied 'pm-is-supported' errors
> reminds me of an issue fix in a newer libvirt.

All the info I have are the ones in the attached sosreport provided by Nikolai.
I can't tell what took libvirtd down after supervdsmd start.
If you can provide the version of libvirt that had the issue we can check in the sosreport if the installed version is affected.

Comment 11 Dan Kenigsberg 2015-05-18 08:35:51 UTC
Yaniv, can you tell why libvirtd died?

Comment 12 Yaniv Bronhaim 2015-05-20 09:40:49 UTC
Hi
I would change log level to debug and see what libvirtd.log says - you need to modify libvirtd.conf for that. can you do that and attach the output?

In addition I would add someone from libvirt to take a look as well

Comment 13 Dan Kenigsberg 2015-05-20 09:47:58 UTC
BTW, due to bug 1222154 comment 13 - please make sure that you run

  service libvirtd stop
  chkconfig libvirtd off

before any el6 tests of ovirt-3.5's vdsm.

Comment 14 Yaniv Bronhaim 2015-05-25 12:38:26 UTC
any news regarding the debug level logs? is it possible to produce them?

Comment 15 Nikolai Sednev 2015-05-26 14:31:56 UTC
I was unable to reproduce this bug, might be closed as not reproduced.

Comment 16 Nikolai Sednev 2015-05-26 14:36:09 UTC
(In reply to Nikolai Sednev from comment #15)
> I was unable to reproduce this bug, might be closed as not reproduced.

I've failed to get this reproduced on RHELs, on RHEVHs we don't have any image with HE setup yet, for better reproduction, such image is required.

Comment 17 Yaniv Lavi 2015-05-27 11:28:20 UTC
Once you have a image please reopen if this reproduces.

Comment 30 Fabian Deutsch 2015-06-17 09:54:25 UTC
I can reproduce this issue with the el6 build from comment 29 and the following network configuration:

# tail -n 40 /etc/sysconfig/network-scripts/ifcfg-*
==> /etc/sysconfig/network-scripts/ifcfg-bond0 <==
BONDING_OPTS="mode=4 miimon=100"
BOOTPROTO="dhcp"
DEVICE="bond0"
IPV6INIT="no"
IPV6_AUTOCONF="no"
NM_CONTROLLED="no"
ONBOOT="yes"
PEERNTP="yes"
TYPE="Bond"

==> /etc/sysconfig/network-scripts/ifcfg-eth0 <==
DEVICE="eth0"
HWADDR="52:54:00:2c:3c:8e"
MASTER="bond0"
NM_CONTROLLED="no"
ONBOOT="yes"
SLAVE="yes"

==> /etc/sysconfig/network-scripts/ifcfg-eth1 <==
DEVICE="eth1"
HWADDR="52:54:00:c0:7a:da"
MASTER="bond0"
NM_CONTROLLED="no"
ONBOOT="yes"
SLAVE="yes"

==> /etc/sysconfig/network-scripts/ifcfg-lo <==
DEVICE="lo"
IPADDR="127.0.0.1"
NETMASK="255.0.0.0"
ONBOOT="yes"

Comment 31 Fabian Deutsch 2015-06-17 09:57:47 UTC
Changing mode=4 to mode=balance-rr in the bonding opts prevents this error, but leads to bug 1230638.

Comment 32 Fabian Deutsch 2015-06-17 10:03:36 UTC
It's probably a race, because I also encountered it with a plain nic (eth1), running the setup a second time without any changes solved the issue.

Comment 33 Sandro Bonazzola 2015-06-17 10:10:37 UTC
*** Bug 1231614 has been marked as a duplicate of this bug. ***

Comment 34 Sandro Bonazzola 2015-06-17 10:40:15 UTC
first try, can't reproduce with
 rhev-hypervisor6-6.6-20150609.0.el6ev

Comment 35 Sandro Bonazzola 2015-06-17 12:18:31 UTC
(In reply to Sandro Bonazzola from comment #34)
> first try, can't reproduce with
>  rhev-hypervisor6-6.6-20150609.0.el6ev

it's 6.7 not 6.6. BTW, second run worked as well on my network and host so I guess it may be really some race condition depending on the power or the load of the system / network.

Comment 36 Fabian Deutsch 2015-06-19 10:18:31 UTC
Created attachment 1040844 [details]
Config and logs from a failed attempt

Comment 37 Sandro Bonazzola 2015-06-19 11:22:20 UTC
According to logs attached in comment #36, supervdsmd received a setupNetworks call:

MainThread::DEBUG::2015-06-19 10:00:41,416::vdsm-restore-net-config::77::root::(unified_restoration) Calling setupNetworks with networks ({}) and bond ({}).

ended with:
MainThread::INFO::2015-06-19 10:04:42,886::vdsm-restore-net-config::145::root::(restore) networks already restored. doing nothing.

Then supervdsmd has confirmed being up and running:

MainProcess|Thread-13::DEBUG::2015-06-19 10:04:45,431::supervdsmServer::102::SuperVdsm.ServerCallback::(wrapper) call getHardwareInfo with () {}
MainProcess|Thread-13::DEBUG::2015-06-19 10:04:45,431::supervdsmServer::109::SuperVdsm.ServerCallback::(wrapper) return getHardwareInfo with {'systemProductName': 'Standard PC (i440FX + PIIX, 1996)', 'systemUUID': 'D2F9C074-81CD-4057-FAA3-17B5980EEE1A', 'systemVersion': 'pc-i440fx-2.1', 'systemManufacturer': 'QEMU'}

before receiving the call to setupNetworks:

MainProcess|Thread-15::DEBUG::2015-06-19 10:04:46,327::supervdsmServer::102::SuperVdsm.ServerCallback::(wrapper) call setupNetworks with ({'rhevm': {'nic': 'eth1', 'netmask': '', 'ipaddr': ''}}, {}, {'connectivityCheck': False}) {}
MainProcess|Thread-15::WARNING::2015-06-19 10:04:46,328::libvirtconnection::135::root::(wrapper) connection to libvirt broken. ecode: 1 edom: 7
MainProcess|Thread-15::CRITICAL::2015-06-19 10:04:46,328::libvirtconnection::137::root::(wrapper) taking calling process down.
MainProcess|Thread-15::DEBUG::2015-06-19 10:04:46,328::libvirtconnection::143::root::(wrapper) Unknown libvirterror: ecode: 1 edom: 7 level: 2 message: internal error client socket is closed
MainProcess|Thread-15::ERROR::2015-06-19 10:04:46,328::supervdsmServer::106::SuperVdsm.ServerCallback::(wrapper) Error in setupNetworks
Traceback (most recent call last):
  File "/usr/share/vdsm/supervdsmServer", line 104, in wrapper
    res = func(*args, **kwargs)
  File "/usr/share/vdsm/supervdsmServer", line 224, in setupNetworks
    return setupNetworks(networks, bondings, **options)
  File "/usr/share/vdsm/network/api.py", line 624, in setupNetworks
  File "/usr/lib/python2.6/site-packages/vdsm/netinfo.py", line 112, in networks
  File "/usr/lib/python2.6/site-packages/vdsm/libvirtconnection.py", line 129, in wrapper
  File "/usr/lib64/python2.6/site-packages/libvirt.py", line 3405, in getLibVersion
libvirtError: internal error client socket is closed


libvirt was already up and running in the past 20 seconds:
2015-06-19 10:04:25.265+0000: 16409: info : libvirt version: 0.10.2, package: 54.el6 (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>, 2015-04-28-09:24:16, x86-033.build.eng.bos.redhat.com)
2015-06-19 10:04:25.265+0000: 16409: error : virExecWithHook:426 : Cannot find 'pm-is-supported' in path: No such file or directory
2015-06-19 10:04:25.265+0000: 16409: warning : qemuCapsInit:885 : Failed to get host power management capabilities
2015-06-19 10:04:25.599+0000: 16409: error : virExecWithHook:426 : Cannot find 'pm-is-supported' in path: No such file or directory
2015-06-19 10:04:25.599+0000: 16409: warning : lxcCapsInit:77 : Failed to get host power management capabilities
2015-06-19 10:04:44.256+0000: 16399: error : virExecWithHook:426 : Cannot find 'pm-is-supported' in path: No such file or directory
2015-06-19 10:04:44.256+0000: 16399: warning : qemuCapsInit:885 : Failed to get host power management capabilities


In vdsm.log I see:
clientIFinit::DEBUG::2015-06-19 10:04:44,244::libvirtconnection::150::root::(get) trying to connect libvirt

just 2 seconds before the call to setupNetworks

But I think the connection has been established since there is also:
Thread-14::DEBUG::2015-06-19 10:04:45,923::caps::276::root::(getLiveMergeSupport) libvirt is missing 'VIR_DOMAIN_BLOCK_COMMIT_RELATIVE': live merge disabled

just one second before the call to setupNetworks

I don't see anything related to restarts during the process or anything related to hosted-engine calls to vdsm other than the call to setupNetworks.
It doesn't seems something related to delays or race conditions.
Looks like a bug in the way vdsm handles the connection with libvirt.

Comment 38 Yaniv Bronhaim 2015-06-23 09:27:29 UTC
Fabian and I couldn't reproduce it easily . It happened few times, but we tried to get the libvirtd debug logs (add log_filters="1:libvirt 1:util 1:qemu"
 and log_outputs="1:file:/var/log/libvirt/libvirtd.log" lines to libvirtd.conf) - 

Then we need to restart libvirtd service and it might helped to avoid bug ... but we're not sure exactly. Having libvirt debug log is necessary to understand why libvirt raises an exception - this exception caused this regression and must be related to hosted-engine network configurations as it happened just after setupNetwork call. 

I suggest to leave this bug for verification in next 3.5.z release - This is a complex of network configuration issue, libvirt unknown error which relates to hosted-engine installation over node.. (quite a lot involve in it)

Comment 39 Fabian Deutsch 2015-06-23 09:36:10 UTC
I tried about 15 times to rewproduce it with plain nics and bonds.
With the congfigured libvirtd logging and without.
All testing was done on the el6 0609 build.

We should document it as a known bug for now.

Andrew, can we add this as a known issue to the 3.5.4 notes?

Comment 40 Oved Ourfali 2015-06-23 10:56:48 UTC
Moving it to integration, and to MODIFIED, based on Comment 39.
If it re-occurs, and seems to be an infra issue, then we'll re-examine.

Comment 41 Andrew Dahms 2015-06-23 11:01:01 UTC
Hi Fabian,

Yes, we'll make sure this one is added to the release notes for you.

Thank you for the notice!

Kind regards,

Andrew

Comment 42 Sandro Bonazzola 2015-06-25 09:32:14 UTC
Moving to ON_QA since this has been marked as TestOnly.
QE: please wait for the build to be available before trying to test this.

Comment 45 Fabian Deutsch 2015-07-23 10:56:37 UTC
*** Bug 1244740 has been marked as a duplicate of this bug. ***

Comment 46 Nikolai Sednev 2015-11-05 18:53:04 UTC
Not reproduced on Red Hat Enterprise Virtualization Hypervisor release 7.2 (20151104.0.el7ev)  
ovirt-vmconsole-host-1.0.0-1.el7ev.noarch                                                                                                         
ovirt-node-branding-rhev-3.6.0-0.20.20151103git3d3779a.el7ev.noarch                                                                               
ovirt-node-lib-3.6.0-0.20.20151103git3d3779a.el7ev.noarch                                                                                         
ovirt-node-3.6.0-0.20.20151103git3d3779a.el7ev.noarch                                                                                             
ovirt-node-plugin-snmp-logic-3.6.0-0.20.20151103git3d3779a.el7ev.noarch                                                                           
ovirt-hosted-engine-setup-1.3.0-1.el7ev.noarch
ovirt-node-plugin-vdsm-0.6.1-2.el7ev.noarch
ovirt-setup-lib-1.0.0-1.el7ev.noarch
ovirt-vmconsole-1.0.0-1.el7ev.noarch
ovirt-node-lib-config-3.6.0-0.20.20151103git3d3779a.el7ev.noarch
ovirt-node-selinux-3.6.0-0.20.20151103git3d3779a.el7ev.noarch
ovirt-node-plugin-cim-logic-3.6.0-0.20.20151103git3d3779a.el7ev.noarch
ovirt-hosted-engine-ha-1.3.2.1-1.el7ev.noarch
ovirt-node-plugin-hosted-engine-0.3.0-2.el7ev.noarch
ovirt-node-plugin-snmp-3.6.0-0.20.20151103git3d3779a.el7ev.noarch
ovirt-node-plugin-rhn-3.6.0-0.20.20151103git3d3779a.el7ev.noarch
sanlock-3.2.4-1.el7.x86_64
ovirt-node-lib-legacy-3.6.0-0.20.20151103git3d3779a.el7ev.noarch
libvirt-1.2.17-13.el7.x86_64
qemu-kvm-rhev-2.3.0-31.el7.x86_64
vdsm-4.17.10.1-0.el7ev.noarch
ovirt-host-deploy-offline-1.4.0-1.el7ev.x86_64
ovirt-node-plugin-cim-3.6.0-0.20.20151103git3d3779a.el7ev.noarch
ovirt-host-deploy-1.4.0-1.el7ev.noarch

Comment 47 Lucy Bopf 2016-02-19 10:20:17 UTC
Moving 'requires_doc_text' to '-', as this bug is verified and no longer needs to be documented as a known issue.

Comment 49 errata-xmlrpc 2016-03-09 19:34:00 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-0362.html


Note You need to log in before you can comment on or make changes to this bug.