Bug 984416

Summary: GetHardwareInfoVDS fails with XmlRpcException
Product: Red Hat Enterprise Virtualization Manager Reporter: Ohad Basan <obasan>
Component: ovirt-engineAssignee: Yaniv Bronhaim <ybronhei>
Status: CLOSED DUPLICATE QA Contact:
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 3.3.0CC: acathrow, bazulay, eedri, iheim, lpeer, obasan, Rhev-m-bugs, yeylon
Target Milestone: ---Keywords: Triaged
Target Release: 3.3.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: infra
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-07-23 12:27:59 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Infra RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Ohad Basan 2013-07-15 08:06:19 UTC
Description of problem:

Executing command GetHardwareInfoVDS fails with an exception
2013-07-14 14:29:44,562 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetHardwareInfoVDSCommand] (DefaultQuartzScheduler_Worker-1) Command GetHardwareInfoVDS execution failed. Exception: VDSNetworkException: org.apache.xmlrpc.XmlRpcException: <type 'exceptions.TypeError'>:cannot marshal None unless allow_none is enabled


2013-07-14 14:29:44,568 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler_Worker-1) Correlation ID: null, Call Stack: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException: org.apache.xmlrpc.XmlRpcException: <type 'exceptions.TypeError'>:cannot marshal None unless allow_none is enabled
        at org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerCommand.executeVDSCommand(VdsBrokerCommand.java:102)
        at org.ovirt.engine.core.vdsbroker.VDSCommandBase.executeCommand(VDSCommandBase.java:56)
        at org.ovirt.engine.core.dal.VdcCommandBase.execute(VdcCommandBase.java:28)
        at org.ovirt.engine.core.vdsbroker.ResourceManager.runVdsCommand(ResourceManager.java:359)
        at org.ovirt.engine.core.vdsbroker.VdsManager.refreshCapabilities(VdsManager.java:548)
        at org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo.refreshVdsRunTimeInfo(VdsUpdateRunTimeInfo.java:528)
        at org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo.Refresh(VdsUpdateRunTimeInfo.java:379)
        at org.ovirt.engine.core.vdsbroker.VdsManager.OnTimer(VdsManager.java:237)
        at sun.reflect.GeneratedMethodAccessor51.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.ovirt.engine.core.utils.timer.JobWrapper.execute(JobWrapper.java:60)
        at org.quartz.core.JobRunShell.run(JobRunShell.java:213)
        at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:557)
Caused by: org.apache.xmlrpc.XmlRpcException: <type 'exceptions.TypeError'>:cannot marshal None unless allow_none is enabled
        at org.apache.xmlrpc.client.XmlRpcStreamTransport.readResponse(XmlRpcStreamTransport.java:197)
        at org.apache.xmlrpc.client.XmlRpcStreamTransport.sendRequest(XmlRpcStreamTransport.java:156)
        at org.apache.xmlrpc.client.XmlRpcHttpTransport.sendRequest(XmlRpcHttpTransport.java:143)
        at org.apache.xmlrpc.client.XmlRpcClientWorker.execute(XmlRpcClientWorker.java:56)
        at org.apache.xmlrpc.client.XmlRpcClient.execute(XmlRpcClient.java:167)
        at org.apache.xmlrpc.client.XmlRpcClient.execute(XmlRpcClient.java:137)
        at org.apache.xmlrpc.client.XmlRpcClient.execute(XmlRpcClient.java:126)
        at org.apache.xmlrpc.client.util.ClientFactory$1.invoke(ClientFactory.java:140)
        at com.sun.proxy.$Proxy100.getVdsHardwareInfo(Unknown Source)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.ovirt.engine.core.vdsbroker.xmlrpc.XmlRpcUtils$AsyncProxy$InternalCallable.call(XmlRpcUtils.java:225)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
        at java.util.concurrent.FutureTask.run(FutureTask.java:166)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
        at java.util.concurrent.FutureTask.run(FutureTask.java:166)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:724)

Here is the REST query being sent

2013-07-14 14:29:27,344 - MainThread - plmanagement.matrix-test-composer - INFO - Test parameters: True, name='restvm0', description='Test VM', cluster='RestCluster1'
2013-07-14 14:29:27,344 - MainThread - plmanagement.matrix-test-composer - INFO - Running command: addVm(True, name='restvm0', description='Test VM', cluster='RestCluster1')
2013-07-14 14:29:28,410 - MainThread - vms - DEBUG - CREATE api content is --  collection:vms element:<vm>
    <name>restvm0</name>
    <description>Test VM</description>
    <os>
        <boot dev="hd"/>
    </os>
    <cluster href="/api/clusters/34ab0542-90cb-4492-ba51-65e8cad14908" id="34ab0542-90cb-4492-ba51-65e8cad14908">
        <name>RestCluster1</name>
        <link href="/api/clusters/34ab0542-90cb-4492-ba51-65e8cad14908/networks" rel="networks"/>
        <link href="/api/clusters/34ab0542-90cb-4492-ba51-65e8cad14908/permissions" rel="permissions"/>
        <link href="/api/clusters/34ab0542-90cb-4492-ba51-65e8cad14908/glustervolumes" rel="glustervolumes"/>
        <cpu id="Intel Conroe Family"/>
        <data_center href="/api/datacenters/9de2c9d8-81a0-468d-baad-d7adfa8bd781" id="9de2c9d8-81a0-468d-baad-d7adfa8bd781"/>
        <memory_policy>
            <overcommit percent="200"/>
            <transparent_hugepages>
                <enabled>true</enabled>
            </transparent_hugepages>
        </memory_policy>
        <scheduling_policy/>
        <version major="3" minor="2"/>
        <error_handling>
            <on_error>migrate</on_error>
        </error_handling>
        <virt_service>true</virt_service>
        <gluster_service>false</gluster_service>
        <threads_as_cores>false</threads_as_cores>
        <tunnel_migration>false</tunnel_migration>
        <trusted_service>false</trusted_service>
    </cluster>
    <template id="00000000-0000-0000-0000-000000000000"/>
</vm>
 
2013-07-14 14:29:29,942 - MainThread - vms - INFO - New entity was added successfully
2013-07-14 14:29:29,942 - MainThread - vms - ERROR - Element '<ovirtsdk.infrastructure.brokers.VM object at 0x33dc350>' has no attribute 'watchdogs'

Comment 3 Barak 2013-07-21 10:15:15 UTC
Ohad,

Does it happen on all the hosts ? or a specific one ?
This is a basic flow (getVdsCapabilities), so if it fails nothing should work.

Comment 4 Ohad Basan 2013-07-21 10:21:53 UTC
I don't know if it happens on all the host
but it's certainly not a specific one.
I see it on more machines.

Comment 6 Barak 2013-07-21 15:15:40 UTC
Is this a duplicate of Bug 984267

Comment 7 Yaniv Bronhaim 2013-07-22 13:07:47 UTC
Explanation about the fix: After supervdsm crash or restart, vdsm won't be able to reach supervdsm in its next call. RuntimeError will be raised to vdsm and vdsm will try to reconnect to supervdsm, if supervdsm is up again properly, the reconnection will establish and next call between vdsm and supervdsm will work.

In the bz case, setupNetwork fails due to restart of supervdsm (what should be fixed separately as part of BZ 984267), and getVdsHardwareInfo failed also (because reconnect wasn't occurred, and the connection was down). The only way to recover from that was to restart vdsmd. 

With the fix the next call will work after reconnect.

Comment 8 Yaniv Bronhaim 2013-07-23 12:27:59 UTC

*** This bug has been marked as a duplicate of bug 980493 ***