Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1004066

Summary:	Host: Exit message: internal error No more available PCI addresses
Product:	Red Hat Enterprise Virtualization Manager	Reporter:	baiesi
Component:	ovirt-engine	Assignee:	Omer Frenkel <ofrenkel>
Status:	CLOSED ERRATA	QA Contact:	Ilanit Stein <istein>
Severity:	urgent	Docs Contact:
Priority:	urgent
Version:	3.2.0	CC:	aberezin, acathrow, adahms, adevolder, baiesi, bazulay, iheim, lpeer, mavital, michal.skrivanek, ofrenkel, Rhev-m-bugs, rhodain, rmcswain, simon.neininger, ssekidde, yeylon
Target Milestone:	---	Keywords:	Triaged, ZStream
Target Release:	3.3.0
Hardware:	x86_64
OS:	Linux
Whiteboard:	virt
Fixed In Version:	is18	Doc Type:	Bug Fix
Doc Text:	Previously, sound devices would not be found when restoring the configuration of a stateless virtual machine, resulting in a new sound device being added. This would result in new sound devices being added to stateless virtual machines each time those machines were started, eventually preventing them from starting. With this update, the code that searches for sound devices has been updated so that sound devices are now correctly discovered when restoring stateless virtual machines, and no new sound devices are created. Stateless virtual machines now only have a single sound device and will not experience any issues due to sound devices after being started multiple times.	Story Points:	---
Clone Of:
Clones:	1015134 (view as bug list)		Environment:
Last Closed:	2014-01-21 17:36:48 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1015134

Description baiesi 2013-09-03 20:29:57 UTC

Summary:
Host: Exit message: internal error No more available PCI addresses

Description of problem:
I first noticed vm migration failing from the admin portal indicating "Migration failed due to Error: Fatal error during migration". Then correlated that the destination Host had an issue indicating "Exit message: internal error No more available PCI addresses".  All vm(s) associated with the Host are in a frozen state in the UI, some contain hour-glass, some migrating to and some a power up status that never change. Since this issue had occurred, the Host systems memory resources had been slowly climbing now at 82.06%. 

The test environment is currently in this condition and will remain in this state for a brief period of time in case developer wish to get access to it. I have not tried to get my Host functional again by trying to put the Host into maintenance mode. I will try this if the developers have no need for the current system state ans see if will recover.

Current state;
-Unable to migrate vm(s) to this Host
-Unable to shut-down the Hosts vm(s): a dialog indicates "Cannot shut-down VM. VM is not running."
-Unable to suspend the Hosts vm(s): a dialog indicates "Cannot hibernate VM. VM is not up."
-Unable to run the Hosts vm(s): a dialog indicates "Cannot run VM. VM is running."
-Unable to cancel migration of the Hosts vm(s): a dialog indicates "Cannot cancel migration for non migrating VM"
-The Admins Portal UI shows the Vm(s) for the Host in a frozen state as indicated above.

Version-Release number of selected component:
Host Info
OS       : RHEL6Server - 6.4.0.4.el6
Kernel   : 2.6.32 - 358.14.1.el6.x86_64
KVM Ver  : 0.12.1.2 - 2.355.el6_4.5
Libvirt  : libvirt-0.10.2-18.el6_4.9
vdsm     : vdsm-4.10.2-23.0.el6ev
spice    : 0.12.0 - 12.el6_4.2

How reproducible:Undetermined since this was the first run
Steps to Reproduce:
1.Run System test load against the system for an extended period of time

Actual results:
Failed migration errors with Host generated event: " Exit message: internal error No more available PCI addresses."

Expected results:
Continued system operation and functionality

Additional info:
I have running a 30 day test using Rhevm 3.2.
Type            : System / Longevity
Target Duration : 30 days
Current Duration: 26 days / Run 1

System Test Env:
-Red Hat Enterprise Virtualization Manager Version: 3.2.1-0.39.el6ev
-Qty 1 Rhel6.4, Rhevm Server,  high end Dell PowerEdge R710 Dual 8core, 32GBRam, rhevm-3.2.1-0.39.el6ev.noarch
-Qty 4 Rhel6.4, Hosts all high end Dell, PowerEdge R710 Dual 8core, 16GBRam
-Qty 1 Rhel6.4, Ipa Directory Server
-Qty 3 Rhel6.4, Load Client machines to dive user simulated load.

VM(s)
Total 34 Vms created

Storage
-ISCSI Total 500G
-Name Type Storage Format Cross Data-Center-Status FreeSpace
-ISCIMainStorage Data (Master) iSCSI  V3 Active 263 GB

Data collection / monitoring:
All systems being monitored for uptime, memory, swap, cpu, networkio, diskio and disk space during the test run. (except for the IPA Server/ Clients)

System Test Load:
1. VM_Crud client, A python multi-thread client using the sdk to cycles through a crud flow of VM(s) over a period of time defined by the tester to drive load against the system  (10 threads)

2. VM_Migration client, A python multi-thread client using the sdk to cycles through migrating running vms from host to host in the test environment over a period of time defined by the tester to drive load against the system (2 threads)

3. VM_Cycling client, A python multi-thread client using the sdk to cycles through a rnd run, suspend, stop of existing VM(s) in the test environment over a period of time defined by the tester to drive load against the system (10 threads)

4. UserPortal client, A python multi-thread client using selenium  to drive the User Portal.  The client cycles through unique users to run, stop or start console remote-viewer of existing VM(s) in the test environment over a period of time defined by the tester to drive load against the system (10 threads)

Let me know if there are any additional logs or system access required.

Thanks
Bruce

Comment 2 baiesi 2013-09-09 15:02:30 UTC

Let me know if anyone needs access to the test env in its current state.  I've sent the information out already to a few who has requested access.  The systems will be available until Wednesday EST 11:00.

Comment 3 Barak 2013-09-09 20:22:45 UTC

Below is the command sent for the destination host,


{'custom': {},
     'keyboardLayout':'en-us',
     'kvmEnable': 'true',{'custom': {},
     'keyboardLayout':'en-us',
     'kvmEnable': 'true',
     'pitReinjection': 'false',
     'acpiEnable': 'true',
     'emulatedMachine': 'rhel6.4.0',
     'cpuType': 'Westmere',
     'vmId': '27f78875-8e50-4dc1-9c24-e519bd8683ce',
     'devices':
	[{'device': 'qxl',
	  'specParams': {'vram': '65536'},
	  'type': 'video',
	  'deviceId': 'c4b8cb2c-a27b-46c6-a08d-ee96175a60fe'},
	 {'index': '2',
	  'iface': 'ide',
	  'bootOrder': '2',
	  'specParams': {'path': ''},
	  'readonly': 'true',
	  'deviceId': '0113864d-9dfe-4b14-8cc8-d83a85db7f98',
	  'path': '',
	  'device': 'cdrom',
	  'shared': 'false',
	  'type': 'disk'},
	 {'index': 0,
	  'iface': 'virtio',
	  'format': 'cow',
	  'bootOrder': '1',
	  'poolID': '5849b030-626e-47cb-ad90-3ce782d831b3',
	  'volumeID': 'e8fe1b6e-b62c-487c-bf04-b9a6168131fb',
	  'imageID': '28e91ef5-a15b-421d-adda-da5a2294d1ae',
	  'specParams': {},
	  'readonly': 'false',
	  'domainID': '156a4b8c-f139-46c5-9e0b-fceaaaeaff4f',
	  'optional': 'false',
	  'deviceId': '28e91ef5-a15b-421d-adda-da5a2294d1ae',
	  'address': {'bus': '0x00', ' slot': '0x05', ' domain': '0x0000', ' type': 'pci', ' function': '0x0'},
	  'device': 'disk',
	  'shared': 'false',
	  'propagateErrors': 'off',
	  'type': 'disk'},
	 {'nicModel': 'e1000',
	  'macAddr': '00:1a:4a:10:86:31',
	  'linkActive': 'true',
	  'network': 'rhevm',
	  'filter': 'vdsm-no-mac-spoofing',
	  'specParams': {},
	  'deviceId': '051fbde9-71ab-49c1-bdf6-a22a8e24355e',
	  'address': {'bus': '0x00', ' slot': '0x03', ' domain': '0x0000', ' type': 'pci', ' function': '0x0'},
	  'device': 'bridge',
	  'type': 'interface'},
	 {'device': 'ich6',
	  'specParams': {},
	  'type': 'sound',
	  'deviceId': '3524a274-3439-475b-93c5-d420cafb1be3'},
	 {'device': 'ich6',
	  'specParams': {},
	  'type': 'sound',
	  'deviceId': 'c0dbb199-3648-478f-b995-45180b3c1cba'},
	 {'device': 'ich6', 'specParams': {},
	  'type': 'sound',
	  'deviceId': 'a8a760f2-6b05-4390-b4f3-0b4c7eb1d0dc'},
	 {'device': 'ich6',
	  'specParams': {},
	  'type': 'sound',
	  'deviceId': '4db9217e-6c80-4638-974b-863b3b6dae1a'},
	 {'device': 'ich6',
	  'specParams': {},
	  'type': 'sound',
	  'deviceId': '625b6716-8855-419f-9952-f01a542aec7f'},
	 {'device': 'ich6',
	  'specParams': {},
	  'type': 'sound',
	  'deviceId': '4adbc5d0-f5b5-4ac1-9df9-327561b7f4ca'},
	 {'device': 'ich6',
	  'specParams': {},
	  'type': 'sound',
	  'deviceId': '579f84f5-ae32-49d7-9977-18d57ab90a41'},
	 {'device': 'ich6',
	  'specParams': {},
	  'type': 'sound',
	  'deviceId': '17d51030-b43d-43fa-a413-e06c8b2d62bb'},
	 {'device': 'ich6',
	  'specParams': {},
	  'type': 'sound',
	  'deviceId': '2d28d304-32eb-423d-baee-91f218eb43f2'},
	 {'device': 'ich6',
	  'specParams': {},
	  'type': 'sound',
	  'deviceId': 'ff4c9f6a-3e7e-4a25-b0d7-776f86e5a3e4'},
	 {'device': 'ich6',
	  'specParams': {},
	  'type': 'sound',
	  'deviceId': 'f05b4df4-52b9-48d4-8a7d-f32304df9da8'},
	 {'device': 'ich6',
	  'specParams': {},
	  'type': 'sound',
	  'deviceId': 'c2bf70af-48f7-4058-a611-d95dead221c7'},
	 {'device': 'ich6',
	  'specParams': {},
	  'type': 'sound',
	  'deviceId': '85600e66-d6e9-4bea-ac97-e9fd358ff33b'},
	 {'device': 'ich6',
	  'specParams': {},
	  'type': 'sound',
	  'deviceId': '1fa4a40b-d847-4b81-a572-9150135aa511'},
	 {'device': 'ich6',
	  'specParams': {},
	  'type': 'sound',
	  'deviceId': '326bf1da-0a50-4da6-ad45-f4794ab6feea'},
	 {'device': 'ich6',
	  'specParams': {},
	  'type': 'sound',
	  'deviceId': '824d1bf0-7fff-439b-8f7a-4c0989342a49'},
	 {'device': 'ich6',
	  'specParams': {},
	  'type': 'sound',
	  'deviceId': '4d789bd1-6d3f-4599-8299-68467c4d591d'},
	 {'device': 'ich6',
	  'specParams': {},
	  'type': 'sound',
	  'deviceId': 'eb829e9a-ef2f-4401-a1f0-49d10483c097'},
	 {'device': 'ich6',
	  'specParams': {},
	  'type': 'sound',
	  'deviceId': '48fe2b91-46e4-461c-ae0c-bb96cb4b2676'},
	 {'device':
	  'ich6', 'specParams': {},
	  'type': 'sound',
	  'deviceId': 'b5e910dd-f7b9-497b-a518-35f750c8e1c0'},
	 {'device': 'ich6',
	  'specParams': {},
	  'type': 'sound',
	  'deviceId': '5cc0325a-03cd-408d-8e88-c5af29f108d8'},
	 {'device': 'ich6',
	  'specParams': {},
	  'type': 'sound',
	  'deviceId': 'ea36e407-91f6-43ea-8f41-1eebff6e4661'},
	 {'device': 'ich6',
	  'specParams': {},
	  'type': 'sound',
	  'deviceId': '0212b3b5-5170-4877-87cb-40c1b290b619'}, 
	 {'device': 'ich6', 
	  'specParams': {}, 
	  'type': 'sound', 
	  'deviceId': '11c16b25-7851-4b3d-8866-cc4414721ee3'}, 
	 {'device': 'ich6', 
	  'specParams': {}, 
	  'type': 'sound', 
	  'deviceId': '5494e581-8823-4607-bfa0-e1419173567d'}, 
	 {'device': 'ich6', 
	  'specParams': {}, 
	  'type': 'sound', 
	  'deviceId': '7c74505d-6181-4e5b-bdfe-c03319bf468f'}, 
	 {'device': 'memballoon', 
	  'specParams': {'model': 'virtio'}, 
	  'type': 'balloon', 
	  'deviceId': 'b2c0ae77-dace-4539-bade-f2c776061cef'}], 
     'smp': '1', 
     'vmType': 'kvm', 
     'timeOffset': '0', 
     'memSize': 1024, 
     'spiceSslCipherSuite': 'DEFAULT', 
     'smpCoresPerSocket': '1', 
     'spiceSecureChannels': 'smain,sinputs,scursor,splayback,srecord,sdisplay,susbredir,ssmartcard', 
     'smartcardEnable': 'false', 
     'vmName': 'p_rhel6x64-29', 
     'display': 'qxl', 
     'transparentHugePages': 'true', 
     'nice': '0'}

If I counted correctly it has total of 31 devices,
the odd thing is that 26 of them are sound cards ich6

IIRC qemu has 32 slots ?

Comment 4 Barak 2013-09-09 20:30:49 UTC

Can we please get the log of the creation of this VM on the source host ?
'vmName': 'p_rhel6x64-29'

Comment 5 Barak 2013-09-09 20:42:08 UTC

Can we also get the DB backup (from log collector)

Comment 6 Michal Skrivanek 2013-09-13 07:12:38 UTC

also - I suppose the VM was edited or something, wasn't it? Or is it a freshly created new VM in webadmin?

Comment 7 Robert McSwain 2013-09-13 15:37:42 UTC

I am providing a RHEV Database from my customer who is experiencing this same issue.

Comment 10 Omer Frenkel 2013-09-16 08:48:46 UTC

please attach engine and vdsm logs

Comment 14 Omer Frenkel 2013-09-25 06:39:56 UTC

I was able to reproduce this, steps:
on 3.2:
1. create stateless desktop vm.
2. start it and once its up stop it.
3. observe vm_device table in the db and see additional sound device,
or on next run observe creation xml in vdsm.log and see the multiple sound device.
4. execute step 2 until vm cannot be started anymore with the 'No more PCI addresses' error. (this depend on number of other device in the vm, monitors, usb, disks, etc..)

on 3.3 this is reproducible only if cluster compatibility level is 3.0.

problem is that when restoring the vm configuration (as it is statelss), code that looks for existing sound devices is wrong

Comment 16 Michal Skrivanek 2013-10-03 12:09:57 UTC

merged to ovirt-engine-3.3

Comment 18 Ilanit Stein 2013-10-13 10:18:37 UTC

Verified on is18. 

Followed the reproduce flow in comment 14, on 3.0 cluster.

On vdsm log, the device list in the xml did not increase for repeatedly VM  stop & start.

Comment 19 Michal Skrivanek 2013-11-11 14:54:47 UTC

*** Bug 1024833 has been marked as a duplicate of this bug. ***

Comment 20 Charlie 2013-11-28 00:13:35 UTC

This bug is currently attached to errata RHEA-2013:15231. If this change is not to be documented in the text for this errata please either remove it from the errata, set the requires_doc_text flag to minus (-), or leave a "Doc Text" value of "--no tech note required" if you do not have permission to alter the flag.

Otherwise to aid in the development of relevant and accurate release documentation, please fill out the "Doc Text" field above with these four (4) pieces of information:

* Cause: What actions or circumstances cause this bug to present.
* Consequence: What happens when the bug presents.
* Fix: What was done to fix the bug.
* Result: What now happens when the actions or circumstances above occur. (NB: this is not the same as 'the bug doesn't present anymore')

Once filled out, please set the "Doc Type" field to the appropriate value for the type of change made and submit your edits to the bug.

For further details on the Cause, Consequence, Fix, Result format please refer to:

https://bugzilla.redhat.com/page.cgi?id=fields.html#cf_release_notes 

Thanks in advance.

Comment 21 errata-xmlrpc 2014-01-21 17:36:48 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2014-0038.html