Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1147751

Summary: [scale] when hosts are too many ovirt-engine failed to refresh states
Product: [Retired] oVirt Reporter: 马立克 <like.ma>
Component: ovirt-engine-coreAssignee: Liran Zelkha <lzelkha>
Status: CLOSED CURRENTRELEASE QA Contact: Pavel Stehlik <pstehlik>
Severity: high Docs Contact:
Priority: unspecified    
Version: unspecifiedCC: bazulay, bugs, ecohen, gklein, iheim, lsurette, oourfali, rbalakri, s.kieske, yeylon
Target Milestone: ---   
Target Release: 3.3.4   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: infra
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-10-11 06:15:10 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
ovirt engine log
none
vdsm log none

Description 马立克 2014-09-30 02:55:27 UTC
Description of problem:
In our environments, we have 100 hosts. And we create 5 datacenters, there are 20 hosts in each datacenter. If all the hosts is UP state, then the ovirt-engine failed
to refresh states. e.g. i create a disk, and the state of the new disk will be locked.That's normal. But after a long time(like serval hours, one day, two days) it still locked. Until i restart the ovirt-engine service, the state of the new disk is changed to OK. The same thing happened to VM. If i start a VM, its state will be WaitForLaunch. And it will be that state until i restart ovirt-engine service.

I think there are some bugs when the ovirt manage so many hosts. because if i maintenance about 50 hosts then the environments will work well. The locked disk will change to OK state soon and the waitforlaunch vm will also change to Up state soon.

Version-Release number of selected component (if applicable):
3.2.1-1

How reproducible:
always

Expected results:


Additional info:
The size of disk is less than 300GB
The storage is iscsi

Comment 1 Sven Kieske 2014-09-30 06:49:23 UTC
Could you attach engine and vdsm logs?
Also this version is pretty ancient (afaik not supported anymore).
Did you try to upgrade, to see if the problem persists?

Comment 2 马立克 2014-09-30 07:02:40 UTC
(In reply to Sven Kieske from comment #1)
> Could you attach engine and vdsm logs?
> Also this version is pretty ancient (afaik not supported anymore).
> Did you try to upgrade, to see if the problem persists?

Due to some reasons i cannot upgrade. Do you have any stats about how many hosts can be supported in 3.2.1 environments(not theoretical but practical)?

Comment 3 马立克 2014-09-30 07:03:29 UTC
Created attachment 942611 [details]
ovirt engine log

Comment 4 马立克 2014-09-30 07:04:14 UTC
Created attachment 942612 [details]
vdsm log

Comment 5 Sven Kieske 2014-09-30 07:25:36 UTC
(In reply to 马立克 from comment #2)
> Due to some reasons i cannot upgrade. Do you have any stats about how many
> hosts can be supported in 3.2.1 environments(not theoretical but practical)?

No sorry, I have no practical statistics, but the docs state
that 200 Hosts per Cluster are supported (at least since 3.3. but I believe
this was already true for 3.2) but I never saw such a huge cluster myself yet.

However I know from testing that 3.3 is in all regards way more stable
than 3.2.

Can you share information why you can't upgrade?
The upgrade process itself is very good documented, so there might be some
edges in 3.2 to 3.3 upgrade (3.3 to 3.4 is way better).

Comment 6 马立克 2014-10-08 05:09:23 UTC
(In reply to Sven Kieske from comment #5)
> (In reply to 马立克 from comment #2)
> > Due to some reasons i cannot upgrade. Do you have any stats about how many
> > hosts can be supported in 3.2.1 environments(not theoretical but practical)?
> 
> No sorry, I have no practical statistics, but the docs state
> that 200 Hosts per Cluster are supported (at least since 3.3. but I believe
> this was already true for 3.2) but I never saw such a huge cluster myself
> yet.
> 
> However I know from testing that 3.3 is in all regards way more stable
> than 3.2.
> 
> Can you share information why you can't upgrade?
> The upgrade process itself is very good documented, so there might be some
> edges in 3.2 to 3.3 upgrade (3.3 to 3.4 is way better).

It's not technical reason. Due to some boring policy reasons of my organisation, i can't upgrade. But i'll try to get the permission to upgrade.

Comment 7 Oved Ourfali 2014-10-11 06:15:10 UTC
It does work well for 3.3 and above. Unfortunately we don't have these statistics for 3.2.  Please indeed upgrade when possible, preferably to 3.4. Closing this one.