Bug 1317468 - [RFE]Email notification when the number of LVs in SD are reaching/more than 300
Summary: [RFE]Email notification when the number of LVs in SD are reaching/more than 300
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: RFEs
Version: 3.6.0
Hardware: All
OS: Linux
medium
medium
Target Milestone: ovirt-4.0.0-rc
: 4.0.0
Assignee: Idan Shaby
QA Contact: Natalie Gavrielov
URL:
Whiteboard:
Depends On: 1326578
Blocks: 1275182
TreeView+ depends on / blocked
 
Reported: 2016-03-14 10:25 UTC by Yaniv Lavi
Modified: 2016-08-15 01:52 UTC (History)
19 users (show)

Fixed In Version: ovirt 4.0.0 alpha1
Doc Type: Enhancement
Doc Text:
Previously, when the number of logical volumes in a storage domain reached the recommended maximum it was logged and a message was shown in the events pane. Now, a user can register to the event notifier and receive an email when the number of logical volumes in a storage domain reached the recommended maximum.
Clone Of: 1275182
Environment:
Last Closed: 2016-08-01 12:24:30 UTC
oVirt Team: Storage
Embargoed:
rule-engine: ovirt-4.0.0+
acanan: testing_plan_complete+
ylavi: planning_ack+
rule-engine: devel_ack+
acanan: testing_ack+


Attachments (Terms of Use)
Screenshot from 2016-04-12 17:48:44.png (45.09 KB, image/png)
2016-04-12 14:51 UTC, Nikolai Sednev
no flags Details
sosreport from the engine (6.19 MB, application/x-xz)
2016-04-12 14:54 UTC, Nikolai Sednev
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 53583 0 None None None 2016-03-14 10:25:41 UTC
oVirt gerrit 58639 0 master MERGED engine: prevent notification flood for number of LVs 2016-06-06 08:11:43 UTC
oVirt gerrit 58652 0 None None None 2016-06-06 11:27:43 UTC

Description Yaniv Lavi 2016-03-14 10:25:42 UTC
We have the warning in the events tab:
"The number of LVs on the domain XXX exceeded 300, you are approaching the limit where performance may degrade."

But many users do not access frequently the admin portal and do not get notified regarding that.
We would like to add an email notification for this as well.

Comment 1 Nikolai Sednev 2016-04-12 14:50:15 UTC
Failed to set the "Users"->"Manage Events"->"Storage Management Events:" -> "Storage Domain's number of LVs exceeded threshold", got:
"
Operation Canceled
Error while executing action: A Request to the Server failed with the following Status Code: 500"

Sosreport from host has been attached.

Comment 2 Nikolai Sednev 2016-04-12 14:51:19 UTC
Created attachment 1146504 [details]
Screenshot from 2016-04-12 17:48:44.png

Comment 3 Nikolai Sednev 2016-04-12 14:54:37 UTC
Created attachment 1146505 [details]
sosreport from the engine

Comment 4 Nikolai Sednev 2016-04-12 14:58:12 UTC
Engine's components:
CentOS Linux release 7.2.1511 (Core) 
Linux 3.10.0-327.13.1.el7.x86_64 #1 SMP Thu Mar 31 16:04:38 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
ovirt-engine-4.0.0-0.0.master.20160404161620.git4ffd5a4.el7.centos.noarch
ovirt-host-deploy-1.5.0-0.0.master.20160329063025.gitdf57fe1.el7.centos.noarch
ovirt-host-deploy-java-1.5.0-0.0.master.20160329063025.gitdf57fe1.el7.centos.noarch
javassist-3.16.1-10.el7.noarch
javapackages-tools-3.4.1-11.el7.noarch
java-1.7.0-openjdk-devel-1.7.0.99-2.6.5.0.el7_2.x86_64
java-1.8.0-openjdk-1.8.0.77-0.b03.el7_2.x86_64
java-1.7.0-openjdk-headless-1.7.0.99-2.6.5.0.el7_2.x86_64
java-1.8.0-openjdk-headless-1.8.0.77-0.b03.el7_2.x86_64
javamail-1.4.6-8.el7.noarch
java-1.7.0-openjdk-1.7.0.99-2.6.5.0.el7_2.x86_64

These from server.log might be also related to the issue:

Caused by: java.lang.NoClassDefFoundError: javax/mail/internet/AddressException
        at java.lang.Class.forName0(Native Method) [rt.jar:1.8.0_77]
        at java.lang.Class.forName(Class.java:264) [rt.jar:1.8.0_77]
        at org.ovirt.engine.core.bll.CommandsFactory.loadClass(CommandsFactory.java:193) [bll.jar:]
        at org.ovirt.engine.core.bll.CommandsFactory.getCommandClass(CommandsFactory.java:178) [bll.jar:]
        at org.ovirt.engine.core.bll.CommandsFactory.getCommandClass(CommandsFactory.java:161) [bll.jar:]
        at org.ovirt.engine.core.bll.CommandsFactory.createCommand(CommandsFactory.java:86) [bll.jar:]
        at org.ovirt.engine.core.bll.CommandsFactory.createCommand(CommandsFactory.java:79) [bll.jar:]
        at org.ovirt.engine.core.bll.MultipleActionsRunner.initCommandsAndReturnValues(MultipleActionsRunner.java:80) [bll.jar:]
        at org.ovirt.engine.core.bll.MultipleActionsRunner.execute(MultipleActionsRunner.java:61) [bll.jar:]
        at org.ovirt.engine.core.bll.Backend.runMultipleActionsImpl(Backend.java:613) [bll.jar:]
        at org.ovirt.engine.core.bll.Backend.runMultipleActions(Backend.java:583) [bll.jar:]
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [rt.jar:1.8.0_77]
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) [rt.jar:1.8.0_77]
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [rt.jar:1.8.0_77]
        at java.lang.reflect.Method.invoke(Method.java:498) [rt.jar:1.8.0_77]
        at org.jboss.as.ee.component.ManagedReferenceMethodInterceptor.processInvocation(ManagedReferenceMethodInterceptor.java:52)
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:340)
        at org.jboss.invocation.InterceptorContext$Invocation.proceed(InterceptorContext.java:437)
        at org.jboss.as.weld.ejb.Jsr299BindingsInterceptor.delegateInterception(Jsr299BindingsInterceptor.java:70) [wildfly-weld-10.0.0.Final.jar:10.0.0.Final]

Comment 5 Aharon Canan 2016-04-12 15:03:45 UTC
Did you check other events? all fails or only the LVs event fails?

If all is not working than we need to open a bug for it and block this on the new bug, 

else we need to reopen the RFE as the basic flow is not working.

Comment 6 Nikolai Sednev 2016-04-13 06:49:21 UTC
(In reply to Aharon Canan from comment #5)
> Did you check other events? all fails or only the LVs event fails?
> 
> If all is not working than we need to open a bug for it and block this on
> the new bug, 
> 
> else we need to reopen the RFE as the basic flow is not working.

It fails each time you selecting all or any of the selection boxes, not related to this feature in general, this looks like a frontend java bug.
I think that separate bug is the best option for this and this RFE will have to be set as dependent on opened bug.

Comment 7 Nikolai Sednev 2016-04-13 07:09:37 UTC
Opened a separate bug and blocking this one with it:
https://bugzilla.redhat.com/show_bug.cgi?id=1326578

Comment 8 Natalie Gavrielov 2016-05-24 13:54:04 UTC
Yaniv,

After number of LV's in storage domain reaches 300, and I get an email notification.. should I expect getting the same notification for 301,302,303.. etc. (over and over again)?

Thanks

Comment 9 Nikolai Sednev 2016-05-25 06:46:18 UTC
Please bare in mind that if customer will create each of he's disks (LVs) separately inside specially created SD,(one LV per SD)*300, then he'll hit the 1339245.

Comment 10 Allon Mureinik 2016-05-25 07:07:15 UTC
(In reply to Nikolai Sednev from comment #9)
> Please bare in mind that if customer will create each of he's disks (LVs)
> separately inside specially created SD,(one LV per SD)*300, then he'll hit
> the 1339245.
No, he won't.
This alert is for LVs PER STORAGE DOMAIN.

Comment 11 Idan Shaby 2016-05-26 06:11:36 UTC
(In reply to Natalie Gavrielov from comment #8)
> Yaniv,
> 
> After number of LV's in storage domain reaches 300, and I get an email
> notification.. should I expect getting the same notification for
> 301,302,303.. etc. (over and over again)?
> 
> Thanks

Yes, you should get it each time you create a disk and the total number of disks on the domain is >= AlertOnNumberOfLVs in the db (300 by default).

Comment 12 Natalie Gavrielov 2016-05-26 17:04:57 UTC
(In reply to Idan Shaby from comment #11)
> (In reply to Natalie Gavrielov from comment #8)
> > Yaniv,
> > 
> > After number of LV's in storage domain reaches 300, and I get an email
> > notification.. should I expect getting the same notification for
> > 301,302,303.. etc. (over and over again)?
> > 
> > Thanks
> 
> Yes, you should get it each time you create a disk and the total number of
> disks on the domain is >= AlertOnNumberOfLVs in the db (300 by default).

Let's say a user is running a script creating 500 disks (LV's).. he'll get 200 emails.
This is kind of spamming the user, isn't it?

Comment 13 Idan Shaby 2016-05-29 05:16:41 UTC
He'll get 201 emails. It's not a bug, it's by design.
But let the PMs/managers decide.
Yaniv/Allon?

Comment 14 Allon Mureinik 2016-05-29 11:52:21 UTC
Won't the flood suppression mechanism prevent you from getting too many emails at one?

Comment 15 Idan Shaby 2016-05-30 07:32:49 UTC
I am not sure.
Moti, can you give us your two cents about the email notification mechanism?

Comment 16 Moti Asayag 2016-05-30 08:19:11 UTC
(In reply to Allon Mureinik from comment #14)
> Won't the flood suppression mechanism prevent you from getting too many
> emails at one?

Yes, the definition of the enum in AuditLogType doesn't contain any flood info - therefore events keeps coming. 

NUMBER_OF_LVS_ON_STORAGE_DOMAIN_EXCEEDED_THRESHOLD(1008, AuditLogSeverity.WARNING),

You should decide on a reasonable threshold and set it for this enum (see AuditLogTimeInterval)

In order to prevent loosing events for other storage domains, you should use the AuditLog.setCustomEventId() and set the storage domain id as the event identifier - so flooding prevention will act per storage domain.

(In reply to Idan Shaby from comment #15)
> I am not sure.
> Moti, can you give us your two cents about the email notification mechanism?

Aggregating notifications to the user by the event-type makes sense. But this specific bug can be fixed without changing the infra for that.

Comment 17 Natalie Gavrielov 2016-06-01 14:59:36 UTC
Allon/Idan,

Is there an open issue on flooding with emails?

Comment 18 Idan Shaby 2016-06-02 05:54:12 UTC
Not that I know of.
Allon, Do we want to pull this BZ back from QE and send a patch to prevent a mail flood?
From what Moti explained here, it should be pretty simple.

Comment 19 Allon Mureinik 2016-06-02 09:06:03 UTC
If it's under an hour's work, just do it.
If not, let's leave the decision to PM.

Comment 20 Yaniv Lavi 2016-06-02 13:56:01 UTC
(In reply to Idan Shaby from comment #18)
> Not that I know of.
> Allon, Do we want to pull this BZ back from QE and send a patch to prevent a
> mail flood?
> From what Moti explained here, it should be pretty simple.

We should not flood the user with email. Once every few hours should be good.

Comment 21 Red Hat Bugzilla Rules Engine 2016-06-05 05:14:38 UTC
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.

Comment 22 Allon Mureinik 2016-06-07 08:03:28 UTC
Moving target release to RC to include the latest fix.

Comment 23 Natalie Gavrielov 2016-07-05 17:36:30 UTC
Verified, rhevm-4.0.0.5-0.1.el7ev.noarch
One issue found (in one of the earlier builds) :
https://bugzilla.redhat.com/show_bug.cgi?id=1340209


Note You need to log in before you can comment on or make changes to this bug.