Bug 973534 - Notification service cannot recover from connectivity loss
Summary: Notification service cannot recover from connectivity loss
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine-notification-service
Version: 3.3.0
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: 3.3.0
Assignee: Mooli Tayer
QA Contact: Ilanit Stein
URL:
Whiteboard: infra
Depends On:
Blocks: 1019461
TreeView+ depends on / blocked
 
Reported: 2013-06-12 07:48 UTC by Mooli Tayer
Modified: 2016-02-10 19:24 UTC (History)
4 users (show)

Fixed In Version: is2
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:
oVirt Team: Infra
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Mooli Tayer 2013-06-12 07:48:28 UTC
Description of problem:

If DB connectivity is lost at some point and later regained, the notification service daemon will not regain connectivity unless restarted.

How reproducible:

Every time.

Steps to Reproduce:

1.)First, to make reproduction quicker edit conf file:
share/ovirt-engine/conf/notifier.conf.defaults
INTERVAL_IN_SECONDS=10
(note: to run the notification service you will also need to configure the MAIL_SERVER property)
 
2.)Run the notification service:
share/ovirt-engine/services/ovirt-engine-notifier.py start

and watch the log:
var/log/ovirt-engine/notifier/notifier.log

3.)Disable DB:
sudo service postgresql stop
wait for first exception in the log

4.)Enable DB:
sudo service postgresql start

Actual results:

Notification service does not regain connectivity (unless restarted), 
and null pointer exceptions keep getting written to the log, once every iteration:

Failed to run the service: [null]
java.lang.NullPointerException
        at org.ovirt.engine.core.tools.common.db.StandaloneDataSource.checkConnection(StandaloneDataSource.java:112)
        at org.ovirt.engine.core.tools.common.db.StandaloneDataSource.getConnection(StandaloneDataSource.java:130)
        at org.ovirt.engine.core.notifier.NotificationService.processEvents(NotificationService.java:220)
        at org.ovirt.engine.core.notifier.NotificationService.run(NotificationService.java:103)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:722)

Expected results:

Notification service goes back to work normally

Additional info:

Comment 1 Ilanit Stein 2013-07-17 07:44:34 UTC
Verified on is5.

Comment 2 Itamar Heim 2014-01-21 22:26:43 UTC
Closing - RHEV 3.3 Released

Comment 3 Itamar Heim 2014-01-21 22:30:00 UTC
Closing - RHEV 3.3 Released


Note You need to log in before you can comment on or make changes to this bug.