Bug 973534 - Notification service cannot recover from connectivity loss
Notification service cannot recover from connectivity loss
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine-notification-service (Show other bugs)
3.3.0
Unspecified Unspecified
unspecified Severity unspecified
: ---
: 3.3.0
Assigned To: Mooli Tayer
Ilanit Stein
infra
:
Depends On:
Blocks: 1019461
  Show dependency treegraph
 
Reported: 2013-06-12 03:48 EDT by Mooli Tayer
Modified: 2016-02-10 14:24 EST (History)
4 users (show)

See Also:
Fixed In Version: is2
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: Infra
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Mooli Tayer 2013-06-12 03:48:28 EDT
Description of problem:

If DB connectivity is lost at some point and later regained, the notification service daemon will not regain connectivity unless restarted.

How reproducible:

Every time.

Steps to Reproduce:

1.)First, to make reproduction quicker edit conf file:
share/ovirt-engine/conf/notifier.conf.defaults
INTERVAL_IN_SECONDS=10
(note: to run the notification service you will also need to configure the MAIL_SERVER property)
 
2.)Run the notification service:
share/ovirt-engine/services/ovirt-engine-notifier.py start

and watch the log:
var/log/ovirt-engine/notifier/notifier.log

3.)Disable DB:
sudo service postgresql stop
wait for first exception in the log

4.)Enable DB:
sudo service postgresql start

Actual results:

Notification service does not regain connectivity (unless restarted), 
and null pointer exceptions keep getting written to the log, once every iteration:

Failed to run the service: [null]
java.lang.NullPointerException
        at org.ovirt.engine.core.tools.common.db.StandaloneDataSource.checkConnection(StandaloneDataSource.java:112)
        at org.ovirt.engine.core.tools.common.db.StandaloneDataSource.getConnection(StandaloneDataSource.java:130)
        at org.ovirt.engine.core.notifier.NotificationService.processEvents(NotificationService.java:220)
        at org.ovirt.engine.core.notifier.NotificationService.run(NotificationService.java:103)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:722)

Expected results:

Notification service goes back to work normally

Additional info:
Comment 1 Ilanit Stein 2013-07-17 03:44:34 EDT
Verified on is5.
Comment 2 Itamar Heim 2014-01-21 17:26:43 EST
Closing - RHEV 3.3 Released
Comment 3 Itamar Heim 2014-01-21 17:30:00 EST
Closing - RHEV 3.3 Released

Note You need to log in before you can comment on or make changes to this bug.