Description of problem ====================== When delivery of an email notification fails, tendrl-notifier doesn't report the problem anywhere so that it's not possible to find out that something went wrong (even though notifier has an opportunity to notice and report the problem). Version-Release =============== tendrl-notifier-1.5.4-2.el7rhgs.noarch All RHGSWA components: # rpm -qa | grep tendrl | sort tendrl-ansible-1.5.4-1.el7rhgs.noarch tendrl-api-1.5.4-2.el7rhgs.noarch tendrl-api-httpd-1.5.4-2.el7rhgs.noarch tendrl-commons-1.5.4-2.el7rhgs.noarch tendrl-grafana-plugins-1.5.4-3.el7rhgs.noarch tendrl-grafana-selinux-1.5.3-2.el7rhgs.noarch tendrl-monitoring-integration-1.5.4-3.el7rhgs.noarch tendrl-node-agent-1.5.4-2.el7rhgs.noarch tendrl-notifier-1.5.4-2.el7rhgs.noarch tendrl-selinux-1.5.3-2.el7rhgs.noarch tendrl-ui-1.5.4-2.el7rhgs.noarch How reproducible ================ 100 % Steps to Reproduce ================== 1. Install RHGSWA using tendrl-ansible 2. Import any gluser trusted storage pool with a volume 3. Enable email alerting (following documentation) and configure tendrl notifier to send emails via smtp server you operate (see step 5 to understand why) 4. Perform any action, which tendrl alerting will send event about, eg. stop a volume or shutdown a storage machine and see that you have received the email messages (to validate the setup). 5. Stop the smpt server tendrl notifier is talking to 6. Perform any action, which tendrl alerting will send event about, eg. stop a volume or shutdown a storage machine. 7. See logs of tendrl notifier Note: one can use qe playbook for step 3, see: * https://github.com/usmqe/usmqe-setup/blob/e7d174ab4c970ee535954a9efdc2e4db245f18cf/test_setup.smtp.yml (version I was using when I reported this BZ) * https://github.com/usmqe/usmqe-setup/blob/master/test_setup.smtp.yml (latest version) Actual results ============== Email notifications couldn't be send, because the smtp server tendrl uses is not running: ``` [root@usm1-client ~]# systemctl status postfix ● postfix.service - Postfix Mail Transport Agent Loaded: loaded (/usr/lib/systemd/system/postfix.service; enabled; vendor preset: disabled) Active: inactive (dead) since Wed 2017-11-15 08:40:32 EST; 13min ago Process: 25826 ExecStop=/usr/sbin/postfix stop (code=exited, status=0/SUCCESS) Process: 25439 ExecStart=/usr/sbin/postfix start (code=exited, status=0/SUCCESS) Process: 25436 ExecStartPre=/usr/libexec/postfix/chroot-update (code=exited, status=0/SUCCESS) Process: 25433 ExecStartPre=/usr/libexec/postfix/aliasesdb (code=exited, status=0/SUCCESS) Main PID: 25511 (code=killed, signal=TERM) ``` Trying to connect to the smtp server from the tendrl server machine: ``` [root@usm1-server ~]# grep email_smtp_server /etc/tendrl/notifier/email.conf.yaml email_smtp_server: usm1-client.example.com [root@usm1-server ~]# telnet usm1-client.example.com 25 Trying 10.37.169.25... telnet: connect to address 10.37.169.25: Connection refused ``` Which is expected, because it was me who stopped the smtp server. I list this example to demonstrate particular failure I tested with. But there is no error reported about this in tendrl-notifier logs: ``` [root@usm1-server ~]# journalctl -u tendrl-notifier -fe ``` The last log line I see contains no info about the problem: ``` Nov 15 03:11:46 mbukatov-usm1-server.usmqe.lab.eng.brq.redhat.com tendrl-notifier[21602]: ramsRegistering atom namespace.tendrl.objects.Cluster.atoms.ValidImportClusterParamsFinding flows in namespace.tendrl.objects.Cluster.flowsRegistering object namespace.tendrl.objects.AlertFinding atoms in namespace.tendrl.objects.Alert.atomsFinding flows in namespace.tendrl.objects.Alert.flowsRegistering object namespace.tendrl.objects.ClusterAlertFinding atoms in namespace.tendrl.objects.ClusterAlert.atomsFinding flows in namespace.tendrl.objects.ClusterAlert.flowsRegistering object namespace.tendrl.objects.ClusterAlertCountersFinding atoms in namespace.tendrl.objects.ClusterAlertCounters.atomsFinding flows in namespace.tendrl.objects.ClusterAlertCounters.flowsRegistering object namespace.tendrl.objects.ClusterNodeContextFinding atoms in namespace.tendrl.objects.ClusterNodeContext.atomsFinding flows in namespace.tendrl.objects.ClusterNodeContext.flowsRegistering object namespace.tendrl.objects.ClusterTendrlContextFinding atoms in namespace.tendrl.objects.ClusterTendrlContext.atomsFinding flows in namespace.tendrl.objects.ClusterTendrlContext.flowsRegistering object namespace.tendrl.objects.CpuFinding atoms in namespace.tendrl.objects.Cpu.atomsFinding flows in namespace.tendrl.objects.Cpu.flowsRegistering object namespace.tendrl.objects.DefinitionFinding atoms in namespace.tendrl.objects.Definition.atomsFinding flows in namespace.tendrl.objects.Definition.flowsRegistering object namespace.tendrl.objects.DetectedClusterFinding atoms in namespace.tendrl.objects.DetectedCluster.atomsFinding flows in namespace.tendrl.objects.DetectedCluster.flowsRegistering object namespace.tendrl.objects.DiskFinding atoms in namespace.tendrl.objects.Disk.atomsFinding flows in namespace.tendrl.objects.Disk.flowsRegistering object namespace.tendrl.objects.JobFinding atoms in namespace.tendrl.objects.Job.atomsFinding flows in namespace.tendrl.objects.Job.flowsRegistering object namespace.tendrl.objects.MemoryFinding atoms in namespace.tendrl.objects.Memory.atomsFinding flows in namespace.tendrl.objects.Memory.flowsRegistering object na ``` Nor I see the error about this reported anywhere else. Expected results ================ Tendrl logs the problem with enough details, so that admin will know what has happened. At least tendrl-notifier logs should provide the error: ``` [root@usm1-server ~]# journalctl -u tendrl-notifier -fe ```
Additional Information ====================== Besides this particular reporting problem, notifier seems to operate nominally. When I enable snmpv3 trap messages to be send, it sends them with success while ignoring that it was not possible to deliver email ones.
Development Management has reviewed and declined this request. You may appeal this decision by reopening this request.