Bug 1394791

Summary: cronjob to update bugs.cloud.gluster.org gets killed
Product: [Community] GlusterFS Reporter: Niels de Vos <ndevos>
Component: project-infrastructureAssignee: bugs <bugs>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: high Docs Contact:
Priority: medium    
Version: mainlineCC: bugs, gluster-infra, misc, nigelb
Target Milestone: ---Keywords: Triaged
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-11-15 12:17:31 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Niels de Vos 2016-11-14 13:14:10 UTC
Description of problem:
The dataset on http://bugs.cloud.gluster.org should get updated every night. This is done by a cronjob. Something started to kill the cronjob a couple of days ago. The data on the current website is from 9-November.

I also get emails like this now:

Date: Mon Nov 14 04:00:56 CET 2016
From: "(Cron Daemon)" <ndevos.gluster.org>
To: ndevos.gluster.org
Subject: Cron <ndevos@bugs> /srv/bugs.cloud.gluster.org/html/run-report.sh

/srv/bugs.cloud.gluster.org/html/run-report.sh: line 17: 11750 Killed                  python gluster-bugs.py

Comment 1 Niels de Vos 2016-11-15 08:38:58 UTC
This also prevents getting the weekly "Bugs with incorrect status" email. The contents is also generated by a cronjob and sent to me for forwarding.

Comment 2 Nigel Babu 2016-11-15 11:11:26 UTC
Could you make the run-report.sh use a PID file so multiple instances of the report are killed instantly? It looks like there were plenty of instances of the report running on the machine, which is why it got killed.

Comment 3 Niels de Vos 2016-11-15 12:12:11 UTC
How do you mean "PID file"? One of the scripts is run nightly, the other weekly. Both are expected to finish much sooner than a next execution is done. Do you know why the scripts were not finishing? Maybe we can address that somehow.

Comment 4 Nigel Babu 2016-11-15 12:17:31 UTC
I mean, literally that.

Write the PID of your current script into a file at the start and clean up at the end. Before you write the PID file, check if a stale one exists, if it does, check if the process is still running, if it does, don't run a second time.

I don't know why the scripts were not finishing. The only useful thing to do is make sure it doesn't happen again.

The machine is restarted and you should be able to run the crons now.

Comment 5 Michael S. 2016-11-15 15:20:52 UTC
There is no swap. I guess this is likely a issue.