Bug 1857722

Summary: [scale] ovirt_root partition become full due to "artifacts" folder
Product: [oVirt] ovirt-engine Reporter: David Vaanunu <dvaanunu>
Component: GeneralAssignee: Martin Necas <mnecas>
Status: CLOSED CURRENTRELEASE QA Contact: David Vaanunu <dvaanunu>
Severity: high Docs Contact:
Priority: high    
Version: 4.4.1.8CC: bugs, michal.skrivanek, mnecas, mperina
Target Milestone: ovirt-4.4.3Keywords: Performance
Target Release: ---Flags: pm-rhel: ovirt-4.4+
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: ansible-runner-service-1.0.6, ovirt-engine-4.4.3.5 Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-11-11 06:41:28 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Infra RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description David Vaanunu 2020-07-16 12:20:07 UTC
Description of problem:

Engine "/" become 100% usage. 
"/usr/share/ovirt-engine/ansible-runner-service-project/artifacts" folder size is 8.2GB.

Scale team env:
~500 VMs
~530 Hosts
~80 SDs
10 DCs & Cluster


Version-Release number of selected component (if applicable):
rhv 4.4.1-11

How reproducible:


Steps to Reproduce:
1. Create many files in "/usr/share/ovirt-engine/ansible-runner-service-project/artifacts"
2. run: "df -h" --> verify "/" disk usage increased
3.

Actual results:

Many files are exist - increase Disk usage

Expected results:

Files should be deleted every X period (time / %usage)

Additional info:

Comment 1 Martin Necas 2020-07-20 14:53:27 UTC
Currently, we have config (/etc/ansible-runner-service/config.yaml) where users can specify remove frequency and age of the files example [1].
Default is once per 30 days maybe we should make it more frequently. 
What do you think Martin P. ?

[1] https://github.com/ansible/ansible-runner-service/blob/master/config.yaml#L19-L24

Comment 2 Martin Perina 2020-07-20 15:05:45 UTC
(In reply to Martin Necas from comment #1)
> Currently, we have config (/etc/ansible-runner-service/config.yaml) where
> users can specify remove frequency and age of the files example [1].
> Default is once per 30 days maybe we should make it more frequently. 
> What do you think Martin P. ?
> 
> [1]
> https://github.com/ansible/ansible-runner-service/blob/master/config.
> yaml#L19-L24

Let's decrease it to 7 days by default

Comment 4 David Vaanunu 2020-09-06 09:42:51 UTC
Verify on rhv 4.4.2-4

The folder is not cleared. Have old files (more than a month)


Env was upgrade ar Aug 23 2020 to rhv-4.4.2-4

Apache was restart ~10days ago.
● httpd.service - The Apache HTTP Server
   Loaded: loaded (/usr/lib/systemd/system/httpd.service; enabled; vendor preset: disabled)
   Active: active (running) since Wed 2020-08-26 10:01:44 EDT; 1 weeks 3 days ago


[root@rhev-red-01 artifacts]# date
Sun Sep  6 05:38:53 EDT 2020
[root@rhev-red-01 artifacts]# pwd
/usr/share/ovirt-engine/ansible-runner-service-project/artifacts
[root@rhev-red-01 artifacts]# ls -ltr | wc -l
8673
[root@rhev-red-01 artifacts]# ls -ltr | head
total 0
drwxr-xr-x. 5 ovirt ovirt 106 Jul 28 15:30 621b29e8-d108-11ea-8198-02e6ef0f2614
drwxr-xr-x. 5 ovirt ovirt 106 Jul 28 15:40 25fff78e-d10a-11ea-b274-02e6ef0f2614
drwxr-xr-x. 5 ovirt ovirt 106 Jul 28 16:02 477aa398-d10d-11ea-8865-02e6ef0f2614
drwxr-xr-x. 5 ovirt ovirt 106 Jul 29 13:13 c7e8e768-d1be-11ea-8865-02e6ef0f2614
drwxr-xr-x. 5 ovirt ovirt 106 Jul 29 13:17 cd7f1300-d1be-11ea-8865-02e6ef0f2614
drwxr-xr-x. 5 ovirt ovirt 106 Jul 29 13:25 88132f66-d1c0-11ea-8cbe-02e6ef0f2614
drwxr-xr-x. 5 ovirt ovirt 106 Jul 29 13:28 8d811f8a-d1c0-11ea-8865-02e6ef0f2614
drwxr-xr-x. 5 ovirt ovirt 106 Jul 29 13:37 9dc0626a-d1c1-11ea-8cbe-02e6ef0f2614
drwxr-xr-x. 5 ovirt ovirt 106 Jul 29 13:37 9ef506d6-d1c1-11ea-8cbe-02e6ef0f2614

Folder Size:

3.8G	.
[root@rhev-red-01 artifacts]#

Comment 5 Martin Necas 2020-09-09 17:45:43 UTC
Found the issue, the remove of artifacts was not called in wsgi.

Comment 6 David Vaanunu 2020-10-07 06:51:22 UTC
verified version:

redhat-release-8.3-1.0.
rhv-release-4.4.3-7-001
ovirt-engine-4.4.3.5-0.5
ansible-runner-service-1.0.6-2


Engine - files were deleted from /usr/share/ovirt-engine/ansible-runner-service-project/artifacts 
run: "df -h" --> "/" partition usage was decreased

Comment 7 Sandro Bonazzola 2020-11-11 06:41:28 UTC
This bugzilla is included in oVirt 4.4.3 release, published on November 10th 2020.

Since the problem described in this bug report should be resolved in oVirt 4.4.3 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.