Bug 1541412 - Ansible deployment should clean up files in /var once finished
Summary: Ansible deployment should clean up files in /var once finished
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-hosted-engine-setup
Classification: oVirt
Component: General
Version: ---
Hardware: Unspecified
OS: Unspecified
medium
urgent
Target Milestone: ovirt-4.2.2
: 2.2.10
Assignee: Simone Tiraboschi
QA Contact: Nikolai Sednev
URL:
Whiteboard:
Depends On:
Blocks: 1458709
TreeView+ depends on / blocked
 
Reported: 2018-02-02 14:05 UTC by Yihui Zhao
Modified: 2018-03-29 11:09 UTC (History)
12 users (show)

Fixed In Version: ovirt-hosted-engine-setup-2.2.10
Clone Of:
Environment:
Last Closed: 2018-03-29 11:09:35 UTC
oVirt Team: Integration
Embargoed:
rule-engine: ovirt-4.2+
rule-engine: exception+
sbonazzo: devel_ack+
rule-engine: testing_ack+


Attachments (Terms of Use)
issue (225.96 KB, image/png)
2018-02-02 14:05 UTC, Yihui Zhao
no flags Details
/var/log/* (479.60 KB, application/x-bzip)
2018-02-02 14:17 UTC, Yihui Zhao
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1526752 0 unspecified CLOSED HE setup: Don't use /var/tmp/localvm but a temp dir. 2021-02-22 00:41:40 UTC
oVirt gerrit 87094 0 master MERGED ansible: use block/rescue to clean up the local VM 2018-09-03 09:35:49 UTC
oVirt gerrit 87285 0 ovirt-hosted-engine-setup-2.2 MERGED ansible: use block/rescue to clean up the local VM 2018-02-07 17:57:13 UTC

Internal Links: 1526752

Description Yihui Zhao 2018-02-02 14:05:39 UTC
Created attachment 1390166 [details]
issue

Description of problem: 
There is no space(/var) for HE ansible deployment while setup about four times. 

from the CLI:
"""
[ INFO  ] TASK [Extract appliance to local vm dir]
[ ERROR ] fatal: [localhost]: FAILED! => {"changed": false, "dest": "/var/tmp/localvmiMKbq4", "extract_results": {"cmd": ["/usr/bin/gtar", "--extract", "-C", "/var/tmp/localvmiMKbq4", "-z", "--show-transformed-names", "--sparse", "-f", "/root/.ansible/tmp/ansible-tmp-1517574118.41-79777751702578/source"], "err": "/usr/bin/gtar: images/d73f231b-d1ec-43bd-bf4f-622cd3d4f1f5/55387735-bcd9-4eb6-89dc-d5a576b151ed: Wrote only 512 of 10240 bytes\n/usr/bin/gtar: Exiting with failure status due to previous errors\n", "out": "", "rc": 2}, "gid": 36, "group": "kvm", "handler": "TgzArchive", "mode": "0775", "msg": "failed to unpack /root/.ansible/tmp/ansible-tmp-1517574118.41-79777751702578/source to /var/tmp/localvmiMKbq4", "owner": "vdsm", "secontext": "unconfined_u:object_r:user_tmp_t:s0", "size": 4096, "src": "/root/.ansible/tmp/ansible-tmp-1517574118.41-79777751702578/source", "state": "directory", "uid": 36}
[ ERROR ] Failed to execute stage 'Closing up': Failed executing ansible-playbook
[ INFO  ] Stage: Clean up
[ INFO  ] Cleaning temporary resources
[ INFO  ] TASK [Gathering Facts]
[ INFO  ] ok: [localhost]
[ INFO  ] TASK [Remove local vm dir]
[ INFO  ] ok: [localhost]
[ INFO  ] Generating answer file '/var/lib/ovirt-hosted-engine-setup/answers/answers-20180202072316.conf'
[ ERROR ] Failed to execute stage 'Clean up': [Errno 28] No space left on device
[ INFO  ] Stage: Pre-termination
[ INFO  ] Stage: Termination
[ ERROR ] Hosted Engine deployment failed: please check the logs for the issue, fix accordingly or re-deploy from scratch.
          Log file is located at /var/log/ovirt-hosted-engine-setup/ovirt-hosted-engine-setup-20180202072040-rmwhj6.log
"""

And the /var is 100%:

[root@dell-per515-02 ~]# df -h
Filesystem                                                         Size  Used Avail Use% Mounted on
/dev/mapper/rhvh_bootp--73--75--130-rhvh--4.2.1.2--0.20180201.0+1  2.1T  4.6G  2.0T   1% /
devtmpfs                                                            16G     0   16G   0% /dev
tmpfs                                                               16G  4.0K   16G   1% /dev/shm
tmpfs                                                               16G   17M   16G   1% /run
tmpfs                                                               16G     0   16G   0% /sys/fs/cgroup
/dev/mapper/rhvh_bootp--73--75--130-home                           976M  2.6M  907M   1% /home
/dev/sda2                                                          976M  202M  707M  23% /boot
/dev/mapper/rhvh_bootp--73--75--130-tmp                            976M  4.1M  905M   1% /tmp
/dev/mapper/rhvh_bootp--73--75--130-var                             15G   15G     0 100% /var
/dev/mapper/rhvh_bootp--73--75--130-var_log                        7.8G  109M  7.3G   2% /var/log
/dev/mapper/rhvh_bootp--73--75--130-var_log_audit                  2.0G  9.4M  1.8G   1% /var/log/audit
/dev/mapper/rhvh_bootp--73--75--130-var_crash                      9.8G   37M  9.2G   1% /var/crash
10.66.148.11:/home/yzhao/nfs3                                      237G  133G   92G  60% /rhev/data-center/mnt/10.66.148.11:_home_yzhao_nfs3
tmpfs                                                              3.2G     0  3.2G   0% /run/user/0



Version-Release number of selected component (if applicable): 
cockpit-ws-157-1.el7.x86_64
cockpit-bridge-157-1.el7.x86_64
cockpit-storaged-157-1.el7.noarch
cockpit-dashboard-157-1.el7.x86_64
cockpit-157-1.el7.x86_64
cockpit-ovirt-dashboard-0.11.9-0.1.el7ev.noarch
cockpit-system-157-1.el7.noarch
ovirt-hosted-engine-setup-2.2.9-1.el7ev.noarch
ovirt-hosted-engine-ha-2.2.4-1.el7ev.noarch
rhvh-4.2.1.2-0.20180201.0+1
rhvm-appliance-4.2-20180125.0.el7.noarch


How reproducible: 
Deploy HE based ansible deployment about four times.


Steps to Reproduce: 
1. Clean install latest RHVH4.2.1 with ks(rhvh-4.2.1.2-0.20180201.0+1)
2. Deploy HE via CLI based ansible deployment failed due to some reasons
3. Continue to redeploy the HE based ansible deployment about four times
4. Check the /var partition

Actual results: 
The same of the description.


Expected results: 
Redeploy HE successfully


Additional info:

Comment 1 Yihui Zhao 2018-02-02 14:17:33 UTC
Created attachment 1390170 [details]
/var/log/*

Comment 2 Yihui Zhao 2018-02-02 14:40:35 UTC
VM data stored in /var/tmp/

[root@hp-dl385pg8-11 tmp]# du -h /var/tmp/
3.2G	/var/tmp/localvmamUArL/images/d73f231b-d1ec-43bd-bf4f-622cd3d4f1f5
3.2G	/var/tmp/localvmamUArL/images
8.0K	/var/tmp/localvmamUArL/master/vms/f84ed495-554f-4d03-a4a6-ca2b54435f38
12K	/var/tmp/localvmamUArL/master/vms
16K	/var/tmp/localvmamUArL/master
3.2G	/var/tmp/localvmamUArL
3.2G	/var/tmp/localvmOMA24F/images/d73f231b-d1ec-43bd-bf4f-622cd3d4f1f5
3.2G	/var/tmp/localvmOMA24F/images
8.0K	/var/tmp/localvmOMA24F/master/vms/f84ed495-554f-4d03-a4a6-ca2b54435f38
12K	/var/tmp/localvmOMA24F/master/vms
16K	/var/tmp/localvmOMA24F/master
3.2G	/var/tmp/localvmOMA24F
3.1G	/var/tmp/localvm3qda9t/images/d73f231b-d1ec-43bd-bf4f-622cd3d4f1f5
3.1G	/var/tmp/localvm3qda9t/images
8.0K	/var/tmp/localvm3qda9t/master/vms/f84ed495-554f-4d03-a4a6-ca2b54435f38
12K	/var/tmp/localvm3qda9t/master/vms
16K	/var/tmp/localvm3qda9t/master
3.1G	/var/tmp/localvm3qda9t
4.0K	/var/tmp/systemd-private-22f922f5ce70430ab5df7b87146c09b7-chronyd.service-oZPn20/tmp
8.0K	/var/tmp/systemd-private-22f922f5ce70430ab5df7b87146c09b7-chronyd.service-oZPn20
4.0K	/var/tmp/abrt
3.2G	/var/tmp/localvmc6biHf/images/d73f231b-d1ec-43bd-bf4f-622cd3d4f1f5
3.2G	/var/tmp/localvmc6biHf/images
8.0K	/var/tmp/localvmc6biHf/master/vms/f84ed495-554f-4d03-a4a6-ca2b54435f38
12K	/var/tmp/localvmc6biHf/master/vms
16K	/var/tmp/localvmc6biHf/master
3.2G	/var/tmp/localvmc6biHf
2.2G	/var/tmp/localvmSsPUt5/images/d73f231b-d1ec-43bd-bf4f-622cd3d4f1f5
2.2G	/var/tmp/localvmSsPUt5/images
8.0K	/var/tmp/localvmSsPUt5/master/vms/f84ed495-554f-4d03-a4a6-ca2b54435f38
12K	/var/tmp/localvmSsPUt5/master/vms
16K	/var/tmp/localvmSsPUt5/master
2.2G	/var/tmp/localvmSsPUt5
15G	/var/tmp/

Comment 3 Yihui Zhao 2018-03-16 06:52:25 UTC
Also met this issue.

From the cockpit: 
[ INFO ] TASK [Extract appliance to local vm dir]
[ ERROR ] fatal: [localhost]: FAILED! => {"changed": false, "dest": "/var/tmp/localvmNNzO7q", "extract_results": {"cmd": ["/bin/gtar", "--extract", "-C", "/var/tmp/localvmNNzO7q", "-z", "--show-transformed-names", "--sparse", "-f", "/root/.ansible/tmp/ansible-tmp-1521182717.81-74224666147249/source"], "err": "/bin/gtar: images/44b5a089-6ce7-4a23-b4d1-b9dab3fe2687/5ef59632-5a32-4285-95d9-54c59673e2f3: Wrote only 512 of 10240 bytes\n/bin/gtar: Exiting with failure status due to previous errors\n", "out": "", "rc": 2}, "gid": 36, "group": "kvm", "handler": "TgzArchive", "mode": "0775", "msg": "failed to unpack /root/.ansible/tmp/ansible-tmp-1521182717.81-74224666147249/source to /var/tmp/localvmNNzO7q", "owner": "vdsm", "secontext": "unconfined_u:object_r:user_tmp_t:s0", "size": 4096, "src": "/root/.ansible/tmp/ansible-tmp-1521182717.81-74224666147249/source", "state": "directory", "uid": 36}
[ INFO ] TASK [Remove local vm dir]
[ INFO ] TASK [Notify the user about a failure]
[ ERROR ] fatal: [localhost]: FAILED! => {"changed": false, "msg": "The system may not be provisioned according to the playbook results: please check the logs for the issue, fix accordingly or re-deploy from scratch.\n"}
CancelBackPrepare VM


[root@ibm-x3650m5-05 tmp]# df -h
Filesystem                                                       Size  Used Avail Use% Mounted on
/dev/mapper/rhvh_ibm--x3650m5--05-rhvh--4.2.1.4--0.20180305.0+1  774G  4.7G  730G   1% /
devtmpfs                                                          16G     0   16G   0% /dev
tmpfs                                                             16G  204K   16G   1% /dev/shm
tmpfs                                                             16G   18M   16G   1% /run
tmpfs                                                             16G     0   16G   0% /sys/fs/cgroup
/dev/mapper/rhvh_ibm--x3650m5--05-var                             15G   13G  1.6G  90% /var
/dev/sda1                                                        976M  209M  701M  23% /boot
/dev/mapper/rhvh_ibm--x3650m5--05-tmp                            976M  3.3M  906M   1% /tmp
/dev/mapper/rhvh_ibm--x3650m5--05-var_crash                      9.8G   37M  9.2G   1% /var/crash
/dev/mapper/rhvh_ibm--x3650m5--05-var_log                        7.8G   57M  7.3G   1% /var/log
/dev/mapper/rhvh_ibm--x3650m5--05-home                           976M  2.6M  907M   1% /home
/dev/mapper/rhvh_ibm--x3650m5--05-var_log_audit                  2.0G  7.5M  1.8G   1% /var/log/audit
tmpfs                                                            3.1G     0  3.1G   0% /run/user/0
10.66.148.11:/home/yzhao1/nfs5                                   237G   88G  137G  39% /rhev/data-center/mnt/10.66.148.11:_home_yzhao1_nfs5



[root@ibm-x3650m5-05 ~]# ll /var/tmp/
total 28
drwxr-xr-x. 2 abrt abrt 4096 Mar 16 11:29 abrt
drwxrwxr-x. 4 vdsm kvm  4096 Mar 16 12:12 localvm5FZSH7
drwxrwxr-x. 4 vdsm kvm  4096 Mar 16 11:46 localvma3A2QG
drwxrwxr-x. 4 vdsm kvm  4096 Mar 16 14:38 localvmgoqiaw
drwxrwxr-x. 4 vdsm kvm  4096 Mar 16 14:29 localvmzeL1ug
-rw-r--r--. 1 root root    0 Mar  5 16:30 sssd_is_running
drwx------. 3 root root 4096 Mar 16 11:29 systemd-private-ddaef151e49e4cf0814729251cde71f5-chronyd.service-iZ3Yyq
drwx------. 3 root root 4096 Mar 16 11:43 systemd-private-ddaef151e49e4cf0814729251cde71f5-systemd-timedated.service-5xU3Zu

Test version:

cockpit-dashboard-160-3.el7.x86_64
cockpit-system-160-3.el7.noarch
cockpit-ovirt-dashboard-0.11.17-1.el7ev.noarch
cockpit-bridge-160-3.el7.x86_64
cockpit-ws-160-3.el7.x86_64
cockpit-storaged-160-3.el7.noarch
cockpit-160-3.el7.x86_64
ovirt-hosted-engine-ha-2.2.7-1.el7ev.noarch
ovirt-hosted-engine-setup-2.2.13-1.el7ev.noarch

vdsm-http-4.20.22-1.el7ev.noarch
vdsm-hook-ethtool-options-4.20.22-1.el7ev.noarch
vdsm-network-4.20.22-1.el7ev.x86_64
vdsm-api-4.20.22-1.el7ev.noarch
vdsm-python-4.20.22-1.el7ev.noarch
vdsm-hook-vmfex-dev-4.20.22-1.el7ev.noarch
vdsm-hook-vhostmd-4.20.22-1.el7ev.noarch
vdsm-yajsonrpc-4.20.22-1.el7ev.noarch
vdsm-client-4.20.22-1.el7ev.noarch
vdsm-4.20.22-1.el7ev.x86_64
vdsm-gluster-4.20.22-1.el7ev.noarch
vdsm-hook-vfio-mdev-4.20.22-1.el7ev.noarch
vdsm-common-4.20.22-1.el7ev.noarch
vdsm-hook-openstacknet-4.20.22-1.el7ev.noarch
vdsm-jsonrpc-4.20.22-1.el7ev.noarch
vdsm-hook-fcoe-4.20.22-1.el7ev.noarch

rhvm-appliance-4.2-20180202.0.el7.noarch
OS tree: rhvh-4.2.1.4-0.20180305.0+1

Comment 4 Nikolai Sednev 2018-03-21 14:37:19 UTC
After numerous re-deployments I see that /var/tmp/localvm5V6Avl/ is empty:
alma03 ~]# ll   /var/tmp/localvm5V6Avl/
total 0
4.0K drwxrwxr-x.  2 vdsm kvm  4.0K Mar 21 16:31 localvm5V6Avl

Works for me on these components on host:
ovirt-hosted-engine-ha-2.2.7-1.el7ev.noarch
ovirt-hosted-engine-setup-2.2.13-1.el7ev.noarch
rhvm-appliance-4.2-20180202.0.el7.noarch
Linux 3.10.0-861.el7.x86_64 #1 SMP Wed Mar 14 10:21:01 EDT 2018 x86_64 x86_64 x86_64 GNU/Linux
Red Hat Enterprise Linux Server release 7.5 (Maipo)

Moving to verified.

Comment 5 Sandro Bonazzola 2018-03-29 11:09:35 UTC
This bugzilla is included in oVirt 4.2.2 release, published on March 28th 2018.

Since the problem described in this bug report should be
resolved in oVirt 4.2.2 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.