Bug 1371634 - ioprocess keep open file on shared storage after touching or truncating a file
Summary: ioprocess keep open file on shared storage after touching or truncating a file
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ioprocess
Version: 3.6.0
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ovirt-3.6.9
: ---
Assignee: Nir Soffer
QA Contact: Kevin Alon Goldblatt
URL:
Whiteboard:
Depends On: 1339777 1373491
Blocks: 1339780 1370564
TreeView+ depends on / blocked
 
Reported: 2016-08-30 16:59 UTC by Nir Soffer
Modified: 2019-12-16 06:36 UTC (History)
13 users (show)

Fixed In Version: ioprocess-0.15.2-1
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1339777
Environment:
Last Closed: 2016-09-21 18:08:08 UTC
oVirt Team: Storage
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
vdsm server and engine logs (1.18 MB, application/x-gzip)
2016-09-06 12:53 UTC, Kevin Alon Goldblatt
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2016:1927 0 normal SHIPPED_LIVE ioprocess bug fix update for RHV 3.6 2016-09-21 22:01:30 UTC
oVirt gerrit 62953 0 None None None 2016-08-30 16:59:39 UTC

Description Nir Soffer 2016-08-30 16:59:40 UTC
+++ This bug was initially created as a clone of Bug #1339777 +++

Description of problem:

The first time IOProcess.truncate() or IOProcess.touch() is called, ioprocess
keep the file open and will never close it.

Typically when using with vdsm, ioprocess will keep the __DIRECT_IO_TEST__ 
file open on shared storage, since the first thing vdsm does is touching
this file.

This is an example output from lsof:

$ lsof -p 23805
COMMAND     PID USER   FD   TYPE DEVICE SIZE/OFF     NODE NAME
...
ioprocess 23805 vdsm    0w   REG   0,39        0 13369358 /rhev/data-center/mnt/10.35.0.179:_home_storage__domains_domain2/__DIRECT_IO_TEST__ (10.35.0.179:/home/storage_domains/domain2)
...

When putting host to maintenance, ioprocess still keeps same file open:

COMMAND     PID USER   FD   TYPE DEVICE SIZE/OFF     NODE NAME
...
ioprocess 23805 vdsm    0w   REG   0,39        0 13369358 /__DIRECT_IO_TEST__
...

Keeping open files on shared storage in maintenance mode may cause trouble
on some shared file systems.

Version-Release number of selected component (if applicable):
0.5.0

The defect was introduce in this commit:

commit 7dec019602137186908fdad624e1cc7d1faf4001
Author:     Yeela Kaplan <ykaplan>
AuthorDate: Sun May 18 17:28:51 2014 +0300
Commit:     Yeela Kaplan <ykaplan>
CommitDate: Sun May 25 19:21:40 2014 +0300

    Add missing functionality to exported functions
    
    touch
    truncatefile

How reproducible:
Always

Steps to Reproduce:
1. Invoke truncate or touch
2. Check open files using lsof -p <ioprocess pid>

Actual results:
File remain open forever.

Expected results:
Truncated or touched file close after the operation.

Workaround:

If the host is in maintenance mode, killing ioprocess will safely close the 
open file.

--- Additional comment from Nir Soffer on 2016-05-25 16:42:48 EDT ---

A fix is available for testing here:
http://koji.fedoraproject.org/koji/taskinfo?taskID=14249832

--- Additional comment from Yaniv Kaul on 2016-06-15 02:18:06 EDT ---

Should the bug move to POST?

--- Additional comment from Fedora Update System on 2016-06-16 04:48:56 EDT ---

ioprocess-0.16.1-1.fc24 has been submitted as an update to Fedora 24. https://bodhi.fedoraproject.org/updates/FEDORA-2016-dfdd83c234

--- Additional comment from Fedora Update System on 2016-06-16 05:07:51 EDT ---

ioprocess-0.16.1-1.fc23 has been submitted as an update to Fedora 23. https://bodhi.fedoraproject.org/updates/FEDORA-2016-f2bcab1a73

--- Additional comment from Fedora Update System on 2016-06-16 11:55:02 EDT ---

ioprocess-0.16.1-1.fc24 has been pushed to the Fedora 24 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-dfdd83c234

--- Additional comment from Fedora Update System on 2016-06-17 14:55:00 EDT ---

ioprocess-0.16.1-1.fc23 has been pushed to the Fedora 23 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-f2bcab1a73

--- Additional comment from Fedora Update System on 2016-07-05 01:00:28 EDT ---

ioprocess-0.16.1-1.fc24 has been pushed to the Fedora 24 stable repository. If problems still persist, please make note of it in this bug report.

--- Additional comment from Fedora Update System on 2016-07-05 04:25:48 EDT ---

ioprocess-0.16.1-1.fc23 has been pushed to the Fedora 23 stable repository. If problems still persist, please make note of it in this bug report.

--- Additional comment from Nir Soffer on 2016-08-29 11:22:53 EDT ---

The request from comment 2 was addressed ages ago.

Comment 3 Nir Soffer 2016-08-31 13:56:55 UTC
How to test:

1. Setup system with one NFS storage domain and one host
2. Activate the spm
3. Wait about one minute after the host was activated, until one ioprocess
   child process is running. You can use "ps auxf" on the host
4. Check open files by this ioprocess using "lsof -p <ioprocess pid>"
   If you find more then one ioprocess child process, check all of them.

Expected results before this fix (ioprocess-0.15.0):
ioprocess keeps open the file /rhev/data-center/mnt/server:_path/__DIRECT_IO_TEST__

Expected results with this fix (ioprocess-0.15.2):
No open file on shared storage.

Comment 6 Kevin Alon Goldblatt 2016-09-06 12:53:49 UTC
Created attachment 1198257 [details]
vdsm server and engine logs

Added logs

Comment 7 Nir Soffer 2016-09-06 13:17:27 UTC
You can test the build mentioned in comment 4, or wait until the package is
released.

Comment 8 Kevin Alon Goldblatt 2016-09-14 15:00:48 UTC
Tested with the following code:
----------------------------------------
rhevm-4.0.4.2-0.1.el7ev.noarch
vdsm-4.18.13-1.el7ev.x86_64

Tested with the following scenario:

Steps to Reproduce:
1. Setup system with one NFS storage domain and one host
2. Activate the spm
3. Wait about one minute after the host was activated, until one ioprocess
   child process is running. You can use "ps auxf" on the host
4. Check open files by this ioprocess using "lsof -p <ioprocess pid>"
   If you find more then one ioprocess child process, check all of them.

Expected results before this fix (ioprocess-0.15.0):
ioprocess keeps open the file /rhev/data-center/mnt/server:_path/__DIRECT_IO_TEST__

Results after the fix:
lsof -p 10158
COMMAND     PID USER   FD   TYPE DEVICE SIZE/OFF    NODE NAME
ioprocess 10158 vdsm  cwd    DIR    8,3     4096       2 /
ioprocess 10158 vdsm  rtd    DIR    8,3     4096       2 /
ioprocess 10158 vdsm  txt    REG    8,3    42192 1819196 /usr/libexec/ioprocess
ioprocess 10158 vdsm  mem    REG    8,3  2112384 1819202 /usr/lib64/libc-2.17.so
ioprocess 10158 vdsm  mem    REG    8,3   142304 1819228 /usr/lib64/libpthread-2.17.so
ioprocess 10158 vdsm  mem    REG    8,3    40600 1820785 /usr/lib64/libyajl.so.2.0.4
ioprocess 10158 vdsm  mem    REG    8,3     6928 1819494 /usr/lib64/libgthread-2.0.so.0.4200.2
ioprocess 10158 vdsm  mem    REG    8,3  1287904 1819488 /usr/lib64/libglib-2.0.so.0.4200.2
ioprocess 10158 vdsm  mem    REG    8,3   164440 1819195 /usr/lib64/ld-2.17.so
ioprocess 10158 vdsm  mem    REG    8,3    26254 1845006 /usr/lib64/gconv/gconv-modules.cache
ioprocess 10158 vdsm    1w  FIFO    0,8      0t0   97430 pipe
ioprocess 10158 vdsm    2w  FIFO    0,8      0t0   97431 pipe
ioprocess 10158 vdsm   53w  FIFO    0,8      0t0   97427 pipe
ioprocess 10158 vdsm   54r  FIFO    0,8      0t0   97428 pipe




Actual results:
No open file on shared storage.


Expected results:



Moving to VERIFIED!

Comment 10 errata-xmlrpc 2016-09-21 18:08:08 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-1927.html


Note You need to log in before you can comment on or make changes to this bug.