Bug 875678 - remoteFileHandler errors go to stderr which is lost. log them somewhere.
Summary: remoteFileHandler errors go to stderr which is lost. log them somewhere.
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: oVirt
Classification: Retired
Component: vdsm
Version: 3.1 GA
Hardware: x86_64
OS: Linux
unspecified
unspecified
Target Milestone: ---
: 3.4.0
Assignee: Yaniv Bronhaim
QA Contact: Haim
URL:
Whiteboard: storage
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-11-12 11:01 UTC by Zhou Zheng Sheng
Modified: 2016-02-10 20:33 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-10-14 03:24:17 UTC
oVirt Team: Storage
Embargoed:
abonas: needinfo+


Attachments (Terms of Use)
log of vdsmd when vdsclient is invoked to connect a localfs storage server (2.54 KB, text/x-log)
2012-11-12 11:01 UTC, Zhou Zheng Sheng
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 9182 0 None None None Never
oVirt gerrit 9193 0 None None None Never

Description Zhou Zheng Sheng 2012-11-12 11:01:06 UTC
Created attachment 643403 [details]
log of vdsmd when vdsclient is invoked to connect a localfs storage server

Description of problem:
Connect to localfs storage server failed.


Version-Release number of selected component (if applicable):
Since git commit 12699beb "pep8 now also check visual aligment", reviewed at http://gerrit.ovirt.org/9038


How reproducible:
Always


Steps to Reproduce:
1. Clone the vdsm git repo and checkout the latest master
2. Build and install rpms, start vdsmd.service
3. mkdir /teststorage; chown -R vdsm:kvm /teststorage; chmod -R 775 /teststorage
4. vdsClient -s localhost connectStorageServer 4 '' 'connection=/teststorage,id=1'

  
Actual results:
Connect to storage server failed, and /var/log/vdsm/vdsm.log logs a timeout exception immediately after the connect storage server command issued from the vdsClient.


Expected results:
Storage server gets connected, the /teststorage is mounted at /rhev/data-center/mnt/_teststorage


Additional info:
Actually, the exception is not a timeout, but a failure attempt to run a function in remoteFileHandler in a new process which is requested by RemoteFileHandlerPool.callCrabRPCFunction().

When RemoteFileHandlerPool initiates a new instance of PoolHandler, the PoolHandler.__init__() attempts to BetterPopen() a new Python process to run remoteFileHandler.py. To set the corrected PYTHONPATH for the new process, it copies existing os.environ['PYTHONPATH'] and prepends "../:" to it.

Setting PYTHONPATH in the above way is good for unit test, but not for vdsmd service. When vdsmd.service is running as a daemon, the current working dir is "/", and PYTHONPATH is not set actually when vdsm is started. So the existing PYTHONPATH is empty and "../" refers to "/". When new process of remoteFileHandler.py is started this way, any attempt to import modules under /usr/share/vdsm will fail. Before commit 12699beb, nobody import modules under that dir in remoteFileHandler, and commit 12699beb add a import of logUtils.py in misc.py, and misc.py is imported by remoteFileHandler.py. So it triggers the bug. remoteFileHandler.py imports misc.py successfully because they are in the same sub-directory, and fails to import logUtils.py because it's in the upper directory.

Furthermore, the log is not properly configured for new process of remoteFileHandler.py, making us hard to track what happens when calling functions remotely in the new process.

Comment 1 Zhou Zheng Sheng 2012-11-12 11:16:25 UTC
I try to submit two patches for this bug. One for fix the PYTHONPATH problem, another for setting up logging correctly. http://gerrit.ovirt.org/#/q/status:open+project:vdsm+branch:master+topic:fixImport,n,z

Comment 2 Zhou Zheng Sheng 2012-11-13 03:20:06 UTC
Federico Simoncelli submit a better fix at http://gerrit.ovirt.org/#/c/9193/ . I abandon the similar one from me, but keep the logging fix patch.

Comment 3 Zhou Zheng Sheng 2012-11-14 01:27:08 UTC
Federico Simoncelli's patch is merged, the bug is fixed, but the logging bug remains.

Comment 4 Alissa 2013-03-04 12:05:42 UTC
Hi Zhou Zheng Sheng, did you submit the logging patch? What is its status?

Comment 5 Zhou Zheng Sheng 2013-03-05 04:03:13 UTC
Hi Alissa, I have submit the logging patch http://gerrit.ovirt.org/#/c/9182/ . There is no feedback from other reviewers yet.

Comment 6 Alissa 2013-03-06 12:32:26 UTC
(In reply to comment #5)
> Hi Alissa, I have submit the logging patch http://gerrit.ovirt.org/#/c/9182/
> . There is no feedback from other reviewers yet.

Hi,

I added a tracker to your patch in gerrit to this bug for a better followup. 
I also saw that reviewers commented in the past days, so there is some good progress there in the review.

Comment 7 Saggi Mizrahi 2013-10-13 08:18:05 UTC
I did http://gerrit.ovirt.org/14626 which covers errors in remoteFileHandler.py itslef.

Since all operations being run are small and throw exceptions so being able to write to the log doesn't seem to be an issue to me.

I also don't trust python's logging system and it will probably add more complexity than needed to the handlers making them take more memory which is problematic as it is.

I suggest this bug be closed with WONTFIX

Comment 8 Yaniv Bronhaim 2013-10-13 13:34:22 UTC
As Saggi commented, the PoolHandler prints the stdout and stderr of the subprocess to the logger, what should be enough to handle the issue mentioned here. Doesn't it ?

Comment 9 Zhou Zheng Sheng 2013-10-14 03:24:17 UTC
OK. I agree.


Note You need to log in before you can comment on or make changes to this bug.