Bug 711120

Summary: connectStoragePool blocks forever due to several sudo processes left in defunct state
Product: Red Hat Enterprise Linux 6 Reporter: Haim <hateya>
Component: vdsmAssignee: Erez Shinan <erez>
Status: CLOSED INSUFFICIENT_DATA QA Contact: yeylon <yeylon>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 6.2CC: abaron, bazulay, danken, iheim, mgoldboi, smizrahi, srevivo, yeylon, ykaul
Target Milestone: rc   
Target Release: 6.2   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-07-14 09:15:38 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Haim 2011-06-06 15:45:14 UTC
Description of problem:

case: connectStoragePool doesn't return on host & and host recovery fails
cause: several vdsm sudo sub-procosses left in defunct state which stuck other threads work


root      8102  0.0  0.0  51048  1904 ?        S<   16:00   0:00 /usr/bin/sudo -n /usr/bin/python /usr/share/vdsm/supervdsmServer.pyc 3e57466d-f015-4686-bea5-d91b291b1296 8063
root     28573  0.0  0.0      0     0 ?        Z<   17:03   0:00 [sudo] <defunct>
root     29226  0.0  0.0      0     0 ?        Z<   17:04   0:00 [sudo] <defunct>
root     29675  0.0  0.0      0     0 ?        Z<   17:04   0:00 [sudo] <defunct>
root     29796  0.0  0.0      0     0 ?        Z<   17:04   0:00 [sudo] <defunct>
root     31861  0.0  0.0      0     0 ?        Z<   17:04   0:00 [sudo] <defunct>
root     32113  0.0  0.0      0     0 ?        Z<   17:04   0:00 [sudo] <defunct>

reproducer: not easy
repro steps:

1) VMs are running 
2) logout from iscsi-session 
3) restart VDSM

see attached logs

not sure if its a regression.

Comment 3 Erez Shinan 2011-06-29 08:22:07 UTC
It's likely that recent patches already fixed this bug. Please try to reproduce it with latest vdsm (>77)

Comment 4 Dan Kenigsberg 2011-07-14 09:15:38 UTC
After two weeks with no answer or reproduction, closing.