Bug 1141422

Summary: [RFE] Show vdsm thread name in system monitoring tools
Product: [oVirt] vdsm Reporter: Nir Soffer <nsoffer>
Component: RFEsAssignee: Francesco Romani <fromani>
Status: CLOSED CURRENTRELEASE QA Contact: Jiri Belka <jbelka>
Severity: medium Docs Contact:
Priority: low    
Version: ---CC: amureini, bazulay, bugs, fromani, gklein, melewis, mgoldboi, oourfali, rbalakri, srevivo, tjelinek, ykaul
Target Milestone: ovirt-4.1.0-alphaKeywords: CodeChange, FutureFeature
Target Release: 4.19.2Flags: oourfali: ovirt-4.1?
pstehlik: testing_plan_complete-
rule-engine: planning_ack?
tjelinek: devel_ack+
pstehlik: testing_ack+
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Enhancement
Doc Text:
With this update, the VDSM thread name is now included in the system monitoring tools. This makes it easier to track the resource usages of the threads.
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-02-01 14:58:27 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Infra RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1427725    

Description Nir Soffer 2014-09-13 07:47:04 UTC
Description of problem:

When viewing threads in system monitoring tools such as top, htop or ps, all (100's) of vdsm threads show the same name, so it is very hard to tell which are the top threads consuming most cpu time.

Solution:

Use pthread_set_name_np api to set thread name. This function should be available through ctypes.

Comment 2 Francesco Romani 2015-05-21 10:10:52 UTC
patches posted and under review.

Comment 3 Allon Mureinik 2015-06-03 13:04:01 UTC
Francesco, the patch linked in the External Trackers is merged.
Are we pending anything else, or could this be moved to MODIFIED?

Comment 4 Francesco Romani 2015-06-03 13:11:46 UTC
The patch http://gerrit.ovirt.org/41052 added the infrastructure to make the names visible. We miss a sensible way to translate python names, which are arbitrarily long, in system thread names, which must be long at most 15 ASCII characters. We didn't find a good way to do so yet.

Comment 5 Oved Ourfali 2015-06-04 06:53:25 UTC
(In reply to Francesco Romani from comment #4)
> The patch http://gerrit.ovirt.org/41052 added the infrastructure to make the
> names visible. We miss a sensible way to translate python names, which are
> arbitrarily long, in system thread names, which must be long at most 15
> ASCII characters. We didn't find a good way to do so yet.

I guess it means we should move this for 4.0? As the proposed patch doesn't complete this RFE. Right?

Comment 6 Francesco Romani 2015-06-04 06:55:44 UTC
(In reply to Oved Ourfali from comment #5)
> (In reply to Francesco Romani from comment #4)
> > The patch http://gerrit.ovirt.org/41052 added the infrastructure to make the
> > names visible. We miss a sensible way to translate python names, which are
> > arbitrarily long, in system thread names, which must be long at most 15
> > ASCII characters. We didn't find a good way to do so yet.
> 
> I guess it means we should move this for 4.0? As the proposed patch doesn't
> complete this RFE. Right?

I think yes, as no simple solution emerged yet.

Comment 7 Oved Ourfali 2016-01-20 08:05:31 UTC
Francesco - would you be interested in owning that in 4.0?

Comment 8 Francesco Romani 2016-01-25 16:26:38 UTC
(In reply to Oved Ourfali from comment #7)
> Francesco - would you be interested in owning that in 4.0?

I am, we are not that far from this stage after last patches landed in master (from weeks ago, not much progress lately)

Comment 9 Mike McCune 2016-03-28 22:37:22 UTC
This bug was accidentally moved from POST to MODIFIED via an error in automation, please see mmccune with any questions

Comment 10 Nir Soffer 2016-08-14 12:34:04 UTC
Not all patches merged yet, moving back to post.

Comment 11 Nir Soffer 2016-08-14 12:39:35 UTC
Threads using vdsm.concurrent.thread helper are showing (truncated) python thread
name now.

To complete this feature:
- Convert old code using threading.Thread directly to use vdsm.concurrent.thread
- Modify all thread names to use short names that are not truncated, or usable
  when truncated.

Part of the work is handled in this topic:
https://gerrit.ovirt.org/#/q/topic:thread-cleanup+is:open

Comment 12 Francesco Romani 2016-11-03 11:53:21 UTC
(In reply to Nir Soffer from comment #11)
> Threads using vdsm.concurrent.thread helper are showing (truncated) python
> thread
> name now.
> 
> To complete this feature:
> - Convert old code using threading.Thread directly to use
> vdsm.concurrent.thread
> - Modify all thread names to use short names that are not truncated, or
> usable
>   when truncated.
> 
> Part of the work is handled in this topic:
> https://gerrit.ovirt.org/#/q/topic:thread-cleanup+is:open

All patches are now been merged, so I'm moving this as MODIFIED. Should we discover we forgot some threads, we'll file a new bug.

Caveat: some threads may come for external packages (e.g. python-ioprocess), we should either fix those packages or skip them.

To test this change, just keep "htop" running while Vdsm is working, to see the thread names. Any other tool which shows thread names besides "htop" is fine as well.

Comment 13 Nir Soffer 2016-11-06 11:07:19 UTC
With the additional patches, all threads are named.

ioprocess thread names are handled in bug 1392214.

Some threads name are truncated, I suggest we open another bug for this.

Comment 14 Sandro Bonazzola 2016-12-12 14:02:11 UTC
The fix for this issue should be included in oVirt 4.1.0 beta 1 released on December 1st. If not included please move back to modified.

Comment 15 Jiri Belka 2016-12-16 09:53:47 UTC
ok, vdsm-4.18.999-1173.git28e001a.el7.centos.x86_64

[root@dell-r210ii-13 ~]# systemctl status vdsmd | grep PID
 Main PID: 15413 (vdsm)

[root@dell-r210ii-13 ~]# ps axH -q 15413 -o 'pid tid comm args'
  PID   TID COMMAND         COMMAND
15413 15413 vdsm            /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15450 libvirt/events  /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15451 tasks/0         /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15452 tasks/1         /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15453 tasks/2         /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15454 tasks/3         /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15455 tasks/4         /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15456 tasks/5         /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15457 tasks/6         /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15458 tasks/7         /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15459 tasks/8         /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15460 tasks/9         /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15464 check/loop      /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15466 vdsm.Scheduler  /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15474 vmchannels      /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15491 JsonRpc (StompR /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15492 jsonrpc/0       /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15493 jsonrpc/1       /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15494 jsonrpc/2       /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15495 jsonrpc/3       /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15496 jsonrpc/4       /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15497 jsonrpc/5       /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15498 jsonrpc/6       /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15499 jsonrpc/7       /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15500 JsonRpcServer   /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15501 BindingXMLRPC   /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15502 Reactor thread  /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15503 periodic/0      /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15504 periodic/1      /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15505 periodic/2      /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15506 periodic/3      /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15960 ioprocess/15959 /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15966 monitor/53e447f /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15972 mailbox-hsm/0   /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15973 mailbox-hsm/1   /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15974 mailbox-hsm/2   /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15975 mailbox-hsm/3   /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15976 mailbox-hsm/4   /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15983 mailbox-hsm     /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15997 mailbox-spm/0   /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15998 mailbox-spm/1   /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15999 mailbox-spm/2   /usr/bin/python2 /usr/share/vdsm/vdsm
15413 16000 mailbox-spm/3   /usr/bin/python2 /usr/share/vdsm/vdsm
15413 16001 mailbox-spm/4   /usr/bin/python2 /usr/share/vdsm/vdsm
15413 16003 mailbox-spm     /usr/bin/python2 /usr/share/vdsm/vdsm
15413 17365 ioprocess/17364 /usr/bin/python2 /usr/share/vdsm/vdsm
15413 17436 monitor/0c78b4d /usr/bin/python2 /usr/share/vdsm/vdsm
15413 19033 ioprocess/19032 /usr/bin/python2 /usr/share/vdsm/vdsm
15413 19035 ioprocess/19034 /usr/bin/python2 /usr/share/vdsm/vdsm

[root@dell-r210ii-13 ~]# rpm -q vdsm
vdsm-4.18.999-1173.git28e001a.el7.centos.x86_64