Bug 1141422 - [RFE] Show vdsm thread name in system monitoring tools
Summary: [RFE] Show vdsm thread name in system monitoring tools
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: vdsm
Classification: oVirt
Component: RFEs
Version: ---
Hardware: Unspecified
OS: Unspecified
low
medium
Target Milestone: ovirt-4.1.0-alpha
: 4.19.2
Assignee: Francesco Romani
QA Contact: Jiri Belka
URL:
Whiteboard:
Depends On:
Blocks: 1427725
TreeView+ depends on / blocked
 
Reported: 2014-09-13 07:47 UTC by Nir Soffer
Modified: 2017-03-01 01:24 UTC (History)
12 users (show)

Fixed In Version:
Doc Type: Enhancement
Doc Text:
With this update, the VDSM thread name is now included in the system monitoring tools. This makes it easier to track the resource usages of the threads.
Clone Of:
Environment:
Last Closed: 2017-02-01 14:58:27 UTC
oVirt Team: Infra
Embargoed:
oourfali: ovirt-4.1?
pstehlik: testing_plan_complete-
rule-engine: planning_ack?
tjelinek: devel_ack+
pstehlik: testing_ack+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 41052 0 master MERGED lib: pthread: allow to set system thread names Never
oVirt gerrit 43819 0 master MERGED periodic: Use thread names that can be used as system thread names Never
oVirt gerrit 55074 0 master MERGED lib: set system name for threads 2016-08-14 12:30:46 UTC
oVirt gerrit 65501 0 master NEW lib: shorten name of libvirt event thread 2016-10-26 09:07:22 UTC
oVirt gerrit 65502 0 master NEW clientIF: rename recovery thread 2016-10-25 17:04:21 UTC
oVirt gerrit 65503 0 master POST migration: use system thread names 2016-11-03 10:33:58 UTC
oVirt gerrit 66105 0 master MERGED health: Name health monitor thread 2016-11-09 06:05:22 UTC
oVirt gerrit 66106 0 master MERGED netlink: Name netlink monitor thread 2016-11-09 06:06:17 UTC
oVirt gerrit 66107 0 master MERGED misc: Name itmap threads 2016-11-09 06:06:52 UTC
oVirt gerrit 66108 0 master MERGED v2v: Name v2v import threads 2016-11-08 21:12:33 UTC
oVirt gerrit 66109 0 master MERGED threadPool: Name threadpool worker threads 2016-11-17 17:53:50 UTC
oVirt gerrit 66110 0 master MERGED vm: Name vm creation and live merge cleanup thread 2016-11-09 06:07:43 UTC
oVirt gerrit 66112 0 master MERGED concurrent: Name tmap threads 2016-11-09 06:07:33 UTC

Description Nir Soffer 2014-09-13 07:47:04 UTC
Description of problem:

When viewing threads in system monitoring tools such as top, htop or ps, all (100's) of vdsm threads show the same name, so it is very hard to tell which are the top threads consuming most cpu time.

Solution:

Use pthread_set_name_np api to set thread name. This function should be available through ctypes.

Comment 2 Francesco Romani 2015-05-21 10:10:52 UTC
patches posted and under review.

Comment 3 Allon Mureinik 2015-06-03 13:04:01 UTC
Francesco, the patch linked in the External Trackers is merged.
Are we pending anything else, or could this be moved to MODIFIED?

Comment 4 Francesco Romani 2015-06-03 13:11:46 UTC
The patch http://gerrit.ovirt.org/41052 added the infrastructure to make the names visible. We miss a sensible way to translate python names, which are arbitrarily long, in system thread names, which must be long at most 15 ASCII characters. We didn't find a good way to do so yet.

Comment 5 Oved Ourfali 2015-06-04 06:53:25 UTC
(In reply to Francesco Romani from comment #4)
> The patch http://gerrit.ovirt.org/41052 added the infrastructure to make the
> names visible. We miss a sensible way to translate python names, which are
> arbitrarily long, in system thread names, which must be long at most 15
> ASCII characters. We didn't find a good way to do so yet.

I guess it means we should move this for 4.0? As the proposed patch doesn't complete this RFE. Right?

Comment 6 Francesco Romani 2015-06-04 06:55:44 UTC
(In reply to Oved Ourfali from comment #5)
> (In reply to Francesco Romani from comment #4)
> > The patch http://gerrit.ovirt.org/41052 added the infrastructure to make the
> > names visible. We miss a sensible way to translate python names, which are
> > arbitrarily long, in system thread names, which must be long at most 15
> > ASCII characters. We didn't find a good way to do so yet.
> 
> I guess it means we should move this for 4.0? As the proposed patch doesn't
> complete this RFE. Right?

I think yes, as no simple solution emerged yet.

Comment 7 Oved Ourfali 2016-01-20 08:05:31 UTC
Francesco - would you be interested in owning that in 4.0?

Comment 8 Francesco Romani 2016-01-25 16:26:38 UTC
(In reply to Oved Ourfali from comment #7)
> Francesco - would you be interested in owning that in 4.0?

I am, we are not that far from this stage after last patches landed in master (from weeks ago, not much progress lately)

Comment 9 Mike McCune 2016-03-28 22:37:22 UTC
This bug was accidentally moved from POST to MODIFIED via an error in automation, please see mmccune with any questions

Comment 10 Nir Soffer 2016-08-14 12:34:04 UTC
Not all patches merged yet, moving back to post.

Comment 11 Nir Soffer 2016-08-14 12:39:35 UTC
Threads using vdsm.concurrent.thread helper are showing (truncated) python thread
name now.

To complete this feature:
- Convert old code using threading.Thread directly to use vdsm.concurrent.thread
- Modify all thread names to use short names that are not truncated, or usable
  when truncated.

Part of the work is handled in this topic:
https://gerrit.ovirt.org/#/q/topic:thread-cleanup+is:open

Comment 12 Francesco Romani 2016-11-03 11:53:21 UTC
(In reply to Nir Soffer from comment #11)
> Threads using vdsm.concurrent.thread helper are showing (truncated) python
> thread
> name now.
> 
> To complete this feature:
> - Convert old code using threading.Thread directly to use
> vdsm.concurrent.thread
> - Modify all thread names to use short names that are not truncated, or
> usable
>   when truncated.
> 
> Part of the work is handled in this topic:
> https://gerrit.ovirt.org/#/q/topic:thread-cleanup+is:open

All patches are now been merged, so I'm moving this as MODIFIED. Should we discover we forgot some threads, we'll file a new bug.

Caveat: some threads may come for external packages (e.g. python-ioprocess), we should either fix those packages or skip them.

To test this change, just keep "htop" running while Vdsm is working, to see the thread names. Any other tool which shows thread names besides "htop" is fine as well.

Comment 13 Nir Soffer 2016-11-06 11:07:19 UTC
With the additional patches, all threads are named.

ioprocess thread names are handled in bug 1392214.

Some threads name are truncated, I suggest we open another bug for this.

Comment 14 Sandro Bonazzola 2016-12-12 14:02:11 UTC
The fix for this issue should be included in oVirt 4.1.0 beta 1 released on December 1st. If not included please move back to modified.

Comment 15 Jiri Belka 2016-12-16 09:53:47 UTC
ok, vdsm-4.18.999-1173.git28e001a.el7.centos.x86_64

[root@dell-r210ii-13 ~]# systemctl status vdsmd | grep PID
 Main PID: 15413 (vdsm)

[root@dell-r210ii-13 ~]# ps axH -q 15413 -o 'pid tid comm args'
  PID   TID COMMAND         COMMAND
15413 15413 vdsm            /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15450 libvirt/events  /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15451 tasks/0         /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15452 tasks/1         /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15453 tasks/2         /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15454 tasks/3         /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15455 tasks/4         /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15456 tasks/5         /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15457 tasks/6         /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15458 tasks/7         /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15459 tasks/8         /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15460 tasks/9         /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15464 check/loop      /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15466 vdsm.Scheduler  /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15474 vmchannels      /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15491 JsonRpc (StompR /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15492 jsonrpc/0       /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15493 jsonrpc/1       /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15494 jsonrpc/2       /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15495 jsonrpc/3       /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15496 jsonrpc/4       /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15497 jsonrpc/5       /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15498 jsonrpc/6       /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15499 jsonrpc/7       /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15500 JsonRpcServer   /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15501 BindingXMLRPC   /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15502 Reactor thread  /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15503 periodic/0      /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15504 periodic/1      /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15505 periodic/2      /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15506 periodic/3      /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15960 ioprocess/15959 /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15966 monitor/53e447f /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15972 mailbox-hsm/0   /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15973 mailbox-hsm/1   /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15974 mailbox-hsm/2   /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15975 mailbox-hsm/3   /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15976 mailbox-hsm/4   /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15983 mailbox-hsm     /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15997 mailbox-spm/0   /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15998 mailbox-spm/1   /usr/bin/python2 /usr/share/vdsm/vdsm
15413 15999 mailbox-spm/2   /usr/bin/python2 /usr/share/vdsm/vdsm
15413 16000 mailbox-spm/3   /usr/bin/python2 /usr/share/vdsm/vdsm
15413 16001 mailbox-spm/4   /usr/bin/python2 /usr/share/vdsm/vdsm
15413 16003 mailbox-spm     /usr/bin/python2 /usr/share/vdsm/vdsm
15413 17365 ioprocess/17364 /usr/bin/python2 /usr/share/vdsm/vdsm
15413 17436 monitor/0c78b4d /usr/bin/python2 /usr/share/vdsm/vdsm
15413 19033 ioprocess/19032 /usr/bin/python2 /usr/share/vdsm/vdsm
15413 19035 ioprocess/19034 /usr/bin/python2 /usr/share/vdsm/vdsm

[root@dell-r210ii-13 ~]# rpm -q vdsm
vdsm-4.18.999-1173.git28e001a.el7.centos.x86_64


Note You need to log in before you can comment on or make changes to this bug.