Bug 1351189

Summary:	teamd was killed early when shutdown
Product:	Red Hat Enterprise Linux 7	Reporter:	Marcel Kolaja <mkolaja>
Component:	libteam	Assignee:	Hangbin Liu <haliu>
Status:	CLOSED WONTFIX	QA Contact:	Network QE <network-qe>
Severity:	medium	Docs Contact:
Priority:	high
Version:	7.1	CC:	brubisch, byodlows, david.fields, dcbw, dconsoli, hsowa, jiri, kcleveng, kzhang, lxin, mleitner, network-qe, pgervase, pm-rhel, rkhan, rmanes, sbradley, tbskyd, tgunders
Target Milestone:	rc	Keywords:	ZStream
Target Release:	---
Hardware:	All
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:	1264175	Environment:
Last Closed:	2016-08-17 14:27:02 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:	1264175, 1354382
Bug Blocks:

Description Marcel Kolaja 2016-06-29 12:19:14 UTC

This bug has been copied from bug #1264175 and has been proposed
to be backported to 7.2 z-stream (EUS).

Comment 30 Xin Long 2016-07-28 07:40:32 UTC

I will try to explain the patch in c#20, also ccing jiri to see if it can be accepted in libteam upstream.

-----
Prior these fixes, teamd was managed by systemd, but it didn't have any dependence in teamd.service. It could lead to that teamd was stopped before nfs was stopped, then nfs block.

After we added 'Before=network-pre.target', it would delay teamd's stop until network was stopped completely. It seemed the shutdown order was nfs -> network -> teamd then. The nfs block issue was gone.

Unfortunately, teamd was also stopped in network's ifdown-Team script through "systemctl stop teamd.service". It means our change will make network try to
stop one service (teamd) that was dependent on network itself, then the network block issue showed up.

Now we change to start teamd with '/usr/bin/teamd -d' in network's ifup-Team script and stop with '/usr/bin/teamd -k'. It means teamd will not be managed by systemd, but be managed by network completely, which can fix all the issues we mentioned in this BZ.

IMO, it's better let the teamd NOT work as a systeamd service, but be ONLY managed by network and be a part of network service.
1) The shudown block issue here can be fixed, :D
2) teamd is one daemon each team device, earlier to handle in network. Besides, even if we make it work as a systemd service, network still has to start/stop it. It can never be independent from network.

Comment 31 Hangbin Liu 2016-07-28 08:54:45 UTC

(In reply to Xin Long from comment #30)
> IMO, it's better let the teamd NOT work as a systeamd service, but be ONLY
> managed by network and be a part of network service.
> 1) The shudown block issue here can be fixed, :D

Agree, that would be more easy to handle team.

Comment 32 Jiri Pirko 2016-08-08 07:57:59 UTC

I think you guys need to discuss this with systemd people.

Comment 33 Marcelo Ricardo Leitner 2016-08-08 16:27:45 UTC

(In reply to Jiri Pirko from comment #32)
> I think you guys need to discuss this with systemd people.

We are also doing that, in upstream and internally. We are actually shooting everywhere as systemd folks couldn't help us much so far.

The latest idea (still in the works) we have is to still use systemd but ignore dependencies when stopping teamd. So we can still use systemd and most of its features, but also make it work as we need on stop.

Comment 34 Xin Long 2016-08-14 08:37:36 UTC

upstream fix: https://github.com/jpirko/libteam/commit/4a9e1fac5d69e6abae0451c579b02f16d960e694

Comment 35 Marcelo Ricardo Leitner 2016-08-17 14:27:02 UTC

Not fixing this in z-stream, as the fix is too intrusive for it.
https://bugzilla.redhat.com/show_bug.cgi?id=1354382#c5