Bug 2176768

Summary: logfile rotation for sssd_kcm not working properly, sssd_kcm never receives a 'kill -HUP'
Product: Red Hat Enterprise Linux 8 Reporter: Christian Horn <chorn>
Component: sssdAssignee: Alejandro López <allopez>
Status: NEW --- QA Contact: Jakub Vavra <jvavra>
Severity: low Docs Contact:
Priority: medium    
Version: 8.3CC: aboscatt, atikhono, pbrezina, shane.seymour, stanislav.moravec
Target Milestone: rcKeywords: Triaged
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Christian Horn 2023-03-09 08:38:57 UTC
Description of problem:
logfile rotation for sssd_kcm not working properly, sssd_kcm never receives a 'kill -HUP'

Version-Release number of selected component (if applicable):
Affects all sssd/sssd_kcm versions

How reproducible:
always

Steps to Reproduce:
1. dnf -y install sssd sssd_kcm
2. initiate a log rotation via logrotate

Actual results:
File /var/log/sssd/sssd_kcm.log is moved to
/var/log/sssd/sssd_kcm.log.1, but sssd_kcm keeps the file handle on that
file open instead of starting to write to /var/log/sssd/sssd_kcm.log.

Expected results:
sssd_kcm should drop all file handles to /var/log/sssd/sssd_kcm.log.1 and start
writing to /var/log/sssd/sssd_kcm.log

Additional info:
- As part of package sssd, we deploy /etc/logrotate.d/sssd
- that config file leads to 
  a) /var/log/sssd/*.log being moved to $file.1
  b) and then does kill -HUP `cat /var/run/sssd.pid  2>/dev/null`
That is fine for all children of sssd, but as sssd_kcm is a separate
process, it is not notified and never drops the file handle.

Comment 1 Alexey Tikhonov 2023-03-09 14:42:46 UTC
Probably the reason is https://github.com/SSSD/sssd/blob/712377ea5c5b70a71e4fdb1b1c40ad059157778c/src/util/util.c#L669 :
```
const char * const * get_known_services(void)
{
    static const char *svc[] = {"nss", "pam", "sudo", "autofs",
                                "ssh", "pac", "ifp", NULL };

    return svc;
}
```
  --  `get_known_services()` doesn't list `kcm` so `monitor` refuses to register it:
https://github.com/SSSD/sssd/blob/712377ea5c5b70a71e4fdb1b1c40ad059157778c/src/monitor/monitor.c#L211


 Christian, is there "Invalid service ..." line in sssd.log?

Comment 2 Christian Horn 2023-03-10 00:14:33 UTC
(In reply to Alexey Tikhonov from comment #1)
> ```
> const char * const * get_known_services(void)
> {
>     static const char *svc[] = {"nss", "pam", "sudo", "autofs",
>                                 "ssh", "pac", "ifp", NULL };
> 
>     return svc;
> }
> ```

Might be worth a test built, to verify if sssd then starts
as parent of sssd_kcm.


>   --  `get_known_services()` doesn't list `kcm` so `monitor` refuses to
> register it:
> https://github.com/SSSD/sssd/blob/712377ea5c5b70a71e4fdb1b1c40ad059157778c/
> src/monitor/monitor.c#L211
> 
> 
>  Christian, is there "Invalid service ..." line in sssd.log?

Not seeing that on customers logs, and also not in a KVM-guest
with rhel8.6GA and sssd_kcm running.
  DEBUG(SSSDBG_FATAL_FAILURE, "Invalid service %s\n", svc_name);
should log it without increased debug level, I think.

Comment 3 Christian Horn 2023-03-10 05:19:17 UTC
Tried to use rpmbuild to rebuild sssd packages on both rhel8/rhel9.
On both package cifs-utils-devel is build dependency, which only exists on
rhel7.  Adding the rhel7 cifs-utils-devel, both rebuilds fail in the
"testing" phase of the build.

Comment 4 Alexey Tikhonov 2023-03-10 09:17:50 UTC
(In reply to Christian Horn from comment #2)
> (In reply to Alexey Tikhonov from comment #1)
> > ```
> > const char * const * get_known_services(void)
> > {
> >     static const char *svc[] = {"nss", "pam", "sudo", "autofs",
> >                                 "ssh", "pac", "ifp", NULL };
> > 
> >     return svc;
> > }
> > ```
> 
> Might be worth a test built, to verify if sssd then starts
> as parent of sssd_kcm.

Not sure what you mean with "sssd then starts as parent of sssd_kcm."

If you use sssd_kcm as socket activated service (a default) then changing this list is source code won't affect this.
But idea is that 'monitor' (main 'sssd' process) should register socket activated services (and thus be able to communicate with) as well.

> >   --  `get_known_services()` doesn't list `kcm` so `monitor` refuses to
> > register it:
> > https://github.com/SSSD/sssd/blob/712377ea5c5b70a71e4fdb1b1c40ad059157778c/
> > src/monitor/monitor.c#L211
> > 
> > 
> >  Christian, is there "Invalid service ..." line in sssd.log?
> 
> Not seeing that on customers logs, and also not in a KVM-guest
> with rhel8.6GA and sssd_kcm running.
>   DEBUG(SSSDBG_FATAL_FAILURE, "Invalid service %s\n", svc_name);
> should log it without increased debug level, I think.

Then it's more complicated. Probably sssd_kcm doesn't register with monitor, I need to check.

Comment 5 Alexey Tikhonov 2023-03-10 09:27:05 UTC
(In reply to Alexey Tikhonov from comment #4)
> 
> Then it's more complicated. Probably sssd_kcm doesn't register with monitor,
> I need to check.

Indeed:
```
$ grep -rn sss_monitor_service_init *
providers/data_provider_be.c:704:    ret = sss_monitor_service_init(be_ctx, be_ctx->ev, be_ctx->sbus_name,
responder/pam/pamsrv.c:410:    ret = sss_monitor_service_init(rctx, rctx->ev, SSS_BUS_PAM,
responder/autofs/autofssrv.c:154:    ret = sss_monitor_service_init(rctx, rctx->ev, SSS_BUS_AUTOFS,
responder/ssh/sshsrv.c:136:    ret = sss_monitor_service_init(rctx, rctx->ev, SSS_BUS_SSH,
responder/sudo/sudosrv.c:112:    ret = sss_monitor_service_init(rctx, rctx->ev, SSS_BUS_SUDO,
responder/pac/pacsrv.c:146:    ret = sss_monitor_service_init(rctx, rctx->ev, SSS_BUS_PAC,
responder/ifp/ifpsrv.c:280:    ret = sss_monitor_service_init(rctx, rctx->ev, SSS_BUS_IFP,
responder/nss/nsssrv.c:631:    ret = sss_monitor_service_init(rctx, rctx->ev, SSS_BUS_NSS,
```

But there might be a reason why it doesn't.

Comment 6 Alexey Tikhonov 2023-03-10 09:54:53 UTC
(In reply to Alexey Tikhonov from comment #5)
> 
> But there might be a reason why it doesn't.

Right, this is on purpose: 'monitor' (main `sssd` process) and KCM service (`sssd_kcm`) are totally independent.
On a default RHEL9 (and RHEL8 starting 8.8) install `sssd_kcm` runs (as socket activated serivce) and `sssd` doesn't (if no domains configured explicitly).
So I guess we need to add it's own `/etc/logrotate.d/sssd_kcm` (or list in `/etc/logrotate.d/sssd`?) and make it handle HUP on its own (if it doesn't yet).

Comment 7 Christian Horn 2023-03-10 09:57:23 UTC
(In reply to Alexey Tikhonov from comment #4)
> (In reply to Christian Horn from comment #2)
> > (In reply to Alexey Tikhonov from comment #1)
> > > ```
> > > const char * const * get_known_services(void)
> > > {
> > >     static const char *svc[] = {"nss", "pam", "sudo", "autofs",
> > >                                 "ssh", "pac", "ifp", NULL };
> > > 
> > >     return svc;
> > > }
> > > ```
> > 
> > Might be worth a test built, to verify if sssd then starts
> > as parent of sssd_kcm.
> 
> Not sure what you mean with "sssd then starts as parent of sssd_kcm."

[root@rhel8u6a sa]# ps axf|grep [s]ssd
    752 ?        Ss     0:00 /usr/sbin/sssd -i --logger=files
    770 ?        S      0:00  \_ /usr/libexec/sssd/sssd_be --domain implicit_files --uid 0 --gid 0 --logger=files
    798 ?        S      0:01  \_ /usr/libexec/sssd/sssd_nss --uid 0 --gid 0 --logger=files
 878497 ?        Ss     0:00 /usr/libexec/sssd/sssd_kcm --uid 0 --gid 0 --logger=files
[root@rhel8u6a sa]# 

sssd_kcm is separate, that I meant.
It is not receiving a 'kill -HUP' at the moment.

> If you use sssd_kcm as socket activated service (a default) then changing
> this list is source code won't affect this.

Ok.