Bug 1689184

Summary: Squid RA unable to start due to PID confusion
Product: Red Hat Enterprise Linux 8 Reporter: Patrik Hagara <phagara>
Component: resource-agentsAssignee: Oyvind Albrigtsen <oalbrigt>
Status: CLOSED ERRATA QA Contact: cluster-qe <cluster-qe>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 8.0CC: agk, aherr, cfeist, cluster-maint, fdinitto, gewasiuk, mjuricek, toneata, uwe.knop, wchadwic
Target Milestone: rcKeywords: Regression, TestBlocker, ZStream
Target Release: 8.0Flags: pm-rhel: mirror+
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: resource-agents-4.1.1-24.el8 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1710058 (view as bug list) Environment:
Last Closed: 2019-11-05 20:34:25 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1710058    

Description Patrik Hagara 2019-03-15 11:17:03 UTC
Description of problem:
The ocf:heartbeat:Squid resource agent is unable to start in RHEL-8, apparently due to process forking behavior changes in Squid. The are_pids_sane() function in resource agent requires that the Squid process whose PID is stored in the PID file is also the one which binds the listening TCP port, which is not true on RHEL-8 (the port is bound by a child of the main process).

Version-Release number of selected component (if applicable):
resource-agents-4.1.1-17.el8.x86_64                                                                                                                                                                                

How reproducible:
always

Steps to Reproduce:
1. first, a small debug print patch for the RA to see the non-matching PIDs (first PID is the main process, second is the one which binds the proxy TCP port):
> [root@virt-014 ~]# diff Squid-orig /usr/lib/ocf/resource.d/heartbeat/Squid                                                                                                                                         
> 218a219                                                                                                                                                                                                            
> >               echo "${SQUID_PIDS[1]} != ${SQUID_PIDS[2]}"

2. pcs resource create squid-proxy ocf:heartbeat:Squid squid_conf=/etc/squid/squid.conf squid_exe=/usr/sbin/squid squid_port=3128 squid_pidfile=/var/run/squid.pid op monitor interval=30s

Actual results:
> [root@virt-014 ~]# pcs resource debug-start squid-proxy                                                                                                                                                            
> Operation start for squid-proxy (ocf:heartbeat:Squid) returned: 'ok' (0)
>  >  stdout: 12454 != 12457
>  >  stderr: ocf-exit-reason:squid:Pid unmatch
> [root@virt-014 ~]# pcs status                                                                                                                                                                                      
>  [...]
>  squid-proxy    (ocf::heartbeat:Squid): Stopped
> 
> Failed Resource Actions:
> * squid-proxy_start_0 on virt-014 'unknown error' (1): call=29, status=Timed Out, exitreason='squid:Pid unmatch',
>     last-rc-change='Thu Mar 14 18:35:34 2019', queued=0ms, exec=60005ms
> * squid-proxy_start_0 on virt-025 'unknown error' (1): call=29, status=Timed Out, exitreason='squid:Pid unmatch',
>     last-rc-change='Thu Mar 14 18:36:47 2019', queued=0ms, exec=60014ms
> [root@virt-014 ~]# ps faux
> [...]
> root     12454  0.0  0.4 111680  8468 ?        Ss   11:47   0:00 /usr/sbin/squid -f /etc/squid/squid.conf
> squid    12457  0.0  1.1 116836 21080 ?        S    11:47   0:00  \_ (squid-1) --kid squid-1 -f /etc/squid/squid.conf                                                                                              
> squid    12470  0.0  0.1  24308  2024 ?        S    11:47   0:00      \_ (logfile-daemon) /var/log/squid/access.log   

Expected results:
resource starts

Additional info:

Comment 1 Oyvind Albrigtsen 2019-03-18 15:07:27 UTC
https://github.com/ClusterLabs/resource-agents/pull/1305

Comment 2 Patrik Hagara 2019-03-19 09:20:35 UTC
qa-ack+, reproducer in bug description (sans RA script modification)

Comment 8 Oyvind Albrigtsen 2019-05-29 14:24:06 UTC
Patch to avoid running "pgrep -P" without a PID: https://github.com/ClusterLabs/resource-agents/pull/1345

Comment 12 errata-xmlrpc 2019-11-05 20:34:25 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:3307