Bug 2144754

Summary: FailingStreak is not reset to 0 when the container starts again.
Product: Red Hat Enterprise Linux 8 Reporter: Arya Rajendran <arajendr>
Component: podmanAssignee: Jindrich Novy <jnovy>
Status: CLOSED ERRATA QA Contact: Alex Jia <ajia>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: ---CC: ajia, bbaude, dornelas, dwalsh, jligon, jnovy, lsm5, mheon, pthomas, rjo, tsweeney, umohnani, vrothber, ypu
Target Milestone: rcKeywords: Triaged, ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: podman-4.3.1-2.el8 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 2149774 2149775 (view as bug list) Environment:
Last Closed: 2023-05-16 08:22:23 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 2149774    
Bug Blocks: 2149775    
Deadline: 2023-02-21   

Description Arya Rajendran 2022-11-22 08:30:46 UTC
Description of problem:

When a container is unhealthy, podman healthcheck mechanism check healthcheck_cmd command for number_of_retries times then do the action mentioned in health-on-failure.

If we stopped container then tried again to start it, podman tries 1 time only if "FailingStreak" is equal or larger than the number_of_retries.

Version-Release number of selected component (if applicable):

Rhel 8.7
podman-4.2.0-4

How reproducible:

Always

Steps to Reproduce:
1) create container from any RHEL-based image with failing healthecheck

#podman create -t --restart "on-failure" --health-cmd "/usr/bin/sleep 2; /usr/bin/false;"  --health-interval 30s --health-on-failure restart --health-retries 5 --health-timeout 20s --name testing_on_podman_2 5218 /bin/bash

2) Start container and wait for FailingStreak to be larger than health-retries  (in this case 5)

# podman inspect testing_on_podman_2 --format {{.State.Health}}
{unhealthy 5 [{2022-11-21T13:14:56.263661799Z 2022-11-21T13:14:58.644395856Z 1 } {2022-11-21T13:15:29.210324494Z 2022-11-21T13:15:31.617566043Z 1 } {2022-11-21T13:16:02.472515722Z 2022-11-21T13:16:04.918606044Z 1 } {2022-11-21T13:16:35.690635907Z 2022-11-21T13:16:38.126636751Z 1 } {2022-11-21T13:17:08.734469375Z 2022-11-21T13:17:11.183019457Z 1 }]}

3) Stop container, and start it again.

4) Podman tries one time then container is stopped

# podman start testing_on_podman_2
testing_on_podman_2
# podman inspect testing_on_podman_2 --format {{.State.Health}}
{unhealthy 6 [{2022-11-21T13:15:29.210324494Z 2022-11-21T13:15:31.617566043Z 1 } {2022-11-21T13:16:02.472515722Z 2022-11-21T13:16:04.918606044Z 1 } {2022-11-21T13:16:35.690635907Z 2022-11-21T13:16:38.126636751Z 1 } {2022-11-21T13:17:08.734469375Z 2022-11-21T13:17:11.183019457Z 1 } {2022-11-21T13:17:38.165949695Z 2022-11-21T13:17:40.709216292Z 1 }]}
# podman inspect testing_on_podman_2 --format {{.State.Status}}
exited

Comment 21 Alex Jia 2022-12-19 02:55:20 UTC
The bug has been verified on podman-4.3.1-2.module+el8.8.0+17574+f7825c4b.x86_64.

[root@ibm-x3650m4-01-vm-11 podman]# rpm -q podman
podman-4.3.1-2.module+el8.8.0+17574+f7825c4b.x86_64

[root@ibm-x3650m4-01-vm-11 podman]# bats -f "podman healthcheck - restart cleans up old state" test/system/220-healthcheck.bats
220-healthcheck.bats
 ✓ podman healthcheck - restart cleans up old state

1 test, 0 failures

Comment 24 Alex Jia 2023-01-10 10:15:22 UTC
This bug has been verified on podman-4.3.1-2.module+el8.8.0+17695+8a9c0c1b.x86_64.

[root@kvm-01-guest25 ~]# cat /etc/redhat-release
Red Hat Enterprise Linux release 8.8 Beta (Ootpa)

[root@kvm-01-guest25 ~]# rpm -q podman runc systemd kernel
podman-4.3.1-2.module+el8.8.0+17695+8a9c0c1b.x86_64
runc-1.1.4-1.module+el8.8.0+17695+8a9c0c1b.x86_64
systemd-239-69.el8.x86_64
kernel-4.18.0-447.el8.x86_64

[root@kvm-01-guest25 podman]# git branch
  main
* v4.3.1-rhel

[root@kvm-01-guest25 podman]# bats test/system/220-healthcheck.bats
220-healthcheck.bats
 ✓ podman healthcheck
 ✓ podman healthcheck - restart cleans up old state
 ✓ podman healthcheck --health-on-failure
   setup(): removing stray image localhost/healthcheck_i:latest
   setup(): removing stray image eda2a32a6681
   setup(): removing stray image <none>:<none>
   setup(): removing stray image e1a9b6ee6031
   setup(): removing stray image <none>:<none>
   setup(): removing stray image f69722d2c31e
   setup(): removing stray image <none>:<none>
   setup(): removing stray image a222846d5fc0

3 tests, 0 failures

Comment 28 errata-xmlrpc 2023-05-16 08:22:23 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: container-tools:rhel8 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2023:2758