Bug 2144754

Summary:	FailingStreak is not reset to 0 when the container starts again.
Product:	Red Hat Enterprise Linux 8	Reporter:	Arya Rajendran <arajendr>
Component:	podman	Assignee:	Jindrich Novy <jnovy>
Status:	CLOSED ERRATA	QA Contact:	Alex Jia <ajia>
Severity:	urgent	Docs Contact:
Priority:	unspecified
Version:	---	CC:	ajia, bbaude, dornelas, dwalsh, jligon, jnovy, lsm5, mheon, pthomas, rjo, tsweeney, umohnani, vrothber, ypu
Target Milestone:	rc	Keywords:	Triaged, ZStream
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:	podman-4.3.1-2.el8	Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:
Clones:	2149774 2149775 (view as bug list)		Environment:
Last Closed:	2023-05-16 08:22:23 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:	2149774
Bug Blocks:	2149775
Deadline:	2023-02-21

Description Arya Rajendran 2022-11-22 08:30:46 UTC

Description of problem:

When a container is unhealthy, podman healthcheck mechanism check healthcheck_cmd command for number_of_retries times then do the action mentioned in health-on-failure.

If we stopped container then tried again to start it, podman tries 1 time only if "FailingStreak" is equal or larger than the number_of_retries.

Version-Release number of selected component (if applicable):

Rhel 8.7
podman-4.2.0-4

How reproducible:

Always

Steps to Reproduce:
1) create container from any RHEL-based image with failing healthecheck

#podman create -t --restart "on-failure" --health-cmd "/usr/bin/sleep 2; /usr/bin/false;"  --health-interval 30s --health-on-failure restart --health-retries 5 --health-timeout 20s --name testing_on_podman_2 5218 /bin/bash

2) Start container and wait for FailingStreak to be larger than health-retries  (in this case 5)

# podman inspect testing_on_podman_2 --format {{.State.Health}}
{unhealthy 5 [{2022-11-21T13:14:56.263661799Z 2022-11-21T13:14:58.644395856Z 1 } {2022-11-21T13:15:29.210324494Z 2022-11-21T13:15:31.617566043Z 1 } {2022-11-21T13:16:02.472515722Z 2022-11-21T13:16:04.918606044Z 1 } {2022-11-21T13:16:35.690635907Z 2022-11-21T13:16:38.126636751Z 1 } {2022-11-21T13:17:08.734469375Z 2022-11-21T13:17:11.183019457Z 1 }]}

3) Stop container, and start it again.

4) Podman tries one time then container is stopped

# podman start testing_on_podman_2
testing_on_podman_2
# podman inspect testing_on_podman_2 --format {{.State.Health}}
{unhealthy 6 [{2022-11-21T13:15:29.210324494Z 2022-11-21T13:15:31.617566043Z 1 } {2022-11-21T13:16:02.472515722Z 2022-11-21T13:16:04.918606044Z 1 } {2022-11-21T13:16:35.690635907Z 2022-11-21T13:16:38.126636751Z 1 } {2022-11-21T13:17:08.734469375Z 2022-11-21T13:17:11.183019457Z 1 } {2022-11-21T13:17:38.165949695Z 2022-11-21T13:17:40.709216292Z 1 }]}
# podman inspect testing_on_podman_2 --format {{.State.Status}}
exited

Comment 21 Alex Jia 2022-12-19 02:55:20 UTC

The bug has been verified on podman-4.3.1-2.module+el8.8.0+17574+f7825c4b.x86_64.

[root@ibm-x3650m4-01-vm-11 podman]# rpm -q podman
podman-4.3.1-2.module+el8.8.0+17574+f7825c4b.x86_64

[root@ibm-x3650m4-01-vm-11 podman]# bats -f "podman healthcheck - restart cleans up old state" test/system/220-healthcheck.bats
220-healthcheck.bats
 ✓ podman healthcheck - restart cleans up old state

1 test, 0 failures

Comment 24 Alex Jia 2023-01-10 10:15:22 UTC

This bug has been verified on podman-4.3.1-2.module+el8.8.0+17695+8a9c0c1b.x86_64.

[root@kvm-01-guest25 ~]# cat /etc/redhat-release
Red Hat Enterprise Linux release 8.8 Beta (Ootpa)

[root@kvm-01-guest25 ~]# rpm -q podman runc systemd kernel
podman-4.3.1-2.module+el8.8.0+17695+8a9c0c1b.x86_64
runc-1.1.4-1.module+el8.8.0+17695+8a9c0c1b.x86_64
systemd-239-69.el8.x86_64
kernel-4.18.0-447.el8.x86_64

[root@kvm-01-guest25 podman]# git branch
  main
* v4.3.1-rhel

[root@kvm-01-guest25 podman]# bats test/system/220-healthcheck.bats
220-healthcheck.bats
 ✓ podman healthcheck
 ✓ podman healthcheck - restart cleans up old state
 ✓ podman healthcheck --health-on-failure
   setup(): removing stray image localhost/healthcheck_i:latest
   setup(): removing stray image eda2a32a6681
   setup(): removing stray image <none>:<none>
   setup(): removing stray image e1a9b6ee6031
   setup(): removing stray image <none>:<none>
   setup(): removing stray image f69722d2c31e
   setup(): removing stray image <none>:<none>
   setup(): removing stray image a222846d5fc0

3 tests, 0 failures

Comment 28 errata-xmlrpc 2023-05-16 08:22:23 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: container-tools:rhel8 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2023:2758