Bug 2011171

Summary:	diskmaker-manager constantly redeployed by LSO when creating LV's
Product:	OpenShift Container Platform	Reporter:	ryoji noma <rnoma>
Component:	Storage	Assignee:	Jonathan Dobson <jdobson>
Storage sub component:	Local Storage Operator	QA Contact:	Chao Yang <chaoyang>
Status:	CLOSED ERRATA	Docs Contact:
Severity:	high
Priority:	high	CC:	aos-bugs, mori
Version:	4.8
Target Milestone:	---
Target Release:	4.10.0
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	No Doc Update
Doc Text:		Story Points:	---
Clone Of:
Clones:	2016556 2016557 (view as bug list)		Environment:
Last Closed:	2022-03-10 16:17:19 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	2016556

Description ryoji noma 2021-10-06 08:00:15 UTC

Description of problem:
The diskmaker-manager Pod continues to be deployed and PV is not created when configuring local volumes multiple nodes with local-storage-operator 4.8.

Version-Release number of selected component (if applicable):

OCP4.8.13 UPI
local-storage-operator 4.8

In the customer's environment, the diskmaker-manager Pod got normally when local-storage-operator was downgrade to 4.7.

How reproducible:

This is the procedure reproduced in the internal environment.
  Steps to Reproduce:
  1. Build an OCP4.8 cluster with 3 masters and 2 workers.
  2. Add a disk to worker (/dev/vdb) and partition it into multiple partitions. (This time, create 10 partitions for each.)
  3. Create localvolumes each with the attached script(23-worker-0.sh,23-worker-1.sh). An issue occurs on the fourth localvolume.

Actual results:
The diskmaker-manager will be in a state of repeating ContainerCreating and Terminating.

```
NAME                                      READY   STATUS    RESTARTS   AGE     IP            NODE       NOMINATED NODE   READINESS GATES
diskmaker-manager-t7mdq                   1/1     Running   0          33s     10.128.2.54   worker-0   <none>           <none>
local-storage-operator-65878b66fc-k59sl   1/1     Running   0          6m30s   10.131.0.57   worker-1   <none>           <none>

=> 20210928-153637
NAME                                      READY   STATUS              RESTARTS   AGE     IP            NODE       NOMINATED NODE   READINESS GATES
diskmaker-manager-q9ww9                   0/1     ContainerCreating   0          1s      <none>        worker-1   <none>           <none>
diskmaker-manager-t7mdq                   1/1     Running             0          34s     10.128.2.54   worker-0   <none>           <none>
local-storage-operator-65878b66fc-k59sl   1/1     Running             0          6m31s   10.131.0.57   worker-1   <none>           <none>

=> 20210928-153638
NAME                                      READY   STATUS              RESTARTS   AGE     IP            NODE       NOMINATED NODE   READINESS GATES
diskmaker-manager-q9ww9                   0/1     ContainerCreating   0          2s      <none>        worker-1   <none>           <none>
diskmaker-manager-t7mdq                   1/1     Running             0          35s     10.128.2.54   worker-0   <none>           <none>
local-storage-operator-65878b66fc-k59sl   1/1     Running             0          6m32s   10.131.0.57   worker-1   <none>           <none>

=> 20210928-153639
NAME                                      READY   STATUS              RESTARTS   AGE     IP            NODE       NOMINATED NODE   READINESS GATES
diskmaker-manager-q9ww9                   0/1     ContainerCreating   0          3s      <none>        worker-1   <none>           <none>
diskmaker-manager-t7mdq                   1/1     Running             0          36s     10.128.2.54   worker-0   <none>           <none>
local-storage-operator-65878b66fc-k59sl   1/1     Running             0          6m33s   10.131.0.57   worker-1   <none>           <none>

=> 20210928-153640
NAME                                      READY   STATUS              RESTARTS   AGE     IP            NODE       NOMINATED NODE   READINESS GATES
diskmaker-manager-q9ww9                   0/1     ContainerCreating   0          4s      <none>        worker-1   <none>           <none>
diskmaker-manager-t7mdq                   1/1     Running             0          37s     10.128.2.54   worker-0   <none>           <none>
local-storage-operator-65878b66fc-k59sl   1/1     Running             0          6m34s   10.131.0.57   worker-1   <none>           <none>

=> 20210928-153641
NAME                                      READY   STATUS        RESTARTS   AGE     IP            NODE       NOMINATED NODE   READINESS GATES
diskmaker-manager-q9ww9                   1/1     Running       0          5s      10.131.0.59   worker-1   <none>           <none>
diskmaker-manager-t7mdq                   0/1     Terminating   0          38s     10.128.2.54   worker-0   <none>           <none>
local-storage-operator-65878b66fc-k59sl   1/1     Running       0          6m35s   10.131.0.57   worker-1   <none>           <none>

=> 20210928-153642
NAME                                      READY   STATUS              RESTARTS   AGE     IP            NODE       NOMINATED NODE   READINESS GATES
diskmaker-manager-8ls9h                   0/1     ContainerCreating   0          0s      <none>        worker-0   <none>           <none>
diskmaker-manager-q9ww9                   1/1     Running             0          6s      10.131.0.59   worker-1   <none>           <none>
local-storage-operator-65878b66fc-k59sl   1/1     Running             0          6m36s   10.131.0.57   worker-1   <none>           <none>

=> 20210928-153643
NAME                                      READY   STATUS        RESTARTS   AGE     IP            NODE       NOMINATED NODE   READINESS GATES
diskmaker-manager-8ls9h                   0/1     Terminating   0          1s      <none>        worker-0   <none>           <none>
diskmaker-manager-q9ww9                   1/1     Terminating   0          7s      10.131.0.59   worker-1   <none>           <none>
local-storage-operator-65878b66fc-k59sl   1/1     Running       0          6m37s   10.131.0.57   worker-1   <none>           <none>
```

After about an hour, both will be running, or one will be running and the other will be terminating.

Expected results:
The diskmaker-manager should be created and continues to run. The same goes for ds/diskmaker-manager. 

Additional info:

The ConfigMap resourceVersion continues to increasing. ds/diskmaker-manager The same goes for ds/diskmaker-manager. The contents have not changed. It seems that multiple controllers are competing with each other and keep updating the same object.

```
$ oc -n openshift-local-storage get cm/local-provisioner -o yaml | grep resourceVersion
  resourceVersion: "5888047"
$ oc -n openshift-local-storage get cm/local-provisioner -o yaml | grep resourceVersion
  resourceVersion: "5888135"
$ oc -n openshift-local-storage get cm/local-provisioner -o yaml | grep resourceVersion
  resourceVersion: "5888224"
$ oc -n openshift-local-storage get cm/local-provisioner -o yaml | grep resourceVersion
  resourceVersion: "5888292"
```

Comment 10 Jonathan Dobson 2021-10-15 00:04:25 UTC

Thank you Manabu and Ryoji, this info was very helpful. I'm able to reproduce the issue, still working on RCA.


0) Initial setup

Installed LSO and created a series of loopback devices for testing on each worker node as follows:

[jon@dagobah 2011171]$ oc debug node/ip-10-0-138-245.us-east-2.compute.internal
...
sh-4.4# for i in `seq 1 4`; do dd if=/dev/zero of=/tmp/loopback${i}.img bs=1 count=0 seek=1G; losetup -fP /tmp/loopback${i}.img; done
...
sh-4.4# losetup -a
/dev/loop1: [0042]:4466232 (/tmp/loopback2.img)
/dev/loop2: [0042]:4466840 (/tmp/loopback3.img)
/dev/loop0: [0042]:4466837 (/tmp/loopback1.img)
/dev/loop3: [0042]:4466842 (/tmp/loopback4.img)


1) Create single LV on single worker node

[jon@dagobah 2011171]$ cat lv-loop0.yaml 
apiVersion: "local.storage.openshift.io/v1"
kind: "LocalVolume"
metadata:
  name: "lv-loop0"
  namespace: "openshift-local-storage"
spec:
  nodeSelector:
    nodeSelectorTerms:
    - matchExpressions:
        - key: kubernetes.io/hostname
          operator: In
          values:
          - ip-10-0-138-245
  storageClassDevices:
    - storageClassName: "lv-loop0"
      volumeMode: Filesystem
      fsType: xfs
      devicePaths:
        - /dev/loop0
[jon@dagobah 2011171]$ oc apply -f lv-loop0.yaml 
localvolume.local.storage.openshift.io/lv-loop0 created

This triggered multiple updates to the daemonset (see "daemonset changed" entries):

[jon@dagobah 2011171]$ oc logs -n openshift-local-storage pod/local-storage-operator-77d6b45fc9-t8879
...
I1014 22:28:05.087390       1 reconcile.go:34] Reconciling LocalVolume
I1014 22:28:05.087454       1 api_updater.go:74] Updating localvolume openshift-local-storage/lv-loop0
I1014 22:28:05.097273       1 reconcile.go:34] Reconciling LocalVolume
{"level":"info","ts":1634250485.0975707,"logger":"localvolumesetdaemon-controller","msg":"provisioner configmap changed","Request.Namespace":"openshift-local-storage","Request.Name":""}
E1014 22:28:05.108356       1 reconcile.go:96] failed to fetch diskmaker daemonset DaemonSet.apps "diskmaker-manager" not found
E1014 22:28:05.114731       1 reconcile.go:134] error syncing condition: LocalVolume.local.storage.openshift.io "lv-loop0" is invalid: status.generations: Required value
{"level":"info","ts":1634250485.1165073,"logger":"localvolumesetdaemon-controller","msg":"daemonset changed","Request.Namespace":"openshift-local-storage","Request.Name":"","daemonset.Name":"diskmaker-manager","op.Result":"created"}
I1014 22:28:05.117012       1 reconcile.go:34] Reconciling LocalVolume
{"level":"info","ts":1634250485.126506,"logger":"localvolumesetdaemon-controller","msg":"daemonset changed","Request.Namespace":"openshift-local-storage","Request.Name":"","daemonset.Name":"diskmaker-manager","op.Result":"updated"}
{"level":"info","ts":1634250485.1347363,"logger":"localvolumesetdaemon-controller","msg":"daemonset changed","Request.Namespace":"openshift-local-storage","Request.Name":"","daemonset.Name":"diskmaker-manager","op.Result":"updated"}
I1014 22:28:05.135488       1 reconcile.go:34] Reconciling LocalVolume
{"level":"info","ts":1634250485.1427386,"logger":"localvolumesetdaemon-controller","msg":"daemonset changed","Request.Namespace":"openshift-local-storage","Request.Name":"","daemonset.Name":"diskmaker-manager","op.Result":"updated"}
I1014 22:28:05.165650       1 reconcile.go:34] Reconciling LocalVolume
I1014 22:28:05.176266       1 reconcile.go:34] Reconciling LocalVolume
{"level":"error","ts":1634250485.17673,"logger":"controller-runtime.controller","msg":"Reconciler error","controller":"localvolumesetdaemon-controller","request":"openshift-local-storage/","error":"Operation cannot be fulfilled on daemonsets.apps \"diskmaker-manager\": the object has been modified; please apply your changes to the latest version and try again","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\t/go/src/github.com/openshift/local-storage-operator/vendor/github.com/go-logr/zapr/zapr.go:128\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/src/github.com/openshift/local-storage-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:258\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/src/github.com/openshift/local-storage-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:232\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker\n\t/go/src/github.com/openshift/local-storage-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:211\nk8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\t/go/src/github.com/openshift/local-storage-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:152\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/src/github.com/openshift/local-storage-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:153\nk8s.io/apimachinery/pkg/util/wait.Until\n\t/go/src/github.com/openshift/local-storage-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88"}
{"level":"info","ts":1634250486.185989,"logger":"localvolumesetdaemon-controller","msg":"daemonset changed","Request.Namespace":"openshift-local-storage","Request.Name":"","daemonset.Name":"diskmaker-manager","op.Result":"updated"}
I1014 22:28:16.900963       1 reconcile.go:34] Reconciling LocalVolume
{"level":"info","ts":1634250496.9091973,"logger":"localvolumesetdaemon-controller","msg":"daemonset changed","Request.Namespace":"openshift-local-storage","Request.Name":"","daemonset.Name":"diskmaker-manager","op.Result":"updated"}
I1014 22:28:19.483278       1 reconcile.go:34] Reconciling LocalVolume
I1014 22:28:19.493026       1 reconcile.go:34] Reconciling LocalVolume
I1014 22:28:19.510662       1 reconcile.go:34] Reconciling LocalVolume
I1014 22:28:19.518589       1 reconcile.go:34] Reconciling LocalVolume

However, the generation number was still 1 after those updates:

[jon@dagobah 2011171]$ oc get daemonset.apps/diskmaker-manager -n openshift-local-storage -o
 yaml | head
apiVersion: apps/v1
kind: DaemonSet
metadata:
  annotations:
    deprecated.daemonset.template.generation: "1"
    local.storage.openshift.io/configMapDataHash: 17032e345ceabe091d608b58ed093b2c5d1c2f0107
dcfe34e7888d8ffca1c53c
  creationTimestamp: "2021-10-14T22:28:05Z"
  generation: 1
  labels:
    app: diskmaker-manager

This seems like a potential issue... if the generation number is not changing, then why isn't CreateOrUpdate() returning OperationResultNone?

https://github.com/openshift/local-storage-operator/blob/81d182a4cabe04d2dc7f42d7b6167bd0d86900e2/vendor/sigs.k8s.io/controller-runtime/pkg/controller/controllerutil/controllerutil.go#L211-L224


2) Create 4 LV's on each of 2 worker nodes (8 LV's total)

[jon@dagobah 2011171]$ cat reproduce-loopback.sh 
#!/bin/bash

devprefix=/dev/loop
volmode=fs

declare -a workers=(
"ip-10-0-138-245"
"ip-10-0-163-198"
)

case ${volmode} in
fs)
        volumeMode=Filesystem
        ;;
block)
        volumeMode=Block
        ;;
esac

echo "* volmode: ${volmode}"
for (( w=0; w<${#workers[@]}; w++ )); do
        worker=${workers[$w]}
        echo "$w" "$worker"
        for (( d=0; d<4; d++ )); do
                devpath="${devprefix}$d"
                echo "$d" "${devpath}"
                cat << END | oc create -f -
apiVersion: "local.storage.openshift.io/v1"
kind: "LocalVolume"
metadata:
  name: "local-${volmode}-w${w}-${d}"
  namespace: "openshift-local-storage"
spec:
  nodeSelector:
    nodeSelectorTerms:
    - matchExpressions:
        - key: kubernetes.io/hostname
          operator: In
          values:
          - ${worker}
  storageClassDevices:
    - storageClassName: "local-${volmode}-w${w}-${d}"
      volumeMode: ${volumeMode}
      devicePaths:
        - ${devpath}
END
                sleep 1
        done
done

[jon@dagobah 2011171]$ ./reproduce-loopback.sh 
* volmode: fs
0 ip-10-0-138-245
0 /dev/loop0
localvolume.local.storage.openshift.io/local-fs-w0-0 created
1 /dev/loop1
localvolume.local.storage.openshift.io/local-fs-w0-1 created
2 /dev/loop2
localvolume.local.storage.openshift.io/local-fs-w0-2 created
3 /dev/loop3
localvolume.local.storage.openshift.io/local-fs-w0-3 created
1 ip-10-0-163-198
0 /dev/loop0
localvolume.local.storage.openshift.io/local-fs-w1-0 created
1 /dev/loop1
localvolume.local.storage.openshift.io/local-fs-w1-1 created
2 /dev/loop2
localvolume.local.storage.openshift.io/local-fs-w1-2 created
3 /dev/loop3
localvolume.local.storage.openshift.io/local-fs-w1-3 created

Right away we can see the behavior described in the bug report.

[jon@dagobah 2011171]$ oc get pods -n openshift-local-storage
NAME                                      READY   STATUS        RESTARTS   AGE
diskmaker-manager-qqhmz                   0/1     Terminating   0          10s
diskmaker-manager-xhs4h                   0/1     Terminating   0          6s
local-storage-operator-77d6b45fc9-t8879   1/1     Running       0          87m
[jon@dagobah 2011171]$ oc get pods -n openshift-local-storage
NAME                                      READY   STATUS        RESTARTS   AGE
diskmaker-manager-qqhmz                   0/1     Terminating   0          15s
diskmaker-manager-xhs4h                   0/1     Terminating   0          11s
local-storage-operator-77d6b45fc9-t8879   1/1     Running       0          87m
[jon@dagobah 2011171]$ oc get pods -n openshift-local-storage
NAME                                      READY   STATUS              RESTARTS   AGE
diskmaker-manager-5l52x                   0/1     ContainerCreating   0          1s
diskmaker-manager-plzvk                   0/1     Pending             0          0s
local-storage-operator-77d6b45fc9-t8879   1/1     Running             0          88m
[jon@dagobah 2011171]$ oc get pods -n openshift-local-storage
NAME                                      READY   STATUS              RESTARTS   AGE
diskmaker-manager-5l52x                   0/1     Terminating         0          3s
diskmaker-manager-xw8ql                   0/1     ContainerCreating   0          0s
local-storage-operator-77d6b45fc9-t8879   1/1     Running             0          88m

While that was happening, I captured the yaml for diskmaker-manager and ran a diff.

[jon@dagobah 2011171]$ oc get daemonset.apps/diskmaker-manager -n openshift-local-storage -o yaml > diskmaker1.yaml
[jon@dagobah 2011171]$ oc get daemonset.apps/diskmaker-manager -n openshift-local-storage -o yaml > diskmaker2.yaml
[jon@dagobah 2011171]$ oc get daemonset.apps/diskmaker-manager -n openshift-local-storage -o yaml > diskmaker3.yaml
[jon@dagobah 2011171]$ oc get daemonset.apps/diskmaker-manager -n openshift-local-storage -o yaml > diskmaker4.yaml
[jon@dagobah 2011171]$ diff diskmaker1.yaml diskmaker4.yaml
5c5
<     deprecated.daemonset.template.generation: "310"
---
>     deprecated.daemonset.template.generation: "339"
8c8
<   generation: 310
---
>   generation: 339
62c62
<   resourceVersion: "240948"
---
>   resourceVersion: "241199"
198c198
<   observedGeneration: 310
---
>   observedGeneration: 339

At first it looks like just the generation number and resourceVersion are being updated.
But multiple updates are happening between these 2 files.
What's really happening is the ownerReferences and nodeAffinity lists are being reordered:

[jon@dagobah 2011171]$ diff diskmaker2.yaml diskmaker3.yaml 
5c5
<     deprecated.daemonset.template.generation: "324"
---
>     deprecated.daemonset.template.generation: "329"
8c8
<   generation: 324
---
>   generation: 329
18,23d17
<     name: local-fs-w0-3
<     uid: 4178c3bf-33d9-4122-aac4-053be382362c
<   - apiVersion: local.storage.openshift.io/v1
<     blockOwnerDeletion: false
<     controller: false
<     kind: LocalVolume
62c56,62
<   resourceVersion: "241069"
---
>   - apiVersion: local.storage.openshift.io/v1
>     blockOwnerDeletion: false
>     controller: false
>     kind: LocalVolume
>     name: local-fs-w0-3
>     uid: 4178c3bf-33d9-4122-aac4-053be382362c
>   resourceVersion: "241122"
85c85
<                 - ip-10-0-138-245
---
>                 - ip-10-0-163-198
105c105
<                 - ip-10-0-163-198
---
>                 - ip-10-0-138-245
198,199c198
<   observedGeneration: 321
<   updatedNumberScheduled: 2
---
>   observedGeneration: 329

[jon@dagobah 2011171]$ diff diskmaker3.yaml diskmaker4.yaml 
5c5
<     deprecated.daemonset.template.generation: "329"
---
>     deprecated.daemonset.template.generation: "339"
8c8
<   generation: 329
---
>   generation: 339
17a18,23
>     name: local-fs-w0-3
>     uid: 4178c3bf-33d9-4122-aac4-053be382362c
>   - apiVersion: local.storage.openshift.io/v1
>     blockOwnerDeletion: false
>     controller: false
>     kind: LocalVolume
56,62c62
<   - apiVersion: local.storage.openshift.io/v1
<     blockOwnerDeletion: false
<     controller: false
<     kind: LocalVolume
<     name: local-fs-w0-3
<     uid: 4178c3bf-33d9-4122-aac4-053be382362c
<   resourceVersion: "241122"
---
>   resourceVersion: "241199"
85c85
<                 - ip-10-0-163-198
---
>                 - ip-10-0-138-245
105c105
<                 - ip-10-0-138-245
---
>                 - ip-10-0-163-198
198c198
<   observedGeneration: 329
---
>   observedGeneration: 339

configMapDataHash is not getting updated like I had theorized earlier, but I captured the configmap during this experiment too:

[jon@dagobah 2011171]$ oc get configmap -n openshift-local-storage local-provisioner -o yaml
 > configmap1.yaml
[jon@dagobah 2011171]$ oc get configmap -n openshift-local-storage local-provisioner -o yaml
 > configmap2.yaml
[jon@dagobah 2011171]$ oc get configmap -n openshift-local-storage local-provisioner -o yaml
 > configmap3.yaml
[jon@dagobah 2011171]$ oc get configmap -n openshift-local-storage local-provisioner -o yaml
 > configmap4.yaml

And here, again, it's the ownerReferences being reordered and updated.

[jon@dagobah 2011171]$ diff configmap2.yaml configmap3.yaml 
68,69c68,69
<     name: local-fs-w0-2
<     uid: 7e645f08-b7c1-4e1c-a3ce-8d9e089dbdc2
---
>     name: local-fs-w1-1
>     uid: 5dce825e-3eeb-4402-835a-03e37a01ecba
74,75c74,75
<     name: local-fs-w0-3
<     uid: 4178c3bf-33d9-4122-aac4-053be382362c
---
>     name: local-fs-w1-2
>     uid: 9bb2cfae-3f78-46ee-bfc3-87c6a01d3173
80,81c80,81
<     name: local-fs-w1-0
<     uid: 2c2ee787-1938-43b2-8845-a02244819e4e
---
>     name: local-fs-w1-3
>     uid: 03ab4601-dda4-4b32-ac81-fdc79dc8885b
86,87c86,87
<     name: local-fs-w1-1
<     uid: 5dce825e-3eeb-4402-835a-03e37a01ecba
---
>     name: local-fs-w0-0
>     uid: d22d47e8-c03a-4b0e-8790-decbf979966b
92,93c92,93
<     name: local-fs-w1-2
<     uid: 9bb2cfae-3f78-46ee-bfc3-87c6a01d3173
---
>     name: local-fs-w0-1
>     uid: 8dd04e56-caa4-4374-9101-b7b890b368a8
98,99c98,99
<     name: local-fs-w1-3
<     uid: 03ab4601-dda4-4b32-ac81-fdc79dc8885b
---
>     name: local-fs-w0-2
>     uid: 7e645f08-b7c1-4e1c-a3ce-8d9e089dbdc2
104,105c104,105
<     name: local-fs-w0-0
<     uid: d22d47e8-c03a-4b0e-8790-decbf979966b
---
>     name: local-fs-w0-3
>     uid: 4178c3bf-33d9-4122-aac4-053be382362c
110,112c110,112
<     name: local-fs-w0-1
<     uid: 8dd04e56-caa4-4374-9101-b7b890b368a8
<   resourceVersion: "249667"
---
>     name: local-fs-w1-0
>     uid: 2c2ee787-1938-43b2-8845-a02244819e4e
>   resourceVersion: "249740"

[jon@dagobah 2011171]$ diff configmap3.yaml configmap4.yaml 
68,69c68,69
<     name: local-fs-w1-1
<     uid: 5dce825e-3eeb-4402-835a-03e37a01ecba
---
>     name: local-fs-w0-2
>     uid: 7e645f08-b7c1-4e1c-a3ce-8d9e089dbdc2
74,75c74,75
<     name: local-fs-w1-2
<     uid: 9bb2cfae-3f78-46ee-bfc3-87c6a01d3173
---
>     name: local-fs-w0-3
>     uid: 4178c3bf-33d9-4122-aac4-053be382362c
80,81c80,81
<     name: local-fs-w1-3
<     uid: 03ab4601-dda4-4b32-ac81-fdc79dc8885b
---
>     name: local-fs-w1-0
>     uid: 2c2ee787-1938-43b2-8845-a02244819e4e
86,87c86,87
<     name: local-fs-w0-0
<     uid: d22d47e8-c03a-4b0e-8790-decbf979966b
---
>     name: local-fs-w1-1
>     uid: 5dce825e-3eeb-4402-835a-03e37a01ecba
92,93c92,93
<     name: local-fs-w0-1
<     uid: 8dd04e56-caa4-4374-9101-b7b890b368a8
---
>     name: local-fs-w1-2
>     uid: 9bb2cfae-3f78-46ee-bfc3-87c6a01d3173
98,99c98,99
<     name: local-fs-w0-2
<     uid: 7e645f08-b7c1-4e1c-a3ce-8d9e089dbdc2
---
>     name: local-fs-w1-3
>     uid: 03ab4601-dda4-4b32-ac81-fdc79dc8885b
104,105c104,105
<     name: local-fs-w0-3
<     uid: 4178c3bf-33d9-4122-aac4-053be382362c
---
>     name: local-fs-w0-0
>     uid: d22d47e8-c03a-4b0e-8790-decbf979966b
110,112c110,112
<     name: local-fs-w1-0
<     uid: 2c2ee787-1938-43b2-8845-a02244819e4e
<   resourceVersion: "249740"
---
>     name: local-fs-w0-1
>     uid: 8dd04e56-caa4-4374-9101-b7b890b368a8
>   resourceVersion: "249836"


So right now the things I want to understand are:
A) When creating a single LV, why do we get so many messages about daemonset being updated, even though the generation number is only 1? I would expect CreateOrUpdate() to return OperationResultNone in that case and not print a log message.
B) Why is the list ordering for ownerReferences and nodeAffinity constantly changing and causing updates to both the daemonset and the configmap? This seems only to happen with 2+ worker nodes. This is what's causing the diskmaker-manager pods to be recreated over and over.

Comment 11 Jonathan Dobson 2021-10-21 20:20:52 UTC

Ok, RCA is understood now. I spent quite a bit of time tracing this with delve, and there were a few issues to fix.

1) We create the list of ownerRefs and node selector terms here in extractLVInfo:
https://github.com/openshift/local-storage-operator/blob/81d182a4cabe04d2dc7f42d7b6167bd0d86900e2/pkg/controller/nodedaemon/aggregate.go#L88
But the order of the lists generated by that function is different every time. This explains my previous observations where the list ordering for the owner refs and node selectors were constantly changing.
The fix for this is to sort the slice of lvs before looping through them, similar to what we already do in extractLVSetInfo, so that the lists will be created in a predictable order:
https://github.com/openshift/local-storage-operator/blob/81d182a4cabe04d2dc7f42d7b6167bd0d86900e2/pkg/controller/nodedaemon/aggregate.go#L56-L61

2) Every time LSO updated the daemonset, it was attempting to clear TerminationMessagePath and TerminationMessagePolicy:

(dlv) print existing.Spec.Template.Spec.Containers[0]
k8s.io/api/core/v1.Container {
...
        TerminationMessagePath: "/dev/termination-log",
        TerminationMessagePolicy: "File",
...
(dlv) print obj.Spec.Template.Spec.Containers[0]
k8s.io/api/core/v1.Container {
...
        TerminationMessagePath: "",
        TerminationMessagePolicy: "",
...

Looks like those fields were set by prometheus-operator. Maybe LSO is even "fighting" prometheus-operator to set those fields? Every time LSO updates the daemonset, I see the same thing. Those 2 fields are set on the existing object, but since they are not in MutateAggregatedSpec(), LSO tries to clear them on every update. Adding them to the definition in MutateAggregatedSpec() solves this particular issue.

https://github.com/openshift/local-storage-operator/blob/81d182a4cabe04d2dc7f42d7b6167bd0d86900e2/pkg/controller/nodedaemon/daemonsets.go#L159

3) Every time LSO updated the daemonset, it was also attempting to clear the APIVersion field from the field selector for each environment variable, i.e.:

Name:  "MY_NODE_NAME",
Value: "",
ValueFrom: &v1.EnvVarSource{
        FieldRef: &v1.ObjectFieldSelector{
-               APIVersion: "v1",
+               APIVersion: "",
                FieldPath:  "spec.nodeName",
        },

This again is because that field is not defined in MutateAggregatedSpec(), and it can be fixed by setting it to v1 for each entry here:
https://github.com/openshift/local-storage-operator/blob/81d182a4cabe04d2dc7f42d7b6167bd0d86900e2/pkg/controller/nodedaemon/daemonsets.go#L233-L256

4) There is a Watch on DaemonSets that I don't think is necessary:

https://github.com/openshift/local-storage-operator/blob/81d182a4cabe04d2dc7f42d7b6167bd0d86900e2/pkg/controller/nodedaemon/add.go#L70-L73

We already watch for changes on the daemonset at line 54, and LV's on line 65.
I think this one can be removed--it's triggering an extra reconcile when one should not be needed.
It was already removed from the master branch at some point:
https://github.com/openshift/local-storage-operator/blob/a38697051f4567c7f630c897d11b202206113832/controllers/nodedaemon/reconcile.go#L207-L212


After fixing those 4 issues in my local test VM, I see the configmap and daemonset only getting updated once when creating a new LV:

[vagrant@fedora33 deploy]$ k get localvolumes
NAME       AGE
lv-loop0   19h
lv-loop1   19h
lv-loop2   19h
[vagrant@fedora33 deploy]$ k apply -f lv-loop3.yaml 
localvolume.local.storage.openshift.io/lv-loop3 created

[vagrant@fedora33 deploy]$ k logs local-storage-operator-5d6fc74cdd-5xml8 -f
...
I1021 19:16:59.423579       1 reconcile.go:34] Reconciling LocalVolume
I1021 19:16:59.423628       1 api_updater.go:74] Updating localvolume default/lv-loop3
I1021 19:16:59.450197       1 reconcile.go:34] Reconciling LocalVolume
{"level":"info","ts":1634843819.4503026,"logger":"localvolumesetdaemon-controller","msg":"pr
ovisioner configmap changed","Request.Namespace":"default","Request.Name":""}
{"level":"info","ts":1634843819.4561226,"logger":"localvolumesetdaemon-controller","msg":"da
emonset changed","Request.Namespace":"default","Request.Name":"","daemonset.Name":"diskmaker
-manager","op.Result":"updated"}
I1021 19:16:59.468948       1 reconcile.go:34] Reconciling LocalVolume
I1021 19:16:59.476097       1 reconcile.go:34] Reconciling LocalVolume
I1021 19:16:59.481306       1 reconcile.go:34] Reconciling LocalVolume
I1021 19:16:59.484802       1 reconcile.go:34] Reconciling LocalVolume


My testing so far was based on the release-4.8 branch, so next steps:
- port fix over to 4.10 and 4.9 branches
- test fix on 4.10, 4.9, 4.8 OCP clusters
- file PR's for 4.10, 4.9, 4.8

Comment 12 Jonathan Dobson 2021-10-21 23:21:46 UTC

RE https://bugzilla.redhat.com/show_bug.cgi?id=2011171#c11

Issues #1 and #2 apply to all 3 versions (4.8, 4.9, 4.10).

Issues #3 and #4 only apply to 4.8 (already fixed on newer versions).

Comment 13 Jonathan Dobson 2021-10-26 23:34:16 UTC

4.9 and 4.10 also have a kube-rbac-proxy sidecar that needs to be updated to avoid extra daemonset updates. The following fields were undefined and getting cleared on every update attempt:

        containers:
        - name: kube-rbac-proxy
          imagePullPolicy: IfNotPresent
          terminationMessagePath: /dev/termination-log
          terminationMessagePolicy: File
        volumes:
        - name: metrics-serving-cert
          secret:
            defaultMode: 420

Comment 15 Chao Yang 2021-10-28 09:42:41 UTC

./reproduce-loopback.sh 
* volmode: fs
0 ip-10-0-187-149.us-east-2.compute.internal
0 /dev/loop0
localvolume.local.storage.openshift.io/local-fs-w0-0 created
1 /dev/loop1
localvolume.local.storage.openshift.io/local-fs-w0-1 created
2 /dev/loop2
localvolume.local.storage.openshift.io/local-fs-w0-2 created
3 /dev/loop3
localvolume.local.storage.openshift.io/local-fs-w0-3 created
1 ip-10-0-142-92.us-east-2.compute.internal
0 /dev/loop0
localvolume.local.storage.openshift.io/local-fs-w1-0 created
1 /dev/loop1
localvolume.local.storage.openshift.io/local-fs-w1-1 created
2 /dev/loop2
localvolume.local.storage.openshift.io/local-fs-w1-2 created
3 /dev/loop3
localvolume.local.storage.openshift.io/local-fs-w1-3 created

oc get pods
NAME                                    READY   STATUS    RESTARTS   AGE
diskmaker-discovery-jwh9m               2/2     Running   0          76m
diskmaker-discovery-mzh29               2/2     Running   0          76m
diskmaker-discovery-vhgs2               2/2     Running   0          76m
diskmaker-manager-4c5zm                 2/2     Running   0          15s
diskmaker-manager-jphkp                 2/2     Running   0          16s
diskmaker-manager-zkpcv                 2/2     Running   0          15s
local-storage-operator-9855f445-pts2w   1/1     Running   0          130m

Passed with local-storage-operator.4.10.0-202110270816

Comment 18 errata-xmlrpc 2022-03-10 16:17:19 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:0056