Created attachment 1702569 [details] nbdb sbdb and northd logs Description of problem: Lot of transaction error were seen during a Baremetal cluster upgrade from 4.5.2 to 4.5.3 Excerpt from northd logs on master-1 ------------------------------------ |WARN|transaction error: {"details":"Transaction causes multiple rows in \"Port_Binding\" table to have identical values (openshift-insights_insights-operator-6bd7f47568-j2l5t ) for index on column \"logical_port\". First row, with UUID 9b34e9f1-dc8e-490f-9ec4-6d0eae081337, existed in the database before this transaction and was not modified by the transaction. Second row, with UUID 7e81ba01-fe35-4e8d-b819-648aac99e5d4, was inserted by this transaction.","error":"constraint violation"} Excerpt from sbdb logs on master2 --------------------------------- 2020-07-27T18:16:02Z|00624|ovsdb_server|ERR|syntax error: transaction deletes row 03055103-180b-4163-9739-4e7080534f63 that does not exist 2020-07-27T18:16:02Z|00625|ovsdb_server|INFO|referential integrity violation: Table Port_Binding column datapath row afdff1d1-b901-4d9b-960e-5d70c8bc9731 references nonexistent row 00000000-0000-0000-0000-000000000000 in table Datapath_Binding. Master-1 is NB leader Version-Release number of selected component (if applicable): 4.5.3 How reproducible:Rare Steps to Reproduce: 1. Various testing were performed and cluster was leap upgraded time to time from 4.4.x -> 4.5.0.rc.7 -> 4.5.1-rc.0 -> 4.5.2 -> 4.5.3 Actual results:Upgrade failed with various operator errors Expected results: Additional info: $ oc get nodes NAME STATUS ROLES AGE VERSION openshift-master-0 Ready master 18d v1.18.3+b74c5ed openshift-master-1 Ready master 18d v1.18.3+b74c5ed openshift-master-2 Ready master 18d v1.18.3+b74c5ed openshift-worker-0 NotReady worker,worker-cnf 16d v1.18.3+b74c5ed openshift-worker-1 Ready worker,worker-cnf 16d v1.18.3+b74c5ed openshift-worker-2 Ready worker,worker-lb 18d v1.18.3+b74c5ed openshift-worker-3 Ready worker,worker-lb 18d v1.18.3+b74c5ed $ oc get pods -n openshift-ovn-kubernetes NAME READY STATUS RESTARTS AGE ovnkube-master-72wdp 4/4 Running 0 7h21m ovnkube-master-7grjf 4/4 Running 0 4h53m ovnkube-master-hdwt7 4/4 Running 0 8h ovnkube-node-5ccnc 2/2 Running 0 8h ovnkube-node-7d79g 1/2 NotReady 0 8h ovnkube-node-7r9qk 2/2 Running 0 8h ovnkube-node-c7hkn 2/2 Running 0 8h ovnkube-node-pxnbl 2/2 Running 0 7h25m ovnkube-node-sw5jc 2/2 Running 0 8h ovnkube-node-wmr2k 2/2 Running 4 4h56m ovs-node-5kds4 0/1 Completed 0 8h ovs-node-75phm 1/1 Running 0 8h ovs-node-79lc5 1/1 Running 0 8h ovs-node-c7778 1/1 Running 0 8h ovs-node-n4lm9 1/1 Running 1 4h55m ovs-node-pcwfg 1/1 Running 0 8h ovs-node-tx6lh 1/1 Running 0 8h $ oc get co NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE authentication 4.5.3 True True False 17d cloud-credential 4.5.3 True False False 18d cluster-autoscaler 4.5.3 True False False 18d config-operator 4.5.3 True False False 18d console 4.5.3 False False False 5h3m csi-snapshot-controller 4.5.3 True False False 11d dns 4.5.2 True True True 18d etcd 4.5.3 True False False 18d image-registry 4.5.3 True False False 3d8h ingress 4.5.3 True False False 6d7h insights 4.5.3 True False False 18d kube-apiserver 4.5.3 True False False 18d kube-controller-manager 4.5.3 True False True 18d kube-scheduler 4.5.3 True False False 18d kube-storage-version-migrator 4.5.3 True False False 3d8h machine-api 4.5.3 True False False 18d machine-approver 4.5.3 True False False 18d machine-config 4.5.2 False False True 11m marketplace 4.5.3 True False False 8h monitoring 4.5.3 False True True 14m network 4.5.3 True True True 18d node-tuning 4.5.3 True False False 3d7h openshift-apiserver 4.5.3 True False True 5h9m openshift-controller-manager 4.5.3 True False False 3d6h openshift-samples 4.5.3 True False False 3d7h operator-lifecycle-manager 4.5.3 True False False 18d operator-lifecycle-manager-catalog 4.5.3 True False False 18d operator-lifecycle-manager-packageserver 4.5.3 True False False 3d6h service-ca 4.5.3 True False False 18d storage 4.5.3 True False False 3d7h
AFAIK, NB/SB dbs should all be the same on all masters (since they are synchronized).
(In reply to Anurag saxena from comment #6) > AFAIK, NB/SB dbs should all be the same on all masters (since they are > synchronized). From the logs it seems like the DBs are not synchronized. It would be useful to see if that's the case. Next time you hit the issue could you please save all DBs? Thanks, Dumitru
*** This bug has been marked as a duplicate of bug 1860522 ***