Created attachment 1804913 [details] assisted service log Description of problem: Error raised "unable to unmarshal connectivity report for host ID 45ad02d2-04be-4988-bc67-341e1623f56e:unexpected end of JSON input" in Assisted Service log on Staging time="2021-07-23T14:04:25Z" level=info msg="Received step reply <connectivity-check-702d4ba6> from cluster <a7265699-b0ca-4492-aea3-a91026615f07> host <45ad02d2-04be-4988-bc67-341e1623f56e> exit-code <0> stderr <> stdout <{\"remote_hosts\":[{\"host_id\":\"61f1700f-34d9-4175-9640-d9bf26ab4f62\",\"l2_connectivity\":[{\"outgoing_ip_address\":\"192.168.123.101\",\"outgoing_nic\":\"enp0s3\",\"remote_ip_address\":\"192.168.123.107\",\"remote_mac\":\"52:54:00:cd:97:79\",\"successful\":true},{\"outgoing_nic\":\"enp0s4\",\"remote_ip_address\":\"192.168.123.107\"}],\"l3_connectivity\":[{\"average_rtt_ms\":0.503,\"outgoing_nic\":\"enp0s3\",\"remote_ip_address\":\"192.168.123.107\",\"successful\":true}]},{\"host_id\":\"6228947a-53e6-4ff1-86bf-7a6a4ff61ebf\",\"l2_connectivity\":[{\"outgoing_ip_address\":\"192.168.123.101\",\"outgoing_nic\":\"enp0s3\",\"remote_ip_address\":\"192.168.123.131\",\"remote_mac\":\"52:54:00:97:e3:78\",\"successful\":true},{\"outgoing_nic\":\"enp0s4\",\"remote_ip_address\":\"192.168.123.131\"}],\"l3_connectivity\":[{\"average_rtt_ms\":0.532,\"outgoing_nic\":\"enp0s3\",\"remote_ip_address\":\"192.168.123.131\",\"successful\":true}]},{\"host_id\":\"a8d251a9-bd5d-4b8f-93c3-3add40e89d9b\",\"l2_connectivity\":[{\"outgoing_ip_address\":\"192.168.123.101\",\"outgoing_nic\":\"enp0s3\",\"remote_ip_address\":\"192.168.123.105\",\"remote_mac\":\"52:54:00:1d:ae:72\",\"successful\":true},{\"outgoing_nic\":\"enp0s4\",\"remote_ip_address\":\"192.168.123.105\"}],\"l3_connectivity\":[{\"average_rtt_ms\":0.496,\"outgoing_nic\":\"enp0s3\",\"remote_ip_address\":\"192.168.123.105\",\"successful\":true}]},{\"host_id\":\"f8add83f-06f4-4822-8756-d0aae5c2c6ee\",\"l2_connectivity\":[{\"outgoing_ip_address\":\"192.168.123.101\",\"outgoing_nic\":\"enp0s3\",\"remote_ip_address\":\"192.168.123.121\",\"remote_mac\":\"52:54:00:71:c3:e5\",\"successful\":true},{\"outgoing_nic\":\"enp0s4\",\"remote_ip_address\":\"192.168.123.121\"}],\"l3_connectivity\":[{\"average_rtt_ms\":0.469,\"outgoing_nic\":\"enp0s3\",\"remote_ip_address\":\"192.168.123.121\",\"successful\":true}]}]}>" func=github.com/openshift/assisted-service/internal/bminventory.logReplyReceived file="/go/src/github.com/openshift/origin/internal/bminventory/inventory.go:3264" cluster_id=a7265699-b0ca-4492-aea3-a91026615f07 go-id=120635 host_id=45ad02d2-04be-4988-bc67-341e1623f56e pkg=Inventory request_id=0c7d606b-97fc-4520-b1f4-95227414643a time="2021-07-23T14:04:25Z" level=error msg="unable to unmarshal connectivity report for host ID 45ad02d2-04be-4988-bc67-341e1623f56e:unexpected end of JSON input" cluster_id=a7265699-b0ca-4492-aea3-a91026615f07 go-id=120635 host_id=45ad02d2-04be-4988-bc67-341e1623f56e request_id=0c7d606b-97fc-4520-b1f4-95227414643a In host agent.log Jul 23 14:04:25 worker-0-1 next_step_runne[2099]: time="23-07-2021 14:04:25" level=info msg="Sending step <connectivity-check-702d4ba6> reply output <{\"remote_hosts\":[{\"host_id\":\"61f1700f-34d9-4175-9640-d9bf26ab4f62\",\"l2_connectivity\":[{\"outgoing_ip_address\":\"192.168.123.101\",\"outgoing_nic\":\"enp0s3\",\"remote_ip_address\":\"192.168.123.107\",\"remote_mac\":\"52:54:00:cd:97:79\",\"successful\":true},{\"outgoing_nic\":\"enp0s4\",\"remote_ip_address\":\"192.168.123.107\"}],\"l3_connectivity\":[{\"average_rtt_ms\":0.503,\"outgoing_nic\":\"enp0s3\",\"remote_ip_address\":\"192.168.123.107\",\"successful\":true}]},{\"host_id\":\"6228947a-53e6-4ff1-86bf-7a6a4ff61ebf\",\"l2_connectivity\":[{\"outgoing_ip_address\":\"192.168.123.101\",\"outgoing_nic\":\"enp0s3\",\"remote_ip_address\":\"192.168.123.131\",\"remote_mac\":\"52:54:00:97:e3:78\",\"successful\":true},{\"outgoing_nic\":\"enp0s4\",\"remote_ip_address\":\"192.168.123.131\"}],\"l3_connectivity\":[{\"average_rtt_ms\":0.532,\"outgoing_nic\":\"enp0s3\",\"remote_ip_address\":\"192.168.123.131\",\"successful\":true}]},{\"host_id\":\"a8d251a9-bd5d-4b8f-93c3-3add40e89d9b\",\"l2_connectivity\":[{\"outgoing_ip_address\":\"192.168.123.101\",\"outgoing_nic\":\"enp0s3\",\"remote_ip_address\":\"192.168.123.105\",\"remote_mac\":\"52:54:00:1d:ae:72\",\"successful\":true},{\"outgoing_nic\":\"enp0s4\",\"remote_ip_address\":\"192.168.123.105\"}],\"l3_connectivity\":[{\"average_rtt_ms\":0.496,\"outgoing_nic\":\"enp0s3\",\"remote_ip_address\":\"192.168.123.105\",\"successful\":true}]},{\"host_id\":\"f8add83f-06f4-4822-8756-d0aae5c2c6ee\",\"l2_connectivity\":[{\"outgoing_ip_address\":\"192.168.123.101\",\"outgoing_nic\":\"enp0s3\",\"remote_ip_address\":\"192.168.123.121\",\"remote_mac\":\"52:54:00:71:c3:e5\",\"successful\":true},{\"outgoing_nic\":\"enp0s4\",\"remote_ip_address\":\"192.168.123.121\"}],\"l3_connectivity\":[{\"average_rtt_ms\":0.469,\"outgoing_nic\":\"enp0s3\",\"remote_ip_address\":\"192.168.123.121\",\"successful\":true}]}]}> error <> exit-code <0>" file="step_processor.go:64" request_id=d9121988-1647-49d9-a9c0-30fcc978c084 Version-Release number of selected component (if applicable): Staging v1.0.23.2 How reproducible: Steps to Reproduce: 1. Deploy clluster 2. Look at Assisted Service Pod log https://console-openshift-console.apps.app-sre-stage-0.k3s7.p1.openshiftapps.com/api/kubernetes/api/v1/namespaces/assisted-installer-stage/pods/assisted-service-74b5747c4d-5rjjb/log?container=assisted-service 3. Actual results: Expected results: Additional info:
Cluster https://qaprodauth.cloud.redhat.com/openshift/assisted-installer/clusters/a7265699-b0ca-4492-aea3-a91026615f07
@jgil looking at the code i found something strange, in captureConnectivityReportMetrics https://github.com/openshift/assisted-service/blob/master/internal/host/host.go#L1154-L1174 I see that `report` (the latest replay from the host) is not used is that intentinal or just a bug? I'm not sure how it's related to the issue, maybe because the existing string in the DB is empty so we don't manage to unmarshal it?
@mfilanov good point. It should be the report parameter being used, not the one coming from the host (which is the old one). I'll raise a PR to fix it. Thanks for the analysis! /Jordi
Verified on Staging v1.0.24.1
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:3759