Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
Red Hat Satellite engineering is moving the tracking of its product development work on Satellite to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "Satellite project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs will be migrated starting at the end of May. If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "Satellite project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/SAT-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1368997

Summary: LinkState unable to initialize, invalid values in 'body'
Product: Red Hat Satellite Reporter: Jan Hutař <jhutar>
Component: InfrastructureAssignee: satellite6-bugs <satellite6-bugs>
Status: CLOSED WONTFIX QA Contact: Katello QA List <katello-qa-list>
Severity: medium Docs Contact:
Priority: high    
Version: 6.2.0CC: adprice, bbuckingham, bkearney, cduryee, jcallaha, psuriset, tross
Target Milestone: UnspecifiedKeywords: Performance, Triaged
Target Release: Unused   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-12-01 15:31:19 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jan Hutař 2016-08-22 10:00:12 UTC
Description of problem:
Lots of tracebacks in `journalctl -f`:

qdrouterd[...]: Exception: Protocol field has wrong data type: 'peers' type=<type 'dict'> expected=<type 'list'>


Version-Release number of selected component (if applicable):
Satellite 6.2 @ RHEL 7.2


How reproducible:
rarely, but persistent when it happens


Steps to Reproduce:
1. Do not know. We have Sat with 2 capsules and 10k clients with katello-agent and we have scheduled errata apply on all of them and when investigating failures, we have noticed this


Actual results:
Aug 22 05:49:06 ip-10-1-1-1.us-west-2.compute.internal qdrouterd[20283]: Mon Aug 22 05:49:06 2016 ROUTER (error) Control message error: opcode=LSU body={'ls': {'ls_seq': 9L, 'peers': {'ip-10-1-1-1.us-west-2.compute.internal': 1L}, 'id': 'ip-10-1-1-4.us-west-2.compute.internal', 'area': '0'}, 'ls_seq': 9L, 'area': '0', 'id': 'ip-10-1-1-4.us-west-2.compute.internal', 'instance': 1471730162L}
Aug 22 05:49:06 ip-10-1-1-1.us-west-2.compute.internal qdrouterd[20283]: Traceback (most recent call last):
Aug 22 05:49:06 ip-10-1-1-1.us-west-2.compute.internal qdrouterd[20283]: File "/usr/lib/qpid-dispatch/python/qpid_dispatch_internal/router/engine.py", line 143, in handleControlMessage
Aug 22 05:49:06 ip-10-1-1-1.us-west-2.compute.internal qdrouterd[20283]: msg = MessageLSU(body)
Aug 22 05:49:06 ip-10-1-1-1.us-west-2.compute.internal qdrouterd[20283]: File "/usr/lib/qpid-dispatch/python/qpid_dispatch_internal/router/data.py", line 176, in __init__
Aug 22 05:49:06 ip-10-1-1-1.us-west-2.compute.internal qdrouterd[20283]: self.ls = LinkState(getMandatory(body, 'ls', dict))
Aug 22 05:49:06 ip-10-1-1-1.us-west-2.compute.internal qdrouterd[20283]: File "/usr/lib/qpid-dispatch/python/qpid_dispatch_internal/router/data.py", line 55, in __init__
Aug 22 05:49:06 ip-10-1-1-1.us-west-2.compute.internal qdrouterd[20283]: self.peers = getMandatory(body, 'peers', list)
Aug 22 05:49:06 ip-10-1-1-1.us-west-2.compute.internal qdrouterd[20283]: File "/usr/lib/qpid-dispatch/python/qpid_dispatch_internal/router/data.py", line 27, in getMandatory
Aug 22 05:49:06 ip-10-1-1-1.us-west-2.compute.internal qdrouterd[20283]: raise Exception("Protocol field has wrong data type: '%s' type=%r expected=%r" % (key, value.__class__, cls))
Aug 22 05:49:06 ip-10-1-1-1.us-west-2.compute.internal qdrouterd[20283]: Exception: Protocol field has wrong data type: 'peers' type=<type 'dict'> expected=<type 'list'>


Expected results:
No traceback


Additional info:
I have modified .../data.py to show me actual data as well, and it showed this:

Exception: Protocol field has wrong data type: 'peers' type=<type 'dict'> expected=<type 'list'> value={'ip-10-1-1-1.us-west-2.compute.internal': 1L}

Comment 2 Chris Duryee 2016-09-12 14:50:13 UTC
This appears to be a dispatch router bug, not be related to Satellite's use of dispatch router.

The only info I found on this bug is from #0 and a couple of pastebin examples. Here is all of the "body" fields when this occurs:

 body={'ls': {'ls_seq': 9L, 'peers': {'ip-10-1-1-1.us-west-2.compute.internal': 1L}, 'id': 'ip-10-1-1-4.us-west-2.compute.internal', 'area': '0'}, 'ls_seq': 9L, 'area': '0', 'id': 'ip-10-1-1-4.us-west-2.compute.internal', 'instance': 1471730162L}

body={'ls': {'ls_seq': 2L, 'peers': ['p3-dev-capsule.example.com', 'centos7-capsule-p42-nightly.example.com'], 'id': 'dev-p42.example.com', 'area': '0'}, 'ls_seq': 2L, 'area': '0', 'id': 'dev-p42.example.com', 'instance': 1472591533L}

body={'ls': {'ls_seq': 1L, 'peers': {'repo.xxxxxxxxxx.com': 1L}, 'id': 'xxxxxx.xxxxxx.com', 'area': '0'}, 'ls_seq': 1L, 'area': '0', 'id': 'xxxxxx.xxxx.com', 'instance': 1470666984L}

Note that in examples 1 and 3, 'peers' is incorrectly set to a dict instead of a list. However, in example 2, it's set correctly, but for some reason getMandatory() is checking if it's a dict:

Aug 31 10:16:21 centos7-capsule-p42-nightly qdrouterd: Exception: Protocol field has wrong data type: 'peers' type=<type 'list'> expected=<type 'dict'>


I took a look at the dispatch router code, but I'm not familiar enough with the code to find the bug. Could there be a problem in *qd_field_to_py() ? I am not sure if that would explain the issue in example 2, however.

Comment 3 Ted Ross 2016-10-03 15:34:20 UTC
This issue occurs because there is a mix of dispatch 0.4 and 0.6.x in a network.  In 0.6.0, configurable link-cost was added as a feature in Qpid Dispatch Router and resulted in a protocol incompatibility between 0.6.0 and earlier versions.

The bug is that Dispatch Router does not gracefully handle this condition.

The workaround is to ensure that the deployment uses the same major-version packages for qpid-dispatch-router.

Comment 4 Bryan Kearney 2016-12-01 15:31:19 UTC
Closing this out based on comment 3