Hide Forgot
Description of problem: qdrouterd throws exception on the capsule/satellite server periodically and requires a restart in order to function. Sep 29 09:00:29 gbl15219 qdrouterd: *** Error in `/usr/sbin/qdrouterd': double free or corruption (!prev): 0x00007efcac6d9650 *** Sep 29 09:00:29 gbl15219 qdrouterd: ======= Backtrace: ========= Sep 29 09:00:29 gbl15219 qdrouterd: /lib64/libc.so.6(+0x7d053)[0x7efcbf57b053] Sep 29 09:00:29 gbl15219 qdrouterd: /lib64/libqpid-proton.so.2(pn_class_decref+0x56)[0x7efcc02c7806] Sep 29 09:00:29 gbl15219 qdrouterd: /lib64/libqpid-proton.so.2(+0x27580)[0x7efcc02d5580] Sep 29 09:00:29 gbl15219 qdrouterd: /lib64/libqpid-proton.so.2(pn_class_decref+0x38)[0x7efcc02c77e8] Sep 29 09:00:29 gbl15219 qdrouterd: /lib64/libqpid-proton.so.2(pn_collector_pop+0x22)[0x7efcc02d5722] Sep 29 09:00:29 gbl15219 qdrouterd: /lib64/libqpid-dispatch.so.0(+0x16f00)[0x7efcc0514f00] Sep 29 09:00:29 gbl15219 qdrouterd: /lib64/libqpid-dispatch.so.0(+0x29b9c)[0x7efcc0527b9c] Sep 29 09:00:29 gbl15219 qdrouterd: /lib64/libqpid-dispatch.so.0(qd_server_run+0xb0)[0x7efcc05287b0] Sep 29 09:00:29 gbl15219 qdrouterd: /usr/sbin/qdrouterd[0x401cd8] Sep 29 09:00:29 gbl15219 qdrouterd: /usr/sbin/qdrouterd(main+0x140)[0x401950] Sep 29 09:00:29 gbl15219 qdrouterd: /lib64/libc.so.6(__libc_start_main+0xf5)[0x7efcbf51fb15] Sep 29 09:00:29 gbl15219 qdrouterd: /usr/sbin/qdrouterd[0x4019a1] Sep 29 09:00:29 gbl15219 qdrouterd: ======= Memory map: ======== Sep 29 09:00:29 gbl15219 qdrouterd: 00400000-00403000 r-xp 00000000 fd:00 10740512 /usr/sbin/qdrouterd Sep 29 09:00:29 gbl15219 qdrouterd: 00602000-00603000 r--p 00002000 fd:00 10740512 /usr/sbin/qdrouterd Sep 29 09:00:29 gbl15219 qdrouterd: 00603000-00604000 rw-p 00003000 fd:00 10740512 /usr/sbin/qdrouterd Sep 29 09:00:29 gbl15219 qdrouterd: 02264000-036b6000 rw-p 00000000 00:00 0 [heap] This then causes disconnects on the client: [root@cutekittens~]# tail -f /var/log/messages Oct 17 10:39:56 cutekittens adclient[7856]: INFO <fd:29 PAMVerifyPassword > audit User 'd3228273' authenticated based on Kerberos exchange to AD Oct 17 10:39:56 cutekittens adclient[7856]: INFO <fd:30 PAMIsUserAllowedAccess2 > audit User 'd3228273' is authorized Oct 17 10:39:56 cutekittens sshd[2003]: Accepted keyboard-interactive/pam for 43228273 from 10.199.226.102 port 51189 ssh2 Oct 17 10:40:04 cutekittens goferd: [INFO][worker-0] gofer.messaging.adapter.connect:28 - connecting: proton+amqps://capsulefqdn:5647 Oct 17 10:40:04 cutekittens goferd: [INFO][worker-0] gofer.messaging.adapter.proton.connection:87 - open: URL: amqps://capsulefqdn:5647|SSL: ca: /etc/rhsm/ca/katello-default-ca.pem|key: None|certificate: /etc/pki/consumer/bundle.pem|host-validation: None Oct 17 10:40:04 cutekittens goferd: [INFO][worker-0] root:490 - connecting to capsulefqdn:5647... Oct 17 10:40:04 cutekittens goferd: [INFO][worker-0] root:532 - Disconnected Oct 17 10:40:04 cutekittens goferd: [ERROR][worker-0] gofer.messaging.adapter.connect:33 - connect: proton+amqps://capsulefqdn:5647, failed: Connection amqps://capsulefqdn:5647 disconnected Oct 17 10:40:04 cutekittens goferd: [INFO][worker-0] gofer.messaging.adapter.connect:35 - retry in 106 seconds Oct 17 10:40:05 cutekittens dzdo[2049]: INFO dz.dzdo Starting dzdo This means the customer has to create a cron job to periodically restart the goferd agent Version-Release number of selected component (if applicable): 6.2.1 How reproducible: 100% reproducible happening on all capsules Steps to Reproduce: 1. Register machines to capsules, do some usual stuff. 2. qdrouterd stops working, breaks, needs to be restarted. Actual results: qdrouterd is unstable, causes goferd to break. Expected results: qdrouterd +goferd should work and be stable. Additional info:
Pulp upstream does not use qdrouterd, so we aren't much help with these bugs. I'm adding Justin Ross who can advise from the messaging team's perspective.
*** This bug has been marked as a duplicate of bug 1366232 ***