Bug 1205343

Summary:

connections to rabbitmq-server are hanging

Product:

[Fedora] Fedora

Reporter:

Lars Kellogg-Stedman <lars>

Component:

rabbitmq-server

Assignee:

John Eckersberg <jeckersb>

Status:

CLOSED NOTABUG

QA Contact:

Fedora Extras Quality Assurance <extras-qa>

Severity:

unspecified

Docs Contact:

Priority:

unspecified

Version:

CC:

aortega, derekh, erlang, fpercoco, hubert.plociniczak, jeckersb, lemenkov, rjones, s, yeylon

Target Milestone:

---

Target Release:

---

Hardware:

Unspecified

OS:

Unspecified

Whiteboard:

Fixed In Version:

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2015-03-25 15:56:10 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
Output of "journalctl -b" on the system on which packstack is failing	none
Log from cinder --debug	none

Description Lars Kellogg-Stedman 2015-03-24 18:01:26 UTC

Description of problem:

On Fedora 21 with openstack-packstack-2014.2-0.18.dev1462.gbb05296.fc22.noarch, a multi-node install fails with:

    ERROR : Error appeared during Puppet run: 192.168.122.43_cinder.pp
    Error: Command exceeded timeout
    You will find full trace in log /var/tmp/packstack/20150324-172116-Kf3VcW/manifests/192.168.122.43_cinder.pp.log

And /var/tmp/packstack/20150324-172116-Kf3VcW/manifests/192.168.122.43_cinder.pp.log contains:

    Error: Command exceeded timeout
    Wrapped exception:
    execution expired
    Error: /Stage[main]/Main/Cinder::Type[iscsi]/Exec[cinder type-create iscsi]/returns: change from notrun to 0 failed: Command exceeded timeout
    Notice: /Stage[main]/Main/Cinder::Type[iscsi]/Cinder::Type_set[lvm]/Exec[cinder type-key iscsi set volume_backend_name=lvm]: Dependency Exec[cinder type-create iscsi] has failures: true
    Warning: /Stage[main]/Main/Cinder::Type[iscsi]/Cinder::Type_set[lvm]/Exec[cinder type-key iscsi set volume_backend_name=lvm]: Skipping because of failed dependencies

Even after `packstack` has exited with an error, the `cinder` command
is still running:

    # date
    Tue Mar 24 17:56:11 UTC 2015
    # ps -fe | grep volume.backend
    root     13142     1  0 17:49 ?        00:00:00 /usr/bin/python /usr/bin/cinder type-key iscsi set volume_backend_name=lvm

Running `strace` and `lsof` on the stuck process shows that it is blocked on
`recvfrom()` on fd 3, which appears to be a connection to the cinder
API:

COMMAND   PID USER   FD   TYPE DEVICE  SIZE/OFF   NODE NAME
cinder  13142 root    3u  IPv4 117407       0t0    TCP localhost.localdomain.localdomain:46048->localhost.localdomain.localdomain:8776 (ESTABLISHED)

There are no errors in the Cinder log, and the only warnings are
deprecation warnings.

Comment 1 Lars Kellogg-Stedman 2015-03-24 18:02:12 UTC

Created attachment 1005925 [details]
Output of "journalctl -b" on the system on which packstack is failing

Comment 2 Lars Kellogg-Stedman 2015-03-24 18:05:15 UTC

Also of note:

SELinux on this system was in permissive mode (so this issue is not caused by a bad selinux configuration).

Comment 3 Lars Kellogg-Stedman 2015-03-24 18:11:35 UTC

Created attachment 1005929 [details]
Log from cinder --debug

This is the output of "cinder-api --debug ..." while manually running the "cinder type-key" command from another terminal.

Comment 4 Lars Kellogg-Stedman 2015-03-24 19:58:48 UTC

Retargeting this to rabbitmq-server.  Updating rabbitmq-server on the F21 host to version rabbitmq-server-3.5.0-2.fc22.noarch has completely resolved the symptoms.

Comment 5 Flavio Percoco 2015-03-25 11:56:19 UTC

Lars, it'd be useful to have more than just the cinder-api logs. Is it possible to have the logs from the cinder-scheduler and cinder-volume nodes too?

Config files of these nodes and the rabbit nodes would be useful too.

Comment 6 John Eckersberg 2015-03-25 15:48:55 UTC

I haven't been able to reproduce this yet, going to give it a few more tries.

Comment 7 Lars Kellogg-Stedman 2015-03-25 15:56:10 UTC

After experiencing this reliably through several cycles, I can no longer reproduce this behavior.  I am throwing in the towel.