Bug 1205343 - connections to rabbitmq-server are hanging
Summary: connections to rabbitmq-server are hanging
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Fedora
Classification: Fedora
Component: rabbitmq-server
Version: 21
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: John Eckersberg
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-03-24 18:01 UTC by Lars Kellogg-Stedman
Modified: 2015-03-25 15:56 UTC (History)
10 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2015-03-25 15:56:10 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
Output of "journalctl -b" on the system on which packstack is failing (850.23 KB, text/plain)
2015-03-24 18:02 UTC, Lars Kellogg-Stedman
no flags Details
Log from cinder --debug (13.06 KB, text/plain)
2015-03-24 18:11 UTC, Lars Kellogg-Stedman
no flags Details

Description Lars Kellogg-Stedman 2015-03-24 18:01:26 UTC
Description of problem:

On Fedora 21 with openstack-packstack-2014.2-0.18.dev1462.gbb05296.fc22.noarch, a multi-node install fails with:

    ERROR : Error appeared during Puppet run: 192.168.122.43_cinder.pp
    Error: Command exceeded timeout
    You will find full trace in log /var/tmp/packstack/20150324-172116-Kf3VcW/manifests/192.168.122.43_cinder.pp.log

And /var/tmp/packstack/20150324-172116-Kf3VcW/manifests/192.168.122.43_cinder.pp.log contains:

    Error: Command exceeded timeout
    Wrapped exception:
    execution expired
    Error: /Stage[main]/Main/Cinder::Type[iscsi]/Exec[cinder type-create iscsi]/returns: change from notrun to 0 failed: Command exceeded timeout
    Notice: /Stage[main]/Main/Cinder::Type[iscsi]/Cinder::Type_set[lvm]/Exec[cinder type-key iscsi set volume_backend_name=lvm]: Dependency Exec[cinder type-create iscsi] has failures: true
    Warning: /Stage[main]/Main/Cinder::Type[iscsi]/Cinder::Type_set[lvm]/Exec[cinder type-key iscsi set volume_backend_name=lvm]: Skipping because of failed dependencies

Even after `packstack` has exited with an error, the `cinder` command
is still running:

    # date
    Tue Mar 24 17:56:11 UTC 2015
    # ps -fe | grep volume.backend
    root     13142     1  0 17:49 ?        00:00:00 /usr/bin/python /usr/bin/cinder type-key iscsi set volume_backend_name=lvm

Running `strace` and `lsof` on the stuck process shows that it is blocked on
`recvfrom()` on fd 3, which appears to be a connection to the cinder
API:

COMMAND   PID USER   FD   TYPE DEVICE  SIZE/OFF   NODE NAME
cinder  13142 root    3u  IPv4 117407       0t0    TCP localhost.localdomain.localdomain:46048->localhost.localdomain.localdomain:8776 (ESTABLISHED)

There are no errors in the Cinder log, and the only warnings are
deprecation warnings.

Comment 1 Lars Kellogg-Stedman 2015-03-24 18:02:12 UTC
Created attachment 1005925 [details]
Output of "journalctl -b" on the system on which packstack is failing

Comment 2 Lars Kellogg-Stedman 2015-03-24 18:05:15 UTC
Also of note:

SELinux on this system was in permissive mode (so this issue is not caused by a bad selinux configuration).

Comment 3 Lars Kellogg-Stedman 2015-03-24 18:11:35 UTC
Created attachment 1005929 [details]
Log from cinder --debug

This is the output of "cinder-api --debug ..." while manually running the "cinder type-key" command from another terminal.

Comment 4 Lars Kellogg-Stedman 2015-03-24 19:58:48 UTC
Retargeting this to rabbitmq-server.  Updating rabbitmq-server on the F21 host to version rabbitmq-server-3.5.0-2.fc22.noarch has completely resolved the symptoms.

Comment 5 Flavio Percoco 2015-03-25 11:56:19 UTC
Lars, it'd be useful to have more than just the cinder-api logs. Is it possible to have the logs from the cinder-scheduler and cinder-volume nodes too?

Config files of these nodes and the rabbit nodes would be useful too.

Comment 6 John Eckersberg 2015-03-25 15:48:55 UTC
I haven't been able to reproduce this yet, going to give it a few more tries.

Comment 7 Lars Kellogg-Stedman 2015-03-25 15:56:10 UTC
After experiencing this reliably through several cycles, I can no longer reproduce this behavior.  I am throwing in the towel.


Note You need to log in before you can comment on or make changes to this bug.