Description of problem: Prior to version v1.0.11, ceph-installer could start too many celery workers. This would lead to race conditions where multiple tasks could be running at the same time. One way that this manifests itself is if an application submits an "/api/osd/install" task and then an "/api/osd/configure" task immediately afterwards. The "/api/osd/install" task will run in Worker-1, while the "/api/osd/configure" task will run in Worker-2, and Worker-2's task will error because Worker-1 has not yet finished installing the ceph-osd packages. One workaround would be for client applications (USM) to *always* check the status before submitting tasks that depend on each other. I'm not sure if USM always does this, so it's safer to just restrict the number of workers. Version-Release number of selected component (if applicable): ceph-installer-1.0.10-1.el7scon How reproducible: always Steps to Reproduce: 1. Start with a RHEL system with multiple processors (ie /proc/cpuinfo shows multiple processors) 2. sudo yum install ceph-installer 3. sudo systemctl status ceph-installer-celery Actual results: systemd shows that more than two celery PIDs are running, for example: CGroup: /system.slice/ceph-installer-celery.service ├─10088 /usr/bin/python /usr/bin/celery -A async worker --loglevel... ├─10180 /usr/bin/python /usr/bin/celery -A async worker --loglevel... └─10184 /usr/bin/python /usr/bin/celery -A async worker --loglevel... Expected results: systemd should always show only two celery PIDs: CGroup: /system.slice/ceph-installer-celery.service ├─15317 /usr/bin/python /usr/bin/celery -A async worker --loglevel... └─15334 /usr/bin/python /usr/bin/celery -A async worker --loglevel... Additional info: This is fixed upstream in v1.0.11: http://docs.ceph.com/ceph-installer/docs/changelog.html#v1-0-11-2016-05-18
Have not seen this in any smoke test for a while now -> VERIFIED
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2016:1754