Bug 1439223 - Directory-backed OSDs stop running and need to be started manually during OSP11 HCI + SRIOV deployment
Summary: Directory-backed OSDs stop running and need to be started manually during OSP...
Keywords:
Status: CLOSED DUPLICATE of bug 1442265
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: ceph
Version: 11.0 (Ocata)
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
: ---
Assignee: John Fulton
QA Contact: Ziv Greenberg
URL:
Whiteboard:
Depends On: 1442265
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-04-05 12:56 UTC by Ziv Greenberg
Modified: 2017-04-17 13:19 UTC (History)
14 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-04-17 13:19:18 UTC
Target Upstream Version:


Attachments (Terms of Use)
deployment_yamls (10.16 KB, application/zip)
2017-04-05 12:56 UTC, Ziv Greenberg
no flags Details
OSPd first-boot Heat env file which applies workaround with each deploy (928 bytes, text/plain)
2017-04-13 20:36 UTC, John Fulton
no flags Details

Description Ziv Greenberg 2017-04-05 12:56:33 UTC
Created attachment 1268950 [details]
deployment_yamls

Description of problem:

SR-IOV instance boot failing with the following error "There are not enough hosts available"

Version-Release number of selected component (if applicable):

RHOS11 HCI - 1 controller, 2 computes. Each compute have only one disk which is sharing OS and OSD.

How reproducible:

always

Steps to Reproduce:

1. deploy RHOS11 with the attached yamls.



Actual results:

for id in $(openstack port create --network sriov_420 --vnic-type direct sriov_420  | awk '/ id/ {print $4}'); do openstack server create --flavor 7 --image rhel_7.3 --nic port-id=$id vm1; done


openstack server show vm1
                                                                                                                                                       

| fault                               | {u'message': u'No valid host was found. There are not enough hosts available.', u'code': 500, u'details': u'  File "/usr/lib/python2.7/site-                                |
|                                     | packages/nova/conductor/manager.py", line 866, in schedule_and_build_instances\n    request_specs[0].to_legacy_filter_properties_dict())\n  File "/usr/lib/python2.7/site-  |
|                                     | packages/nova/conductor/manager.py", line 597, in _schedule_instances\n    hosts = self.scheduler_client.select_destinations(context, spec_obj)\n  File "/usr/lib/python2.7 |
|                                     | /site-packages/nova/scheduler/utils.py", line 371, in wrapped\n    return func(*args, **kwargs)\n  File "/usr/lib/python2.7/site-                                           |
|                                     | packages/nova/scheduler/client/__init__.py", line 51, in select_destinations\n    return self.queryclient.select_destinations(context, spec_obj)\n  File                    |
|                                     | "/usr/lib/python2.7/site-packages/nova/scheduler/client/__init__.py", line 37, in __run_method\n    return getattr(self.instance, __name)(*args, **kwargs)\n  File          |
|                                     | "/usr/lib/python2.7/site-packages/nova/scheduler/client/query.py", line 32, in select_destinations\n    return self.scheduler_rpcapi.select_destinations(context,           |
|                                     | spec_obj)\n  File "/usr/lib/python2.7/site-packages/nova/scheduler/rpcapi.py", line 129, in select_destinations\n    return cctxt.call(ctxt, \'select_destinations\',       |
|                                     | **msg_args)\n  File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/client.py", line 169, in call\n    retry=self.retry)\n  File "/usr/lib/python2.7/site-             |
|                                     | packages/oslo_messaging/transport.py", line 97, in _send\n    timeout=timeout, retry=retry)\n  File "/usr/lib/python2.7/site-                                               |
|                                     | packages/oslo_messaging/_drivers/amqpdriver.py", line 505, in send\n    retry=retry)\n  File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line |
|                                     | 496, in _send\n    raise result\n', u'created': u'2017-04-05T12:13:16Z'} 

                                                                                                   
openstack hypervisor list
+----+-----------------------+-----------------+-------------+-------+
| ID | Hypervisor Hostname   | Hypervisor Type | Host IP     | State |
+----+-----------------------+-----------------+-------------+-------+
|  1 | compute-1.localdomain | QEMU            | 10.35.74.10 | up    |
|  2 | compute-0.localdomain | QEMU            | 10.35.74.13 | up    |
+----+-----------------------+-----------------+-------------+-------+

Expected results:
an instance should boot successfully

Additional info:
The same setup is fully functioning with RHOS10 HCI deployment.

Comment 2 John Fulton 2017-04-05 13:19:05 UTC
Please use the same deploy and same process but with a non-HCI compute and let me know if that fails.

Comment 3 Ziv Greenberg 2017-04-05 14:08:53 UTC
Already deployed rhos11 with SR-IOV (non-HCI), worked as expected.

Comment 4 Sylvain Bauza 2017-04-06 09:41:36 UTC
When looking at the environment, I identified a missing Nova service which wasn't deployed, namely NovaPlacement in TripleO.

That specific service is a new one that became mandatory with OSP11, and helps the Nova scheduler to identify which nodes are good for targeting the instance.

Since I wasn't able to reproduce the boot failure, I digged into the logs and querying the Keystone DB to identify that scheduler was returning 0 hosts as Placement endpoint wasn't deployed.

By reading the custom roles given in the attachment, I don't see the accordingly role named OS::TripleO::Services::NovaPlacement

Could you please try to amend your custom roles by including that one too, and give us feedback whether it fixes the problem ?

Comment 5 John Fulton 2017-04-06 12:26:23 UTC
I think I understand how this happened. The sample templates provided with the RA were for Newton/RH-OSP10 only. They were copied in verbatim for this Mitaka/RH-OSP11. I added a warning to the sample templates that came with the RA. 

 https://github.com/RHsyseng/hci/commit/77a58388c62d061d00422189af28d78f67cf97bf

Comment 6 Ziv Greenberg 2017-04-06 15:17:36 UTC
I have added the additional service to the custom roles and redeployed the overcloud.

I'm trying to boot an instance, but its status doesn't change, it remains the same:

[heat-admin@controller-0 ~]$ openstack server list
+--------------------------------------+------+--------+----------+------------+
| ID                                   | Name | Status | Networks | Image Name |
+--------------------------------------+------+--------+----------+------------+
| 3aa80ab5-01f9-4ea0-9145-c0b6e4a5fbc0 | vm1  | BUILD  |          | rhel_7.3   |
+--------------------------------------+------+--------+----------+------------+


[heat-admin@controller-0 ~]$ openstack server show vm1
+-------------------------------------+----------------------------------------------------------+
| Field                               | Value                                                    |
+-------------------------------------+----------------------------------------------------------+
| OS-DCF:diskConfig                   | MANUAL                                                   |
| OS-EXT-AZ:availability_zone         | nova                                                     |
| OS-EXT-SRV-ATTR:host                | compute-1.localdomain                                    |
| OS-EXT-SRV-ATTR:hypervisor_hostname | compute-1.localdomain                                    |
| OS-EXT-SRV-ATTR:instance_name       | instance-00000001                                        |
| OS-EXT-STS:power_state              | NOSTATE                                                  |
| OS-EXT-STS:task_state               | spawning                                                 |
| OS-EXT-STS:vm_state                 | building                                                 |
| OS-SRV-USG:launched_at              | None                                                     |
| OS-SRV-USG:terminated_at            | None                                                     |
| accessIPv4                          |                                                          |
| accessIPv6                          |                                                          |
| addresses                           |                                                          |
| config_drive                        |                                                          |
| created                             | 2017-04-06T14:59:49Z                                     |
| flavor                              | m1.medium.huge_pages_cpu_pinning_numa (7)                |
| hostId                              | e8ada95a051b2b1a1d8e5fe1798f380a76d17330a6366637712b2a71 |
| id                                  | 3aa80ab5-01f9-4ea0-9145-c0b6e4a5fbc0                     |
| image                               | rhel_7.3 (6321ec7c-7059-4fc7-b8d6-3cf307dbe33f)          |
| key_name                            | None                                                     |
| name                                | vm1                                                      |
| progress                            | 0                                                        |
| project_id                          | 4b607da83f2e4fb29cbe4ae11315c1ac                         |
| properties                          |                                                          |
| status                              | BUILD                                                    |
| updated                             | 2017-04-06T14:59:51Z                                     |
| user_id                             | 49055159aa3346f891e506cec3314574                         |
| volumes_attached                    |                                                          |
+-------------------------------------+----------------------------------------------------------+

Comment 7 John Fulton 2017-04-06 15:23:10 UTC
Ziv,

Would you mind sharing the output of the following from your undercloud? 

diff -u /usr/share/openstack-tripleo-heat-templates/roles_data.yaml  ~/custom-templates/custom-roles.yaml

  John

Comment 8 Ziv Greenberg 2017-04-06 15:53:13 UTC
Please:

[stack@titan03 single-nic-vlans]$ diff -u /usr/share/openstack-tripleo-heat-templates/roles_data.yaml  custom-roles.yaml
--- /usr/share/openstack-tripleo-heat-templates/roles_data.yaml	2017-03-30 21:27:27.000000000 +0300
+++ custom-roles.yaml	2017-04-06 13:55:24.266274043 +0300
@@ -14,44 +14,30 @@
 # defaults to '%stackname%-{{role.name.lower()}}-%index%'
 # sets the default for {{role.name}}HostnameFormat parameter in overcloud.yaml
 #
-# disable_constraints: (boolean) optional, whether to disable Nova and Glance
-# constraints for each role specified in the templates.
-#
-# disable_upgrade_deployment: (boolean) optional, whether to run the
-# ansible upgrade steps for all services that are deployed on the role. If set
-# to True, the operator will drive the upgrade for this role's nodes.
-#
-# upgrade_batch_size: (number): batch size for upgrades where tasks are
-# specified by services to run in batches vs all nodes at once.
-# This defaults to 1, but larger batches may be specified here.
-#
 # ServicesDefault: (list) optional default list of services to be deployed
 # on the role, defaults to an empty list. Sets the default for the
 # {{role.name}}Services parameter in overcloud.yaml
 
-- name: Controller # the 'primary' role goes first
+- name: Controller
   CountDefault: 1
   ServicesDefault:
     - OS::TripleO::Services::CACerts
-    - OS::TripleO::Services::CephMds
     - OS::TripleO::Services::CephMon
     - OS::TripleO::Services::CephExternal
-    - OS::TripleO::Services::CephRbdMirror
     - OS::TripleO::Services::CephRgw
     - OS::TripleO::Services::CinderApi
     - OS::TripleO::Services::CinderBackup
     - OS::TripleO::Services::CinderScheduler
     - OS::TripleO::Services::CinderVolume
-    - OS::TripleO::Services::Congress
     - OS::TripleO::Services::Kernel
     - OS::TripleO::Services::Keystone
     - OS::TripleO::Services::GlanceApi
+    - OS::TripleO::Services::GlanceRegistry
     - OS::TripleO::Services::HeatApi
     - OS::TripleO::Services::HeatApiCfn
     - OS::TripleO::Services::HeatApiCloudwatch
     - OS::TripleO::Services::HeatEngine
     - OS::TripleO::Services::MySQL
-    - OS::TripleO::Services::MySQLClient
     - OS::TripleO::Services::NeutronDhcpAgent
     - OS::TripleO::Services::NeutronL3Agent
     - OS::TripleO::Services::NeutronMetadataAgent
@@ -72,13 +58,11 @@
     - OS::TripleO::Services::NovaScheduler
     - OS::TripleO::Services::NovaConsoleauth
     - OS::TripleO::Services::NovaVncProxy
-    - OS::TripleO::Services::Ec2Api
     - OS::TripleO::Services::Ntp
     - OS::TripleO::Services::SwiftProxy
     - OS::TripleO::Services::SwiftStorage
     - OS::TripleO::Services::SwiftRingBuilder
     - OS::TripleO::Services::Snmp
-    - OS::TripleO::Services::Sshd
     - OS::TripleO::Services::Timezone
     - OS::TripleO::Services::CeilometerApi
     - OS::TripleO::Services::CeilometerCollector
@@ -110,33 +94,18 @@
     - OS::TripleO::Services::OpenDaylightOvs
     - OS::TripleO::Services::SensuClient
     - OS::TripleO::Services::FluentdClient
-    - OS::TripleO::Services::Collectd
-    - OS::TripleO::Services::BarbicanApi
-    - OS::TripleO::Services::PankoApi
-    - OS::TripleO::Services::Tacker
-    - OS::TripleO::Services::Zaqar
-    - OS::TripleO::Services::OVNDBs
-    - OS::TripleO::Services::NeutronML2FujitsuCfab
-    - OS::TripleO::Services::NeutronML2FujitsuFossw
-    - OS::TripleO::Services::CinderHPELeftHandISCSI
-    - OS::TripleO::Services::Etcd
-    - OS::TripleO::Services::AuditD
-    - OS::TripleO::Services::OctaviaApi
-    - OS::TripleO::Services::OctaviaHealthManager
-    - OS::TripleO::Services::OctaviaHousekeeping
-    - OS::TripleO::Services::OctaviaWorker
 
 - name: Compute
   CountDefault: 1
-  disable_upgrade_deployment: True
+  HostnameFormatDefault: '%stackname%-compute-%index%'
   ServicesDefault:
+    - OS::TripleO::Services::CephOSD
     - OS::TripleO::Services::CACerts
     - OS::TripleO::Services::CephClient
     - OS::TripleO::Services::CephExternal
     - OS::TripleO::Services::Timezone
     - OS::TripleO::Services::Ntp
     - OS::TripleO::Services::Snmp
-    - OS::TripleO::Services::Sshd
     - OS::TripleO::Services::NovaCompute
     - OS::TripleO::Services::NovaLibvirt
     - OS::TripleO::Services::Kernel
@@ -151,55 +120,3 @@
     - OS::TripleO::Services::OpenDaylightOvs
     - OS::TripleO::Services::SensuClient
     - OS::TripleO::Services::FluentdClient
-    - OS::TripleO::Services::AuditD
-    - OS::TripleO::Services::Collectd
-
-- name: BlockStorage
-  ServicesDefault:
-    - OS::TripleO::Services::CACerts
-    - OS::TripleO::Services::BlockStorageCinderVolume
-    - OS::TripleO::Services::Kernel
-    - OS::TripleO::Services::Ntp
-    - OS::TripleO::Services::Timezone
-    - OS::TripleO::Services::Snmp
-    - OS::TripleO::Services::Sshd
-    - OS::TripleO::Services::TripleoPackages
-    - OS::TripleO::Services::TripleoFirewall
-    - OS::TripleO::Services::SensuClient
-    - OS::TripleO::Services::FluentdClient
-    - OS::TripleO::Services::AuditD
-    - OS::TripleO::Services::Collectd
-
-- name: ObjectStorage
-  disable_upgrade_deployment: True
-  ServicesDefault:
-    - OS::TripleO::Services::CACerts
-    - OS::TripleO::Services::Kernel
-    - OS::TripleO::Services::Ntp
-    - OS::TripleO::Services::SwiftStorage
-    - OS::TripleO::Services::SwiftRingBuilder
-    - OS::TripleO::Services::Snmp
-    - OS::TripleO::Services::Sshd
-    - OS::TripleO::Services::Timezone
-    - OS::TripleO::Services::TripleoPackages
-    - OS::TripleO::Services::TripleoFirewall
-    - OS::TripleO::Services::SensuClient
-    - OS::TripleO::Services::FluentdClient
-    - OS::TripleO::Services::AuditD
-    - OS::TripleO::Services::Collectd
-
-- name: CephStorage
-  ServicesDefault:
-    - OS::TripleO::Services::CACerts
-    - OS::TripleO::Services::CephOSD
-    - OS::TripleO::Services::Kernel
-    - OS::TripleO::Services::Ntp
-    - OS::TripleO::Services::Snmp
-    - OS::TripleO::Services::Sshd
-    - OS::TripleO::Services::Timezone
-    - OS::TripleO::Services::TripleoPackages
-    - OS::TripleO::Services::TripleoFirewall
-    - OS::TripleO::Services::SensuClient
-    - OS::TripleO::Services::FluentdClient
-    - OS::TripleO::Services::AuditD
-    - OS::TripleO::Services::Collectd

Comment 9 John Fulton 2017-04-06 18:13:52 UTC
Ziv,

The diff shows that your controller was deployed without a bunch services. I think modifying the file provided with the OSP10 RA to bring it up to OSP11 standards is error prone. Let's try making a new one instead (if necessary). 

Option 1: 

I see that your default compute service has the OSD service added. If all of your computes are going to be hyperconverged, then you could go with not composing a role and just deploying with the following: 

/usr/share/openstack-tripleo-heat-templates/environments/hyperconverged-ceph.yaml

The above updates the compute role by adding the OSD service and the above has passed CI. Thus, we could be sure we don't have any complexities introduced from custom roles by using the above with a -e and not passing "-r custom-roles.yaml". 

Option 2: 

If you do need composed roles can you please try the following: 

cp /usr/share/openstack-tripleo-heat-templates/roles_data.yaml ~/custom-roles.yaml

vi ~/custom-templates/custom-roles.yaml 

Then just copy/paste the following into the Compute role? 

 "- OS::TripleO::Services::CephOSD"

and then add whatever else you need role customizations for (what else might that be?). 

Please then deploy again and if you have problems and used a custom-roles.yaml file, then please send me the new diff. 

Thanks,
  John

Comment 10 Ziv Greenberg 2017-04-08 19:26:45 UTC
Hi John,


I have followed the second option, unfortunately, I had exactly the same result as my previous report, boot instance is stuck in a BUILD/spawning  status:

[heat-admin@controller-0 ~]$ openstack server list
+--------------------------------------+------+--------+----------+------------+
| ID                                   | Name | Status | Networks | Image Name |
+--------------------------------------+------+--------+----------+------------+
| 95a020c0-1cd5-4c4d-83a8-956732148241 | vm2  | BUILD  |          | rhel_7.3   |
| eb25e1a6-7617-4d6b-8323-cd08a995f0d1 | vm1  | BUILD  |          | rhel_7.3   |
+--------------------------------------+------+--------+----------+------------+


[stack@titan03 single-nic-vlans]$  diff -u /usr/share/openstack-tripleo-heat-templates/roles_data.yaml  custom-roles.yaml
--- /usr/share/openstack-tripleo-heat-templates/roles_data.yaml	2017-03-30 21:27:27.000000000 +0300
+++ custom-roles.yaml	2017-04-08 20:33:26.855630549 +0300
@@ -130,6 +130,7 @@
   CountDefault: 1
   disable_upgrade_deployment: True
   ServicesDefault:
+    - OS::TripleO::Services::CephOSD
     - OS::TripleO::Services::CACerts
     - OS::TripleO::Services::CephClient
     - OS::TripleO::Services::CephExternal
@@ -152,54 +153,4 @@
     - OS::TripleO::Services::SensuClient
     - OS::TripleO::Services::FluentdClient
     - OS::TripleO::Services::AuditD
-    - OS::TripleO::Services::Collectd
-
-- name: BlockStorage
-  ServicesDefault:
-    - OS::TripleO::Services::CACerts
-    - OS::TripleO::Services::BlockStorageCinderVolume
-    - OS::TripleO::Services::Kernel
-    - OS::TripleO::Services::Ntp
-    - OS::TripleO::Services::Timezone
-    - OS::TripleO::Services::Snmp
-    - OS::TripleO::Services::Sshd
-    - OS::TripleO::Services::TripleoPackages
-    - OS::TripleO::Services::TripleoFirewall
-    - OS::TripleO::Services::SensuClient
-    - OS::TripleO::Services::FluentdClient
-    - OS::TripleO::Services::AuditD
-    - OS::TripleO::Services::Collectd
-
-- name: ObjectStorage
-  disable_upgrade_deployment: True
-  ServicesDefault:
-    - OS::TripleO::Services::CACerts
-    - OS::TripleO::Services::Kernel
-    - OS::TripleO::Services::Ntp
-    - OS::TripleO::Services::SwiftStorage
-    - OS::TripleO::Services::SwiftRingBuilder
-    - OS::TripleO::Services::Snmp
-    - OS::TripleO::Services::Sshd
-    - OS::TripleO::Services::Timezone
-    - OS::TripleO::Services::TripleoPackages
-    - OS::TripleO::Services::TripleoFirewall
-    - OS::TripleO::Services::SensuClient
-    - OS::TripleO::Services::FluentdClient
-    - OS::TripleO::Services::AuditD
-    - OS::TripleO::Services::Collectd
-
-- name: CephStorage
-  ServicesDefault:
-    - OS::TripleO::Services::CACerts
-    - OS::TripleO::Services::CephOSD
-    - OS::TripleO::Services::Kernel
-    - OS::TripleO::Services::Ntp
-    - OS::TripleO::Services::Snmp
-    - OS::TripleO::Services::Sshd
-    - OS::TripleO::Services::Timezone
-    - OS::TripleO::Services::TripleoPackages
-    - OS::TripleO::Services::TripleoFirewall
-    - OS::TripleO::Services::SensuClient
-    - OS::TripleO::Services::FluentdClient
-    - OS::TripleO::Services::AuditD
     - OS::TripleO::Services::Collectd



In addition, I have tried to create a volume from an image (openstack volume create --image rhel_7.3 --size 20 rhel_7.3_vol).

It is stuck on:
[heat-admin@controller-0 ~]$ openstack volume list
+--------------------------------------+--------------+----------+------+-------------+
| ID                                   | Display Name | Status   | Size | Attached to |
+--------------------------------------+--------------+----------+------+-------------+
| de576167-1d47-4446-97be-cba8cb190914 | rhel_7.3_vol | creating |   20 |             |
+--------------------------------------+--------------+----------+------+-------------+


Thank you,
Ziv

Comment 11 Ziv Greenberg 2017-04-08 20:01:45 UTC
Hi,

I have also noticed, that after a while when I'm executing "openstack server list" command, it's got stuck with no output.

Thank you,
Ziv

Comment 12 John Fulton 2017-04-10 22:57:03 UTC
I have reason to think ceph's problem on this deployment is that the OSDs didn't start and that they didn't start because their systemd unit files are missing. However, I see the files were created by puppet ceph as they should be. What removed the unit files for these OSDs? Such an action may not be logged. However, testing with passed phase 2 OSP11 shows that they were created in my env and that they survived a reboot. 

Details: 

This deployment had both of its OSDs down: 

[root@controller-0 ~]# ceph -s
    cluster 06ff858a-1889-11e7-9907-1c98ec173964
     health HEALTH_WARN
            288 pgs degraded
            288 pgs stale
            288 pgs stuck unclean
            288 pgs undersized
            recovery 162/243 objects degraded (66.667%)
            2/2 in osds are down
     monmap e1: 1 mons at {controller-0=10.10.128.12:6789/0}
            election epoch 3, quorum 0 controller-0
     osdmap e22: 2 osds: 0 up, 2 in
            flags sortbitwise,require_jewel_osds
      pgmap v321: 288 pgs, 8 pools, 536 MB data, 81 objects
            18329 MB used, 820 GB / 838 GB avail
            162/243 objects degraded (66.667%)
                 288 stale+active+undersized+degraded
[root@controller-0 ~]# 

As per the logs both OSD nodes were configured correctly by puppet-ceph to start on boot: 

[root@compute-0 ~]# journalctl | grep "Created symlink from /run/systemd/system/ceph-osd.target.wants/ceph-osd"
...
Apr 08 18:00:23 compute-0.localdomain os-collect-config[1917]: Notice: /Stage[main]/Ceph::Osds/Ceph::Osd[/srv/data]/Exec[ceph-osd-activate-/srv/data]/returns: Created symlink from /run/systemd/system/ceph-osd.target.wants/ceph-osd@1.service to /usr/lib/systemd/system/ceph-osd@.service.
...
[root@compute-1 log]# tail -n 13985 messages-20170409 | head -1 
Apr  8 14:00:20 compute-1 os-collect-config: #033[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/srv/data]/Exec[ceph-osd-activate-/srv/data]/returns: Created symlink from /run/systemd/system/ceph-osd.target.wants/ceph-osd@0.service to /usr/lib/systemd/system/ceph-osd@.service.#033[0m
[root@compute-1 log]#

However, neither have their systemd unit file to start those services in place: 

[root@compute-0 ~]#  ls /run/systemd/system/ceph-osd.target.wants/
ls: cannot access /run/systemd/system/ceph-osd.target.wants/: No such file or directory
[root@compute-0 ~]# 

[root@compute-0 ~]# systemctl list-unit-files | grep ceph 
ceph-create-keys@.service                     static  
ceph-disk@.service                            static  
ceph-mds@.service                             disabled
ceph-mon@.service                             disabled
ceph-osd@.service                             disabled
ceph-radosgw@.service                         disabled
ceph-rbd-mirror@.service                      disabled
ceph-mds.target                               enabled 
ceph-mon.target                               enabled 
ceph-osd.target                               enabled 
ceph-radosgw.target                           enabled 
ceph-rbd-mirror.target                        disabled
ceph.target                                   enabled 
[root@compute-0 ~]# 


[root@compute-1 log]# ls /run/systemd/system/ceph-osd.target.wants/
ls: cannot access /run/systemd/system/ceph-osd.target.wants/: No such file or directory
[root@compute-1 log]# ls /run/systemd/system/
session-52.scope  session-52.scope.d  session-53.scope  session-53.scope.d
[root@compute-1 log]#

I deployed my own HCI node using passed phase_2 and it got it's unit file: 

[root@overcloud-osd-compute-0 log]# ls -l "/run/systemd/system/ceph-osd.target.wants/ceph-osd@1.service"
lrwxrwxrwx. 1 root root 41 Apr 10 17:14 /run/systemd/system/ceph-osd.target.wants/ceph-osd@1.service -> /usr/lib/systemd/system/ceph-osd@.service
[root@overcloud-osd-compute-0 log]# ls -l "/usr/lib/systemd/system/ceph-osd@.service"
-rw-r--r--. 1 root root 726 Apr 10 17:31 /usr/lib/systemd/system/ceph-osd@.service
[root@overcloud-osd-compute-0 log]# 

I didn't reboot my system though (ziv did w/ a pre script). 

Does mine survive a reboot? 

[root@overcloud-osd-compute-0 log]# init 6 
Connection to 192.168.1.21 closed by remote host.
Connection to 192.168.1.21 closed.
[stack@hci-director ~]$ 

Yes...

[root@overcloud-osd-compute-0 ~]# ls /run/systemd/system/ceph-osd.target.wants/
ceph-osd@10.service  ceph-osd@16.service  ceph-osd@1.service   ceph-osd@25.service  ceph-osd@31.service  ceph-osd@4.service
ceph-osd@13.service  ceph-osd@19.service  ceph-osd@22.service  ceph-osd@28.service  ceph-osd@34.service  ceph-osd@7.service
[root@overcloud-osd-compute-0 ~]# uptime
 20:27:36 up 1 min,  1 user,  load average: 2.94, 1.55, 0.59
[root@overcloud-osd-compute-0 ~]# 

so my OSDs came up fine and survived a reboot. 

[root@overcloud-osd-compute-0 ~]# ceph -s
    cluster eb2bb192-b1c9-11e6-9205-525400330666
     health HEALTH_OK
     monmap e1: 3 mons at {overcloud-controller-0=172.16.1.200:6789/0,overcloud-controller-1=172.16.1.201:6789/0,overcloud-controller-2=172.16.1.202:6789/0}
            election epoch 6, quorum 0,1,2 overcloud-controller-0,overcloud-controller-1,overcloud-controller-2
     osdmap e247: 36 osds: 36 up, 36 in
            flags sortbitwise,require_jewel_osds
      pgmap v7044: 1856 pgs, 8 pools, 11024 MB data, 2185 objects
            34723 MB used, 40187 GB / 40221 GB avail
                1856 active+clean
[root@overcloud-osd-compute-0 ~]# 


ON ziv's systems, if I simply start the OSD it starts...

[root@compute-0 log]# systemctl status ceph-osd@1
● ceph-osd@1.service - Ceph object storage daemon
   Loaded: loaded (/usr/lib/systemd/system/ceph-osd@.service; disabled; vendor preset: disabled)
   Active: inactive (dead)

Apr 10 22:21:09 compute-0.localdomain systemd[1]: [/usr/lib/systemd/system/ceph-osd@.service:18] Unknown lvalue 'TasksMax' in section 'Service'
Apr 10 22:23:47 compute-0.localdomain systemd[1]: [/usr/lib/systemd/system/ceph-osd@.service:18] Unknown lvalue 'TasksMax' in section 'Service'
Apr 10 22:23:55 compute-0.localdomain systemd[1]: [/usr/lib/systemd/system/ceph-osd@.service:18] Unknown lvalue 'TasksMax' in section 'Service'
Apr 10 22:34:07 compute-0.localdomain systemd[1]: [/usr/lib/systemd/system/ceph-osd@.service:18] Unknown lvalue 'TasksMax' in section 'Service'
[root@compute-0 log]#

[root@compute-0 log]# systemctl stop ceph-osd@1
[root@compute-0 log]# 

[root@compute-0 log]# systemctl start ceph-osd@1
[root@compute-0 log]# systemctl status ceph-osd@1
● ceph-osd@1.service - Ceph object storage daemon
   Loaded: loaded (/usr/lib/systemd/system/ceph-osd@.service; disabled; vendor preset: disabled)
   Active: active (running) since Mon 2017-04-10 22:34:49 UTC; 9s ago
  Process: 221105 ExecStartPre=/usr/lib/ceph/ceph-osd-prestart.sh --cluster ${CLUSTER} --id %i (code=exited, status=0/SUCCESS)
 Main PID: 221155 (ceph-osd)
   CGroup: /system.slice/system-ceph\x2dosd.slice/ceph-osd@1.service
           └─221155 /usr/bin/ceph-osd -f --cluster ceph --id 1 --setuser ceph --setgroup ceph

Apr 10 22:34:48 compute-0.localdomain systemd[1]: Starting Ceph object storage daemon...
Apr 10 22:34:49 compute-0.localdomain ceph-osd-prestart.sh[221105]: create-or-move updated item name 'osd.1' weight 0.4093 at location {host=compute-0,root=default} to crush map
Apr 10 22:34:49 compute-0.localdomain systemd[1]: Started Ceph object storage daemon.
Apr 10 22:34:49 compute-0.localdomain ceph-osd[221155]: starting osd.1 at :/0 osd_data /var/lib/ceph/osd/ceph-1 /var/lib/ceph/osd/ceph-1/journal
Apr 10 22:34:49 compute-0.localdomain ceph-osd[221155]: 2017-04-10 22:34:49.894410 7fb6deaa7800 -1 journal FileJournal::_open: disabling aio for non-block journal.  Use journal_force_aio to force...of aio anyway
Apr 10 22:34:50 compute-0.localdomain ceph-osd[221155]: 2017-04-10 22:34:50.144006 7fb6deaa7800 -1 osd.1 20 log_to_monitors {default=true}
Hint: Some lines were ellipsized, use -l to show in full.
[root@compute-0 log]# 

[root@compute-1 ceph]# ceph -s
    cluster 06ff858a-1889-11e7-9907-1c98ec173964
     health HEALTH_WARN
            288 pgs degraded
            288 pgs stuck unclean
            288 pgs undersized
            recovery 185/447 objects degraded (41.387%)
     monmap e1: 1 mons at {controller-0=10.10.128.12:6789/0}
            election epoch 3, quorum 0 controller-0
     osdmap e28: 2 osds: 2 up, 2 in
            flags sortbitwise,require_jewel_osds
      pgmap v798: 288 pgs, 8 pools, 537 MB data, 149 objects
            18668 MB used, 820 GB / 838 GB avail
            185/447 objects degraded (41.387%)
                 288 active+undersized+degraded
[root@compute-1 ceph]# 

and both OSDs are up...

However, the PGs ARE undersized for 2 OSd nodes (not a bug, environmental hardware issue) and as two OSDs have been down for a while, it's going to be trying to restore if it can as per the following...

http://docs.ceph.com/docs/master/rados/troubleshooting/troubleshooting-pg/#pgs-inconsistent

Comment 17 John Fulton 2017-04-12 21:50:35 UTC
Worked with Ziv to reproduced the problem; the VM is in a state where it's spawning but stuck there: 

[heat-admin@controller-0 ~]$ nova list
/usr/lib/python2.7/site-packages/novaclient/client.py:278: UserWarning: The 'tenant_id' argument is
e releases. As 'project_id' is provided, the 'tenant_id' argument will be ignored.
  warnings.warn(msg)
+--------------------------------------+------+--------+------------+-------------+----------+
| ID                                   | Name | Status | Task State | Power State | Networks |
+--------------------------------------+------+--------+------------+-------------+----------+
| 11e54d2d-f560-4e68-9182-b2a949b5954a | vm1  | BUILD  | spawning   | NOSTATE     |          |
| ac0feb42-227f-459e-88f2-a2be570880a8 | vm1  | BUILD  | spawning   | NOSTATE     |          |
+--------------------------------------+------+--------+------------+-------------+----------+
[heat-admin@controller-0 ~]$

I then observed that the OSDs were down. All I had to do was start the OSDs [1] and then the VMs came right up: 

[heat-admin@controller-0 ~]$ nova list
/usr/lib/python2.7/site-packages/novaclient/client.py:278: UserWarning: The 'tenant_id' argument is
e releases. As 'project_id' is provided, the 'tenant_id' argument will be ignored.
  warnings.warn(msg)
+--------------------------------------+------+--------+------------+-------------+----------------
| ID                                   | Name | Status | Task State | Power State | Networks       
+--------------------------------------+------+--------+------------+-------------+----------------
| 11e54d2d-f560-4e68-9182-b2a949b5954a | vm1  | ACTIVE | -          | Running     | sriov_420=10.35
| ac0feb42-227f-459e-88f2-a2be570880a8 | vm1  | ACTIVE | -          | Running     | sriov_420=10.35
+--------------------------------------+------+--------+------------+-------------+----------------
[heat-admin@controller-0 ~]$ 

I suspect as a workaround, if you want to test load for this combination, that you can just start the OSDs manually before you create VMs. 

However, they should not have been down to start with. 

Other observations: 
- The unit files were created [2] but that they are now missing [3]. 
- The nodes are rebooted for SRIOV purposes by /home/stack/single-nic-vlans/first-boot.yaml

Next questions:
1. Did the OSDs crash or did they not start after the reboot? (I suspect the later but will back it up with logs)
2. Why are the unit files missing? 


Footnotes:
[1] 

[root@compute-1 ~]# systemctl status ceph-osd@0
● ceph-osd@0.service - Ceph object storage daemon
   Loaded: loaded (/usr/lib/systemd/system/ceph-osd@.service; disabled; vendor preset: disabled)
   Active: inactive (dead)

Apr 12 20:58:41 compute-1.localdomain systemd[1]: [/usr/lib/systemd/system/ceph-osd@.service:18] Unknown lvalue 'TasksMax' in section 'Service'
Apr 12 21:02:24 compute-1.localdomain systemd[1]: [/usr/lib/systemd/system/ceph-osd@.service:18] Unknown lvalue 'TasksMax' in section 'Service'
[root@compute-1 ~]# systemctl start ceph-osd@0
[root@compute-1 ~]# systemctl status ceph-osd@0
● ceph-osd@0.service - Ceph object storage daemon
   Loaded: loaded (/usr/lib/systemd/system/ceph-osd@.service; disabled; vendor preset: disabled)
   Active: active (running) since Wed 2017-04-12 21:02:29 UTC; 2s ago
  Process: 8157 ExecStartPre=/usr/lib/ceph/ceph-osd-prestart.sh --cluster ${CLUSTER} --id %i (code=exited, status=0/SUCCESS)
 Main PID: 8207 (ceph-osd)
   CGroup: /system.slice/system-ceph\x2dosd.slice/ceph-osd@0.service
           └─8207 /usr/bin/ceph-osd -f --cluster ceph --id 0 --setuser ceph --setgroup ceph

Apr 12 21:02:29 compute-1.localdomain systemd[1]: Starting Ceph object storage daemon...
Apr 12 21:02:29 compute-1.localdomain ceph-osd-prestart.sh[8157]: create-or-move updated item name 'osd.0' weight 0.4093 at location {host=compute-1,root=default} to crush map
Apr 12 21:02:29 compute-1.localdomain systemd[1]: Started Ceph object storage daemon.
Apr 12 21:02:29 compute-1.localdomain ceph-osd[8207]: starting osd.0 at :/0 osd_data /var/lib/ceph/osd/ceph-0 /var/lib/ceph/osd/ceph-0/journal
Apr 12 21:02:31 compute-1.localdomain ceph-osd[8207]: 2017-04-12 21:02:31.122614 7ff789f0e800 -1 journal FileJournal::_open: disabling aio for non-block journal.  Use journal_force_aio to force use of aio anyway
Apr 12 21:02:31 compute-1.localdomain ceph-osd[8207]: 2017-04-12 21:02:31.507154 7ff789f0e800 -1 osd.0 24 log_to_monitors {default=true}
Hint: Some lines were ellipsized, use -l to show in full.
[root@compute-1 ~]#

[root@compute-0 ~]# systemctl status ceph-osd@1
● ceph-osd@1.service - Ceph object storage daemon
   Loaded: loaded (/usr/lib/systemd/system/ceph-osd@.service; disabled; vendor preset: disabled)
   Active: inactive (dead)

Apr 12 21:02:58 compute-0.localdomain systemd[1]: [/usr/lib/systemd/system/ceph-osd@.service:18] Unknown lvalue 'TasksMax' in section 'Service'
[root@compute-0 ~]# systemctl start ceph-osd@1
[root@compute-0 ~]# systemctl status ceph-osd@1
● ceph-osd@1.service - Ceph object storage daemon
   Loaded: loaded (/usr/lib/systemd/system/ceph-osd@.service; disabled; vendor preset: disabled)
   Active: active (running) since Wed 2017-04-12 21:03:06 UTC; 8s ago
  Process: 8781 ExecStartPre=/usr/lib/ceph/ceph-osd-prestart.sh --cluster ${CLUSTER} --id %i (code=exited, status=0/SUCCESS)
 Main PID: 8831 (ceph-osd)
   CGroup: /system.slice/system-ceph\x2dosd.slice/ceph-osd@1.service
           └─8831 /usr/bin/ceph-osd -f --cluster ceph --id 1 --setuser ceph --setgroup ceph

Apr 12 21:03:06 compute-0.localdomain systemd[1]: Starting Ceph object storage daemon...
Apr 12 21:03:06 compute-0.localdomain ceph-osd-prestart.sh[8781]: create-or-move updated item name 'osd.1' weight 0.4093 at location {host=compute-0,root=default} to crush map
Apr 12 21:03:06 compute-0.localdomain systemd[1]: Started Ceph object storage daemon.
Apr 12 21:03:07 compute-0.localdomain ceph-osd[8831]: starting osd.1 at :/0 osd_data /var/lib/ceph/osd/ceph-1 /var/lib/ceph/osd/ceph-1/journal
Apr 12 21:03:08 compute-0.localdomain ceph-osd[8831]: 2017-04-12 21:03:08.403407 7f53c8949800 -1 journal FileJournal::_open: disabling aio for non-block journal.  Use journal_force_aio to force use of aio anyway
Apr 12 21:03:08 compute-0.localdomain ceph-osd[8831]: 2017-04-12 21:03:08.791003 7f53c8949800 -1 osd.1 24 log_to_monitors {default=true}
Hint: Some lines were ellipsized, use -l to show in full.
[root@compute-0 ~]# 

[2] 
[root@compute-0 ~]#  journalctl | grep "Created symlink from /run/systemd/system/ceph-osd.target.wants/ceph-osd"
Apr 12 19:16:03 compute-0.localdomain os-collect-config[1965]: a]/Exec[ceph-osd-prepare-/srv/data]/returns: + udevadm settle\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/srv/data]/Exec[ceph-osd-prepare-/srv/data]/returns: executed successfully\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/srv/data]/Exec[fcontext_/srv/data]/returns: executed successfully\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/srv/data]/Exec[ceph-osd-activate-/srv/data]/returns: + test -b /srv/data\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/srv/data]/Exec[ceph-osd-activate-/srv/data]/returns: + egrep -e '^/dev' -q -v\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/srv/data]/Exec[ceph-osd-activate-/srv/data]/returns: + echo /srv/data\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/srv/data]/Exec[ceph-osd-activate-/srv/data]/returns: + mkdir -p /srv/data\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/srv/data]/Exec[ceph-osd-activate-/srv/data]/returns: + getent passwd ceph\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/srv/data]/Exec[ceph-osd-activate-/srv/data]/returns: + chown -h ceph:ceph /srv/data\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/srv/data]/Exec[ceph-osd-activate-/srv/data]/returns: + test -b /srv/data\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/srv/data]/Exec[ceph-osd-activate-/srv/data]/returns: + ceph-disk activate /srv/data\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/srv/data]/Exec[ceph-osd-activate-/srv/data]/returns: got monmap epoch 1\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/srv/data]/Exec[ceph-osd-activate-/srv/data]/returns: added key for osd.1\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/srv/data]/Exec[ceph-osd-activate-/srv/data]/returns: Created symlink from /run/systemd/system/ceph-osd.target.wants/ceph-osd@1.service to /usr/lib/systemd/system/ceph-osd@.service.\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/srv/data]/Exec[ceph-osd-activate-/srv/data]/returns: + t
Apr 12 19:16:03 compute-0.localdomain os-collect-config[1965]: Notice: /Stage[main]/Ceph::Osds/Ceph::Osd[/srv/data]/Exec[ceph-osd-activate-/srv/data]/returns: Created symlink from /run/systemd/system/ceph-osd.target.wants/ceph-osd@1.service to /usr/lib/systemd/system/ceph-osd@.service.
[root@compute-0 ~]#

[root@compute-1 ~]#  journalctl | grep "Created symlink from /run/systemd/system/ceph-osd.target.wants/ceph-osd"
Apr 12 19:16:01 compute-1.localdomain os-collect-config[1921]: c[ceph-osd-activate-/srv/data]/returns: + mkdir -p /srv/data\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/srv/data]/Exec[ceph-osd-activate-/srv/data]/returns: + getent passwd ceph\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/srv/data]/Exec[ceph-osd-activate-/srv/data]/returns: + chown -h ceph:ceph /srv/data\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/srv/data]/Exec[ceph-osd-activate-/srv/data]/returns: + test -b /srv/data\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/srv/data]/Exec[ceph-osd-activate-/srv/data]/returns: + ceph-disk activate /srv/data\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/srv/data]/Exec[ceph-osd-activate-/srv/data]/returns: got monmap epoch 1\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/srv/data]/Exec[ceph-osd-activate-/srv/data]/returns: added key for osd.0\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/srv/data]/Exec[ceph-osd-activate-/srv/data]/returns: Created symlink from /run/systemd/system/ceph-osd.target.wants/ceph-osd@0.service to /usr/lib/systemd/system/ceph-osd@.service.\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/srv/data]/Exec[ceph-osd-activate-/srv/data]/returns: + test -f /usr/lib/udev/rules.d/95-ceph-osd.rules.disabled\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/srv/data]/Exec[ceph-osd-activate-/srv/data]/returns: executed successfully\u001b[0m\n\u001b[mNotice: /Stage[main]/Tripleo::Profile::Base::Kernel/Kmod::Load[ip_conntrack_proto_sctp]/Exec[modprobe ip_conntrack_proto_sctp]/returns: executed successfully\u001b[0m\n\u001b[mNotice: /Stage[main]/Neutron::Logging/Oslo::Log[neutron_config]/Neutron_config[DEFAULT/debug]/ensure: created\u001b[0m\n\u001b[mNotice: /Stage[main]/Neutron::Logging/Oslo::Log[neutron_config]/Neutron_config[DEFAULT/log_dir]/ensure: created\u001b[0m\n\u001b[mNotice: /Stage[main]/Neutron/Oslo::Messaging::Default[neutron_config]/Neutron_config[DEFAULT/control_exchange]/ensure: created\u001b[0m\n\u001b[mNotice: /Stag
Apr 12 19:16:01 compute-1.localdomain os-collect-config[1921]: Notice: /Stage[main]/Ceph::Osds/Ceph::Osd[/srv/data]/Exec[ceph-osd-activate-/srv/data]/returns: Created symlink from /run/systemd/system/ceph-osd.target.wants/ceph-osd@0.service to /usr/lib/systemd/system/ceph-osd@.service.
[root@compute-1 ~]# 

[3] 
[root@compute-0 ~]# stat /run/systemd/system/ceph-osd.target.wants/
stat: cannot stat ‘/run/systemd/system/ceph-osd.target.wants/’: No such file or directory
You have mail in /var/spool/mail/root
[root@compute-0 ~]# 

[root@compute-1 ~]# stat /run/systemd/system/ceph-osd.target.wants/ceph-osd@0.service 
stat: cannot stat ‘/run/systemd/system/ceph-osd.target.wants/ceph-osd@0.service’: No such file or directory
[root@compute-1 ~]# stat /run/systemd/system/ceph-osd.target.wants
stat: cannot stat ‘/run/systemd/system/ceph-osd.target.wants’: No such file or directory
[root@compute-1 ~]#

Comment 20 John Fulton 2017-04-13 20:36:36 UTC
Created attachment 1271547 [details]
OSPd first-boot Heat env file which applies workaround with each deploy

Comment 21 John Fulton 2017-04-13 20:40:12 UTC
This seems to be happening because OSP11 is shipping a newer version
of ceph-disk which uses --runtime [1]. That change isn't a problem
for block device OSDs but is for directory backed OSDs.

How to reproduce:
- Deploy a directory-backed OSD 
- Observe on first boot, on an OSD node, that OSDs are running
  but that their target is in /run not /etc, which will cause
  them to not start on reboot
- Reboot the system and observe that the OSDs are not running

Workaround:
(Deploy an overcloud with a version of ceph-disk without --runtime)
- Use the attached predeploy script
- It downloads the version of ceph-disk [3] used before the change [1] 
- It installs it so that when ceph is configured and puppet calls ceph-disk
  the symlink to start the OSDs on boot is put into /etc not /run

Impact:
This bug does not affect block device backed OSDs, which is what
should be used in production deployments so prioritize accordingly.
I think you flushed out this bug for two reasons:
- Your dev env uses directory-backed OSDs (it should have disks)
- You reboot during deployment for SRIOV purposes

Next Step:
- Open bug against ceph-disk on which this bug will depend
- Perhaps they want to keep the change [1] but make it handle dir backed OSDs?

[1] https://github.com/ceph/ceph/commit/539385b143feee3905dceaf7a8faaced42f2d3c6

[2] 
[root@overcloud-osd-compute-1 ~]# ls /run/systemd/system/ceph-osd.target.wants/
ceph-osd@2.service
[root@overcloud-osd-compute-1 ~]# ls /etc/systemd/system/ceph-osd.target.wants/
ls: cannot access /etc/systemd/system/ceph-osd.target.wants/: No such file or directory
[root@overcloud-osd-compute-1 ~]# 

[3] https://raw.githubusercontent.com/ceph/ceph/72f0b2aa1eb4b7b2a2222c2847d26f99400a8374/src/ceph-disk/ceph_disk/main.py

Comment 23 jomurphy 2017-04-17 13:19:18 UTC

*** This bug has been marked as a duplicate of bug 1442265 ***


Note You need to log in before you can comment on or make changes to this bug.