1936536 – Bootstrap apply-spec provision failed to deploy services

Bug 1936536 - Bootstrap apply-spec provision failed to deploy services

Summary: Bootstrap apply-spec provision failed to deploy services

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Ceph Storage
Classification:	Red Hat Storage
Component:	Cephadm
Sub Component:
Version:	5.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	5.0
Assignee:	Adam King
QA Contact:	Sunil Kumar Nagaraju
Docs Contact:	Karen Norteman
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2021-03-08 17:34 UTC by Sunil Kumar Nagaraju
Modified:	2021-08-30 08:28 UTC (History)
CC List:	5 users (show)
Fixed In Version:	ceph-16.1.0-1323.el8cp
Doc Type:	No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed:	2021-08-30 08:28:49 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Issue Tracker	RHCEPH-1166	0	None	None	None	2021-08-30 00:15:07 UTC
Red Hat Product Errata	RHBA-2021:3294	0	None	None	None	2021-08-30 08:28:57 UTC

Comment 1 Adam King 2021-03-17 21:22:29 UTC

I was able to get this to work with a few modifications to your yaml spec. For the mds service, it nees a "service_id" rather than a service name which I think is what was causing the merror message: "Error EINVAL: Cannot add Service: id required" in your output. I also removed the "myfs" label from the placement from that service since i didn't see any hosts with that label in the spec. Lastly, I changed all the host names and number of hosts to match the system I was using.

[root@vm-00 ~]# cat spec.yaml
service_type: host
addr: vm-01
hostname: vm-01
labels:
- mon
- mgr
- osd
- rgw
- alertmanager
- node-exporter
- grafana
- prometheus
---
service_type: host
addr: vm-02
hostname: vm-02
labels:
- mon
- osd
- mgr
- rgw
- alertmanager
- node-exporter
---
service_type: mon
placement:
  hosts:
   - vm-00
   - vm-01
   - vm-02
---
service_type: mgr
service_id: mgr
unmanaged: true
placement:
  label: mgr
---
service_type: osd
service_id: all
placement:
  host_pattern: '*'
data_devices:
  all: true
encrypted: true
---
service_type: rgw
service_id: realm.zone
placement:
  hosts:
    - vm-01
    - vm-02
unmanaged: false
---
service_type: mds
service_id: myfs
placement:
  count: 3

[root@vm-00 ~]# ./cephadm --image docker.io/amk3798/ceph:latest bootstrap --mon-ip 192.168.122.213 --apply-spec spec.yaml
Verifying podman|docker is present...
Verifying lvm2 is present...
Verifying time synchronization is in place...
Unit chronyd.service is enabled and running
Repeating the final host check...
podman|docker (/usr/bin/podman) is present
systemctl is present
lvcreate is present
Unit chronyd.service is enabled and running
Host looks OK
Cluster fsid: 51cc0b18-8763-11eb-8fe3-52540095f015
Verifying IP 192.168.122.213 port 3300 ...
Verifying IP 192.168.122.213 port 6789 ...
Mon IP 192.168.122.213 is in CIDR network 192.168.122.0/24
- internal network (--cluster-network) has not been provided, OSD replication will default to the public_network
Pulling container image docker.io/amk3798/ceph:latest...
Extracting ceph user uid/gid from container image...
Creating initial keys...
Creating initial monmap...
Creating mon...
Waiting for mon to start...
Waiting for mon...
mon is available
Assimilating anything we can from ceph.conf...
Generating new minimal ceph.conf...
Restarting the monitor...
Setting mon public_network to 192.168.122.0/24
Wrote config to /etc/ceph/ceph.conf
Wrote keyring to /etc/ceph/ceph.client.admin.keyring
Creating mgr...
Verifying port 9283 ...
Waiting for mgr to start...
Waiting for mgr...
mgr not available, waiting (1/15)...
mgr not available, waiting (2/15)...
mgr not available, waiting (3/15)...
mgr is available
Enabling cephadm module...
Waiting for the mgr to restart...
Waiting for mgr epoch 5...
mgr epoch 5 is available
Setting orchestrator backend to cephadm...
Generating ssh key...
Wrote public SSH key to to /etc/ceph/ceph.pub
Adding key to root@localhost authorized_keys...
Adding host vm-00...
Deploying mon service with default placement...
Deploying mgr service with default placement...
Deploying crash service with default placement...
Enabling mgr prometheus module...
Deploying prometheus service with default placement...
Deploying grafana service with default placement...
Deploying node-exporter service with default placement...
Deploying alertmanager service with default placement...
Enabling the dashboard module...
Waiting for the mgr to restart...
Waiting for mgr epoch 13...
mgr epoch 13 is available
Generating a dashboard self-signed certificate...
Creating initial admin user...
Fetching dashboard port number...
Ceph Dashboard is now available at:

	     URL: https://vm-00:8443/
	    User: admin
	Password: 8ewchxi3jl

Applying spec.yaml to cluster
Adding ssh key to vm-01
The authenticity of host 'vm-01 (192.168.122.156)' can't be established.
ECDSA key fingerprint is SHA256:IQbeXhSPWL285921nNGdAr9hxG0mqJ+Gy1bT2iHL6+c.
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Adding ssh key to vm-02
The authenticity of host 'vm-02 (192.168.122.54)' can't be established.
ECDSA key fingerprint is SHA256:qMhD0IVmgJaX0qa5j0nGpNWLLpGuMWHjoQ/3UGF94ZE.
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Added host 'vm-01'
Added host 'vm-02'
Scheduled mon update...
Scheduled mgr update...
Scheduled osd.all update...
Scheduled rgw.realm.zone update...
Scheduled mds.myfs update...

You can access the Ceph CLI with:

	sudo ./cephadm shell --fsid 51cc0b18-8763-11eb-8fe3-52540095f015 -c /etc/ceph/ceph.conf -k /etc/ceph/ceph.client.admin.keyring

Please consider enabling telemetry to help improve Ceph:

	ceph telemetry on

For more information see:

	https://docs.ceph.com/docs/master/mgr/telemetry/

Bootstrap complete.

[root@vm-00 ~]# ./cephadm shell
Inferring fsid 51cc0b18-8763-11eb-8fe3-52540095f015
Inferring config /var/lib/ceph/51cc0b18-8763-11eb-8fe3-52540095f015/mon.vm-00/config
Using recent ceph image docker.io/amk3798/ceph@sha256:1cfe55ad3acbcb3c246df45ef3e0ff32bfe82980e07ef1b33e0203a220177539
WARNING: The same type, major and minor should not be used for multiple devices.
WARNING: The same type, major and minor should not be used for multiple devices.

[ceph: root@vm-00 /]# ceph orch host ls
HOST   ADDR   LABELS                                                         STATUS  
vm-00  vm-00                                                                         
vm-01  vm-01  alertmanager prometheus osd mgr rgw grafana node-exporter mon          
vm-02  vm-02  alertmanager osd mgr rgw node-exporter mon                             

[ceph: root@vm-00 /]# ceph orch ps
NAME                         HOST   STATUS         REFRESHED  AGE  VERSION                IMAGE NAME                                                                                      IMAGE ID      CONTAINER ID  
alertmanager.vm-00           vm-00  starting       -          -    <unknown>              <unknown>                                                                                       <unknown>     <unknown>     
crash.vm-00                  vm-00  running (5m)   10s ago    5m   17.0.0-1275-g5e197a21  docker.io/amk3798/ceph@sha256:1cfe55ad3acbcb3c246df45ef3e0ff32bfe82980e07ef1b33e0203a220177539  18ab1f16e4c7  f10e0c23567e  
crash.vm-01                  vm-01  running (94s)  12s ago    93s  17.0.0-1275-g5e197a21  docker.io/amk3798/ceph@sha256:1cfe55ad3acbcb3c246df45ef3e0ff32bfe82980e07ef1b33e0203a220177539  18ab1f16e4c7  51f61a4eb48d  
crash.vm-02                  vm-02  running (92s)  12s ago    91s  17.0.0-1275-g5e197a21  docker.io/amk3798/ceph@sha256:1cfe55ad3acbcb3c246df45ef3e0ff32bfe82980e07ef1b33e0203a220177539  18ab1f16e4c7  44615befdaee  
grafana.vm-00                vm-00  running (4m)   10s ago    4m   6.7.4                  docker.io/ceph/ceph-grafana:6.7.4                                                               80728b29ad3f  054dfa2bb2fe  
mds.myfs.vm-00.ehgdtt        vm-00  running (4m)   10s ago    4m   17.0.0-1275-g5e197a21  docker.io/amk3798/ceph@sha256:1cfe55ad3acbcb3c246df45ef3e0ff32bfe82980e07ef1b33e0203a220177539  18ab1f16e4c7  a27b9d6d4c6f  
mds.myfs.vm-01.euuqye        vm-01  running (22s)  12s ago    22s  17.0.0-1275-g5e197a21  docker.io/amk3798/ceph@sha256:1cfe55ad3acbcb3c246df45ef3e0ff32bfe82980e07ef1b33e0203a220177539  18ab1f16e4c7  1b83bcd73671  
mds.myfs.vm-02.nwvzwm        vm-02  running (21s)  12s ago    20s  17.0.0-1275-g5e197a21  docker.io/amk3798/ceph@sha256:1cfe55ad3acbcb3c246df45ef3e0ff32bfe82980e07ef1b33e0203a220177539  18ab1f16e4c7  1fe76c6424de  
mgr.vm-00.jnqato             vm-00  running (8m)   10s ago    8m   17.0.0-1275-g5e197a21  docker.io/amk3798/ceph:latest                                                                   18ab1f16e4c7  fcd7fd12888a  
mon.vm-00                    vm-00  running (8m)   10s ago    8m   17.0.0-1275-g5e197a21  docker.io/amk3798/ceph:latest                                                                   18ab1f16e4c7  b1c3e06d98be  
mon.vm-01                    vm-01  running (89s)  12s ago    89s  17.0.0-1275-g5e197a21  docker.io/amk3798/ceph@sha256:1cfe55ad3acbcb3c246df45ef3e0ff32bfe82980e07ef1b33e0203a220177539  18ab1f16e4c7  133f5ca828fd  
mon.vm-02                    vm-02  running (87s)  12s ago    87s  17.0.0-1275-g5e197a21  docker.io/amk3798/ceph@sha256:1cfe55ad3acbcb3c246df45ef3e0ff32bfe82980e07ef1b33e0203a220177539  18ab1f16e4c7  2c3d1574e152  
node-exporter.vm-00          vm-00  running (4m)   10s ago    4m   0.18.1                 docker.io/prom/node-exporter:v0.18.1                                                            e5a616e4b9cf  c42673935202  
node-exporter.vm-01          vm-01  running (81s)  12s ago    80s  0.18.1                 docker.io/prom/node-exporter:v0.18.1                                                            e5a616e4b9cf  a49c3c77efd7  
node-exporter.vm-02          vm-02  running (74s)  12s ago    74s  0.18.1                 docker.io/prom/node-exporter:v0.18.1                                                            e5a616e4b9cf  3e8566e3fa30  
osd.0                        vm-00  running (4m)   10s ago    4m   17.0.0-1275-g5e197a21  docker.io/amk3798/ceph@sha256:1cfe55ad3acbcb3c246df45ef3e0ff32bfe82980e07ef1b33e0203a220177539  18ab1f16e4c7  53f1a048a2fd  
osd.1                        vm-00  running (4m)   10s ago    4m   17.0.0-1275-g5e197a21  docker.io/amk3798/ceph@sha256:1cfe55ad3acbcb3c246df45ef3e0ff32bfe82980e07ef1b33e0203a220177539  18ab1f16e4c7  573c55dcde65  
osd.2                        vm-02  running (35s)  12s ago    35s  17.0.0-1275-g5e197a21  docker.io/amk3798/ceph@sha256:1cfe55ad3acbcb3c246df45ef3e0ff32bfe82980e07ef1b33e0203a220177539  18ab1f16e4c7  4423075503f8  
osd.3                        vm-01  running (32s)  12s ago    32s  17.0.0-1275-g5e197a21  docker.io/amk3798/ceph@sha256:1cfe55ad3acbcb3c246df45ef3e0ff32bfe82980e07ef1b33e0203a220177539  18ab1f16e4c7  b32174442e0c  
osd.4                        vm-02  running (32s)  12s ago    31s  17.0.0-1275-g5e197a21  docker.io/amk3798/ceph@sha256:1cfe55ad3acbcb3c246df45ef3e0ff32bfe82980e07ef1b33e0203a220177539  18ab1f16e4c7  a4686c19d9e1  
osd.5                        vm-01  running (28s)  12s ago    28s  17.0.0-1275-g5e197a21  docker.io/amk3798/ceph@sha256:1cfe55ad3acbcb3c246df45ef3e0ff32bfe82980e07ef1b33e0203a220177539  18ab1f16e4c7  fab03d59a026  
prometheus.vm-00             vm-00  running (4m)   10s ago    4m   2.18.1                 docker.io/prom/prometheus:v2.18.1                                                               de242295e225  b7f926632e0d  
rgw.realm.zone.vm-01.eofpnf  vm-01  running (26s)  12s ago    26s  17.0.0-1275-g5e197a21  docker.io/amk3798/ceph@sha256:1cfe55ad3acbcb3c246df45ef3e0ff32bfe82980e07ef1b33e0203a220177539  18ab1f16e4c7  2bd8444800a0  
rgw.realm.zone.vm-02.tyoilw  vm-02  running (24s)  12s ago    24s  17.0.0-1275-g5e197a21  docker.io/amk3798/ceph@sha256:1cfe55ad3acbcb3c246df45ef3e0ff32bfe82980e07ef1b33e0203a220177539  18ab1f16e4c7  a9d00284e1a7  

[ceph: root@vm-00 /]# ceph orch ls
NAME            RUNNING  REFRESHED  AGE  PLACEMENT          IMAGE NAME                                                                                      IMAGE ID      
alertmanager        1/1  6s ago     8m   count:1            docker.io/prom/alertmanager:v0.20.0                                                             0881eb8f169f  
crash               3/3  7s ago     8m   *                  docker.io/amk3798/ceph@sha256:1cfe55ad3acbcb3c246df45ef3e0ff32bfe82980e07ef1b33e0203a220177539  18ab1f16e4c7  
grafana             1/1  6s ago     8m   count:1            docker.io/ceph/ceph-grafana:6.7.4                                                               80728b29ad3f  
mds.myfs            3/3  7s ago     5m   count:3            docker.io/amk3798/ceph@sha256:1cfe55ad3acbcb3c246df45ef3e0ff32bfe82980e07ef1b33e0203a220177539  18ab1f16e4c7  
mgr                 1/2  6s ago     5m   <unmanaged>        docker.io/amk3798/ceph:latest                                                                   18ab1f16e4c7  
mon                 3/3  7s ago     5m   vm-00;vm-01;vm-02  mix                                                                                             18ab1f16e4c7  
node-exporter       3/3  7s ago     8m   *                  docker.io/prom/node-exporter:v0.18.1                                                            e5a616e4b9cf  
osd.all             6/6  7s ago     5m   *                  docker.io/amk3798/ceph@sha256:1cfe55ad3acbcb3c246df45ef3e0ff32bfe82980e07ef1b33e0203a220177539  18ab1f16e4c7  
prometheus          1/1  6s ago     8m   count:1            docker.io/prom/prometheus:v2.18.1                                                               de242295e225  
rgw.realm.zone      2/2  7s ago     53s  vm-01;vm-02        docker.io/amk3798/ceph@sha256:1cfe55ad3acbcb3c246df45ef3e0ff32bfe82980e07ef1b33e0203a220177539  18ab1f16e4c7  



NOTE: It's going to take a few minutes after the bootstrap completes for the daemons to all be added. Don't be surprised if they aren't up yet after the first few minutes. Also, don't be 
      surprised to see an error like "Failed to apply: Cannot place <ServiceSpec for service_name=mon> on vm-01, vm-02: Unknown hosts". This error message can pop up when an apply is done 
      quickly after adding hosts like when using --aply-spec in bootstrap. It's not actually an issue, the cluster just needs a bit of time to refresh itself and recognize those hosts are 
      present. The issue should resolve itself as long as the hosts listed in the error are included in the output of 'ceph orch host ls'.


Tell me if those modifications work for you or if you're still seeing issues. If it worked properly you should see lines like "Added host 'vm-01'" and "Scheduled mon update..." near the end of the bootstrap output right before the portion explaining how to access the Ceph CLI.

Comment 2 Adam King 2021-03-17 21:25:45 UTC

Also, one thing I forgot to add. If you set the mgr service to unmanaged with the "unmanaged: true" line then cephadm won't be able to place the extra mgr daemons you want because "unmanaged" tells cephadm not to touch the service (even if the user gives an apply command). I recommend taking that out as well unless that was your intention.

Comment 21 errata-xmlrpc 2021-08-30 08:28:49 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat Ceph Storage 5.0 bug fix and enhancement), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:3294

Note You need to log in before you can comment on or make changes to this bug.