Bug 1909977

Summary: Bootstrap fails at dashboard admin user creation
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Sunil Kumar Nagaraju <sunnagar>
Component: CephadmAssignee: Juan Miguel Olmo <jolmomar>
Status: CLOSED ERRATA QA Contact: Sunil Kumar Nagaraju <sunnagar>
Severity: urgent Docs Contact: Karen Norteman <knortema>
Priority: high    
Version: 5.0CC: dsavinea, kdreyer, knortema, pnataraj, sewagner, skanta, tserlin, vereddy
Target Milestone: ---Keywords: Automation, AutomationBlocker, Regression, Reopened, TestBlocker
Target Release: 5.0   
Hardware: Unspecified   
OS: Linux   
Whiteboard:
Fixed In Version: ceph-16.1.0-486.el8cp Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-08-30 08:27:34 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Comment 3 Preethi 2021-01-06 10:37:56 UTC
@Juan, Issue is seen with latest alpha image as well. 
registry-proxy.engineering.redhat.com/rh-osbs/rhceph:ceph-5.0-rhel-8-containers-candidate-10695-20210105032152


root@magna106 ubuntu]# sudo cephadm rm-cluster --fsid 4e57a198-635d-4ca6-bdaa-4687d99cdc7a --force
[root@magna106 ubuntu]# 
[root@magna106 ubuntu]# 
[root@magna106 ubuntu]# sudo cephadm bootstrap --mon-ip 10.8.128.106 --registry-json cephadm.txt
Verifying podman|docker is present...
Verifying lvm2 is present...
Verifying time synchronization is in place...
Unit chronyd.service is enabled and running
Repeating the final host check...
podman|docker (/bin/podman) is present
systemctl is present
lvcreate is present
Unit chronyd.service is enabled and running
Host looks OK
Cluster fsid: 401136aa-5009-11eb-aa8d-002590fc1bf0
Verifying IP 10.8.128.106 port 3300 ...
Verifying IP 10.8.128.106 port 6789 ...
Mon IP 10.8.128.106 is in CIDR network 10.8.128.0/21
Pulling custom registry login info from cephadm.txt.
Logging into custom registry.
Pulling container image registry.redhat.io/rhceph-alpha/rhceph-5-rhel8:latest...
Extracting ceph user uid/gid from container image...
Creating initial keys...
Creating initial monmap...
Creating mon...
Waiting for mon to start...
Waiting for mon...
mon is available
Assimilating anything we can from ceph.conf...
Generating new minimal ceph.conf...
Restarting the monitor...
Setting mon public_network...
Creating mgr...
Verifying port 9283 ...
Wrote keyring to /etc/ceph/ceph.client.admin.keyring
Wrote config to /etc/ceph/ceph.conf
Waiting for mgr to start...
Waiting for mgr...
mgr not available, waiting (1/10)...
mgr not available, waiting (2/10)...
mgr not available, waiting (3/10)...
mgr is available
Enabling cephadm module...
Waiting for the mgr to restart...
Waiting for mgr epoch 5...
mgr epoch 5 is available
Setting orchestrator backend to cephadm...
Generating ssh key...
Wrote public SSH key to to /etc/ceph/ceph.pub
Adding key to root@localhost's authorized_keys...
Adding host magna106...
Deploying mon service with default placement...
Deploying mgr service with default placement...
Deploying crash service with default placement...
Enabling mgr prometheus module...
Deploying prometheus service with default placement...
Deploying grafana service with default placement...
Deploying node-exporter service with default placement...
Deploying alertmanager service with default placement...
Enabling the dashboard module...
Waiting for the mgr to restart...
Waiting for mgr epoch 13...
mgr epoch 13 is available
Generating a dashboard self-signed certificate...
Creating initial admin user...
Non-zero exit code 22 from /bin/podman run --rm --ipc=host --net=host --entrypoint /usr/bin/ceph -e CONTAINER_IMAGE=registry.redhat.io/rhceph-alpha/rhceph-5-rhel8:latest -e NODE_NAME=magna106 -v /var/log/ceph/401136aa-5009-11eb-aa8d-002590fc1bf0:/var/log/ceph:z -v /tmp/ceph-tmpasnuzd58:/etc/ceph/ceph.client.admin.keyring:z -v /tmp/ceph-tmprbv8s9yy:/etc/ceph/ceph.conf:z -v /tmp/ceph-tmp00kii13g:/tmp/dashboard.pw:z registry.redhat.io/rhceph-alpha/rhceph-5-rhel8:latest dashboard ac-user-create admin -i /tmp/dashboard.pw administrator --force-password --pwd-update-required
/usr/bin/ceph: stderr Error EINVAL: Traceback (most recent call last):
/usr/bin/ceph: stderr   File "/usr/share/ceph/mgr/mgr_module.py", line 1198, in _handle_command
/usr/bin/ceph: stderr     return CLICommand.COMMANDS[cmd['prefix']].call(self, cmd, inbuf)
/usr/bin/ceph: stderr   File "/usr/share/ceph/mgr/mgr_module.py", line 332, in call
/usr/bin/ceph: stderr     return self.func(mgr, **kwargs)
/usr/bin/ceph: stderr TypeError: ac_user_create_cmd() got an unexpected keyword argument 'inbuf'
/usr/bin/ceph: stderr 
Traceback (most recent call last):
  File "/sbin/cephadm", line 6931, in <module>
    r = args.func()
  File "/sbin/cephadm", line 1410, in _default_image
    return func()
  File "/sbin/cephadm", line 3320, in command_bootstrap
    cli(cmd, extra_mounts={pathify(tmp_password_file.name): '/tmp/dashboard.pw:z'})
  File "/sbin/cephadm", line 3066, in cli
    ).run(timeout=timeout)
  File "/sbin/cephadm", line 2707, in run
    self.run_cmd(), desc=self.entrypoint, timeout=timeout)
  File "/sbin/cephadm", line 1071, in call_throws
    raise RuntimeError('Failed command: %s' % ' '.join(command))
RuntimeError: Failed command: /bin/podman run --rm --ipc=host --net=host --entrypoint /usr/bin/ceph -e CONTAINER_IMAGE=registry.redhat.io/rhceph-alpha/rhceph-5-rhel8:latest -e NODE_NAME=magna106 -v /var/log/ceph/401136aa-5009-11eb-aa8d-002590fc1bf0:/var/log/ceph:z -v /tmp/ceph-tmpasnuzd58:/etc/ceph/ceph.client.admin.keyring:z -v /tmp/ceph-tmprbv8s9yy:/etc/ceph/ceph.conf:z -v /tmp/ceph-tmp00kii13g:/tmp/dashboard.pw:z registry.redhat.io/rhceph-alpha/rhceph-5-rhel8:latest dashboard ac-user-create admin -i /tmp/dashboard.pw administrator --force-password --pwd-update-required
[root@magna106 ubuntu]# sudo

Comment 4 Preethi 2021-01-06 11:09:59 UTC
(In reply to Preethi from comment #3)
> @Juan, Issue is seen with latest alpha image as well. 
> registry-proxy.engineering.redhat.com/rh-osbs/rhceph:ceph-5.0-rhel-8-
> containers-candidate-10695-20210105032152
> 
> 
> root@magna106 ubuntu]# sudo cephadm rm-cluster --fsid
> 4e57a198-635d-4ca6-bdaa-4687d99cdc7a --force
> [root@magna106 ubuntu]# 
> [root@magna106 ubuntu]# 
> [root@magna106 ubuntu]# sudo cephadm bootstrap --mon-ip 10.8.128.106
> --registry-json cephadm.txt
> Verifying podman|docker is present...
> Verifying lvm2 is present...
> Verifying time synchronization is in place...
> Unit chronyd.service is enabled and running
> Repeating the final host check...
> podman|docker (/bin/podman) is present
> systemctl is present
> lvcreate is present
> Unit chronyd.service is enabled and running
> Host looks OK
> Cluster fsid: 401136aa-5009-11eb-aa8d-002590fc1bf0
> Verifying IP 10.8.128.106 port 3300 ...
> Verifying IP 10.8.128.106 port 6789 ...
> Mon IP 10.8.128.106 is in CIDR network 10.8.128.0/21
> Pulling custom registry login info from cephadm.txt.
> Logging into custom registry.
> Pulling container image
> registry.redhat.io/rhceph-alpha/rhceph-5-rhel8:latest...
> Extracting ceph user uid/gid from container image...
> Creating initial keys...
> Creating initial monmap...
> Creating mon...
> Waiting for mon to start...
> Waiting for mon...
> mon is available
> Assimilating anything we can from ceph.conf...
> Generating new minimal ceph.conf...
> Restarting the monitor...
> Setting mon public_network...
> Creating mgr...
> Verifying port 9283 ...
> Wrote keyring to /etc/ceph/ceph.client.admin.keyring
> Wrote config to /etc/ceph/ceph.conf
> Waiting for mgr to start...
> Waiting for mgr...
> mgr not available, waiting (1/10)...
> mgr not available, waiting (2/10)...
> mgr not available, waiting (3/10)...
> mgr is available
> Enabling cephadm module...
> Waiting for the mgr to restart...
> Waiting for mgr epoch 5...
> mgr epoch 5 is available
> Setting orchestrator backend to cephadm...
> Generating ssh key...
> Wrote public SSH key to to /etc/ceph/ceph.pub
> Adding key to root@localhost's authorized_keys...
> Adding host magna106...
> Deploying mon service with default placement...
> Deploying mgr service with default placement...
> Deploying crash service with default placement...
> Enabling mgr prometheus module...
> Deploying prometheus service with default placement...
> Deploying grafana service with default placement...
> Deploying node-exporter service with default placement...
> Deploying alertmanager service with default placement...
> Enabling the dashboard module...
> Waiting for the mgr to restart...
> Waiting for mgr epoch 13...
> mgr epoch 13 is available
> Generating a dashboard self-signed certificate...
> Creating initial admin user...
> Non-zero exit code 22 from /bin/podman run --rm --ipc=host --net=host
> --entrypoint /usr/bin/ceph -e
> CONTAINER_IMAGE=registry.redhat.io/rhceph-alpha/rhceph-5-rhel8:latest -e
> NODE_NAME=magna106 -v
> /var/log/ceph/401136aa-5009-11eb-aa8d-002590fc1bf0:/var/log/ceph:z -v
> /tmp/ceph-tmpasnuzd58:/etc/ceph/ceph.client.admin.keyring:z -v
> /tmp/ceph-tmprbv8s9yy:/etc/ceph/ceph.conf:z -v
> /tmp/ceph-tmp00kii13g:/tmp/dashboard.pw:z
> registry.redhat.io/rhceph-alpha/rhceph-5-rhel8:latest dashboard
> ac-user-create admin -i /tmp/dashboard.pw administrator --force-password
> --pwd-update-required
> /usr/bin/ceph: stderr Error EINVAL: Traceback (most recent call last):
> /usr/bin/ceph: stderr   File "/usr/share/ceph/mgr/mgr_module.py", line 1198,
> in _handle_command
> /usr/bin/ceph: stderr     return
> CLICommand.COMMANDS[cmd['prefix']].call(self, cmd, inbuf)
> /usr/bin/ceph: stderr   File "/usr/share/ceph/mgr/mgr_module.py", line 332,
> in call
> /usr/bin/ceph: stderr     return self.func(mgr, **kwargs)
> /usr/bin/ceph: stderr TypeError: ac_user_create_cmd() got an unexpected
> keyword argument 'inbuf'
> /usr/bin/ceph: stderr 
> Traceback (most recent call last):
>   File "/sbin/cephadm", line 6931, in <module>
>     r = args.func()
>   File "/sbin/cephadm", line 1410, in _default_image
>     return func()
>   File "/sbin/cephadm", line 3320, in command_bootstrap
>     cli(cmd, extra_mounts={pathify(tmp_password_file.name):
> '/tmp/dashboard.pw:z'})
>   File "/sbin/cephadm", line 3066, in cli
>     ).run(timeout=timeout)
>   File "/sbin/cephadm", line 2707, in run
>     self.run_cmd(), desc=self.entrypoint, timeout=timeout)
>   File "/sbin/cephadm", line 1071, in call_throws
>     raise RuntimeError('Failed command: %s' % ' '.join(command))
> RuntimeError: Failed command: /bin/podman run --rm --ipc=host --net=host
> --entrypoint /usr/bin/ceph -e
> CONTAINER_IMAGE=registry.redhat.io/rhceph-alpha/rhceph-5-rhel8:latest -e
> NODE_NAME=magna106 -v
> /var/log/ceph/401136aa-5009-11eb-aa8d-002590fc1bf0:/var/log/ceph:z -v
> /tmp/ceph-tmpasnuzd58:/etc/ceph/ceph.client.admin.keyring:z -v
> /tmp/ceph-tmprbv8s9yy:/etc/ceph/ceph.conf:z -v
> /tmp/ceph-tmp00kii13g:/tmp/dashboard.pw:z
> registry.redhat.io/rhceph-alpha/rhceph-5-rhel8:latest dashboard
> ac-user-create admin -i /tmp/dashboard.pw administrator --force-password
> --pwd-update-required
> [root@magna106 ubuntu]# sudo

[root@magna106 ubuntu]# rpm -qa | grep cephadm
cephadm-16.0.0-8633.el8cp.noarch

Comment 5 Preethi 2021-01-06 12:59:00 UTC
@Juan, Below are the observation with latest alpha compose:

Cephadm Tool version: 16.0.0-8633.el8cp.noarch


[root@magna106 ubuntu]# sudo cephadm bootstrap --mon-ip 10.8.128.106 --registry-json cephadm.txt --> Fail

[root@magna106 ubuntu]# sudo cephadm --image registry-proxy.engineering.redhat.com/rh-osbs/rhceph:ceph-5.0-rhel-8-containers-candidate-10695-20210105032152 bootstrap --mon-ip 10.8.128.106 --registry-url registry.redhat.io --registry-username qa --registry-password MTQj5t3n5K86p3gH  --->>Pass
 
Logs path:
http://magna002.ceph.redhat.com/pnataraj-2021-01-06_06:51:25-smoke:cephadm-master-distro-basic-clara/394464/teuthology.log - Image based 
http://magna002.ceph.redhat.com/pnataraj-2021-01-06_07:25:29-smoke:cephadm-master-distro-basic-clara/394467/teuthology.log - Registry based

FYI, 

[root@magna106 ubuntu]# sudo cephadm --image registry-proxy.engineering.redhat.com/rh-osbs/rhceph:ceph-5.0-rhel-8-containers-candidate-10695-20210105032152 bootstrap --mon-ip 10.8.128.106 --registry-url registry.redhat.io --registry-username qa --registry-password MTQj5t3n5K86p3gH
Verifying podman|docker is present...
Verifying lvm2 is present...
Verifying time synchronization is in place...
Unit chronyd.service is enabled and running
Repeating the final host check...
podman|docker (/bin/podman) is present
systemctl is present
lvcreate is present
Unit chronyd.service is enabled and running
Host looks OK
Cluster fsid: a8ff821c-501c-11eb-8862-002590fc1bf0
Verifying IP 10.8.128.106 port 3300 ...
Verifying IP 10.8.128.106 port 6789 ...
Mon IP 10.8.128.106 is in CIDR network 10.8.128.0/21
Logging into custom registry.
Pulling container image registry-proxy.engineering.redhat.com/rh-osbs/rhceph:ceph-5.0-rhel-8-containers-candidate-10695-20210105032152...
Extracting ceph user uid/gid from container image...
Creating initial keys...
Creating initial monmap...
Creating mon...
Waiting for mon to start...
Waiting for mon...
mon is available
Assimilating anything we can from ceph.conf...
Generating new minimal ceph.conf...
Restarting the monitor...
Setting mon public_network...
Creating mgr...
Verifying port 9283 ...
Wrote keyring to /etc/ceph/ceph.client.admin.keyring
Wrote config to /etc/ceph/ceph.conf
Waiting for mgr to start...
Waiting for mgr...
mgr not available, waiting (1/10)...
mgr not available, waiting (2/10)...
mgr not available, waiting (3/10)...
mgr is available
Enabling cephadm module...
Waiting for the mgr to restart...
Waiting for mgr epoch 5...
mgr epoch 5 is available
Setting orchestrator backend to cephadm...
Generating ssh key...
Wrote public SSH key to to /etc/ceph/ceph.pub
Adding key to root@localhost's authorized_keys...
Adding host magna106...
Deploying mon service with default placement...
Deploying mgr service with default placement...
Deploying crash service with default placement...
Enabling mgr prometheus module...
Deploying prometheus service with default placement...
Deploying grafana service with default placement...
Deploying node-exporter service with default placement...
Deploying alertmanager service with default placement...
Enabling the dashboard module...
Waiting for the mgr to restart...
Waiting for mgr epoch 13...
mgr epoch 13 is available
Generating a dashboard self-signed certificate...
Creating initial admin user...
Fetching dashboard port number...
Ceph Dashboard is now available at:

	     URL: https://magna106.ceph.redhat.com:8443/
	    User: admin
	Password: wk0n24re1i

You can access the Ceph CLI with:

	sudo /sbin/cephadm shell --fsid a8ff821c-501c-11eb-8862-002590fc1bf0 -c /etc/ceph/ceph.conf -k /etc/ceph/ceph.client.admin.keyring

Please consider enabling telemetry to help improve Ceph:

	ceph telemetry on

For more information see:

	https://docs.ceph.com/docs/master/mgr/telemetry/

Bootstrap complete.
[root@magna106 ubuntu]#

Comment 7 Veera Raghava Reddy 2021-01-06 19:29:51 UTC
Hi Dimitri,
This has potential usability issue for customers -
Customer has cephadm deployed from version 5.x and later performs bootstrap using latest container image.
There should be a pre-req check to ensure comparability and give the info to user.
If this happens for Upgrade then there is probability of DU if not able to upgrade successfully.

This needs to be addressed. Also need to have builds publish cephadm and containers should be published Synchronously.

Comment 9 Veera Raghava Reddy 2021-01-06 20:32:28 UTC
Thanks Dimitri for sharing info on current practice of Documenting versions. We can have similar approach continued for CephADM.

To avoid any errors due to negligence, Should we consider to have version dependency check and option to override?

Comment 18 Preethi 2021-01-18 05:33:21 UTC
@Juan, Issue is still issue with latest build. Registry.redhat.io is not updated with latest container image. Below is the error snippet and logs path for reference.

Logs: http://magna002.ceph.redhat.com/pnataraj-2021-01-17_13:03:20-smoke:cephadm-master-distro-basic-psi/395177/tasks/rhcephadm-1.log


2021-01-17T13:22:50.626 INFO:teuthology.orchestra.run.psi022.stderr:Running command: /bin/podman run --rm --ipc=host --net=host --entrypoint /usr/bin/ceph -e CONTAINER_IMAGE=registry.redhat.io/rhceph-alpha/rhceph-5-rhel8:latest -e NODE_NAME=psi022 -v /var/log/ceph/850f43b0-58f0-11eb-95d0-002590fc2776:/var/log/ceph:z -v /tmp/ceph-tmp0ffnuege:/etc/ceph/ceph.client.admin.keyring:z -v /tmp/ceph-tmpafw4tgje:/etc/ceph/ceph.conf:z registry.redhat.io/rhceph-alpha/rhceph-5-rhel8:latest dashboard create-self-signed-cert
2021-01-17T13:22:52.568 INFO:teuthology.orchestra.run.psi022.stderr:/usr/bin/ceph: stdout Self-signed certificate created
2021-01-17T13:22:52.739 INFO:teuthology.orchestra.run.psi022.stderr:Creating initial admin user...
2021-01-17T13:22:52.740 INFO:teuthology.orchestra.run.psi022.stderr:Running command: /bin/podman run --rm --ipc=host --net=host --entrypoint /usr/bin/ceph -e CONTAINER_IMAGE=registry.redhat.io/rhceph-alpha/rhceph-5-rhel8:latest -e NODE_NAME=psi022 -v /var/log/ceph/850f43b0-58f0-11eb-95d0-002590fc2776:/var/log/ceph:z -v /tmp/ceph-tmp0ffnuege:/etc/ceph/ceph.client.admin.keyring:z -v /tmp/ceph-tmpafw4tgje:/etc/ceph/ceph.conf:z -v /tmp/ceph-tmpcxo3enbh:/tmp/dashboard.pw:z registry.redhat.io/rhceph-alpha/rhceph-5-rhel8:latest dashboard ac-user-create admin -i /tmp/dashboard.pw administrator --force-password --pwd-update-required
2021-01-17T13:22:54.159 INFO:teuthology.orchestra.run.psi022.stderr:/usr/bin/ceph: stderr Error EINVAL: Traceback (most recent call last):
2021-01-17T13:22:54.160 INFO:teuthology.orchestra.run.psi022.stderr:/usr/bin/ceph: stderr   File "/usr/share/ceph/mgr/mgr_module.py", line 1198, in _handle_command
2021-01-17T13:22:54.160 INFO:teuthology.orchestra.run.psi022.stderr:/usr/bin/ceph: stderr     return CLICommand.COMMANDS[cmd['prefix']].call(self, cmd, inbuf)
2021-01-17T13:22:54.161 INFO:teuthology.orchestra.run.psi022.stderr:/usr/bin/ceph: stderr   File "/usr/share/ceph/mgr/mgr_module.py", line 332, in call
2021-01-17T13:22:54.161 INFO:teuthology.orchestra.run.psi022.stderr:/usr/bin/ceph: stderr     return self.func(mgr, **kwargs)
2021-01-17T13:22:54.162 INFO:teuthology.orchestra.run.psi022.stderr:/usr/bin/ceph: stderr TypeError: ac_user_create_cmd() got an unexpected keyword argument 'inbuf'
2021-01-17T13:22:54.163 INFO:teuthology.orchestra.run.psi022.stderr:/usr/bin/ceph: stderr

Comment 19 Juan Miguel Olmo 2021-02-01 11:54:43 UTC
Please retest with latest compose (29th January or latest)

Comment 20 Preethi 2021-02-02 05:37:09 UTC
@Juan, Issue is not seen with custom image. However, we will check once the registry is pushed with latest alpha drop.

Comment 21 skanta 2021-02-17 08:10:22 UTC
Facing the same issue with the compose -http://download.eng.bos.redhat.com/rhel-8/composes/auto/ceph-5.0-rhel-8/RHCEPH-5.0-RHEL-8-20210216.ci.1/compose/Tools/x86_64/os/ 
and image - registry-proxy.engineering.redhat.com/rh-osbs/rhceph:ceph-5.0-rhel-8-containers-candidate-89244-20201208030231

Comment 22 Juan Miguel Olmo 2021-03-03 09:44:19 UTC
@skanta :

It seems that you are using a "new" cephadm binary with an "old" ceph image. 

Take a look to the Dimitri's comments (https://bugzilla.redhat.com/show_bug.cgi?id=1909977#c16)

Comment 28 errata-xmlrpc 2021-08-30 08:27:34 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat Ceph Storage 5.0 bug fix and enhancement), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:3294