Bug 1958236

Summary: cephadm: /etc/hosts not working
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Harish Munjulur <hmunjulu>
Component: CephadmAssignee: Daniel Pivonka <dpivonka>
Status: CLOSED ERRATA QA Contact: Gopi <gpatta>
Severity: medium Docs Contact: Karen Norteman <knortema>
Priority: unspecified    
Version: 5.0CC: ceph-eng-bugs, ceph-qe-bugs, gpatta, gsitlani, idryomov, jolmomar, pcuzner, sangadi, sewagner, sweil, tserlin, vereddy
Target Milestone: ---Flags: hmunjulu: needinfo-
gpatta: needinfo+
Target Release: 5.0   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: ceph-16.2.0-77.el8cp Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-08-30 08:30:29 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1962508, 1963849    

Description Harish Munjulur 2021-05-07 13:54:22 UTC
Description of problem: Gateway create fails with error
/iscsi-target...-igw/gateways> create ceph-hmunjulu-1620387488453-node1-mon-mgr-installer-node-export 10.0.209.50
Adding gateway, sync'ing 0 disk(s) and 0 client(s)
REST API failure, code : 403
Unable to refresh local config over API - sync aborted, restart rbd-target-api on ceph-hmunjulu-1620387488453-node1-mon-mgr-installer-node-export to sync


Version-Release number of selected component (if applicable):
ceph version 16.2.0-31.el8cp (4bfd76f34a0e145d704f8ecbecbed6e33bc06842) pacific (stable)


How reproducible: 3/3


Steps to Reproduce:
1. created an iscsi pool and enabled the rbd application
2. [ceph: root@ceph-hmunjulu-1620387488453-node1-mon-mgr-installer-node-export ~]# ceph orch apply iscsi --pool iscsi --api_user admin --api_password admin --placement="ceph-hmunjulu-1620387488453-node2-mon-mds-node-exporter-alertma ceph-hmunjulu-1620387488453-node1-mon-mgr-installer-node-export" 
Scheduled iscsi.iscsi update...

3. [root@ceph-hmunjulu-1620387488453-node1-mon-mgr-installer-node-export ~]# podman exec -it 96d714bf5eab sh
sh-4.4# gwcli

/iscsi-targets> ls
o- iscsi-targets ................................................................................. [DiscoveryAuth: None, Targets: 1]
  o- iqn.2003-01.com.redhat.iscsi-gw:ceph-igw ............................................................ [Auth: None, Gateways: 1]
    o- disks ............................................................................................................ [Disks: 0]
    o- gateways .............................................................................................. [Up: 0/1, Portals: 1]
    | o- ceph-hmunjulu-1620387488453-node1-mon-mgr-installer-node-export .............................. [10.0.209.50 (UNAUTHORIZED)]
    o- host-groups .................................................................................................... [Groups : 0]
    o- hosts ......................................................................................... [Auth: ACL_ENABLED, Hosts: 0]


Actual results:
/iscsi-target...-igw/gateways> create ceph-hmunjulu-1620387488453-node1-mon-mgr-installer-node-export 10.0.209.50
Adding gateway, sync'ing 0 disk(s) and 0 client(s)
REST API failure, code : 403
Unable to refresh local config over API - sync aborted, restart rbd-target-api on ceph-hmunjulu-1620387488453-node1-mon-mgr-installer-node-export to sync



Expected results: gateway should be added successfully


cluster details:
gateways:
ceph-hmunjulu-1620387488453-node1-mon-mgr-installer-node-export  10.0.209.50 
ceph-hmunjulu-1620387488453-node2-mon-mds-node-exporter-alertma  10.0.208.174


[ceph: root@ceph-hmunjulu-1620387488453-node1-mon-mgr-installer-node-export ~]# rados -p iscsi ls 
gateway.conf
[ceph: root@ceph-hmunjulu-1620387488453-node1-mon-mgr-installer-node-export ~]# rados -p iscsi get gateway.conf -
{
    "created": "2021/05/07 12:58:59",
    "discovery_auth": {
        "mutual_password": "",
        "mutual_password_encryption_enabled": false,
        "mutual_username": "",
        "password": "",
        "password_encryption_enabled": false,
        "username": ""
    },
    "disks": {},
    "epoch": 4,
    "gateways": {
        "ceph-hmunjulu-1620387488453-node1-mon-mgr-installer-node-export": {
            "active_luns": 0,
            "created": "2021/05/07 13:20:31",
            "updated": "2021/05/07 13:20:31"
        }
    },
    "targets": {
        "iqn.2003-01.com.redhat.iscsi-gw:ceph-igw": {
            "acl_enabled": true,
            "auth": {
                "mutual_password": "",
                "mutual_password_encryption_enabled": false,
                "mutual_username": "",
                "password": "",
                "password_encryption_enabled": false,
                "username": ""
            },
            "clients": {},
            "controls": {},
            "created": "2021/05/07 13:02:17",
            "disks": {},
            "groups": {},
            "ip_list": [
                "10.0.209.50"
            ],
            "portals": {
                "ceph-hmunjulu-1620387488453-node1-mon-mgr-installer-node-export": {
                    "gateway_ip_list": [
                        "10.0.209.50"
                    ],
                    "inactive_portal_ips": [],
                    "portal_ip_addresses": [
                        "10.0.209.50"
                    ],
                    "tpgs": 1
                }
            },
            "updated": "2021/05/07 13:20:31"
        }
    },
    "updated": "2021/05/07 13:20:31",
    "version": 11

[root@ceph-hmunjulu-1620387488453-node1-mon-mgr-installer-node-export ~]# ss -tnlp | grep rbd
LISTEN    0         128                      *:5000                   *:*        users:(("rbd-target-api",pid=56712,fd=33))

Comment 4 Harish Munjulur 2021-05-11 05:59:08 UTC
Followed the steps as per the document https://docs.ceph.com/en/latest/cephadm/iscsi/ on a new cluster still FAILS with the same error.

[ceph: root@ceph-hmunjulu-1620631783525-node1-mon-mgr-installer-node-export ~]# ceph orch apply -i iscsi.yaml
Scheduled iscsi.iscsi update...

iscsi.yaml file

service_type: iscsi
service_id: iscsi
placement:
  hosts:
  - ceph-hmunjulu-1620631783525-node1-mon-mgr-installer-node-export
  - ceph-hmunjulu-1620631783525-node2-mon-mds-node-exporter-alertma
spec:
  pool: iscsi_pool
  trusted_ip_list: "10.0.210.96, 10.0.210.151"

[root@ceph-hmunjulu-1620631783525-node1-mon-mgr-installer-node-export ~]# podman exec -it 183db4478c08 sh
sh-4.4# gwcli

1 gateway is inaccessible - updates will be disabled
/iscsi-target...-igw/gateways> ls
o- gateways .................................................................................................. [Up: 0/1, Portals: 1]
  o- ceph-hmunjulu-1620631783525-node1-mon-mgr-installer-node-export .................................. [10.0.210.96 (UNAUTHORIZED)]
/iscsi-target...-igw/gateways> exit

[root@ceph-hmunjulu-1620631783525-node1-mon-mgr-installer-node-export ~]# ss -tnlp | grep rbd
LISTEN    0         128                      *:5000                   *:*        users:(("rbd-target-api",pid=53334,fd=15))  

[root@ceph-hmunjulu-1620631783525-node2-mon-mds-node-exporter-alertma ~]# ss -tnlp | grep rbd
LISTEN    0         128                      *:5000                   *:*        users:(("rbd-target-api",pid=23020,fd=33))

Comment 7 Paul Cuzner 2021-05-13 04:43:27 UTC
Harish could you look at you firewall environment and confirm whether ports 5000 and 3260 were opened by the orchestrator...I suspect not.

Incidentally, +1 for using the yaml spec when defining the services

I did an install, and found that although the daemons started and bound, they were inaccessible due to FW. Would be good to confirm

After the install run firewall-cmd --zone public --list-all...port 5000 is the API port - we need 5000 and 3260 to be exposed for the API and client connections to work.

Comment 8 Paul Cuzner 2021-05-13 08:27:39 UTC
logged in to the test machines

there is no firewall (so the issue I described earlier doesn't apply...but it's still an issue!)

current issue is the yaml didn't specify admin user and password, so the conf had null values for this. Harish followed the rhcs4 guide where this was unnecessary, but it appears for cephadm it is (DOC BUG?)

Stopped the services on the gateways
deleted the gateway.conf object (useless anyway)
fixed icsi-gateway.cfg on both nodes
restarted iscsi service

Created the target
add local machine as 1st gateway
--> Same issue 403 Unauthorized

looked at the caps for the client keyring and updated for all mgr commands

attempted removal of the service (orch rm), and the iscsi service is stuck in deleting state. checking orch ps shows a last refresh timestamp of 2 days ago!
unloaded cephadm module, and reloaded...cephadm now cleared the services

Looking at this (in the iscsi.iscsi daemon log of the local host where the target add ran)

1996-01.com.redhat.iscsi-gw:ceph-igw - Adding the IP to the enabled tpg, allowing iSCSI logins
- - [13/May/2021 08:17:28] "PUT /api/_gateway/iqn.1996-01.com.redhat.iscsi-gw:ceph-igw/ceph-hmunjulu-1620631783525-node1-mon-mgr-installer-node-export HTTP/1.1" 200 -
eway update on localhost, successful
- - [13/May/2021 08:17:28] "PUT /api/gateway/iqn.1996-01.com.redhat.iscsi-gw:ceph-igw/ceph-hmunjulu-1620631783525-node1-mon-mgr-installer-node-export HTTP/1.1" 200 -
::f816:3eff:fe52:e384%eth0 - - [13/May/2021 08:17:28] "GET /api/config HTTP/1.1" 403 -

we can see that the add works, but the read back appears to be ipv6, and since it's not in the trusted list you get forbidden 403

When you run the api/config call directly to the gateways ip v4 address - its fine.

so I think this is the root cause

will follow up tomorrow

Comment 14 Sebastian Wagner 2021-05-27 10:08:57 UTC
https://tracker.ceph.com/issues/49654

Comment 23 Gopi 2021-06-22 09:28:53 UTC
Working as expected.

Comment 25 errata-xmlrpc 2021-08-30 08:30:29 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat Ceph Storage 5.0 bug fix and enhancement), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:3294