Bug 1420537

Summary: OSD nodes missing in /cluster/FSID/server endpoint
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Christina Meno <gmeno>
Component: CalamariAssignee: Boris Ranto <branto>
Calamari sub component: Back-end QA Contact: Harish NV Rao <hnallurv>
Status: CLOSED ERRATA Docs Contact: Erin Donnelly <edonnell>
Severity: medium    
Priority: unspecified CC: branto, ceph-eng-bugs, edonnell, gmeno, hnallurv, kdreyer, vimishra
Version: 2.2   
Target Milestone: rc   
Target Release: 2.2   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: calamari-server-1.5.5-1.el7cp calamari_server_1.5.5-2redhat1xenial Doc Type: Bug Fix
Doc Text:
.Calamari can now handle several Ceph daemons running on the same host Previously, the Calamari API did not handle the situation when several Ceph daemons were running on a single host. As a consequence, Calamari failed to recognize OSDs running on the same host as Monitors. This bug has been fixed, and Calamari now handles several daemons running on the same host as expected.
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-04-17 14:31:48 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1412948    

Description Christina Meno 2017-02-08 22:26:56 UTC
Description of problem:
error: wildcard resolved to multiple address
2017-02-08 06:37:49,547 - ERROR - calamari Uncaught exception
Traceback (most recent call last):
  File "/opt/calamari/venv/lib/python2.7/site-packages/gevent/greenlet.py", line 534, in run
    result = self._run(*self.args, **self.kwargs)
  File "/opt/calamari/venv/lib/python2.7/site-packages/calamari_cthulhu-0.1-py2.7.egg/cthulhu/manager/cluster_monitor.py", line 251, in _run
    on_job=self.on_job_complete)
  File "/opt/calamari/venv/lib/python2.7/site-packages/calamari_common-0.1-py2.7.egg/calamari_common/remote/mon_remote.py", line 987, in listen
    ev.data['fun_args'])
  File "/opt/calamari/venv/lib/python2.7/site-packages/calamari_cthulhu-0.1-py2.7.egg/cthulhu/manager/cluster_monitor.py", line 238, in on_job_complete
    self.on_sync_object(fqdn, result)
  File "/opt/calamari/venv/lib/python2.7/site-packages/calamari_cthulhu-0.1-py2.7.egg/cthulhu/manager/cluster_monitor.py", line 344, in on_sync_object
    new_object = self.inject_sync_object(minion_id, data['type'], data['version'], sync_object)
  File "/opt/calamari/venv/lib/python2.7/site-packages/calamari_cthulhu-0.1-py2.7.egg/cthulhu/manager/cluster_monitor.py", line 324, in inject_sync_object
    self._servers.on_osd_map(data)
  File "/opt/calamari/venv/lib/python2.7/site-packages/calamari_cthulhu-0.1-py2.7.egg/cthulhu/manager/server_monitor.py", line 259, in on_osd_map
    hostname_to_osds = self.get_hostname_to_osds(osd_map)
  File "/opt/calamari/venv/lib/python2.7/site-packages/calamari_cthulhu-0.1-py2.7.egg/cthulhu/manager/server_monitor.py", line 207, in get_hostname_to_osds
    name_info = get_name_info('', osd['cluster_addr'])
  File "/opt/calamari/venv/lib/python2.7/site-packages/calamari_cthulhu-0.1-py2.7.egg/cthulhu/manager/server_monitor.py", line 186, in get_name_info
    hostname = socket.gethostbyaddr(osd_addr)[0]
  File "/opt/calamari/venv/lib/python2.7/site-packages/gevent/_socketcommon.py", line 280, in gethostbyaddr
    return get_hub().resolver.gethostbyaddr(ip_address)
  File "/opt/calamari/venv/lib/python2.7/site-packages/gevent/resolver_thread.py", line 67, in gethostbyaddr
    return self.pool.apply(_socket.gethostbyaddr, args, kwargs)
  File "/opt/calamari/venv/lib/python2.7/site-packages/gevent/pool.py", line 300, in apply
    return self.spawn(func, *args, **kwds).get()
  File "/opt/calamari/venv/lib/python2.7/site-packages/gevent/event.py", line 373, in get
    return self.get(block=False)
  File "/opt/calamari/venv/lib/python2.7/site-packages/gevent/event.py", line 363, in get
    return self._raise_exception()
  File "/opt/calamari/venv/lib/python2.7/site-packages/gevent/event.py", line 343, in _raise_exception
    reraise(*self.exc_info)
  File "/opt/calamari/venv/lib/python2.7/site-packages/gevent/threadpool.py", line 207, in _worker
    value = func(*args, **kwargs)
error: wildcard resolved to multiple address


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 8 Christina Meno 2017-02-13 16:46:17 UTC
https://github.com/ceph/calamari/releases/tag/v1.5.1

Comment 15 Christina Meno 2017-02-17 06:09:50 UTC
Boris,

Are we able to get the osd nodes at the server endpoint? This functionality is required for storage console to do cluster imports and did work in many cases in 2.1.

Would you please help me understand the state of this? Is the NotImplementedError reproducible? How? 

cheers,
G

Comment 16 Boris Ranto 2017-02-17 12:34:38 UTC
@Gregory: We are able to get the osd nodes but not on the first service start (a calamari restart is sufficient to make it all work). All the subsequent starts work fine.

It is only ever reproducible on first calamari service start and only on some systems. It is what happened in this case once the hostname regression was fixed (this bugzilla has currently nothing to do with the hostname resolving code).

I was looking into this further. My notes:

In the Manager class, we initiate RequestCollection and a Ticker. The RequestCollection tick itself goes through all the (salt jids, minion ids) and tries to emit a saltutil.running to ping jobs with MonRemote.get_running. This fails because MonRemote.get_running is not implemented.

There is apparently more than one way we are doing this and the other one succeeds so it works from the second time on.

This would probably work if we just commented out the lines that call the function that is not implemented but that looks like a hack to me. Alternatively, we could try to implement the signal. I think this should implement it (the patch was not tested yet due to the nature of the reproducer):

https://github.com/ceph/calamari/pull/509

Any better ideas/comments?


@Harish: Could you re-run the test with the old stable versions of calamari to confirm/verify whether it is a regression or not?

@John: Unless this is really a regression (which I don't believe it is) I would not consider this a blocker. That being said, I have already proposed a fix (although it was not tested) so ETA on this bugzilla should hopefully be not that far away.

Comment 17 Christina Meno 2017-02-17 17:49:11 UTC
Building the upstream PR for some testing now.

Comment 18 Harish NV Rao 2017-02-20 09:23:54 UTC
@Boris, as mentioned to you over IRC, this issue is not seen on Magna machines for the same calamari version. Seen only in Beaker machines

Comment 19 Christina Meno 2017-02-21 16:45:03 UTC
Ken would you please make a build for https://github.com/ceph/calamari/releases/tag/v1.5.3

Comment 22 Ken Dreyer (Red Hat) 2017-02-21 21:14:15 UTC
v1.5.3 is in the latest RHEL and Ubuntu composes to QE.

Comment 23 Harish NV Rao 2017-02-22 11:54:43 UTC
Issue is Not resolved in calamari-server-1.5.3-1.el7cp.x86_64.

Followed these steps:
1) using ceph-ansible, brought up single mon and 3 osd cluster using machines from beaker lab

2) added a user 'tester' in the mon

3) as 'tester' issued "/api/v2/cluster/<fs-id>/server". It listed as follows:

[
    {
        "fqdn": "hp-ms-01-c33.moonshot1.lab.eng.rdu.redhat.com", 
        "hostname": "hp-ms-01-c33.moonshot1.lab.eng.rdu.redhat", 
        "services": [
            {
                "fsid": "9d6fd5a0-37f5-4ea2-807e-95c68a8ce252", 
                "type": "osd", 
                "id": "2", 
                "running": false
            }, 
            {
                "fsid": "9d6fd5a0-37f5-4ea2-807e-95c68a8ce252", 
                "type": "osd", 
                "id": "1", 
                "running": false
            }, 
            {
                "fsid": "9d6fd5a0-37f5-4ea2-807e-95c68a8ce252", 
                "type": "osd", 
                "id": "0", 
                "running": false
            }, 
            {
                "fsid": "9d6fd5a0-37f5-4ea2-807e-95c68a8ce252", 
                "type": "mon", 
                "id": "hp-ms-01-c33", 
                "running": true
            }
        ], 
        "frontend_addr": "10.12.27.36", 
        "backend_addr": "10.12.27.36", 
        "frontend_iface": null, 
        "backend_iface": null, 
        "managed": true, 
        "last_contact": "2017-02-22T11:39:20.000885+00:00", 
        "boot_time": "2017-02-18T17:37:34+00:00", 
        "ceph_version": "10.2.5-29.el7cp"
    }
]

4) Above output is incorrect as
  a) it lists osd details under MON node.
  b) it still does not list all osd nodes.
  c) hostname is incomplete form of fqdn instead of shortname
  d) incorrect frontend and backend addresses

will be updating the setup details in the next update.

Note: As mentioned earlier, when this issue is hit, restart of calamari service fixes the problem. Could we suggest it as the workaround for 2.2?

Comment 25 Boris Ranto 2017-02-22 16:51:07 UTC
I examined the server and this all looks like you are running calamari daemon before the network setup is working properly (that is why a restart helps):

 - the socket.getfqdn() returns null for the address as was visible from osd listing
 - gethostbyaddr fails to get the hostname and throws an exception
 - the only resolvable address is the localhost one...

Calamari can't work properly if the underlying network is not ready, yet.

All in all, this looks like a test (environment) problem, not a calamari issue. How/when do you run 'calamari-ctl initialize'? Is the networking working properly at that moment? Are the addresses resolvable at that moment?

Comment 26 Christina Meno 2017-02-22 18:24:06 UTC
Boris,

I think a more suitable behavior would be for calamari to tolerate this condition. Please prepare a patch where we just don't add nodes that are un resolvable, IIRC this code is triggered by mon and osd maps and we'll be getting more opportunities to add correct data.

I don't think restarting the service to avoid this is a workaround that we can live with.

Comment 27 Boris Ranto 2017-02-22 20:27:04 UTC
Gregory,

thanks to how python works, if we are not able to resolve an address once we will not be able to resolve it until python is restarted (the "bad" resolving code will already be in memory after first attempt to resolve and won't get removed until the restart). To test that out, you can e.g. start python with all dns servers masked out, try to resolve -> fail, unmask dns servers -> still fails, restart python -> resolves fine.

We kinda did that not-adding before when we failed on the not implemented function. At least that is what Harish was originally complaining about. Currently, the data gets populated with the things that can be known and the rest of the data gets fixed on next proper run when resolving finally works.

Also, we are already trying to skip empty name_info with the condition

name_info != ('', '')

we could maybe change that to

name_info and len(name_info) == 2 and name_info[0] and name_info

to avoid other forms of empty name_info (although I don't see a path that would lead to those). Anyway, it looks like the new addresses are now populated by the code that was not previously implemented -- by the emitted signal. Maybe, we should revert that or even mask that code altogether -- it never worked (was never implemented) so if we didn't need it by now (all it caused was trouble) we might never miss it.

Comment 28 Christina Meno 2017-02-22 22:25:20 UTC
Boris,

I was under the impression that python was just using the OS name resolver and will eventually come up with the correct name once the TTL expires, but I'm willing to accept that is still not something we can work with.

Here is what I propose, if we fail to resolve a name then both fqdn and hostname fields should contain the IP that we take as input. That should allow the programmatic users of the API to do the right  thing and the humans will just have to deal with it till we can resolve names.

What do you think about that?

cheers,
G

Comment 29 Boris Ranto 2017-02-23 14:20:17 UTC
I was thinking we are actually doing that -- getfqdn(IP) returns the IP if it cannot find a record, the hostname would be close to a subnet because we strip down the last dot onwards. The view however showed a null fqdn and no hostname (maybe it was None?). Therefore I suspect that either the records are modified in on_osd_map (the code actually looks like it modifies it) or the table got populated by a different mechanism (this seems just as likely).

It would really help if I could get a better reproducer because now, until the actual package gets build I can usually only guess on what is going on. For starters, it would be nice to test with a build with debugging turned on (do we maybe have a support for that in ceph-ansible?).

This is not a new behaviour and it fails in the cases where the network was broken/lacking. I also suspect that this might end up requiring a larger code re-design to make it work (we would probably have to touch the on_*_map functions). As such, I don't think there is much sense postponing the current release because of this and I believe we should move this bugzilla to the next release.

I was also looking at the other ways to do resolving (there two python dns libraries) but neither of them seemed to be of much help, here.

Comment 30 Harish NV Rao 2017-02-23 14:48:49 UTC
Seeing same issue on a newly added mon via ceph-ansible. restarting calamari service fixes that issue.

Isn't ceph-ansible restarting the calamari service after it installs the calamari-server packages to avoid this issue?

Comment 31 Boris Ranto 2017-02-24 10:27:41 UTC
Re-targetting for 2.3.

@Harish: It is likely that even if ceph-ansible did restart the service it would not help as this seems to be caused by missing dns records at the time the calamari is run (if you restart the service at the ~same time you tried to start it they are still likely to not be sorted out). It is the manual restart once they are sorted out that helps.

Can you elaborate on the previous comment? Did you add a mon and only then deployed calamari or was calamari running fine (after a previous restart), you deployed the mon node and that resulted in broken output? If the latter, did calamari get reinstalled in the meantime? If so in what way? (I tried reproducing by reinstalling just the calamari packages, dumping its db, etc but with no luck)

Comment 32 Harish NV Rao 2017-02-24 11:37:40 UTC
(In reply to Boris Ranto from comment #31)
> Can you elaborate on the previous comment? Did you add a mon and only then
> deployed calamari or was calamari running fine (after a previous restart),
> you deployed the mon node and that resulted in broken output? If the latter,
> did calamari get reinstalled in the meantime? If so in what way? (I tried
> reproducing by reinstalling just the calamari packages, dumping its db, etc
> but with no luck)

@Boris, here are the steps I followed:

1) Installed the ceph cluster[3 mon, 3 osd] using ceph-ansible (with "calamari" set "true" in mons.yml file). 

2) /api/v2/cluster/<fsid>/server did not work on any of the mon nodes.

3) Restarted the calamari service on all the mon nodes. server API worked from all mon nodes.

4) Added another MON node using ceph-ansible with "calamari" set as "true" in the mons.yml file

5) /api/v2/cluster/<fsid>/server did not work on the newly added MON. It was working on all the other mon nodes.

6) Restarted the calamari service on the newly added Mon. server API worked.

> @Harish: It is likely that even if ceph-ansible did restart the service it
> would not help as this seems to be caused by missing dns records at the time
> the calamari is run (if you restart the service at the ~same time you tried
> to start it they are still likely to not be sorted out). It is the manual
> restart once they are sorted out that helps.

How about suggesting the users to restart the calamari after the installation?

Comment 34 Boris Ranto 2017-02-24 12:03:57 UTC
We may want to add a Note saying something like:

If you are having trouble discovering all your nodes in calamari, please consider restarting the calamari service as the name resolution code might have already been deprecated once calamari tried to discover the new nodes.

Comment 36 Boris Ranto 2017-03-01 19:30:51 UTC
I was finally able to reproduce and I believe I have a fix (well, at least it helped on my test cluster):

https://github.com/ceph/calamari/pull/511

It turned out that the null fqdn was not caused by us not being able to resolve through getfqdn/gethostbyaddr but because (some) osd services were not registered properly and so Eventer was not able to resolve them -- _get_fqdn function checks whether the service was registered and returns None (null) if it was not.

This was also related to the way ceph-ansible deploys ceph and calamari. It first installs mon nodes, then installs and deploys calamari and only then it adds osd nodes. Therefore, calamari will use a different discovery method that is racy because some osd services do not get registered.

Given all of this, it would be nice if this was still able to make it into the 2.2 release (otherwise, I would probably nominate this for an early z-stream).

The patchset also contains a proper fix for the wildcard traceback. Previously, if we received (hostname, osd_addr) tuple like ('', ':/0') we did not strip CIDR notation before checking if the osd_addr is actually non-null. Therefore we ran socket.gethostbyaddr('') which will always fail with the wildcard traceback.

Comment 37 Christina Meno 2017-03-02 01:27:46 UTC
Boris,

I'm mostly ok with the patch though I'd like to talk a little more about the "calamari will use a different discovery method that is racy because some osd services do not get registered." part.

let's talk tomorrow morning to decide what release we should pursue with this fix.

Thank you

Comment 38 Ken Dreyer (Red Hat) 2017-03-02 19:49:01 UTC
Without more information here, re-targeting to 2.3

Comment 49 errata-xmlrpc 2017-04-17 14:31:48 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:0978