Bug 1286566

Summary: Nova APIs used by fence_compute have changed
Product: Red Hat Enterprise Linux 7 Reporter: Andrew Beekhof <abeekhof>
Component: fence-agentsAssignee: Marek Grac <mgrac>
Status: CLOSED NOTABUG QA Contact: cluster-qe <cluster-qe>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 7.2CC: abeekhof, cfeist, cluster-maint, eglynn, fdinitto, sbauza
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-12-02 03:08:56 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1185030    

Description Andrew Beekhof 2015-11-30 09:20:53 UTC
Description of problem:

The agent we shipped for OSP7 doesn't appear to work in OSP8.
Specifically, the plug list is now UUIDs which we need to map to and from host names.

Version-Release number of selected component (if applicable):

fence-agents-all-4.0.11-27.el7.x86_64

How reproducible:

100% 

Steps to Reproduce:
fence_compute -o list (plus normal arguments)

Actual results:

a list of uuid's

Expected results:

a list of compute nodes


Additional info:

Provisional patch

[root@overcloud-controller-0 heat-admin]# diff -u /sbin/fence_compute /sbin/fence_compute.new
--- /sbin/fence_compute.new	2015-11-29 19:24:50.419784336 -0500
+++ /sbin/fence_compute	2015-11-29 22:42:02.836595219 -0500
@@ -65,12 +65,19 @@
 		}
 
 def _host_evacuate(host, on_shared_storage):
-	hypervisors = nova.hypervisors.search(host, servers=True)
-	response = []
-	for hyper in hypervisors:
-		if hasattr(hyper, 'servers'):
-			for server in hyper.servers:
-				response.append(_server_evacuate(server, on_shared_storage))
+        response = []
+        servers = nova.servers.list(search_opts={'name': host})
+        for server in servers:
+                logging.debug("Got %s" % server)
+                logging.debug("Got hypervisor name %s" % server._info['OS-EXT-SRV-ATTR:hypervisor_hostname'])
+                hypervisors = nova.hypervisors.search(server._info['OS-EXT-SRV-ATTR:hypervisor_hostname'], servers=True)
+                for hyper in hypervisors:
+                        logging.debug("Got %s" % hyper)
+                        if hasattr(hyper, 'servers'):
+                                logging.debug("Looking up servers")
+                                for instance in hyper.servers:
+                                        logging.debug("evacuating %s" % instance)
+                                        response.append(_server_evacuate(instance, on_shared_storage))
 
 def set_attrd_status(host, status, options):
 	logging.debug("Setting fencing status for %s to %s" % (host, status))
@@ -112,20 +119,32 @@
 	return
 
 def get_plugs_list(_, options):
-	result = {}
+        result = {}
+        unames = []
 
-	if nova:
-		hypervisors = nova.hypervisors.list()
-		for hypervisor in hypervisors:
-			longhost = hypervisor.hypervisor_hostname
-			if options["--action"] == "list" and options["--domain"] != "":
-				shorthost = longhost.replace("." + options["--domain"],
-                                                 "")
-				result[shorthost] = ("", None)
-			else:
-				result[longhost] = ("", None)
-	return result
+        nodes=Popen(['cibadmin', '-Q', '-o', 'nodes'], stdout=PIPE)
+        nodes.wait()
 
+        if nodes.stdout:
+                lines = nodes.stdout.readlines()
+                nodes.stdout.close()
+
+        for line in lines:
+                getNext=False
+                fields = line.split('"')
+                for field in fields:
+                        if getNext == True:
+                                unames.append(field)
+                                break
+                        elif field.strip() == "uname=":
+                                getNext=True
+
+        for uname in unames:
+                role=Popen(['crm_attribute', '-N', uname, '-n', 'osprole', '-d', 'unknown', '-q'], stdout=PIPE).communicate()[0].strip()
+                if role == "compute":
+                        result[uname] = ("", None)
+
+        return result
 
 def define_new_opts():
 	all_opt["endpoint-type"] = {

Comment 1 Eoghan Glynn 2015-11-30 11:31:06 UTC
abeekhof: can you give a concise description of what exactly has changed in the nova-api?

Is it that the nova hypervisors API now returns host UUIDs wheeas previously it reported symbolic hostnames?

Comment 2 Sylvain Bauza 2015-11-30 11:33:21 UTC
Not sure I understand the problem.
The output of _server_evacuate() (ie. the dict keyed with server_uuid) has been introduced with https://github.com/ClusterLabs/fence-agents/commit/855c7f617e6afc840540439c359de970d3dc8cee

We haven't changed that since the beginning, so I don't get what's the problem here.

Could you please be more explicit, and tell us what has been regressed so we could identify how to fix that ?

Comment 3 Sylvain Bauza 2015-11-30 11:34:46 UTC
To be clear, I'm speaking of https://github.com/ClusterLabs/fence-agents/blob/f32efc776fd146aa9f386f2e635707763baf231f/fence/agents/compute/fence_compute.py#L63-L67 which is independent of what provides the Nova API.

Comment 4 Andrew Beekhof 2015-12-02 03:08:56 UTC
Eoghan, Sylvain, apologies I thought I had closed this bug.

I misattributed the change in output to changes to nova.

In practice, the person who set up the environment had accidentally pointed the fencing configuration to nova on the undercloud instead of the overcloud (and I was slow to realise what was going on).

No bug here. Sorry for the noise.