Bug 1387441 - [RFE] resource_tracker should correcly account for VMs using boot from volumes hosted on NFS backend
Summary: [RFE] resource_tracker should correcly account for VMs using boot from volume...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-nova
Version: 9.0 (Mitaka)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: ---
Assignee: Eoghan Glynn
QA Contact: Prasanth Anbalagan
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-10-20 21:57 UTC by David Hill
Modified: 2022-03-13 14:10 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-04-10 15:27:59 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker OSP-13535 0 None None None 2022-03-13 14:10:51 UTC
Red Hat Knowledge Base (Solution) 4213941 0 None None None 2019-06-11 16:43:10 UTC

Description David Hill 2016-10-20 21:57:28 UTC
Description of problem:
resource_tracker doesn't cope well with VMs using boot from volumes hosted on NFS backend.

This is easy to reproduce as the resource_tracker doesn't validate the location of root_gb and ephemeral_gb of a given instance and by commenting out the following 2 lines in resource_tracker.py returns proper values to the nova-api.
 
        self.compute_node.memory_mb_used += sign * mem_usage
#        self.compute_node.local_gb_used += sign * usage.get('root_gb', 0)
#        self.compute_node.local_gb_used += sign * usage.get('ephemeral_gb', 0)

This hack is ugly as anyone NOT using boot_from_volume could easily fill in $instance_path if that path is different from $nfs_mount_point_base .


# Top-level directory for maintaining nova's state (string value)
#state_path=/var/lib/nova
state_path=/var/lib/nova


# Where instances are stored on disk (string value)
#instances_path=$state_path/instances

# Directory where the NFS volume is mounted on the compute node (string value)
#nfs_mount_point_base=$state_path/mnt


Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1. Deploy an overcloud with a valid storage-environment.yaml using NFS backend
2. Deploy many VMs with boot_from_volumes on the overcloud until you reach a full / but still have lots of free gigs on the NFS share
3. Scheduler will refuse to spawn new VMs

Actual results:
Fails

Expected results:
This shouldn't fail

Additional info:

Comment 1 David Hill 2016-10-20 21:58:59 UTC
Ugly patch to fix this: 
--- resource_tracker.py.orig    2016-10-20 21:24:29.892850673 +0000
+++ resource_tracker.py 2016-10-20 21:47:35.282942549 +0000
@@ -714,8 +714,8 @@
         mem_usage += overhead['memory_mb']

         self.compute_node.memory_mb_used += sign * mem_usage
-        self.compute_node.local_gb_used += sign * usage.get('root_gb', 0)
-        self.compute_node.local_gb_used += sign * usage.get('ephemeral_gb', 0)
+#        self.compute_node.local_gb_used += sign * usage.get('root_gb', 0)
+#        self.compute_node.local_gb_used += sign * usage.get('ephemeral_gb', 0)
         self.compute_node.vcpus_used += sign * usage.get('vcpus', 0)

         # free ram and disk may be negative, depending on policy:

Comment 2 Sylvain Bauza 2017-01-06 15:41:55 UTC
That's unfortunately a very well known problem that comes from an initial design issue.

We generally count our resources per compute node since the early beginning of Nova. Unfortunately, when we accepted (for good reasons) to have shared space for booting instances, that resource usage per compute became unreasonable.

Fixing that is an upstream Nova priority and we hopefully will see some solution implemented between Ocata and Pike (OSP11 and OSP12), but it will be a full dealbreaker with a lot of design changes and a new REST API called Placement API.

Consequently, it's hardly assumable to backport any of that to OSP9. That said, there are a couple of known workarounds that help reducing the problem :

 - operators can create dedicated flavors for boot-from-volume instances with a root and ephemeral size of 0
 - or, if you only support BFV instances, just disable the DiskFilter

I assume this is not a perfect solution and that the resolution will be somehow mid-term, but the above are the current workarounds for most of the operators.

An upstream bug partially describes the problem : https://bugs.launchpad.net/nova/+bug/1469179

Comment 3 awaugama 2017-08-30 17:51:48 UTC
WONTFIX/NOTABUG therefore QE Won't automate

Comment 8 Matthew Booth 2019-04-10 15:27:59 UTC
We've revisited this, and the fundamental issue here is that cinder and nova are sharing a storage pool. Unfortunately this isn't something we can support. Cinder and nova must use separate storage. If they're both using NFS on the same array, they need to use separate exports from separate filesystems.


Note You need to log in before you can comment on or make changes to this bug.