Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1500157

Summary: compute_capabilities_filter does not filter out soft deleted compute_nodes & ends up choosing soft deleted nodes with wrong profiles
Product: Red Hat OpenStack Reporter: Jaison Raju <jraju>
Component: openstack-novaAssignee: Sylvain Bauza <sbauza>
Status: CLOSED INSUFFICIENT_DATA QA Contact: Joe H. Rahme <jhakimra>
Severity: medium Docs Contact:
Priority: medium    
Version: 10.0 (Newton)CC: berrange, dasmith, eglynn, jraju, kchamart, lyarwood, mbooth, sbauza, sferdjao, sgordon, srevivo, vromanso
Target Milestone: asyncKeywords: Triaged, ZStream
Target Release: 10.0 (Newton)   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-03-26 08:47:29 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
cmds / logs none

Description Jaison Raju 2017-10-10 05:39:45 UTC
Description of problem:
On a director based undercloud environment which is tried to redeployed multiple times, the ironic node entries are already present in nova.compute_nodes table.
Although, when changing ironic node profiles, the nova-compute would detect & entry of ironic nodes with different profile , but while deployment, the nova-scheduler does not use the logic to skip soft deleted nodes , like the following.
nova.compute_nodes where deleted != '0'


Version-Release number of selected component (if applicable):
RHOS10 (But i think it should be reproducible in all version)

How reproducible:
Always

Steps to Reproduce:
1.
2.
3.

Actual results:
compute capabilities filter considers the profile in the 'stats' of 1st entry with a ironic node uuid in db.

Expected results:
compute capabilities filter should skip all ironic node entries in compute_nodes which have "deleted != '0' " .

Additional info:
Similar logic was applied to show latest hypervisor stats in
https://code.engineering.redhat.com/gerrit/#/c/108009/2/nova/db/sqlalchemy/api.py

Comment 1 Jaison Raju 2017-10-10 05:45:20 UTC
Created attachment 1336613 [details]
cmds / logs

Comment 2 Sylvain Bauza 2017-10-13 16:22:26 UTC
By design, filters are not responsible for verifying the liveness of the host they get in parameter (even the ComputeFilter verifies the *service* liveness, not the host itself).

Rather, when we build a list of nodes to verify, we call the DB and ask it (by the Newton timeframe) to return us a list of compute node records here :
https://github.com/openstack/nova/blob/e30b75097840019c38e0619e70924ddc9f9487a0/nova/scheduler/host_manager.py#L588

(Once we get that list of nodes, we later call for each of them each filter one-by-one)

If you look into the object internals about how we SQL query the list of records that are compute_nodes entries, we turn into verifying the deleted state and by default, we don't ask to return the list of soft-deleted entries.
https://github.com/openstack/nova/blob/stable/newton/nova/db/sqlalchemy/api.py#L589-L590

Since we build a context object that is having by default read_deleted=no, in theory we should only return a list of non-soft-deleted nodes

Now, looking at your attachment, I can see two records for the same hypervisor_hostname value :

*************************** 3. row ***************************
           created_at: 2017-02-17 06:01:29
           updated_at: 2017-09-24 01:41:48
           deleted_at: 2017-09-24 01:41:51
                   id: 3
           service_id: NULL
                vcpus: 0
            memory_mb: 0
             local_gb: 0
           vcpus_used: 2
       memory_mb_used: 8192
        local_gb_used: 80
      hypervisor_type: ironic
   hypervisor_version: 1
             cpu_info: 
 disk_available_least: -278
          free_ram_mb: -8192
         free_disk_gb: -80
     current_workload: 0
          running_vms: 3
  hypervisor_hostname: 561a3dea-aed9-472f-bc5c-eb55f2aab183
              deleted: 3
              host_ip: 10.65.176.41
  supported_instances: [["x86_64", "baremetal", "hvm"]]
            pci_stats: {"nova_object.version": "1.1", "nova_object.changes": ["objects"], "nova_object.name": "PciDevicePoolList", "nova_object.data": {"objects": []}, "nova_object.namespace": "nova"}
              metrics: []
      extra_resources: NULL
                stats: {"profile": "ceph-storage", "cpu_arch": "x86_64", "num_proj_1ce9d46a310d4500987afc96ded596e3": "3", "io_workload": "3", "num_instances": "3", "num_vm_building": "3", "num_task_None": "3", "boot_option": "local", "num_os_type_None": "3"}
        numa_topology: NULL
                 host: ibm-x3630m4-5.gsslab.pnq.redhat.com
 ram_allocation_ratio: 1
 cpu_allocation_ratio: 0
                 uuid: b8685471-e7eb-4ba0-98d7-44539828bfeb
disk_allocation_ratio: 0

*************************** 10. row ***************************
           created_at: 2017-09-24 01:44:53
           updated_at: 2017-10-02 17:38:02
           deleted_at: NULL
                   id: 10
           service_id: NULL
                vcpus: 12
            memory_mb: 16384
             local_gb: 278
           vcpus_used: 0
       memory_mb_used: 0
        local_gb_used: 0
      hypervisor_type: ironic
   hypervisor_version: 1
             cpu_info: 
 disk_available_least: 278
          free_ram_mb: 16384
         free_disk_gb: 278
     current_workload: 0
          running_vms: 0
  hypervisor_hostname: 561a3dea-aed9-472f-bc5c-eb55f2aab183
              deleted: 0
              host_ip: 10.65.176.41
  supported_instances: [["x86_64", "baremetal", "hvm"]]
            pci_stats: {"nova_object.version": "1.1", "nova_object.changes": ["objects"], "nova_object.name": "PciDevicePoolList", "nova_object.data": {"objects": []}, "nova_object.namespace": "nova"}
              metrics: []
      extra_resources: NULL
                stats: {"profile": "compute", "cpu_arch": "x86_64", "cpu_hugepages": "true", "cpu_txt": "true", "cpu_vt": "true", "boot_option": "local", "cpu_aes": "true", "cpu_hugepages_1g": "true"}
        numa_topology: NULL
                 host: ibm-x3630m4-5.gsslab.pnq.redhat.com
 ram_allocation_ratio: 1
 cpu_allocation_ratio: 0
                 uuid: c7829239-2100-40e9-830b-2efb88145b8a
disk_allocation_ratio: 0

That is fine because we have a UniqueKey on (host, hypervisor_hostname, deleted) which means you can have twice or more the same node if only one of them is active.

Accordingly, that means that the HostState should be updated with the new compute node resources, including the stats.

Could you please try something like that :
 - create a new Ironic node
 - verify you can see it in the compute_nodes table
 - modify the Ironic node like you did
 - verify it creates a separate compute_nodes entry (and deleting the previous one)
 - try to see if the scheduler filter doesn't work
 - if so, trying to just run again the scheduler and see again if that fixes ?

Thanks,
-Sylvain

Comment 4 Jaison Raju 2017-10-18 18:30:00 UTC
tested this via tripleo-quickstart on ocata rdo , but couldnt reproduce the issue.
The db entry was immediately changed once i ran ironic node update.
No new entries were created , existing ones were updated.



MariaDB [nova]> select hypervisor_hostname,stats from compute_nodes\G;
*************************** 1. row ***************************
hypervisor_hostname: 4d640cad-270c-4baa-b218-b7b1ffc78023
              stats: {"profile": "compute", "cpu_arch": "x86_64", "boot_option": "local"}
*************************** 2. row ***************************
hypervisor_hostname: 22a2c202-6c18-468f-a2a7-043a9170dc2f
              stats: {"profile": "control", "cpu_arch": "x86_64", "boot_option": "local"}
2 rows in set (0.00 sec)

I will test this again on the original environment which was RHOS11.
If i am not able to reproduce the same behavior using ironic node-update,
i think something during the repeated stack-delete & create may have brought the
compute_nodes table to this state.

Comment 5 Jaison Raju 2017-10-19 05:05:57 UTC
I tested this again on RHOS11 where the issue was initially noticed.
(to work around, i deleted all entries in compute_nodes earlier)

The behavior seen is same as i found in tripleo-quickstart env.
After ironic-node update , the compute_nodes entry is updated with new profile.
Although this does not create new entries.

It seems i was wrong on how these entries were created & i still dont know how they were.
I will try to force this behavior by adding similar entries before the actual entry with deleted = $id .

Comment 6 Sylvain Bauza 2017-10-20 11:57:37 UTC
Okay, lemme know the outcomes of the tests and if you can reproduce that, but I suppose a problem with the Ironic virt driver providing the list of nodes if you can see again the issue.

Comment 7 Jaison Raju 2017-10-27 15:49:10 UTC
(In reply to Sylvain Bauza from comment #6)
> Okay, lemme know the outcomes of the tests and if you can reproduce that,
> but I suppose a problem with the Ironic virt driver providing the list of
> nodes if you can see again the issue.

I am facing some different issues while testing this.
I just took all insert commands to be run in compute_nodes & i deleted all entries.
i created a duplicate entry insert statement for one of the computes with a much older id (deleted) & with a different profile.

But while testing the scheduler immediately fails in retry itself.
I cant make out why.

I am also not sure how multiple entries are created in repeated redeployment , but i have noticed this multiple times.

Please suggest what we could do next.

Comment 8 Sylvain Bauza 2018-03-26 08:47:29 UTC
Honestly, given we can't really reproduce the issue, I don't know how to help here.
The only possible way would be to try to look at the DB and see whether an Ironic node modification was providing a new compute node record or just modifying the existing one, but given it's not possible to verify that atm, closing the bug now.

Please reopen it if you are able to reproduce the problem so we could be discussing after.