The FDP team is no longer accepting new bugs in Bugzilla. Please report your issues under FDP project in Jira. Thanks.
Bug 1787319 - [OVN] ovn-controller: crash due to use after free in I-P engine
Summary: [OVN] ovn-controller: crash due to use after free in I-P engine
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux Fast Datapath
Classification: Red Hat
Component: ovn2.11
Version: FDP 20.A
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: ---
Assignee: Dumitru Ceara
QA Contact: Jianlin Shi
URL:
Whiteboard:
Depends On: 1787318
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-01-02 11:37 UTC by Dumitru Ceara
Modified: 2020-11-10 15:23 UTC (History)
3 users (show)

Fixed In Version: ovn2.11-2.11.1-33
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1787318
Environment:
Last Closed: 2020-11-10 15:23:08 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Dumitru Ceara 2020-01-02 11:37:37 UTC
+++ This bug was initially created as a clone of Bug #1787318 +++

Description of problem:
With the attached scaled configuration if logical-switches are deleted ovn-controller might access freed memory and crash.

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1. Start ovn-northd and point it to the attached northbound db (ovnnb_db.db).
2. Start ovn-controller.
3. Start OVS and bind the logical_switch_ports locally:

for i in $(ovn-nbctl --bare --columns name find logical_switch_port type=\"\"); do
    vm=$(echo $i | cut -f 1 -d "-")
    ovs-vsctl add-port br-int $vm -- set interface $vm type=internal
    ovs-vsctl set interface $vm external_ids:iface-id=$i
done

4. Delete all logical switches:
for s in $(ovn-nbctl list logical_switch | grep -E "^name" | cut -f 2 -d ':' | cut -f 2 -d '"'); do ovn-nbctl ls-del $s; done

Actual results:
ovn-controller might crash:
Program received signal SIGSEGV, Segmentation fault.
0x00000000004b8f47 in hmap_first_with_hash (hmap=hmap@entry=0x91da08, hmap=hmap@entry=0x91da08, hash=2346380341)
    at ./include/openvswitch/hmap.h:328
328         return hmap_next_with_hash__(hmap->buckets[hash & hmap->mask], hash);


Expected results:
ovn-controller shouldn't use memory after it was freed.

Additional info:
Fixed upstream by commits:
2a4965c0e187db0c4218556ed9b06f988e88cb62: ovn-controller: Refactor I-P engine_run() tracking.
5ed53faecef12c09330ced445418c961cb1f8caf: ovn-controller: Add per node states to I-P engine.
2117ba0a91f36206d0f3665e8680c15f1f6fa0a0: ovn-controller: Add separate I-P engine node for processing ct-zones.
94cbc59dc0f1cb56e56d1551956efe5824561864: ovn-controller: Fix use of dangling pointers in I-P runtime_data.

Comment 2 Jianlin Shi 2020-03-23 09:41:24 UTC
Hi Dumitru,

I failed to reproduce the issue on ovn2.11-2.11.1-24.el7fdp.x86_64 with steps in https://bugzilla.redhat.com/show_bug.cgi?id=1787318#c3.

Comment 3 Dumitru Ceara 2020-03-30 15:24:16 UTC
Hi Jianlin,

The crash was made more visible by commit [1] but this was squashed in the patches for ovn2.11-2.11.1-26 which also fix the crash.
The steps described in https://bugzilla.redhat.com/show_bug.cgi?id=1787318#c3 don't work in replicating the issue because they were exercising the code path added by [1].

I don't see a straight forward way of replicating the issue without [1]. There are, in theory, code paths that would trigger the memory corruption but I couldn't hit them.

Regards,
Dumitru

[1] https://github.com/ovn-org/ovn/commit/fc1e1640cd47f255c68488b0ec36052b0af58fd2#diff-452d44dee1f09b8a972c69ef7499a69c

Comment 4 Jianlin Shi 2020-03-31 03:49:34 UTC
set VERIFIED per comment 3

Comment 5 Dan Williams 2020-11-10 15:23:08 UTC
All these bugs have been verified and have shipped in FDP 20.G or earlier.


Note You need to log in before you can comment on or make changes to this bug.