Bug 864978 - cura-networking: incorrect locking
Summary: cura-networking: incorrect locking
Keywords:
Status: CLOSED UPSTREAM
Alias: None
Product: Fedora
Classification: Fedora
Component: cura-networking
Version: rawhide
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Radek Novacek
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: 864846
TreeView+ depends on / blocked
 
Reported: 2012-10-10 14:37 UTC by Florian Weimer
Modified: 2016-12-01 00:31 UTC (History)
2 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2012-11-21 14:00:14 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Florian Weimer 2012-10-10 14:37:30 UTC
This code in src/network.c is incorrect:

Network *network_ref(const CMPIBroker *broker)
{
    if (_network == NULL) {
        _network = network_new(broker);
    }
    while (_network->loaded != (NETWORK_DEVICES_LOADED | NETWORK_SETTINGS_LOADED)) {
        // We need to wait for other thread to fetch network devices and settings
        // TODO: don't use polling
        usleep(10000);
    }
    MUTEX_LOCK(_network);
    _network->ref_count++;
    MUTEX_UNLOCK(_network);
    return _network;
}

You need to protect _network with a lock, and perform the initialization and reading under it.  If locking on every read is problematic, you could use the double-checked locking idiom, but I'm not sure if it's valid under the memory model implemented in GCC.  Using pthread_once could be an option as well.

In the network_unref function, the assignment to _network must be performed under a lock:

void network_unref(Network *network)
{
    if (network == NULL || _network == NULL) {
        network = _network = NULL;
        return;
    }

    MUTEX_LOCK(network);
    network->ref_count--;
    MUTEX_UNLOCK(network);

    if (network->ref_count <= 0) {
        network_free(network);
        network = _network = NULL;
    }
}

The value of network->ref_count must be stored in a local variable under the lock, reading it outside the lock can lead to parallel calls to network_free.  (It is safe to call network_free outside the lock when the reference count has reached zero because there is only one thread in which this can happen.  In fact, you cannot call network_free on a locked object.)

The following is just an example, there are several similar cases with the same problem.  In the network_device_removed_cb function in src/network_nm.c file, network->ports is accessed outside the lock:

void network_device_removed_cb(NMClient *nm_client, NMDevice *device, Network *network)
{
    fprintf(stderr, "Device: %s - removed\n", nm_device_get_iface(device));
    int i, len = ports_length(network->ports);
    for (i = 0; i < len; i++) {
        if (strcmp(ports_index(network->ports, i)->id, nm_device_get_iface(device)) == 0) {
            break;
        }
    }
    if (i < len) {
        MUTEX_LOCK(network);
        port_free(ports_pop(network->ports, i));
        MUTEX_UNLOCK(network);
    }
}

This read has to happen under the lock.

As a rule of thumb, it is best to assume that writes are invisible to reads unless both happen while the same lock is held.  In other words, you must lock while reading *and* writing.

Comment 1 Radek Novacek 2012-11-21 14:00:14 UTC
Thanks for the report. Locking has been fixed by explicit locking of each usage of the Network struct. Initialization of the Network struct is now done using pthead_once.

Upstream commit:
http://git.fedorahosted.org/cgit/openlmi-networking.git/commit/?id=c0328abc0977920e3fd0cbfa90566583a00ec72c


Note You need to log in before you can comment on or make changes to this bug.