Bug 1547070 - [DB] [DNS] - Updating the host's capabilities while running a VM may cause 'ERROR: duplicate key value violates unique constraint "name_server_pkey"'
Summary: [DB] [DNS] - Updating the host's capabilities while running a VM may cause 'E...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: BLL.Network
Version: 4.2.1.4
Hardware: x86_64
OS: Linux
high
high
Target Milestone: ovirt-4.2.2
: ---
Assignee: Alona Kaplan
QA Contact: Michael Burman
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-02-20 13:09 UTC by Michael Burman
Modified: 2018-04-05 09:39 UTC (History)
6 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2018-04-05 09:39:11 UTC
oVirt Team: Network
Embargoed:
rule-engine: ovirt-4.2+
ylavi: exception+


Attachments (Terms of Use)
engine log (814.55 KB, application/x-gzip)
2018-02-20 13:09 UTC, Michael Burman
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 88820 0 master MERGED engine: Exclude DnsConfig from VdsDynamic add/update 2020-10-30 09:58:43 UTC
oVirt gerrit 88967 0 ovirt-engine-4.2 MERGED engine: Exclude DnsConfig from VdsDynamic add/update 2020-10-30 09:58:43 UTC

Description Michael Burman 2018-02-20 13:09:59 UTC
Created attachment 1398219 [details]
engine log

Description of problem:
[DB] [DNS] - Updating the host's capabilities while running a VM may cause 'ERROR: duplicate key value violates unique constraint "name_server_pkey"'

The issue is the dns_configuration of the host is part of the vds_dynamic table.
If the table is updated not as part of getCapapbilies (for example it is updated when running a VM) and at the same time getCaps (or other updates of the vdsDynamic table) is performed, a race can happen.

- Those are the problematic two lines -

removeNameServersByDnsResolverConfigurationId(entity.getId());
saveNameServersByDnsResolverConfigurationId(entity.getId(), entity.getNameServers());

- If a context switch happens after/during the removal (thread A) and another thread  (thread B) updates the dns configuration.
When returning to thread A, the duplicate name server error will happen.

Caused by: org.postgresql.util.PSQLException: ERROR: duplicate key value violates unique constraint "name_server_pkey"
  Detail: Key (dns_resolver_configuration_id, address)=(86405e21-b3fd-4968-aa66-287daa178a81, 10.35.28.28) already exists.
  Where: SQL statement "INSERT INTO
    name_server(
      address,
      position,
      dns_resolver_configuration_id)
    VALUES (
      v_address,
      v_position,
      v_dns_resolver_configuration_id)"
PL/pgSQL function insertnameserver(uuid,character varying,smallint) line 3 at SQL statement


Version-Release number of selected component (if applicable):
4.2.2-0.1.el7

How reproducible:
100%

Steps to Reproduce:
1. Refresh caps during start VM operation

Actual results:
duplicate name server error will happen

Comment 1 Dan Kenigsberg 2018-03-13 12:40:06 UTC
This bug seems to have caused a failure to start a VM, raising severity

http://jenkins.ovirt.org/job/ovirt-4.2_change-queue-tester/1064/testReport/junit/(root)/004_basic_sanity/run_vms/

Comment 2 Yaniv Kaul 2018-03-15 14:04:13 UTC
Is this on track to 4.2.2? If not, please defer to 4.2.3.

Comment 3 Michael Burman 2018-03-29 09:03:02 UTC
- Network status is PASS - (manual testing)
- No regression introduced (Network tier2 is PASS)
- Waiting for an ACK from Virt team - As they saw this bug multiple times on their automation runs + make sure no regression introduced there as well. 

Tested on - 4.2.2.5-0.1.el7 and vdsm-4.20.23-1.el7ev.x86_64 

Keeping ON_QA until the ACK from Virt team.

Comment 4 Israel Pinto 2018-04-01 12:18:02 UTC
In Virt we don't see it anymore. 
Michael you can verify it.

Israel

Comment 5 Michael Burman 2018-04-01 14:18:01 UTC
Thanks Israel, 

Verified on - 4.2.2.5-0.1.el7 and vdsm-4.20.23-1.el7ev.x86_64

Comment 6 Sandro Bonazzola 2018-04-05 09:39:11 UTC
This bugzilla is included in oVirt 4.2.2 release, published on March 28th 2018.

Since the problem described in this bug report should be
resolved in oVirt 4.2.2 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.