Bug 1108708

Summary:

Config parser: Properly check integer values

Product:

Red Hat Enterprise Linux 7

Reporter:

Jan Friesse <jfriesse>

Component:

corosync

Assignee:

Jan Friesse <jfriesse>

Status:

CLOSED ERRATA

QA Contact:

Cluster QE <mspqa-list>

Severity:

medium

Docs Contact:

Priority:

medium

Version:

7.1

CC:

ccaulfie, cluster-maint, jkortus

Target Milestone:

Target Release:

---

Hardware:

Unspecified

OS:

Unspecified

Whiteboard:

Fixed In Version:

corosync-2.3.3-3.el7

Doc Type:

Bug Fix

Doc Text:

Cause: User sets integer option in corosync.conf to value out of boundary (< 0 or (usually) > 2^32 - 1) Consequence: Corosync doesn't show error message. Integer is parsed and overflows. Fix: Properly check integer boundaries. Result: Corosync show error during configuration file parsing (error contains allowed integer range)

Story Points:

---

Clone Of:

Clones:

1167390 (view as bug list)

Environment:

Last Closed:

2015-03-05 08:27:09 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Bug Depends On:

Bug Blocks:

1167390

Attachments:

Description	Flags
coroparse: More strict numbers parsing	none

Description Jan Friesse 2014-06-12 12:58:54 UTC

Description of problem:
Corosync doesn't properly check integers in corosync.conf for their range. It also doesn't allow enter 32-bit integer with highest bit set on 32-bit system.

Version-Release number of selected component (if applicable):
2.3.3

How reproducible:
100%

Steps to Reproduce and results:
* corosync.conf: token: -1 -> 4294967295 (no error shown)
* token: 4294967296 -> overflow -> 0 (no error shown)
* It's impossible to enter (perfectly valid) nodeid > 2^31 - 1 on 32-bit systems

Expected results:
Corosync show error on invalid integer value (check boundaries) ideally with allowed range. It's possible to use integer > 2^31 - 1 on 32-bit systems (not subject of QA. We don't support 32-bit systems any longer).

Comment 1 Jan Friesse 2014-06-12 13:00:48 UTC

Created attachment 908110 [details]
coroparse: More strict numbers parsing

coroparse: More strict numbers parsing

Previous safe_atoi didn't check range of input values so if for example
user used -1 s token timeout, it was converted to UINT32_MAX without
letting user know.

Another safe_atoi problem was using strtol. This works pretty well on
64-bit systems, where long integer is usually 64-bits long, sadly on
32-bit systems, it is usually 32-bit long. And because strtol returns
signed integer, it was not possible to enter 32-bit value with highest
bit set.

Solution is to use strtoll which is guaranteed to be at least 64-bits
long and check value range.

Also error message now contains also information about expected value
range.

Signed-off-by: Jan Friesse <jfriesse>
Reviewed-by: Christine Caulfield <ccaulfie>

Comment 2 Jan Friesse 2014-06-12 13:04:35 UTC

For QE: "Unit" test is csts config-strict-parse-numbers.sh

Comment 5 Jaroslav Kortus 2014-10-30 13:37:09 UTC

With introduction of dynamic token timeout the introduced checks might not be enough. For instance setting token: 4294966646 will forbid corosync to start in 3-node cluster (4294966646+650 will result in max+1).

As the purpose of this bug was to eliminate these values from config and let the user know they are off limits, I'm flipping this back to assigned for reconsideration. Users would have to be really evil to reach these corner cases, but when we already have a bug for this... :).

Comment 6 Jan Friesse 2014-10-30 16:30:54 UTC

Good catch. Sadly I'm really unsure if it's possible to let user know they are off limit in this weird case. Basically, during startup it's possible, but sadly during runtime it's not possible (we really do not want to crash corosync just because user added another node).

So my proposed solution is to limit maximum token timeout to uint32_max to prevent overflow. And to reach consistent behavior, not showing any extra error.

Opinions? Ideas?

Comment 7 Jaroslav Kortus 2014-10-30 17:03:00 UTC

limiting token timeout only will not do it. We should limit also all the other variables that come into play (like the coefficient), or better said, the resulting product of them.

We could do this logic also on reload I guess, at least that worked for the join timeout vs consensus ratio, if it was not valid on reload, the reload was refused.

Is it possible to do that or are there any use cases that I don't see right now?

Comment 8 Jan Friesse 2014-10-31 16:47:51 UTC

Oh, this is misunderstanding. By token I mean final token value. So yes, coefficient comes into play.

About join vs consensus error. Sadly it's not true (and it may be worth of BZ). Error is shown, but reload is still processed and values are used.

That's why I'm considering using highest possible values for token and related variables. Maybe with warning message for 7.1.

Opinions?

Comment 9 Jan Friesse 2014-11-24 16:04:14 UTC

Cloning bug to 7.2, to solve token timeout overflow and add functionality to refuse reload.

Comment 13 errata-xmlrpc 2015-03-05 08:27:09 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-0365.html