Bug 2171486
| Summary: | etcd: FTBFS in Fedora rawhide/f38 | ||
|---|---|---|---|
| Product: | [Fedora] Fedora | Reporter: | Fedora Release Engineering <releng> |
| Component: | etcd | Assignee: | Jan Chaloupka <jchaloup> |
| Status: | NEW --- | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
| Severity: | unspecified | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 39 | CC: | go-sig, gscrivan, jcajka, jchaloup, lacypret, lemenkov, zaitcev |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | Type: | --- | |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 2117176 | ||
|
Description
Fedora Release Engineering
2023-02-20 11:47:37 UTC
The rawhide RPM etcd-3.5.5-3.fc38.x86_64 builds fine on F38 on x86_64,
just like in Koji. Failure only happens on other architectures.
The initial failure was on i686, and then on aarch64 (scratch build).
diff -u good bad:
-ok go.etcd.io/etcd/pkg/v3/idutil 0.001s
+ok go.etcd.io/etcd/pkg/v3/idutil 0.003s
go.etcd.io/etcd/pkg/v3/idutil
PASS
-ok go.etcd.io/etcd/pkg/v3/idutil 0.001s
-go.etcd.io/etcd/pkg/v3/ioutil
-PASS
-ok go.etcd.io/etcd/pkg/v3/ioutil 0.002s
+ok go.etcd.io/etcd/pkg/v3/idutil 0.003s
go.etcd.io/etcd/pkg/v3/ioutil
-PASS
-ok go.etcd.io/etcd/pkg/v3/ioutil 0.002s
-go.etcd.io/etcd/pkg/v3/netutil
-{"level":"info","msg":"resolved URL Host","url":"http://infra0.example.com:4001","host":"infra0.example.com:4001","resolved-addr":"10.0.1.10:4001"}
......................
+--- FAIL: TestPageWriterRandom (0.00s)
+ pagewriter_test.go:41: got 2385 bytes pending, expected less than 128 bytes
+FAIL
+exit status 1
+FAIL go.etcd.io/etcd/pkg/v3/ioutil 0.006s
+error: Bad exit status from /var/tmp/rpm-tmp.OCzuL0 (%check)
+RPM build errors:
+ Bad exit status from /var/tmp/rpm-tmp.OCzuL0 (%check)
OK, I see what's happening.
The TestPageWriterRandom has 2 problems.
Problem 1: it uses rand.Intn(), but forgets rand.Seed().
As a result, it passes or fails depending on how the platform
sets up the module. This is why it passes on x86_64: the platform
happens to seed the rand module in such a way that the sequence
makes the buggy test to work.
If I adds rand.Seed(time.Now().UnixNano()), the test begins to
fail on my laptop in the exact way it fails in Koji on aarch64.
Problem 2: The test is obviously incorrect, even though it
existed with no change since it was committed in 2016,
commit 2943bf908606ccbfaeda3bdf882a11a0138a0502.
The condition it tests obviously may fail. The buffer can
contain several pages (especially if they're this small).
If the very first write is longer several pages,
if len(p)+pw.bufferedBytes <= pw.bufWatermarkBytes { }
triggers, and Write returns, while retaining everything
written in the buffer. At that point, the as-tested
condition is obviously violated, as cw.writeBytes is zero.
I'm too lazy to construct a long scenario, but it obviously
can happen on the last write too (for example, if the
previous 4045 writes reset the buffer to zero by accident).
The tested condition needs to change to a valid one.
See issue https://github.com/etcd-io/etcd/issues/16255 commit https://github.com/etcd-io/etcd/commit/fddd1add52b33649a99d7f756404924138344a10 I think the fix is incomplete, because it does not address the seeding of the rand module. However, it should be sufficient to unblock Koji. This bug appears to have been reported against 'rawhide' during the Fedora Linux 39 development cycle. Changing version to 39. |