Bug 2033016

Summary:

systemd/1 - possible circular locking dependency detected on aarch64

Product:

[Fedora] Fedora

Reporter:

Jakub Čajka <jcajka>

Component:

kernel

Assignee:

Kernel Maintainer List <kernel-maint>

Status:

CLOSED RAWHIDE

QA Contact:

Fedora Extras Quality Assurance <extras-qa>

Severity:

unspecified

Docs Contact:

Priority:

unspecified

Version:

rawhide

CC:

acaringi, adscvr, airlied, alciregi, bskeggs, dustymabe, fedoraproject, filbranden, flepied, hdegoede, jarodwilson, jeremy, jeremy.linton, jglisse, jonathan, josef, kernel-maint, lgoncalv, linville, lnykryn, masami256, mchehab, msekleta, pbrobinson, ptalbert, ryncsn, ssahani, s, steved, systemd-maint, yuwatana, zbyszek

Target Milestone:

---

Target Release:

---

Hardware:

aarch64

OS:

Unspecified

Whiteboard:

Fixed In Version:

Doc Type:

If docs needed, set a value

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2022-02-21 16:56:17 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Bug Depends On:

Bug Blocks:

245418

Attachments:

Description	Flags
logs with the deadlock warning	none
reduced reproducer	none

Description Jakub Čajka 2021-12-15 17:31:23 UTC

Created attachment 1846432 [details]
logs with the deadlock warning

Description of problem:
We have hit during FCOS tests reports of "WARNING: possible circular locking dependency detected" on aarch64.
The test in question is podman.base, basic podman/container functionality check of FCOS run in QEMU kvm VM.
I don't have yet reduced reproducer but issue is happening roughly in 20-30% of the test runs.

Version-Release number of selected component (if applicable):
kernel-5.16.0-0.rc4.20211207gitcd8c917a56f2.30.fc36.aarch64 
systemd-249.7-2.fc36.aarch64 
podman-3.4.4-1.fc36.aarch64

How reproducible:
Roughly in 20-30% of cases.

Steps to Reproduce:
On aarch64 host.
1. source cosa (can be obtained at https://github.com/coreos/coreos-assembler/blob/main/docs/building-fcos.md#define-a-bash-alias-to-run-cosa)
2. export COREOS_ASSEMBLER_CONTAINER=quay.io/jcajka/coreos-assembler:latest
3. mkdir fcos
4. cd fcos
5. cosa init --branch rawhide https://github.com/coreos/fedora-coreos-config
6. cosa fetch
7. cosa build
8. while true; do cosa kola run podman.base || break; done

Actual results:
[snip]
Dec 14 06:35:48.545439 kernel: ======================================================
Dec 14 06:35:48.545575 kernel: WARNING: possible circular locking dependency detected
Dec 14 06:35:48.545712 kernel: 5.16.0-0.rc4.20211207gitcd8c917a56f2.30.fc36.aarch64 #1 Not tainted
Dec 14 06:35:48.545831 kernel: ------------------------------------------------------
Dec 14 06:35:48.545954 kernel: systemd/1 is trying to acquire lock:
Dec 14 06:35:48.546086 kernel: ffff356b42f8e058 (&sighand->siglock){....}-{2:2}, at: _raw_spin_lock_irqsave+0x1c/0x30
Dec 14 06:35:48.546204 kernel:
                               but task is already holding lock:
Dec 14 06:35:48.546352 kernel: ffffa24be4e13a08 (css_set_lock){..-.}-{2:2}, at: _raw_spin_lock_irq+0x1c/0x30
Dec 14 06:35:48.546522 kernel:
                               which lock already depends on the new lock.
                               
Dec 14 06:35:48.546656 kernel:
                               the existing dependency chain (in reverse order) is:
Dec 14 06:35:48.546765 kernel:
[snip]
More/full logs in attachment.

Expected results:
No warning is logged.

Additional info:
This has been reproduced in aws baremetal instance and on baremetal. In attachment is assortment of console logs with the logged warning. They bit vary but all seems centered around systemd and seems to happen in relation to container being spawn in the VM. I will continue to look for smaller reproducer/bisecting versions of the components.

Comment 1 Peter Robinson 2021-12-15 18:06:47 UTC

Given this is happening when css_set_lock has the lock, and css_set_lock is used exclusively by cgroups, I'm guessing COSA is running things with cgroups, probably in containers?

Comment 2 Dusty Mabe 2021-12-15 18:35:36 UTC

This is running inside a Fedora CoreOS VM run via qemu. `cosa` is the tool that does the setup and launch of the VM. All this is to say that we should be able to reproduce without `cosa` in the chain.

Jakub also was able to reproduce on an AWS and in a baremetal instance.

The failing test actually means that we're not uploading any of the images (i.e. I don't have a link for you to download) but if that would be useful I can do a build and post it up somewhere. Just let me know which platform you prefer.

Comment 3 Dusty Mabe 2021-12-15 18:43:39 UTC

For reference.. The failing test (the one that triggers the warning) is in this test file [1] and is the equivalent of running:

```
sudo podman info --format json
sudo podman run --net=none --rm --memory=128m --memory-swap=128m echo echo 1
sudo podman run --net=none --rm --memory-reservation=10m echo echo 1
sudo podman run --net=none --rm --cpu-shares=100 echo echo 1
sudo podman run --net=none --rm --cpu-period=1000 echo echo 1
sudo podman run --net=none --rm --cpuset-cpus=0 echo echo 1
sudo podman run --net=none --rm --cpuset-mems=0 echo echo 1
sudo podman run --net=none --rm --cpu-quota=1000 echo echo 1
sudo podman run --net=none --rm --blkio-weight=10 echo echo 1
sudo podman run --net=none --rm --memory=128m echo echo 1
sudo podman run --net=none --rm --shm-size=1m echo echo 1
```

[1] https://github.com/coreos/coreos-assembler/blob/1183b990f3cdfd223651c49c4bd93fbd26a178e4/mantle/kola/tests/podman/podman.go#L119-L122

Comment 4 Zbigniew Jędrzejewski-Szmek 2021-12-15 21:32:16 UTC

But the locking is a kernel thing, no?

Comment 5 Dusty Mabe 2021-12-16 17:16:22 UTC

Yes. I assume it's an issue with the RC kernel.

Comment 6 Jakub Čajka 2021-12-16 18:14:45 UTC

Created attachment 1846632 [details]
reduced reproducer

I have managed to reproduce this in aarch64 rawhide(5.16.0-0.rc5.20211214git5472f14a3742.36.fc36.aarch64) VM. Script based on FCOS's podman.base test case is in attachment. It might take more than one run to trigger this. Also it seems it will happen just once per reboot. To note nothing seems to fail, but the issue is logged.

Comment 7 Jeremy Linton 2022-01-07 23:00:20 UTC

As a note, this doesn't at first glance appear to be aarch64 specific, although TBD.

Obviously, this isn't going to be reproducible on any of the release/etc kernels as they won't have CONFIG_PROVE_LOCKING enabled. If it were causing wider problems I would expect them to appear as soft lockup messages with similar callstacks.

I'm still trying to duplicate it on the honeycomb.

Comment 8 Dusty Mabe 2022-01-08 04:15:27 UTC

I don't think it is aarch64 specific. Just happened to be where we were seeing it in our CI (see https://github.com/coreos/fedora-coreos-tracker/issues/1049).

I did comment later in that issue and mention I saw a similar lock, this time on x86_64: https://github.com/coreos/fedora-coreos-tracker/issues/1049#issuecomment-998252332

Comment 9 Jeremy Linton 2022-01-20 05:29:41 UTC

So, I duplicated this, and it is probably legitimate but requires a cgroup migration with multiple threads to be running when an async signal is being sent to a task being migrated. AKA I dont think the indicated tests can deadlock, ive got a bit of code which should, but doesn't yet.

Its a bit tricky to intuit out a solution, but its likely assuring that obj_cgroup_release (see commit bf4f059954dcb221384b2f784677e19a13cd4bdb) is delayed. There is a frozen task check in get_signal() which may be expanded (because AFAIK it doesn't avoid this lock dependency) to cover cgroup task migration as well as already frozen tasks, but that isn't obvious/clean.

I'm going to see if i can force a deadlock some more, and then post what i have to lk.

Comment 10 Dusty Mabe 2022-01-21 14:15:33 UTC

Just a note. We are still able to reproduce this in our CI env: https://github.com/coreos/fedora-coreos-tracker/issues/1049#issuecomment-1018540617

Comment 11 Jeremy Linton 2022-01-29 04:06:51 UTC

Well, I slimmed down the test program, but never managed to create a standalone deadlocking program. OTOH, I think I have a fairly simple patch (~3 lines) that fixes it a fairly clean way (basically I'm just replacing the spinlock/list_del/unlock with a list_del_rcu). I've been running the podman test in a loop for a few hours now and it hasn't poped, although it would go away for a bit in the past as well..

Comment 12 Dusty Mabe 2022-01-29 17:16:42 UTC

Jeremy, Since we're able to reproduce the issue pretty easy (at least it was last I checked) if you can provide me a kernel scratch build I can run it through FCOS CI and see if it resolves the issue.

Comment 13 Jeremy Linton 2022-02-01 05:30:01 UTC

Sure, which kernel version do you prefer, although, hu, I'm still playing with different ways to fix it, although I've got some other issues I'm juggling so this is a bit slow too. I should have most of tomorrow to start closing this out (and post a suggested fix to LKML as I've been promising).

Let me do that, and roll you a scratch build (although I'm not 100% confident I can get the config right for rawhide in an official build, lets see).

Comment 14 Dusty Mabe 2022-02-01 14:44:38 UTC

The latest kernel version in rawhide (which I was able to reproduce this issue on yesterday about 50% of the time) is kernel-5.17.0-0.rc0.20220112gitdaadb3bd0e8d.63.fc36

You can apply a patch on top of the rawhide branch and do a scratch build (you can limit the build to x86_64/aarch64 unless you want to wait a long time for the armv7/s390x builds to complete).

You've probably applied a patch to the kernel before and done a scratch build, but in case it's helpful here's a recent example where I did: https://src.fedoraproject.org/rpms/kernel/pull-request/50

Comment 15 Jeremy Linton 2022-02-01 21:11:10 UTC

Yah, I will see if the scratch build picks up the right config...

The public posting/patch is here https://lore.kernel.org/lkml/20220201205623.1325649-1-jeremy.linton@arm.com/T/#u 

Pretty sure it fixed the problem here, but i'm not sure it doesn't break something else.

Comment 16 Jeremy Linton 2022-02-01 21:35:17 UTC

So there is a scratch build here: https://koji.fedoraproject.org/koji/taskinfo?taskID=82256731

Comment 17 Dusty Mabe 2022-02-04 03:09:41 UTC

That scratch build seems to have solved the problem for me.

Comment 18 Jeremy Linton 2022-02-04 19:44:34 UTC

A modified version has been merged to -mm, and seems like its going to be merged to -stable as well soon.

Comment 19 Dusty Mabe 2022-02-04 19:46:08 UTC

Cool. Let me know when it hits a tag and we'll see if things make it down into a fedora kernel so we can test.

Comment 20 Jeremy Linton 2022-02-17 03:12:37 UTC

This is in 5.17rc4.

Comment 21 Dusty Mabe 2022-02-17 13:18:18 UTC

Is it safe to say it's in kernel-5.17.0-0.rc4.96.fc36 (https://bodhi.fedoraproject.org/updates/FEDORA-2022-e27e6736b8) ?

Comment 22 Dusty Mabe 2022-02-21 16:56:17 UTC

Since the kernel in branched and rawhide are both  rc4+ I'll mark this as closed and notify if we see the issue any longer.