Bug 2173754
| Summary: | openvswitch provokes kernel warnings | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 9 | Reporter: | Colin Walters <walters> |
| Component: | kernel | Assignee: | Michael Petlan <mpetlan> |
| kernel sub component: | Perf | QA Contact: | Qiao Zhao <qzhao> |
| Status: | ASSIGNED --- | Docs Contact: | |
| Severity: | high | ||
| Priority: | high | CC: | acme, bstinson, ctrautma, dcbw, fleitner, jwboyer, mpetlan, qzhao, ralongi, zsun |
| Version: | CentOS Stream | Keywords: | Triaged |
| Target Milestone: | rc | Flags: | zsun:
needinfo?
(walters) zsun: needinfo? (ralongi) mpetlan: needinfo? (walters) mpetlan: needinfo? (walters) |
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | Type: | Bug | |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Colin Walters
2023-02-27 21:17:23 UTC
It seems that openvswitch2.17 also reproduces the same warnings, so this apparently isn't new behavior for openvswitch. (In reply to Colin Walters from comment #1) > It seems that openvswitch2.17 also reproduces the same warnings, so this > apparently isn't new behavior for openvswitch. OVS 2.17 on RHEL9, or OVS 2.17 on RHEL8? > OVS 2.17 on RHEL9,
This.
Also, not a very recent issue since this also reproduces with 5.14.0-252.el9.x86_64 from Feb 01 (with ovs 3.1).
But this is new in C9S/9.2; doesn't reproduce with 5.14.0-70.13.1.el9_0.x86_64 (same userspace). The call trace in the bug seems different from the one reported upstream: https://github.com/openshift/os/issues/1179 Feb 27 15:05:36.752873 ovs-ctl[1333]: /etc/openvswitch/conf.db does not exist ... (warning). Feb 27 15:05:36.783800 ovs-ctl[1333]: Creating empty database /etc/openvswitch/conf.db. Feb 27 15:05:36.810769 ovs-ctl[1400]: 2023-02-27T20:05:36Z|00001|dns_resolve|WARN|Failed to read /etc/resolv.conf: No such file or directory Feb 27 15:05:36.810863 ovsdb-server[1400]: ovs|00001|dns_resolve|WARN|Failed to read /etc/resolv.conf: No such file or directory Feb 27 15:05:36.814311 kernel: Call Trace: Feb 27 15:05:36.814334 kernel: <TASK> Feb 27 15:05:36.814339 kernel: __perf_install_in_context+0x16c/0x220 Feb 27 15:05:36.814347 kernel: remote_function+0x44/0x50 Feb 27 15:05:36.814351 kernel: generic_exec_single+0x75/0xb0 Feb 27 15:05:36.814354 kernel: smp_call_function_single+0xeb/0x130 Feb 27 15:05:36.814357 kernel: ? sw_perf_event_destroy+0x60/0x60 Feb 27 15:05:36.814361 kernel: perf_install_in_context+0xcf/0x200 Feb 27 15:05:36.814367 kernel: ? ctx_resched+0xe0/0xe0 Feb 27 15:05:36.814370 kernel: __do_sys_perf_event_open+0xa99/0xe30 Feb 27 15:05:36.814374 kernel: do_syscall_64+0x59/0x90 Feb 27 15:05:36.814377 kernel: ? do_user_addr_fault+0x1d8/0x690 Feb 27 15:05:36.814381 kernel: ? exc_page_fault+0x62/0x150 Feb 27 15:05:36.814385 kernel: entry_SYSCALL_64_after_hwframe+0x63/0xcd Feb 27 15:05:36.814388 kernel: RIP: 0033:0x7f5b66e3ee5d Feb 27 15:05:36.814392 kernel: Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 93 af 1b 00 f7 d8 64 89 01 48 Feb 27 15:05:36.814397 kernel: RSP: 002b:00007ffe7f8fe6a8 EFLAGS: 00000246 ORIG_RAX: 000000000000012a Feb 27 15:05:36.814401 kernel: RAX: ffffffffffffffda RBX: 00007ffe7f8fe860 RCX: 00007f5b66e3ee5d Feb 27 15:05:36.814404 kernel: RDX: 00000000ffffffff RSI: 0000000000000000 RDI: 00007ffe7f8fe6b0 Feb 27 15:05:36.814408 kernel: RBP: 0000558b0ba7bb60 R08: 0000000000000000 R09: 0000008000000000 Feb 27 15:05:36.814412 kernel: R10: 00000000ffffffff R11: 0000000000000246 R12: 0000000000000000 Feb 27 15:05:36.814415 kernel: R13: 0000558b0ae8a260 R14: 0000558b0ba7ef60 R15: 0000000000000000 Feb 27 15:05:36.814419 kernel: </TASK> That seems to have come from: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/blob/main/kernel/events/core.c#L2757 Skimming through MRs, I see https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/1540 AKA https://bugzilla.redhat.com/show_bug.cgi?id=2123229 which seems likely to be relevant here. I previously ran some tests using openvswitch3.1-3.1.0-0.2.el9fdp.x86_64.rpm with RHEL-9.2.0-20230209.13 (kernel: 5.14.0-252.el9.x86_64) and did not detect any call trace. The tests don't start ovsdb-server.service individually as it is started as part of openvswitch.service.
However, I just set up a test bed running openvswitch3.1-3.1.0-0.2.el9fdp.x86_64.rpm with RHEL-9.2.0-20230227.0 (kernel: 5.14.0-282.el9.x86_64), stopped openvswitch.service, then ran "systemctl start ovsdb-server". There were no issues reported via the console or dmesg. Details are below.
Is there a step that I am missing?
Thanks,
Rick
[root@netqe25 ~]# rpm -q kernel openvswitch3.1
kernel-5.14.0-282.el9.x86_64
openvswitch3.1-3.1.0-2.el9fdp.x86_64
[root@netqe25 ~]# systemctl stop openvswitch
[root@netqe25 ~]# systemctl status ovsdb-server
○ ovsdb-server.service - Open vSwitch Database Unit
Loaded: loaded (/usr/lib/systemd/system/ovsdb-server.service; static)
Active: inactive (dead) since Tue 2023-02-28 15:27:15 EST; 17s ago
Duration: 33min 3.657s
Process: 142249 ExecStartPre=/usr/bin/rm -f /run/openvswitch.useropts (code>
Process: 142250 ExecStartPre=/usr/bin/chown ${OVS_USER_ID} /run/openvswitch>
Process: 142251 ExecStartPre=/bin/sh -c /usr/bin/echo "OVS_USER_ID=${OVS_US>
Process: 142253 ExecStartPre=/bin/sh -c if [ "$${OVS_USER_ID/:*/}" != "root>
Process: 142255 ExecStart=/usr/share/openvswitch/scripts/ovs-ctl --no-ovs-v>
Process: 143404 ExecStop=/usr/share/openvswitch/scripts/ovs-ctl --no-ovs-vs>
Main PID: 142298 (code=exited, status=0/SUCCESS)
CPU: 202ms
Feb 28 14:54:11 netqe25.knqe.lab.eng.bos.redhat.com ovs-vsctl[142299]: ovs|0000>
Feb 28 14:54:11 netqe25.knqe.lab.eng.bos.redhat.com ovs-vsctl[142304]: ovs|0000>
Feb 28 14:54:11 netqe25.knqe.lab.eng.bos.redhat.com ovs-ctl[142255]: Configurin>
Feb 28 14:54:11 netqe25.knqe.lab.eng.bos.redhat.com ovs-ctl[142255]: Enabling r>
Feb 28 14:54:11 netqe25.knqe.lab.eng.bos.redhat.com systemd[1]: Started Open vS>
Feb 28 14:54:11 netqe25.knqe.lab.eng.bos.redhat.com ovs-vsctl[142312]: ovs|0000>
Feb 28 15:27:14 netqe25.knqe.lab.eng.bos.redhat.com systemd[1]: Stopping Open v>
Feb 28 15:27:14 netqe25.knqe.lab.eng.bos.redhat.com ovs-ctl[143404]: Exiting ov>
Feb 28 15:27:15 netqe25.knqe.lab.eng.bos.redhat.com systemd[1]: ovsdb-server.se>
Feb 28 15:27:15 netqe25.knqe.lab.eng.bos.redhat.com systemd[1]: Stopped Open vS>
[root@netqe25 ~]# dmesg
[ 836.688899] systemd-rc-local-generator[141502]: /etc/rc.d/rc.local is not marked executable, skipping.
[ 874.357212] No such timeout policy "ovs_test_tp"
[ 874.362389] Failed to associated timeout policy `ovs_test_tp'
[ 1119.158565] No such timeout policy "ovs_test_tp"
[ 1119.163737] Failed to associated timeout policy `ovs_test_tp'
[root@netqe25 ~]# systemctl start ovsdb-server
[root@netqe25 ~]# systemctl status ovsdb-server
● ovsdb-server.service - Open vSwitch Database Unit
Loaded: loaded (/usr/lib/systemd/system/ovsdb-server.service; static)
Active: active (running) since Tue 2023-02-28 15:27:54 EST; 4s ago
Process: 143451 ExecStartPre=/usr/bin/rm -f /run/openvswitch.useropts (code>
Process: 143452 ExecStartPre=/usr/bin/chown ${OVS_USER_ID} /run/openvswitch>
Process: 143453 ExecStartPre=/bin/sh -c /usr/bin/echo "OVS_USER_ID=${OVS_US>
Process: 143455 ExecStartPre=/bin/sh -c if [ "$${OVS_USER_ID/:*/}" != "root>
Process: 143457 ExecStart=/usr/share/openvswitch/scripts/ovs-ctl --no-ovs-v>
Main PID: 143501 (ovsdb-server)
Tasks: 1 (limit: 407974)
Memory: 3.2M
CPU: 106ms
CGroup: /system.slice/ovsdb-server.service
└─143501 ovsdb-server /etc/openvswitch/conf.db -vconsole:emer -vsy>
Feb 28 15:27:54 netqe25.knqe.lab.eng.bos.redhat.com systemd[1]: Starting Open v>
Feb 28 15:27:54 netqe25.knqe.lab.eng.bos.redhat.com ovs-ctl[143457]: Starting o>
Feb 28 15:27:54 netqe25.knqe.lab.eng.bos.redhat.com ovs-vsctl[143502]: ovs|0000>
Feb 28 15:27:54 netqe25.knqe.lab.eng.bos.redhat.com ovs-vsctl[143507]: ovs|0000>
Feb 28 15:27:54 netqe25.knqe.lab.eng.bos.redhat.com ovs-ctl[143457]: Configurin>
Feb 28 15:27:54 netqe25.knqe.lab.eng.bos.redhat.com ovs-ctl[143457]: Enabling r>
Feb 28 15:27:54 netqe25.knqe.lab.eng.bos.redhat.com systemd[1]: Started Open vS>
Feb 28 15:27:54 netqe25.knqe.lab.eng.bos.redhat.com ovs-vsctl[143513]: ovs|0000>
[root@netqe25 ~]# dmesg
[ 836.688899] systemd-rc-local-generator[141502]: /etc/rc.d/rc.local is not marked executable, skipping.
[ 874.357212] No such timeout policy "ovs_test_tp"
[ 874.362389] Failed to associated timeout policy `ovs_test_tp'
[ 1119.158565] No such timeout policy "ovs_test_tp"
[ 1119.163737] Failed to associated timeout policy `ovs_test_tp'
[root@netqe25 ~]#
I'd need someone familiar with openvswitch to help set up a reproduction environment - a machine in Beaker reserved for investigation, where I could try it out with different kernels, etc. Could you provide such box? Thanks. Hi, The messages from comment#5 (copied below) indicates that there is no OVS configuration yet: Feb 27 15:05:36.752873 ovs-ctl[1333]: /etc/openvswitch/conf.db does not exist ... (warning). Feb 27 15:05:36.783800 ovs-ctl[1333]: Creating empty database /etc/openvswitch/conf.db. [...] So, all you need to do is install openvswitch3.1 and the openvswitch-selinux-extra-policy from here: http://download-node-02.eng.bos.redhat.com/rhel-9/nightly/updates/FDP/latest-FDP-9-RHEL-9/compose/Server/x86_64/os/Packages/ and enable/start the service using systemctl: # systemctl enable openvswitch # systemctl start openvswitch The above will start ovs-vswitchd and ovsdb-server daemons. The latter is the one reproducing the issue. HTH, fbl I have tested it on a VM with kernel-5.14.0-339.el9 and I cannot reproduce the problem. I'll have a look at some older kernel and see whether it is caused by version (and thus already fixed) or by the hardware... Yes, it seems that OVS uses perf in some way. I have tested RHEL-9.2.0 with kernel-5.14.0-284.11.1.el9_2.x86_64 on a bare metal hardware (Skylake CPU) with openvswitch3.1 and again, no fail. If you experience issues with any of the later kernels, please help me reproducing it. It might be reproducible only on certain CPU model (CPU vendor, family, model and stepping might be important to know). |