Bug 1352672

Summary: [RHELSA-7.3] 4.5.0-0.43.el7 WARNING: at net/core/dev.c:6518
Product: Red Hat Enterprise Linux 7 Reporter: PaulB <pbunyan>
Component: kernel-aarch64Assignee: Mark Salter <msalter>
kernel-aarch64 sub component: Platform Enablement QA Contact: Jeff Bastian <jbastian>
Status: CLOSED CURRENTRELEASE Docs Contact:
Severity: high    
Priority: unspecified CC: bpeck, jbastian, jcm, jfeeney, mlangsdo, msalter, pbunyan, xzhou
Version: 7.3   
Target Milestone: rc   
Target Release: ---   
Hardware: aarch64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-12-12 16:54:11 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1274397    

Description PaulB 2016-07-04 16:29:41 UTC
Description of problem:
The following issue is reported by dmesg:
 WARNING: at net/core/dev.c:6518

Version-Release number of selected component (if applicable):
 distro: RHEL-7.3-20160629.n.0
 kernel: 4.5.0-0.43.el7

How reproducible:
 unknown

Steps to Reproduce:
1. Install ARM host with  distro: RHEL-7.3-20160629.n.0
2. Install  kernel-4.5.0-0.43.el7
3. dmesg

Actual results:
https://beaker.engineering.redhat.com/jobs/1388544
https://beaker.engineering.redhat.com/recipes/2836692#task42567198
https://beaker.engineering.redhat.com/recipes/2836692/tasks/42567198/results/210015351/logs/resultoutputfile.log
---<-snip->---
[    1.889930] realtek: module verification failed: signature and/or required key missing - tainting kernel
[    1.899886] xgene-enet APMC0D30:00: Unable to get ENET IRQ
[    1.905373] ------------[ cut here ]------------
[    1.909970] WARNING: at net/core/dev.c:6518
[    1.914133] Modules linked in: realtek(E)
---<-snip->---


Expected results:
 no dmesg warnings

Additional info:

Comment 4 Mark Salter 2016-07-19 15:43:17 UTC
Looks like realtek module is getting loaded before the kernel key.

mustang:3> dmesg | grep key
[    0.527148] Initialise system trusted keyring
[    0.531080] NFS: Registering the id_resolver key type
[    0.531325] Key type big_key registered
[    0.538656] Asymmetric key parser 'x509' registered
[    2.112568] realtek: module verification failed: signature and/or required key missing - tainting kernel
[    2.504094] Loaded X.509 cert 'Build time autogenerated kernel key: 2e11297ac161ac1be32883d658a8350b9facf864'

Comment 5 Mark Salter 2016-07-20 15:35:46 UTC
The realtek phy module is getting loaded early because we have xgene net driver built in to kernel. I think we have to build in the phy as well to avoid the signing message. This doesn't explain the panic in comment 3 which seems to be a separate issue.

Comment 6 Mark Langsdorf 2016-07-20 15:38:04 UTC
I agree that it makes sense that both the phy and NIC need to monolithic or modular. But I thought we had a BZ focused on making the NIC modular which seems like a better way of resolving this issue.

Comment 7 Mark Salter 2016-07-20 15:55:42 UTC
Getting NIC modularized is the thing to do, definitely. Then the problem with realtek problem goes away. There was bug 1264540 but it is on verified. The xgene driver works as a module as far as loading goes but unloading seems to be broken. So maybe 1264540 needs reopening and this one can block on it.

Comment 8 Jon Masters 2016-08-30 19:03:30 UTC
Can QE retest this with a -5 kernel?

Comment 9 PaulB 2016-09-01 12:56:48 UTC
JeffB,
I have not seen this issue in recent KT1 testing with apm-mustang-ev3
hosts.

RE: https://bugzilla.redhat.com/show_bug.cgi?id=1352672#c1
I always like to retest against the target BZ system to verify.
This is the host that was loaned to   Yuying Ma <yuma> :
 apm-mustang-ev3-02.lab.eng.rdu.redhat.com


I have a couple of jobs in queue for a couple BZ checks.
I understand  yuma  is testing/verifying other BZ.
Once system is released my jobs are next in line.
I will update this BZ once the results are in.


Best,
-pbunyan

Comment 10 Mark Salter 2016-09-06 18:47:29 UTC
This should be fixed in kernel-4.5.0-5.el7 due to APM eth driver being modularized as of that kernel.

Comment 12 Jeff Bastian 2016-09-22 22:01:33 UTC
In addition to Paul's testing in comment 11, I loaded and unloaded the xgene_enet modules a number of times on the original system, although with newer firmware now, and did not see any warnings.  Marking as Verified.

[root@apm-mustang-ev3-02 ~]# uname -r
4.5.0-12.el7.aarch64

[root@apm-mustang-ev3-02 ~]# dmidecode -t0 | grep -A3 BIOS.Info
BIOS Information
	Vendor: AppliedMicro
	Version: 3.06.15
	Release Date: Aug 19 2016

[root@apm-mustang-ev3-02 ~]# lsmod | egrep 'xgene_enet|realtek'
xgene_enet             48782  0 
realtek                 4450  0 
mdio_xgene              7540  4 xgene_enet

[root@apm-mustang-ev3-02 ~]# rmmod xgene_enet ; rmmod mdio_xgene

[root@apm-mustang-ev3-02 ~]# lsmod | egrep 'xgene_enet|realtek'
realtek                 4450  0 

[root@apm-mustang-ev3-02 ~]# modprobe xgene_enet
[  316.559987] libphy: APM X-Gene MDIO bus: probed
[  316.633209] xgene-enet APMC0D05:00: clocks have been setup already
[  316.667675] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
...
[  320.706674] xgene-enet APMC0D30:00 eth1: Link is Up - 1Gbps/Full - flow control off
[  320.714399] IPv6: ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready

[root@apm-mustang-ev3-02 ~]# for i in {1..10} ; do rmmod xgene_enet ; rmmod mdio_xgene ; sleep 10 ; modprobe xgene_enet ; sleep 10 ; done
...
...

[root@apm-mustang-ev3-02 ~]# dmesg | grep -i warn
[root@apm-mustang-ev3-02 ~]# ip -4 a l
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
46: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000
    inet 10.12.0.113/21 brd 10.12.7.255 scope global dynamic eth0
       valid_lft 85447sec preferred_lft 85447sec

Comment 13 Jeff Bastian 2016-12-12 16:54:11 UTC
Closing since RHEL 7.3 has shipped.  (This BZ was not attached to Errata Tool so it did not get closed automatically.)