Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1506230

Summary: nettop.stp does not collect receive data on kernel-3.10.0-514.el7 and higher
Product: Red Hat Enterprise Linux 7 Reporter: Lukas Herbolt <lherbolt>
Component: systemtapAssignee: Frank Ch. Eigler <fche>
Status: CLOSED ERRATA QA Contact: Martin Cermak <mcermak>
Severity: medium Docs Contact: Vladimír Slávik <vslavik>
Priority: unspecified    
Version: 7.4CC: chorn, dsmith, fj-lsoft-kernel-it, fj-lsoft-rh-dump, jistone, lberk, mbenitez, mcermak, mjw, pasik, vslavik
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: systemtap-3.2-3.el7 Doc Type: If docs needed, set a value
Doc Text:
see https://bugzilla.redhat.com/show_bug.cgi?id=1473722
Story Points: ---
Clone Of:
: 1546179 (view as bug list) Environment:
Last Closed: 2018-04-10 16:32:40 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1459581, 1522983    
Attachments:
Description Flags
netif_receive_skb.stp
none
netif_receive_skb_internal.stp
none
systemtap fix for the issue none

Description Lukas Herbolt 2017-10-25 13:06:36 UTC
Created attachment 1343237 [details]
netif_receive_skb.stp

Description of problem:
Running an example script netttop.stp from
/usr/share/systemtap/examples/network/nettop.stp does not collect any RECV_PK or 
RECV_KB.

# stap /usr/share/systemtap/examples/network/nettop.stp

 PID   UID DEV     XMIT_PK RECV_PK XMIT_KB RECV_KB COMMAND        
 2058     0 eth0      20026       0    1419       0 sshd           
    0     0 eth0       4068       0     263       0 swapper/0      
 2061     0 eth0        234       0      15       0 scp            


Version-Release number of selected component (if applicable):
This is happening since: kernel-3.10.0-514.el7
Actually looking into code it seems to be this commit:

RH commit: 988b8276bc4cf6cd653a9633ecdc4514e2ab7f44 
Upstream:  ae78dbfa40c629f79c72ab93525508ef49e798b6

There is a change in the netif_receive_skb() which is now calling netif_receive_skb_internal()

The systemtap script uses probe 'netdev.receive' which is defined as:
probe netdev.receive =
        kernel.function("netif_receive_skb")

in /usr/share/systemtap/tapset/linux/networking.stp. 
I was running live crash on the latest kernel:

crash> sys | grep REL 
     RELEASE: 3.10.0-693.5.2.el7.x86_64
crash> dis netif_receive_skb | head -5
0xffffffff81587290 <netif_receive_skb>:	callq  0xffffffff816b6df0 <ftrace_regs_caller>
0xffffffff81587295 <netif_receive_skb+5>:	push   %rbp
0xffffffff81587296 <netif_receive_skb+6>:	mov    %rsp,%rbp
0xffffffff81587299 <netif_receive_skb+9>:	push   %r12
0xffffffff8158729b <netif_receive_skb+11>:	mov    %rdi,%r12

Despite of I can see the stap module is bind there, I am getting no output.
The stap script is really easy one and is attached as:
 - netif_receive_skb.stp

Using the slightly modified version of it, hooked to the netif_receive_skb_internal
works as charm. Stap attached as well as netif_receive_skb_internal.stp. 

How reproducible:
just running stap /usr/share/systemtap/tapset/linux/networking.stp on one of the 
kernels 7.3+ 


Actual results:
no data collected on RECV_PKT or RECV_KB


Expected results:
data collected on RECV_PKT or RECV_KB

Additional info:
Attaching patch which will actually fix the issue from systemtap point of view.
But I am more interested why the probe does not work as I can see it is hooked in the disassembly.

Comment 2 Lukas Herbolt 2017-10-25 13:07:01 UTC
Created attachment 1343238 [details]
netif_receive_skb_internal.stp

Comment 3 Lukas Herbolt 2017-10-25 13:07:32 UTC
Created attachment 1343239 [details]
systemtap fix for the issue

Comment 4 David Smith 2017-10-25 20:50:07 UTC
(In reply to Lukas Herbolt from comment #0)
> Created attachment 1343237 [details]
> netif_receive_skb.stp
> 
> Description of problem:
> Running an example script netttop.stp from
> /usr/share/systemtap/examples/network/nettop.stp does not collect any
> RECV_PK or 
> RECV_KB.
> 
> # stap /usr/share/systemtap/examples/network/nettop.stp
> 
>  PID   UID DEV     XMIT_PK RECV_PK XMIT_KB RECV_KB COMMAND        
>  2058     0 eth0      20026       0    1419       0 sshd           
>     0     0 eth0       4068       0     263       0 swapper/0      
>  2061     0 eth0        234       0      15       0 scp            
> 
> 
> Version-Release number of selected component (if applicable):
> This is happening since: kernel-3.10.0-514.el7
> Actually looking into code it seems to be this commit:
> 
> RH commit: 988b8276bc4cf6cd653a9633ecdc4514e2ab7f44 
> Upstream:  ae78dbfa40c629f79c72ab93525508ef49e798b6
> 
> There is a change in the netif_receive_skb() which is now calling
> netif_receive_skb_internal()

Excellent debugging work here. Those commits do appear to be the problem.

... stuff deleted ...

> Additional info:
> Attaching patch which will actually fix the issue from systemtap point of
> view.
> But I am more interested why the probe does not work as I can see it is
> hooked in the disassembly.

The netdev.receive probe hooks netif_receive_skb(). Before those commits, all the traffic received by the kernel went through netif_receive_skb(). After those commits you referenced, that is no longer the case. So, even though that probe is placed at that function, the function is no longer called as much (probably only by drivers in a loadable module).

I checked in your patch upstream as commit 94b3978aa:

https://sourceware.org/git/gitweb.cgi?p=systemtap.git;a=commit;h=94b3978aa1d01f09b29dbc2d61e1a2bddec313df

This patch will need to be backported.

Comment 5 Frank Ch. Eigler 2017-11-29 02:00:19 UTC
*** Bug 1518462 has been marked as a duplicate of this bug. ***

Comment 13 errata-xmlrpc 2018-04-10 16:32:40 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:0906