Description of problem: the sctp assoc array appears to be missinga few remaining pieces of data. This is to track them and fix them up Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
> Neil Horman wrote: > > >On Mon, Feb 25, 2008 at 04:16:15PM +0100, Jan Safranek wrote: > >>I'm testing 2.6.18-81.el5.bz277111 kernel and I am a bit confused, > >>mainly because missing documentation. At first, the header and content > >>of /proc/net/sctp/assocs does not match. Header shows, that there are > >>some parameters after remote addresses (HBINT INS OUTS MAXRT T1X T2X > >>RTXC), but in the body I can see no parameter after last remote address. > >>I can only guess the order of values of these parameters in the file body. > >> > >Thats odd, I tested this and was able to see all the appropriate values in > >there. Are you using an official build, or your own kernel? > > My own, created from cvs co -r private-nhorman-bz277111-branch && make > test-srpm && brew build --scratch. With one SCTP association (over > loopback), cat /proc/net/sctp/assoc shows: > > ASSOC SOCK STY SST ST HBKT ASSOC-ID TX_QUEUE RX_QUEUE UID INODE > LPORT RPORT LADDRS <-> RADDRS HBINT INS OUTS MAXRT T1X T2X RTXC > ffff810003026000 ffff810003662080 0 7 4 777 1 63488 0 > 0 9287 54979 12345 30000 10 10 10 0 0 0 > *127.0.0.1 <-> *127.0.0.1 192.168.100.105 > ffff81000303c000 ffff810003663a00 0 10 4 3861 2 63488 0 > 0 9200 12345 54979 30000 10 10 10 0 0 0 > *127.0.0.1 192.168.100.105 <-> *127.0.0.1 > > -> header does not match the content. > Crap, I think I see the problem. I have a ton of other stuff to do at the moment, but I'll get to fixing this in short order. Basically the new values are in there, they're just in there before the LADDRS <-> RADDRS stuff. The order is off, but the content is there. You should be able to make that switch in net-snmmp until I get to fixing it. > > >>>sctpAssocRemAddrRtx is the same as sctpAssocRtxChunks > >>I thought that sctp chunks are sent only to one remote address and > >>resent to another one only if the previous addr. times out. -> each > >>RemAddr can have different value of this counter. Or do we always send > >>all chunks to all remote addresses? And isn't it possible that the > >>remote host can add/remove addresses dynamically? > >> > > > >Hmm, you're right about that. For some reason it didn't register with me > >that > >this is a per-remote-address table and the numbers wouldn't be in > >aggregate. > > Are you going to fix it? I would be very interested in /proc file format > then (implementation can wait at the moment, RHEL 5.3 is far away). > I'd like to, but I don't have it on my schedule yet. Like you say, 5.3 is still a bit distant
Created attachment 297982 [details] additionl patch to fix header/data argreement Hey, can you test out this patch please? It should fix the data in /proc/net/sctp/assoc so the layout agrees with the header. Thanks!
oh, if it wasnt clear previously, my patch in comment #2 should apply on top of my previous patch, not instead of :) Thanks
the header now matches file body, thanks. It would be nice if there would be some separator between last remote address and HBINT value (just to prevent possible parse bugs when parsing the line from backward...). But that's just a "Christmas wish", not a requirement. Still, I need these values separately for each remote address (* denotes missing items): - assoc. id - sctpAssocRemAddrHBActive (boolean; heartbeat check is activated or not on this remote addr.), it's always true in RHEL 5.3 and upstream kernel. * sctpAssocRemAddrRTO (integer, current T3-rtx timer) - probably can be computed from /proc/sys/net/sctp/rto_initial and sctpAssocRemAddrRtx, but it would be nice to have it separately, just in case the RTO computation changes in kernel. *? sctpAssocRemAddrMaxPathRtx (integer, max. number of DATA chunks retransmissions allowed to a remote IP address before it is considered inactive) - ABI breaker for RHEL 5.3, value of asociation's MAXRT can be used instead. * sctpAssocRemAddrRtx (integer, nr. of DATA chunks retransmissions to this specific IP address) - sctpAssocRemAddrStartTime - can be computed by net-snmp with some difficulties (ugly lookups in tables indexed by ip-address etc... but that's my problem).
yes, I'm aware of those and am getting to them. Thanks!
pushed the formattting fix upstream. I'll post all the
sorry, finger check. Meant to say I'll post this and the rest of the fix internally during the 5.3 devel cycle
I've sent a patch upstream for review.
patch has been accepted to vlad, and DaveM's tree, I'll backport shortly.
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release.
Looking at the kernel-2_6_18-92_el5_bz435110 branch in CVS, the data for sctpAssocRemAddrTable (in /proc/net/sctp/remaddr) look OK. I only don't understand the START column - it's always zero, even if the association is established state. Anyway, as usual, I can poll the remaddr periodically and reconstruct the start time manually in net-snmp (with ~15-30 sec precision). What troubles me more, some values disappeared from /proc/net/sctp/assocs. There is list of values I need. I think the previous version of kernel *had* the values there... where did they go? sctpAssocHeartBeatInterval - The current heartbeat interval (I am not sure, is it HBKT column?) sctpAssocInStreams - nr. of Inbound Streams according to the negotiation at association start up. sctpAssocOutStreams - nr. of Outbound Streams according to the negotiation at association start up. sctpAssocMaxRetr - The maximum number of data retransmissions in the association context. sctpAssocT1expireds - This object reflects the number of times the T1 timer expires without having received the acknowledgement. sctpAssocT2expireds - Every DATA chunk that was included in the SCTP packet that triggered the T3-rtx timer must be added to the value of this counter (this can be probably reconstructed from /proc/net/sctp/remaddr by counting values in REM_ADDR_RTX).
The START column is always zero because we don't record start times yet, but we wanted a placeholer there so we didn't have to change the file format later. As for the remainging values, they're still there from what I can see, is net-snmp not parsing the file properly?
I've checked out today your kernel-2_6_18-92_el5_bz435110 branch and if I open one associations, each end with two addresses, I can see in /proc/net/sctp/: /proc/net/sctp/assoc: ASSOC SOCK STY SST ST HBKT ASSOC-ID TX_QUEUE RX_QUEUE UID INODE LPORT RPORT LADDRS <-> RADDRS ffff810003fc6000 ffff810003f1ba00 0 10 4 2314 2 116488 0 0 9716 12345 12346 *192.168.100.65 192.168.100.131 <-> *192.168.100.65 192.168.100.131 ffff810003038000 ffff810003f1a080 0 7 4 2569 1 63488 141089 0 9729 12346 12345 *192.168.100.65 192.168.100.131 <-> *192.168.100.65 192.168.100.131 /proc/net/sctp/eps: ENDPT SOCK STY SST HBKT LPORT UID INODE LADDRS ffff81000767cc00 ffff810003f1ba00 0 10 57 12345 0 9716 0.0.0.0 /proc/net/sctp/remaddr: ADDR ASSOC_ID HB_ACT RTO MAX_PATH_RTX REM_ADDR_RTX START 192.168.100.65 2 1 3000 5 0 0 192.168.100.131 2 1 1000 5 0 0 192.168.100.65 1 1 1000 5 0 0 192.168.100.131 1 1 1000 5 0 0 And I cannot see values I want in comment #12. You can check the kernel at http://brewweb.devel.redhat.com/brew/taskinfo?taskID=1391580
Dangit! The patch didn't include the association updates. I'll open a bug for it and pull them back in. For now just grab the patch from upstream and build your own kernel.
scratch my last comment, I did the stats you were looking for as part of bz 277111. Just looks like Don hasn't incorporated this into the kernel yet. You can get the patch from that bug and include it manually.
in kernel-2.6.18-99.el5 You can download this test kernel from http://people.redhat.com/dzickus/el5
This bug has been marked for inclusion in the Red Hat Enterprise Linux 5.3 Release Notes. To aid in the development of relevant and accurate release notes, please fill out the "Release Notes" field above with the following 4 pieces of information: Cause: What actions or circumstances cause this bug to present. Consequence: What happens when the bug presents. Fix: What was done to fix the bug. Result: What now happens when the actions or circumstances above occur. (NB: this is not the same as 'the bug doesn't present anymore')
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2009-0225.html