SWsoft Virtuozzo/OpenVZ Linux Kernel Team has noticed that "ip route list table local" output may lose the existing routes and it confuses the scripts parsing its output. This is kernel-related issue and it was fixed in mainstream by the following patch: http://linux.bkbits.net:8080/linux-2.6/gnupatch@42221b0brjyLy6saMQrizii8Jx5kxQ # ChangeSet # 2005/02/27 11:10:03-08:00 davem.net # [IPV4]: Fix lost routes in fn_hash netlink dumps. # # Spotted by itkes.msu.ru, the fn_hash_dump_bucket() main # loop does not increment 'i' properly, and thus routes will not # be listed, when the test 'i < s_i' passes. # # The bug was added when the code was converted over to # hlist_for_each_entry() by your's truly. # # Signed-off-by: David S. Miller <davem> Version-Release number of selected component (if applicable): 2.6.9-42.0.3.EL Steps to Reproduce: add the new entries into local routing table until the issue occupies [root@svconsole ~]# uname -a Linux svconsole.sw.ru 2.6.9-42.0.3.ELsmp #1 SMP Mon Sep 25 17:28:02 EDT 2006 i686 i686 i386 GNU/Linux [root@svconsole ~]# ip route list table local | wc -l 7 VVS@ comment: add 100 new entries into local table and inspect the number of VVS@ comment: entries [root@svconsole ~]# for i in `seq 1 100` ; do ip r a local 10.1.240.$i dev eth0 proto kernel scope host src 10.1.240.$i ; done [root@svconsole ~]# ip route list table local | wc -l 107 VVS@ comment: correct, as expected [root@svconsole ~]# for i in `seq 1 100` ; do ip r a local 10.2.240.$i dev eth0 proto kernel scope host src 10.2.240.$i || echo "ERROR" ; done [root@svconsole ~]# ip route list table local | wc -l 207 VVS@ comment: correct again [root@svconsole ~]# for i in `seq 1 100` ; do ip r a local 10.3.240.$i dev eth0 proto kernel scope host src 10.3.240.$i || echo "ERROR" ; done [root@svconsole ~]# ip route list table local | wc -l 304 VVS@ comment: wrong, should be 307 here!
see: http://marc2.theaimsgroup.com/?l=git-commits-head&m=110955280729636&w=4
The term "lost" is a bit misleading, none of the routes disappear or get lost. The kernel fails to create multiple netlink messages to dump routes if the total number of routes in a table exceed what can be carried in a single netlink message. The netlink message size is limited by PAGE_SIZE, thus results will be different on various archs.
That's not what's happening here. If you look at the patch the issue is simply not advancing the iterator state properly in the dumping code.
I thought that's exactly what I wrote :-)
Patch looks good to me. Thanks!
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release.
Created attachment 149289 [details] backport of the patch
This request was evaluated by Red Hat Kernel Team for inclusion in a Red Hat Enterprise Linux maintenance release, and has moved to bugzilla status POST.
committed in stream U6 build 55.4. A test kernel with this patch is available from http://people.redhat.com/~jbaron/rhel4/
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2007-0791.html