Hide Forgot
Recently Parallels Virtuozzo Containers team has found a kernel bug that leads to a node freeze. We were able to get the root of the problem: it is linux-2.6-net-allow-for-on-demand-emergency-route-cache-flushing.patch, added in 2.6.18-164.x kernels (i've checked 164.6.1.el5, it still contains the bug). The calltrace of the deadlocked process: ... rt_intern_hash( spin_lock_bh(rt_hash_lock_addr(hash)); rt_emergency_hash_rebuild rt_secret_rebuild_oneshot rt_cache_flush rt_run_flush() for (i = rt_hash_mask; i >= 0; i--) { spin_lock_bh(rt_hash_lock_addr(i)); deadlock, bh is disabled => all the node quickly becomes "freezed". We believe this is a security issue, cause this is a remote DOS. Original raw calltrace: Route hash chain too long! Adjust your secret_interval! SysRq HELP : loglevel0-8 vsced_stAte reBoot Crashdump tErm Full debuG kIll thaw-filesystems(J) saK showMem Nice powerOff showPc unRaw Sync showTasks Unmount shoWcpus SysRq: Show Regs ----------- IPI show regs ----------- Pid: 23107, comm: dcpumon, oopses: 0 EIP: 0060:[<0261abf6>] CPU: 3, VCPU: 9665:2 EIP is at _spin_lock_bh+0xf/0x18 EFLAGS: 00000282 Tainted: P (2.6.18-028stab066.8-ent #1 028stab066) EAX: c00bc000 EBX: d9f25950 ECX: 00000088 EDX: c00bc000 ESI: 00000000 EDI: 00000000 EBP: 00001950 DS: 007b ES: 007b CR0: 80050033 CR2: b7dfa03c CR3: 00799000 CR4: 000006f0 [<025d0da4>] rt_run_flush+0x74/0x123 [<025d1bd9>] rt_intern_hash+0x4a4/0x6f2 [<025d26bf>] __ip_route_output_key+0x898/0x909 [<025d2743>] ip_route_output_flow+0x13/0x24f [<025de3b9>] inet_csk_route_req+0xb8/0x1c7 [<025edc9a>] tcp_v4_send_synack+0x17/0x169 [<025f072d>] tcp_v4_conn_request+0x4cc/0x517 [<0261ab57>] _read_lock_bh+0x8/0x17 [<025e7b52>] tcp_rcv_state_process+0x5d/0xd66 [<025eebb1>] tcp_v4_do_rcv+0x2c3/0x323 [<025f1781>] tcp_v4_rcv+0x95e/0x9b1 [<025d60e5>] ip_local_deliver+0x1a6/0x26d [<025d5efc>] ip_rcv+0x505/0x548 [<025b8b7e>] netif_receive_skb+0x3b2/0x3f0 [<025ba6f7>] process_backlog+0x7e/0xd1 [<025ba8c6>] net_rx_action+0x9c/0x1a7 [<024275c4>] __do_softirq+0xe0/0x193 [<024276ad>] do_softirq+0x36/0x3a [<024277c8>] local_bh_enable+0x74/0x7f [<025baf37>] dev_queue_xmit+0x34f/0x380 [<025bf6f4>] neigh_resolve_output+0x1a7/0x1d1 [<025db407>] ip_output+0x299/0x2c2 [<025daa33>] ip_queue_xmit+0x4cf/0x515 [<025d26bf>] __ip_route_output_key+0x898/0x909 [<025ef380>] tcp_v4_send_check+0x77/0xbd [<025e9da6>] tcp_transmit_skb+0x687/0x6b5 [<025ec3e2>] tcp_connect+0x341/0x3c1 [<025f0040>] tcp_v4_connect+0x708/0x89f [<024690a1>] kunmap_high+0x14/0x84 [<025fad1d>] inet_stream_connect+0x7d/0x208 [<02475c7a>] get_user_size+0x30/0x57 [<025afa9e>] sys_connect+0x6d/0x98 [<0246cfea>] follow_page_pte+0x173/0x195 [<02475997>] __kmap_atomic_user_page+0xdb/0xf3 [<024758a8>] __kunmap_atomic_user_page+0x10/0x24 [<02475c2f>] rw_vm+0x10d/0x128 [<025afbe6>] sys_socketcall+0x8e/0x1af [<02407400>] do_syscall_trace+0xab/0xb1 ======================= ----------- IPI show regs ----------- -- Best regards, Konstantin Khorenko, PVC/OpenVZ developer, Parallels
Created attachment 377148 [details] initial revision of a patch This is how a fix could look like, I have not tested it yet because I was not able to reproduce the issue myself yet. 73e42897e8e5619eacb787d2ce69be12f47cfc21 b6280b47a7a42970d098a3059f4ebe7e55e90d8d
Created attachment 378505 [details] additional patch to fix uninitialized rp issue This is a follow up patch based on the upstream commits 73e42897e8e5619eacb787d2ce69be12f47cfc21 b6280b47a7a42970d098a3059f4ebe7e55e90d8d to address the uninitialized rp issues when rt caching is disabled.
This issue has been addressed in following products: Red Hat Enterprise Linux 5 Via RHSA-2010:0046 https://rhn.redhat.com/errata/RHSA-2010-0046.html
This issue has been addressed in following products: Red Hat Enterprise Virtualization for RHEL-5 Via RHSA-2010:0095 https://rhn.redhat.com/errata/RHSA-2010-0095.html