Description of problem: Corruption of net list when sit module is loaded and unloaded and when we read /proc/<pid>/net/dev it causes a oops. This is because the sit modules calls list_netdevice() which adds itself to the network list that is read by that proc file, but the unloading of the module does not remove itself and leaves a freed item on the list. When reading the proc file that freed item is accessed and undefined results occur. Version-Release number of selected component (if applicable): How reproducible: Always. Steps to Reproduce: 1. modprobe sit 2. rmmod sit <<-- bug happens here 3. cat /proc/$$/net/dev The bug occurs on the removing of the sit module, but it does not cause an oops until something reads the freed memory on the network list. Actual results: oops Expected results: no oops Additional info:
Created attachment 856687 [details] Temporary work around for sit module unload bug As we know what the bug is, but trying to come up with a correct solution is not that trivial, this patch is a simple work around. The bug happens when we unload the sit module. The workaround is not to let the user unload the sit module :-) The only problem with this "fix", is that on shutdown, the following warnings will occur (only if the sit module was ever loaded): Stopping Permit User Sessions... [ 53.866774] ------------[ cut here ]------------ [ 53.871522] WARNING: at /home/rostedt/work/git/linux-mrg.git/fs/proc/generic.c:562 remove_proc_entry+0x1a7/0x1b0() [ 53.881977] remove_proc_entry: removing non-empty directory 'net/dev_snmp6', leaking at least 'sit0' [ 53.891188] Modules linked in: sit ip_tunnel tunnel4 bnep lockd bluetooth ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 ip6table_filter(-) xt_sta te ip6_tables nf_conntrack snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm snd_page_alloc snd_timer tpm_infineon hp_wm i snd rfkill coretemp tpm_tis sparse_keymap tpm lpc_ich microcode wmi tpm_bios i2c_i801 mfd_core serio_raw pcspkr soundcore uinput e1000e ptp crc32c_intel i915 i2c_algo_bit drm_kms _helper drm pps_core i2c_core video sunrpc [ 53.942194] CPU: 0 PID: 117 Comm: kworker/u16:5 Not tainted 3.10.27-test-mrg-rt25+ #37 [ 53.950168] Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v02.05 05/07/2012 [ 53.959182] Workqueue: netns cleanup_net [ 53.963158] ffffffff819f2fb0 ffff88011193dc48 ffffffff81631db1 ffff88011193dc88 [ 53.963159] ffffffff8104b970 ffff88011193dc98 ffff880115269f00 ffffffff81a64eba [ 53.963161] ffffffff81a64eba ffff880115269fa1 0000000000000000 ffff88011193dce8 [ 53.985484] Call Trace: [ 53.987956] [<ffffffff81631db1>] dump_stack+0x19/0x1b [ 53.987959] [<ffffffff8104b970>] warn_slowpath_common+0x70/0xa0 [ 53.987961] [<ffffffff8104ba56>] warn_slowpath_fmt+0x46/0x50 [ 53.987963] [<ffffffff811fdc07>] remove_proc_entry+0x1a7/0x1b0 [ 53.987964] [<ffffffff811fdb5c>] ? remove_proc_entry+0xfc/0x1b0 [ 53.987967] [<ffffffff816080f1>] ipv6_proc_exit_net+0x31/0x50 [ 53.987969] [<ffffffff81522718>] ops_exit_list.isra.5+0x38/0x60 [ 53.987970] [<ffffffff81522f70>] cleanup_net+0x100/0x1a0 [ 53.987972] [<ffffffff8106b8eb>] process_one_work+0x18b/0x530 [ 53.987974] [<ffffffff8106cc5d>] worker_thread+0x11d/0x360 [ 53.987976] [<ffffffff8106cb40>] ? manage_workers.isra.26+0x2f0/0x2f0 [ 53.987978] [<ffffffff81073602>] kthread+0xb2/0xc0 [ 53.987982] [<ffffffff81630000>] ? writenote+0x44/0xba [ 53.987984] [<ffffffff81073550>] ? flush_kthread_worker+0xb0/0xb0 [ 53.987986] [<ffffffff8163fb1c>] ret_from_fork+0x7c/0xb0 [ 53.987988] [<ffffffff81073550>] ? flush_kthread_worker+0xb0/0xb0 [ 53.987989] ---[ end trace 0000000000000002 ]--- [ESC[32m OK ESC[0m] Stopped Permit User Sessions. [ESC[32m OK ESC[0m] Stopped target Remote File Systems. [ 54.094103] ------------[ cut here ]------------ [ 54.098981] WARNING: at /home/rostedt/work/git/linux-mrg.git/fs/proc/proc_sysctl.c:1585 retire_sysctl_set+0x23/0x30() [ 54.109659] Modules linked in: sit ip_tunnel tunnel4 bnep lockd bluetooth ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 ip6table_filter(-) xt_sta te ip6_tables nf_conntrack snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm snd_page_alloc snd_timer tpm_infineon hp_wm i snd rfkill coretemp tpm_tis sparse_keymap tpm lpc_ich microcode wmi tpm_bios i2c_i801 mfd_core serio_raw pcspkr soundcore uinput e1000e ptp crc32c_intel i915 i2c_algo_bit drm_kms _helper drm pps_core i2c_core video sunrpc [ 54.160663] CPU: 6 PID: 117 Comm: kworker/u16:5 Tainted: G W 3.10.27-test-mrg-rt25+ #37 [ 54.170566] Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v02.05 05/07/2012 [ 54.180897] Workqueue: netns cleanup_net [ 54.186205] ffffffff819f3520 ffff88011193dcf8 ffffffff81631db1 ffff88011193dd38 [ 54.186207] ffffffff8104b970 ffff88011eb94aa0 ffffffff81cc5238 ffffffff81cc5240 [ 54.186208] ffff8801131e8000 ffff88011193ddb8 0000000000000000 ffff88011193dd48 [ 54.211564] Call Trace: [ 54.215433] [<ffffffff81631db1>] dump_stack+0x19/0x1b [ 54.215435] [<ffffffff8104b970>] warn_slowpath_common+0x70/0xa0 [ 54.215437] [<ffffffff8104b9ba>] warn_slowpath_null+0x1a/0x20 [ 54.215438] [<ffffffff81204793>] retire_sysctl_set+0x23/0x30 [ 54.215440] [<ffffffff8161b255>] sysctl_net_exit+0x15/0x20 [ 54.215442] [<ffffffff81522718>] ops_exit_list.isra.5+0x38/0x60 [ 54.215443] [<ffffffff81522f70>] cleanup_net+0x100/0x1a0 [ 54.215445] [<ffffffff8106b8eb>] process_one_work+0x18b/0x530 [ 54.215447] [<ffffffff8106cc5d>] worker_thread+0x11d/0x360 [ 54.215448] [<ffffffff8106cb40>] ? manage_workers.isra.26+0x2f0/0x2f0 [ 54.215450] [<ffffffff81073602>] kthread+0xb2/0xc0 [ 54.215452] [<ffffffff81630000>] ? writenote+0x44/0xba [ 54.215454] [<ffffffff81073550>] ? flush_kthread_worker+0xb0/0xb0 [ 54.215456] [<ffffffff8163fb1c>] ret_from_fork+0x7c/0xb0 [ 54.215458] [<ffffffff81073550>] ? flush_kthread_worker+0xb0/0xb0 [ 54.215459] ---[ end trace 0000000000000003 ]---
Oh, I forgot to show what the original bug looks like. This happens after sit has been loaded and unloaded and then /proc/<pid>/net/dev has been read. BUG: unable to handle kernel paging request at ffffffffa0b109b8 IP: [<ffffffff81488ae3>] dev_get_stats+0x23/0xa0 PGD 1a0d067 PUD 1a0e063 PMD 12072a067 PTE 0 Oops: 0000 [#1] PREEMPT SMP Modules linked in: ts_kmp nf_conntrack_ipv4 nf_defrag_ipv4 nls_koi8_u nls_cp932 rpcsec_gss_krb5 auth_rpcgss oid_registry nfsv4 nfsv3 nfs_acl nfsv2 nfs lockd sunrpc nfnetlink_queue nfnetlink bluetooth rfkill ipv6 gpio_ich iTCO_wdt iTCO_vendor_support coretemp microcode serio_raw pcspkr sg ipmi_si ipmi_msghandler hpilo hpwdt lpc_ich tg3 ptp pps_core be2net be2iscsi iscsi_boot_sysfs libiscsi scsi_transport_iscsi shpchp i3000_edac edac_core ext4 jbd2 mbcache sd_mod crc_t10dif pata_acpi ata_generic ata_piix radeon ttm drm_kms_helper drm hwmon i2c_algo_bit i2c_core dm_mirror dm_region_hash dm_log dm_mod [last unloaded: rmd128] CPU: 1 PID: 21863 Comm: cat Not tainted 3.10.27-rt25.18.vanilla.el6rt.x86_64 #1 Hardware name: HP ProLiant DL320 G5 , BIOS W04 06/10/2008 task: ffff88011fffdc40 ti: ffff88011eae6000 task.ti: ffff88011eae6000 RIP: 0010:[<ffffffff81488ae3>] [<ffffffff81488ae3>] dev_get_stats+0x23/0xa0 RSP: 0018:ffff88011eae7cc8 EFLAGS: 00010282 RAX: ffffffff81697980 RBX: ffff88011eae7d58 RCX: ffff880121974cc0 RDX: ffffffffa0b10940 RSI: ffff88011eae7d58 RDI: ffff88011fe47000 RBP: ffff88011eae7cd8 R08: 0000000000000098 R09: 0000000000000001 R10: 0000000000000000 R11: 0000000000000009 R12: ffff88011fe47000 R13: ffff88011fe47000 R14: 00000000000002b9 R15: ffff88011eae7e78 FS: 00007f1548622700(0000) GS:ffff880127a40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffffa0b109b8 CR3: 000000011febe000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Stack: ffff88012132f080 ffff88011fe47000 ffff88011eae7e28 ffffffff814a4018 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 Call Trace: [<ffffffff814a4018>] dev_seq_printf_stats+0x28/0x100 [<ffffffff814a4104>] dev_seq_show+0x14/0x30 [<ffffffff8119d141>] seq_read+0x281/0x410 [<ffffffff811e0963>] proc_reg_read+0x43/0x70 [<ffffffff81179981>] vfs_read+0xb1/0x130 [<ffffffff81179e1f>] SyS_read+0x5f/0xa0 [<ffffffff81568a59>] system_call_fastpath+0x16/0x1b Code: 0f 1f 84 00 00 00 00 00 55 48 89 e5 48 83 ec 10 48 89 1c 24 4c 89 64 24 08 66 66 66 66 90 48 8b 97 90 01 00 00 49 89 fc 48 89 f3 <48> 83 7a 78 00 74 36 b9 17 00 00 00 31 c0 48 89 f7 f3 48 ab 4c RIP [<ffffffff81488ae3>] dev_get_stats+0x23/0xa0 RSP <ffff88011eae7cc8> CR2: ffffffffa0b109b8 ---[ end trace b85560ffb583b1c2 ]---
Created attachment 856805 [details] Unregister sit devices I did a bit of reading of the change logs between 3.10 and 3.13 as well as what we backported for the sit driver. Commit 20300db1bd1b9f45477c7af15b358deb87742bd1 sit: allow to use rtnl ops on fb tunnel [ Upstream commit 205983c43700ac3a81e7625273a3fa83cd2759b5 ] Has the following text: Now that it is done, we must remove the explicit call to unregister_netdevice_queue(), because the fallback tunnel is added to the queue in sit_destroy_tunnels() when checking rtnl_link_ops of all netdevices (this is valid since commit 5e6700b3bf98 ("sit: add support of x-netns")). The thing is, we did not have commit 5e6700b3bf98 ("sit: add support of x-netns")). I took a look at that commit, and it does quite a bit more than what I wanted to backport. It had dependencies on other changes like the introduction of skb_scrub_packet() which meant that backporting any of this extra struff would very likely open up more bugs unless we backported everything from 3.13. I read the code that was changed and as well as the comment about what was does in commit 5e6700b3bf98. I was able to narrow down the specific changes that were made. That commit did not do the normal "one change per patch" but did a few changes. I pulled out the unregistered code that was added to sit_destroyed_tunnels() in that commit and added it to our kernel. This change was unrelated to the other changes in that commit. I booted and traced the kernel with this update and it looks like everything is being done correctly: ># trace-cmd start -p function -l '*registe^Cnetdev' -l '*list_netde*' ># modprobe sit ># cat /proc/$$/net/dev Inter-| Receive | Transmit face |bytes packets errs drop fifo frame compressed multicast|bytes packets errs drop fifo colls carrier compressed lo: 276 4 0 0 0 0 0 0 276 4 0 0 0 0 0 0 sit0: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 em1: 67490 265 0 0 0 0 0 12 19439 144 0 0 0 0 0 0 ># rmmod sit ># cat /proc/$$/net/dev Inter-| Receive | Transmit face |bytes packets errs drop fifo frame compressed multicast|bytes packets errs drop fifo colls carrier compressed lo: 276 4 0 0 0 0 0 0 276 4 0 0 0 0 0 0 em1: 92520 342 0 0 0 0 0 12 22761 168 0 0 0 0 0 0 ># trace-cmd show # tracer: function # # entries-in-buffer/entries-written: 6/6 #P:8 # # _-------=> irqs-off # / _------=> need-resched # |/ _-----=> need-resched_lazy # ||/ _----=> hardirq/softirq # |||/ _---=> preempt-depth # ||||/ _--=> preempt-lazy-depth # ||||| / _-=> migrate-disable # |||||| / delay # TASK-PID CPU# |||||| TIMESTAMP FUNCTION # | | | |||||| | | modprobe-1118 [006] ....... 86.080708: register_netdev <- modprobe-1118 [006] ....... 86.080880: list_netdevice <-register_netdevice modprobe-1118 [006] ....... 86.081016: register_netdev <- modprobe-1118 [006] ....... 86.081286: list_netdevice <-register_netdevice rmmod-1127 [007] ....... 121.384299: unlist_netdevice <-rollback_registered_many rmmod-1127 [006] ....... 121.391317: unlist_netdevice <-rollback_registered_many
fixed in 3.10.27-rt25.21.el6rt
tried reproducer, no oopses observed. ################## [indy@dhcp-27-216 linux-review]$ check_commit_presence ~/rpmbuild/BUILD/kernel-3.10.33/linux-3.10.33.x86_64/ 1b2a58ff62aa3acc7c539b325972c9b94ce89c3d 20300db1bd1b9f45477c7af15b358deb87742bd1 Reverting 1b2a58ff62aa3acc7c539b325972c9b94ce89c3d (v3.10.29~19) ... Applied Reverting 20300db1bd1b9f45477c7af15b358deb87742bd1 (v3.10.16~69) ... Applied Restoring .. Done 2 patch(es) was found applied. ################## [indy@dhcp-27-216 linux-review]$ git show 1b2a58ff62aa3acc7c539b325972c9b94ce89c3d 20300db1bd1b9f45477c7af15b358deb87742bd1 commit 1b2a58ff62aa3acc7c539b325972c9b94ce89c3d Author: Nicolas Dichtel <nicolas.dichtel> Date: Fri Jan 31 09:24:04 2014 +0100 sit: fix double free of fb_tunnel_dev on exit [ No relevant upstream commit. ] This problem was fixed upstream by commit 9434266f2c64 ("sit: fix use after free of fb_tunnel_dev"). The upstream patch depends on upstream commit 5e6700b3bf98 ("sit: add support of x-netns"), which was not backported into 3.10 branch. First, explain the problem: when the sit module is unloaded, sit_cleanup() is called. rmmod sit => sit_cleanup() => rtnl_link_unregister() => __rtnl_kill_links() => for_each_netdev(net, dev) { if (dev->rtnl_link_ops == ops) ops->dellink(dev, &list_kill); } At this point, the FB device is deleted (and all sit tunnels). => unregister_pernet_device() => unregister_pernet_operations() => ops_exit_list() => sit_exit_net() => sit_destroy_tunnels() In this function, no tunnel is found. => unregister_netdevice_queue(sitn->fb_tunnel_dev, &list); We delete the FB device a second time here! Because we cannot simply remove the second deletion (sit_exit_net() must remove the FB device when a netns is deleted), we add an rtnl ops which delete all sit device excepting the FB device and thus we can keep the explicit deletion in sit_exit_net(). CC: Steven Rostedt <rostedt> Signed-off-by: Nicolas Dichtel <nicolas.dichtel> Acked-by: Willem de Bruijn <willemb> Reported-by: Steven Rostedt <srostedt> Tested-by: Steven Rostedt <srostedt> (and our entire MRG team) Tested-by: "Luis Claudio R. Goncalves" <lgoncalv> Tested-by: John Kacur <jkacur> Signed-off-by: Greg Kroah-Hartman <gregkh> diff --git a/net/ipv6/sit.c b/net/ipv6/sit.c index 0491264..620d326 100644 --- a/net/ipv6/sit.c +++ b/net/ipv6/sit.c @@ -1507,6 +1507,15 @@ static const struct nla_policy ipip6_policy[IFLA_IPTUN_MAX + 1] = { #endif }; +static void ipip6_dellink(struct net_device *dev, struct list_head *head) +{ + struct net *net = dev_net(dev); + struct sit_net *sitn = net_generic(net, sit_net_id); + + if (dev != sitn->fb_tunnel_dev) + unregister_netdevice_queue(dev, head); +} + static struct rtnl_link_ops sit_link_ops __read_mostly = { .kind = "sit", .maxtype = IFLA_IPTUN_MAX, @@ -1517,6 +1526,7 @@ static struct rtnl_link_ops sit_link_ops __read_mostly = { .changelink = ipip6_changelink, .get_size = ipip6_get_size, .fill_info = ipip6_fill_info, + .dellink = ipip6_dellink, }; static struct xfrm_tunnel sit_handler __read_mostly = { commit 20300db1bd1b9f45477c7af15b358deb87742bd1 Author: Nicolas Dichtel <nicolas.dichtel> Date: Tue Oct 1 18:04:59 2013 +0200 sit: allow to use rtnl ops on fb tunnel [ Upstream commit 205983c43700ac3a81e7625273a3fa83cd2759b5 ] rtnl ops where introduced by ba3e3f50a0e5 ("sit: advertise tunnel param via rtnl"), but I forget to assign rtnl ops to fb tunnels. Now that it is done, we must remove the explicit call to unregister_netdevice_queue(), because the fallback tunnel is added to the queue in sit_destroy_tunnels() when checking rtnl_link_ops of all netdevices (this is valid since commit 5e6700b3bf98 ("sit: add support of x-netns")). Signed-off-by: Nicolas Dichtel <nicolas.dichtel> Signed-off-by: David S. Miller <davem> Signed-off-by: Greg Kroah-Hartman <gregkh> diff --git a/net/ipv6/sit.c b/net/ipv6/sit.c index cf7c754..0491264 100644 --- a/net/ipv6/sit.c +++ b/net/ipv6/sit.c @@ -1561,6 +1561,7 @@ static int __net_init sit_init_net(struct net *net) goto err_alloc_dev; } dev_net_set(sitn->fb_tunnel_dev, net); + sitn->fb_tunnel_dev->rtnl_link_ops = &sit_link_ops; err = ipip6_fb_tunnel_init(sitn->fb_tunnel_dev); if (err) ################## [indy@dhcp-27-216 linux-review]$ tail -n +1507 ~/rpmbuild/BUILD/kernel-3.10.33/linux-3.10.33.x86_64/net/ipv6/sit.c | head -n 15 #endif }; static void ipip6_dellink(struct net_device *dev, struct list_head *head) { struct net *net = dev_net(dev); struct sit_net *sitn = net_generic(net, sit_net_id); if (dev != sitn->fb_tunnel_dev) unregister_netdevice_queue(dev, head); } static struct rtnl_link_ops sit_link_ops __read_mostly = { .kind = "sit", .maxtype = IFLA_IPTUN_MAX, [indy@dhcp-27-216 linux-review]$ tail -n +1526 ~/rpmbuild/BUILD/kernel-3.10.33/linux-3.10.33.x86_64/net/ipv6/sit.c | head -n 7 .changelink = ipip6_changelink, .get_size = ipip6_get_size, .fill_info = ipip6_fill_info, .dellink = ipip6_dellink, }; static struct xfrm_tunnel sit_handler __read_mostly = { [indy@dhcp-27-216 linux-review]$ tail -n +1571 ~/rpmbuild/BUILD/kernel-3.10.33/linux-3.10.33.x86_64/net/ipv6/sit.c | head -n 7 goto err_alloc_dev; } dev_net_set(sitn->fb_tunnel_dev, net); sitn->fb_tunnel_dev->rtnl_link_ops = &sit_link_ops; err = ipip6_fb_tunnel_init(sitn->fb_tunnel_dev); if (err)
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHSA-2014-0439.html