1. 18 8月, 2019 4 次提交
  2. 06 8月, 2019 1 次提交
    • N
      net: bridge: move default pvid init/deinit to NETDEV_REGISTER/UNREGISTER · 091adf9b
      Nikolay Aleksandrov 提交于
      Most of the bridge device's vlan init bugs come from the fact that its
      default pvid is created at the wrong time, way too early in ndo_init()
      before the device is even assigned an ifindex. It introduces a bug when the
      bridge's dev_addr is added as fdb during the initial default pvid creation
      the notification has ifindex/NDA_MASTER both equal to 0 (see example below)
      which really makes no sense for user-space[0] and is wrong.
      Usually user-space software would ignore such entries, but they are
      actually valid and will eventually have all necessary attributes.
      It makes much more sense to send a notification *after* the device has
      registered and has a proper ifindex allocated rather than before when
      there's a chance that the registration might still fail or to receive
      it with ifindex/NDA_MASTER == 0. Note that we can remove the fdb flush
      from br_vlan_flush() since that case can no longer happen. At
      NETDEV_REGISTER br->default_pvid is always == 1 as it's initialized by
      br_vlan_init() before that and at NETDEV_UNREGISTER it can be anything
      depending why it was called (if called due to NETDEV_REGISTER error
      it'll still be == 1, otherwise it could be any value changed during the
      device life time).
      
      For the demonstration below a small change to iproute2 for printing all fdb
      notifications is added, because it contained a workaround not to show
      entries with ifindex == 0.
      Command executed while monitoring: $ ip l add br0 type bridge
      Before (both ifindex and master == 0):
      $ bridge monitor fdb
      36:7e:8a:b3:56:ba dev * vlan 1 master * permanent
      
      After (proper br0 ifindex):
      $ bridge monitor fdb
      e6:2a:ae:7a:b7:48 dev br0 vlan 1 master br0 permanent
      
      v4: move only the default pvid init/deinit to NETDEV_REGISTER/UNREGISTER
      v3: send the correct v2 patch with all changes (stub should return 0)
      v2: on error in br_vlan_init set br->vlgrp to NULL and return 0 in
          the br_vlan_bridge_event stub when bridge vlans are disabled
      
      [0] https://bugzilla.kernel.org/show_bug.cgi?id=204389Reported-by: Nmichael-dev <michael-dev@fami-braun.de>
      Fixes: 5be5a2df ("bridge: Add filtering support for default_pvid")
      Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Acked-by: NRoopa Prabhu <roopa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      091adf9b
  3. 01 8月, 2019 2 次提交
    • N
      net: bridge: mcast: add delete due to fast-leave mdb flag · 3247b272
      Nikolay Aleksandrov 提交于
      In user-space there's no way to distinguish why an mdb entry was deleted
      and that is a problem for daemons which would like to keep the mdb in
      sync with remote ends (e.g. mlag) but would also like to converge faster.
      In almost all cases we'd like to age-out the remote entry for performance
      and convergence reasons except when fast-leave is enabled. In that case we
      want explicit immediate remote delete, thus add mdb flag which is set only
      when the entry is being deleted due to fast-leave.
      Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3247b272
    • N
      net: bridge: mcast: don't delete permanent entries when fast leave is enabled · 5c725b6b
      Nikolay Aleksandrov 提交于
      When permanent entries were introduced by the commit below, they were
      exempt from timing out and thus igmp leave wouldn't affect them unless
      fast leave was enabled on the port which was added before permanent
      entries existed. It shouldn't matter if fast leave is enabled or not
      if the user added a permanent entry it shouldn't be deleted on igmp
      leave.
      
      Before:
      $ echo 1 > /sys/class/net/eth4/brport/multicast_fast_leave
      $ bridge mdb add dev br0 port eth4 grp 229.1.1.1 permanent
      $ bridge mdb show
      dev br0 port eth4 grp 229.1.1.1 permanent
      
      < join and leave 229.1.1.1 on eth4 >
      
      $ bridge mdb show
      $
      
      After:
      $ echo 1 > /sys/class/net/eth4/brport/multicast_fast_leave
      $ bridge mdb add dev br0 port eth4 grp 229.1.1.1 permanent
      $ bridge mdb show
      dev br0 port eth4 grp 229.1.1.1 permanent
      
      < join and leave 229.1.1.1 on eth4 >
      
      $ bridge mdb show
      dev br0 port eth4 grp 229.1.1.1 permanent
      
      Fixes: ccb1c31a ("bridge: add flags to distinguish permanent mdb entires")
      Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5c725b6b
  4. 30 7月, 2019 2 次提交
    • F
      netfilter: ebtables: also count base chain policies · 3b48300d
      Florian Westphal 提交于
      ebtables doesn't include the base chain policies in the rule count,
      so we need to add them manually when we call into the x_tables core
      to allocate space for the comapt offset table.
      
      This lead syzbot to trigger:
      WARNING: CPU: 1 PID: 9012 at net/netfilter/x_tables.c:649
      xt_compat_add_offset.cold+0x11/0x36 net/netfilter/x_tables.c:649
      
      Reported-by: syzbot+276ddebab3382bbf72db@syzkaller.appspotmail.com
      Fixes: 2035f3ff ("netfilter: ebtables: compat: un-break 32bit setsockopt when no rules are present")
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      3b48300d
    • N
      net: bridge: delete local fdb on device init failure · d7bae09f
      Nikolay Aleksandrov 提交于
      On initialization failure we have to delete the local fdb which was
      inserted due to the default pvid creation. This problem has been present
      since the inception of default_pvid. Note that currently there are 2 cases:
      1) in br_dev_init() when br_multicast_init() fails
      2) if register_netdevice() fails after calling ndo_init()
      
      This patch takes care of both since br_vlan_flush() is called on both
      occasions. Also the new fdb delete would be a no-op on normal bridge
      device destruction since the local fdb would've been already flushed by
      br_dev_delete(). This is not an issue for ports since nbp_vlan_init() is
      called last when adding a port thus nothing can fail after it.
      
      Reported-by: syzbot+88533dc8b582309bf3ee@syzkaller.appspotmail.com
      Fixes: 5be5a2df ("bridge: Add filtering support for default_pvid")
      Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d7bae09f
  5. 25 7月, 2019 2 次提交
  6. 22 7月, 2019 1 次提交
    • W
      netfilter: ebtables: fix a memory leak bug in compat · 15a78ba1
      Wenwen Wang 提交于
      In compat_do_replace(), a temporary buffer is allocated through vmalloc()
      to hold entries copied from the user space. The buffer address is firstly
      saved to 'newinfo->entries', and later on assigned to 'entries_tmp'. Then
      the entries in this temporary buffer is copied to the internal kernel
      structure through compat_copy_entries(). If this copy process fails,
      compat_do_replace() should be terminated. However, the allocated temporary
      buffer is not freed on this path, leading to a memory leak.
      
      To fix the bug, free the buffer before returning from compat_do_replace().
      Signed-off-by: NWenwen Wang <wenwen@cs.uga.edu>
      Reviewed-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      15a78ba1
  7. 20 7月, 2019 1 次提交
    • A
      netfilter: bridge: make NF_TABLES_BRIDGE tristate · dfee0e99
      Arnd Bergmann 提交于
      The new nft_meta_bridge code fails to link as built-in when NF_TABLES
      is a loadable module.
      
      net/bridge/netfilter/nft_meta_bridge.o: In function `nft_meta_bridge_get_eval':
      nft_meta_bridge.c:(.text+0x1e8): undefined reference to `nft_meta_get_eval'
      net/bridge/netfilter/nft_meta_bridge.o: In function `nft_meta_bridge_get_init':
      nft_meta_bridge.c:(.text+0x468): undefined reference to `nft_meta_get_init'
      nft_meta_bridge.c:(.text+0x49c): undefined reference to `nft_parse_register'
      nft_meta_bridge.c:(.text+0x4cc): undefined reference to `nft_validate_register_store'
      net/bridge/netfilter/nft_meta_bridge.o: In function `nft_meta_bridge_module_exit':
      nft_meta_bridge.c:(.exit.text+0x14): undefined reference to `nft_unregister_expr'
      net/bridge/netfilter/nft_meta_bridge.o: In function `nft_meta_bridge_module_init':
      nft_meta_bridge.c:(.init.text+0x14): undefined reference to `nft_register_expr'
      net/bridge/netfilter/nft_meta_bridge.o:(.rodata+0x60): undefined reference to `nft_meta_get_dump'
      net/bridge/netfilter/nft_meta_bridge.o:(.rodata+0x88): undefined reference to `nft_meta_set_eval'
      
      This can happen because the NF_TABLES_BRIDGE dependency itself is just a
      'bool'.  Make the symbol a 'tristate' instead so Kconfig can propagate the
      dependencies correctly.
      
      Fixes: 30e103fe ("netfilter: nft_meta: move bridge meta keys into nft_meta_bridge")
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      dfee0e99
  8. 19 7月, 2019 1 次提交
  9. 06 7月, 2019 6 次提交
  10. 04 7月, 2019 1 次提交
  11. 03 7月, 2019 4 次提交
    • N
      net: bridge: stp: don't cache eth dest pointer before skb pull · 2446a68a
      Nikolay Aleksandrov 提交于
      Don't cache eth dest pointer before calling pskb_may_pull.
      
      Fixes: cf0f02d0 ("[BRIDGE]: use llc for receiving STP packets")
      Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2446a68a
    • N
      net: bridge: don't cache ether dest pointer on input · 3d26eb8a
      Nikolay Aleksandrov 提交于
      We would cache ether dst pointer on input in br_handle_frame_finish but
      after the neigh suppress code that could lead to a stale pointer since
      both ipv4 and ipv6 suppress code do pskb_may_pull. This means we have to
      always reload it after the suppress code so there's no point in having
      it cached just retrieve it directly.
      
      Fixes: 057658cb ("bridge: suppress arp pkts on BR_NEIGH_SUPPRESS ports")
      Fixes: ed842fae ("bridge: suppress nd pkts on BR_NEIGH_SUPPRESS ports")
      Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3d26eb8a
    • N
      net: bridge: mcast: fix stale ipv6 hdr pointer when handling v6 query · 3b26a5d0
      Nikolay Aleksandrov 提交于
      We get a pointer to the ipv6 hdr in br_ip6_multicast_query but we may
      call pskb_may_pull afterwards and end up using a stale pointer.
      So use the header directly, it's just 1 place where it's needed.
      
      Fixes: 08b202b6 ("bridge br_multicast: IPv6 MLD support.")
      Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Tested-by: NMartin Weinelt <martin@linuxlounge.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3b26a5d0
    • N
      net: bridge: mcast: fix stale nsrcs pointer in igmp3/mld2 report handling · e57f6185
      Nikolay Aleksandrov 提交于
      We take a pointer to grec prior to calling pskb_may_pull and use it
      afterwards to get nsrcs so record nsrcs before the pull when handling
      igmp3 and we get a pointer to nsrcs and call pskb_may_pull when handling
      mld2 which again could lead to reading 2 bytes out-of-bounds.
      
       ==================================================================
       BUG: KASAN: use-after-free in br_multicast_rcv+0x480c/0x4ad0 [bridge]
       Read of size 2 at addr ffff8880421302b4 by task ksoftirqd/1/16
      
       CPU: 1 PID: 16 Comm: ksoftirqd/1 Tainted: G           OE     5.2.0-rc6+ #1
       Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
       Call Trace:
        dump_stack+0x71/0xab
        print_address_description+0x6a/0x280
        ? br_multicast_rcv+0x480c/0x4ad0 [bridge]
        __kasan_report+0x152/0x1aa
        ? br_multicast_rcv+0x480c/0x4ad0 [bridge]
        ? br_multicast_rcv+0x480c/0x4ad0 [bridge]
        kasan_report+0xe/0x20
        br_multicast_rcv+0x480c/0x4ad0 [bridge]
        ? br_multicast_disable_port+0x150/0x150 [bridge]
        ? ktime_get_with_offset+0xb4/0x150
        ? __kasan_kmalloc.constprop.6+0xa6/0xf0
        ? __netif_receive_skb+0x1b0/0x1b0
        ? br_fdb_update+0x10e/0x6e0 [bridge]
        ? br_handle_frame_finish+0x3c6/0x11d0 [bridge]
        br_handle_frame_finish+0x3c6/0x11d0 [bridge]
        ? br_pass_frame_up+0x3a0/0x3a0 [bridge]
        ? virtnet_probe+0x1c80/0x1c80 [virtio_net]
        br_handle_frame+0x731/0xd90 [bridge]
        ? select_idle_sibling+0x25/0x7d0
        ? br_handle_frame_finish+0x11d0/0x11d0 [bridge]
        __netif_receive_skb_core+0xced/0x2d70
        ? virtqueue_get_buf_ctx+0x230/0x1130 [virtio_ring]
        ? do_xdp_generic+0x20/0x20
        ? virtqueue_napi_complete+0x39/0x70 [virtio_net]
        ? virtnet_poll+0x94d/0xc78 [virtio_net]
        ? receive_buf+0x5120/0x5120 [virtio_net]
        ? __netif_receive_skb_one_core+0x97/0x1d0
        __netif_receive_skb_one_core+0x97/0x1d0
        ? __netif_receive_skb_core+0x2d70/0x2d70
        ? _raw_write_trylock+0x100/0x100
        ? __queue_work+0x41e/0xbe0
        process_backlog+0x19c/0x650
        ? _raw_read_lock_irq+0x40/0x40
        net_rx_action+0x71e/0xbc0
        ? __switch_to_asm+0x40/0x70
        ? napi_complete_done+0x360/0x360
        ? __switch_to_asm+0x34/0x70
        ? __switch_to_asm+0x40/0x70
        ? __schedule+0x85e/0x14d0
        __do_softirq+0x1db/0x5f9
        ? takeover_tasklets+0x5f0/0x5f0
        run_ksoftirqd+0x26/0x40
        smpboot_thread_fn+0x443/0x680
        ? sort_range+0x20/0x20
        ? schedule+0x94/0x210
        ? __kthread_parkme+0x78/0xf0
        ? sort_range+0x20/0x20
        kthread+0x2ae/0x3a0
        ? kthread_create_worker_on_cpu+0xc0/0xc0
        ret_from_fork+0x35/0x40
      
       The buggy address belongs to the page:
       page:ffffea0001084c00 refcount:0 mapcount:-128 mapping:0000000000000000 index:0x0
       flags: 0xffffc000000000()
       raw: 00ffffc000000000 ffffea0000cfca08 ffffea0001098608 0000000000000000
       raw: 0000000000000000 0000000000000003 00000000ffffff7f 0000000000000000
       page dumped because: kasan: bad access detected
      
       Memory state around the buggy address:
       ffff888042130180: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
       ffff888042130200: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
       > ffff888042130280: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
                                           ^
       ffff888042130300: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
       ffff888042130380: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
       ==================================================================
       Disabling lock debugging due to kernel taint
      
      Fixes: bc8c20ac ("bridge: multicast: treat igmpv3 report with INCLUDE and no sources as a leave")
      Reported-by: NMartin Weinelt <martin@linuxlounge.net>
      Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Tested-by: NMartin Weinelt <martin@linuxlounge.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e57f6185
  12. 21 6月, 2019 1 次提交
    • W
      netfilter: bridge: Fix non-untagged fragment packet · 29099462
      wenxu 提交于
      ip netns exec ns1 ip a a dev eth0 10.0.0.7/24
      ip netns exec ns2 ip link a link eth0 name vlan type vlan id 200
      ip netns exec ns2 ip a a dev vlan 10.0.0.8/24
      
      ip l add dev br0 type bridge vlan_filtering 1
      brctl addif br0 veth1
      brctl addif br0 veth2
      
      bridge vlan add dev veth1 vid 200 pvid untagged
      bridge vlan add dev veth2 vid 200
      
      A two fragment packet sent from ns2 contains the vlan tag 200.  In the
      bridge conntrack, this packet will defrag to one skb with fraglist.
      When the packet is forwarded to ns1 through veth1, the first skb vlan
      tag will be cleared by the "untagged" flags. But the vlan tag in the
      second skb is still tagged, so the second fragment ends up with tag 200
      to ns1. So if the first fragment packet doesn't contain the vlan tag,
      all of the remain should not contain vlan tag.
      
      Fixes: 3c171f49 ("netfilter: bridge: add connection tracking system")
      Signed-off-by: Nwenxu <wenxu@ucloud.cn>
      Acked-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      29099462
  13. 20 6月, 2019 1 次提交
    • C
      netfilter: bridge: prevent UAF in brnf_exit_net() · 7e6daf50
      Christian Brauner 提交于
      Prevent a UAF in brnf_exit_net().
      
      When unregister_net_sysctl_table() is called the ctl_hdr pointer will
      obviously be freed and so accessing it righter after is invalid. Fix
      this by stashing a pointer to the table we want to free before we
      unregister the sysctl header.
      
      Note that syzkaller falsely chased this down to the drm tree so the
      Fixes tag that syzkaller requested would be wrong. This commit uses a
      different but the correct Fixes tag.
      
      /* Splat */
      
      BUG: KASAN: use-after-free in br_netfilter_sysctl_exit_net
      net/bridge/br_netfilter_hooks.c:1121 [inline]
      BUG: KASAN: use-after-free in brnf_exit_net+0x38c/0x3a0
      net/bridge/br_netfilter_hooks.c:1141
      Read of size 8 at addr ffff8880a4078d60 by task kworker/u4:4/8749
      
      CPU: 0 PID: 8749 Comm: kworker/u4:4 Not tainted 5.2.0-rc5-next-20190618 #17
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google
      01/01/2011
      Workqueue: netns cleanup_net
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x172/0x1f0 lib/dump_stack.c:113
       print_address_description.cold+0xd4/0x306 mm/kasan/report.c:351
       __kasan_report.cold+0x1b/0x36 mm/kasan/report.c:482
       kasan_report+0x12/0x20 mm/kasan/common.c:614
       __asan_report_load8_noabort+0x14/0x20 mm/kasan/generic_report.c:132
       br_netfilter_sysctl_exit_net net/bridge/br_netfilter_hooks.c:1121 [inline]
       brnf_exit_net+0x38c/0x3a0 net/bridge/br_netfilter_hooks.c:1141
       ops_exit_list.isra.0+0xaa/0x150 net/core/net_namespace.c:154
       cleanup_net+0x3fb/0x960 net/core/net_namespace.c:553
       process_one_work+0x989/0x1790 kernel/workqueue.c:2269
       worker_thread+0x98/0xe40 kernel/workqueue.c:2415
       kthread+0x354/0x420 kernel/kthread.c:255
       ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352
      
      Allocated by task 11374:
       save_stack+0x23/0x90 mm/kasan/common.c:71
       set_track mm/kasan/common.c:79 [inline]
       __kasan_kmalloc mm/kasan/common.c:489 [inline]
       __kasan_kmalloc.constprop.0+0xcf/0xe0 mm/kasan/common.c:462
       kasan_kmalloc+0x9/0x10 mm/kasan/common.c:503
       __do_kmalloc mm/slab.c:3645 [inline]
       __kmalloc+0x15c/0x740 mm/slab.c:3654
       kmalloc include/linux/slab.h:552 [inline]
       kzalloc include/linux/slab.h:743 [inline]
       __register_sysctl_table+0xc7/0xef0 fs/proc/proc_sysctl.c:1327
       register_net_sysctl+0x29/0x30 net/sysctl_net.c:121
       br_netfilter_sysctl_init_net net/bridge/br_netfilter_hooks.c:1105 [inline]
       brnf_init_net+0x379/0x6a0 net/bridge/br_netfilter_hooks.c:1126
       ops_init+0xb3/0x410 net/core/net_namespace.c:130
       setup_net+0x2d3/0x740 net/core/net_namespace.c:316
       copy_net_ns+0x1df/0x340 net/core/net_namespace.c:439
       create_new_namespaces+0x400/0x7b0 kernel/nsproxy.c:103
       unshare_nsproxy_namespaces+0xc2/0x200 kernel/nsproxy.c:202
       ksys_unshare+0x444/0x980 kernel/fork.c:2822
       __do_sys_unshare kernel/fork.c:2890 [inline]
       __se_sys_unshare kernel/fork.c:2888 [inline]
       __x64_sys_unshare+0x31/0x40 kernel/fork.c:2888
       do_syscall_64+0xfd/0x680 arch/x86/entry/common.c:301
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Freed by task 9:
       save_stack+0x23/0x90 mm/kasan/common.c:71
       set_track mm/kasan/common.c:79 [inline]
       __kasan_slab_free+0x102/0x150 mm/kasan/common.c:451
       kasan_slab_free+0xe/0x10 mm/kasan/common.c:459
       __cache_free mm/slab.c:3417 [inline]
       kfree+0x10a/0x2c0 mm/slab.c:3746
       __rcu_reclaim kernel/rcu/rcu.h:215 [inline]
       rcu_do_batch kernel/rcu/tree.c:2092 [inline]
       invoke_rcu_callbacks kernel/rcu/tree.c:2310 [inline]
       rcu_core+0xcc7/0x1500 kernel/rcu/tree.c:2291
       __do_softirq+0x25c/0x94c kernel/softirq.c:292
      
      The buggy address belongs to the object at ffff8880a4078d40
       which belongs to the cache kmalloc-512 of size 512
      The buggy address is located 32 bytes inside of
       512-byte region [ffff8880a4078d40, ffff8880a4078f40)
      The buggy address belongs to the page:
      page:ffffea0002901e00 refcount:1 mapcount:0 mapping:ffff8880aa400a80
      index:0xffff8880a40785c0
      flags: 0x1fffc0000000200(slab)
      raw: 01fffc0000000200 ffffea0001d636c8 ffffea0001b07308 ffff8880aa400a80
      raw: ffff8880a40785c0 ffff8880a40780c0 0000000100000004 0000000000000000
      page dumped because: kasan: bad access detected
      
      Memory state around the buggy address:
       ffff8880a4078c00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
       ffff8880a4078c80: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
      > ffff8880a4078d00: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb
                                                             ^
       ffff8880a4078d80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
       ffff8880a4078e00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      
      Reported-by: syzbot+43a3fa52c0d9c5c94f41@syzkaller.appspotmail.com
      Fixes: 22567590 ("netfilter: bridge: namespace bridge netfilter sysctls")
      Signed-off-by: NChristian Brauner <christian@brauner.io>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      7e6daf50
  14. 19 6月, 2019 1 次提交
  15. 17 6月, 2019 2 次提交
    • C
      netfilter: bridge: namespace bridge netfilter sysctls · 22567590
      Christian Brauner 提交于
      Currently, the /proc/sys/net/bridge folder is only created in the initial
      network namespace. This patch ensures that the /proc/sys/net/bridge folder
      is available in each network namespace if the module is loaded and
      disappears from all network namespaces when the module is unloaded.
      
      In doing so the patch makes the sysctls:
      
      bridge-nf-call-arptables
      bridge-nf-call-ip6tables
      bridge-nf-call-iptables
      bridge-nf-filter-pppoe-tagged
      bridge-nf-filter-vlan-tagged
      bridge-nf-pass-vlan-input-dev
      
      apply per network namespace. This unblocks some use-cases where users would
      like to e.g. not do bridge filtering for bridges in a specific network
      namespace while doing so for bridges located in another network namespace.
      
      The netfilter rules are afaict already per network namespace so it should
      be safe for users to specify whether bridge devices inside a network
      namespace are supposed to go through iptables et al. or not. Also, this can
      already be done per-bridge by setting an option for each individual bridge
      via Netlink. It should also be possible to do this for all bridges in a
      network namespace via sysctls.
      
      Cc: Tyler Hicks <tyhicks@canonical.com>
      Signed-off-by: NChristian Brauner <christian.brauner@ubuntu.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      22567590
    • C
      netfilter: bridge: port sysctls to use brnf_net · ff6d090d
      Christian Brauner 提交于
      This ports the sysctls to use struct brnf_net.
      
      With this patch we make it possible to namespace the br_netfilter module in
      the following patch.
      Signed-off-by: NChristian Brauner <christian.brauner@ubuntu.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      ff6d090d
  16. 15 6月, 2019 1 次提交
    • M
      docs: kbuild: convert docs to ReST and rename to *.rst · cd238eff
      Mauro Carvalho Chehab 提交于
      The kbuild documentation clearly shows that the documents
      there are written at different times: some use markdown,
      some use their own peculiar logic to split sections.
      
      Convert everything to ReST without affecting too much
      the author's style and avoiding adding uneeded markups.
      
      The conversion is actually:
        - add blank lines and identation in order to identify paragraphs;
        - fix tables markups;
        - add some lists markups;
        - mark literal blocks;
        - adjust title markups.
      
      At its new index.rst, let's add a :orphan: while this is not linked to
      the main index.rst file, in order to avoid build warnings.
      Signed-off-by: NMauro Carvalho Chehab <mchehab+samsung@kernel.org>
      Signed-off-by: NJonathan Corbet <corbet@lwn.net>
      cd238eff
  17. 01 6月, 2019 1 次提交
    • F
      netfilter: bridge: convert skb_make_writable to skb_ensure_writable · c1a83116
      Florian Westphal 提交于
      Back in the day, skb_ensure_writable did not exist.  By now, both functions
      have the same precondition:
      
      I. skb_make_writable will test in this order:
        1. wlen > skb->len -> error
        2. if not cloned and wlen <= headlen -> OK
        3. If cloned and wlen bytes of clone writeable -> OK
      
      After those checks, skb is either not cloned but needs to pull from
      nonlinear area, or writing to head would also alter data of another clone.
      
      In both cases skb_make_writable will then call __pskb_pull_tail, which will
      kmalloc a new memory area to use for skb->head.
      
      IOW, after successful skb_make_writable call, the requested length is in
      linear area and can be modified, even if skb was cloned.
      
      II. skb_ensure_writable will do this instead:
         1. call pskb_may_pull.  This handles case 1 above.
            After this, wlen is in linear area, but skb might be cloned.
         2. return if skb is not cloned
         3. return if wlen byte of clone are writeable.
         4. fully copy the skb.
      
      So post-conditions are the same:
      *len bytes are writeable in linear area without altering any payload data
      of a clone, all header pointers might have been changed.
      
      Only differences are that skb_ensure_writable is in the core, whereas
      skb_make_writable lives in netfilter core and the inverted return value.
      skb_make_writable returns 0 on error, whereas skb_ensure_writable returns
      negative value.
      
      For the normal cases performance is similar:
      A. skb is not cloned and in linear area:
         pskb_may_pull is inline helper, so neither function copies.
      B. skb is cloned, write is in linear area and clone is writeable:
         both funcions return with step 3.
      
      This series removes skb_make_writable from the kernel.
      
      While at it, pass the needed value instead, its less confusing that way:
      There is no special-handling of "0-length" argument in either
      skb_make_writable or skb_ensure_writable.
      
      bridge already makes sure ethernet header is in linear area, only purpose
      of the make_writable() is is to copy skb->head in case of cloned skbs.
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      c1a83116
  18. 31 5月, 2019 3 次提交
    • P
      netfilter: nf_conntrack_bridge: add support for IPv6 · 764dd163
      Pablo Neira Ayuso 提交于
      br_defrag() and br_fragment() indirections are added in case that IPv6
      support comes as a module, to avoid pulling innecessary dependencies in.
      
      The new fraglist iterator and fragment transformer APIs are used to
      implement the refragmentation code.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      764dd163
    • P
      netfilter: bridge: add connection tracking system · 3c171f49
      Pablo Neira Ayuso 提交于
      This patch adds basic connection tracking support for the bridge,
      including initial IPv4 support.
      
      This patch register two hooks to deal with the bridge forwarding path,
      one from the bridge prerouting hook to call nf_conntrack_in(); and
      another from the bridge postrouting hook to confirm the entry.
      
      The conntrack bridge prerouting hook defragments packets before passing
      them to nf_conntrack_in() to look up for an existing entry, otherwise a
      new entry is allocated and it is attached to the skbuff. The conntrack
      bridge postrouting hook confirms new conntrack entries, ie. if this is
      the first packet seen, then it adds the entry to the hashtable and (if
      needed) it refragments the skbuff into the original fragments, leaving
      the geometry as is if possible. Exceptions are linearized skbuffs, eg.
      skbuffs that are passed up to nfqueue and conntrack helpers, as well as
      cloned skbuff for the local delivery (eg. tcpdump), also in case of
      bridge port flooding (cloned skbuff too).
      
      The packet defragmentation is done through the ip_defrag() call.  This
      forces us to save the bridge control buffer, reset the IP control buffer
      area and then restore it after call. This function also bumps the IP
      fragmentation statistics, it would be probably desiderable to have
      independent statistics for the bridge defragmentation/refragmentation.
      The maximum fragment length is stored in the control buffer and it is
      used to refragment the skbuff from the postrouting path.
      
      The new fraglist splitter and fragment transformer APIs are used to
      implement the bridge refragmentation code. The br_ip_fragment() function
      drops the packet in case the maximum fragment size seen is larger than
      the output port MTU.
      
      This patchset follows the principle that conntrack should not drop
      packets, so users can do it through policy via invalid state matching.
      
      Like br_netfilter, there is no refragmentation for packets that are
      passed up for local delivery, ie. prerouting -> input path. There are
      calls to nf_reset() already in several spots in the stack since time ago
      already, eg. af_packet, that show that skbuff fraglist handling from the
      netif_rx path is supported already.
      
      The helpers are called from the postrouting hook, before confirmation,
      from there we may see packet floods to bridge ports. Then, although
      unlikely, this may result in exercising the helpers many times for each
      clone. It would be good to explore how to pass all the packets in a list
      to the conntrack hook to do this handle only once for this case.
      
      Thanks to Florian Westphal for handing me over an initial patchset
      version to add support for conntrack bridge.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3c171f49
    • T
      treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 · 2874c5fd
      Thomas Gleixner 提交于
      Based on 1 normalized pattern(s):
      
        this program is free software you can redistribute it and or modify
        it under the terms of the gnu general public license as published by
        the free software foundation either version 2 of the license or at
        your option any later version
      
      extracted by the scancode license scanner the SPDX license identifier
      
        GPL-2.0-or-later
      
      has been chosen to replace the boilerplate/reference in 3029 file(s).
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NAllison Randal <allison@lohutok.net>
      Cc: linux-spdx@vger.kernel.org
      Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.deSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2874c5fd
  19. 21 5月, 2019 4 次提交
  20. 11 5月, 2019 1 次提交
    • T
      bridge: Fix error path for kobject_init_and_add() · bdfad5ae
      Tobin C. Harding 提交于
      Currently error return from kobject_init_and_add() is not followed by a
      call to kobject_put().  This means there is a memory leak.  We currently
      set p to NULL so that kfree() may be called on it as a noop, the code is
      arguably clearer if we move the kfree() up closer to where it is
      called (instead of after goto jump).
      
      Remove a goto label 'err1' and jump to call to kobject_put() in error
      return from kobject_init_and_add() fixing the memory leak.  Re-name goto
      label 'put_back' to 'err1' now that we don't use err1, following current
      nomenclature (err1, err2 ...).  Move call to kfree out of the error
      code at bottom of function up to closer to where memory was allocated.
      Add comment to clarify call to kfree().
      Signed-off-by: NTobin C. Harding <tobin@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bdfad5ae