1. 03 7月, 2019 2 次提交
    • N
      net: bridge: mcast: fix stale ipv6 hdr pointer when handling v6 query · 3b26a5d0
      Nikolay Aleksandrov 提交于
      We get a pointer to the ipv6 hdr in br_ip6_multicast_query but we may
      call pskb_may_pull afterwards and end up using a stale pointer.
      So use the header directly, it's just 1 place where it's needed.
      
      Fixes: 08b202b6 ("bridge br_multicast: IPv6 MLD support.")
      Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Tested-by: NMartin Weinelt <martin@linuxlounge.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3b26a5d0
    • N
      net: bridge: mcast: fix stale nsrcs pointer in igmp3/mld2 report handling · e57f6185
      Nikolay Aleksandrov 提交于
      We take a pointer to grec prior to calling pskb_may_pull and use it
      afterwards to get nsrcs so record nsrcs before the pull when handling
      igmp3 and we get a pointer to nsrcs and call pskb_may_pull when handling
      mld2 which again could lead to reading 2 bytes out-of-bounds.
      
       ==================================================================
       BUG: KASAN: use-after-free in br_multicast_rcv+0x480c/0x4ad0 [bridge]
       Read of size 2 at addr ffff8880421302b4 by task ksoftirqd/1/16
      
       CPU: 1 PID: 16 Comm: ksoftirqd/1 Tainted: G           OE     5.2.0-rc6+ #1
       Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
       Call Trace:
        dump_stack+0x71/0xab
        print_address_description+0x6a/0x280
        ? br_multicast_rcv+0x480c/0x4ad0 [bridge]
        __kasan_report+0x152/0x1aa
        ? br_multicast_rcv+0x480c/0x4ad0 [bridge]
        ? br_multicast_rcv+0x480c/0x4ad0 [bridge]
        kasan_report+0xe/0x20
        br_multicast_rcv+0x480c/0x4ad0 [bridge]
        ? br_multicast_disable_port+0x150/0x150 [bridge]
        ? ktime_get_with_offset+0xb4/0x150
        ? __kasan_kmalloc.constprop.6+0xa6/0xf0
        ? __netif_receive_skb+0x1b0/0x1b0
        ? br_fdb_update+0x10e/0x6e0 [bridge]
        ? br_handle_frame_finish+0x3c6/0x11d0 [bridge]
        br_handle_frame_finish+0x3c6/0x11d0 [bridge]
        ? br_pass_frame_up+0x3a0/0x3a0 [bridge]
        ? virtnet_probe+0x1c80/0x1c80 [virtio_net]
        br_handle_frame+0x731/0xd90 [bridge]
        ? select_idle_sibling+0x25/0x7d0
        ? br_handle_frame_finish+0x11d0/0x11d0 [bridge]
        __netif_receive_skb_core+0xced/0x2d70
        ? virtqueue_get_buf_ctx+0x230/0x1130 [virtio_ring]
        ? do_xdp_generic+0x20/0x20
        ? virtqueue_napi_complete+0x39/0x70 [virtio_net]
        ? virtnet_poll+0x94d/0xc78 [virtio_net]
        ? receive_buf+0x5120/0x5120 [virtio_net]
        ? __netif_receive_skb_one_core+0x97/0x1d0
        __netif_receive_skb_one_core+0x97/0x1d0
        ? __netif_receive_skb_core+0x2d70/0x2d70
        ? _raw_write_trylock+0x100/0x100
        ? __queue_work+0x41e/0xbe0
        process_backlog+0x19c/0x650
        ? _raw_read_lock_irq+0x40/0x40
        net_rx_action+0x71e/0xbc0
        ? __switch_to_asm+0x40/0x70
        ? napi_complete_done+0x360/0x360
        ? __switch_to_asm+0x34/0x70
        ? __switch_to_asm+0x40/0x70
        ? __schedule+0x85e/0x14d0
        __do_softirq+0x1db/0x5f9
        ? takeover_tasklets+0x5f0/0x5f0
        run_ksoftirqd+0x26/0x40
        smpboot_thread_fn+0x443/0x680
        ? sort_range+0x20/0x20
        ? schedule+0x94/0x210
        ? __kthread_parkme+0x78/0xf0
        ? sort_range+0x20/0x20
        kthread+0x2ae/0x3a0
        ? kthread_create_worker_on_cpu+0xc0/0xc0
        ret_from_fork+0x35/0x40
      
       The buggy address belongs to the page:
       page:ffffea0001084c00 refcount:0 mapcount:-128 mapping:0000000000000000 index:0x0
       flags: 0xffffc000000000()
       raw: 00ffffc000000000 ffffea0000cfca08 ffffea0001098608 0000000000000000
       raw: 0000000000000000 0000000000000003 00000000ffffff7f 0000000000000000
       page dumped because: kasan: bad access detected
      
       Memory state around the buggy address:
       ffff888042130180: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
       ffff888042130200: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
       > ffff888042130280: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
                                           ^
       ffff888042130300: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
       ffff888042130380: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
       ==================================================================
       Disabling lock debugging due to kernel taint
      
      Fixes: bc8c20ac ("bridge: multicast: treat igmpv3 report with INCLUDE and no sources as a leave")
      Reported-by: NMartin Weinelt <martin@linuxlounge.net>
      Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Tested-by: NMartin Weinelt <martin@linuxlounge.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e57f6185
  2. 19 6月, 2019 1 次提交
  3. 31 5月, 2019 1 次提交
  4. 21 5月, 2019 4 次提交
  5. 11 5月, 2019 1 次提交
    • T
      bridge: Fix error path for kobject_init_and_add() · bdfad5ae
      Tobin C. Harding 提交于
      Currently error return from kobject_init_and_add() is not followed by a
      call to kobject_put().  This means there is a memory leak.  We currently
      set p to NULL so that kfree() may be called on it as a noop, the code is
      arguably clearer if we move the kfree() up closer to where it is
      called (instead of after goto jump).
      
      Remove a goto label 'err1' and jump to call to kobject_put() in error
      return from kobject_init_and_add() fixing the memory leak.  Re-name goto
      label 'put_back' to 'err1' now that we don't use err1, following current
      nomenclature (err1, err2 ...).  Move call to kfree out of the error
      code at bottom of function up to closer to where memory was allocated.
      Add comment to clarify call to kfree().
      Signed-off-by: NTobin C. Harding <tobin@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bdfad5ae
  6. 09 5月, 2019 1 次提交
  7. 28 4月, 2019 3 次提交
    • J
      netlink: make validation more configurable for future strictness · 8cb08174
      Johannes Berg 提交于
      We currently have two levels of strict validation:
      
       1) liberal (default)
           - undefined (type >= max) & NLA_UNSPEC attributes accepted
           - attribute length >= expected accepted
           - garbage at end of message accepted
       2) strict (opt-in)
           - NLA_UNSPEC attributes accepted
           - attribute length >= expected accepted
      
      Split out parsing strictness into four different options:
       * TRAILING     - check that there's no trailing data after parsing
                        attributes (in message or nested)
       * MAXTYPE      - reject attrs > max known type
       * UNSPEC       - reject attributes with NLA_UNSPEC policy entries
       * STRICT_ATTRS - strictly validate attribute size
      
      The default for future things should be *everything*.
      The current *_strict() is a combination of TRAILING and MAXTYPE,
      and is renamed to _deprecated_strict().
      The current regular parsing has none of this, and is renamed to
      *_parse_deprecated().
      
      Additionally it allows us to selectively set one of the new flags
      even on old policies. Notably, the UNSPEC flag could be useful in
      this case, since it can be arranged (by filling in the policy) to
      not be an incompatible userspace ABI change, but would then going
      forward prevent forgetting attribute entries. Similar can apply
      to the POLICY flag.
      
      We end up with the following renames:
       * nla_parse           -> nla_parse_deprecated
       * nla_parse_strict    -> nla_parse_deprecated_strict
       * nlmsg_parse         -> nlmsg_parse_deprecated
       * nlmsg_parse_strict  -> nlmsg_parse_deprecated_strict
       * nla_parse_nested    -> nla_parse_nested_deprecated
       * nla_validate_nested -> nla_validate_nested_deprecated
      
      Using spatch, of course:
          @@
          expression TB, MAX, HEAD, LEN, POL, EXT;
          @@
          -nla_parse(TB, MAX, HEAD, LEN, POL, EXT)
          +nla_parse_deprecated(TB, MAX, HEAD, LEN, POL, EXT)
      
          @@
          expression NLH, HDRLEN, TB, MAX, POL, EXT;
          @@
          -nlmsg_parse(NLH, HDRLEN, TB, MAX, POL, EXT)
          +nlmsg_parse_deprecated(NLH, HDRLEN, TB, MAX, POL, EXT)
      
          @@
          expression NLH, HDRLEN, TB, MAX, POL, EXT;
          @@
          -nlmsg_parse_strict(NLH, HDRLEN, TB, MAX, POL, EXT)
          +nlmsg_parse_deprecated_strict(NLH, HDRLEN, TB, MAX, POL, EXT)
      
          @@
          expression TB, MAX, NLA, POL, EXT;
          @@
          -nla_parse_nested(TB, MAX, NLA, POL, EXT)
          +nla_parse_nested_deprecated(TB, MAX, NLA, POL, EXT)
      
          @@
          expression START, MAX, POL, EXT;
          @@
          -nla_validate_nested(START, MAX, POL, EXT)
          +nla_validate_nested_deprecated(START, MAX, POL, EXT)
      
          @@
          expression NLH, HDRLEN, MAX, POL, EXT;
          @@
          -nlmsg_validate(NLH, HDRLEN, MAX, POL, EXT)
          +nlmsg_validate_deprecated(NLH, HDRLEN, MAX, POL, EXT)
      
      For this patch, don't actually add the strict, non-renamed versions
      yet so that it breaks compile if I get it wrong.
      
      Also, while at it, make nla_validate and nla_parse go down to a
      common __nla_validate_parse() function to avoid code duplication.
      
      Ultimately, this allows us to have very strict validation for every
      new caller of nla_parse()/nlmsg_parse() etc as re-introduced in the
      next patch, while existing things will continue to work as is.
      
      In effect then, this adds fully strict validation for any new command.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8cb08174
    • M
      net: fix two coding style issues · f78c6032
      Michal Kubecek 提交于
      This is a simple cleanup addressing two coding style issues found by
      checkpatch.pl in an earlier patch. It's submitted as a separate patch to
      keep the original patch as it was generated by spatch.
      Signed-off-by: NMichal Kubecek <mkubecek@suse.cz>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f78c6032
    • M
      netlink: make nla_nest_start() add NLA_F_NESTED flag · ae0be8de
      Michal Kubecek 提交于
      Even if the NLA_F_NESTED flag was introduced more than 11 years ago, most
      netlink based interfaces (including recently added ones) are still not
      setting it in kernel generated messages. Without the flag, message parsers
      not aware of attribute semantics (e.g. wireshark dissector or libmnl's
      mnl_nlmsg_fprintf()) cannot recognize nested attributes and won't display
      the structure of their contents.
      
      Unfortunately we cannot just add the flag everywhere as there may be
      userspace applications which check nlattr::nla_type directly rather than
      through a helper masking out the flags. Therefore the patch renames
      nla_nest_start() to nla_nest_start_noflag() and introduces nla_nest_start()
      as a wrapper adding NLA_F_NESTED. The calls which add NLA_F_NESTED manually
      are rewritten to use nla_nest_start().
      
      Except for changes in include/net/netlink.h, the patch was generated using
      this semantic patch:
      
      @@ expression E1, E2; @@
      -nla_nest_start(E1, E2)
      +nla_nest_start_noflag(E1, E2)
      
      @@ expression E1, E2; @@
      -nla_nest_start_noflag(E1, E2 | NLA_F_NESTED)
      +nla_nest_start(E1, E2)
      Signed-off-by: NMichal Kubecek <mkubecek@suse.cz>
      Acked-by: NJiri Pirko <jiri@mellanox.com>
      Acked-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ae0be8de
  8. 23 4月, 2019 1 次提交
    • I
      bridge: Fix possible use-after-free when deleting bridge port · 697cd36c
      Ido Schimmel 提交于
      When a bridge port is being deleted, do not dereference it later in
      br_vlan_port_event() as it can result in a use-after-free [1] if the RCU
      callback was executed before invoking the function.
      
      [1]
      [  129.638551] ==================================================================
      [  129.646904] BUG: KASAN: use-after-free in br_vlan_port_event+0x53c/0x5fd
      [  129.654406] Read of size 8 at addr ffff8881e4aa1ae8 by task ip/483
      [  129.663008] CPU: 0 PID: 483 Comm: ip Not tainted 5.1.0-rc5-custom-02265-ga946bd73daac #1383
      [  129.672359] Hardware name: Mellanox Technologies Ltd. MSN2100-CB2FO/SA001017, BIOS 5.6.5 06/07/2016
      [  129.682484] Call Trace:
      [  129.685242]  dump_stack+0xa9/0x10e
      [  129.689068]  print_address_description.cold.2+0x9/0x25e
      [  129.694930]  kasan_report.cold.3+0x78/0x9d
      [  129.704420]  br_vlan_port_event+0x53c/0x5fd
      [  129.728300]  br_device_event+0x2c7/0x7a0
      [  129.741505]  notifier_call_chain+0xb5/0x1c0
      [  129.746202]  rollback_registered_many+0x895/0xe90
      [  129.793119]  unregister_netdevice_many+0x48/0x210
      [  129.803384]  rtnl_delete_link+0xe1/0x140
      [  129.815906]  rtnl_dellink+0x2a3/0x820
      [  129.844166]  rtnetlink_rcv_msg+0x397/0x910
      [  129.868517]  netlink_rcv_skb+0x137/0x3a0
      [  129.882013]  netlink_unicast+0x49b/0x660
      [  129.900019]  netlink_sendmsg+0x755/0xc90
      [  129.915758]  ___sys_sendmsg+0x761/0x8e0
      [  129.966315]  __sys_sendmsg+0xf0/0x1c0
      [  129.988918]  do_syscall_64+0xa4/0x470
      [  129.993032]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      [  129.998696] RIP: 0033:0x7ff578104b58
      ...
      [  130.073811] Allocated by task 479:
      [  130.077633]  __kasan_kmalloc.constprop.5+0xc1/0xd0
      [  130.083008]  kmem_cache_alloc_trace+0x152/0x320
      [  130.088090]  br_add_if+0x39c/0x1580
      [  130.092005]  do_set_master+0x1aa/0x210
      [  130.096211]  do_setlink+0x985/0x3100
      [  130.100224]  __rtnl_newlink+0xc52/0x1380
      [  130.104625]  rtnl_newlink+0x6b/0xa0
      [  130.108541]  rtnetlink_rcv_msg+0x397/0x910
      [  130.113136]  netlink_rcv_skb+0x137/0x3a0
      [  130.117538]  netlink_unicast+0x49b/0x660
      [  130.121939]  netlink_sendmsg+0x755/0xc90
      [  130.126340]  ___sys_sendmsg+0x761/0x8e0
      [  130.130645]  __sys_sendmsg+0xf0/0x1c0
      [  130.134753]  do_syscall_64+0xa4/0x470
      [  130.138864]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      [  130.146195] Freed by task 0:
      [  130.149421]  __kasan_slab_free+0x125/0x170
      [  130.154016]  kfree+0xf3/0x310
      [  130.157349]  kobject_put+0x1a8/0x4c0
      [  130.161363]  rcu_core+0x859/0x19b0
      [  130.165175]  __do_softirq+0x250/0xa26
      [  130.170956] The buggy address belongs to the object at ffff8881e4aa1ae8
                      which belongs to the cache kmalloc-1k of size 1024
      [  130.184972] The buggy address is located 0 bytes inside of
                      1024-byte region [ffff8881e4aa1ae8, ffff8881e4aa1ee8)
      
      Fixes: 9c0ec2e7 ("bridge: support binding vlan dev link state to vlan member bridge ports")
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Cc: Mike Manning <mmanning@vyatta.att-mail.com>
      Acked-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Acked-by: NMike Manning <mmanning@vyatta.att-mail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      697cd36c
  9. 22 4月, 2019 1 次提交
  10. 20 4月, 2019 3 次提交
  11. 17 4月, 2019 2 次提交
    • N
      net: bridge: fix netlink export of vlan_stats_per_port option · 600bea7d
      Nikolay Aleksandrov 提交于
      Since the introduction of the vlan_stats_per_port option the netlink
      export of it has been broken since I made a typo and used the ifla
      attribute instead of the bridge option to retrieve its state.
      Sysfs export is fine, only netlink export has been affected.
      
      Fixes: 9163a0fc ("net: bridge: add support for per-port vlan stats")
      Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      600bea7d
    • N
      net: bridge: fix per-port af_packet sockets · 3b2e2904
      Nikolay Aleksandrov 提交于
      When the commit below was introduced it changed two visible things:
       - the skb was no longer passed through the protocol handlers with the
         original device
       - the skb was passed up the stack with skb->dev = bridge
      
      The first change broke af_packet sockets on bridge ports. For example we
      use them for hostapd which listens for ETH_P_PAE packets on the ports.
      We discussed two possible fixes:
       - create a clone and pass it through NF_HOOK(), act on the original skb
         based on the result
       - somehow signal to the caller from the okfn() that it was called,
         meaning the skb is ok to be passed, which this patch is trying to
         implement via returning 1 from the bridge link-local okfn()
      
      Note that we rely on the fact that NF_QUEUE/STOLEN would return 0 and
      drop/error would return < 0 thus the okfn() is called only when the
      return was 1, so we signal to the caller that it was called by preserving
      the return value from nf_hook().
      
      Fixes: 8626c56c ("bridge: fix potential use-after-free when hook returns QUEUE or STOLEN verdict")
      Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3b2e2904
  12. 16 4月, 2019 1 次提交
  13. 12 4月, 2019 4 次提交
  14. 08 4月, 2019 1 次提交
    • N
      rhashtable: use bit_spin_locks to protect hash bucket. · 8f0db018
      NeilBrown 提交于
      This patch changes rhashtables to use a bit_spin_lock on BIT(1) of the
      bucket pointer to lock the hash chain for that bucket.
      
      The benefits of a bit spin_lock are:
       - no need to allocate a separate array of locks.
       - no need to have a configuration option to guide the
         choice of the size of this array
       - locking cost is often a single test-and-set in a cache line
         that will have to be loaded anyway.  When inserting at, or removing
         from, the head of the chain, the unlock is free - writing the new
         address in the bucket head implicitly clears the lock bit.
         For __rhashtable_insert_fast() we ensure this always happens
         when adding a new key.
       - even when lockings costs 2 updates (lock and unlock), they are
         in a cacheline that needs to be read anyway.
      
      The cost of using a bit spin_lock is a little bit of code complexity,
      which I think is quite manageable.
      
      Bit spin_locks are sometimes inappropriate because they are not fair -
      if multiple CPUs repeatedly contend of the same lock, one CPU can
      easily be starved.  This is not a credible situation with rhashtable.
      Multiple CPUs may want to repeatedly add or remove objects, but they
      will typically do so at different buckets, so they will attempt to
      acquire different locks.
      
      As we have more bit-locks than we previously had spinlocks (by at
      least a factor of two) we can expect slightly less contention to
      go with the slightly better cache behavior and reduced memory
      consumption.
      
      To enhance type checking, a new struct is introduced to represent the
        pointer plus lock-bit
      that is stored in the bucket-table.  This is "struct rhash_lock_head"
      and is empty.  A pointer to this needs to be cast to either an
      unsigned lock, or a "struct rhash_head *" to be useful.
      Variables of this type are most often called "bkt".
      
      Previously "pprev" would sometimes point to a bucket, and sometimes a
      ->next pointer in an rhash_head.  As these are now different types,
      pprev is NULL when it would have pointed to the bucket. In that case,
      'blk' is used, together with correct locking protocol.
      Signed-off-by: NNeilBrown <neilb@suse.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8f0db018
  15. 05 4月, 2019 4 次提交
  16. 30 3月, 2019 2 次提交
  17. 21 3月, 2019 1 次提交
  18. 18 3月, 2019 1 次提交
  19. 01 3月, 2019 2 次提交
  20. 28 2月, 2019 1 次提交
  21. 27 2月, 2019 1 次提交
  22. 24 2月, 2019 1 次提交
  23. 22 2月, 2019 1 次提交