1. 21 10月, 2017 1 次提交
  2. 18 10月, 2017 1 次提交
    • J
      bpf: cpumap xdp_buff to skb conversion and allocation · 1c601d82
      Jesper Dangaard Brouer 提交于
      This patch makes cpumap functional, by adding SKB allocation and
      invoking the network stack on the dequeuing CPU.
      
      For constructing the SKB on the remote CPU, the xdp_buff in converted
      into a struct xdp_pkt, and it mapped into the top headroom of the
      packet, to avoid allocating separate mem.  For now, struct xdp_pkt is
      just a cpumap internal data structure, with info carried between
      enqueue to dequeue.
      
      If a driver doesn't have enough headroom it is simply dropped, with
      return code -EOVERFLOW.  This will be picked up the xdp tracepoint
      infrastructure, to allow users to catch this.
      
      V2: take into account xdp->data_meta
      
      V4:
       - Drop busypoll tricks, keeping it more simple.
       - Skip RPS and Generic-XDP-recursive-reinjection, suggested by Alexei
      
      V5: correct RCU read protection around __netif_receive_skb_core.
      
      V6: Setting TASK_RUNNING vs TASK_INTERRUPTIBLE based on talk with Rik van Riel
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1c601d82
  3. 05 10月, 2017 3 次提交
  4. 04 10月, 2017 1 次提交
  5. 01 10月, 2017 1 次提交
  6. 02 9月, 2017 1 次提交
  7. 01 9月, 2017 1 次提交
  8. 30 8月, 2017 1 次提交
  9. 29 8月, 2017 1 次提交
    • Y
      smp: Avoid using two cache lines for struct call_single_data · 966a9671
      Ying Huang 提交于
      struct call_single_data is used in IPIs to transfer information between
      CPUs.  Its size is bigger than sizeof(unsigned long) and less than
      cache line size.  Currently it is not allocated with any explicit alignment
      requirements.  This makes it possible for allocated call_single_data to
      cross two cache lines, which results in double the number of the cache lines
      that need to be transferred among CPUs.
      
      This can be fixed by requiring call_single_data to be aligned with the
      size of call_single_data. Currently the size of call_single_data is the
      power of 2.  If we add new fields to call_single_data, we may need to
      add padding to make sure the size of new definition is the power of 2
      as well.
      
      Fortunately, this is enforced by GCC, which will report bad sizes.
      
      To set alignment requirements of call_single_data to the size of
      call_single_data, a struct definition and a typedef is used.
      
      To test the effect of the patch, I used the vm-scalability multiple
      thread swap test case (swap-w-seq-mt).  The test will create multiple
      threads and each thread will eat memory until all RAM and part of swap
      is used, so that huge number of IPIs are triggered when unmapping
      memory.  In the test, the throughput of memory writing improves ~5%
      compared with misaligned call_single_data, because of faster IPIs.
      Suggested-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NHuang, Ying <ying.huang@intel.com>
      [ Add call_single_data_t and align with size of call_single_data. ]
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Aaron Lu <aaron.lu@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/87bmnqd6lz.fsf@yhuang-mobile.sh.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      966a9671
  10. 28 8月, 2017 1 次提交
    • A
      netfilter: convert hook list to an array · 960632ec
      Aaron Conole 提交于
      This converts the storage and layout of netfilter hook entries from a
      linked list to an array.  After this commit, hook entries will be
      stored adjacent in memory.  The next pointer is no longer required.
      
      The ops pointers are stored at the end of the array as they are only
      used in the register/unregister path and in the legacy br_netfilter code.
      
      nf_unregister_net_hooks() is slower than needed as it just calls
      nf_unregister_net_hook in a loop (i.e. at least n synchronize_net()
      calls), this will be addressed in followup patch.
      
      Test setup:
       - ixgbe 10gbit
       - netperf UDP_STREAM, 64 byte packets
       - 5 hooks: (raw + mangle prerouting, mangle+filter input, inet filter):
      empty mangle and raw prerouting, mangle and filter input hooks:
      353.9
      this patch:
      364.2
      Signed-off-by: NAaron Conole <aconole@bytheb.org>
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      960632ec
  11. 19 8月, 2017 3 次提交
  12. 14 8月, 2017 1 次提交
    • J
      net: export some generic xdp helpers · 7c497478
      Jason Wang 提交于
      This patch tries to export some generic xdp helpers to drivers. This
      can let driver to do XDP for a specific skb. This is useful for the
      case when the packet is hard to be processed at page level directly
      (e.g jumbo/GSO frame).
      
      With this patch, there's no need for driver to forbid the XDP set when
      configuration is not suitable. Instead, it can defer the XDP for
      packets that is hard to be processed directly after skb is created.
      Signed-off-by: NJason Wang <jasowang@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7c497478
  13. 08 8月, 2017 5 次提交
  14. 25 7月, 2017 1 次提交
  15. 20 7月, 2017 1 次提交
  16. 18 7月, 2017 3 次提交
  17. 08 7月, 2017 1 次提交
    • W
      bonding: avoid NETDEV_CHANGEMTU event when unregistering slave · f51048c3
      WANG Cong 提交于
      As Hongjun/Nicolas summarized in their original patch:
      
      "
      When a device changes from one netns to another, it's first unregistered,
      then the netns reference is updated and the dev is registered in the new
      netns. Thus, when a slave moves to another netns, it is first
      unregistered. This triggers a NETDEV_UNREGISTER event which is caught by
      the bonding driver. The driver calls bond_release(), which calls
      dev_set_mtu() and thus triggers NETDEV_CHANGEMTU (the device is still in
      the old netns).
      "
      
      This is a very special case, because the device is being unregistered
      no one should still care about the NETDEV_CHANGEMTU event triggered
      at this point, we can avoid broadcasting this event on this path,
      and avoid touching inetdev_event()/addrconf_notify() path.
      
      It requires to export __dev_set_mtu() to bonding driver.
      Reported-by: NHongjun Li <hongjun.li@6wind.com>
      Reported-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
      Cc: Jay Vosburgh <j.vosburgh@gmail.com>
      Cc: Veaceslav Falico <vfalico@gmail.com>
      Cc: Andy Gospodarek <andy@greyhouse.net>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f51048c3
  18. 25 6月, 2017 1 次提交
  19. 24 6月, 2017 3 次提交
  20. 16 6月, 2017 1 次提交
  21. 15 6月, 2017 1 次提交
  22. 08 6月, 2017 3 次提交
    • D
      net: ipv6: Release route when device is unregistering · 8397ed36
      David Ahern 提交于
      Roopa reported attempts to delete a bond device that is referenced in a
      multipath route is hanging:
      
      $ ifdown bond2    # ifupdown2 command that deletes virtual devices
      unregister_netdevice: waiting for bond2 to become free. Usage count = 2
      
      Steps to reproduce:
          echo 1 > /proc/sys/net/ipv6/conf/all/ignore_routes_with_linkdown
          ip link add dev bond12 type bond
          ip link add dev bond13 type bond
          ip addr add 2001:db8:2::0/64 dev bond12
          ip addr add 2001:db8:3::0/64 dev bond13
          ip route add 2001:db8:33::0/64 nexthop via 2001:db8:2::2 nexthop via 2001:db8:3::2
          ip link del dev bond12
          ip link del dev bond13
      
      The root cause is the recent change to keep routes on a linkdown. Update
      the check to detect when the device is unregistering and release the
      route for that case.
      
      Fixes: a1a22c12 ("net: ipv6: Keep nexthop of multipath route on admin down")
      Reported-by: NRoopa Prabhu <roopa@cumulusnetworks.com>
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Acked-by: NRoopa Prabhu <roopa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8397ed36
    • J
      net: propagate tc filter chain index down the ndo_setup_tc call · a5fcf8a6
      Jiri Pirko 提交于
      We need to push the chain index down to the drivers, so they have the
      information to which chain the rule belongs. For now, no driver supports
      multichain offload, so only chain 0 is supported. This is needed to
      prevent chain squashes during offload for now. Later this will be used
      to implement multichain offload.
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a5fcf8a6
    • D
      net: Fix inconsistent teardown and release of private netdev state. · cf124db5
      David S. Miller 提交于
      Network devices can allocate reasources and private memory using
      netdev_ops->ndo_init().  However, the release of these resources
      can occur in one of two different places.
      
      Either netdev_ops->ndo_uninit() or netdev->destructor().
      
      The decision of which operation frees the resources depends upon
      whether it is necessary for all netdev refs to be released before it
      is safe to perform the freeing.
      
      netdev_ops->ndo_uninit() presumably can occur right after the
      NETDEV_UNREGISTER notifier completes and the unicast and multicast
      address lists are flushed.
      
      netdev->destructor(), on the other hand, does not run until the
      netdev references all go away.
      
      Further complicating the situation is that netdev->destructor()
      almost universally does also a free_netdev().
      
      This creates a problem for the logic in register_netdevice().
      Because all callers of register_netdevice() manage the freeing
      of the netdev, and invoke free_netdev(dev) if register_netdevice()
      fails.
      
      If netdev_ops->ndo_init() succeeds, but something else fails inside
      of register_netdevice(), it does call ndo_ops->ndo_uninit().  But
      it is not able to invoke netdev->destructor().
      
      This is because netdev->destructor() will do a free_netdev() and
      then the caller of register_netdevice() will do the same.
      
      However, this means that the resources that would normally be released
      by netdev->destructor() will not be.
      
      Over the years drivers have added local hacks to deal with this, by
      invoking their destructor parts by hand when register_netdevice()
      fails.
      
      Many drivers do not try to deal with this, and instead we have leaks.
      
      Let's close this hole by formalizing the distinction between what
      private things need to be freed up by netdev->destructor() and whether
      the driver needs unregister_netdevice() to perform the free_netdev().
      
      netdev->priv_destructor() performs all actions to free up the private
      resources that used to be freed by netdev->destructor(), except for
      free_netdev().
      
      netdev->needs_free_netdev is a boolean that indicates whether
      free_netdev() should be done at the end of unregister_netdevice().
      
      Now, register_netdevice() can sanely release all resources after
      ndo_ops->ndo_init() succeeds, by invoking both ndo_ops->ndo_uninit()
      and netdev->priv_destructor().
      
      And at the end of unregister_netdevice(), we invoke
      netdev->priv_destructor() and optionally call free_netdev().
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cf124db5
  23. 22 5月, 2017 1 次提交
  24. 20 5月, 2017 3 次提交