1. 20 12月, 2018 20 次提交
    • D
      Merge branch 'sk_buff-add-extension-infrastructure' · 4a54877e
      David S. Miller 提交于
      Florian Westphal says:
      
      ====================
      sk_buff: add extension infrastructure
      
      TL;DR:
       - objdiff shows no change if CONFIG_XFRM=n && BR_NETFILTER=n
       - small size reduction when one or both options are set
       - no changes in ipsec performance
      
       Changes since v1:
       - Allocate entire extension space from a kmem_cache.
       - Avoid atomic_dec_and_test operation on skb_ext_put() for refcnt == 1 case.
         (similar to kfree_skbmem() fclone_ref use).
      
      This adds an optional extension infrastructure, with ispec (xfrm) and
      bridge netfilter as first users.
      
      The third (future) user is Multipath TCP which is still out-of-tree.
      MPTCP needs to map logical mptcp sequence numbers to the tcp sequence
      numbers used by individual subflows.
      
      This DSS mapping is read/written from tcp option space on receive and
      written to tcp option space on transmitted tcp packets that are part of
      and MPTCP connection.
      
      Extending skb_shared_info or adding a private data field to skb fclones
      doesn't work for incoming skb, so a different DSS propagation method would
      be required for the receive side.
      
      mptcp has same requirements as secpath/bridge netfilter:
      
      1. extension memory is released when the sk_buff is free'd.
      2. data is shared after cloning an skb (clone inherits extension)
      3. adding extension to an skb will COW the extension buffer if needed.
      
      Two new members are added to sk_buff:
      1. 'active_extensions' byte (filling a hole), telling which extensions
         are available for this skb.
         This has two purposes.
         a) avoids the need to initialize the pointer.
         b) allows to "delete" an extension by clearing its bit
         value in ->active_extensions.
      
         While it would be possible to store the active_extensions byte
         in the extension struct instead of sk_buff, there is one problem
         with this:
          When an extension has to be disabled, we can always clear the
          bit in skb->active_extensions.  But in case it would be stored in the
          extension buffer itself, we might have to COW it first, if
          we are dealing with a cloned skb.  On kmalloc failure we would
          be unable to turn an extension off.
      2. extension pointer, located at the end of the sk_buff.
         If the active_extensions byte is 0, the pointer is undefined,
         it is not initialized on skb allocation.
      
      This adds extra code to skb clone and free paths (to deal with
      refcount/free of extension area) but this replaces similar code that
      manages skb->nf_bridge and skb->sp structs in the followup patches of
      the series.
      
      It is possible to add support for extensions that are not preseved on
      clones/copies:
      
      1. define a bitmask of all extensions that need copy/cow on clone
      2. change __skb_ext_copy() to check
         ->active_extensions & SKB_EXT_PRESERVE_ON_CLONE
      3. set clone->active_extensions to 0 if test is false.
      
      This isn't done here because all extensions that get added here
      need the copy/cow semantics.
      
      Last patch converts skb->sp, secpath information gets stored as
      new SKB_EXT_SEC_PATH, so the 'sp' pointer is removed from skbuff.
      
      Extra code added to skb clone and free paths (to deal with refcount/free
      of extension area) replaces the existing code that does the same for
      skb->nf_bridge and skb->secpath.
      
      I don't see any other in-tree users that could benefit from this
      infrastructure, it doesn't make sense to add an extension just for the sake
      of a single flag bit (like skb->nf_trace).
      
      Adding a new extension is a good fit if all of the following are true:
      
      1. Data is related to the skb/packet aggregate
      2. Data should be freed when the skb is free'd
      3. Data is not going to be relevant/needed in normal case (udp, tcp,
         forwarding workloads, ...)
      4. There are no fancy action(s) needed on clone/free, such as callbacks
         into kernel modules.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4a54877e
    • F
      net: switch secpath to use skb extension infrastructure · 4165079b
      Florian Westphal 提交于
      Remove skb->sp and allocate secpath storage via extension
      infrastructure.  This also reduces sk_buff by 8 bytes on x86_64.
      
      Total size of allyesconfig kernel is reduced slightly, as there is
      less inlined code (one conditional atomic op instead of two on
      skb_clone).
      
      No differences in throughput in following ipsec performance tests:
      - transport mode with aes on 10GB link
      - tunnel mode between two network namespaces with aes and null cipher
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4165079b
    • F
      xfrm: prefer secpath_set over secpath_dup · a84e3f53
      Florian Westphal 提交于
      secpath_set is a wrapper for secpath_dup that will not perform
      an allocation if the secpath attached to the skb has a reference count
      of one, i.e., it doesn't need to be COW'ed.
      
      Also, secpath_dup doesn't attach the secpath to the skb, it leaves
      this to the caller.
      
      Use secpath_set in places that immediately assign the return value to
      skb.
      
      This allows to remove skb->sp without touching these spots again.
      
      secpath_dup can eventually be removed in followup patch.
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a84e3f53
    • F
      drivers: chelsio: use skb_sec_path helper · a053c866
      Florian Westphal 提交于
      reduce noise when skb->sp is removed later in the series.
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a053c866
    • F
      xfrm: use secpath_exist where applicable · 26912e37
      Florian Westphal 提交于
      Will reduce noise when skb->sp is removed later in this series.
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      26912e37
    • F
      drivers: net: netdevsim: use skb_sec_path helper · 56d1ac32
      Florian Westphal 提交于
      ... so this won't have to be changed when skb->sp goes away.
      
      v2: no changes, preserve ack.
      Acked-by: NShannon Nelson <shannon.lee.nelson@gmail.com>
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      56d1ac32
    • F
      drivers: net: ethernet: mellanox: use skb_sec_path helper · 6362a6a0
      Florian Westphal 提交于
      Will avoid touching this when sp pointer is removed from sk_buff struct.
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6362a6a0
    • F
      drivers: net: intel: use secpath helpers in more places · 2fdb435b
      Florian Westphal 提交于
      Use skb_sec_path and secpath_exists helpers where possible.
      This reduces noise in followup patch that removes skb->sp pointer.
      
      v2: no changes, preseve acks from v1.
      Acked-by: NShannon Nelson <shannon.lee.nelson@gmail.com>
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2fdb435b
    • F
      net: use skb_sec_path helper in more places · 2294be0f
      Florian Westphal 提交于
      skb_sec_path gains 'const' qualifier to avoid
      xt_policy.c: 'skb_sec_path' discards 'const' qualifier from pointer target type
      
      same reasoning as previous conversions: Won't need to touch these
      spots anymore when skb->sp is removed.
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2294be0f
    • F
      net: move secpath_exist helper to sk_buff.h · 7af8f4ca
      Florian Westphal 提交于
      Future patch will remove skb->sp pointer.
      To reduce noise in those patches, move existing helper to
      sk_buff and use it in more places to ease skb->sp replacement later.
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7af8f4ca
    • F
      xfrm: change secpath_set to return secpath struct, not error value · 0ca64da1
      Florian Westphal 提交于
      It can only return 0 (success) or -ENOMEM.
      Change return value to a pointer to secpath struct.
      
      This avoids direct access to skb->sp:
      
      err = secpath_set(skb);
      if (!err) ..
      skb->sp-> ...
      
      Becomes:
      sp = secpath_set(skb)
      if (!sp) ..
      sp-> ..
      
      This reduces noise in followup patch which is going to remove skb->sp.
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0ca64da1
    • F
      net: convert bridge_nf to use skb extension infrastructure · de8bda1d
      Florian Westphal 提交于
      This converts the bridge netfilter (calling iptables hooks from bridge)
      facility to use the extension infrastructure.
      
      The bridge_nf specific hooks in skb clone and free paths are removed, they
      have been replaced by the skb_ext hooks that do the same as the bridge nf
      allocations hooks did.
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      de8bda1d
    • F
      sk_buff: add skb extension infrastructure · df5042f4
      Florian Westphal 提交于
      This adds an optional extension infrastructure, with ispec (xfrm) and
      bridge netfilter as first users.
      objdiff shows no changes if kernel is built without xfrm and br_netfilter
      support.
      
      The third (planned future) user is Multipath TCP which is still
      out-of-tree.
      MPTCP needs to map logical mptcp sequence numbers to the tcp sequence
      numbers used by individual subflows.
      
      This DSS mapping is read/written from tcp option space on receive and
      written to tcp option space on transmitted tcp packets that are part of
      and MPTCP connection.
      
      Extending skb_shared_info or adding a private data field to skb fclones
      doesn't work for incoming skb, so a different DSS propagation method would
      be required for the receive side.
      
      mptcp has same requirements as secpath/bridge netfilter:
      
      1. extension memory is released when the sk_buff is free'd.
      2. data is shared after cloning an skb (clone inherits extension)
      3. adding extension to an skb will COW the extension buffer if needed.
      
      The "MPTCP upstreaming" effort adds SKB_EXT_MPTCP extension to store the
      mapping for tx and rx processing.
      
      Two new members are added to sk_buff:
      1. 'active_extensions' byte (filling a hole), telling which extensions
         are available for this skb.
         This has two purposes.
         a) avoids the need to initialize the pointer.
         b) allows to "delete" an extension by clearing its bit
         value in ->active_extensions.
      
         While it would be possible to store the active_extensions byte
         in the extension struct instead of sk_buff, there is one problem
         with this:
          When an extension has to be disabled, we can always clear the
          bit in skb->active_extensions.  But in case it would be stored in the
          extension buffer itself, we might have to COW it first, if
          we are dealing with a cloned skb.  On kmalloc failure we would
          be unable to turn an extension off.
      
      2. extension pointer, located at the end of the sk_buff.
         If the active_extensions byte is 0, the pointer is undefined,
         it is not initialized on skb allocation.
      
      This adds extra code to skb clone and free paths (to deal with
      refcount/free of extension area) but this replaces similar code that
      manages skb->nf_bridge and skb->sp structs in the followup patches of
      the series.
      
      It is possible to add support for extensions that are not preseved on
      clones/copies.
      
      To do this, it would be needed to define a bitmask of all extensions that
      need copy/cow semantics, and change __skb_ext_copy() to check
      ->active_extensions & SKB_EXT_PRESERVE_ON_CLONE, then just set
      ->active_extensions to 0 on the new clone.
      
      This isn't done here because all extensions that get added here
      need the copy/cow semantics.
      
      v2:
      Allocate entire extension space using kmem_cache.
      Upside is that this allows better tracking of used memory,
      downside is that we will allocate more space than strictly needed in
      most cases (its unlikely that all extensions are active/needed at same
      time for same skb).
      The allocated memory (except the small extension header) is not cleared,
      so no additonal overhead aside from memory usage.
      
      Avoid atomic_dec_and_test operation on skb_ext_put()
      by using similar trick as kfree_skbmem() does with fclone_ref:
      If recount is 1, there is no concurrent user and we can free right away.
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      df5042f4
    • F
      netfilter: avoid using skb->nf_bridge directly · c4b0e771
      Florian Westphal 提交于
      This pointer is going to be removed soon, so use the existing helpers in
      more places to avoid noise when the removal happens.
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c4b0e771
    • D
      Merge branch 'dpaa2-eth-add-QBMAN-statistics' · 8239d579
      David S. Miller 提交于
      Ioana Ciornei says:
      
      ====================
      dpaa2-eth: add QBMAN statistics
      
      This patch set adds ethtool statistics for pending frames/bytes
      in Rx/Tx conf FQs and number of buffers in pool.
      
      The first patch adds support for the query APIs in the DPIO driver
      while the latter actually exposes the statistics through ethtool.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8239d579
    • I
      dpaa2-eth: Add QBMAN related stats · 610febc6
      Ioana Radulescu 提交于
      Add statistics for pending frames in Rx/Tx conf FQs and
      number of buffers in pool. Available through ethtool -S.
      Signed-off-by: NIoana Radulescu <ruxandra.radulescu@nxp.com>
      Signed-off-by: NIoana ciornei <ioana.ciornei@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      610febc6
    • R
      soc: fsl: dpio: Add BP and FQ query APIs · e80081c3
      Roy Pledge 提交于
      Add FQ (Frame Queue) and BP (Buffer Pool) query APIs that
      users of QBMan can invoke to see the status of the queues
      and pools that they are using.
      Signed-off-by: NRoy Pledge <roy.pledge@nxp.com>
      Signed-off-by: NIoana Radulescu <ruxandra.radulescu@nxp.com>
      Signed-off-by: NIoana Ciornei <ioana.ciornei@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e80081c3
    • R
      net: phy: mscc: Fix the VSC 8531/41 Chip Init sequence · 7b98f63e
      Raju Lakkaraju 提交于
      - Turn on Broadcast writes
      - UNH 1.8.1 clear bias for UNH 1000BT distortion
      - UNH 1.8.7 optimize pre-emphasis for 100BasTx UNH 100W fix
      - Enable Token-ring during 'Coma Mode'
      Signed-off-by: NRaju Lakkaraju <Raju.Lakkaraju@microchip.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7b98f63e
    • D
      Merge branch 'for-upstream' of... · 29d3c047
      David S. Miller 提交于
      Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next
      
      Johan Hedberg says:
      
      ====================
      pull request: bluetooth-next 2018-12-19
      
      Here's the main bluetooth-next pull request for 4.21:
      
       - Multiple fixes & improvements for Broadcom-based controllers
       - New USB ID for an Intel controller
       - Support for new Broadcom controller variants
       - Use DEFINE_SHOW_ATTRIBUTE to simplify debugfs code
       - Eliminate confusing "last event is not cmd complete" warning message
       - Added vendor suspend/resume support for H:5 (3-Wire UART) controllers
       - Various other smaller improvements & fixes
      
      Please let me know if there are any issues pulling. Thanks.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      29d3c047
    • D
      Merge tag 'mac80211-next-for-davem-2018-12-19' of... · 5a862f86
      David S. Miller 提交于
      Merge tag 'mac80211-next-for-davem-2018-12-19' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next
      
      Johannes Berg says:
      
      ====================
      This time we have too many changes to list, highlights:
       * virt_wifi - wireless control simulation on top of
         another network interface
       * hwsim configurability to test capabilities similar
         to real hardware
       * various mesh improvements
       * various radiotap vendor data fixes in mac80211
       * finally the nl_set_extack_cookie_u64() we talked
         about previously, used for
       * peer measurement APIs, right now only with FTM
         (flight time measurement) for location
       * made nl80211 radio/interface announcements more complete
       * various new HE (802.11ax) things:
         updates, TWT support, ...
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5a862f86
  2. 19 12月, 2018 20 次提交