1. 19 10月, 2018 8 次提交
    • V
      net/mlx5: Take fs_counters dellist before addlist · fd330713
      Vlad Buslov 提交于
      In fs_counters elements from both addlist and dellist are removed by
      mlx5_fc_stats_work() without any locking. This introduces race condition
      when batch of new rules is created and then immediately deleted (for
      example, when error occurred during flow creation). In such case some of
      the rules might be in dellist, but not in addlist when mlx5_fc_stats_work()
      is executed concurrently with tc, which will result rule deletion and
      use-after-free on next iteration because deleted rules are still in
      addlist.
      
      Always take dellist first to guarantee that rules can only be deleted after
      they were removed from addlist.
      
      Fixes: 6e5e2283 ("net/mlx5: Add new list to store deleted flow counters")
      Signed-off-by: NVlad Buslov <vladbu@mellanox.com>
      Reported-by: NChris Mi <chrism@mellanox.com>
      Reviewed-by: NRoi Dayan <roid@mellanox.com>
      fd330713
    • T
      net/mlx5: Refactor fragmented buffer struct fields and init flow · 4972e6fa
      Tariq Toukan 提交于
      Take struct mlx5_frag_buf out of mlx5_frag_buf_ctrl, as it is not
      needed to manage and control the datapath of the fragmented buffers API.
      
      struct mlx5_frag_buf contains control info to manage the allocation
      and de-allocation of the fragmented buffer.
      Its fields are not relevant for datapath, so here I take them out of the
      struct mlx5_frag_buf_ctrl, except for the fragments array itself.
      
      In addition, modified mlx5_fill_fbc to initialise the frags pointers
      as well. This implies that the buffer must be allocated before the
      function is called.
      
      A set of type-specific *_get_byte_size() functions are replaced by
      a generic one.
      Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      4972e6fa
    • D
      Merge branch 'sctp-fix-sk_wmem_queued-and-use-it-to-check-for-writable-space' · 3a3295bf
      David S. Miller 提交于
      Xin Long says:
      
      ====================
      sctp: fix sk_wmem_queued and use it to check for writable space
      
      sctp doesn't count and use asoc sndbuf_used, sk sk_wmem_alloc and
      sk_wmem_queued properly, which also causes some problem.
      
      This patchset is to improve it.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3a3295bf
    • X
      sctp: use sk_wmem_queued to check for writable space · cd305c74
      Xin Long 提交于
      sk->sk_wmem_queued is used to count the size of chunks in out queue
      while sk->sk_wmem_alloc is for counting the size of chunks has been
      sent. sctp is increasing both of them before enqueuing the chunks,
      and using sk->sk_wmem_alloc to check for writable space.
      
      However, sk_wmem_alloc is also increased by 1 for the skb allocked
      for sending in sctp_packet_transmit() but it will not wake up the
      waiters when sk_wmem_alloc is decreased in this skb's destructor.
      
      If msg size is equal to sk_sndbuf and sendmsg is waiting for sndbuf,
      the check 'msg_len <= sctp_wspace(asoc)' in sctp_wait_for_sndbuf()
      will keep waiting if there's a skb allocked in sctp_packet_transmit,
      and later even if this skb got freed, the waiting thread will never
      get waked up.
      
      This issue has been there since very beginning, so we change to use
      sk->sk_wmem_queued to check for writable space as sk_wmem_queued is
      not increased for the skb allocked for sending, also as TCP does.
      
      SOCK_SNDBUF_LOCK check is also removed here as it's for tx buf auto
      tuning which I will add in another patch.
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cd305c74
    • X
      sctp: count both sk and asoc sndbuf with skb truesize and sctp_chunk size · 605c0ac1
      Xin Long 提交于
      Now it's confusing that asoc sndbuf_used is doing memory accounting with
      SCTP_DATA_SNDSIZE(chunk) + sizeof(sk_buff) + sizeof(sctp_chunk) while sk
      sk_wmem_alloc is doing that with skb->truesize + sizeof(sctp_chunk).
      
      It also causes sctp_prsctp_prune to count with a wrong freed memory when
      sndbuf_policy is not set.
      
      To make this right and also keep consistent between asoc sndbuf_used, sk
      sk_wmem_alloc and sk_wmem_queued, use skb->truesize + sizeof(sctp_chunk)
      for them.
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      605c0ac1
    • D
      Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue · 2d0f0ca2
      David S. Miller 提交于
      Jeff Kirsher says:
      
      ====================
      1GbE Intel Wired LAN Driver Updates 2018-10-17
      
      This series adds support for the new igc driver.
      
      The igc driver is the new client driver supporting the Intel I225
      Ethernet Controller, which supports 2.5GbE speeds.  The reason for
      creating a new client driver, instead of adding support for the new
      device in e1000e, is that the silicon behaves more like devices
      supported in igb driver.  It also did not make sense to add a client
      part, to the igb driver which supports only 1GbE server parts.
      
      This initial set of patches is designed for basic support (i.e. link and
      pass traffic).  Follow-on patch series will add more advanced support
      like VLAN, Wake-on-LAN, etc..
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2d0f0ca2
    • D
      Merge tag 'mlx5-updates-2018-10-17' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux · 99e9acd8
      David S. Miller 提交于
      mlx5-updates-2018-10-17
      
      ========================================================================
      
      From Or Gerlitz <ogerlitz@mellanox.com>:
      
      This series from Paul adds support to mlx5 e-switch tc offloading of multiple priorities and chains.
      
      This is made of four building blocks (along with few minor driver refactors):
      
      [1] Split FDB fast path prio to multiple namespaces
      
      Currently the FDB name-space contains two priorities, fast path (p0) and slow path (p1).
      The slow path contains the per representor SQ send-to-vport TX rule and the match-all
      RX miss rule. As a pre-step to support multi-chains and priorities, we split the FDB fast path
      to multiple namespaces  (sub namespaces), each with multiple priorities.
      
      [2] E-Switch chains and priorities
      
      A chain is a group of priorities. We use the fdb parallel sub-namespaces to implement chains,
      and a flow table for each priority in them.
      
      Because these namespaces are parallel and in series to the slow path
      fdb, the chains aren't connected to each other (but to the slow path),
      and one must use a explicit goto action to reach a different chain.
      
      Flow tables for the priorities are created on demand and destroyed
      once not used.
      
      [3] Add a no-append flow insertion mode, use it for TC offloads
      
      Enhance the driver fs core, such that if a no-append flag is set by the caller,
      we add a new FTE, instead of appending the actions of the inserted rule when
      the same match already exists.
      
      For encap rules, we defer the HW offloading till we have a valid neighbor. This can
      result in the packet hitting a lower priority rule in the HW DP. Use the no-append API
      to push these packets to the slow path FDB table, so they go to the TC kernel DP as done
      before priorities where supported.
      
      [4] Offloading tc priorities and chains for eswitch flows
      
      Using [1], [2] and [3] above we add the support for offloading both chains
      and priorities. To get to a new chain, use the tc goto action. We support
      a fixed prio range 1-16, and chains 0-3.
      =============================================================================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      99e9acd8
    • D
      Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next · 8f18da47
      David S. Miller 提交于
      Steffen Klassert says:
      
      ====================
      pull request (net-next): ipsec-next 2018-10-18
      
      1) Remove an unnecessary dev->tstats check in xfrmi_get_stats64.
         From Li RongQing.
      
      2) We currently do a sizeof(element) instead of a sizeof(array)
         check when initializing the ovec array of the secpath.
         Currently this array can have only one element, so code is
         OK but error-prone. Change this to do a sizeof(array)
         check so that we can add more elements in future.
         From Li RongQing.
      
      3) Improve xfrm IPv6 address hashing by using the complete IPv6
         addresses for a hash. From Michal Kubecek.
      
      Please pull or let me know if there are problems.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8f18da47
  2. 18 10月, 2018 32 次提交