1. 11 7月, 2013 1 次提交
  2. 26 6月, 2013 1 次提交
  3. 20 6月, 2013 1 次提交
  4. 25 4月, 2013 1 次提交
    • A
      net/mlx4_en: Add HW timestamping (TS) support · ec693d47
      Amir Vadai 提交于
      The patch allows to enable/disable HW timestamping for incoming and/or
      outgoing packets. It adds and initializes all structs and callbacks
      needed by kernel TS API.
      To enable/disable HW timestamping appropriate ioctl should be used.
      Currently HWTSTAMP_FILTER_ALL/NONE and HWTSAMP_TX_ON/OFF only are
      supported.
      When enabling TS on receive flow - VLAN stripping will be disabled.
      Also were made all relevant changes in RX/TX flows to consider TS request
      and plant HW timestamps into relevant structures.
      mlx4_ib was fixed to compile with new mlx4_cq_alloc() signature.
      Signed-off-by: NEugenia Emantayev <eugenia@mellanox.com>
      Signed-off-by: NAmir Vadai <amirv@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ec693d47
  5. 20 4月, 2013 1 次提交
  6. 28 2月, 2013 1 次提交
    • S
      hlist: drop the node parameter from iterators · b67bfe0d
      Sasha Levin 提交于
      I'm not sure why, but the hlist for each entry iterators were conceived
      
              list_for_each_entry(pos, head, member)
      
      The hlist ones were greedy and wanted an extra parameter:
      
              hlist_for_each_entry(tpos, pos, head, member)
      
      Why did they need an extra pos parameter? I'm not quite sure. Not only
      they don't really need it, it also prevents the iterator from looking
      exactly like the list iterator, which is unfortunate.
      
      Besides the semantic patch, there was some manual work required:
      
       - Fix up the actual hlist iterators in linux/list.h
       - Fix up the declaration of other iterators based on the hlist ones.
       - A very small amount of places were using the 'node' parameter, this
       was modified to use 'obj->member' instead.
       - Coccinelle didn't handle the hlist_for_each_entry_safe iterator
       properly, so those had to be fixed up manually.
      
      The semantic patch which is mostly the work of Peter Senna Tschudin is here:
      
      @@
      iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host;
      
      type T;
      expression a,c,d,e;
      identifier b;
      statement S;
      @@
      
      -T b;
          <+... when != b
      (
      hlist_for_each_entry(a,
      - b,
      c, d) S
      |
      hlist_for_each_entry_continue(a,
      - b,
      c) S
      |
      hlist_for_each_entry_from(a,
      - b,
      c) S
      |
      hlist_for_each_entry_rcu(a,
      - b,
      c, d) S
      |
      hlist_for_each_entry_rcu_bh(a,
      - b,
      c, d) S
      |
      hlist_for_each_entry_continue_rcu_bh(a,
      - b,
      c) S
      |
      for_each_busy_worker(a, c,
      - b,
      d) S
      |
      ax25_uid_for_each(a,
      - b,
      c) S
      |
      ax25_for_each(a,
      - b,
      c) S
      |
      inet_bind_bucket_for_each(a,
      - b,
      c) S
      |
      sctp_for_each_hentry(a,
      - b,
      c) S
      |
      sk_for_each(a,
      - b,
      c) S
      |
      sk_for_each_rcu(a,
      - b,
      c) S
      |
      sk_for_each_from
      -(a, b)
      +(a)
      S
      + sk_for_each_from(a) S
      |
      sk_for_each_safe(a,
      - b,
      c, d) S
      |
      sk_for_each_bound(a,
      - b,
      c) S
      |
      hlist_for_each_entry_safe(a,
      - b,
      c, d, e) S
      |
      hlist_for_each_entry_continue_rcu(a,
      - b,
      c) S
      |
      nr_neigh_for_each(a,
      - b,
      c) S
      |
      nr_neigh_for_each_safe(a,
      - b,
      c, d) S
      |
      nr_node_for_each(a,
      - b,
      c) S
      |
      nr_node_for_each_safe(a,
      - b,
      c, d) S
      |
      - for_each_gfn_sp(a, c, d, b) S
      + for_each_gfn_sp(a, c, d) S
      |
      - for_each_gfn_indirect_valid_sp(a, c, d, b) S
      + for_each_gfn_indirect_valid_sp(a, c, d) S
      |
      for_each_host(a,
      - b,
      c) S
      |
      for_each_host_safe(a,
      - b,
      c, d) S
      |
      for_each_mesh_entry(a,
      - b,
      c, d) S
      )
          ...+>
      
      [akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c]
      [akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c]
      [akpm@linux-foundation.org: checkpatch fixes]
      [akpm@linux-foundation.org: fix warnings]
      [akpm@linux-foudnation.org: redo intrusive kvm changes]
      Tested-by: NPeter Senna Tschudin <peter.senna@gmail.com>
      Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NSasha Levin <sasha.levin@oracle.com>
      Cc: Wu Fengguang <fengguang.wu@intel.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Gleb Natapov <gleb@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b67bfe0d
  7. 09 2月, 2013 1 次提交
  8. 08 2月, 2013 3 次提交
  9. 27 11月, 2012 1 次提交
    • O
      mlx4: 64-byte CQE/EQE support · 08ff3235
      Or Gerlitz 提交于
      ConnectX-3 devices can use either 64- or 32-byte completion queue
      entries (CQEs) and event queue entries (EQEs).  Using 64-byte
      EQEs/CQEs performs better because each entry is aligned to a complete
      cacheline.  This patch queries the HCA's capabilities, and if it
      supports 64-byte CQEs and EQES the driver will configure the HW to
      work in 64-byte mode.
      
      The 32-byte vs 64-byte mode is global per HCA and not per CQ or EQ.
      
      Since this mode is global, userspace (libmlx4) must be updated to work
      with the configured CQE size, and guests using SR-IOV virtual
      functions need to know both EQE and CQE size.
      
      In case one of the 64-byte CQE/EQE capabilities is activated, the
      patch makes sure that older guest drivers that use the QUERY_DEV_FUNC
      command (e.g as done in mlx4_core of Linux 3.3..3.6) will notice that
      they need an update to be able to work with the PPF. This is done by
      changing the returned pf_context_behaviour not to be zero any more. In
      case none of these capabilities is activated that value remains zero
      and older guest drivers can run OK.
      
      The SRIOV related flow is as follows
      
      1. the PPF does the detection of the new capabilities using
         QUERY_DEV_CAP command.
      
      2. the PPF activates the new capabilities using INIT_HCA.
      
      3. the VF detects if the PPF activated the capabilities using
         QUERY_HCA, and if this is the case activates them for itself too.
      
      Note that the VF detects that it must be aware to the new PF behaviour
      using QUERY_FUNC_CAP.  Steps 1 and 2 apply also for native mode.
      
      User space notification is done through a new field introduced in
      struct mlx4_ib_ucontext which holds device capabilities for which user
      space must take action. This changes the binary interface so the ABI
      towards libmlx4 exposed through uverbs is bumped from 3 to 4 but only
      when **needed** i.e. only when the driver does use 64-byte CQEs or
      future device capabilities which must be in sync by user space. This
      practice allows to work with unmodified libmlx4 on older devices (e.g
      A0, B0) which don't support 64-byte CQEs.
      
      In order to keep existing systems functional when they update to a
      newer kernel that contains these changes in VF and userspace ABI, a
      module parameter enable_64b_cqe_eqe must be set to enable 64-byte
      mode; the default is currently false.
      Signed-off-by: NEli Cohen <eli@mellanox.com>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      08ff3235
  10. 20 11月, 2012 1 次提交
  11. 04 8月, 2012 1 次提交
  12. 20 7月, 2012 1 次提交
    • T
      mlx4_en: map entire pages to increase throughput · 4cce66cd
      Thadeu Lima de Souza Cascardo 提交于
      In its receive path, mlx4_en driver maps each page chunk that it pushes
      to the hardware and unmaps it when pushing it up the stack. This limits
      throughput to about 3Gbps on a Power7 8-core machine.
      
      One solution is to map the entire allocated page at once. However, this
      requires that we keep track of every page fragment we give to a
      descriptor. We also need to work with the discipline that all fragments will
      be released (in the sense that it will not be reused by the driver
      anymore) in the order they are allocated to the driver.
      
      This requires that we don't reuse any fragments, every single one of
      them must be reallocated. We do that by releasing all the fragments that
      are processed and only after finished processing the descriptors, we
      start the refill.
      
      We also must somehow guarantee that we either refill all fragments in a
      descriptor or none at all, without resorting to giving up a page
      fragment that we would have already given. Otherwise, we would break the
      discipline of only releasing the fragments in the order they were
      allocated.
      
      This has passed page allocation fault injections (restricted to the
      driver by using required-start and required-end) and device hotplug
      while 16 TCP streams were able to deliver more than 9Gbps.
      Signed-off-by: NThadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4cce66cd
  13. 19 7月, 2012 1 次提交
    • A
      net/mlx4_en: Add accelerated RFS support · 1eb8c695
      Amir Vadai 提交于
      Use RFS infrastructure and flow steering in HW to keep CPU
      affinity of rx interrupts and application per TCP stream.
      
      A flow steering filter is added to the HW whenever the RFS
      ndo callback is invoked by core networking code.
      
      Because the invocation takes place in interrupt context, the
      actual setup of HW is done using workqueue. Whenever new filter
      is added, the driver checks for expiry of existing filters.
      
      Since there's window in time between the point where the core
      RFS code invoked the ndo callback, to the point where the HW
      is configured from the workqueue context, the 2nd, 3rd etc
      packets from that stream will cause the net core to invoke
      the callback again and again.
      
      To prevent inefficient/double configuration of the HW, the filters
      are kept in a database which is indexed using hash function to enable
      fast access.
      Signed-off-by: NAmir Vadai <amirv@mellanox.com>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1eb8c695
  14. 08 7月, 2012 1 次提交
  15. 05 4月, 2012 1 次提交
    • A
      net/mlx4_en: Force user priority by QP attribute · 0e98b523
      Amir Vadai 提交于
      Instead of relying on HW to change schedule queue by UP, schedule
      queue is fixed for a tx_ring, and UP in WQE is ignored in this aspect.  This
      resolves two issues with untagged traffic:
      1. untagged traffic has no UP in packet which is needed for QoS. The change
         above allows setting the schedule queue (and by that the UP) of such a stream.
      2. BlueFlame uses the same field used by vlan tag. So forcing UP from QPC
         allows using BF for untagged but prioritized traffic.
      
      In old firmware that force UP is not supported, untagged traffic will not subject to
      QoS.
      
      Because UP is set by QP, need to always have a tx ring per UP, even if pfcrx
      module paramter is false.
      Signed-off-by: NAmir Vadai <amirv@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0e98b523
  16. 07 3月, 2012 2 次提交
  17. 07 2月, 2012 3 次提交
  18. 01 2月, 2012 1 次提交
  19. 19 1月, 2012 1 次提交
  20. 21 12月, 2011 1 次提交
  21. 14 12月, 2011 1 次提交
  22. 28 11月, 2011 2 次提交
  23. 15 11月, 2011 1 次提交
  24. 21 10月, 2011 1 次提交
  25. 20 10月, 2011 1 次提交
  26. 19 10月, 2011 5 次提交
  27. 11 8月, 2011 1 次提交
  28. 22 7月, 2011 1 次提交
  29. 22 6月, 2011 1 次提交
  30. 16 4月, 2011 1 次提交