1. 24 4月, 2008 1 次提交
  2. 20 4月, 2008 1 次提交
  3. 17 4月, 2008 12 次提交
  4. 15 2月, 2008 1 次提交
  5. 09 2月, 2008 1 次提交
    • J
      IB/mlx4: Use multiple WQ blocks to post smaller send WQEs · ea54b10c
      Jack Morgenstein 提交于
      ConnectX HCA supports shrinking WQEs, so that a single work request
      can be made of multiple units of wqe_shift.  This way, WRs can differ
      in size, and do not have to be a power of 2 in size, saving memory and
      speeding up send WR posting.  Unfortunately, if we do this then the
      wqe_index field in CQEs can't be used to look up the WR ID anymore, so
      our implementation does this only if selective signaling is off.
      
      Further, on 32-bit platforms, we can't use vmap() to make the QP
      buffer virtually contigious. Thus we have to use constant-sized WRs to
      make sure a WR is always fully within a single page-sized chunk.
      
      Finally, we use WRs with the NOP opcode to avoid wrapping around the
      queue buffer in the middle of posting a WR, and we set the
      NoErrorCompletion bit to avoid getting completions with error for NOP
      WRs.  However, NEC is only supported starting with firmware 2.2.232,
      so we use constant-sized WRs for older firmware.  And, since MLX QPs
      only support SEND, we use constant-sized WRs in this case.
      
      When stamping during NOP posting, do stamping following setting of the
      NOP WQE valid bit.
      Signed-off-by: NMichael S. Tsirkin <mst@dev.mellanox.co.il>
      Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      ea54b10c
  6. 07 2月, 2008 1 次提交
  7. 05 2月, 2008 2 次提交
  8. 26 1月, 2008 1 次提交
    • R
      IB/mlx4: Micro-optimize mlx4_ib_poll_one() · b3226184
      Roland Dreier 提交于
      Rather than byte-swapping cqe->g_mlpath_rqpn each time we extract a
      field from it, byte-swap it once into a temporary variable.  This 
      results in smaller, better code -- eg, on 32-bit x86:
      
      add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-5 (-5)
      function                                     old     new   delta
      mlx4_ib_poll_cq                             1188    1183      -5
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      b3226184
  9. 09 1月, 2008 1 次提交
  10. 31 10月, 2007 1 次提交
  11. 19 10月, 2007 1 次提交
  12. 10 10月, 2007 5 次提交
  13. 24 9月, 2007 1 次提交
    • J
      IB/mlx4: Fix data corruption triggered by wrong headroom marking order · 6e694ea3
      Jack Morgenstein 提交于
      This is an addendum to commit 0e6e7416 ("IB/mlx4: Handle new FW
      requirement for send request prefetching").  We also need to handle
      prefetch marking properly for S/G segments, or else the HCA may end up
      processing S/G segments that are not fully written and end up sending
      the wrong data.  This can actually cause data corruption in practice,
      especially on systems with relatively slow CPUs (where the HCA is more
      likely to prefetch while the CPU is in the middle of writing a work
      request into memory).
      
      We write S/G segments in reverse order into the WQE, in order to
      guarantee that the first dword of all cachelines containing S/G
      segments is written last (overwriting the headroom invalidation
      pattern).  The entire cacheline will thus contain valid data when the
      invalidation pattern is overwritten.
      Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      6e694ea3
  14. 16 8月, 2007 1 次提交
  15. 04 8月, 2007 1 次提交
  16. 29 7月, 2007 1 次提交
  17. 21 7月, 2007 2 次提交
  18. 19 7月, 2007 2 次提交
  19. 18 7月, 2007 4 次提交