提交 · 98a3e879907644c0b7e2f16436eb5cf24b9cd61f · openanolis / cloud-kernel

12 4月, 2014 1 次提交

由 Steve Wise 提交于 4月 09, 2014

There is a race when moving a QP from RTS->CLOSING where a SQ work
request could be posted after the FW receives the RDMA_RI/FINI WR.
The SQ work request will never get processed, and should be completed
with FLUSHED status.  Function c4iw_flush_sq(), however was dropping
the oldest SQ work request when in CLOSING or IDLE states, instead of
completing the pending work request. If that oldest pending work
request was actually complete and has a CQE in the CQ, then when that
CQE is proceessed in poll_cq, we'll BUG_ON() due to the inconsistent
SQ/CQ state.

This is a very small timing hole and has only been hit once so far.

The fix is two-fold:

1) c4iw_flush_sq() MUST always flush all non-completed WRs with FLUSHED
   status regardless of the QP state.

2) In c4iw_modify_rc_qp(), always set the "in error" bit on the queue
   before moving the state out of RTS.  This ensures that the state
   transition will not happen while another thread is in
   post_rc_send(), because set_state() and post_rc_send() both aquire
   the qp spinlock.  Also, once we transition the state out of RTS,
   subsequent calls to post_rc_send() will fail because the "in error"
   bit is set.  I don't think this fully closes the race where the FW
   can get a FINI followed a SQ work request being posted (because
   they are posted to differente EQs), but the #1 fix will handle the
   issue by flushing the SQ work request.
Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

b4e2901c

25 3月, 2014 1 次提交

RDMA/cxgb4: Ignore read reponse type 1 CQEs · 70b9c660

由 Steve Wise 提交于 3月 21, 2014

These are generated by HW in some error cases and need to be
silently discarded.
Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

70b9c660

21 3月, 2014 3 次提交

RDMA/cxgb4: Fix incorrect BUG_ON conditions · 8a9c399e

由 Steve Wise 提交于 3月 19, 2014

Based on original work from Jay Hernandez <jay@chelsio.com>
Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

8a9c399e

RDMA/cxgb4: Cap CQ size at T4_MAX_IQ_SIZE · ffd43592

由 Steve Wise 提交于 3月 19, 2014

Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

ffd43592

RDMA/cxgb4: Fix four byte info leak in c4iw_create_cq() · e24a72a3

由 Dan Carpenter 提交于 10月 19, 2013

There is a four byte hole at the end of the "uresp" struct after the
->qid_mask member.
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

e24a72a3

14 8月, 2013 2 次提交

RDMA/cxgb4: Fix accounting for unsignaled SQ WRs to deal with wrap · 27ca34f5

由 Steve Wise 提交于 8月 06, 2013

When determining how many WRs are completed with a signaled CQE,
correctly deal with queue wraps.
Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NVipul Pandya <vipul@chelsio.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

27ca34f5

RDMA/cxgb4: Fix QP flush logic · 1cf24dce

由 Steve Wise 提交于 8月 06, 2013

This patch makes following fixes in QP flush logic:

- correctly flushes unsignaled WRs followed by a signaled WR
- supports for flushing a CQ bound to multiple QPs
- resets cidx_flush if a active queue starts getting HW CQEs again
- marks WQ in error when we leave RTS. This was only being done for
  user queues, but we need it for kernel queues too so that
  post_send/post_recv will start returning the appropriate error
  synchronously
- eats unsignaled read resp CQEs. HW always inserts CQEs so we must
  silently discard them if the read work request was unsignaled.
- handles QP flushes with pending SW CQEs. The flush and out of order
  completion logic has a bug where if out of order completions are
  flushed but not yet polled by the consumer and the qp is then
  flushed then we end up inserting duplicate completions.
- c4iw_flush_sq() should only flush wrs that have not already been
  flushed.  Since we already track where in the SQ we've flushed via
  sq.cidx_flush, just start at that point and flush any remaining.
  This bug only caused a problem in the presence of unsignaled work
  requests.
Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NVipul Pandya <vipul@chelsio.com>

[ Fixed sparse warning due to htonl/ntohl confusion.  - Roland ]
Signed-off-by: NRoland Dreier <roland@purestorage.com>

1cf24dce

29 11月, 2011 1 次提交

RDMA/cxgb4: Fix iw_cxgb4 count_rcqes() logic · c34c97ad

由 Jonathan Lallinger 提交于 10月 20, 2011

Fix another place in the code where logic dealing with the t4_cqe was
using the wrong QID.  This fixes the counting logic so that it tests
against the SQ QID instead of the RQ QID when counting RCQES.

Signed-off by: Jonathan Lallinger <jonathan@ogc.us>
Signed-off by: Steve Wise <swise@ogc.us>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

c34c97ad

01 11月, 2011 1 次提交

RDMA/cxgb4: Serialize calls to CQ's comp_handler · 581bbe2c

由 Kumar Sanghvi 提交于 10月 24, 2011

Commit 01e7da6b ("RDMA/cxgb4: Make sure flush CQ entries are
collected on connection close") introduced a potential problem where a
CQ's comp_handler can get called simultaneously from different places
in the iw_cxgb4 driver.  This does not comply with
Documentation/infiniband/core_locking.txt, which states that at a
given point of time, there should be only one callback per CQ should
be active.

This problem was reported by Parav Pandit <Parav.Pandit@Emulex.Com>.
Based on discussion between Parav Pandit and Steve Wise, this patch
fixes the above problem by serializing the calls to a CQ's
comp_handler using a spin_lock.
Reported-by: NParav Pandit <Parav.Pandit@Emulex.Com>
Signed-off-by: NKumar Sanghvi <kumaras@chelsio.com>
Acked-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

581bbe2c

15 10月, 2011 1 次提交

RDMA/cxgb4: Use correct QID in insert_recv_cqe() · e14d62c0

由 Jonathan Lallinger 提交于 10月 13, 2011

When creating flushed receive CQEs, set the QPID field in the t4_cqe
to the SQ QID and not the RQ QID.  Otherwise the poll code will not
find the correct QP context.

Signed-off by: Jonathan Lallinger <jonathan@ogc.us>
Signed-off by: Steve Wise <swise@ogc.us>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

e14d62c0

18 6月, 2011 1 次提交

RDMA/cxgb4: Don't exceed hw IQ depth limit for user CQs · 2ff7d09a

由 Steve Wise 提交于 6月 01, 2011

Memory allocated for user CQs gets rounded up to the next page
boundary. And after rounding, we recalculate the resulting IQ depth
and we need to make sure we don't exceed the HW limits.

This bug can result a much smaller CQ allocated than was expected if
the HW size field is exceeded, resulting in CQ overflow failures.
Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

2ff7d09a

29 9月, 2010 2 次提交

RDMA/cxgb4: Centralize the wait logic · aadc4df3

由 Steve Wise 提交于 9月 10, 2010

Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

aadc4df3

RDMA/cxgb4: Ignore TERMINATE CQEs · 6ff0e343

由 Steve Wise 提交于 9月 10, 2010

T4 incorrectly inserts TERM CQEs into the CQ.  Silently ignore them.
Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

6ff0e343

28 9月, 2010 1 次提交

RDMA/cxgb4: Fix warnings about casts to/from pointers of different sizes · c8e081a1

由 Roland Dreier 提交于 9月 27, 2010

Fix:

drivers/infiniband/hw/cxgb4/qp.c: In function ‘create_qp’:
drivers/infiniband/hw/cxgb4/qp.c:147: warning: cast from pointer to integer of different size
drivers/infiniband/hw/cxgb4/qp.c: In function ‘rdma_fini’:
drivers/infiniband/hw/cxgb4/qp.c:988: warning: cast from pointer to integer of different size
drivers/infiniband/hw/cxgb4/qp.c: In function ‘rdma_init’:
drivers/infiniband/hw/cxgb4/qp.c:1063: warning: cast from pointer to integer of different size
drivers/infiniband/hw/cxgb4/mem.c: In function ‘write_adapter_mem’:
drivers/infiniband/hw/cxgb4/mem.c:74: warning: cast from pointer to integer of different size
drivers/infiniband/hw/cxgb4/cq.c: In function ‘destroy_cq’:
drivers/infiniband/hw/cxgb4/cq.c:58: warning: cast from pointer to integer of different size
drivers/infiniband/hw/cxgb4/cq.c: In function ‘create_cq’:
drivers/infiniband/hw/cxgb4/cq.c:135: warning: cast from pointer to integer of different size
drivers/infiniband/hw/cxgb4/cm.c: In function ‘fw6_msg’:
drivers/infiniband/hw/cxgb4/cm.c:2326: warning: cast to pointer from integer of different size

by casting pointers to unsigned long instead of u64.
Reported-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

c8e081a1

22 7月, 2010 1 次提交

RDMA/cxgb4: Remove dependency on __GFP_NOFAIL · d3c814e8

由 David Rientjes 提交于 7月 21, 2010

The alloc_skb() in various allocations are failable, so remove
__GFP_NOFAIL from their masks.
Signed-off-by: NDavid Rientjes <rientjes@google.com>
Acked-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

d3c814e8

07 7月, 2010 2 次提交

RDMA/cxgb4: Avoid false GTS CIDX_INC overflows · 1973e8b8

由 Steve Wise 提交于 6月 10, 2010

The T4 IQ hw design assumes CIDX_INC credits will be returned on a
regular basis and always before the CIDX counter crosses over the PIDX
counter. For RDMA CQs, however, returning CIDX_INC credits is only
needed and desired when and if the CQ is armed for notification. This
can lead to a GTS write returning credits that causes the HW to reject
the credit update because it causes CIDX to pass PIDX. Once this
happens, the CIDX/PIDX counters get out of whack and an application
can miss a notification and get stuck blocked awaiting a notification.

To avoid this, we allocate the HW IQ 2x times the requested size.
This seems to avoid the false overflow failures. If we see more
issues with this, then we'll have to add code in the poll path to
return credits periodically like when the amount reaches 1/2 the queue
depth). I would like to avoid this as it adds a PCI write transaction
for applications that never arm the CQ (like most MPIs).
Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

1973e8b8

RDMA/cxgb4: Use the DMA state API instead of the pci equivalents · f38926aa

由 FUJITA Tomonori 提交于 6月 03, 2010

This replace the PCI DMA state API (include/linux/pci-dma.h) with the
DMA equivalents since the PCI DMA state API will be obsolete.

No functional change.

For further information about the background:

http://marc.info/?l=linux-netdev&m=127037540020276&w=2Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

f38926aa

25 5月, 2010 2 次提交

RDMA/cxgb4: Optimize CQ overflow detection · 84172dee

由 Steve Wise 提交于 5月 20, 2010

1) save the timestamp flit in the cq when we consume a CQE.

2) always compare the saved flit with the previous entry flit when
   reading the next CQE entry.  If the flits don't compare, then we
   have overflowed.
Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

84172dee

RDMA/cxgb4: CQ size must be IQ size - 2 · 895cf5f3

由 Steve Wise 提交于 5月 20, 2010

We need 1 extra entry for the status page and 1 to always have 1 free
entry to detect when the queue is full.
Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

895cf5f3

22 4月, 2010 1 次提交

RDMA/cxgb4: Add driver for Chelsio T4 RNIC · cfdda9d7

由 Steve Wise 提交于 4月 21, 2010

Add an RDMA/iWARP driver for Chelsio T4 Ethernet adapters.
Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

cfdda9d7

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功