提交 · 04e10e2164fcfa05e14eff3c2757a5097f11d258 · openanolis / cloud-kernel

16 7月, 2014 1 次提交

iw_cxgb4: Detect Ing. Padding Boundary at run-time · 04e10e21

由 Hariprasad Shenai 提交于 7月 14, 2014

Updates iw_cxgb4 to determine the Ingress Padding Boundary from
cxgb4_lld_info, and take subsequent actions.
Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NHariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

04e10e21

11 6月, 2014 1 次提交

iw_cxgb4: Allocate and use IQs specifically for indirect interrupts · cf38be6d

由 Hariprasad Shenai 提交于 6月 06, 2014

Currently indirect interrupts for RDMA CQs funnel through the LLD's RDMA
RXQs, which also handle direct interrupts for offload CPLs during RDMA
connection setup/teardown.  The intended T4 usage model, however, is to
have indirect interrupts flow through dedicated IQs. IE not to mix
indirect interrupts with CPL messages in an IQ.  This patch adds the
concept of RDMA concentrator IQs, or CIQs, setup and maintained by the
LLD and exported to iw_cxgb4 for use when creating CQs. RDMA CPLs will
flow through the LLD's RDMA RXQs, and CQ interrupts flow through the
CIQs.

Design:

cxgb4 creates and exports an array of CIQs for the RDMA ULD.  These IQs
are sized according to the max available CQs available at adapter init.
In addition, these IQs don't need FL buffers since they only service
indirect interrupts.  One CIQ is setup per RX channel similar to the
RDMA RXQs.

iw_cxgb4 will utilize these CIQs based on the vector value passed into
create_cq().  The num_comp_vectors advertised by iw_cxgb4 will be the
number of CIQs configured, and thus the vector value will be the index
into the array of CIQs.

Based on original work by Steve Wise <swise@opengridcomputing.com>
Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NHariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cf38be6d

30 5月, 2014 1 次提交

RDMA/cxgb4: Add missing padding at end of struct c4iw_create_cq_resp · b6f04d3d

由 Yann Droneaud 提交于 5月 05, 2014

The i386 ABI disagrees with most other ABIs regarding alignment of
data types larger than 4 bytes: on most ABIs a padding must be added
at end of the structures, while it is not required on i386.

So for most ABI struct c4iw_create_cq_resp gets implicitly padded
to be aligned on a 8 bytes multiple, while for i386, such padding
is not added.

The tool pahole can be used to find such implicit padding:

  $ pahole --anon_include \
           --nested_anon_include \
           --recursive \
           --class_name c4iw_create_cq_resp \
           drivers/infiniband/hw/cxgb4/iw_cxgb4.o

Then, structure layout can be compared between i386 and x86_64:

  +++ obj-i386/drivers/infiniband/hw/cxgb4/iw_cxgb4.o.pahole.txt   2014-03-28 11:43:05.547432195 +0100
  --- obj-x86_64/drivers/infiniband/hw/cxgb4/iw_cxgb4.o.pahole.txt 2014-03-28 10:55:10.990133017 +0100
  @@ -14,9 +13,8 @@ struct c4iw_create_cq_resp {
          __u32                      size;                 /*    28     4 */
          __u32                      qid_mask;             /*    32     4 */

  -       /* size: 36, cachelines: 1, members: 6 */
  -       /* last cacheline: 36 bytes */
  +       /* size: 40, cachelines: 1, members: 6 */
  +       /* padding: 4 */
  +       /* last cacheline: 40 bytes */
   };

This ABI disagreement will make an x86_64 kernel try to write past the
buffer provided by an i386 binary.

When boundary check will be implemented, the x86_64 kernel will refuse
to write past the i386 userspace provided buffer and the uverbs will
fail.

If the structure is on a page boundary and the next page is not
mapped, ib_copy_to_udata() will fail and the uverb will fail.

This patch adds an explicit padding at end of structure
c4iw_create_cq_resp, and, like 92b0ca7c ("IB/mlx5: Fix stack info
leak in mlx5_ib_alloc_ucontext()"), makes function c4iw_create_cq()
not writting this padding field to userspace. This way, x86_64 kernel
will be able to write struct c4iw_create_cq_resp as expected by
unpatched and patched i386 libcxgb4.

Link: http://marc.info/?i=cover.1399309513.git.ydroneaud@opteya.com
Cc: <stable@vger.kernel.org>
Fixes: cfdda9d7 ("RDMA/cxgb4: Add driver for Chelsio T4 RNIC")
Fixes: e24a72a3 ("RDMA/cxgb4: Fix four byte info leak in c4iw_create_cq()")
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NYann Droneaud <ydroneaud@opteya.com>
Acked-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

b6f04d3d

12 4月, 2014 2 次提交

RDMA/cxgb4: Use uninitialized_var() · 97df1c67

由 Steve Wise 提交于 4月 09, 2014

Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

97df1c67

RDMA/cxgb4: SQ flush fix · b4e2901c

由 Steve Wise 提交于 4月 09, 2014

There is a race when moving a QP from RTS->CLOSING where a SQ work
request could be posted after the FW receives the RDMA_RI/FINI WR.
The SQ work request will never get processed, and should be completed
with FLUSHED status.  Function c4iw_flush_sq(), however was dropping
the oldest SQ work request when in CLOSING or IDLE states, instead of
completing the pending work request. If that oldest pending work
request was actually complete and has a CQE in the CQ, then when that
CQE is proceessed in poll_cq, we'll BUG_ON() due to the inconsistent
SQ/CQ state.

This is a very small timing hole and has only been hit once so far.

The fix is two-fold:

1) c4iw_flush_sq() MUST always flush all non-completed WRs with FLUSHED
   status regardless of the QP state.

2) In c4iw_modify_rc_qp(), always set the "in error" bit on the queue
   before moving the state out of RTS.  This ensures that the state
   transition will not happen while another thread is in
   post_rc_send(), because set_state() and post_rc_send() both aquire
   the qp spinlock.  Also, once we transition the state out of RTS,
   subsequent calls to post_rc_send() will fail because the "in error"
   bit is set.  I don't think this fully closes the race where the FW
   can get a FINI followed a SQ work request being posted (because
   they are posted to differente EQs), but the #1 fix will handle the
   issue by flushing the SQ work request.
Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

b4e2901c

25 3月, 2014 1 次提交

RDMA/cxgb4: Ignore read reponse type 1 CQEs · 70b9c660

由 Steve Wise 提交于 3月 21, 2014

These are generated by HW in some error cases and need to be
silently discarded.
Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

70b9c660

21 3月, 2014 3 次提交

RDMA/cxgb4: Fix incorrect BUG_ON conditions · 8a9c399e

由 Steve Wise 提交于 3月 19, 2014

Based on original work from Jay Hernandez <jay@chelsio.com>
Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

8a9c399e

RDMA/cxgb4: Cap CQ size at T4_MAX_IQ_SIZE · ffd43592

由 Steve Wise 提交于 3月 19, 2014

Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

ffd43592

RDMA/cxgb4: Fix four byte info leak in c4iw_create_cq() · e24a72a3

由 Dan Carpenter 提交于 10月 19, 2013

There is a four byte hole at the end of the "uresp" struct after the
->qid_mask member.
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

e24a72a3

14 8月, 2013 2 次提交

RDMA/cxgb4: Fix accounting for unsignaled SQ WRs to deal with wrap · 27ca34f5

由 Steve Wise 提交于 8月 06, 2013

When determining how many WRs are completed with a signaled CQE,
correctly deal with queue wraps.
Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NVipul Pandya <vipul@chelsio.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

27ca34f5

RDMA/cxgb4: Fix QP flush logic · 1cf24dce

由 Steve Wise 提交于 8月 06, 2013

This patch makes following fixes in QP flush logic:

- correctly flushes unsignaled WRs followed by a signaled WR
- supports for flushing a CQ bound to multiple QPs
- resets cidx_flush if a active queue starts getting HW CQEs again
- marks WQ in error when we leave RTS. This was only being done for
  user queues, but we need it for kernel queues too so that
  post_send/post_recv will start returning the appropriate error
  synchronously
- eats unsignaled read resp CQEs. HW always inserts CQEs so we must
  silently discard them if the read work request was unsignaled.
- handles QP flushes with pending SW CQEs. The flush and out of order
  completion logic has a bug where if out of order completions are
  flushed but not yet polled by the consumer and the qp is then
  flushed then we end up inserting duplicate completions.
- c4iw_flush_sq() should only flush wrs that have not already been
  flushed.  Since we already track where in the SQ we've flushed via
  sq.cidx_flush, just start at that point and flush any remaining.
  This bug only caused a problem in the presence of unsignaled work
  requests.
Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NVipul Pandya <vipul@chelsio.com>

[ Fixed sparse warning due to htonl/ntohl confusion.  - Roland ]
Signed-off-by: NRoland Dreier <roland@purestorage.com>

1cf24dce

29 11月, 2011 1 次提交

RDMA/cxgb4: Fix iw_cxgb4 count_rcqes() logic · c34c97ad

由 Jonathan Lallinger 提交于 10月 20, 2011

Fix another place in the code where logic dealing with the t4_cqe was
using the wrong QID.  This fixes the counting logic so that it tests
against the SQ QID instead of the RQ QID when counting RCQES.

Signed-off by: Jonathan Lallinger <jonathan@ogc.us>
Signed-off by: Steve Wise <swise@ogc.us>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

c34c97ad

01 11月, 2011 1 次提交

RDMA/cxgb4: Serialize calls to CQ's comp_handler · 581bbe2c

由 Kumar Sanghvi 提交于 10月 24, 2011

Commit 01e7da6b ("RDMA/cxgb4: Make sure flush CQ entries are
collected on connection close") introduced a potential problem where a
CQ's comp_handler can get called simultaneously from different places
in the iw_cxgb4 driver.  This does not comply with
Documentation/infiniband/core_locking.txt, which states that at a
given point of time, there should be only one callback per CQ should
be active.

This problem was reported by Parav Pandit <Parav.Pandit@Emulex.Com>.
Based on discussion between Parav Pandit and Steve Wise, this patch
fixes the above problem by serializing the calls to a CQ's
comp_handler using a spin_lock.
Reported-by: NParav Pandit <Parav.Pandit@Emulex.Com>
Signed-off-by: NKumar Sanghvi <kumaras@chelsio.com>
Acked-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

581bbe2c

15 10月, 2011 1 次提交

RDMA/cxgb4: Use correct QID in insert_recv_cqe() · e14d62c0

由 Jonathan Lallinger 提交于 10月 13, 2011

When creating flushed receive CQEs, set the QPID field in the t4_cqe
to the SQ QID and not the RQ QID.  Otherwise the poll code will not
find the correct QP context.

Signed-off by: Jonathan Lallinger <jonathan@ogc.us>
Signed-off by: Steve Wise <swise@ogc.us>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

e14d62c0

18 6月, 2011 1 次提交

RDMA/cxgb4: Don't exceed hw IQ depth limit for user CQs · 2ff7d09a

由 Steve Wise 提交于 6月 01, 2011

Memory allocated for user CQs gets rounded up to the next page
boundary. And after rounding, we recalculate the resulting IQ depth
and we need to make sure we don't exceed the HW limits.

This bug can result a much smaller CQ allocated than was expected if
the HW size field is exceeded, resulting in CQ overflow failures.
Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

2ff7d09a

29 9月, 2010 2 次提交

RDMA/cxgb4: Centralize the wait logic · aadc4df3

由 Steve Wise 提交于 9月 10, 2010

Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

aadc4df3

RDMA/cxgb4: Ignore TERMINATE CQEs · 6ff0e343

由 Steve Wise 提交于 9月 10, 2010

T4 incorrectly inserts TERM CQEs into the CQ.  Silently ignore them.
Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

6ff0e343

28 9月, 2010 1 次提交

RDMA/cxgb4: Fix warnings about casts to/from pointers of different sizes · c8e081a1

由 Roland Dreier 提交于 9月 27, 2010

Fix:

drivers/infiniband/hw/cxgb4/qp.c: In function ‘create_qp’:
drivers/infiniband/hw/cxgb4/qp.c:147: warning: cast from pointer to integer of different size
drivers/infiniband/hw/cxgb4/qp.c: In function ‘rdma_fini’:
drivers/infiniband/hw/cxgb4/qp.c:988: warning: cast from pointer to integer of different size
drivers/infiniband/hw/cxgb4/qp.c: In function ‘rdma_init’:
drivers/infiniband/hw/cxgb4/qp.c:1063: warning: cast from pointer to integer of different size
drivers/infiniband/hw/cxgb4/mem.c: In function ‘write_adapter_mem’:
drivers/infiniband/hw/cxgb4/mem.c:74: warning: cast from pointer to integer of different size
drivers/infiniband/hw/cxgb4/cq.c: In function ‘destroy_cq’:
drivers/infiniband/hw/cxgb4/cq.c:58: warning: cast from pointer to integer of different size
drivers/infiniband/hw/cxgb4/cq.c: In function ‘create_cq’:
drivers/infiniband/hw/cxgb4/cq.c:135: warning: cast from pointer to integer of different size
drivers/infiniband/hw/cxgb4/cm.c: In function ‘fw6_msg’:
drivers/infiniband/hw/cxgb4/cm.c:2326: warning: cast to pointer from integer of different size

by casting pointers to unsigned long instead of u64.
Reported-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

c8e081a1

22 7月, 2010 1 次提交

RDMA/cxgb4: Remove dependency on __GFP_NOFAIL · d3c814e8

由 David Rientjes 提交于 7月 21, 2010

The alloc_skb() in various allocations are failable, so remove
__GFP_NOFAIL from their masks.
Signed-off-by: NDavid Rientjes <rientjes@google.com>
Acked-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

d3c814e8

07 7月, 2010 2 次提交

RDMA/cxgb4: Avoid false GTS CIDX_INC overflows · 1973e8b8

由 Steve Wise 提交于 6月 10, 2010

The T4 IQ hw design assumes CIDX_INC credits will be returned on a
regular basis and always before the CIDX counter crosses over the PIDX
counter. For RDMA CQs, however, returning CIDX_INC credits is only
needed and desired when and if the CQ is armed for notification. This
can lead to a GTS write returning credits that causes the HW to reject
the credit update because it causes CIDX to pass PIDX. Once this
happens, the CIDX/PIDX counters get out of whack and an application
can miss a notification and get stuck blocked awaiting a notification.

To avoid this, we allocate the HW IQ 2x times the requested size.
This seems to avoid the false overflow failures. If we see more
issues with this, then we'll have to add code in the poll path to
return credits periodically like when the amount reaches 1/2 the queue
depth). I would like to avoid this as it adds a PCI write transaction
for applications that never arm the CQ (like most MPIs).
Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

1973e8b8

RDMA/cxgb4: Use the DMA state API instead of the pci equivalents · f38926aa

由 FUJITA Tomonori 提交于 6月 03, 2010

This replace the PCI DMA state API (include/linux/pci-dma.h) with the
DMA equivalents since the PCI DMA state API will be obsolete.

No functional change.

For further information about the background:

http://marc.info/?l=linux-netdev&m=127037540020276&w=2Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

f38926aa

25 5月, 2010 2 次提交

RDMA/cxgb4: Optimize CQ overflow detection · 84172dee

由 Steve Wise 提交于 5月 20, 2010

1) save the timestamp flit in the cq when we consume a CQE.

2) always compare the saved flit with the previous entry flit when
   reading the next CQE entry.  If the flits don't compare, then we
   have overflowed.
Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

84172dee

RDMA/cxgb4: CQ size must be IQ size - 2 · 895cf5f3

由 Steve Wise 提交于 5月 20, 2010

We need 1 extra entry for the status page and 1 to always have 1 free
entry to detect when the queue is full.
Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

895cf5f3

22 4月, 2010 1 次提交

RDMA/cxgb4: Add driver for Chelsio T4 RNIC · cfdda9d7

由 Steve Wise 提交于 4月 21, 2010

Add an RDMA/iWARP driver for Chelsio T4 Ethernet adapters.
Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

cfdda9d7

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功