提交 · 36a8f01cd24b125aa027c71c1288588edde5322d · openanolis / cloud-kernel

20 7月, 2012 3 次提交

IB/qib: Add congestion control agent implementation · 36a8f01c

由 Mike Marciniszyn 提交于 7月 19, 2012

Add a congestion control agent in the driver that handles gets and
sets from the congestion control manager in the fabric for the
Performance Scale Messaging (PSM) library.
Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

36a8f01c

IB/qib: Reduce sdma_lock contention · 551ace12

由 Mike Marciniszyn 提交于 7月 19, 2012

Profiling has shown that sdma_lock is proving a bottleneck for
performance. The situations include:
 - RDMA reads when krcvqs > 1
 - post sends from multiple threads

For RDMA read the current global qib_wq mechanism runs on all CPUs
and contends for the sdma_lock when multiple RMDA read requests are
fielded on differenct CPUs. For post sends, the direct call to
qib_do_send() from multiple threads causes the contention.

Since the sdma mechanism is per port, this fix converts the existing
workqueue to a per port single thread workqueue to reduce the lock
contention in the RDMA read case, and for any other case where the QP
is scheduled via the workqueue mechanism from more than 1 CPU.

For the post send case, This patch modifies the post send code to test
for a non empty sdma engine.  If the sdma is not idle the (now single
thread) workqueue will be used to trigger the send engine instead of
the direct call to qib_do_send().
Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

551ace12

IB/qib: Fix an incorrect log message · f3331f88

由 Betty Dall 提交于 7月 19, 2012

There is a cut-and-paste typo in the function qib_pci_slot_reset()
where it prints that the "link_reset" function is called rather than
the "slot_reset" function.  This makes the message misleading.
Signed-off-by: NBetty Dall <betty.dall@hp.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

f3331f88

18 7月, 2012 1 次提交

IB/qib: Fix QP RCU sparse warnings · 1fb9fed6

由 Mike Marciniszyn 提交于 7月 16, 2012

Commit af061a64 ("IB/qib: Use RCU for qpn lookup") introduced sparse
warnings.

This patch corrects those issues.
Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

1fb9fed6

11 7月, 2012 1 次提交

IB/qib: Fix sparse RCU warnings in qib_keys.c · 7e230177

由 Mike Marciniszyn 提交于 7月 06, 2012

Commit 8aac4cc3 ("IB/qib: RCU locking for MR validation") introduced
new sparse warnings in qib_keys.c.
Acked-by: NFengguang Wu <fengguang.wu@intel.com>
Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

7e230177

09 7月, 2012 3 次提交

IB/qib: RCU locking for MR validation · 8aac4cc3

由 Mike Marciniszyn 提交于 6月 27, 2012

Profiling indicates that MR validation locking is expensive.  The MR
table is largely read-only and is a suitable candidate for RCU locking.

The patch uses RCU locking during validation to eliminate one
lock/unlock during that validation.
Reviewed-by: NMike Heinz <michael.william.heinz@intel.com>
Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

8aac4cc3

IB/qib: Avoid returning EBUSY from MR deregister · 6a82649f

由 Mike Marciniszyn 提交于 6月 27, 2012

A timing issue can occur where qib_mr_dereg can return -EBUSY if the
MR use count is not zero.

This can occur if the MR is de-registered while RDMA read response
packets are being progressed from the SDMA ring.  The suspicion is
that the peer sent an RDMA read request, which has already been copied
across to the peer.  The peer sees the completion of his request and
then communicates to the responder that the MR is not needed any
longer.  The responder tries to de-register the MR, catching some
responses remaining in the SDMA ring holding the MR use count.

The code now uses a get/put paradigm to track MR use counts and
coordinates with the MR de-registration process using a completion
when the count has reached zero.  A timeout on the delay is in place
to catch other EBUSY issues.

The reference count protocol is as follows:
- The return to the user counts as 1
- A reference from the lk_table or the qib_ibdev counts as 1.
- Transient I/O operations increase/decrease as necessary

A lot of code duplication has been folded into the new routines
init_qib_mregion() and deinit_qib_mregion().  Additionally, explicit
initialization of fields to zero is now handled by kzalloc().

Also, duplicated code 'while.*num_sge' that decrements reference
counts have been consolidated in qib_put_ss().
Reviewed-by: NRamkrishna Vepa <ramkrishna.vepa@intel.com>
Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

6a82649f

IB/qib: Fix UC MR refs for immediate operations · 354dff1b

由 Mike Marciniszyn 提交于 6月 27, 2012

An MR reference leak exists when handling UC RDMA writes with
immediate data because we manipulate the reference counts as if the
operation had been a send.

This patch moves the last_imm label so that the RDMA write operations
with immediate data converge at the cq building code.  The copy/mr
deref code is now done correctly prior to the branch to last_imm.
Reviewed-by: NEdward Mascarenhas <edward.mascarenhas@intel.com>
Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

354dff1b

20 6月, 2012 1 次提交

RDMA/cma: QP type check on received REQs should be AND not OR · 4dd81e89

由 Sean Hefty 提交于 6月 14, 2012

Change || check to the intended && when checking the QP type in a
received connection request against the listening endpoint.
Signed-off-by: NSean Hefty <sean.hefty@intel.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

4dd81e89

15 6月, 2012 1 次提交

RDMA/ocrdma: Fix off by one in ocrdma_query_gid() · 7b33dc2b

由 Dan Carpenter 提交于 6月 14, 2012

The dev->sgid_tbl[] array is allocated in ocrdma_alloc_resources().
It has OCRDMA_MAX_SGID elements so the test here is off by one.
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

7b33dc2b

12 6月, 2012 4 次提交

RDMA/ocrdma: Fixed RQ error CQE polling · a3698a9b

由 Parav Pandit 提交于 6月 11, 2012

Fix RQ/SRQ error CQE polling.  Return error CQE to consumer for error
case which was not returned previously.
Signed-off-by: NParav Pandit <parav.pandit@emulex.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

a3698a9b

RDMA/ocrdma: Correct queue SGE calculation · 634c5796

由 Mahesh Vardhamanaiah 提交于 6月 08, 2012

Fix max sge calculation for sq, rq, srq for all hardware types.
Signed-off-by: NMahesh Vardhamanaiah <mahesh.vardhamanaiah@emulex.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

634c5796

RDMA/ocrdma: Correct reported max queue sizes · 07bb5424

由 Mahesh Vardhamanaiah 提交于 6月 08, 2012

Fix code to read the max wqe and max rqe values from mailbox response.
Signed-off-by: NMahesh Vardhamanaiah <mahesh.vardhamanaiah@emulex.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

07bb5424

RDMA/ocrdma: Fixed GID table for vlan and events · 6ab6827e

由 Parav Pandit 提交于 6月 08, 2012

1. Fix reporting GID table addition events.
2. Enable vlan based GID entries only when VLAN is enabled at compile
   time (test CONFIG_VLAN_8021Q / CONFIG_VLAN_8021Q_MODULE).
Signed-off-by: NParav Pandit <parav.pandit@emulex.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

6ab6827e

07 6月, 2012 1 次提交

IB/mlx4: Fix max_wqe capacity reported from query device · fc2d0044

由 Sagi Grimberg 提交于 5月 24, 2012

1. Limit the max number of WQEs per QP reported when querying the
   device, so that ib_create_qp() will not fail for a QP size that the
   device claimed to support due to additional headroom WQEs being
   allocated.

2. Limit qp resources accepted for ib_create_qp() to the limits
   reported in ib_query_device().  In kernel space, make sure that the
   limits returned to the caller following qp creation also lie within
   the reported device limits. For userspace, report as before, and do
   adjustment in libmlx4 (so as not to break ABI).
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NSagi Grimberg <sagig@mellanox.co.il>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

fc2d0044

04 6月, 2012 2 次提交

IB/mlx4: Fix EQ deallocation in legacy mode · 3aac6ff1

由 Shlomo Pongratz 提交于 5月 24, 2012

Commit e605b743 ("IB/mlx4: Increase the number of vectors (EQs)
available for ULPs") didn't handle correctly the case where there
aren't enough MSI-X vectors to increase the number of EQs, so only the
legacy EQs are allocated.  This results in an attempt to memset() to
zero the EQ table which was never allocated and a kernel crash.

Fix this by checking in the teardown flow if the table of EQs was ever
allocated.  Also remove some unneeded setting to zero of the EQ
related fields in struct mlx4_ib_dev.
Signed-off-by: NShlomo Pongratz <shlomop@mellanox.com>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

3aac6ff1

RDMA/cxgb4: Fix crash when peer address is 0.0.0.0 · 71b43fd5

由 Thadeu Lima de Souza Cascardo 提交于 5月 17, 2012

When using rping -c -a 0.0.0.0 with iw_cxgb4, the system crashes when
rdma_connect() is called.  ip_dev_find() will return NULL, but pdev is
accessed anyway.

Checking that pdev is NULL and returning -ENODEV prevents the system
from crashing.
Signed-off-by: NThadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>
Acked-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

71b43fd5

30 5月, 2012 3 次提交

RDMA/ocrdma: Remove unnecessary version.h includes · 7ad5e449

由 Devendra Naga 提交于 5月 29, 2012

"make versioncheck" shows:

    drivers/infiniband/hw/ocrdma/ocrdma_main.c: 29 linux/version.h not needed.
    drivers/infiniband/hw/ocrdma/ocrdma_verbs.h: 31 linux/version.h not needed.
Signed-off-by: NDevendra Naga <devendra.aaru@gmail.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

7ad5e449

RDMA/ocrdma: Fix signaled event for SRQ_LIMIT_REACHED · 804eaf29

由 Parav Pandit 提交于 5月 23, 2012

Signed-off-by: NParav Pandit <parav.pandit@emulex.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

804eaf29

RDMA/ocrdma: Correct queue free count math · cd4fedf9

由 Parav Pandit 提交于 5月 23, 2012

Correct queue free count math for SQ, RQ for all hardware type.
Update user-kernel ABI interface.
Signed-off-by: NParav Pandit <parav.pandit@emulex.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

cd4fedf9

22 5月, 2012 1 次提交

RDMA/cxgb4: Include vmalloc.h for vmalloc and vfree · e572568f

由 Vipul Pandya 提交于 5月 21, 2012

Signed-off-by: NVipul Pandya <vipul@chelsio.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

e572568f

19 5月, 2012 10 次提交

IB/mlx4: Fix mlx4_ib_add() error flow · 035b1032

由 Jack Morgenstein 提交于 5月 10, 2012

We need to use a different loop index for mlx4_counter_alloc() and for
device_create_file() iterations: the mlx4_counter_alloc() loop index
is used in the error flow to free counters.

If the same loop index is used for device_create_file() and, say, the
device_create_file() loop fails on the first iteration, the allocated
counters will not be freed.
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

035b1032

IB/iser: Fix error flow in iser ep connection establishment · 7d9c0de4

由 Or Gerlitz 提交于 4月 29, 2012

The current error flow code was releasing the IB connection object and
calling iscsi_destroy_endpoint() directly without going through the
reference counting mechanism introduced in commit 39ff05db ("IB/iser:
Enhance disconnection logic for multi-pathing"). This resulted in a
double free of the iscsi endpoint object, which causes a kernel NULL
pointer dereference.  Fix that by plugging into the IB conn reference
counting correctly.
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

7d9c0de4

IB/mlx4: Increase the number of vectors (EQs) available for ULPs · e605b743

由 Shlomo Pongratz 提交于 4月 29, 2012

Enable IB ULPs to use a larger portion of the device EQs (which map to
IRQs). The mlx4_ib driver follows the mlx4_core framework of the EQs
to be divided among the device ports. In this scheme, for each IB
port, the number of allocated EQs follows the number of cores, subject
to other system constraints, such as number available MSI-X vectors.
Signed-off-by: NShlomo Pongratz <shlomop@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

e605b743

RDMA/cxgb4: Add query_qp support · 67bbc055