提交 · e20d583818a5d6fc052e59fe2345d82ffd089462 · openanolis / cloud-kernel

01 10月, 2012 1 次提交

IB/qib: Add a qib driver version · e20d5838

由 Dean Luick 提交于 9月 13, 2012

Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NDean Luick <dean.luick@intel.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

e20d5838

15 9月, 2012 2 次提交

IB/qib: Fix failure of compliance test C14-024#06_LocalPortNum · 4c355005

由 Mike Marciniszyn 提交于 9月 12, 2012

Commit 3236b2d4 ("IB/qib: MADs with misset M_Keys should return
failure") introduced a return code assignment that unfortunately
introduced an unconditional exit for the routine due to missing braces.

This patch adds the braces to correct the original patch.
Reviewed-by: NDean Luick <dean.luick@intel.com>
Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

4c355005

RDMA/ocrdma: Fix CQE expansion of unsignaled WQE · ae3bca90

由 Parav Pandit 提交于 8月 17, 2012

Fix CQE expansion of unsignaled WQE -- don't expand the CQE when the
WQE index of the completed CQE matches with last pending WQE (tail) in
the queue.
Signed-off-by: NParav Pandit <parav.pandit@emulex.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

ae3bca90

13 9月, 2012 2 次提交

IPoIB: Fix AB-BA deadlock when deleting neighbours · b5120a6e

由 Shlomo Pongratz 提交于 8月 29, 2012

Lockdep points out a circular locking dependency betwwen the ipoib
device priv spinlock (priv->lock) and the neighbour table rwlock
(ntbl->rwlock).

In the normal path, ie neigbour garbage collection task, the neigh
table rwlock is taken first and then if the neighbour needs to be
deleted, priv->lock is taken.

However in some error paths, such as in ipoib_cm_handle_tx_wc(),
priv->lock is taken first and then ipoib_neigh_free routine is called
which in turn takes the neighbour table ntbl->rwlock.

The solution is to get rid the neigh table rwlock completely and use
only priv->lock.
Signed-off-by: NShlomo Pongratz <shlomop@mellanox.com>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

b5120a6e

IPoIB: Fix memory leak in the neigh table deletion flow · 66172c09

由 Shlomo Pongratz 提交于 8月 29, 2012

If the neighbours hash table is empty when unloading the module, then
ipoib_flush_neighs(), the cleanup routine, isn't called and the
memory used for the hash table itself leaked.

To fix this, ipoib_flush_neighs() is allways called, and another
completion object is added to signal when the table is freed.

Once invoked, ipoib_flush_neighs() flushes all the neighbours (if
there are any), calls the the hash table RCU free routine, which now
signals completion of the deletion process, and waits for the last
neighbour to be freed.
Signed-off-by: NShlomo Pongratz <shlomop@mellanox.com>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

66172c09

08 9月, 2012 1 次提交

RDMA/cxgb4: Move dereference below NULL test · 92dd6c3d

由 Wei Yongjun 提交于 9月 07, 2012

spatch with a semantic match is used to found this.
(http://coccinelle.lip6.fr/)
Signed-off-by: NWei Yongjun <yongjun_wei@trendmicro.com.cn>
Acked-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

92dd6c3d

17 8月, 2012 1 次提交

IB/mlx4: Check iboe netdev pointer before dereferencing it · a0675a38

由 Kleber Sacilotto de Souza 提交于 8月 10, 2012

Unlike other parts of the mlx4_ib code, the function build_mlx_header()
doesn't check if the iboe netdev of the given port is valid before
dereferencing it, which can cause a crash if the ethernet interface
has already been taken down.

Fix this by checking for a valid netdev pointer before using it to get
the port MAC address.
Signed-off-by: NKleber Sacilotto de Souza <klebers@linux.vnet.ibm.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

a0675a38

16 8月, 2012 3 次提交

IB/srp: Fix a race condition · 22032991

由 Bart Van Assche 提交于 8月 14, 2012

Avoid a crash caused by the scmnd->scsi_done(scmnd) call in
srp_process_rsp() being invoked with scsi_done == NULL.  This can
happen if a reply is received during or after a command abort.
Reported-by: NJoseph Glanville <joseph.glanville@orionvm.com.au>
Reference: http://marc.info/?l=linux-rdma&m=134314367801595
Cc: <stable@vger.kernel.org>
Acked-by: NDavid Dillow <dillowda@ornl.gov>
Signed-off-by: NBart Van Assche <bvanassche@acm.org>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

22032991

IB/qib: Fix error return code in qib_init_7322_variables() · 51fa3ca3

由 Julia Lawall 提交于 8月 14, 2012

Convert a 0 error return code to a negative one, as returned elsewhere
in the function.

A simplified version of the semantic match that finds this problem is
as follows: (http://coccinelle.lip6.fr/)

// <smpl>
@@
identifier ret;
expression e,e1,e2,e3,e4,x;
@@

(
if (\(ret != 0\|ret < 0\) || ...) { ... return ...; }
|
ret = 0
)
... when != ret = e1
*x = \(kmalloc\|kzalloc\|kcalloc\|devm_kzalloc\|ioremap\|ioremap_nocache\|devm_ioremap\|devm_ioremap_nocache\)(...);
... when != x = e2
    when != ret = e3
*if (x == NULL || ...)
{
  ... when != ret = e4
*  return ret;
}
// </smpl>
Signed-off-by: NJulia Lawall <Julia.Lawall@lip6.fr>
Acked-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

51fa3ca3

IB: Fix typos in infiniband drivers · 142ad5db

由 Masanari Iida 提交于 8月 10, 2012

Correct spelling typos in comments in drivers/infiniband.
Signed-off-by: NMasanari Iida <standby24x7@gmail.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

142ad5db

15 8月, 2012 2 次提交

IB/ipoib: Fix RCU pointer dereference of wrong object · 6c723a68

由 Shlomo Pongratz 提交于 8月 13, 2012

Commit b63b70d8 ("IPoIB: Use a private hash table for path lookup
in xmit path") introduced a bug where in ipoib_neigh_free() (which is
called from a few errors flows in the driver), rcu_dereference() is
invoked with the wrong pointer object, which results in a crash.
Signed-off-by: NShlomo Pongratz <shlomop@mellanox.com>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

6c723a68

IB/ipoib: Add missing locking when CM object is deleted · fa16ebed

由 Shlomo Pongratz 提交于 8月 13, 2012

Commit b63b70d8 ("IPoIB: Use a private hash table for path lookup
in xmit path") introduced a bug where in ipoib_cm_destroy_tx() a CM
object is moved between lists without any supported locking.  Under a
stress test, this eventually leads to list corruption and a crash.

Previously when this routine was called, callers were taking the
device priv lock.  Currently this function is called from the RCU
callback associated with neighbour deletion.  Fix the race by taking
the same lock we used to before.
Signed-off-by: NShlomo Pongratz <shlomop@mellanox.com>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

fa16ebed

14 8月, 2012 1 次提交

RDMA/ucma.c: Fix for events with wrong context on iWARP · 418edaab

由 Tatyana Nikolova 提交于 8月 03, 2012

It is possible for asynchronous RDMA_CM_EVENT_ESTABLISHED events to be
generated with ctx->uid == 0, because ucma_set_event_context() copies
ctx->uid to the event structure outside of ctx->file->mut. This leads
to a crash in the userspace library, since it gets a bogus event.

Fix this by taking the mutex a bit earlier in ucma_event_handler.
Signed-off-by: NTatyana Nikolova <Tatyana.E.Nikolova@intel.com>
Signed-off-by: NSean Hefty <Sean.Hefty@intel.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

418edaab

11 8月, 2012 2 次提交

RDMA/ocrdma: Don't call vlan_dev_real_dev() for non-VLAN netdevs · d549f55f

由 Roland Dreier 提交于 8月 10, 2012

If CONFIG_VLAN_8021Q is not set, then vlan_dev_real_dev() just goes BUG(),
so we shouldn't call it unless we're actually dealing with a VLAN netdev.
Reported-by: NFengguang Wu <fengguang.wu@intel.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

d549f55f

IB/mlx4: Fix possible deadlock on sm_lock spinlock · df7fba66

由 Jack Morgenstein 提交于 8月 03, 2012

The sm_lock spinlock is taken in the process context by
mlx4_ib_modify_device, and in the interrupt context by update_sm_ah,
so we need to take that spinlock with irqsave, and release it with
irqrestore.

Lockdeps reports this as follows:

    [ INFO: inconsistent lock state ]
    3.5.0+ #20 Not tainted
    inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
    swapper/0/0 [HC1[1]:SC0[0]:HE0:SE1] takes:
    (&(&ibdev->sm_lock)->rlock){?.+...}, at: [<ffffffffa028af1d>] update_sm_ah+0xad/0x100 [mlx4_ib]
    {HARDIRQ-ON-W} state was registered at:
      [<ffffffff810b84a0>] mark_irqflags+0x120/0x190
      [<ffffffff810b9ce7>] __lock_acquire+0x307/0x4c0
      [<ffffffff810b9f51>] lock_acquire+0xb1/0x150
      [<ffffffff815523b1>] _raw_spin_lock+0x41/0x50
      [<ffffffffa028d563>] mlx4_ib_modify_device+0x63/0x240 [mlx4_ib]
      [<ffffffffa026d1fc>] ib_modify_device+0x1c/0x20 [ib_core]
      [<ffffffffa026c353>] set_node_desc+0x83/0xc0 [ib_core]
      [<ffffffff8136a150>] dev_attr_store+0x20/0x30
      [<ffffffff81201fd6>] sysfs_write_file+0xe6/0x170
      [<ffffffff8118da38>] vfs_write+0xc8/0x190
      [<ffffffff8118dc01>] sys_write+0x51/0x90
      [<ffffffff8155b869>] system_call_fastpath+0x16/0x1b

    ...
    *** DEADLOCK ***

    1 lock held by swapper/0/0:

    stack backtrace:
    Pid: 0, comm: swapper/0 Not tainted 3.5.0+ #20
    Call Trace:
    <IRQ>  [<ffffffff810b7bea>] print_usage_bug+0x18a/0x190
    [<ffffffff810b7370>] ? print_irq_inversion_bug+0x210/0x210
    [<ffffffff810b7fb2>] mark_lock_irq+0xf2/0x280
    [<ffffffff810b8290>] mark_lock+0x150/0x240
    [<ffffffff810b84ef>] mark_irqflags+0x16f/0x190
    [<ffffffff810b9ce7>] __lock_acquire+0x307/0x4c0
    [<ffffffffa028af1d>] ? update_sm_ah+0xad/0x100 [mlx4_ib]
    [<ffffffff810b9f51>] lock_acquire+0xb1/0x150
    [<ffffffffa028af1d>] ? update_sm_ah+0xad/0x100 [mlx4_ib]
    [<ffffffff815523b1>] _raw_spin_lock+0x41/0x50
    [<ffffffffa028af1d>] ? update_sm_ah+0xad/0x100 [mlx4_ib]
    [<ffffffffa026b2fa>] ? ib_create_ah+0x1a/0x40 [ib_core]
    [<ffffffffa028af1d>] update_sm_ah+0xad/0x100 [mlx4_ib]
    [<ffffffff810c27c3>] ? is_module_address+0x23/0x30
    [<ffffffffa028b05b>] handle_port_mgmt_change_event+0xeb/0x150 [mlx4_ib]
    [<ffffffffa028c177>] mlx4_ib_event+0x117/0x160 [mlx4_ib]
    [<ffffffff81552501>] ? _raw_spin_lock_irqsave+0x61/0x70
    [<ffffffffa022718c>] mlx4_dispatch_event+0x6c/0x90 [mlx4_core]
    [<ffffffffa0221b40>] mlx4_eq_int+0x500/0x950 [mlx4_core]

Reported by: Or Gerlitz <ogerlitz@mellanox.com>
Tested-by: NBart Van Assche <bvanassche@acm.org>
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

df7fba66

30 7月, 2012 2 次提交

IPoIB: Use a private hash table for path lookup in xmit path · b63b70d8

由 Shlomo Pongratz 提交于 7月 24, 2012

Dave Miller <davem@davemloft.net> provided a detailed description of
why the way IPoIB is using neighbours for its own ipoib_neigh struct
is buggy:

    Any time an ipoib_neigh is changed, a sequence like the following is made:

    			spin_lock_irqsave(&priv->lock, flags);
    			/*
    			 * It's safe to call ipoib_put_ah() inside
    			 * priv->lock here, because we know that
    			 * path->ah will always hold one more reference,
    			 * so ipoib_put_ah() will never do more than
    			 * decrement the ref count.
    			 */
    			if (neigh->ah)
    				ipoib_put_ah(neigh->ah);
    			list_del(&neigh->list);
    			ipoib_neigh_free(dev, neigh);
    			spin_unlock_irqrestore(&priv->lock, flags);
    			ipoib_path_lookup(skb, n, dev);

    This doesn't work, because you're leaving a stale pointer to the freed up
    ipoib_neigh in the special neigh->ha pointer cookie.  Yes, it even fails
    with all the locking done to protect _changes_ to *ipoib_neigh(n), and
    with the code in ipoib_neigh_free() that NULLs out the pointer.

    The core issue is that read side calls to *to_ipoib_neigh(n) are not
    being synchronized at all, they are performed without any locking.  So
    whether we hold the lock or not when making changes to *ipoib_neigh(n)
    you still can have threads see references to freed up ipoib_neigh
    objects.

    	cpu 1			cpu 2
    	n = *ipoib_neigh()
    				*ipoib_neigh() = NULL
    				kfree(n)
    	n->foo == OOPS

    [..]

    Perhaps the ipoib code can have a private path database it manages
    entirely itself, which holds all the necessary information and is
    looked up by some generic key which is available easily at transmit
    time and does not involve generic neighbour entries.

See <http://marc.info/?l=linux-rdma&m=132812793105624&w=2> and
<http://marc.info/?l=linux-rdma&w=2&r=1&s=allows+references+to+freed+memory&q=b>
for the full discussion.

This patch aims to solve the race conditions found in the IPoIB driver.

The patch removes the connection between the core networking neighbour
structure and the ipoib_neigh structure.  In addition to avoiding the
race described above, it allows us to handle SKBs carrying IP packets
that don't have any associated neighbour.

We add an ipoib_neigh hash table with N buckets where the key is the
destination hardware address.  The ipoib_neigh is fetched from the
hash table and instead of the stashed location in the neighbour
structure. The hash table uses both RCU and reference counting to
guarantee that no ipoib_neigh instance is ever deleted while in use.

Fetching the ipoib_neigh structure instance from the hash also makes
the special code in ipoib_start_xmit that handles remote and local
bonding failover redundant.

Aged ipoib_neigh instances are deleted by a garbage collection task
that runs every M seconds and deletes every ipoib_neigh instance that
was idle for at least 2*M seconds. The deletion is safe since the
ipoib_neigh instances are protected using RCU and reference count
mechanisms.

The number of buckets (N) and frequency of running the GC thread (M),
are taken from the exported arb_tbl.
Signed-off-by: NShlomo Pongratz <shlomop@mellanox.com>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

b63b70d8

IB/qib: Fix size of cc_supported_table_entries · 5d7fe4ef

由 Mike Marciniszyn 提交于 7月 23, 2012

Commit 36a8f01c ("IB/qib: Add congestion control agent
implementation") tries to store the value 1984 in a u8, which leads to
truncation.  Fix this by making the member big enough.

This bug was detected by a smatch warning.
Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: NRamkrishna Vepa <ramkrishna.vepa@intel.com>
Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

5d7fe4ef

28 7月, 2012 3 次提交

RDMA/ucma: Convert open-coded equivalent to memdup_user() · 0764c76e

由 Roland Dreier 提交于 7月 27, 2012

Suggested by scripts/coccinelle/api/memdup_user.cocci.
Reported-by: NFengguang Wu <fengguang.wu@intel.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

0764c76e

RDMA/ocrdma: Fix check of GSI CQs · 9e8fa040

由 Roland Dreier 提交于 7月 27, 2012

It looks like one check was accidentally duplicated, and the other 3
checks were left out. This was detected by scripts/coccinelle/tests/doubletest.cocci:

drivers/infiniband/hw/ocrdma/ocrdma_verbs.c:895:6-54: duplicated argument to && or ||
Reported-by: NFengguang Wu <fengguang.wu@intel.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

9e8fa040

F
RDMA/cma: Use PTR_RET rather than if (IS_ERR(...)) + PTR_ERR · 4e289045
由 Fengguang Wu 提交于 7月 27, 2012
```
Suggested by scripts/coccinelle/api/ptr_ret.cocci.
Signed-off-by: NRoland Dreier <roland@purestorage.com>
```
4e289045

20 7月, 2012 4 次提交

IB/qib: checkpatch fixes · 7fac3301

由 Mike Marciniszyn 提交于 7月 19, 2012

Elminate some simple_strto* usage.

checkpatch also noted pr_ conversations, which have been done as
recommended.  The pr_fmt() define is used to shorten line length.

Other multi-line string warnings are also elmininated.
Reviewed-by: NDean Luick <dean.luick@intel.com>
Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

7fac3301

IB/qib: Add congestion control agent implementation · 36a8f01c

由 Mike Marciniszyn 提交于 7月 19, 2012

Add a congestion control agent in the driver that handles gets and
sets from the congestion control manager in the fabric for the
Performance Scale Messaging (PSM) library.
Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

36a8f01c

IB/qib: Reduce sdma_lock contention · 551ace12

由 Mike Marciniszyn 提交于 7月 19, 2012

Profiling has shown that sdma_lock is proving a bottleneck for
performance. The situations include:
 - RDMA reads when krcvqs > 1
 - post sends from multiple threads

For RDMA read the current global qib_wq mechanism runs on all CPUs
and contends for the sdma_lock when multiple RMDA read requests are
fielded on differenct CPUs. For post sends, the direct call to
qib_do_send() from multiple threads causes the contention.

Since the sdma mechanism is per port, this fix converts the existing
workqueue to a per port single thread workqueue to reduce the lock
contention in the RDMA read case, and for any other case where the QP
is scheduled via the workqueue mechanism from more than 1 CPU.

For the post send case, This patch modifies the post send code to test
for a non empty sdma engine.  If the sdma is not idle the (now single
thread) workqueue will be used to trigger the send engine instead of
the direct call to qib_do_send().
Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

551ace12

IB/qib: Fix an incorrect log message · f3331f88

由 Betty Dall 提交于 7月 19, 2012

There is a cut-and-paste typo in the function qib_pci_slot_reset()
where it prints that the "link_reset" function is called rather than
the "slot_reset" function.  This makes the message misleading.
Signed-off-by: NBetty Dall <betty.dall@hp.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

f3331f88

19 7月, 2012 1 次提交

{NET,IB}/mlx4: Add rmap support to mlx4_assign_eq · d9236c3f

由 Amir Vadai 提交于 7月 18, 2012

Enable callers of mlx4_assign_eq to supply a pointer to cpu_rmap.
If supplied, the assigned IRQ is tracked using rmap infrastructure.
Signed-off-by: NAmir Vadai <amirv@mellanox.com>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d9236c3f

18 7月, 2012 1 次提交

IB/qib: Fix QP RCU sparse warnings · 1fb9fed6

由 Mike Marciniszyn 提交于 7月 16, 2012

Commit af061a64 ("IB/qib: Use RCU for qpn lookup") introduced sparse
warnings.

This patch corrects those issues.
Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

1fb9fed6

17 7月, 2012 2 次提交

net: Pass optional SKB and SK arguments to dst_ops->{update_pmtu,redirect}() · 6700c270

由 David S. Miller 提交于 7月 17, 2012

This will be used so that we can compose a full flow key.

Even though we have a route in this context, we need more. In the
future the routes will be without destination address, source address,
etc. keying. One ipv4 route will cover entire subnets, etc.

In this environment we have to have a way to possess persistent storage
for redirects and PMTU information. This persistent storage will exist
in the FIB tables, and that's why we'll need to be able to rebuild a
full lookup flow key here. Using that flow key will do a fib_lookup()
and create/update the persistent entry.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6700c270

srpt: use target_execute_cmd for WRITEs in srpt_handle_rdma_comp · e672a47f

由 Christoph Hellwig 提交于 7月 08, 2012

srpt_handle_rdma_comp is called from kthread context and thus can execute
target_execute_cmd directly. srpt_abort_cmd sets the CMD_T_LUN_STOP
flag directly, and thus the abuse of transport_generic_handle_data can be
replaced with an opencoded variant of that code path. I'm still not happy
about a fabric driver poking into target core internals like this, but
let's defer the bigger architecture changes for now.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>

e672a47f

12 7月, 2012 5 次提交

mlx4: Put physical GID and P_Key table sizes in mlx4_phys_caps struct and paravirtualize them · 6634961c

由 Jack Morgenstein 提交于 6月 19, 2012

To allow easy paravirtualization of P_Key and GID table sizes, keep
paravirtualized sizes in mlx4_dev->caps, but save the actual physical
sizes from FW in struct: mlx4_dev->phys_cap.

In addition, in SR-IOV mode, do the following:

1. Reduce reported P_Key table size by 1.
   This is done to reserve the highest P_Key index for internal use,
   for declaring an invalid P_Key in P_Key paravirtualization.
   We require a P_Key index which always contain an invalid P_Key
   value for this purpose (i.e., one which cannot be modified by
   the subnet manager).  The way to do this is to reduce the
   P_Key table size reported to the subnet manager by 1, so that
   it will not attempt to access the P_Key at index #127.

2. Paravirtualize the GID table size to 1. Thus, each guest sees
   only a single GID (at its paravirtualized index 0).

In addition, since we are paravirtualizing the GID table size to 1, we
add paravirtualization of the master GID event here (i.e., we do not
do ib_dispatch_event() for the GUID change event on the master, since
its (only) GUID never changes).
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

6634961c

IB/cm: Destroy idr as part of the module init error flow · 87d4abda

由 Dotan Barak 提交于 7月 11, 2012

Clean the idr as part of the error flow since it is a resource too.
Signed-off-by: NDotan Barak <dotanb@dev.mellanox.co.il>
Reviewed-by: NSean Hefty <sean.hefty@intel.com>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

87d4abda

IB/mlx4: Fill the masked_atomic_cap attribute in query device · 47e956b2

由 Dotan Barak 提交于 7月 11, 2012

When the user queries for device capabilities, fill in the
masked_atomic_cap attribute with the real support level of atomic
capabilities instead of using a hard coded value.
Signed-off-by: NDotan Barak <dotanb@dev.mellanox.co.il>
Reviewed-by: NEli Cohen <eli@mellanox.com>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

47e956b2

IB/mthca: Fill in sq_sig_type in query QP · 16551d45

由 Dotan Barak 提交于 7月 11, 2012

The query QP code was didn't fill that attribute, do that.
Signed-off-by: NDotan Barak <dotanb@dev.mellanox.co.il>
Reviewed-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

16551d45

IB/mthca: Warning about event for non-existent QPs should show event type · 9bbeb666

由 Dotan Barak 提交于 7月 11, 2012

Events received for non-existent QPs should generate a warning that includes
the event type that was received.
Signed-off-by: NDotan Barak <dotanb@dev.mellanox.co.il>
Reviewed-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

9bbeb666

11 7月, 2012 3 次提交

IPoIB: fix skb truesize underestimatiom · b28ba726

由 Eric Dumazet 提交于 7月 10, 2012

Or Gerlitz reported triggering of WARN_ON_ONCE(delta < len); in
skb_try_coalesce()
This warning tracks drivers that incorrectly set skb->truesize

IPoIB indeed allocates a full page to store a fragment, but only
accounts in skb->truesize the used part of the page (frame length)

This patch fixes skb truesize underestimation, and
also fixes a performance issue, because RX skbs have not enough tailroom
to allow IP and TCP stacks to pull their header in skb linear part
without an expensive call to pskb_expand_head()
Signed-off-by: NEric Dumazet <edumazet@google.com>
Reported-by: NOr Gerlitz <ogerlitz@mellanox.com>
Cc: Erez Shitrit <erezsh@mellanox.com>
Cc: Shlomo Pongartz <shlomop@mellanox.com>
Cc: Roland Dreier <roland@purestorage.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b28ba726

IB/qib: Fix sparse RCU warnings in qib_keys.c · 7e230177

由 Mike Marciniszyn 提交于 7月 06, 2012

Commit 8aac4cc3 ("IB/qib: RCU locking for MR validation") introduced
new sparse warnings in qib_keys.c.
Acked-by: NFengguang Wu <fengguang.wu@intel.com>
Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

7e230177

mlx4: Use port management change event instead of smp_snoop · 00f5ce99

由 Jack Morgenstein 提交于 6月 19, 2012

The port management change event can replace smp_snoop.  If the
capability bit for this event is set in dev-caps, the event is used
(by the driver setting the PORT_MNG_CHG_EVENT bit in the async event
mask in the MAP_EQ fw command).  In this case, when the driver passes
incoming SMP PORT_INFO SET mads to the FW, the FW generates port
management change events to signal any changes to the driver.

If the FW generates these events, smp_snoop shouldn't be invoked in
ib_process_mad(), or duplicate events will occur (once from the
FW-generated event, and once from smp_snoop).

In the case where the FW does not generate port management change
events smp_snoop needs to be invoked to create these events.  The flow
in smp_snoop has been modified to make use of the same procedures as
in the fw-generated-event event case to generate the port management
events (LID change, Client-rereg, Pkey change, and/or GID change).

Port management change event handling required changing the
mlx4_ib_event and mlx4_dispatch_event prototypes; the "param" argument
(last argument) had to be changed to unsigned long in order to
accomodate passing the EQE pointer.

We also needed to move the definition of struct mlx4_eqe from
net/mlx4.h to file device.h -- to make it available to the IB driver,
to handle port management change events.
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

00f5ce99

09 7月, 2012 4 次提交

IB/qib: RCU locking for MR validation · 8aac4cc3

由 Mike Marciniszyn 提交于 6月 27, 2012

Profiling indicates that MR validation locking is expensive.  The MR
table is largely read-only and is a suitable candidate for RCU locking.

The patch uses RCU locking during validation to eliminate one
lock/unlock during that validation.
Reviewed-by: NMike Heinz <michael.william.heinz@intel.com>
Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

8aac4cc3

IB/qib: Avoid returning EBUSY from MR deregister · 6a82649f

由 Mike Marciniszyn 提交于 6月 27, 2012

A timing issue can occur where qib_mr_dereg can return -EBUSY if the
MR use count is not zero.

This can occur if the MR is de-registered while RDMA read response
packets are being progressed from the SDMA ring.  The suspicion is
that the peer sent an RDMA read request, which has already been copied
across to the peer.  The peer sees the completion of his request and
then communicates to the responder that the MR is not needed any
longer.  The responder tries to de-register the MR, catching some
responses remaining in the SDMA ring holding the MR use count.

The code now uses a get/put paradigm to track MR use counts and
coordinates with the MR de-registration process using a completion
when the count has reached zero.  A timeout on the delay is in place
to catch other EBUSY issues.

The reference count protocol is as follows:
- The return to the user counts as 1
- A reference from the lk_table or the qib_ibdev counts as 1.
- Transient I/O operations increase/decrease as necessary

A lot of code duplication has been folded into the new routines
init_qib_mregion() and deinit_qib_mregion().  Additionally, explicit
initialization of fields to zero is now handled by kzalloc().

Also, duplicated code 'while.*num_sge' that decrements reference
counts have been consolidated in qib_put_ss().
Reviewed-by: NRamkrishna Vepa <ramkrishna.vepa@intel.com>
Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

6a82649f

IB/qib: Fix UC MR refs for immediate operations · 354dff1b

由 Mike Marciniszyn 提交于 6月 27, 2012

An MR reference leak exists when handling UC RDMA writes with
immediate data because we manipulate the reference counts as if the
operation had been a send.

This patch moves the last_imm label so that the RDMA write operations
with immediate data converge at the cq building code.  The copy/mr
deref code is now done correctly prior to the branch to last_imm.
Reviewed-by: NEdward Mascarenhas <edward.mascarenhas@intel.com>
Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

354dff1b

IB/core: Move CM_xxx_ATTR_ID macros from cm_msgs.h to ib_cm.h · 3045f092

由 Jack Morgenstein 提交于 6月 19, 2012

These macros will be reused by the mlx4 SRIOV-IB CM paravirtualization
code, and there is no reason to have them declared both in the IB core
in the mlx4 IB driver.
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

3045f092

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功