提交 · b9c5d6a643589ad39064f652938baa698f0e884a · openeuler / Kernel

01 10月, 2012 6 次提交

IB/mlx4: Add multicast group (MCG) paravirtualization for SR-IOV · b9c5d6a6

由 Oren Duer 提交于 8月 03, 2012

MCG paravirtualization support includes:
- Creating multicast groups by VFs, and keeping accounting of them
- Leaving multicast groups by VFs
- Updating SM only with real changes in the overall picture of MCGs status
- Creation of MGID=0 groups (let SM choose MGID)

Note that the MCG module maintains its own internal MCG object
reference counts.  The reason for this is that the IB core is used to
track only the multicast groups joins generated by the PF it runs
over.  The PF IB core layer is unaware of slaves, so it cannot be used
to keep track of MCG joins they generate.
Signed-off-by: NOren Duer <oren@mellanox.co.il>
Signed-off-by: NEli Cohen <eli@mellanox.com>
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

b9c5d6a6

mlx4: MAD_IFC paravirtualization · 0a9a0188

由 Jack Morgenstein 提交于 8月 03, 2012

The MAD_IFC firmware command fulfills two functions.

First, it is used in the QP0/QP1 MAD-handling flow to obtain
information from the FW (for answering queries), and for setting
variables in the HCA (MAD SET packets).

For this, MAD_IFC should provide the FW (physical) view of the data.
This is the view that OpenSM needs.  We call this the "network view".

In the second case, MAD_IFC is used by various verbs to obtain data
regarding the local HCA (e.g., ib_query_device()).  We call this the
"host view".

This data needs to be paravirtualized.

MAD_IFC therefore needs a wrapper function, and also needs another
flag indicating whether it should provide the network view (when it is
called by ib_process_mad in special-qp packet handling), or the host
view (when it is called while implementing a verb).

There are currently 2 flag parameters in mlx4_MAD_IFC already:
ignore_bkey and ignore_mkey.  These two parameters are replaced by a
single "mad_ifc_flags" parameter, with different bits set for each
flag.  A third flag is added: "network-view/host-view".
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

0a9a0188

IB/mlx4: SR-IOV multiplex and demultiplex MADs · 37bfc7c1

由 Jack Morgenstein 提交于 8月 03, 2012

Special QPs are paravirtualized.

vHCAs are not given direct access to QP0/1. Rather, these QPs are
operated by a special context hosted by the PF, which mediates access
to/from vHCAs.  This is done by opening a "tunnel" per vHCA port per
QP0/1. A tunnel comprises a pair of UD QPs: a "Tunnel QP" in the
PF-context and a "Proxy QP" in the vHCA.  All vHCA MAD traffic must
pass through the corresponding tunnel.  vHCA QPs cannot be assigned to
VL15 and are denied of the well-known QKey.

Outgoing messages are "de-multiplexed" (i.e., directed to the wire via
the real special QP).

Incoming messages are "multiplexed" (i.e. steered by the PPF to the
correct VF or to the PF)

QP0 access is restricted to the PF vHCA. VF vHCAs also have (virtual)
QP0s, but they never receive any SMPs and all SMPs sent are discarded.
QP1 traffic is allowed for all vHCAs, but special care is required to
bridge the gap between the host and network views.

Specifically:
- Transaction IDs are mapped to guarantee uniqueness among vHCAs
- CM para-virtualization
  o   Incoming requests are steered to the correct vHCA according to the embedded GID
  o   Local communication IDs are mapped to ensure uniqueness among vHCAs
  (see the patch that adds CM paravirtualization.)
- Multicast para-virtualization
  o   The PF context aggregates membership state from all vHCAs
  o   The SA is contacted only when the aggregate membership changes
  o   If the aggregate does not change, the PF context will provide the
      requesting vHCA with the proper response.
  (see the patch that adds multicast group paravirtualization)

Incoming MADs are steered according to:
- the DGID If a GRH is present
- the mapped transaction ID for response MADs
- the embedded GID in CM requests
- the remote communication ID in other CM messages
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

37bfc7c1

mlx4: Implement QP paravirtualization and maintain phys_pkey_cache for smp_snoop · 54679e14

由 Jack Morgenstein 提交于 8月 03, 2012

This requires:

1. Replacing the paravirtualized P_Key index (inserted by the guest)
   with the real P_Key index.

2. For UD QPs, placing the guest's true source GID index in the
   address path structure mgid field, and setting the ud_force_mgid
   bit so that the mgid is taken from the QP context and not from the
   WQE when posting sends.

3. For UC and RC QPs, placing the guest's true source GID index in the
   address path structure mgid field.

4. For tunnel and proxy QPs, setting the Q_Key value reserved for that
   proxy/tunnel pair.

Since not all the above adjustments occur in all the QP transitions,
the QP transitions require separate wrapper functions.

Secondly, initialize the P_Key virtualization table to its default
values: Master virtualized table is 1-1 with the real P_Key table,
guest virtualized table has P_Key index 0 mapped to the real P_Key
index 0, and all the other P_Key indices mapped to the reserved
(invalid) P_Key at index 127.

Finally, add logic in smp_snoop for maintaining the phys_P_Key_cache.
and generating events on the master only if a P_Key actually changed.
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

54679e14

IB/mlx4: Initialize SR-IOV IB support for slaves in master context · fc06573d

由 Jack Morgenstein 提交于 8月 03, 2012

Allocate SR-IOV paravirtualization resources and MAD demuxing contexts
on the master.

This has two parts.  The first part is to initialize the structures to
contain the contexts.  This is done at master startup time in
mlx4_ib_init_sriov().

The second part is to actually create the tunneling resources required
on the master to support a slave.  This is performed the master
detects that a slave has started up (MLX4_DEV_EVENT_SLAVE_INIT event
generated when a slave initializes its comm channel).

For the master, there is no such startup event, so it creates its own
tunneling resources when it starts up.  In addition, the master also
creates the real special QPs.  The ib_core layer on the master causes
creation of proxy special QPs, since the master is also
paravirtualized at the ib_core layer.
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

fc06573d

IB/mlx4: SR-IOV IB context objects and proxy/tunnel SQP support · 1ffeb2eb

由 Jack Morgenstein 提交于 8月 03, 2012

1. Introduce the basic SR-IOV parvirtualization context objects for
   multiplexing and demultiplexing MADs.
2. Introduce support for the new proxy and tunnel QP types.

This patch introduces the objects required by the master for managing
QP paravirtualization for guests.

struct mlx4_ib_sriov is created by the master only.
It is a container for the following:

1. All the info required by the PPF to multiplex and de-multiplex MADs
   (including those from the PF). (struct mlx4_ib_demux_ctx demux)
2. All the info required to manage alias GUIDs (i.e., the GUID at
   index 0 that each guest perceives.  In fact, this is not the GUID
   which is actually at index 0, but is, in fact, the GUID which is at
   index[<VF number>] in the physical table.
3. structures which are used to manage CM paravirtualization
4. structures for managing the real special QPs when running in SR-IOV
   mode.  The real SQPs are controlled by the PPF in this case.  All
   SQPs created and controlled by the ib core layer are proxy SQP.

struct mlx4_ib_demux_ctx contains the information per port needed
to manage paravirtualization:

1. All multicast paravirt info
2. All tunnel-qp paravirt info for the port.
3. GUID-table and GUID-prefix for the port
4. work queues.

struct mlx4_ib_demux_pv_ctx contains all the info for managing the
paravirtualized QPs for one slave/port.

struct mlx4_ib_demux_pv_qp contains the info need to run an individual
QP (either tunnel qp or real SQP).

Note:  We made use of the 2 most significant bits in enum
mlx4_ib_qp_flags (based on enum ib_qp_create_flags in ib_verbs.h).
We need these bits in the low-level driver for internal purposes.
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

1ffeb2eb

15 9月, 2012 2 次提交

IB/qib: Fix failure of compliance test C14-024#06_LocalPortNum · 4c355005

由 Mike Marciniszyn 提交于 9月 12, 2012

Commit 3236b2d4 ("IB/qib: MADs with misset M_Keys should return
failure") introduced a return code assignment that unfortunately
introduced an unconditional exit for the routine due to missing braces.

This patch adds the braces to correct the original patch.
Reviewed-by: NDean Luick <dean.luick@intel.com>
Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

4c355005

RDMA/ocrdma: Fix CQE expansion of unsignaled WQE · ae3bca90

由 Parav Pandit 提交于 8月 17, 2012

Fix CQE expansion of unsignaled WQE -- don't expand the CQE when the
WQE index of the completed CQE matches with last pending WQE (tail) in
the queue.
Signed-off-by: NParav Pandit <parav.pandit@emulex.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

ae3bca90

08 9月, 2012 1 次提交

RDMA/cxgb4: Move dereference below NULL test · 92dd6c3d

由 Wei Yongjun 提交于 9月 07, 2012

spatch with a semantic match is used to found this.
(http://coccinelle.lip6.fr/)
Signed-off-by: NWei Yongjun <yongjun_wei@trendmicro.com.cn>
Acked-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

92dd6c3d

17 8月, 2012 1 次提交

IB/mlx4: Check iboe netdev pointer before dereferencing it · a0675a38

由 Kleber Sacilotto de Souza 提交于 8月 10, 2012

Unlike other parts of the mlx4_ib code, the function build_mlx_header()
doesn't check if the iboe netdev of the given port is valid before
dereferencing it, which can cause a crash if the ethernet interface
has already been taken down.

Fix this by checking for a valid netdev pointer before using it to get
the port MAC address.
Signed-off-by: NKleber Sacilotto de Souza <klebers@linux.vnet.ibm.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

a0675a38

16 8月, 2012 2 次提交

IB/qib: Fix error return code in qib_init_7322_variables() · 51fa3ca3

由 Julia Lawall 提交于 8月 14, 2012

Convert a 0 error return code to a negative one, as returned elsewhere
in the function.

A simplified version of the semantic match that finds this problem is
as follows: (http://coccinelle.lip6.fr/)

// <smpl>
@@
identifier ret;
expression e,e1,e2,e3,e4,x;
@@

(
if (\(ret != 0\|ret < 0\) || ...) { ... return ...; }
|
ret = 0
)
... when != ret = e1
*x = \(kmalloc\|kzalloc\|kcalloc\|devm_kzalloc\|ioremap\|ioremap_nocache\|devm_ioremap\|devm_ioremap_nocache\)(...);
... when != x = e2
    when != ret = e3
*if (x == NULL || ...)
{
  ... when != ret = e4
*  return ret;
}
// </smpl>
Signed-off-by: NJulia Lawall <Julia.Lawall@lip6.fr>
Acked-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

51fa3ca3

IB: Fix typos in infiniband drivers · 142ad5db

由 Masanari Iida 提交于 8月 10, 2012

Correct spelling typos in comments in drivers/infiniband.
Signed-off-by: NMasanari Iida <standby24x7@gmail.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

142ad5db

11 8月, 2012 2 次提交

RDMA/ocrdma: Don't call vlan_dev_real_dev() for non-VLAN netdevs · d549f55f

由 Roland Dreier 提交于 8月 10, 2012

If CONFIG_VLAN_8021Q is not set, then vlan_dev_real_dev() just goes BUG(),
so we shouldn't call it unless we're actually dealing with a VLAN netdev.
Reported-by: NFengguang Wu <fengguang.wu@intel.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

d549f55f

IB/mlx4: Fix possible deadlock on sm_lock spinlock · df7fba66

由 Jack Morgenstein 提交于 8月 03, 2012

The sm_lock spinlock is taken in the process context by
mlx4_ib_modify_device, and in the interrupt context by update_sm_ah,
so we need to take that spinlock with irqsave, and release it with
irqrestore.

Lockdeps reports this as follows:

    [ INFO: inconsistent lock state ]
    3.5.0+ #20 Not tainted
    inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
    swapper/0/0 [HC1[1]:SC0[0]:HE0:SE1] takes:
    (&(&ibdev->sm_lock)->rlock){?.+...}, at: [<ffffffffa028af1d>] update_sm_ah+0xad/0x100 [mlx4_ib]
    {HARDIRQ-ON-W} state was registered at:
      [<ffffffff810b84a0>] mark_irqflags+0x120/0x190
      [<ffffffff810b9ce7>] __lock_acquire+0x307/0x4c0
      [<ffffffff810b9f51>] lock_acquire+0xb1/0x150
      [<ffffffff815523b1>] _raw_spin_lock+0x41/0x50
      [<ffffffffa028d563>] mlx4_ib_modify_device+0x63/0x240 [mlx4_ib]
      [<ffffffffa026d1fc>] ib_modify_device+0x1c/0x20 [ib_core]
      [<ffffffffa026c353>] set_node_desc+0x83/0xc0 [ib_core]
      [<ffffffff8136a150>] dev_attr_store+0x20/0x30
      [<ffffffff81201fd6>] sysfs_write_file+0xe6/0x170
      [<ffffffff8118da38>] vfs_write+0xc8/0x190
      [<ffffffff8118dc01>] sys_write+0x51/0x90
      [<ffffffff8155b869>] system_call_fastpath+0x16/0x1b

    ...
    *** DEADLOCK ***

    1 lock held by swapper/0/0:

    stack backtrace:
    Pid: 0, comm: swapper/0 Not tainted 3.5.0+ #20
    Call Trace:
    <IRQ>  [<ffffffff810b7bea>] print_usage_bug+0x18a/0x190
    [<ffffffff810b7370>] ? print_irq_inversion_bug+0x210/0x210
    [<ffffffff810b7fb2>] mark_lock_irq+0xf2/0x280
    [<ffffffff810b8290>] mark_lock+0x150/0x240
    [<ffffffff810b84ef>] mark_irqflags+0x16f/0x190
    [<ffffffff810b9ce7>] __lock_acquire+0x307/0x4c0
    [<ffffffffa028af1d>] ? update_sm_ah+0xad/0x100 [mlx4_ib]
    [<ffffffff810b9f51>] lock_acquire+0xb1/0x150
    [<ffffffffa028af1d>] ? update_sm_ah+0xad/0x100 [mlx4_ib]
    [<ffffffff815523b1>] _raw_spin_lock+0x41/0x50
    [<ffffffffa028af1d>] ? update_sm_ah+0xad/0x100 [mlx4_ib]
    [<ffffffffa026b2fa>] ? ib_create_ah+0x1a/0x40 [ib_core]
    [<ffffffffa028af1d>] update_sm_ah+0xad/0x100 [mlx4_ib]
    [<ffffffff810c27c3>] ? is_module_address+0x23/0x30
    [<ffffffffa028b05b>] handle_port_mgmt_change_event+0xeb/0x150 [mlx4_ib]
    [<ffffffffa028c177>] mlx4_ib_event+0x117/0x160 [mlx4_ib]
    [<ffffffff81552501>] ? _raw_spin_lock_irqsave+0x61/0x70
    [<ffffffffa022718c>] mlx4_dispatch_event+0x6c/0x90 [mlx4_core]
    [<ffffffffa0221b40>] mlx4_eq_int+0x500/0x950 [mlx4_core]

Reported by: Or Gerlitz <ogerlitz@mellanox.com>
Tested-by: NBart Van Assche <bvanassche@acm.org>
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

df7fba66

30 7月, 2012 1 次提交

IB/qib: Fix size of cc_supported_table_entries · 5d7fe4ef

由 Mike Marciniszyn 提交于 7月 23, 2012

Commit 36a8f01c ("IB/qib: Add congestion control agent
implementation") tries to store the value 1984 in a u8, which leads to
truncation.  Fix this by making the member big enough.

This bug was detected by a smatch warning.
Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: NRamkrishna Vepa <ramkrishna.vepa@intel.com>
Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

5d7fe4ef

28 7月, 2012 1 次提交

RDMA/ocrdma: Fix check of GSI CQs · 9e8fa040

由 Roland Dreier 提交于 7月 27, 2012

It looks like one check was accidentally duplicated, and the other 3
checks were left out. This was detected by scripts/coccinelle/tests/doubletest.cocci:

drivers/infiniband/hw/ocrdma/ocrdma_verbs.c:895:6-54: duplicated argument to && or ||
Reported-by: NFengguang Wu <fengguang.wu@intel.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

9e8fa040

20 7月, 2012 4 次提交

IB/qib: checkpatch fixes · 7fac3301

由 Mike Marciniszyn 提交于 7月 19, 2012

Elminate some simple_strto* usage.

checkpatch also noted pr_ conversations, which have been done as
recommended.  The pr_fmt() define is used to shorten line length.

Other multi-line string warnings are also elmininated.
Reviewed-by: NDean Luick <dean.luick@intel.com>
Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

7fac3301

IB/qib: Add congestion control agent implementation · 36a8f01c

由 Mike Marciniszyn 提交于 7月 19, 2012

Add a congestion control agent in the driver that handles gets and
sets from the congestion control manager in the fabric for the
Performance Scale Messaging (PSM) library.
Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

36a8f01c

IB/qib: Reduce sdma_lock contention · 551ace12

由 Mike Marciniszyn 提交于 7月 19, 2012

Profiling has shown that sdma_lock is proving a bottleneck for
performance. The situations include:
 - RDMA reads when krcvqs > 1
 - post sends from multiple threads

For RDMA read the current global qib_wq mechanism runs on all CPUs
and contends for the sdma_lock when multiple RMDA read requests are
fielded on differenct CPUs. For post sends, the direct call to
qib_do_send() from multiple threads causes the contention.

Since the sdma mechanism is per port, this fix converts the existing
workqueue to a per port single thread workqueue to reduce the lock
contention in the RDMA read case, and for any other case where the QP
is scheduled via the workqueue mechanism from more than 1 CPU.

For the post send case, This patch modifies the post send code to test
for a non empty sdma engine.  If the sdma is not idle the (now single
thread) workqueue will be used to trigger the send engine instead of
the direct call to qib_do_send().
Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

551ace12

IB/qib: Fix an incorrect log message · f3331f88

由 Betty Dall 提交于 7月 19, 2012

There is a cut-and-paste typo in the function qib_pci_slot_reset()
where it prints that the "link_reset" function is called rather than
the "slot_reset" function.  This makes the message misleading.
Signed-off-by: NBetty Dall <betty.dall@hp.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

f3331f88

19 7月, 2012 1 次提交

{NET,IB}/mlx4: Add rmap support to mlx4_assign_eq · d9236c3f

由 Amir Vadai 提交于 7月 18, 2012

Enable callers of mlx4_assign_eq to supply a pointer to cpu_rmap.
If supplied, the assigned IRQ is tracked using rmap infrastructure.
Signed-off-by: NAmir Vadai <amirv@mellanox.com>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d9236c3f

18 7月, 2012 1 次提交

IB/qib: Fix QP RCU sparse warnings · 1fb9fed6

由 Mike Marciniszyn 提交于 7月 16, 2012

Commit af061a64 ("IB/qib: Use RCU for qpn lookup") introduced sparse
warnings.

This patch corrects those issues.
Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

1fb9fed6

12 7月, 2012 4 次提交

mlx4: Put physical GID and P_Key table sizes in mlx4_phys_caps struct and paravirtualize them · 6634961c

由 Jack Morgenstein 提交于 6月 19, 2012

To allow easy paravirtualization of P_Key and GID table sizes, keep
paravirtualized sizes in mlx4_dev->caps, but save the actual physical
sizes from FW in struct: mlx4_dev->phys_cap.

In addition, in SR-IOV mode, do the following:

1. Reduce reported P_Key table size by 1.
   This is done to reserve the highest P_Key index for internal use,
   for declaring an invalid P_Key in P_Key paravirtualization.
   We require a P_Key index which always contain an invalid P_Key
   value for this purpose (i.e., one which cannot be modified by
   the subnet manager).  The way to do this is to reduce the
   P_Key table size reported to the subnet manager by 1, so that
   it will not attempt to access the P_Key at index #127.

2. Paravirtualize the GID table size to 1. Thus, each guest sees
   only a single GID (at its paravirtualized index 0).

In addition, since we are paravirtualizing the GID table size to 1, we
add paravirtualization of the master GID event here (i.e., we do not
do ib_dispatch_event() for the GUID change event on the master, since
its (only) GUID never changes).
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

6634961c

IB/mlx4: Fill the masked_atomic_cap attribute in query device · 47e956b2

由 Dotan Barak 提交于 7月 11, 2012

When the user queries for device capabilities, fill in the
masked_atomic_cap attribute with the real support level of atomic
capabilities instead of using a hard coded value.
Signed-off-by: NDotan Barak <dotanb@dev.mellanox.co.il>
Reviewed-by: NEli Cohen <eli@mellanox.com>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

47e956b2

IB/mthca: Fill in sq_sig_type in query QP · 16551d45

由 Dotan Barak 提交于 7月 11, 2012

The query QP code was didn't fill that attribute, do that.
Signed-off-by: NDotan Barak <dotanb@dev.mellanox.co.il>
Reviewed-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

16551d45

IB/mthca: Warning about event for non-existent QPs should show event type · 9bbeb666

由 Dotan Barak 提交于 7月 11, 2012

Events received for non-existent QPs should generate a warning that includes
the event type that was received.
Signed-off-by: NDotan Barak <dotanb@dev.mellanox.co.il>
Reviewed-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

9bbeb666

11 7月, 2012 2 次提交

IB/qib: Fix sparse RCU warnings in qib_keys.c · 7e230177

由 Mike Marciniszyn 提交于 7月 06, 2012

Commit 8aac4cc3 ("IB/qib: RCU locking for MR validation") introduced
new sparse warnings in qib_keys.c.
Acked-by: NFengguang Wu <fengguang.wu@intel.com>
Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

7e230177

mlx4: Use port management change event instead of smp_snoop · 00f5ce99

由 Jack Morgenstein 提交于 6月 19, 2012

The port management change event can replace smp_snoop.  If the
capability bit for this event is set in dev-caps, the event is used
(by the driver setting the PORT_MNG_CHG_EVENT bit in the async event
mask in the MAP_EQ fw command).  In this case, when the driver passes
incoming SMP PORT_INFO SET mads to the FW, the FW generates port
management change events to signal any changes to the driver.

If the FW generates these events, smp_snoop shouldn't be invoked in
ib_process_mad(), or duplicate events will occur (once from the
FW-generated event, and once from smp_snoop).

In the case where the FW does not generate port management change
events smp_snoop needs to be invoked to create these events.  The flow
in smp_snoop has been modified to make use of the same procedures as
in the fw-generated-event event case to generate the port management
events (LID change, Client-rereg, Pkey change, and/or GID change).

Port management change event handling required changing the
mlx4_ib_event and mlx4_dispatch_event prototypes; the "param" argument
(last argument) had to be changed to unsigned long in order to
accomodate passing the EQE pointer.

We also needed to move the definition of struct mlx4_eqe from
net/mlx4.h to file device.h -- to make it available to the IB driver,
to handle port management change events.
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

00f5ce99

09 7月, 2012 6 次提交

IB/qib: RCU locking for MR validation · 8aac4cc3

由 Mike Marciniszyn 提交于 6月 27, 2012

Profiling indicates that MR validation locking is expensive.  The MR
table is largely read-only and is a suitable candidate for RCU locking.

The patch uses RCU locking during validation to eliminate one
lock/unlock during that validation.
Reviewed-by: NMike Heinz <michael.william.heinz@intel.com>
Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

8aac4cc3

IB/qib: Avoid returning EBUSY from MR deregister · 6a82649f

由 Mike Marciniszyn 提交于 6月 27, 2012

A timing issue can occur where qib_mr_dereg can return -EBUSY if the
MR use count is not zero.

This can occur if the MR is de-registered while RDMA read response
packets are being progressed from the SDMA ring.  The suspicion is
that the peer sent an RDMA read request, which has already been copied
across to the peer.  The peer sees the completion of his request and
then communicates to the responder that the MR is not needed any
longer.  The responder tries to de-register the MR, catching some
responses remaining in the SDMA ring holding the MR use count.

The code now uses a get/put paradigm to track MR use counts and
coordinates with the MR de-registration process using a completion
when the count has reached zero.  A timeout on the delay is in place
to catch other EBUSY issues.

The reference count protocol is as follows:
- The return to the user counts as 1
- A reference from the lk_table or the qib_ibdev counts as 1.
- Transient I/O operations increase/decrease as necessary

A lot of code duplication has been folded into the new routines
init_qib_mregion() and deinit_qib_mregion().  Additionally, explicit
initialization of fields to zero is now handled by kzalloc().

Also, duplicated code 'while.*num_sge' that decrements reference
counts have been consolidated in qib_put_ss().
Reviewed-by: NRamkrishna Vepa <ramkrishna.vepa@intel.com>
Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

6a82649f

IB/qib: Fix UC MR refs for immediate operations · 354dff1b

由 Mike Marciniszyn 提交于 6月 27, 2012

An MR reference leak exists when handling UC RDMA writes with
immediate data because we manipulate the reference counts as if the
operation had been a send.

This patch moves the last_imm label so that the RDMA write operations
with immediate data converge at the cq building code.  The copy/mr
deref code is now done correctly prior to the branch to last_imm.
Reviewed-by: NEdward Mascarenhas <edward.mascarenhas@intel.com>
Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

354dff1b

IB/mlx4: Add debug prints · b1d8eb5a

由 Jack Morgenstein 提交于 6月 19, 2012

Define pr_fmt and add some pr_debug prints.
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

b1d8eb5a

IB: Use IS_ENABLED(CONFIG_IPV6) · d90f9b35

由 Roland Dreier 提交于 7月 05, 2012

Instead of testing defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE)
Signed-off-by: NRoland Dreier <roland@purestorage.com>

d90f9b35

RDMA/cxgb4: Fix endianness of addition to mpa->private_data_size · f747c34a

由 Roland Dreier 提交于 7月 05, 2012

sparse correctly warns that if mpa->private_data_size is __be16, then
doing += on it is wrong, even if we do += htons(<something>) -- on a
little endian system, carries will go the wrong way.  Fix this up by
doing the addition in native byte order.
Acked-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

f747c34a

08 7月, 2012 2 次提交

{NET, IB}/mlx4: Add device managed flow steering firmware API · 0ff1fb65

由 Hadar Hen Zion 提交于 7月 05, 2012

The driver is modified to support three operation modes.

If supported by firmware use the device managed flow steering
API, that which we call device managed steering mode. Else, if
the firmware supports the B0 steering mode use it, and finally,
if none of the above, use the A0 steering mode.

When the steering mode is device managed, the code is modified
such that L2 based rules set by the mlx4_en driver for Ethernet
unicast and multicast, and the IB stack multicast attach calls
done through the mlx4_ib driver are all routed to use the device
managed API.

When attaching rule using device managed flow steering API,
the firmware returns a 64 bit registration id, which is to be
provided during detach.

Currently the firmware is always programmed during HCA initialization
to use standard L2 hashing. Future work should be done to allow
configuring the flow-steering hash function with common, non
proprietary means.
Signed-off-by: NHadar Hen Zion <hadarh@mellanox.co.il>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0ff1fb65

RDMA/ocrdma: Fix assignment of max_srq_sge in device query · d1e09ebf

由 Roland Dreier 提交于 7月 07, 2012

We want to set attr->max_srq_sge to dev->attr.max_srq_sge, not to itself.

This was detected by Coverity (CID 709210).
Signed-off-by: NRoland Dreier <roland@purestorage.com>

d1e09ebf

05 7月, 2012 1 次提交
- D
  cxgb3: Convert t3_l2t_get() over to dst_neigh_lookup(). · 534cb283
  由 David S. Miller 提交于 7月 02, 2012
```
This means passing in a suitable destination address.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  534cb283
15 6月, 2012 1 次提交

RDMA/ocrdma: Fix off by one in ocrdma_query_gid() · 7b33dc2b

由 Dan Carpenter 提交于 6月 14, 2012

The dev->sgid_tbl[] array is allocated in ocrdma_alloc_resources().
It has OCRDMA_MAX_SGID elements so the test here is off by one.
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

7b33dc2b

12 6月, 2012 2 次提交

RDMA/ocrdma: Fixed RQ error CQE polling · a3698a9b

由 Parav Pandit 提交于 6月 11, 2012

Fix RQ/SRQ error CQE polling.  Return error CQE to consumer for error
case which was not returned previously.
Signed-off-by: NParav Pandit <parav.pandit@emulex.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

a3698a9b

RDMA/ocrdma: Correct queue SGE calculation · 634c5796

由 Mahesh Vardhamanaiah 提交于 6月 08, 2012

Fix max sge calculation for sq, rq, srq for all hardware types.
Signed-off-by: NMahesh Vardhamanaiah <mahesh.vardhamanaiah@emulex.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

634c5796

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功