提交 · adf30907d63893e4208dfe3f5c88ae12bc2f25d5 · openeuler / Kernel

03 6月, 2009 1 次提交

由 Eric Dumazet 提交于 6月 02, 2009

Define three accessors to get/set dst attached to a skb

struct dst_entry *skb_dst(const struct sk_buff *skb)

void skb_dst_set(struct sk_buff *skb, struct dst_entry *dst)

void skb_dst_drop(struct sk_buff *skb)
This one should replace occurrences of :
dst_release(skb->dst)
skb->dst = NULL;

Delete skb->dst field
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

adf30907

19 5月, 2009 1 次提交

net: Fix ipoib rtnl_lock sysfs deadlock. · 26574401

由 Eric W. Biederman 提交于 5月 13, 2009

Network device sysfs files that grab the rtnl_lock unconditionally
will deadlock if accessed when the network device is being
unregistered.  So use trylock and syscall_restart to avoid this
deadlock.
Signed-off-by: NEric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

26574401

30 10月, 2008 1 次提交

net: replace %p6 with %pI6 · 5b095d98

由 Harvey Harrison 提交于 10月 29, 2008

Signed-off-by: NHarvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5b095d98

29 10月, 2008 1 次提交

infiniband: ipoib replace IPOIB_GID_FMT with %p6 · fcace2fe

由 Harvey Harrison 提交于 10月 28, 2008

Replace all uses of IPOIB_GID_FMT, IPOIB_GID_RAW_ARG() and IPOIB_GID_ARG()
Signed-off-by: NHarvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fcace2fe

01 10月, 2008 1 次提交

IPoIB: Use netif_tx_lock() and get rid of private tx_lock, LLTX · 943c246e

由 Roland Dreier 提交于 9月 30, 2008

Currently, IPoIB is an LLTX driver that uses its own IRQ-disabling
tx_lock.  Not only do we want to get rid of LLTX, this actually causes
problems because of the skb_orphan() done with this tx_lock held: some
skb destructors expect to be run with interrupts enabled.

The simplest fix for this is to get rid of the driver-private tx_lock
and stop using LLTX.  We kill off priv->tx_lock and use
netif_tx_lock[_bh]() instead; the patch to do this is a tiny bit
tricky because we need to update places that take priv->lock inside
the tx_lock to disable IRQs, rather than relying on tx_lock having
already disabled IRQs.

Also, there are a couple of places where we need to disable BHs to
make sure we have a consistent context to call netif_tx_lock() (since
we no longer can use _irqsave() variants), and we also have to change
ipoib_send_comp_handler() to call drain_tx_cq() through a timer rather
than directly, because ipoib_send_comp_handler() runs in interrupt
context and drain_tx_cq() must run in BH context so it can call
netif_tx_lock().
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

943c246e

09 8月, 2008 1 次提交

IPoIB/cm: Use vmalloc() to allocate rx_rings · b1404069

由 David J. Wilder 提交于 8月 08, 2008

There are users that are running UDP applications that require a large
receive queue size in order to get good performance.  To prevent
allocation failures for rx_rings when using non-SRQ mode and large
recv_queue_size (1K or larger), use vmalloc() instead of kcalloc() to
alocate rx_rings.
Signed-off-by: NDavid Wilder <dwilder@us.ibm.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

b1404069

30 7月, 2008 1 次提交

IPoIB/cm: Set correct SG list in ipoib_cm_init_rx_wr() · e0819816

由 Roland Dreier 提交于 7月 30, 2008

wr->sg_list should be set to the sge pointer passed in, not
priv->cm.rx_sge.
Reported-by: NHoang-Nam Nguyen <HNGUYEN@de.ibm.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

e0819816

15 7月, 2008 6 次提交

IPoIB/cm: Reduce connected mode TX object size · e112373f

由 Eli Cohen 提交于 7月 14, 2008

Since IPoIB connected mode does not NETIF_F_SG, we only have one DMA
mapping per send, so we don't need a mapping[] array.  Define a new
struct with a single u64 mapping member and use it for the CM tx_ring.
Signed-off-by: NEli Cohen <eli@mellanox.co.il>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

e112373f

IPoIB: Use dev_set_mtu() to change mtu · bd360671

由 Eli Cohen 提交于 7月 14, 2008

When the driver sets the MTU of the net device outside of its
change_mtu method, it should make use of dev_set_mtu() instead of
directly setting the mtu field of struct netdevice.  Otherwise
functions registered to be called upon MTU change will not get called
(this is done through call_netdevice_notifiers() in dev_set_mtu()).
Signed-off-by: NEli Cohen <eli@mellanox.co.il>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

bd360671

IPoIB: Use rtnl lock/unlock when changing device flags · c8c2afe3

由 Eli Cohen 提交于 7月 14, 2008

Use of this lock is required to synchronize changes to the netdvice's
data structs.  Also move the call to ipoib_flush_paths() after the
modification of the netdevice flags in set_mode().
Signed-off-by: NEli Cohen <eli@mellanox.co.il>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

c8c2afe3

IPoIB/cm: Fix racy use of receive WR/SGL in ipoib_cm_post_receive_nonsrq() · a7d834c4

由 Roland Dreier 提交于 7月 14, 2008

For devices that don't support SRQs, ipoib_cm_post_receive_nonsrq() is
called from both ipoib_cm_handle_rx_wc() and ipoib_cm_nonsrq_init_rx(),
and these two callers are not synchronized against each other.
However, ipoib_cm_post_receive_nonsrq() always reuses the same receive
work request and scatter list structures, so multiple callers can end
up stepping on each other, which leads to posting garbled work
requests.

Fix this by having the caller pass in the ib_recv_wr and ib_sge
structures to use, and allocating new local structures in
ipoib_cm_nonsrq_init_rx().

Based on a patch by Pradeep Satyanarayana <pradeep@us.ibm.com> and
David Wilder <dwilder@us.ibm.com>, with debugging help from Hoang-Nam
Nguyen <hnguyen@de.ibm.com>.
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

a7d834c4

IPoIB: Copy small received SKBs in connected mode · f89271da

由 Eli Cohen 提交于 7月 14, 2008

The connected mode implementation in the IPoIB driver has a large
overhead in the way SKBs are handled in the receive flow.  It usually
allocates an SKB with as big as was used in the currently received SKB
and moves unused fragments from the old SKB to the new one. This
involves a loop on all the remaining fragments and incurs overhead on
the CPU.  This patch, for small SKBs, allocates an SKB just large
enough to contain the received data and copies to it the data from the
received SKB.  The newly allocated SKB is passed to the stack and the
old SKB is reposted.

When running netperf, UDP small messages, without this pach I get:

    UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
    14.4.3.178 (14.4.3.178) port 0 AF_INET
    Socket  Message  Elapsed      Messages
    Size    Size     Time         Okay Errors   Throughput
    bytes   bytes    secs            #      #   10^6bits/sec

    114688     128   10.00     5142034      0     526.31
    114688           10.00     1130489            115.71

With this patch I get both send and receive at ~315 mbps.

The reason that send performance actually slows down is as follows:
When using this patch, the overhead of the CPU for handling RX packets
is dramatically reduced.  As a result, we do not experience RNR NAK
messages from the receiver which cause the connection to be closed and
reopened again; when the patch is not used, the receiver cannot handle
the packets fast enough so there is less time to post new buffers and
hence the mentioned RNR NACKs.  So what happens is that the
application *thinks* it posted a certain number of packets for
transmission but these packets are flushed and do not really get
transmitted.  Since the connection gets opened and closed many times,
each time netperf gets the CPU time that otherwise would have been
given to IPoIB to actually transmit the packets.  This can be verified
when looking at the port counters -- the output of ifconfig and the
oputput of netperf (this is for the case without the patch):

    tx packets
    ==========
    port counter:   1,543,996
    ifconfig:       1,581,426
    netperf:        5,142,034

    rx packets
    ==========
    netperf         1,1304,089
Signed-off-by: NEli Cohen <eli@mellanox.co.il>

f89271da

RDMA: Remove subversion $Id tags · f3781d2e

由 Roland Dreier 提交于 7月 14, 2008

They don't get updated by git and so they're worse than useless.
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

f3781d2e

30 4月, 2008 1 次提交

IPoIB: Use separate CQ for UD send completions · f56bcd80

由 Eli Cohen 提交于 4月 29, 2008

Use a dedicated CQ for UD send completions. Also, do not arm the UD
send CQ, which reduces the number of interrupts generated. This patch
farther reduces overhead by not calling poll CQ for every posted send
WR -- it does polls only when there 16 or more outstanding work requests.
Signed-off-by: NEli Cohen <eli@mellanox.co.il>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

f56bcd80

17 4月, 2008 3 次提交

IPoIB: Handle case when P_Key is deleted and re-added at same index · 9fdd5e5b

由 Roland Dreier 提交于 4月 16, 2008

If a P_Key is deleted and then re-added at the same index, then IPoIB
gets confused because __ipoib_ib_dev_flush() only checks whether the
index is the same without checking whether the P_Key was present, so
the interface is stopped when the P_Key is deleted, but the event when
the P_Key is re-added gets ignored and the interface never gets
restarted.

Also, switch to using ib_find_pkey() instead of ib_find_cached_pkey()
everywhere in IPoIB, since none of the places that look for P_Keys are
in a fast path or in non-sleeping context, and in general we want to
kill off the whole caching infrastructure eventually.  This also fixes
consistency problems caused because some IPoIB queries were cached and
some were uncached during the window where the cache was not updated.

Thanks to Venkata Subramonyam <vsubramo@cisco.com> for debugging this
problem and testing this fix.
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

9fdd5e5b

IPoIB: Add LSO support · 40ca1988

由 Eli Cohen 提交于 4月 16, 2008

For HCAs that support TCP segmentation offload (IB_DEVICE_UD_TSO), set
NETIF_F_TSO and use HW LSO to offload TCP segmentation.
Signed-off-by: NEli Cohen <eli@mellanox.co.il>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

40ca1988

IPoIB: Use checksum offload support if available · 6046136c

由 Eli Cohen 提交于 4月 16, 2008

For HCAs that support checksum offload (ie that set IB_DEVICE_UD_IP_CSUM
in the device capabilities flags), have IPoIB set NETIF_F_IP_CSUM and
use the HCA to generate and verify IP checksums.
Signed-off-by: NEli Cohen <eli@mellanox.co.il>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

6046136c

12 3月, 2008 2 次提交

IPoIB: Allocate priv->tx_ring with vmalloc() · 10313cbb

由 Roland Dreier 提交于 3月 12, 2008

Commit 7143740d ("IPoIB: Add send gather support") made struct
ipoib_tx_buf significantly larger, since the mapping member changed
from a single u64 to an array with MAX_SKB_FRAGS + 1 entries. This
means that allocating tx_rings with kzalloc() may fail because there
is not enough contiguous memory for the new, much bigger size. Fix
this regression by allocating the rings with vmalloc() instead.
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

10313cbb

IPoIB/cm: Set tx_wr.num_sge in connected mode post_send() · 4200406b

由 Roland Dreier 提交于 3月 11, 2008

Commit 7143740d ("IPoIB: Add send gather support") made it possible
for tx_wr.num_sge to be != 1 -- this happens if send gather support is
enabled. However, the code in the connected mode post_send() function
assumes the old invariant, namely that tx_wr.num_sge is always 1. Fix
this by explicitly setting tx_wr.num_sge to 1 in the CM post_send().
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

4200406b

20 2月, 2008 1 次提交

IPoIB/cm: Fix ipoib_cm_dev_stop() cleanup when drain times out · ec229e5e

由 Pradeep Satyanarayana 提交于 2月 12, 2008

Commit efcd9971 ("IPoIB/cm: Factor out ipoib_cm_free_rx_reap_list()")
introduced a bug in ipoib_cm_dev_stop() when the receive drain times
out.  In that case, the function moves all the pending rx stuff into a
private list but then calls ipoib_cm_free_rx_reap_list(), which
handles a different list.

Fix this by moving everything to the rx_reap_list that will actually
get freed up.

This fixes <https://bugs.openfabrics.org/show_bug.cgi?id=906>.
Signed-off-by: NPradeep Satyanarayana <pradeeps@linux.vnet.ibm.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

ec229e5e

09 2月, 2008 1 次提交

IPoIB: Add send gather support · 7143740d

由 Eli Cohen 提交于 1月 30, 2008

This patch acts as a preparation for using checksum offload for IB
devices capable of inserting/verifying checksum in IP packets.  The
patch does not actaully turn on NETIF_F_SG - we defer that to the
patches adding checksum offload capabilities.

We only add support for send gathers for datagram mode, since existing
HW does not support checksum offload on connected QPs.
Signed-off-by: NMichael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: NEli Cohen <eli@mellanox.co.il>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

7143740d

26 1月, 2008 6 次提交

IPoIB/CM: Enable SRQ support on HCAs that support fewer than 16 SG entries · 586a6934

由 Pradeep Satyanarayana 提交于 12月 21, 2007

Some HCAs (such as ehca2) support SRQ, but only support fewer than 16 SG
entries for SRQs. Currently IPoIB/CM implicitly assumes all HCAs will
support 16 SG entries for SRQs (to handle a 64K MTU with 4K pages). This
patch removes that restriction by limiting the maximum MTU in connected
mode to what the maximum number of SRQ SG entries allows.

This patch addresses <https://bugs.openfabrics.org/show_bug.cgi?id=728>
Signed-off-by: NPradeep Satyanarayana <pradeeps@linux.vnet.ibm.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

586a6934

IPoIB/cm: Add connected mode support for devices without SRQs · 68e995a2

由 Pradeep Satyanarayana 提交于 1月 25, 2008

Some IB adapters (notably IBM's eHCA) do not implement SRQs (shared
receive queues). The current IPoIB connected mode support only works
on devices that support SRQs.

Fix this by adding support for using the receive queue of each
connected mode receive QP. The disadvantage of this compared to using
an SRQ is that it means a full queue of receives must be posted for
each remote connected mode peer, which means that total memory usage
is potentially much higher than when using SRQs. To manage this, add
a new module parameter "max_nonsrq_conn_qp" that limits the number of
connections allowed per interface.

The rest of the changes are fairly straightforward: we use a table of
struct ipoib_cm_rx to hold all the active connections, and put the
table index of the connection in the high bits of receive WR IDs.
This is needed because we cannot rely on the struct ib_wc.qp field for
non-SRQ receive completions. Most of the rest of the changes just
test whether or not an SRQ is available, and post receives or find
received packets in the right place depending on the answer.

Cleaning up dead connections actually becomes simpler, because we do
not have to do the "last WQE reached" dance that is required to
destroy QPs attached to an SRQ. We just move the QP to the error
state and wait for all pending receives to be flushed.
Signed-off-by: NPradeep Satyanarayana <pradeeps@linux.vnet.ibm.com>

[ Completely rewritten and split up, based on Pradeep's work. Several
bugs fixed and no doubt several bugs introduced. - Roland ]
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

68e995a2

IPoIB/cm: Factor out ipoib_cm_free_rx_reap_list() · efcd9971

由 Roland Dreier 提交于 1月 25, 2008

Factor out the code for going through the rx_reap list of struct
ipoib_cm_rx and freeing each one.  This consolidates the code
duplicated between ipoib_cm_dev_stop() and ipoib_cm_rx_reap() and
reduces the risk of error when adding additional accounting.
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

efcd9971

IPoIB/cm: Factor out ipoib_cm_create_srq() · 7b3687df

由 Roland Dreier 提交于 1月 25, 2008

Factor out the code to create an SRQ and allocate the receive ring in
ipoib_cm_dev_init() into a new function ipoib_cm_create_srq().  This
will make the code neater when support for devices that don't implement
SRQs is added.
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

7b3687df

IPoIB/cm: Factor out ipoib_cm_free_rx_ring() · 1efb6144

由 Roland Dreier 提交于 1月 25, 2008

Factor out the code to unmap/free skbs and free the receive ring in
ipoib_cm_dev_cleanup() into a new function ipoib_cm_free_rx_ring().
This function will be called from a couple of other places when
support for devices that don't implement SRQs is added.
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

1efb6144

IPoIB: Trivial formatting cleanups · 2337f809

由 Roland Dreier 提交于 10月 23, 2007

Fix whitespace blunders, convert "foo* bar" to "foo *bar", etc.
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

2337f809

27 10月, 2007 1 次提交

IPoIB/cm: Fix receive QP cleanup · 09f60f8f

由 Roland Dreier 提交于 10月 26, 2007

Commit 1b524963 ("IPoIB/cm: Use common CQ for CM send completions")
changed how the high-order bits of work request IDs were used, which
had the effect that IPOIB_CM_RX_DRAIN_WRID was no longer handled as a
connected mode receive completion.  This leads to the messages

    ib1: cm send completion event with wrid 1073741823 (> 64)
    ib1: RX drain timing out

when an interface with connected mode QPs is brought down.  Fix this
by making sure that both IPOIB_OP_CM and IPOIB_OP_RECV are set in
IPOIB_CM_RX_DRAIN_WRID.
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

09f60f8f

20 10月, 2007 1 次提交

IPoIB/cm: Use common CQ for CM send completions · 1b524963

由 Michael S. Tsirkin 提交于 8月 16, 2007

Use the same CQ for CM send completions as for all other IPoIB
completions.  This means all completions are processed via the same
NAPI polling routine.  This should help reduce the number of
interrupts for bi-directional traffic (such as TCP) and fixes "driver
is hogging interrupts" errors reported for IPoIB send side, e.g.
<https://bugs.openfabrics.org/show_bug.cgi?id=508>

To do this, keep a per-interface counter of outstanding send WRs, and
stop the interface when this counter reaches the send queue size to
avoid CQ overruns.
Signed-off-by: NMichael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

1b524963

18 10月, 2007 1 次提交

IPoIB: Rewrite "if (!likely(...))" as "if (unlikely(!(...)))" · fd312561

由 Roland Dreier 提交于 10月 17, 2007

    
It's too hard to figure out what "!likely(...)" really means, and who
knows how compilers interpret the hint.
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

fd312561

11 10月, 2007 1 次提交

[IPoIB]: Convert to netdevice internal stats · de903512

由 Roland Dreier 提交于 9月 28, 2007

Use the stats member of struct netdevice in IPoIB, so we can save
memory by deleting the stats member of struct ipoib_dev_priv, and save
code by deleting ipoib_get_stats().
Signed-off-by: NRoland Dreier <rolandd@cisco.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

de903512

10 10月, 2007 1 次提交

IPoIB/cm: Clean up initialization of QP attr in ipoib_cm_create_tx_qp() · ede6bc04

由 Dotan Barak 提交于 10月 07, 2007

Make the way QP is being created in ipoib_cm_create_tx_qp()
consistent with ipoib_cm_create_rx_qp().
Signed-off-by: NDotan Barak <dotanb@dev.mellanox.co.il>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

ede6bc04

11 7月, 2007 2 次提交

IB/cm: Include HCA ACK delay in local ACK timeout · 1d846126

由 Sean Hefty 提交于 6月 18, 2007

The IB CM should include the HCA ACK delay when calculating the local
ACK timeout value to use for RC QPs.  If the HCA ACK delay is large
enough relative to the packet life time, then if it is not taken into
account, the calculated timeout value ends up being too small, which
can result in "retry exceeded" errors.
Signed-off-by: NSean Hefty <sean.hefty@intel.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

1d846126

IPoIB/cm: Fix warning if IPV6 is not enabled · 20089ca5

由 Roland Dreier 提交于 7月 10, 2007

Fix

    drivers/infiniband/ulp/ipoib/ipoib_cm.c:1151: warning: unused variable 'dev'

by getting rid of the variable dev, which is only used if CONFIG_IPV6
is enabled, and replacing the one use of it with the value it is
assigned, namely priv->dev.
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

20089ca5

03 7月, 2007 1 次提交

IPoIB/cm: Partial error clean up unmaps wrong address · 841adfca

由 Ralph Campbell 提交于 6月 29, 2007

If a page can't be allocated for the frag list of a skb, the code to
unmap the partially allocated list is off by one.  For exaple, if
'frags' equals one, i == 0, and the alloc_page() fails, then the old
loop would have unmapped mapping[1] which is uninitialized.  The same
would happen if the call to ib_dma_map_page() failed.
Signed-off-by: NRalph Campbell <ralph.campbell@qlogic.com>
Acked-by: NMichael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

841adfca

22 6月, 2007 3 次提交

R
IPoIB/cm: Remove dead definition of struct ipoib_cm_id · 13ef5f44
由 Roland Dreier 提交于 6月 21, 2007
```
It's completely unused.
Signed-off-by: NRoland Dreier <rolandd@cisco.com>
```
13ef5f44

IPoIB/cm: Fix interoperability when MTU doesn't match · 82c3aca6

由 Michael S. Tsirkin 提交于 6月 20, 2007

IPoIB connected mode currently rejects a connection request unless the
supported MTU is >= the local netdevice MTU. This breaks
interoperability with implementations that might have tweaked
IPOIB_CM_MTU, and there's real no longer a reason to do so: this test
is just a leftover from when we did not tweak MTU per-connection.  Fix
this by making the test as permissive as possible.
Signed-off-by: NMichael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

82c3aca6

IPoIB/cm: Initialize RX before moving QP to RTR · 3ec7393a

由 Michael S. Tsirkin 提交于 6月 19, 2007

Fix a crasher bug in IPoIB CM: once a QP is in the RTR state, a
receive completion (or even an asynchronous error) might be observed
on this QP, so we have to initialize all of our receive data
structures before moving to the RTR state.

As an optimization (since modify_qp might take a long time), the
jiffies update done when moving RX to the passive_ids list is also
left in place to reduce the chance of the RX being misdetected as
stale.

This fixes bug <https://bugs.openfabrics.org/show_bug.cgi?id=662>.
Signed-off-by: NMichael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

3ec7393a

30 5月, 2007 1 次提交

IPoIB/cm: Fix performance regression on Mellanox · ec56dc0b

由 Michael S. Tsirkin 提交于 5月 28, 2007

commit 518b1646 ("IPoIB/cm: Fix SRQ WR leak") introduced a severe
performance regression on Mellanox cards, because keeping a QP in the
error state for extended periods of time moves hardware to the slow
path (until the QP is destroyed).  For example, MPI latency goes from
~3 usecs to ~7 usecs.

Fix this by posting a send WR on one of the QPs that are being
flushed, instead of using a separate drain QP that is kept in the
error state.

This fixes bug <https://bugs.openfabrics.org/show_bug.cgi?id=636>,
reported and bisected by Scott Weitzenkamp at Cisco and debugged by
Sasha Mikheev at Voltaire.
Signed-off-by: NMichael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

ec56dc0b

25 5月, 2007 1 次提交

IPoIB/cm: Drain cq in ipoib_cm_dev_stop() · 2dfbfc37

由 Michael S. Tsirkin 提交于 5月 24, 2007

Since NAPI polling is disabled while ipoib_cm_dev_stop() is running,
ipoib_cm_dev_stop() must poll the CQ itself in order to see the
packets draining.
Signed-off-by: NMichael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

2dfbfc37

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功