提交 · 87528227dfa8776d12779d073c217f0835fd6d20 · openeuler / raspberrypi-kernel

24 4月, 2008 1 次提交

IPoIB: Handle 4K IB MTU for UD (datagram) mode · bc7b3a36

由 Shirley Ma 提交于 4月 23, 2008

This patch enables IPoIB to use 4K UD messages (when the underlying
device and fabrics support a 4K MTU) by using two scatter buffers when
PAGE_SIZE is less than or equal to thhe HCA IB MTU size.  The first
buffer is for IPoIB header + GRH header, and the second buffer is the
IPoIB payload, which is 4K-4.
Signed-off-by: NShirley Ma <xma@us.ibm.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

bc7b3a36

17 4月, 2008 3 次提交

IPoIB: Handle case when P_Key is deleted and re-added at same index · 9fdd5e5b

由 Roland Dreier 提交于 4月 16, 2008

If a P_Key is deleted and then re-added at the same index, then IPoIB
gets confused because __ipoib_ib_dev_flush() only checks whether the
index is the same without checking whether the P_Key was present, so
the interface is stopped when the P_Key is deleted, but the event when
the P_Key is re-added gets ignored and the interface never gets
restarted.

Also, switch to using ib_find_pkey() instead of ib_find_cached_pkey()
everywhere in IPoIB, since none of the places that look for P_Keys are
in a fast path or in non-sleeping context, and in general we want to
kill off the whole caching infrastructure eventually.  This also fixes
consistency problems caused because some IPoIB queries were cached and
some were uncached during the window where the cache was not updated.

Thanks to Venkata Subramonyam <vsubramo@cisco.com> for debugging this
problem and testing this fix.
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

9fdd5e5b

IPoIB: Add LSO support · 40ca1988

由 Eli Cohen 提交于 4月 16, 2008

For HCAs that support TCP segmentation offload (IB_DEVICE_UD_TSO), set
NETIF_F_TSO and use HW LSO to offload TCP segmentation.
Signed-off-by: NEli Cohen <eli@mellanox.co.il>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

40ca1988

IPoIB: Use checksum offload support if available · 6046136c

由 Eli Cohen 提交于 4月 16, 2008

For HCAs that support checksum offload (ie that set IB_DEVICE_UD_IP_CSUM
in the device capabilities flags), have IPoIB set NETIF_F_IP_CSUM and
use the HCA to generate and verify IP checksums.
Signed-off-by: NEli Cohen <eli@mellanox.co.il>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

6046136c

15 2月, 2008 1 次提交

IPoIB: On P_Key change event, reset state properly · 167c4265

由 Jack Morgenstein 提交于 2月 13, 2008

In P_Key event handling, if the old P_Key is no longer available, the
driver must call ipoib_ib_dev_stop() -- just as it does when the P_Key
is still available (see procedure __ipoib_ib_dev_flush()).

When a P_Key becomes available, the driver will perform ipoib_open(),
which assumes that the QP is in RESET, the cm_id has been
destroyed/deleted, etc.  If ipoib_ib_dev_stop() is not called as
described above, then these assumptions will be false, and the attempt
to bring the interface up will fail.

Found by Mellanox QA.
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

167c4265

09 2月, 2008 1 次提交

IPoIB: Add send gather support · 7143740d

由 Eli Cohen 提交于 1月 30, 2008

This patch acts as a preparation for using checksum offload for IB
devices capable of inserting/verifying checksum in IP packets.  The
patch does not actaully turn on NETIF_F_SG - we defer that to the
patches adding checksum offload capabilities.

We only add support for send gathers for datagram mode, since existing
HW does not support checksum offload on connected QPs.
Signed-off-by: NMichael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: NEli Cohen <eli@mellanox.co.il>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

7143740d

26 1月, 2008 1 次提交

IPoIB: Trivial formatting cleanups · 2337f809

由 Roland Dreier 提交于 10月 23, 2007

Fix whitespace blunders, convert "foo* bar" to "foo *bar", etc.
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

2337f809

20 10月, 2007 1 次提交

IPoIB/cm: Use common CQ for CM send completions · 1b524963

由 Michael S. Tsirkin 提交于 8月 16, 2007

Use the same CQ for CM send completions as for all other IPoIB
completions.  This means all completions are processed via the same
NAPI polling routine.  This should help reduce the number of
interrupts for bi-directional traffic (such as TCP) and fixes "driver
is hogging interrupts" errors reported for IPoIB send side, e.g.
<https://bugs.openfabrics.org/show_bug.cgi?id=508>

To do this, keep a per-interface counter of outstanding send WRs, and
stop the interface when this counter reaches the send queue size to
avoid CQ overruns.
Signed-off-by: NMichael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

1b524963

17 10月, 2007 1 次提交

IPoIB: Use round_jiffies() for ah_reap_task · 69fc507a

由 Anton Blanchard 提交于 10月 15, 2007

Use round_jiffies() to align the 1 second ah_reap_task with other work
and potentially save power by sleeping cores for longer.
Signed-off-by: NAnton Blanchard <anton@samba.org>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

69fc507a

11 10月, 2007 2 次提交

[IPoIB]: Convert to netdevice internal stats · de903512

由 Roland Dreier 提交于 9月 28, 2007

Use the stats member of struct netdevice in IPoIB, so we can save
memory by deleting the stats member of struct ipoib_dev_priv, and save
code by deleting ipoib_get_stats().
Signed-off-by: NRoland Dreier <rolandd@cisco.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

de903512

[NET]: Make NAPI polling independent of struct net_device objects. · bea3348e

由 Stephen Hemminger 提交于 10月 03, 2007

Several devices have multiple independant RX queues per net
device, and some have a single interrupt doorbell for several
queues.

In either case, it's easier to support layouts like that if the
structure representing the poll is independant from the net
device itself.

The signature of the ->poll() call back goes from:

	int foo_poll(struct net_device *dev, int *budget)

to

	int foo_poll(struct napi_struct *napi, int budget)

The caller is returned the number of RX packets processed (or
the number of "NAPI credits" consumed if you want to get
abstract).  The callee no longer messes around bumping
dev->quota, *budget, etc. because that is all handled in the
caller upon return.

The napi_struct is to be embedded in the device driver private data
structures.

Furthermore, it is the driver's responsibility to disable all NAPI
instances in it's ->stop() device close handler.  Since the
napi_struct is privatized into the driver's private data structures,
only the driver knows how to get at all of the napi_struct instances
it may have per-device.

With lots of help and suggestions from Rusty Russell, Roland Dreier,
Michael Chan, Jeff Garzik, and Jamal Hadi Salim.

Bug fixes from Thomas Graf, Roland Dreier, Peter Zijlstra,
Joseph Fannin, Scott Wood, Hans J. Koch, and Michael Chan.

[ Ported to current tree and all drivers converted.  Integrated
  Stephen's follow-on kerneldoc additions, and restored poll_list
  handling to the old style to fix mutual exclusion issues.  -DaveM ]
Signed-off-by: NStephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bea3348e

10 10月, 2007 1 次提交

IPoIB: Make sure no receives are handled when stopping device · ce423ef5

由 Roland Dreier 提交于 10月 09, 2007

The current IPoIB code might process receive completions from
ipoib_drain_cq() when bringing down the interface.  This could cause
packets to be passed up the stack without the device's poll method
being called.  Avoid this by setting the status of any successful
completions to IB_WC_WR_FLUSH_ERR.
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

ce423ef5

11 7月, 2007 1 次提交

IPoIB: Recycle loopback skbs instead of freeing and reallocating · 1b844afe

由 Roland Dreier 提交于 7月 10, 2007

InfiniBand HCAs replicate multicast packets back to the QP that sent
them if that QP is attached to the destination multicast group. This
means that IPoIB multicasts are often replicated back to the receive
queue of the interface that generated them. To avoid confusing the
network stack, we drop these duplicates within the IPoIB driver.

However, there's no reason to free the skb that received the duplicate
and then immediately allocate a new skb to post to the receive queue.
We can be more efficient and just repost the same skb.
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

1b844afe

25 5月, 2007 1 次提交

IPoIB/cm: Drain cq in ipoib_cm_dev_stop() · 2dfbfc37

由 Michael S. Tsirkin 提交于 5月 24, 2007

Since NAPI polling is disabled while ipoib_cm_dev_stop() is running,
ipoib_cm_dev_stop() must poll the CQ itself in order to see the
packets draining.
Signed-off-by: NMichael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

2dfbfc37

22 5月, 2007 1 次提交

IB/ipoib: Fix typos in error messages · 24bd1e4e

由 Michael S. Tsirkin 提交于 5月 18, 2007

Trivial error message fixups.
Signed-off-by: NMichael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

24bd1e4e

19 5月, 2007 1 次提交

IPoIB: Handle P_Key table reordering · 26bbf13c

由 Yosef Etigin 提交于 5月 19, 2007

SM reconfiguration or failover possibly causes a shuffling of the values
in the P_Key table. Right now, IPoIB only queries for the P_Key index
once when it creates the device QP, and hence there are problems if the
index of a P_Key value changes.  Fix this by using the PKEY_CHANGE event
to trigger a recheck of the P_Key index.
Signed-off-by: NYosef Etigin <yosefe@voltaire.com>
Acked-by: NMichael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

26bbf13c

07 5月, 2007 1 次提交

IPoIB: Convert to NAPI · 8d1cc86a

由 Roland Dreier 提交于 5月 06, 2007

Convert the IP-over-InfiniBand network device driver over to using
NAPI to handle completions for the main CQ.  This covers all receives
as well as datagram mode sends; send completions for connected mode
connections are still handled from interrupt context.
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

8d1cc86a

26 4月, 2007 1 次提交

[SK_BUFF]: Introduce skb_reset_mac_header(skb) · 459a98ed

由 Arnaldo Carvalho de Melo 提交于 3月 19, 2007

For the common, open coded 'skb->mac.raw = skb->data' operation, so that we can
later turn skb->mac.raw into a offset, reducing the size of struct sk_buff in
64bit land while possibly keeping it as a pointer on 32bit.

This one touches just the most simple case, next will handle the slightly more
"complex" cases.
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

459a98ed

19 4月, 2007 1 次提交

IPoIB: Remove pointless opcode field from debugging output · a89875fc

由 Roland Dreier 提交于 4月 18, 2007

There's no point in printing the opcode field in the completion
handling debugging output, since the type of completion is already
printed at the beginning of the line. In fact the opcode field is not
even defined for completions with a status other than success.
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

a89875fc

23 3月, 2007 1 次提交

IB/ipoib: Fix thinko in packet length checks · 77d8e1ef

由 Michael S. Tsirkin 提交于 3月 21, 2007

The packet length checks in ipoib are broken: we add 4 bytes (IPoIB
encapsulation header) when sending a packet, not 20 bytes (hardware
address length) to each packet. Therefore, if connected mode is
enabled so that the interface MTU is larger than the multicast MTU,
IPoIB may end up trying to send too-long multicast packets. For
example, multicast is broken if a message of size 2048 bytes is sent
on an interface with UD MTU 2048, because 2048 is bigger than the real
limit of 2044 but the code tests against the wrong limit of 2060.

This patch fixes <https://bugs.openfabrics.org/show_bug.cgi?id=418>,
submitted by Scott Weitzenkamp <sweitzen@cisco.com>.
Signed-off-by: NMichael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

77d8e1ef

11 2月, 2007 1 次提交

IPoIB: Connected mode experimental support · 839fcaba

由 Michael S. Tsirkin 提交于 2月 05, 2007

The following patch adds experimental support for IPoIB connected
mode, as defined by the draft from the IETF ipoib working group.  The
idea is to increase performance by increasing the MTU from the maximum
of 2K (theoretically 4K) supported by IPoIB on top of UD.  With this
code, I'm able to get 800MByte/sec or more with netperf without
options on a Mellanox 4x back-to-back DDR system.

Some notes on code:
1. SRQ is used for scalability to large cluster sizes
2. Only RC connections are used (UC does not support SRQ now)
3. Retry count is set to 0 since spec draft warns against retries
4. Each connection is used for data transfers in only 1 direction, so
   each connection is either active(TX) or passive (RX).  2 sides that
   want to communicate create 2 connections.
5. Each active (TX) connection has a separate CQ for send completions -
   this keeps the code simple without CQ resize and other tricks
6. To detect stale passive side connections (where the remote side is
   down), we keep an LRU list of passive connections (updated once per
   second per connection) and destroy a connection after it has been
   unused for several seconds. The LRU rule makes it possible to avoid
   scanning connections that have recently been active.
Signed-off-by: NMichael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

839fcaba

13 12月, 2006 1 次提交

IPoIB: Use the new verbs DMA mapping functions · 37ccf9df

由 Ralph Campbell 提交于 12月 12, 2006

Convert IPoIB to use the new DMA mapping functions
for kernel verbs consumers.
Signed-off-by: NRalph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

37ccf9df

22 11月, 2006 1 次提交
- D
  WorkStruct: make allyesconfig · c4028958
  由 David Howells 提交于 11月 22, 2006
```
Fix up for make allyesconfig.
Signed-Off-By: NDavid Howells <dhowells@redhat.com>
```
  c4028958
11 10月, 2006 1 次提交
- R
  IPoIB: Check for DMA mapping error for TX packets · 73fbe8be
  由 Roland Dreier 提交于 10月 10, 2006
```
Signed-off-by: NRoland Dreier <rolandd@cisco.com>
```
  73fbe8be
23 9月, 2006 3 次提交

E
IPoIB: Add some likely/unlikely annotations in hot path · a8bfca02
由 Eli Cohen 提交于 9月 22, 2006
```
Signed-off-by: NEli Cohen <eli@dev.mellanox.co.il>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>
```
a8bfca02

IPoIB: Rejoin all multicast groups after a port event · 5ccd0255

由 Eli Cohen 提交于 9月 22, 2006

When ipoib_ib_dev_flush() is called because of a port event, the
driver needs to rejoin all multicast groups, since the flush will call
ipoib_mcast_dev_flush() (via ipoib_ib_dev_down()).  Otherwise no
(non-broadcast) multicast groups will be rejoined until the networking
core calls ->set_multicast_list again, and so multicast reception will
be broken for potentially a long time.
Signed-off-by: NEli Cohen <eli@mellanox.co.il>
Signed-off-by: NMichael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

5ccd0255

IPoIB: Refactor completion handling · 2439a6e6

由 Roland Dreier 提交于 9月 22, 2006

Split up ipoib_ib_handle_wc() into ipoib_ib_handle_rx_wc() and
ipoib_ib_handle_tx_wc() to make the code easier to read.  This will
also help implement NAPI in the future.
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

2439a6e6

18 6月, 2006 1 次提交

IPoIB: Avoid using stale last_send counter when reaping AHs · 31c02e21

由 Roland Dreier 提交于 6月 17, 2006

The comparisons of priv->tx_tail to ah->last_send in ipoib_free_ah()
and ipoib_post_receive() are slightly unsafe, because priv->tx_lock is
not held and hence a stale value of ah->last_send might be used, which
would lead to freeing an AH before the driver was really done with it.
The simple way to fix this is to the optimization of early free from
ipoib_free_ah() and unconditionally queue AHs for reaping, and then
take priv->tx_lock in __ipoib_reap_ah().
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

31c02e21

06 6月, 2006 1 次提交

IPoIB: Fix AH leak at interface down · 959eb392

由 Eli Cohen 提交于 6月 05, 2006

When ipoib_stop() is called it first calls netif_stop_queue() to stop
the kernel from passing more packets to the network driver. However,
the completion handler may call netif_wake_queue() re-enabling packet
transfer.

This might result in leaks (we see AH leaks which we think can be
attributed to this bug) as new packets get posted while the interface
is going down.
Signed-off-by: NEli Cohen <eli@mellanox.co.il>
Signed-off-by: NMichael Tsirkin <mst@mellanox.co.il>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

959eb392

11 4月, 2006 1 次提交

IPoIB: Make send and receive queue sizes tunable · 0f485251

由 Shirley Ma 提交于 4月 10, 2006

Make IPoIB's send and receive queue sizes tunable via module
parameters ("send_queue_size" and "recv_queue_size").  This allows the
queue sizes to be enlarged to fix disastrously bad performance on some
platforms and workloads, without bloating memory usage when large
queues aren't needed.
Signed-off-by: NShirley Ma <xma@us.ibm.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

0f485251

25 3月, 2006 2 次提交

IPoIB: P_Key change event handling · 7a343d4c

由 Leonid Arsh 提交于 3月 23, 2006

This patch causes the network interface to respond to P_Key change
events correctly.  As a result, you'll see a child interface in the
"RUNNING" state (netif_carrier_on()) only when the corresponding P_Key
is configured by the SM.  When SM removes a P_Key, the "RUNNING" state
will be disabled for the corresponding network interface.  To
implement this, I added IB_EVENT_PKEY_CHANGE event handling.  To
prevent flushing the device before the device is open by the "delay
open" mechanism, I added an additional device flag called
IPOIB_FLAG_INITIALIZED.

This also prevents the child network interface from trying to join to
multicast groups until the PKEY is configured.  We used to get error
messages like:

    ib0.f2f2: couldn't attach QP to multicast group ff12:401b:f2f2:0:0:0:ffff:ffff

in this case.  To fix this, I just check IPOIB_FLAG_OPER_UP flag in
ipoib_set_mcast_list().
Signed-off-by: NLeonid Arsh <leonida@voltaire.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

7a343d4c

IPoIB: Pass correct pointer when flushing child interfaces · 6f633c8d

由 Leonid Arsh 提交于 3月 24, 2006

ipoib_ib_dev_flush() should get passed cpriv->dev, not &cpriv->dev.
Signed-off-by: NLeonid Arsh <leonida@voltaire.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

6f633c8d

21 3月, 2006 2 次提交

IPoIB: Move ipoib_ib_dev_flush() to ipoib workqueue · 0b3ea082

由 Jack Morgenstein 提交于 3月 20, 2006

Move ipoib_ib_dev_flush() to ipoib's workqueue.  This keeps it ordered
with respect to other work scheduled by the ipoib driver.  This fixes
problems with races, for example:
 - ipoib_ib_dev_flush() has started running because of an IB event
 - user does ifconfig ib0 down
 - ipoib_mcast_stop_thread() gets called twice and waits for the same
   completion twice
Signed-off-by: NJack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: NMichael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

0b3ea082

IPoIB: Clean up if posting receives fails · 54d07e2a

由 Eli Cohen 提交于 3月 02, 2006

If posting receives in ipoib_ib_dev_open() fails, call
ipoib_ib_dev_stop() to move the device's QP back to the RESET state so
that we can try again later.
Signed-off-by: NEli Cohen <eli@mellanox.co.il>
Signed-off-by: NMichael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

54d07e2a

14 1月, 2006 1 次提交

IB: convert from semaphores to mutexes · 95ed644f

由 Ingo Molnar 提交于 1月 13, 2006

semaphore to mutex conversion by Ingo and Arjan's script.
Signed-off-by: NIngo Molnar <mingo@elte.hu>
[ Sanity-checked on real IB hardware ]
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

95ed644f

13 1月, 2006 1 次提交

IPoIB: Fix memory leak of multicast group structures · 988bd503

由 Eli Cohen 提交于 1月 12, 2006

The current handling of multicast groups in IPoIB ends up never
freeing send-only multicast groups.  It turns out the logic was much
more complicated than it needed to be; we can fix this bug and
completely kill ipoib_mcast_dev_down() at the same time.
Signed-off-by: NEli Cohen <eli@mellanox.co.il>
Signed-off-by: NMichael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

988bd503

30 11月, 2005 1 次提交

IPoIB: protect child list in ipoib_ib_dev_flush · 4f71055a

由 Michael S. Tsirkin 提交于 11月 29, 2005

race condition: ipoib_ib_dev_flush is accessing child list without locks.
Signed-off-by: NMichael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

4f71055a

03 11月, 2005 1 次提交

[IPoIB] don't compile debug code if debugging isn't enabled · 8ae5a8a2

由 Roland Dreier 提交于 11月 02, 2005

Don't build ipoib_mcast_iter_ functions if CONFIG_INFINIBAND_IPOIB_DEBUG
is not enabled -- their only callers will not be built either.

Also move the prototype for ipoib_open() to ipoib.h to fix a sparse warning.
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

8ae5a8a2

31 10月, 2005 1 次提交

[IPoIB] cleanups: fix comment, remove useless variables · 3bc12e75

由 Roland Dreier 提交于 10月 30, 2005

Minor cleanups: fix a misleading comment, and get rid of attr_mask
variables that are only used to hold constants (just use the constants
directly).
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

3bc12e75

29 10月, 2005 1 次提交

[IPoIB] Drop RX packets when out of memory · 1993d683

由 Roland Dreier 提交于 10月 28, 2005

Change the way IPoIB handles RX packets when it can't allocate a new
receive skbuff. If the allocation of a new receive skb fails, we now
drop the packet we just received and repost the original receive skb.
This means that the receive ring always stays full and we don't have
to monkey around with trying to schedule a refill task for later.
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

1993d683