提交 · 521e575b9a7324a0bca762622139f69582a042bf · openeuler / Kernel

15 7月, 2008 30 次提交

IB/mlx4: Add support for blocking multicast loopback packets · 521e575b

由 Ron Livne 提交于 7月 14, 2008

Add support for handling the IB_QP_CREATE_MULTICAST_BLOCK_LOOPBACK
flag by using the per-multicast group loopback blocking feature of
mlx4 hardware.
Signed-off-by: NRon Livne <ronli@voltaire.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

521e575b

RDMA/cxgb3: Add support for protocol statistics · 14cc180f

由 Steve Wise 提交于 7月 14, 2008

- Add a new rdma ctl command called RDMA_GET_MIB to the cxgb3 low
  level driver to obtain the protocol mib from the rnic hardware.

- Add new iw_cxgb3 provider method to get the MIB from the low level
  driver.
Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

14cc180f

RDMA/core: Add iWARP protocol statistics attributes in sysfs · 7f624d02

由 Steve Wise 提交于 7月 14, 2008

This patch adds a sysfs attribute group called "proto_stats" under
/sys/class/infiniband/$device/ and populates this group with protocol
statistics if they exist for a given device.  Currently, only iWARP
stats are defined, but the code is designed to allow InfiniBand
protocol stats if they become available.  These stats are per-device
and more importantly -not- per port.

Details:

- Add union rdma_protocol_stats in ib_verbs.h.  This union allows
  defining transport-specific stats.  Currently only iwarp stats are
  defined.

- Add struct iw_protocol_stats to define the current set of iwarp
  protocol stats.

- Add new ib_device method called get_proto_stats() to return protocol
  statistics.

- Add logic in core/sysfs.c to create iwarp protocol stats attributes
  if the device is an RNIC and has a get_proto_stats() method.
Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

7f624d02

IPoIB/cm: Fix racy use of receive WR/SGL in ipoib_cm_post_receive_nonsrq() · a7d834c4

由 Roland Dreier 提交于 7月 14, 2008

For devices that don't support SRQs, ipoib_cm_post_receive_nonsrq() is
called from both ipoib_cm_handle_rx_wc() and ipoib_cm_nonsrq_init_rx(),
and these two callers are not synchronized against each other.
However, ipoib_cm_post_receive_nonsrq() always reuses the same receive
work request and scatter list structures, so multiple callers can end
up stepping on each other, which leads to posting garbled work
requests.

Fix this by having the caller pass in the ib_recv_wr and ib_sge
structures to use, and allocating new local structures in
ipoib_cm_nonsrq_init_rx().

Based on a patch by Pradeep Satyanarayana <pradeep@us.ibm.com> and
David Wilder <dwilder@us.ibm.com>, with debugging help from Hoang-Nam
Nguyen <hnguyen@de.ibm.com>.
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

a7d834c4

R
RDMA/cma: Add missing newlines to printk()s · 468f2239
由 Roland Dreier 提交于 7月 14, 2008
```
Signed-off-by: NRoland Dreier <rolandd@cisco.com>
Acked-by: NSean Hefty <sean.hefty@intel.com>
```
468f2239

RDMA/cxgb3: Remove write-only iwch_rnic_attributes fields · eec8845d

由 Roland Dreier 提交于 7月 14, 2008

The members struct iwch_rnic_attributes.vendor_id and .vendor_part_id
are write-only, so we might as well get rid of them.
Signed-off-by: NRoland Dreier <rolandd@cisco.com>
Acked-by: NSteve Wise <swise@opengridcomputing.com>

eec8845d

RDMA/cxgb3: Fix up some ib_device_attr fields · 97d1cc80

由 Steve Wise 提交于 7月 14, 2008

- set fw_ver
- set hw_ver
- set max_qp_wr to something reasonable
- set max_cqe to something reasonable
Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

97d1cc80

IB/ehca: In case of lost interrupts, trigger EOI to reenable interrupts · 6f7bc01a

由 Stefan Roscher 提交于 7月 14, 2008

During corner case testing, we noticed that some versions of ehca do
not properly transition to interrupt done in special load situations.
This can be resolved by periodically triggering EOI through H_EOI, if
EQEs are pending.
Signed-off-by: NStefan Roscher <stefan.roscher@de.ibm.com>
Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

6f7bc01a

J
IB/ehca: Reject receive work requests if QP is in RESET state · 3e255eac
由 Joachim Fenkes 提交于 7月 14, 2008
```
Signed-off-by: NJoachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>
```
3e255eac

IB/mlx4: Remove extra code for RESET->ERR QP state transition · 7c27f358

由 Roland Dreier 提交于 7月 14, 2008

Commit 65adfa91 ("IB/mlx4: Fix RESET to RESET and RESET to ERROR
transitions") added some extra code to handle a QP state transition
from RESET to ERROR.  However, the latest 1.2.1 version of the IB spec
has clarified that this transition is actually not allowed, so we can
remove this extra code again.
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

7c27f358

IB/mthca: Remove extra code for RESET->ERR QP state transition · d3809ad0

由 Roland Dreier 提交于 7月 14, 2008

Commit b18aad71 ("IB/mthca: Fix RESET to ERROR transition") added some
extra code to handle a QP state transition from RESET to ERROR.
However, the latest 1.2.1 version of the IB spec has clarified that
this transition is actually not allowed, so we can remove this extra
code again.
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

d3809ad0

IB/core: Reset to error QP state transition is not allowed · e5a5e7d5

由 Ralph Campbell 提交于 7月 14, 2008

I was reviewing the QP state transition diagram in the IB 1.2.1 spec
and the code for qp_state_table[], and noticed that the code allows a
QP to be modified from IB_QPS_RESET to IB_QPS_ERR whereas the notes
for figure 124 (pg 457) specifically says that this transition isn't
allowed.  This is a clarification from earlier versions of the IB
spec, which were ambiguous in this area and suggested that the RESET
to ERR transition was allowed.

Fix up the qp_state_table[] to make RESET->ERR not allowed.
Signed-off-by: NRalph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

e5a5e7d5

IB/mlx4: Pass congestion management class MADs to the HCA · 6578cf33

由 Eli Cohen 提交于 7月 14, 2008

ConnectX HCAs support the IB_MGMT_CLASS_CONG_MGMT management class, so
process MADs of this class through the MAD_IFC firmware command.
Signed-off-by: NEli Cohen <eli@mellanox.co.il>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

6578cf33

IB/mlx4: Configure QPs' max message size based on real device capability · d1f2cd89

由 Eli Cohen 提交于 7月 14, 2008

ConnectX returns the max message size it supports through the
QUERY_DEV_CAP firmware command. When modifying a QP to RTR, the max
message size for the QP must be specified. This value must not exceed
the value declared through QUERY_DEV_CAP. The current code ignores
the max allowed size and unconditionally sets the value to 2^31. This
patch sets all QPs to the max value allowed as returned from firmware.
Signed-off-by: NEli Cohen <eli@mellanox.co.il>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

d1f2cd89

RDMA/cxgb3: MEM_MGT_EXTENSIONS support · e7e55829

由 Steve Wise 提交于 7月 14, 2008

- set IB_DEVICE_MEM_MGT_EXTENSIONS capability bit if fw supports it.
- set max_fast_reg_page_list_len device attribute.
- add iwch_alloc_fast_reg_mr function.
- add iwch_alloc_fastreg_pbl
- add iwch_free_fastreg_pbl
- adjust the WQ depth for kernel mode work queues to account for
  fastreg possibly taking 2 WR slots.
- add fastreg_mr work request support.
- add local_inv work request support.
- add send_with_inv and send_with_se_inv work request support.
- removed useless duplicate enums/defines for TPT/MW/MR stuff.
Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

e7e55829

RDMA/core: Add memory management extensions support · 00f7ec36

由 Steve Wise 提交于 7月 14, 2008

This patch adds support for the IB "base memory management extension"
(BMME) and the equivalent iWARP operations (which the iWARP verbs
mandates all devices must implement).  The new operations are:

 - Allocate an ib_mr for use in fast register work requests.

 - Allocate/free a physical buffer lists for use in fast register work
   requests.  This allows device drivers to allocate this memory as
   needed for use in posting send requests (eg via dma_alloc_coherent).

 - New send queue work requests:
   * send with remote invalidate
   * fast register memory region
   * local invalidate memory region
   * RDMA read with invalidate local memory region (iWARP only)

Consumer interface details:

 - A new device capability flag IB_DEVICE_MEM_MGT_EXTENSIONS is added
   to indicate device support for these features.

 - New send work request opcodes IB_WR_FAST_REG_MR, IB_WR_LOCAL_INV,
   IB_WR_RDMA_READ_WITH_INV are added.

 - A new consumer API function, ib_alloc_mr() is added to allocate
   fast register memory regions.

 - New consumer API functions, ib_alloc_fast_reg_page_list() and
   ib_free_fast_reg_page_list() are added to allocate and free
   device-specific memory for fast registration page lists.

 - A new consumer API function, ib_update_fast_reg_key(), is added to
   allow the key portion of the R_Key and L_Key of a fast registration
   MR to be updated.  Consumers call this if desired before posting
   a IB_WR_FAST_REG_MR work request.

Consumers can use this as follows:

 - MR is allocated with ib_alloc_mr().

 - Page list memory is allocated with ib_alloc_fast_reg_page_list().

 - MR R_Key/L_Key "key" field is updated with ib_update_fast_reg_key().

 - MR made VALID and bound to a specific page list via
   ib_post_send(IB_WR_FAST_REG_MR)

 - MR made INVALID via ib_post_send(IB_WR_LOCAL_INV),
   ib_post_send(IB_WR_RDMA_READ_WITH_INV) or an incoming send with
   invalidate operation.

 - MR is deallocated with ib_dereg_mr()

 - page lists dealloced via ib_free_fast_reg_page_list().

Applications can allocate a fast register MR once, and then can
repeatedly bind the MR to different physical block lists (PBLs) via
posting work requests to a send queue (SQ).  For each outstanding
MR-to-PBL binding in the SQ pipe, a fast_reg_page_list needs to be
allocated (the fast_reg_page_list is owned by the low-level driver
from the consumer posting a work request until the request completes).
Thus pipelining can be achieved while still allowing device-specific
page_list processing.

The 32-bit fast register memory key/STag is composed of a 24-bit index
and an 8-bit key.  The application can change the key each time it
fast registers thus allowing more control over the peer's use of the
key/STag (ie it can effectively be changed each time the rkey is
rebound to a page list).
Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

00f7ec36

IPoIB: Copy small received SKBs in connected mode · f89271da

由 Eli Cohen 提交于 7月 14, 2008

The connected mode implementation in the IPoIB driver has a large
overhead in the way SKBs are handled in the receive flow.  It usually
allocates an SKB with as big as was used in the currently received SKB
and moves unused fragments from the old SKB to the new one. This
involves a loop on all the remaining fragments and incurs overhead on
the CPU.  This patch, for small SKBs, allocates an SKB just large
enough to contain the received data and copies to it the data from the
received SKB.  The newly allocated SKB is passed to the stack and the
old SKB is reposted.

When running netperf, UDP small messages, without this pach I get:

    UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
    14.4.3.178 (14.4.3.178) port 0 AF_INET
    Socket  Message  Elapsed      Messages
    Size    Size     Time         Okay Errors   Throughput
    bytes   bytes    secs            #      #   10^6bits/sec

    114688     128   10.00     5142034      0     526.31
    114688           10.00     1130489            115.71

With this patch I get both send and receive at ~315 mbps.

The reason that send performance actually slows down is as follows:
When using this patch, the overhead of the CPU for handling RX packets
is dramatically reduced.  As a result, we do not experience RNR NAK
messages from the receiver which cause the connection to be closed and
reopened again; when the patch is not used, the receiver cannot handle
the packets fast enough so there is less time to post new buffers and
hence the mentioned RNR NACKs.  So what happens is that the
application *thinks* it posted a certain number of packets for
transmission but these packets are flushed and do not really get
transmitted.  Since the connection gets opened and closed many times,
each time netperf gets the CPU time that otherwise would have been
given to IPoIB to actually transmit the packets.  This can be verified
when looking at the port counters -- the output of ifconfig and the
oputput of netperf (this is for the case without the patch):

    tx packets
    ==========
    port counter:   1,543,996
    ifconfig:       1,581,426
    netperf:        5,142,034

    rx packets
    ==========
    netperf         1,1304,089
Signed-off-by: NEli Cohen <eli@mellanox.co.il>

f89271da

RDMA: Remove subversion $Id tags · f3781d2e

由 Roland Dreier 提交于 7月 14, 2008

They don't get updated by git and so they're worse than useless.
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

f3781d2e

IB/ipath: Simplify code using ARRAY_SIZE() macro · fd91b1bf

由 Robert P. J. Day 提交于 7月 14, 2008

Signed-off-by: NRobert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

fd91b1bf

IB/mlx4: Optimize QP stamping · 9670e553

由 Eli Cohen 提交于 7月 14, 2008

The idea is that for QPs with fixed size work requests (eg selective
signaling QPs), before stamping the WQE, we read the value of the DS
field, which gives the effective size of the descriptor as used in the
previous post.  Then we stamp only that area, since the rest of the
descriptor is already stamped.

When initializing the send queue buffer, make sure the DS field is
initialized to the max descriptor size so that the subsequent stamping
will be done on the entire descriptor area.
Signed-off-by: NEli Cohen <eli@mellanox.co.il>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

9670e553

IB/sa: Fail requests made while creating new SM AH · 164ba089

由 Moni Shoua 提交于 7月 14, 2008

This patch solves a race that occurs after an event occurs that causes
the SA query module to flush its SM address handle (AH).  When SM AH
becomes invalid and needs an update it is handled by the global
workqueue.  On the other hand this event is also handled in the IPoIB
driver by queuing work in the ipoib_workqueue that does multicast
joins.  Although queuing is in the right order, it is done to 2
different workqueues and so there is no guarantee that the first to be
queued is the first to be executed.

This causes a problem because IPoIB may end up sending an request to
the old SM, which will take a long time to time out (since the old SM
is gone); this leads to a much longer than necessary interruption in
multicast traffer.

The patch sets the SA query module's SM AH to NULL when the event
occurs, and until update_sm_ah() is done, any request that needs sm_ah
fails with -EAGAIN return status.

For consumers, the patch doesn't make things worse.  Before the patch,
MADs are sent to the wrong SM so the request gets lost.  Consumers can
be improved if they examine the return code and respond to EAGAIN
properly but even without an improvement the situation is not getting
worse.
Signed-off-by: NMoni Levy <monil@voltaire.com>
Signed-off-by: NMoni Shoua <monis@voltaire.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

164ba089

RDMA: Fix license text · a9474917

由 Sean Hefty 提交于 7月 14, 2008

The license text for several files references a third software license
that was inadvertently copied in.  Update the license to what was
intended.  This update was based on a request from HP.
Signed-off-by: NSean Hefty <sean.hefty@intel.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

a9474917

RDMA/nes: Remove unnecessary memset() · 929555a2

由 Christophe Jaillet 提交于 7月 14, 2008

Remove an explicit memset(..., 0, ...) of a 'listener' structure
allocated with kzalloc().
Signed-off-by: NChristophe Jaillet <christophe.jaillet@wanadoo.fr>
Acked-by: NFaisal Latif <faisal@neteffect.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

929555a2

IB/srp: Remove use of cached P_Key/GID queries · 969a60f9

由 Roland Dreier 提交于 7月 14, 2008

The SRP initiator is currently using ib_find_cached_pkey() and
ib_get_cached_gid() in situations where the uncached ib_find_pkey()
and ib_query_gid() functions serve just as well: sleeping is allowed
and performance is not an issue.  Since we want to eliminate the
cached operations in the long term, convert SRP to use the uncached
variants.
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

969a60f9

dsp56k: use request_firmware · 7f127d5e

由 Jaswinder Singh 提交于 7月 05, 2008

Signed-off-by: NJaswinder Singh <jaswinder@infradead.org>
Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>

7f127d5e

edgeport-ti: use request_firmware() · d12b219a

由 Jaswinder Singh 提交于 7月 04, 2008

Firmware blob looks like this...
        uint8_t  MajorVersion
        uint8_t  MinorVersion
        __le16   BuildNumber
        uint8_t  data[]
Signed-off-by: NJaswinder Singh <jaswinder@infradead.org>
Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>

d12b219a

edgeport: use request_firmware() · 5b9ea932

由 Jaswinder Singh 提交于 7月 03, 2008

Version number provided in first HEX record.
Signed-off-by: NJaswinder Singh <jaswinder@infradead.org>
Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>

5b9ea932

vicam: use request_firmware() · fb54be87

由 Jaswinder Singh 提交于 6月 27, 2008

Although it wasn't actually using ihex records before, we use the Intel
HEX record format for this firmware -- because that gives us a simple
way to split it into separate chunks internally as we need, without
loading each part as a separate file.
Signed-off-by: NJaswinder Singh <jaswinder@infradead.org>
Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>

fb54be87

D
dabusb: use request_firmware() · c4667746
由 David Woodhouse 提交于 6月 23, 2008
```
Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
```
c4667746

cpia2: use request_firmware() · 04a33e40

由 David Woodhouse 提交于 6月 23, 2008

Thanks for Jaswinder Singh for converting the firmware blob itself to ihex.
Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>

04a33e40

14 7月, 2008 10 次提交

[S390] sclp_tty: Fix scheduling while atomic bug. · 5e34599f

由 Heiko Carstens 提交于 7月 14, 2008

Finally fixes a possible scheduling while in atomic context bug. The driver
used to wait on a waitqueue if no empty buffer was available. This could
lead to a deadlock if the driver was called from non-schedulable context.
So fix this. The write operation may fail now. It returns the number of
characters accepted. put_char will never fail, since it writes characters
to an intermediate buffer which gets flushed as soon as it is full.
That means the driver now can busy wait if something is in the intermediate
buffer and a write_string operation follows. Seems to be an acceptable
compromise, since that shouldn't happen too often.

Cc: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>

5e34599f

[S390] sclp_tty: remove ioctl interface. · 095761d2

由 Heiko Carstens 提交于 7月 14, 2008

After all we came to the conclusion that this interface doesn't make any
sense. Besides that the ioctl number used was never registered, the header
file isn't exported, and we doubt there is even a single user.
So remove this interface, since it eases maintenance.

Cc: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>

095761d2

[S390] Remove P390 support. · 1d030370

由 Heiko Carstens 提交于 7月 14, 2008

Most likely it is broken anyway because of the changes in memory
detection. Since we can't test it and there are probably better ways
that using a P390 card, remove support for it.
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

1d030370

[S390] Cleanup vmcp printk messages. · a44008f2

由 Christian Borntraeger 提交于 7月 14, 2008

Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>

a44008f2

[S390] Cleanup lcs printk messages. · 6b648063

由 Klaus-D. Wacker 提交于 7月 14, 2008

Cc: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: NKlaus-D. Wacker <kdwacker@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>

6b648063

[S390] Cleanup vmwatch printk messages. · 0d130066

由 Martin Schwidefsky 提交于 7月 14, 2008

Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>

0d130066

[S390] Cleanup dcssblk printk messages. · ded77fb4

由 Hongjie Yang 提交于 7月 14, 2008

Signed-off-by: NHongjie Yang <hongjie@us.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>

ded77fb4

[S390] Cleanup zfcp dumper printk messages. · 2a062ab4

由 Michael Holzheu 提交于 7月 14, 2008

Signed-off-by: NMichael Holzheu <holzheu@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>

2a062ab4

[S390] Cleanup vmlogrdr printk messages. · 2f6f2521

由 Martin Schwidefsky 提交于 7月 14, 2008

The message descriptions are still missing though ..
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>

2f6f2521

[S390] Cleanup monreader printk messages. · 2ca5b6e2

由 Gerald Schaefer 提交于 7月 14, 2008

Signed-off-by: NGerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>

2ca5b6e2

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功