提交 · 9bd626e79df67b2ba3b0c91a4640ab7bca1af04d · openanolis / cloud-kernel

19 1月, 2014 4 次提交

IB/isert: pass scatterlist instead of cmd to fast_reg_mr routine · 9bd626e7

由 Sagi Grimberg 提交于 1月 09, 2014

This routine may help for protection registration as well.
This patch does not change any functionality.
Signed-off-by: NSagi Grimberg <sagig@mellanox.com>
Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>

9bd626e7

IB/isert: Move fastreg descriptor creation to a function · dc87a90f

由 Sagi Grimberg 提交于 1月 09, 2014

This routine may be called both by fast registration
descriptors for data and for integrity buffers.

This patch does not change any functionality.
Signed-off-by: NSagi Grimberg <sagig@mellanox.com>
Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>

dc87a90f

IB/isert: Avoid frwr notation, user fastreg · a3a5a826

由 Sagi Grimberg 提交于 1月 09, 2014

Use fast registration lingo. fast registration will
also incorporate signature/DIF registration.
Signed-off-by: NSagi Grimberg <sagig@mellanox.com>
Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>

a3a5a826

IB/isert: seperate connection protection domains and dma MRs · eb6ab132

由 Sagi Grimberg 提交于 1月 09, 2014

It is more correct to seperate connections protection domains
and dma_mr handles. protection information support requires to
do so.
Signed-off-by: NSagi Grimberg <sagig@mellanox.com>
Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>

eb6ab132

18 11月, 2013 6 次提交

IB/core: Re-enable create_flow/destroy_flow uverbs · 69ad5da4

由 Matan Barak 提交于 11月 06, 2013

This commit reverts commit 7afbddfa ("IB/core: Temporarily disable
create_flow/destroy_flow uverbs").  Since the uverbs extensions
functionality was experimental for v3.12, this patch re-enables the
support for them and flow-steering for v3.13.
Signed-off-by: NMatan Barak <matanb@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

69ad5da4

IB/core: extended command: an improved infrastructure for uverbs commands · f21519b2

由 Yann Droneaud 提交于 11月 06, 2013

Commit 400dbc96 ("IB/core: Infrastructure for extensible uverbs
commands") added an infrastructure for extensible uverbs commands
while later commit 436f2ad0 ("IB/core: Export ib_create/destroy_flow
through uverbs") exported ib_create_flow()/ib_destroy_flow() functions
using this new infrastructure.

According to the commit 400dbc96, the purpose of this
infrastructure is to support passing around provider (eg. hardware)
specific buffers when userspace issue commands to the kernel, so that
it would be possible to extend uverbs (eg. core) buffers independently
from the provider buffers.

But the new kernel command function prototypes were not modified to
take advantage of this extension. This issue was exposed by Roland
Dreier in a previous review[1].

So the following patch is an attempt to a revised extensible command
infrastructure.

This improved extensible command infrastructure distinguish between
core (eg. legacy)'s command/response buffers from provider
(eg. hardware)'s command/response buffers: each extended command
implementing function is given a struct ib_udata to hold core
(eg. uverbs) input and output buffers, and another struct ib_udata to
hold the hw (eg. provider) input and output buffers.

Having those buffers identified separately make it easier to increase
one buffer to support extension without having to add some code to
guess the exact size of each command/response parts: This should make
the extended functions more reliable.

Additionally, instead of relying on command identifier being greater
than IB_USER_VERBS_CMD_THRESHOLD, the proposed infrastructure rely on
unused bits in command field: on the 32 bits provided by command
field, only 6 bits are really needed to encode the identifier of
commands currently supported by the kernel. (Even using only 6 bits
leaves room for about 23 new commands).

So this patch makes use of some high order bits in command field to
store flags, leaving enough room for more command identifiers than one
will ever need (eg. 256).

The new flags are used to specify if the command should be processed
as an extended one or a legacy one. While designing the new command
format, care was taken to make usage of flags itself extensible.

Using high order bits of the commands field ensure that newer
libibverbs on older kernel will properly fail when trying to call
extended commands. On the other hand, older libibverbs on newer kernel
will never be able to issue calls to extended commands.

The extended command header includes the optional response pointer so
that output buffer length and output buffer pointer are located
together in the command, allowing proper parameters checking. This
should make implementing functions easier and safer.

Additionally the extended header ensure 64bits alignment, while making
all sizes multiple of 8 bytes, extending the maximum buffer size:

                             legacy      extended

   Maximum command buffer:  256KBytes   1024KBytes (512KBytes + 512KBytes)
  Maximum response buffer:  256KBytes   1024KBytes (512KBytes + 512KBytes)

For the purpose of doing proper buffer size accounting, the headers
size are no more taken in account in "in_words".

One of the odds of the current extensible infrastructure, reading
twice the "legacy" command header, is fixed by removing the "legacy"
command header from the extended command header: they are processed as
two different parts of the command: memory is read once and
information are not duplicated: it's making clear that's an extended
command scheme and not a different command scheme.

The proposed scheme will format input (command) and output (response)
buffers this way:

- command:

  legacy header +
  extended header +
  command data (core + hw):

    +----------------------------------------+
    | flags     |   00      00    |  command |
    |        in_words    |   out_words       |
    +----------------------------------------+
    |                 response               |
    |                 response               |
    | provider_in_words | provider_out_words |
    |                 padding                |
    +----------------------------------------+
    |                                        |
    .              <uverbs input>            .
    .              (in_words * 8)            .
    |                                        |
    +----------------------------------------+
    |                                        |
    .             <provider input>           .
    .          (provider_in_words * 8)       .
    |                                        |
    +----------------------------------------+

- response, if present:

    +----------------------------------------+
    |                                        |
    .          <uverbs output space>         .
    .             (out_words * 8)            .
    |                                        |
    +----------------------------------------+
    |                                        |
    .         <provider output space>        .
    .         (provider_out_words * 8)       .
    |                                        |
    +----------------------------------------+

The overall design is to ensure that the extensible infrastructure is
itself extensible while begin more reliable with more input and bound
checking.

Note:

The unused field in the extended header would be perfect candidate to
hold the command "comp_mask" (eg. bit field used to handle
compatibility).  This was suggested by Roland Dreier in a previous
review[2].  But "comp_mask" field is likely to be present in the uverb
input and/or provider input, likewise for the response, as noted by
Matan Barak[3], so it doesn't make sense to put "comp_mask" in the
header.

[1]:
http://marc.info/?i=CAL1RGDWxmM17W2o_era24A-TTDeKyoL6u3NRu_=t_dhV_ZA9MA@mail.gmail.com

[2]:
http://marc.info/?i=CAL1RGDXJtrc849M6_XNZT5xO1+ybKtLWGq6yg6LhoSsKpsmkYA@mail.gmail.com

[3]:
http://marc.info/?i=525C1149.6000701@mellanox.comSigned-off-by: NYann Droneaud <ydroneaud@opteya.com>
Link: http://marc.info/?i=cover.1383773832.git.ydroneaud@opteya.com

[ Convert "ret ? ret : 0" to the equivalent "ret".  - Roland ]
Signed-off-by: NRoland Dreier <roland@purestorage.com>

f21519b2

IB/core: Remove ib_uverbs_flow_spec structure from userspace · 2490f20b

由 Yann Droneaud 提交于 11月 06, 2013

The structure holding any types of flow_spec is of no use to
userspace.  It would be wrong for userspace to do:

  struct ib_uverbs_flow_spec flow_spec;

  flow_spec.type = IB_FLOW_SPEC_TCP;
  flow_spec.size = sizeof(flow_spec);

Instead, userspace should use the dedicated flow_spec structure for
  - Ethernet : struct ib_uverbs_flow_spec_eth,
  - IPv4     : struct ib_uverbs_flow_spec_ipv4,
  - TCP/UDP  : struct ib_uverbs_flow_spec_tcp_udp.

In other words, struct ib_uverbs_flow_spec is a "virtual" data
structure that can only be use by the kernel as an alias to the other.
Signed-off-by: NYann Droneaud <ydroneaud@opteya.com>
Link: http://marc.info/?i=cover.1383773832.git.ydroneaud@opteya.comSigned-off-by: NRoland Dreier <roland@purestorage.com>

2490f20b

IB/core: Make uverbs flow structure use names like verbs ones · b68c9560

由 Yann Droneaud 提交于 11月 06, 2013

This patch adds "flow" prefix to most of data structure added as part
of commit 436f2ad0 ("IB/core: Export ib_create/destroy_flow through
uverbs") to keep those names in sync with the data structures added in
commit 319a441d ("IB/core: Add receive flow steering support").

It's just a matter of translating 'ib_flow' to 'ib_uverbs_flow'.
Signed-off-by: NYann Droneaud <ydroneaud@opteya.com>
Link: http://marc.info/?i=cover.1383773832.git.ydroneaud@opteya.comSigned-off-by: NRoland Dreier <roland@purestorage.com>

b68c9560

IB/core: Rename 'flow' structs to match other uverbs structs · d82693da

由 Yann Droneaud 提交于 11月 06, 2013

Commit 436f2ad0 ("IB/core: Export ib_create/destroy_flow through
uverbs") added public data structures to support receive flow
steering. The new structs are not following the 'uverbs' pattern:
they're lacking the common prefix 'ib_uverbs'.

This patch replaces ib_kern prefix by ib_uverbs.
Signed-off-by: NYann Droneaud <ydroneaud@opteya.com>
Link: http://marc.info/?i=cover.1383773832.git.ydroneaud@opteya.comSigned-off-by: NRoland Dreier <roland@purestorage.com>

d82693da

IB/core: clarify overflow/underflow checks on ib_create/destroy_flow · f8848274

由 Matan Barak 提交于 11月 06, 2013

This patch fixes the following issues:

1. Unneeded checks were removed

2. Removed the fixed size out of flow_attr.size, thus simplifying the checks.

3. Remove a 32bit hole on 64bit systems with strict alignment in
   struct ib_kern_flow_att by adding a reserved field.
Signed-off-by: NMatan Barak <matanb@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

f8848274

17 11月, 2013 2 次提交

IB/ucma: Convert use of typedef ctl_table to struct ctl_table · f3a5e3e3

由 Joe Perches 提交于 10月 22, 2013

This typedef is unnecessary and should just be removed.
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

f3a5e3e3

IB/cm: Convert to using idr_alloc_cyclic() · ab626d1a

由 Zhao Hongjiang 提交于 11月 15, 2013

Commit 3e6628c4 ("idr: introduce idr_alloc_cyclic()") adds a new
idr_alloc_cyclic() routine and converts several of these users to it.
This is just a missed one - add it.
Signed-off-by: NZhao Hongjiang <zhaohongjiang@huawei.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

ab626d1a

16 11月, 2013 5 次提交

IB/mlx5: Fix page shift in create CQ for userspace · cf1c5e1f

由 Eli Cohen 提交于 10月 31, 2013

When creating a CQ, we must use mlx5 adapter page shift.
Signed-off-by: NEli Cohen <eli@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

cf1c5e1f

IB/mlx4: Fix device max capabilities check · 79d3da9c

由 Eli Cohen 提交于 10月 31, 2013

Move the check on max supported CQEs after the final number of entries is
evaluated.
Signed-off-by: NEli Cohen <eli@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

79d3da9c

IB/mlx5: Remove dead code · 7e2e1921

由 Eli Cohen 提交于 10月 31, 2013

The value of the local variable index is never used in reg_mr_callback().
Signed-off-by: NEli Cohen <eli@mellanox.com>

[ Remove now-unused variable delta too.  - Roland ]
Signed-off-by: NRoland Dreier <roland@purestorage.com>

7e2e1921

IB/core: Encorce MR access rights rules on kernel consumers · 1c636f80

由 Eli Cohen 提交于 10月 31, 2013

Enforce the rule that when requesting remote write or atomic permissions, local
write must be indicated as well. See IB spec 11.2.8.2.

Spotted by: Hagay Abramovsky <hagaya@mellanox.com>
Signed-off-by: NEli Cohen <eli@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

1c636f80

IB/mlx4: Fix endless loop in resize CQ · 93b80ac2

由 Eli Cohen 提交于 10月 31, 2013

When calling get_sw_cqe() we need pass the consumer_index and not the
masked value. Failure to do so will cause incorrect result of
get_sw_cqe() possibly leading to endless loop.

This problem was reported and analyzed by Michael Rice from HP.
Signed-off-by: NEli Cohen <eli@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

93b80ac2

13 11月, 2013 3 次提交

qib_fs: fix (some) dcache abuses · 441a9d0e

由 Al Viro 提交于 11月 13, 2013

* lookup_one_len() really wants i_mutex held on directory.
* leaks galore - just mount ipathfs, then
cd /sys/bus/pci/drivers/qib_ib; echo *:*:*.* >unbind
on a box with that card present and try to umount ipathfs...
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

441a9d0e

ib_isert: Avoid duplicate iscsit_increment_maxcmdsn call · 04d9cd12

由 Nicholas Bellinger 提交于 11月 12, 2013

This patch avoids a duplicate iscsit_increment_maxcmdsn() call for
ISER_IB_RDMA_WRITE within isert_map_rdma() + isert_reg_rdma_frwr(),
which will already be occuring once during isert_put_datain() ->
iscsit_build_rsp_pdu() operation.

It also removes the local conn->stat_sn assignment + increment,
and changes the third parameter to iscsit_build_rsp_pdu() to
signal this should be done by iscsi_target_mode code.
Tested-by: NMoussa Ba <moussaba@micron.com>
Cc: <stable@vger.kernel.org> # v3.10+
Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>

04d9cd12

iser-target: Avoid using FRMR for single dma entry requests · f01b9f73

由 Vu Pham 提交于 11月 11, 2013

This patch changes isert_reg_rdma_frwr() to not use FRMR for single
dma entry requests from small I/Os, in order to avoid the associated
memory registration overhead.

Using DMA MR is sufficient here for the single dma entry requests,
and addresses a >= v3.12 performance regression.
Signed-off-by: NVu Pham <vu@mellanox.com>
Cc: <stable@vger.kernel.org> # v3.12+
Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>

f01b9f73

12 11月, 2013 2 次提交

RDMA/cma: Remove unused argument and minor dead code · 352b9056

由 Michal Nazarewicz 提交于 11月 10, 2013

The dev variable is never assigned after being initialised.
Signed-off-by: NMichal Nazarewicz <mina86@mina86.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

352b9056

RDMA/ucma: Discard events for IDs not yet claimed by user space · c6b21824

由 Sean Hefty 提交于 11月 01, 2013

Problem reported by Avneesh Pant <avneesh.pant@oracle.com>:

It looks like we are triggering a bug in RDMA CM/UCM interaction.
The bug specifically hits when we have an incoming connection
request and the connecting process dies BEFORE the passive end of
the connection can process the request i.e. it does not call
rdma_get_cm_event() to retrieve the initial connection event. We
were able to triage this further and have some additional
information now.

In the example below when P1 dies after issuing a connect request
as the CM id is being destroyed all outstanding connects (to P2)
are sent a reject message. We see this reject message being
received on the passive end and the appropriate CM ID created for
the initial connection message being retrieved in cm_match_req().
The problem is in the ucma_event_handler() code when this reject
message is delivered to it and the initial connect message itself
HAS NOT been delivered to the client. In fact the client has not
even called rdma_cm_get_event() at this stage so we haven't
allocated a new ctx in ucma_get_event() and updated the new
connection CM_ID to point to the new UCMA context.

This results in the reject message not being dropped in
ucma_event_handler() for the new connection request as the
(if (!ctx->uid)) block is skipped since the ctx it refers to is
the listen CM id context which does have a valid UID associated
with it (I believe the new CMID for the connection initially
uses the listen CMID -> context when it is created in
cma_new_conn_id). Thus the assumption that new events for a
connection can get dropped in ucma_event_handler() is incorrect
IF the initial connect request has not been retrieved in the
first case. We end up getting a CM Reject event on the listen CM
ID and our upper layer code asserts (in fact this event does not
even have the listen_id set as that only gets set up librdmacm
for connect requests).

The solution is to verify that the cm_id being reported in the event
is the same as the cm_id referenced by the ucma context. A mismatch
indicates that the ucma context corresponds to the listen. This fix
was validated by using a modified version of librdmacm that was able
to verify the problem and see that the reject message was indeed
dropped after this patch was applied.
Signed-off-by: NSean Hefty <sean.hefty@intel.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

c6b21824

09 11月, 2013 18 次提交

IB/core: Add Cisco usNIC rdma node and transport types · 180771a3

由 Upinder Malhi \(umalhi\) 提交于 9月 10, 2013

This patch adds new rdma node and new rdma transport, and supporting
code used by Cisco's low latency driver called usNIC.  usNIC uses its
own transport, distinct from IB and iWARP.
Signed-off-by: NUpinder Malhi <umalhi@cisco.com>
Signed-off-by: NJeff Squyres <jsquyres@cisco.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

180771a3

RDMA/nes: Remove self-assignment from nes_query_qp() · 4127c365

由 Dave Jones 提交于 9月 17, 2013

Assigning a value to itself is pointless.

Spotted with coverity, no hardware to test.
Signed-off-by: NDave Jones <davej@fedoraproject.org>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

4127c365

IB/srp: Report receive errors correctly · cd4e3854

由 Bart Van Assche 提交于 10月 10, 2013

The IB spec does not guarantee that the opcode is available in error
completions.  Hence do not rely on it.  See also commit 948d1e88
("IB/srp: Introduce srp_handle_qp_err()").
Signed-off-by: NBart Van Assche <bvanassche@acm.org>
Cc: <stable@vger.kernel.org> # v3.8
Signed-off-by: NRoland Dreier <roland@purestorage.com>

cd4e3854

IB/srp: Avoid offlining operational SCSI devices · 99b6697a

由 Bart Van Assche 提交于 10月 10, 2013

If SCSI commands are submitted with a SCSI request timeout that is
lower than the the IB RC timeout, it can happen that the SCSI error
handler has already started device recovery before transport layer
error handling starts.  So it can happen that the SCSI error handler
tries to abort a SCSI command after it has been reset by
srp_rport_reconnect().

Tell the SCSI error handler that such commands have finished and that
it is not necessary to continue its recovery strategy for commands
that have been reset by srp_rport_reconnect().
Signed-off-by: NBart Van Assche <bvanassche@acm.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

99b6697a

IB/srp: Remove target from list before freeing Scsi_Host structure · 65d7dd2f

由 Vu Pham 提交于 10月 10, 2013

Remove an SRP target from the SRP target list before invoking the last
scsi_host_put() call.  This change is necessary because that last put
frees the memory that holds the srp_target_port structure.

This patch prevents the following kernel oops:

    RIP: 0010:[<ffffffff810b00d0>] __lock_acquire+0x500/0x1570
    Call Trace:
     [<ffffffff810b11e4>] lock_acquire+0xa4/0x120
     [<ffffffff81531206>] _spin_lock+0x36/0x70
     [<ffffffffa01b6d8f>] srp_remove_work+0xef/0x180 [ib_srp]
     [<ffffffff8109125c>] worker_thread+0x21c/0x3d0
     [<ffffffff81096e86>] kthread+0x96/0xa0
     [<ffffffff8100c20a>] child_rip+0xa/0x20
Signed-off-by: NVu Pham <vuhuong@mellanox.com>

[ bvanassche - Modified path description and CC'ed stable. ]
Signed-off-by: NBart Van Assche <bvanassche@acm.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

65d7dd2f

IB/srp: Add change_queue_depth and change_queue_type support · 71444b97

由 Jack Wang 提交于 11月 07, 2013

Currently, it's not possible to change queue depth for a device behind
SRP host. Sometimes, we need to adjust queue_depth for performance
reason (eg storage busy, we need lower queue_depth to avoid running
into SCSI error handler), so this patch add support for SRP driver.
Signed-off-by: NJack Wang <jinpu.wang@profitbricks.com>
Tested-by: NBart Van Assche <bvanassche@acm.org>
Acked-by: NDavid Dillow <dillowda@ornl.gov>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

71444b97

IB/srp: Make queue size configurable · 4d73f95f

由 Bart Van Assche 提交于 10月 26, 2013

Certain storage configurations, e.g. a sufficiently large array of
hard disks in a RAID configuration, need a queue depth above 64 to
achieve optimal performance. Hence make the queue depth configurable.
Signed-off-by: NBart Van Assche <bvanassche@acm.org>
Acked-by: NDavid Dillow <dillowda@ornl.gov>
Tested-by: NJack Wang <xjtuwjp@gmail.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

4d73f95f

IB/srp: Introduce srp_alloc_req_data() · b81d00bd

由 Bart Van Assche 提交于 10月 26, 2013

This patch does not change any functionality.
Signed-off-by: NBart Van Assche <bvanassche@acm.org>
Acked-by: NDavid Dillow <dillowda@ornl.gov>
Cc: Roland Dreier <roland@purestorage.com>
Cc: Vu Pham <vu@mellanox.com>
Cc: Sebastian Riemer <sebastian.riemer@profitbricks.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

b81d00bd

IB/srp: Export sgid to sysfs · 848b3082

由 Bart Van Assche 提交于 10月 26, 2013

On an initiator system with multiple IB ports it is not yet possible
to figure out what the originating port of an SRP connection is. Hence
make the source GID available in sysfs.
Signed-off-by: NBart Van Assche <bvanassche@acm.org>
Acked-by: NDavid Dillow <dillowda@ornl.gov>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

848b3082

IB/srp: Add periodic reconnect functionality · a95cadb9

由 Bart Van Assche 提交于 10月 26, 2013

After a transport layer occurred, periodically try to reconnect
to the target until the dev_loss timer expires.  Protect the
callback functions that can be invoked from inside the SCSI EH
against concurrent invocation with srp_reconnect_rport() via the
rport mutex. Change the default dev_loss_tmo from 60s into 600s
to give the reconnect mechanism a chance to kick in.
Signed-off-by: NBart Van Assche <bvanassche@acm.org>
Acked-by: NDavid Dillow <dillowda@ornl.gov>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

a95cadb9

scsi_transport_srp: Add periodic reconnect support · 8c64e453

由 Bart Van Assche 提交于 10月 26, 2013

Add support for periodically reconnecting to an SRP target until
the dev_loss timer expires. After the tenth reconnection attempt,
gradually slow down subsequent reconnect attempts.
Signed-off-by: NBart Van Assche <bvanassche@acm.org>
Acked-by: NDavid Dillow <dillowda@ornl.gov>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

8c64e453

IB/srp: Start timers if a transport layer error occurs · c1120f89

由 Bart Van Assche 提交于 10月 26, 2013

Start the reconnect timer, fast_io_fail timer and dev_loss timers if a
transport layer error occurs.
Signed-off-by: NBart Van Assche <bvanassche@acm.org>
Acked-by: NDavid Dillow <dillowda@ornl.gov>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

c1120f89

IB/srp: Use SRP transport layer error recovery · ed9b2264

由 Bart Van Assche 提交于 10月 26, 2013

Enable fast_io_fail_tmo and dev_loss_tmo functionality for the IB SRP
initiator.  Add kernel module parameters that allow to specify default
values for these parameters.
Signed-off-by: NBart Van Assche <bvanassche@acm.org>
Acked-by: NDavid Dillow <dillowda@ornl.gov>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

ed9b2264

IB/srp: Keep rport as long as the IB transport layer · 9dd69a60

由 Bart Van Assche 提交于 10月 26, 2013

Keep the rport data structure around after srp_remove_host() has
finished until cleanup of the IB transport layer has finished
completely. This is necessary because later patches use the rport
pointer inside the queuecommand callback. Without this patch
accessing the rport from inside a queuecommand callback is racy
because srp_remove_host() must be invoked before scsi_remove_host()
and because the queuecommand callback could get invoked after
srp_remove_host() has finished. In other words, without this patch
the queuecommand callback can get invoked after the rport data
structure has been freed.
Signed-off-by: NBart Van Assche <bvanassche@acm.org>
Acked-by: NDavid Dillow <dillowda@ornl.gov>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

9dd69a60

IB/srp: Make transport layer retry count configurable · 7bb312e4

由 Vu Pham 提交于 10月 26, 2013

Allow the InfiniBand RC retry count to be configured by the user as an
option in the target login string.  Reducing this retry count allows to
reduce the path failover time.
Signed-off-by: NVu Pham <vu@mellanox.com>

[ bvanassche: Rewrote patch description / changed default retry count ]
Signed-off-by: NBart Van Assche <bvanassche@acm.org>
Acked-by: NDavid Dillow <dillowda@ornl.gov>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

7bb312e4

IB/qib: Fix txselect regression · 2fadd831

由 Mike Marciniszyn 提交于 10月 25, 2013

Commit 7fac3301("IB/qib: checkpatch fixes") was overzealous in
removing a simple_strtoul for a parse routine, setup_txselect().  That
routine is required to handle a multi-value string.

Unwind that aspect of the fix.

Cc: <stable@vger.kernel.org>
Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

2fadd831

IB/qib: Fix checkpatch __packed warnings · 78a58864

由 Mike Marciniszyn 提交于 10月 24, 2013

Convert __attribute__ ((packed)) to __packed.
Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

78a58864

IB/qib: Convert qib_user_sdma_pin_pages() to use get_user_pages_fast() · 603e7729

由 Jan Kara 提交于 10月 04, 2013

qib_user_sdma_queue_pkts() gets called with mmap_sem held for
writing. Except for get_user_pages() deep down in
qib_user_sdma_pin_pages() we don't seem to need mmap_sem at all.  Even
more interestingly the function qib_user_sdma_queue_pkts() (and also
qib_user_sdma_coalesce() called somewhat later) call copy_from_user()
which can hit a page fault and we deadlock on trying to get mmap_sem
when handling that fault.

So just make qib_user_sdma_pin_pages() use get_user_pages_fast() and
leave mmap_sem locking for mm.

This deadlock has actually been observed in the wild when the node
is under memory pressure.

Cc: <stable@vger.kernel.org>
Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

603e7729

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功