提交 · 93079162bf0ed2934c7b0c3ee93ba894df8fb3cd · openeuler / Kernel

22 1月, 2014 1 次提交

scsi_transport_srp: Fix a race condition · 93079162

由 Bart Van Assche 提交于 12月 11, 2013

The rport timers must be stopped before the SRP initiator destroys the
resources associated with the SCSI host. This is necessary because
otherwise the callback functions invoked from the SRP transport layer
could trigger a use-after-free. Stopping the rport timers before
invoking scsi_remove_host() can trigger long delays in the SCSI error
handler if a transport layer failure occurs while scsi_remove_host()
is in progress. Hence move the code for stopping the rport timers from
srp_rport_release() into a new function and invoke that function after
scsi_remove_host() has finished. This patch fixes the following
sporadic kernel crash:

     kernel BUG at include/asm-generic/dma-mapping-common.h:64!
     invalid opcode: 0000 [#1] SMP
     RIP: 0010:[<ffffffffa03b20b1>]  [<ffffffffa03b20b1>] srp_unmap_data+0x121/0x130 [ib_srp]
     Call Trace:
     [<ffffffffa03b20fc>] srp_free_req+0x3c/0x80 [ib_srp]
     [<ffffffffa03b2188>] srp_finish_req+0x48/0x70 [ib_srp]
     [<ffffffffa03b21fb>] srp_terminate_io+0x4b/0x60 [ib_srp]
     [<ffffffffa03a6fb5>] __rport_fail_io_fast+0x75/0x80 [scsi_transport_srp]
     [<ffffffffa03a7438>] rport_fast_io_fail_timedout+0x88/0xc0 [scsi_transport_srp]
     [<ffffffff8108b370>] worker_thread+0x170/0x2a0
     [<ffffffff81090876>] kthread+0x96/0xa0
     [<ffffffff8100c0ca>] child_rip+0xa/0x20
Signed-off-by: NBart Van Assche <bvanassche@acm.org>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

93079162

04 1月, 2014 1 次提交

infiniband: make sure the src net is infiniband when create new link · 0d68fc4f

由 Hangbin Liu 提交于 1月 03, 2014

When we create a new infiniband link with uninfiniband device, e.g. `ip link
add link em1 type ipoib pkey 0x8001`. We will get a NULL pointer dereference
cause other dev like Ethernet don't have struct ib_device.

The code path is:
rtnl_newlink
  |-- ipoib_new_child_link
        |-- __ipoib_vlan_add
              |-- ipoib_set_dev_features
                    |-- ib_query_device

Fix this bug by make sure the src net is infiniband when create new link.
Signed-off-by: NHangbin Liu <liuhangbin@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0d68fc4f

23 12月, 2013 3 次提交

RDMA/cxgb4: Use cxgb4_select_ntuple to correctly calculate ntuple fields · 41b4f86c

由 Kumar Sanghvi 提交于 12月 18, 2013

Signed-off-by: NKumar Sanghvi <kumaras@chelsio.com>
Signed-off-by: NHariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

41b4f86c

RDMA/cxgb4: Server filters are supported only for IPv4 · 8c044690

由 Kumar Sanghvi 提交于 12月 18, 2013

Signed-off-by: NKumar Sanghvi <kumaras@chelsio.com>
Signed-off-by: NHariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8c044690

RDMA/cxgb4: Calculate the filter server TID properly · a4ea025f

由 Kumar Sanghvi 提交于 12月 18, 2013

Based on original work by Santosh Rastapur <santosh@chelsio.com>
Signed-off-by: NKumar Sanghvi <kumaras@chelsio.com>
Signed-off-by: NHariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a4ea025f

21 12月, 2013 7 次提交

IB/uverbs: Check access to userspace response buffer in extended command · 6cc3df84

由 Yann Droneaud 提交于 12月 11, 2013

This patch adds a check on the output buffer with access_ok(VERIFY_WRITE, ...)
to ensure the whole buffer is in userspace memory before using the
pointer in uverbs functions. If the buffer or a subset of it is not
valid, returns -EFAULT to the caller.

This will also catch invalid buffer before the final call to
copy_to_user() which happen late in most uverb functions.

Just like the check in read(2) syscall, it's a sanity check to detect
invalid parameters provided by userspace. This particular check was added
in vfs_read() by Linus Torvalds for v2.6.12 with following commit message:

https://git.kernel.org/cgit/linux/kernel/git/tglx/history.git/commit/?id=fd770e66c9a65b14ce114e171266cf6f393df502

Make read/write always do the full "access_ok()" tests.

The actual user copy will do them too, but only for the
range that ends up being actually copied. That hides
bugs when the range has been clamped by file size or other
issues.

Note: there's no need to check input buffer since vfs_write() already does
access_ok(VERIFY_READ, ...) as part of write() syscall.

Link: http://marc.info/?i=cover.1387273677.git.ydroneaud@opteya.comSigned-off-by: NYann Droneaud <ydroneaud@opteya.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

6cc3df84

IB/uverbs: Check input length in flow steering uverbs · 6bcca3d4

由 Yann Droneaud 提交于 12月 11, 2013

Since ib_copy_from_udata() doesn't check yet the available input data
length before accessing userspace memory, an explicit check of this
length is required to prevent:

- reading past the user provided buffer,
- underflow when subtracting the expected command size from the input
  length.

This will ensure the newly added flow steering uverbs don't try to
process truncated commands.

Link: http://marc.info/?i=cover.1386798254.git.ydroneaud@opteya.com>
Signed-off-by: NYann Droneaud <ydroneaud@opteya.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

6bcca3d4

IB/uverbs: Set error code when fail to consume all flow_spec items · 98a37510

由 Yann Droneaud 提交于 12月 11, 2013

If the flow_spec items parsed count does not match the number of items
declared in the flow_attr command, or if not all bytes are used for
flow_spec items (eg. trailing garbage), a log message is reported and
the function leave through the error path. Unfortunately the error
code is currently not set.

This patch set error code to -EINVAL in such cases, so that the error
is reported to userspace instead of silently fail.

Link: http://marc.info/?i=cover.1386798254.git.ydroneaud@opteya.com>
Signed-off-by: NYann Droneaud <ydroneaud@opteya.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

98a37510

IB/uverbs: Check reserved fields in create_flow · c780d82a

由 Yann Droneaud 提交于 12月 11, 2013

As noted by Daniel Vetter in its article "Botching up ioctls"[1]

  "Check *all* unused fields and flags and all the padding for whether
   it's 0, and reject the ioctl if that's not the case.  Otherwise
   your nice plan for future extensions is going right down the
   gutters since someone *will* submit an ioctl struct with random
   stack garbage in the yet unused parts. Which then bakes in the ABI
   that those fields can never be used for anything else but garbage."

It's important to ensure that reserved fields are set to known value,
so that it will be possible to use them latter to extend the ABI.

The same reasonning apply to comp_mask field present in newer uverbs
command: per commit 22878dbc ("IB/core: Better checking of
userspace values for receive flow steering"), unsupported values in
comp_mask are rejected.

[1] http://blog.ffwll.ch/2013/11/botching-up-ioctls.html

Link: http://marc.info/?i=cover.1386798254.git.ydroneaud@opteya.com>
Signed-off-by: NYann Droneaud <ydroneaud@opteya.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

c780d82a

IB/uverbs: Check comp_mask in destroy_flow · 2782c2d3

由 Yann Droneaud 提交于 12月 11, 2013

Just like the check added to create_flow in 22878dbc ("IB/core:
Better checking of userspace values for receive flow steering"),
comp_mask must be checked in destroy_flow too.

Since only empty comp_mask is currently supported, any other value
must be rejected.

This check was silently added in a previous patch[1] to move comp_mask
in extended command header, part of previous patchset[2] against
create/destroy_flow uverbs. The idea of moving comp_mask to the header
was discarded for the final patchset[3].

Unfortunately the check added in destroy_flow uverb was not integrated
in the final patchset.

[1] http://marc.info/?i=40175eda10d670d098204da6aa4c327a0171ae5f.1381510045.git.ydroneaud@opteya.com
[2] http://marc.info/?i=cover.1381510045.git.ydroneaud@opteya.com
[3] http://marc.info/?i=cover.1383773832.git.ydroneaud@opteya.com

Cc: Matan Barak <matanb@mellanox.com>
Link: http://marc.info/?i=cover.1386798254.git.ydroneaud@opteya.com>
Signed-off-by: NYann Droneaud <ydroneaud@opteya.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

2782c2d3

IB/uverbs: Check reserved field in extended command header · 7efb1b19

由 Yann Droneaud 提交于 12月 11, 2013

As noted by Daniel Vetter in its article "Botching up ioctls"[1]

  "Check *all* unused fields and flags and all the padding for whether
   it's 0, and reject the ioctl if that's not the case.  Otherwise
   your nice plan for future extensions is going right down the
   gutters since someone *will* submit an ioctl struct with random
   stack garbage in the yet unused parts. Which then bakes in the ABI
   that those fields can never be used for anything else but garbage."

It's important to ensure that reserved fields are set to known value,
so that it will be possible to use them latter to extend the ABI.

The same reasonning apply to comp_mask field present in newer uverbs
command: per commit 22878dbc ("IB/core: Better checking of
userspace values for receive flow steering"), unsupported values in
comp_mask are rejected.

[1] http://blog.ffwll.ch/2013/11/botching-up-ioctls.html

Link: http://marc.info/?i=cover.1386798254.git.ydroneaud@opteya.com>
Signed-off-by: NYann Droneaud <ydroneaud@opteya.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

7efb1b19

IB/uverbs: New macro to set pointers to NULL if length is 0 in INIT_UDATA() · a96e4e2f

由 Roland Dreier 提交于 12月 19, 2013

Trying to have a ternary operator to choose between NULL (or 0) and the
real pointer value in invocations leads to an impossible choice between
a sparse error about a literal 0 used as a NULL pointer, and a gcc
warning about "pointer/integer type mismatch in conditional expression."

Rather than clutter the source with more casts, move the ternary
operator into a new INIT_UDATA_BUF_OR_NULL() macro, which makes it
easier to use and simplifies its callers.
Reported-by: NYann Droneaud <ydroneaud@opteya.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

a96e4e2f

19 12月, 2013 1 次提交

iser-target: Move INIT_WORK setup into isert_create_device_ib_res · 2853c2b6

由 Nicholas Bellinger 提交于 12月 11, 2013

This patch moves INIT_WORK setup for cq_desc->cq_[rx,tx]_work into
isert_create_device_ib_res(), instead of being done each callback
invocation in isert_cq_[rx,tx]_callback().

This also fixes a 'INFO: trying to register non-static key' warning
when cancel_work_sync() is called before INIT_WORK has setup the
struct work_struct.
Reported-by: NOr Gerlitz <ogerlitz@mellanox.com>
Cc: <stable@vger.kernel.org> #3.12+
Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>

2853c2b6

17 12月, 2013 1 次提交

IB/core: const'ify inbuf in struct ib_udata · 309243ec

由 Yann Droneaud 提交于 12月 11, 2013

Userspace input buffer is not modified by kernel, so it can be 'const'.

This is also a prerequisite to remove the implicit cast
from INIT_UDATA().

Link: http://marc.info/?i=cover.1386798254.git.ydroneaud@opteya.com>
Signed-off-by: NYann Droneaud <ydroneaud@opteya.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

309243ec

16 12月, 2013 2 次提交

RDMA/iwcm: Don't touch cm_id after deref in rem_ref · 6b59ba60

由 Steve Wise 提交于 11月 21, 2013

rem_ref() calls iwcm_deref_id(), which will wake up any blockers on
cm_id_priv->destroy_comp if the refcnt hits 0.  That will unblock
someone in iw_destroy_cm_id() which will free the cmid.  If that
happens before rem_ref() calls test_bit(IWCM_F_CALLBACK_DESTROY,
&cm_id_priv->flags), then the test_bit() will touch freed memory.

The fix is to read the bit first, then deref.  We should never be in
iw_destroy_cm_id() with IWCM_F_CALLBACK_DESTROY set, and there is a
BUG_ON() to make sure of that.
Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

6b59ba60

RDMA/cxgb4: Make _c4iw_write_mem_dma() static · c00850dd

由 Rashika 提交于 12月 14, 2013

This patch marks the function _c4iw_write_mem_dma() as static
because it is not used outside this file, which fixes the warning:

drivers/infiniband/hw/cxgb4/mem.c:176:5: warning: no previous prototype for ‘_c4iw_write_mem_dma’ [-Wmissing-prototypes]
Signed-off-by: NRashika Kheria <rashika.kheria@gmail.com>
Acked-by: NSteve Wise <swise@opengridcomputing.com>
Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

c00850dd

12 12月, 2013 1 次提交

iser-target: fix error return code in isert_create_device_ib_res() · 94a71110

由 Wei Yongjun 提交于 10月 29, 2013

Fix to return a negative error code from the error handling
case instead of 0, as done elsewhere in this function.
Signed-off-by: NWei Yongjun <yongjun_wei@trendmicro.com.cn>
Cc: <stable@vger.kernel.org> #3.10+
Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>

94a71110

18 11月, 2013 6 次提交

IB/core: Re-enable create_flow/destroy_flow uverbs · 69ad5da4

由 Matan Barak 提交于 11月 06, 2013

This commit reverts commit 7afbddfa ("IB/core: Temporarily disable
create_flow/destroy_flow uverbs").  Since the uverbs extensions
functionality was experimental for v3.12, this patch re-enables the
support for them and flow-steering for v3.13.
Signed-off-by: NMatan Barak <matanb@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

69ad5da4

IB/core: extended command: an improved infrastructure for uverbs commands · f21519b2

由 Yann Droneaud 提交于 11月 06, 2013

Commit 400dbc96 ("IB/core: Infrastructure for extensible uverbs
commands") added an infrastructure for extensible uverbs commands
while later commit 436f2ad0 ("IB/core: Export ib_create/destroy_flow
through uverbs") exported ib_create_flow()/ib_destroy_flow() functions
using this new infrastructure.

According to the commit 400dbc96, the purpose of this
infrastructure is to support passing around provider (eg. hardware)
specific buffers when userspace issue commands to the kernel, so that
it would be possible to extend uverbs (eg. core) buffers independently
from the provider buffers.

But the new kernel command function prototypes were not modified to
take advantage of this extension. This issue was exposed by Roland
Dreier in a previous review[1].

So the following patch is an attempt to a revised extensible command
infrastructure.

This improved extensible command infrastructure distinguish between
core (eg. legacy)'s command/response buffers from provider
(eg. hardware)'s command/response buffers: each extended command
implementing function is given a struct ib_udata to hold core
(eg. uverbs) input and output buffers, and another struct ib_udata to
hold the hw (eg. provider) input and output buffers.

Having those buffers identified separately make it easier to increase
one buffer to support extension without having to add some code to
guess the exact size of each command/response parts: This should make
the extended functions more reliable.

Additionally, instead of relying on command identifier being greater
than IB_USER_VERBS_CMD_THRESHOLD, the proposed infrastructure rely on
unused bits in command field: on the 32 bits provided by command
field, only 6 bits are really needed to encode the identifier of
commands currently supported by the kernel. (Even using only 6 bits
leaves room for about 23 new commands).

So this patch makes use of some high order bits in command field to
store flags, leaving enough room for more command identifiers than one
will ever need (eg. 256).

The new flags are used to specify if the command should be processed
as an extended one or a legacy one. While designing the new command
format, care was taken to make usage of flags itself extensible.

Using high order bits of the commands field ensure that newer
libibverbs on older kernel will properly fail when trying to call
extended commands. On the other hand, older libibverbs on newer kernel
will never be able to issue calls to extended commands.

The extended command header includes the optional response pointer so
that output buffer length and output buffer pointer are located
together in the command, allowing proper parameters checking. This
should make implementing functions easier and safer.

Additionally the extended header ensure 64bits alignment, while making
all sizes multiple of 8 bytes, extending the maximum buffer size:

                             legacy      extended

   Maximum command buffer:  256KBytes   1024KBytes (512KBytes + 512KBytes)
  Maximum response buffer:  256KBytes   1024KBytes (512KBytes + 512KBytes)

For the purpose of doing proper buffer size accounting, the headers
size are no more taken in account in "in_words".

One of the odds of the current extensible infrastructure, reading
twice the "legacy" command header, is fixed by removing the "legacy"
command header from the extended command header: they are processed as
two different parts of the command: memory is read once and
information are not duplicated: it's making clear that's an extended
command scheme and not a different command scheme.

The proposed scheme will format input (command) and output (response)
buffers this way:

- command:

  legacy header +
  extended header +
  command data (core + hw):

    +----------------------------------------+
    | flags     |   00      00    |  command |
    |        in_words    |   out_words       |
    +----------------------------------------+
    |                 response               |
    |                 response               |
    | provider_in_words | provider_out_words |
    |                 padding                |
    +----------------------------------------+
    |                                        |
    .              <uverbs input>            .
    .              (in_words * 8)            .
    |                                        |
    +----------------------------------------+
    |                                        |
    .             <provider input>           .
    .          (provider_in_words * 8)       .
    |                                        |
    +----------------------------------------+

- response, if present:

    +----------------------------------------+
    |                                        |
    .          <uverbs output space>         .
    .             (out_words * 8)            .
    |                                        |
    +----------------------------------------+
    |                                        |
    .         <provider output space>        .
    .         (provider_out_words * 8)       .
    |                                        |
    +----------------------------------------+

The overall design is to ensure that the extensible infrastructure is
itself extensible while begin more reliable with more input and bound
checking.

Note:

The unused field in the extended header would be perfect candidate to
hold the command "comp_mask" (eg. bit field used to handle
compatibility).  This was suggested by Roland Dreier in a previous
review[2].  But "comp_mask" field is likely to be present in the uverb
input and/or provider input, likewise for the response, as noted by
Matan Barak[3], so it doesn't make sense to put "comp_mask" in the
header.

[1]:
http://marc.info/?i=CAL1RGDWxmM17W2o_era24A-TTDeKyoL6u3NRu_=t_dhV_ZA9MA@mail.gmail.com

[2]:
http://marc.info/?i=CAL1RGDXJtrc849M6_XNZT5xO1+ybKtLWGq6yg6LhoSsKpsmkYA@mail.gmail.com

[3]:
http://marc.info/?i=525C1149.6000701@mellanox.comSigned-off-by: NYann Droneaud <ydroneaud@opteya.com>
Link: http://marc.info/?i=cover.1383773832.git.ydroneaud@opteya.com

[ Convert "ret ? ret : 0" to the equivalent "ret".  - Roland ]
Signed-off-by: NRoland Dreier <roland@purestorage.com>

f21519b2

IB/core: Remove ib_uverbs_flow_spec structure from userspace · 2490f20b

由 Yann Droneaud 提交于 11月 06, 2013

The structure holding any types of flow_spec is of no use to
userspace.  It would be wrong for userspace to do:

  struct ib_uverbs_flow_spec flow_spec;

  flow_spec.type = IB_FLOW_SPEC_TCP;
  flow_spec.size = sizeof(flow_spec);

Instead, userspace should use the dedicated flow_spec structure for
  - Ethernet : struct ib_uverbs_flow_spec_eth,
  - IPv4     : struct ib_uverbs_flow_spec_ipv4,
  - TCP/UDP  : struct ib_uverbs_flow_spec_tcp_udp.

In other words, struct ib_uverbs_flow_spec is a "virtual" data
structure that can only be use by the kernel as an alias to the other.
Signed-off-by: NYann Droneaud <ydroneaud@opteya.com>
Link: http://marc.info/?i=cover.1383773832.git.ydroneaud@opteya.comSigned-off-by: NRoland Dreier <roland@purestorage.com>

2490f20b

IB/core: Make uverbs flow structure use names like verbs ones · b68c9560

由 Yann Droneaud 提交于 11月 06, 2013

This patch adds "flow" prefix to most of data structure added as part
of commit 436f2ad0 ("IB/core: Export ib_create/destroy_flow through
uverbs") to keep those names in sync with the data structures added in
commit 319a441d ("IB/core: Add receive flow steering support").

It's just a matter of translating 'ib_flow' to 'ib_uverbs_flow'.
Signed-off-by: NYann Droneaud <ydroneaud@opteya.com>
Link: http://marc.info/?i=cover.1383773832.git.ydroneaud@opteya.comSigned-off-by: NRoland Dreier <roland@purestorage.com>

b68c9560

IB/core: Rename 'flow' structs to match other uverbs structs · d82693da

由 Yann Droneaud 提交于 11月 06, 2013

Commit 436f2ad0 ("IB/core: Export ib_create/destroy_flow through
uverbs") added public data structures to support receive flow
steering. The new structs are not following the 'uverbs' pattern:
they're lacking the common prefix 'ib_uverbs'.

This patch replaces ib_kern prefix by ib_uverbs.
Signed-off-by: NYann Droneaud <ydroneaud@opteya.com>
Link: http://marc.info/?i=cover.1383773832.git.ydroneaud@opteya.comSigned-off-by: NRoland Dreier <roland@purestorage.com>

d82693da

IB/core: clarify overflow/underflow checks on ib_create/destroy_flow · f8848274

由 Matan Barak 提交于 11月 06, 2013

This patch fixes the following issues:

1. Unneeded checks were removed

2. Removed the fixed size out of flow_attr.size, thus simplifying the checks.

3. Remove a 32bit hole on 64bit systems with strict alignment in
   struct ib_kern_flow_att by adding a reserved field.
Signed-off-by: NMatan Barak <matanb@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

f8848274

17 11月, 2013 2 次提交

IB/ucma: Convert use of typedef ctl_table to struct ctl_table · f3a5e3e3

由 Joe Perches 提交于 10月 22, 2013

This typedef is unnecessary and should just be removed.
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

f3a5e3e3

IB/cm: Convert to using idr_alloc_cyclic() · ab626d1a

由 Zhao Hongjiang 提交于 11月 15, 2013

Commit 3e6628c4 ("idr: introduce idr_alloc_cyclic()") adds a new
idr_alloc_cyclic() routine and converts several of these users to it.
This is just a missed one - add it.
Signed-off-by: NZhao Hongjiang <zhaohongjiang@huawei.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

ab626d1a

16 11月, 2013 5 次提交

IB/mlx5: Fix page shift in create CQ for userspace · cf1c5e1f

由 Eli Cohen 提交于 10月 31, 2013

When creating a CQ, we must use mlx5 adapter page shift.
Signed-off-by: NEli Cohen <eli@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

cf1c5e1f

IB/mlx4: Fix device max capabilities check · 79d3da9c

由 Eli Cohen 提交于 10月 31, 2013

Move the check on max supported CQEs after the final number of entries is
evaluated.
Signed-off-by: NEli Cohen <eli@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

79d3da9c

IB/mlx5: Remove dead code · 7e2e1921

由 Eli Cohen 提交于 10月 31, 2013

The value of the local variable index is never used in reg_mr_callback().
Signed-off-by: NEli Cohen <eli@mellanox.com>

[ Remove now-unused variable delta too.  - Roland ]
Signed-off-by: NRoland Dreier <roland@purestorage.com>

7e2e1921

IB/core: Encorce MR access rights rules on kernel consumers · 1c636f80

由 Eli Cohen 提交于 10月 31, 2013

Enforce the rule that when requesting remote write or atomic permissions, local
write must be indicated as well. See IB spec 11.2.8.2.

Spotted by: Hagay Abramovsky <hagaya@mellanox.com>
Signed-off-by: NEli Cohen <eli@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

1c636f80

IB/mlx4: Fix endless loop in resize CQ · 93b80ac2

由 Eli Cohen 提交于 10月 31, 2013

When calling get_sw_cqe() we need pass the consumer_index and not the
masked value. Failure to do so will cause incorrect result of
get_sw_cqe() possibly leading to endless loop.

This problem was reported and analyzed by Michael Rice from HP.
Signed-off-by: NEli Cohen <eli@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

93b80ac2

13 11月, 2013 3 次提交

qib_fs: fix (some) dcache abuses · 441a9d0e

由 Al Viro 提交于 11月 13, 2013

* lookup_one_len() really wants i_mutex held on directory.
* leaks galore - just mount ipathfs, then
cd /sys/bus/pci/drivers/qib_ib; echo *:*:*.* >unbind
on a box with that card present and try to umount ipathfs...
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

441a9d0e

ib_isert: Avoid duplicate iscsit_increment_maxcmdsn call · 04d9cd12

由 Nicholas Bellinger 提交于 11月 12, 2013

This patch avoids a duplicate iscsit_increment_maxcmdsn() call for
ISER_IB_RDMA_WRITE within isert_map_rdma() + isert_reg_rdma_frwr(),
which will already be occuring once during isert_put_datain() ->
iscsit_build_rsp_pdu() operation.

It also removes the local conn->stat_sn assignment + increment,
and changes the third parameter to iscsit_build_rsp_pdu() to
signal this should be done by iscsi_target_mode code.
Tested-by: NMoussa Ba <moussaba@micron.com>
Cc: <stable@vger.kernel.org> # v3.10+
Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>

04d9cd12

iser-target: Avoid using FRMR for single dma entry requests · f01b9f73

由 Vu Pham 提交于 11月 11, 2013

This patch changes isert_reg_rdma_frwr() to not use FRMR for single
dma entry requests from small I/Os, in order to avoid the associated
memory registration overhead.

Using DMA MR is sufficient here for the single dma entry requests,
and addresses a >= v3.12 performance regression.
Signed-off-by: NVu Pham <vu@mellanox.com>
Cc: <stable@vger.kernel.org> # v3.12+
Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>

f01b9f73

12 11月, 2013 2 次提交

RDMA/cma: Remove unused argument and minor dead code · 352b9056

由 Michal Nazarewicz 提交于 11月 10, 2013

The dev variable is never assigned after being initialised.
Signed-off-by: NMichal Nazarewicz <mina86@mina86.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

352b9056

RDMA/ucma: Discard events for IDs not yet claimed by user space · c6b21824

由 Sean Hefty 提交于 11月 01, 2013

Problem reported by Avneesh Pant <avneesh.pant@oracle.com>:

It looks like we are triggering a bug in RDMA CM/UCM interaction.
The bug specifically hits when we have an incoming connection
request and the connecting process dies BEFORE the passive end of
the connection can process the request i.e. it does not call
rdma_get_cm_event() to retrieve the initial connection event. We
were able to triage this further and have some additional
information now.

In the example below when P1 dies after issuing a connect request
as the CM id is being destroyed all outstanding connects (to P2)
are sent a reject message. We see this reject message being
received on the passive end and the appropriate CM ID created for
the initial connection message being retrieved in cm_match_req().
The problem is in the ucma_event_handler() code when this reject
message is delivered to it and the initial connect message itself
HAS NOT been delivered to the client. In fact the client has not
even called rdma_cm_get_event() at this stage so we haven't
allocated a new ctx in ucma_get_event() and updated the new
connection CM_ID to point to the new UCMA context.

This results in the reject message not being dropped in
ucma_event_handler() for the new connection request as the
(if (!ctx->uid)) block is skipped since the ctx it refers to is
the listen CM id context which does have a valid UID associated
with it (I believe the new CMID for the connection initially
uses the listen CMID -> context when it is created in
cma_new_conn_id). Thus the assumption that new events for a
connection can get dropped in ucma_event_handler() is incorrect
IF the initial connect request has not been retrieved in the
first case. We end up getting a CM Reject event on the listen CM
ID and our upper layer code asserts (in fact this event does not
even have the listen_id set as that only gets set up librdmacm
for connect requests).

The solution is to verify that the cm_id being reported in the event
is the same as the cm_id referenced by the ucma context. A mismatch
indicates that the ucma context corresponds to the listen. This fix
was validated by using a modified version of librdmacm that was able
to verify the problem and see that the reject message was indeed
dropped after this patch was applied.
Signed-off-by: NSean Hefty <sean.hefty@intel.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

c6b21824

09 11月, 2013 5 次提交

IB/core: Add Cisco usNIC rdma node and transport types · 180771a3

由 Upinder Malhi \(umalhi\) 提交于 9月 10, 2013

This patch adds new rdma node and new rdma transport, and supporting
code used by Cisco's low latency driver called usNIC.  usNIC uses its
own transport, distinct from IB and iWARP.
Signed-off-by: NUpinder Malhi <umalhi@cisco.com>
Signed-off-by: NJeff Squyres <jsquyres@cisco.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

180771a3

RDMA/nes: Remove self-assignment from nes_query_qp() · 4127c365

由 Dave Jones 提交于 9月 17, 2013

Assigning a value to itself is pointless.

Spotted with coverity, no hardware to test.
Signed-off-by: NDave Jones <davej@fedoraproject.org>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

4127c365

IB/srp: Report receive errors correctly · cd4e3854

由 Bart Van Assche 提交于 10月 10, 2013

The IB spec does not guarantee that the opcode is available in error
completions.  Hence do not rely on it.  See also commit 948d1e88
("IB/srp: Introduce srp_handle_qp_err()").
Signed-off-by: NBart Van Assche <bvanassche@acm.org>
Cc: <stable@vger.kernel.org> # v3.8
Signed-off-by: NRoland Dreier <roland@purestorage.com>

cd4e3854

IB/srp: Avoid offlining operational SCSI devices · 99b6697a

由 Bart Van Assche 提交于 10月 10, 2013

If SCSI commands are submitted with a SCSI request timeout that is
lower than the the IB RC timeout, it can happen that the SCSI error
handler has already started device recovery before transport layer
error handling starts.  So it can happen that the SCSI error handler
tries to abort a SCSI command after it has been reset by
srp_rport_reconnect().

Tell the SCSI error handler that such commands have finished and that
it is not necessary to continue its recovery strategy for commands
that have been reset by srp_rport_reconnect().
Signed-off-by: NBart Van Assche <bvanassche@acm.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

99b6697a

IB/srp: Remove target from list before freeing Scsi_Host structure · 65d7dd2f

由 Vu Pham 提交于 10月 10, 2013

Remove an SRP target from the SRP target list before invoking the last
scsi_host_put() call.  This change is necessary because that last put
frees the memory that holds the srp_target_port structure.

This patch prevents the following kernel oops:

    RIP: 0010:[<ffffffff810b00d0>] __lock_acquire+0x500/0x1570
    Call Trace:
     [<ffffffff810b11e4>] lock_acquire+0xa4/0x120
     [<ffffffff81531206>] _spin_lock+0x36/0x70
     [<ffffffffa01b6d8f>] srp_remove_work+0xef/0x180 [ib_srp]
     [<ffffffff8109125c>] worker_thread+0x21c/0x3d0
     [<ffffffff81096e86>] kthread+0x96/0xa0
     [<ffffffff8100c20a>] child_rip+0xa/0x20
Signed-off-by: NVu Pham <vuhuong@mellanox.com>

[ bvanassche - Modified path description and CC'ed stable. ]
Signed-off-by: NBart Van Assche <bvanassche@acm.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

65d7dd2f

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功