提交 · 9268f72dcb24348c8b4cf9bcf8afeb24035157a5 · openeuler / Kernel

31 8月, 2015 3 次提交

IB/core: Find the network device matching connection parameters · 9268f72d

由 Yotam Kenneth 提交于 7月 30, 2015

In the case of IPoIB, and maybe in other cases, the network device is
managed by an upper-layer protocol (ULP). In order to expose this
network device to other users of the IB device, let ULPs implement
a callback that returns network device according to connection parameters.

The IB device and port, together with the P_Key and the GID should
be enough to uniquely identify the ULP net device. However, in current
kernels there can be multiple IPoIB interfaces created with the same GID.
Furthermore, such configuration may be desireable to support ipvlan-like
configurations for RDMA CM with IPoIB. To resolve the device in these
cases the code will also take the IP address as an additional input.
Reviewed-by: NJason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: NHaggai Eran <haggaie@mellanox.com>
Signed-off-by: NYotam Kenneth <yotamke@mellanox.com>
Signed-off-by: NShachar Raindel <raindel@mellanox.com>
Signed-off-by: NGuy Shapiro <guysh@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

9268f72d

IB/core: lock client data with lists_rwsem · 7c1eb45a

由 Haggai Eran 提交于 7月 30, 2015

An ib_client callback that is called with the lists_rwsem locked only for
read is protected from changes to the IB client lists, but not from
ib_unregister_device() freeing its client data. This is because
ib_unregister_device() will remove the device from the device list with
lists_rwsem locked for write, but perform the rest of the cleanup,
including the call to remove() without that lock.

Mark client data that is undergoing de-registration with a new going_down
flag in the client data context. Lock the client data list with lists_rwsem
for write in addition to using the spinlock, so that functions calling the
callback would be able to lock only lists_rwsem for read and let callbacks
sleep.

Since ib_unregister_client() now marks the client data context, no need for
remove() to search the context again, so pass the client data directly to
remove() callbacks.
Reviewed-by: NJason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: NHaggai Eran <haggaie@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

7c1eb45a

IB/core: Add rwsem to allow reading device list or client list · 5aa44bb9

由 Haggai Eran 提交于 7月 30, 2015

Currently the RDMA subsystem's device list and client list are protected by
a single mutex. This prevents adding user-facing APIs that iterate these
lists, since using them may cause a deadlock. The patch attempts to solve
this problem by adding a read-write semaphore to protect the lists. Readers
now don't need the mutex, and are safe just by read-locking the semaphore.

The ib_register_device, ib_register_client, ib_unregister_device, and
ib_unregister_client functions are modified to lock the semaphore for write
during their respective list modification. Also, in order to make sure
client callbacks are called only between add() and remove() calls, the code
is changed to only add items to the lists after the add() calls and remove
from the lists before the remove() calls.

This patch attempts to solve a similar need [1] that was seen in the RoCE
v2 patch series.

[1] http://www.spinics.net/lists/linux-rdma/msg24733.htmlReviewed-by: NJason Gunthorpe <jgunthorpe@obsidianresearch.com>
Cc: Matan Barak <matanb@mellanox.com>
Signed-off-by: NHaggai Eran <haggaie@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

5aa44bb9

29 8月, 2015 14 次提交

ipath,qib: Expose max_sge_rd correctly · aaae91f4

由 Steve Wise 提交于 7月 27, 2015

Applications must not assume that max_sge and max_sge_rd are the same,
Hence expose max_sge_rd correctly as well.
Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
Acked-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

aaae91f4

mlx4, mlx5, mthca: Expose max_sge_rd correctly · 18ebd407

由 Sagi Grimberg 提交于 7月 27, 2015

Applications must not assume that max_sge and max_sge_rd are the same,
Hence expose max_sge_rd correctly as well.
Reported-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NSagi Grimberg <sagig@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

18ebd407

IB/core: Add core header changes needed for OPA · d4ab3470

由 Dennis Dalessandro 提交于 7月 30, 2015

This patch adds the value of the CNP opcode to the existing list of enumerated
opcodes in ib_pack.h

Add common OPA header definitions for driver
build:
- opa_port_info.h
- opa_smi.h
- hfi1_user.h

Additionally, ib_mad.h, has additional definitions
that are common to ib_drivers including:
- trap support
- cca support

The qib driver has the duplication removed in favor
those in ib_mad.h
Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: NJohn, Jubin <jubin.john@intel.com>
Signed-off-by: NIra Weiny <ira.weiny@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

d4ab3470

RDMA/amso1100: Deprecate the amso1100 driver and move to staging · 072bf1f7

由 Steve Wise 提交于 7月 29, 2015

The HW hasn't been sold since 2005, and the SW has definite bit rot.
Its time to remove it.  So move it to staging for a few releases and
then remove it after that.
Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

072bf1f7

IB/ipath: Deprecate ipath driver and move to staging. · 6f9b3890

由 Dennis Dalessandro 提交于 7月 30, 2015

It is now time for the ipath driver to begin to be phased out of the kernel.
This patch moves the ipath driver from the Infiniband sub tree to the staging
area where it will remain until the code is removed from the kernel in a few
releases.
Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

6f9b3890

iw_cxgb4: Add support for clip · 84cc6ac6

由 Hariprasad S 提交于 8月 25, 2015

Add support for ipv6 address handling clip api provided by lld
Signed-off-by: NHariprasad Shenai <hariprasad@chelsio.com>
Acked-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

84cc6ac6

RDMA/cma: fix IPv6 address resolution · 6c26a771

由 Spencer Baugh 提交于 8月 13, 2015

Resolving a link-local IPv6 address with an unspecified source address
was broken by commit 5462eddd, which prevented the IPv6 stack from
learning the scope id of the link-local IPv6 address, causing random
failures as the IP stack chose a random link to resolve the address on.

This commit 5462eddd made us bail out of cma_check_linklocal early if
the address passed in was not an IPv6 link-local address. On the address
resolution path, the address passed in is the source address; if the
source address is the unspecified address, which is not link-local, we
will bail out early.

This is mostly correct, but if the destination address is a link-local
address, then we will be following a link-local route, and we'll need to
tell the IPv6 stack what the scope id of the destination address is.
This used to be done by last line of cma_check_linklocal, which is
skipped when bailing out early:

	dev_addr->bound_dev_if = sin6->sin6_scope_id;

(In cma_bind_addr, the sin6_scope_id of the source address is set to the
sin6_scope_id of the destination address, so this is correct)
This line is required in turn for the following line, L279 of
addr6_resolve, to actually inform the IPv6 stack of the scope id:

      fl6.flowi6_oif = addr->bound_dev_if;

Since we can only know we are in this failure case when we have access
to both the source IPv6 address and destination IPv6 address, we have to
deal with this further up the stack. So detect this failure case in
cma_bind_addr, and set bound_dev_if to the destination address scope id
to correct it.
Signed-off-by: NSpencer Baugh <sbaugh@catern.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

6c26a771

IB/ucma: Fix theoretical user triggered use-after-free · 7e967fd0

由 Jason Gunthorpe 提交于 8月 04, 2015

Something like this:

CPU A                         CPU B
Acked-by: NSean Hefty <sean.hefty@intel.com>

========================      ================================
ucma_destroy_id()
 wait_for_completion()
                              .. anything
                                ucma_put_ctx()
                                  complete()
 .. continues ...
                              ucma_leave_multicast()
                               mutex_lock(mut)
                                 atomic_inc(ctx->ref)
                               mutex_unlock(mut)
 ucma_free_ctx()
  ucma_cleanup_multicast()
   mutex_lock(mut)
     kfree(mc)
                               rdma_leave_multicast(mc->ctx->cm_id,..

Fix it by latching the ref at 0. Once it goes to 0 mc and ctx cannot
leave the mutex(mut) protection.

The other atomic_inc in ucma_get_ctx is OK because mutex(mut) protects
it from racing with ucma_destroy_id.
Signed-off-by: NJason Gunthorpe <jgunthorpe@obsidianresearch.com>
Acked-by: NSean Hefty <sean.hefty@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

7e967fd0

iw_cxgb4: set the default MPA version to 2 · b8ac3112

由 Hariprasad S 提交于 7月 27, 2015

This enables ORD/IRD negotiation and its about time to enable it by
default
Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NHariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

b8ac3112

RDMA/iser: Limit sgs to the device fastreg depth · 7854550a

由 Steve Wise 提交于 7月 28, 2015

Currently the sg tablesize, which dictates fast register page list
depth to use, does not take into account the limits of the rdma device.
So adjust it once we discover the device fastreg max depth limit. Also
adjust the max_sectors based on the resulting sg tablesize.
Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

7854550a

IB/mlx5: Remove dead code from alloc_cached_mr() · d6c7276b

由 Roland Dreier 提交于 7月 27, 2015

The only place that assigns mr inside the loop already does a break.
So "if (mr)" will never be true here since the function initializes mr
to NULL at the top.  We can just drop the extra if and break here.
Signed-off-by: NRoland Dreier <roland@purestorage.com>
Acked-by: NEli Cohen  <eli@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

d6c7276b

IB/qib: Change lkey table allocation to support more MRs · d6f1c17e

由 Mike Marciniszyn 提交于 7月 21, 2015

The lkey table is allocated with with a get_user_pages() with an
order based on a number of index bits from a module parameter.

The underlying kernel code cannot allocate that many contiguous pages.

There is no reason the underlying memory needs to be physically
contiguous.

This patch:
- switches the allocation/deallocation to vmalloc/vfree
- caps the number of bits to 23 to insure at least 1 generation bit
  o this matches the module parameter description

Cc: stable@vger.kernel.org
Reviewed-by: NVinit Agnihotri <vinit.abhay.agnihotri@intel.com>
Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

d6f1c17e

mlx5: Expose correct page_size_cap in device attributes · e0238a6a

由 Sagi Grimberg 提交于 7月 21, 2015

Should be all the page sizes that are supported by the
device.
Reported-by: NJason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: NSagi Grimberg <sagig@mellanox.com>
Reviewed-by: NJason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

e0238a6a

mlx5: Fix missing device local_dma_lkey · a3c87420

由 Sagi Grimberg 提交于 7月 20, 2015

The mlx5 driver exposes device capability IB_DEVICE_LOCAL_DMA_LKEY
but does not set the the device local_dma_lkey. This breaks
rpcrdma drivers.

Query and set this lkey when creating the device resources.
Signed-off-by: NSagi Grimberg <sagig@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

a3c87420

28 7月, 2015 1 次提交

iw_cxgb4: gracefully handle unknown CQE status errors · 3661df17

由 Hariprasad S 提交于 7月 27, 2015

c4iw_poll_cq_on() shouldn't fail the poll operation just because
the CQE status is unknown.  Rather, it should map this to the
"fatal error" status and log the anomaly.
Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NHariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

3661df17

25 7月, 2015 1 次提交

iser-target: Fix REJECT CM event use-after-free OOPs · ce9a9fc2

由 Nicholas Bellinger 提交于 7月 24, 2015

This patch fixes a bug in iser-target code where the REJECT CM event
handler code currently performs a isert_put_conn() for the final
isert_conn->kref put, while iscsi_np process context is still blocked
in isert_get_login_rx().

Once isert_get_login_rx() is awoking due to login timeout, iscsi_np
process context will attempt to invoke iscsi_target_login_sess_out()
to cleanup iscsi_conn as expected, and calls isert_wait_conn() +
isert_free_conn() which triggers the use-after-free OOPs.

To address this bug, move the kref_get_unless_zero() call from
isert_connected_handler() into isert_connect_request() immediately
preceeding isert_rdma_accept() to ensure the CM handler cleanup
paths and isert_free_conn() are always operating with two refs.

Cc: Sagi Grimberg <sagig@mellanox.com>
Cc: <stable@vger.kernel.org> # v3.10+
Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>

ce9a9fc2

24 7月, 2015 4 次提交

RDMA/ocrdma: update ocrdma module license string · b8f5595e

由 Devesh Sharma 提交于 7月 24, 2015

Change module_license from "GPL" to "Dual BSD/GPL"

Cc: Tejun Heo <tj@kernel.org>
Cc: Duan Jiong <duanj.fnst@cn.fujitsu.com>
Cc: Roland Dreier <roland@purestorage.com>
Cc: Jes Sorensen <Jes.Sorensen@redhat.com>
Cc: Sasha Levin <levinsasha928@gmail.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Cc: Moni Shoua <monis@mellanox.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Li RongQing <roy.qing.li@gmail.com>
Cc: Devendra Naga <devendra.aaru@gmail.com>
Signed-off-by: NDevesh Sharma <devesh.sharma@avagotech.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

b8f5595e

RDMA/ocrdma: update ocrdma license to dual-license · 71ee6730

由 Devesh Sharma 提交于 7月 24, 2015

Change of license from GPLv2 to dual-license (GPLv2 and BSD 2-Clause)

All contributors were contacted off-list and permission to make this
change was received.  The complete list of contributors are Cc:ed here.

Cc: Tejun Heo <tj@kernel.org>
Cc: Duan Jiong <duanj.fnst@cn.fujitsu.com>
Cc: Roland Dreier <roland@purestorage.com>
Cc: Jes Sorensen <Jes.Sorensen@redhat.com>
Cc: Sasha Levin <levinsasha928@gmail.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Cc: Moni Shoua <monis@mellanox.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Li RongQing <roy.qing.li@gmail.com>
Cc: Devendra Naga <devendra.aaru@gmail.com>
Signed-off-by: NDevesh Sharma <devesh.sharma@avagotech.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

71ee6730

IB/ipoib: Fix CONFIG_INFINIBAND_IPOIB_CM · efc1eedb

由 Jason Gunthorpe 提交于 7月 22, 2015

If the above is turned off then ipoib_cm_dev_init unconditionally
returns ENOSYS, and the newly added error handling in
0b3957 prevents ipoib from coming up at all:

kernel: mlx4_0: ipoib_transport_dev_init failed
kernel: mlx4_0: failed to initialize port 1 (ret = -12)

Fixes: 0b39578b (IB/ipoib: Use dedicated workqueues per interface)
Signed-off-by: NJason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

efc1eedb

RDMA/cxgb3: fail get_dma_mr on 64 bit arches · 49fa63d8

由 Steve Wise 提交于 7月 22, 2015

T3 HW only supports 32 bit MRs.  If the system uses 64 bit memory
addresses, then a registered 32 bit MR will wrap and write to the
wrong memory when used with addresses > 4GB.  To prevent this,
simply fail to allocate an MR on 64 bit machines (other means
of registering memory are still available and software can still
work, we just don't allow this means of memory registration).
Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

49fa63d8

21 7月, 2015 1 次提交

x86/mm/pat, drivers/infiniband/ipath: Replace WARN() with pr_warn() · fd0a1b86

由 Luis R. Rodriguez 提交于 7月 17, 2015

WARN() may confuse users, fix that. ipath_init_one() is part the
device's probe so this would only be triggered if a
corresponding device was found.
Signed-off-by: NLuis R. Rodriguez <mcgrof@suse.com>
Acked-by: NDoug Ledford <dledford@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: andy@silverblocksystems.net
Cc: benh@kernel.crashing.org
Cc: bp@suse.de
Cc: dan.j.williams@intel.com
Cc: jkosina@suse.cz
Cc: julia.lawall@lip6.fr
Cc: luto@amacapital.net
Cc: mchehab@osg.samsung.com
Link: http://lkml.kernel.org/r/1437167245-28273-2-git-send-email-mcgrof@do-not-panic.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

fd0a1b86

15 7月, 2015 16 次提交

IB/core: Destroy ocrdma_dev_id IDR on module exit · d8b2ba7c

由 Johannes Thumshirn 提交于 7月 08, 2015

Destroy ocrdma_dev_id IDR on module exit, reclaiming the allocated memory.

This was detected by the following semantic patch (written by Luis Rodriguez
<mcgrof@suse.com>)
<SmPL>
@ defines_module_init @
declarer name module_init, module_exit;
declarer name DEFINE_IDR;
identifier init;
@@

module_init(init);

@ defines_module_exit @
identifier exit;
@@

module_exit(exit);

@ declares_idr depends on defines_module_init && defines_module_exit @
identifier idr;
@@

DEFINE_IDR(idr);

@ on_exit_calls_destroy depends on declares_idr && defines_module_exit @
identifier declares_idr.idr, defines_module_exit.exit;
@@

exit(void)
{
 ...
 idr_destroy(&idr);
 ...
}

@ missing_module_idr_destroy depends on declares_idr && defines_module_exit && !on_exit_calls_destroy @
identifier declares_idr.idr, defines_module_exit.exit;
@@

exit(void)
{
 ...
 +idr_destroy(&idr);
 }

</SmPL>
Signed-off-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

d8b2ba7c

IB/core: Destroy multcast_idr on module exit · 45d25420

由 Johannes Thumshirn 提交于 7月 08, 2015

Destroy multcast_idr on module exit, reclaiming the allocated memory.

This was detected by the following semantic patch (written by Luis Rodriguez
<mcgrof@suse.com>)
<SmPL>
@ defines_module_init @
declarer name module_init, module_exit;
declarer name DEFINE_IDR;
identifier init;
@@

module_init(init);

@ defines_module_exit @
identifier exit;
@@

module_exit(exit);

@ declares_idr depends on defines_module_init && defines_module_exit @
identifier idr;
@@

DEFINE_IDR(idr);

@ on_exit_calls_destroy depends on declares_idr && defines_module_exit @
identifier declares_idr.idr, defines_module_exit.exit;
@@

exit(void)
{
 ...
 idr_destroy(&idr);
 ...
}

@ missing_module_idr_destroy depends on declares_idr && defines_module_exit && !on_exit_calls_destroy @
identifier declares_idr.idr, defines_module_exit.exit;
@@

exit(void)
{
 ...
 +idr_destroy(&idr);
}

</SmPL>
Signed-off-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

45d25420

IB/mlx4: Optimize do_slave_init · d9a047ae

由 Doug Ledford 提交于 7月 09, 2015

There is little chance our memory allocation will fail, so we can
combine initializing the work structs with allocating them instead of
looping through all of them once to allocate and again to initialize.
Then when we need to actually find out if our device is up or in the
process of going down, have all of our work structs batched up, take the
spin_lock once and only once, and do all of the batch under the one
spin_lock invocation instead of incurring all of the locked memory cycles
we would otherwise incur to take/release the spin_lock over and over
again.
Signed-off-by: NDoug Ledford <dledford@redhat.com>

d9a047ae

IB/mlx4: Fix memory leak in do_slave_init · 9bbf282d

由 Doug Ledford 提交于 7月 09, 2015

We create a number of work structs to be queued up to a workqueue, and
on completion of the workqueue handler, the workqueue handler frees the
allocated memory. If, however, we don't queue the work struct because
the device is going down, then we need to free the memory ourselves.
Signed-off-by: NDoug Ledford <dledford@redhat.com>

9bbf282d

IB/mlx4: Optimize freeing of items on error unwind · a39a98ff

由 Maninder Singh 提交于 7月 08, 2015

On failure, we loop through all possible pointers and test them before
calling kfree. But really, why even attempt to free items we didn't
allocate when we can easily loop through exactly and only the devices
for which the original memory allocation succeeded and free just those.
Signed-off-by: NManinder Singh <maninder1.s@samsung.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

a39a98ff

IB/mlx4: Fix use of flow-counters for process_mad · 43bfb972

由 Or Gerlitz 提交于 6月 25, 2015

For IB links, reading HCA flow counters through iboe_process_mad() should
be used when mlx4_ib_process_mad() is invoked only for VFs PMA queries and
exactly nothing else.

Fixes: 7193a141 ('IB/mlx4: Set VF to read from QP counters')
Reported-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

43bfb972

IB/ipath: Convert use of __constant_<foo> to <foo> · cb1ff431

由 Vaishali Thakkar 提交于 6月 16, 2015

In little endian cases, the macros be16_to_cpu and cpu_to_be64
unfolds to __swab{16,64} which provides special case for constants.
In big endian cases, __constant_be16_to_cpu and be16_to_cpu
expand directly to the same expression. The same applies for
__constant_cpu_to_be64 and cpu_to_be64.

So, replace __constant_be16_to_cpu with be16_to_cpu and
__constant_cpu_to_be64 with cpu_to_be64, with the goal of getting
rid of the definition of __constant_be16_to_cpu and
__constant_cpu_to_be64 completely.
Signed-off-by: NVaishali Thakkar <vthakkar1994@gmail.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

cb1ff431

IB/ipoib: Set MTU to max allowed by mode when mode changes · edcd2a74

由 Erez Shitrit 提交于 6月 07, 2015

When switching between modes (datagram / connected) change the MTU
accordingly.
datagram mode up to 4K, connected mode up to (64K - 0x10).
Signed-off-by: NELi Cohen <eli@mellanox.com>
Signed-off-by: NErez Shitrit <erezsh@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

edcd2a74

IB/ipoib: Scatter-Gather support in connected mode · c4268778

由 Yuval Shaia 提交于 7月 12, 2015

By default, IPoIB-CM driver uses 64k MTU. Larger MTU gives better
performance.
This MTU plus overhead puts the memory allocation for IP based packets at
32 4k pages (order 5), which have to be contiguous.
When the system memory under pressure, it was observed that allocating 128k
contiguous physical memory is difficult and causes serious errors (such as
system becomes unusable).

This enhancement resolve the issue by removing the physically contiguous
memory requirement using Scatter/Gather feature that exists in Linux stack.

With this fix Scatter-Gather will be supported also in connected mode.

This change reverts some of the change made in commit e112373f
("IPoIB/cm: Reduce connected mode TX object size").

The ability to use SG in IPoIB CM is possible because the coupling
between NETIF_F_SG and NETIF_F_CSUM was removed in commit
ec5f0615 ("net: Kill link between CSUM and SG features.")
Signed-off-by: NYuval Shaia <yuval.shaia@oracle.com>
Acked-by: NChristian Marie <christian@ponies.io>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

c4268778

IB/ucm: Fix bitmap wrap when devnum > IB_UCM_MAX_DEVICES · 59d40dd9

由 Carol L Soto 提交于 6月 11, 2015

ib_ucm_release_dev clears the wrong bit if devnum is greater
than IB_UCM_MAX_DEVICES.
Signed-off-by: NCarol L Soto <clsoto@linux.vnet.ibm.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

59d40dd9

IB/ipoib: Prevent lockdep warning in __ipoib_ib_dev_flush · 8b7cce0d

由 Haggai Eran 提交于 7月 07, 2015

__ipoib_ib_dev_flush calls itself recursively on child devices, and lockdep
complains about locking vlan_rwsem twice (see below). Use down_read_nested
instead of down_read to prevent the warning.

 =============================================
 [ INFO: possible recursive locking detected ]
 4.1.0-rc4+ #36 Tainted: G           O
 ---------------------------------------------
 kworker/u20:2/261 is trying to acquire lock:
  (&priv->vlan_rwsem){.+.+..}, at: [<ffffffffa0791e2a>] __ipoib_ib_dev_flush+0x3a/0x2b0 [ib_ipoib]

 but task is already holding lock:
  (&priv->vlan_rwsem){.+.+..}, at: [<ffffffffa0791e2a>] __ipoib_ib_dev_flush+0x3a/0x2b0 [ib_ipoib]

 other info that might help us debug this:
  Possible unsafe locking scenario:

        CPU0
        ----
   lock(&priv->vlan_rwsem);
   lock(&priv->vlan_rwsem);

  *** DEADLOCK ***

  May be due to missing lock nesting notation

 3 locks held by kworker/u20:2/261:
  #0:  ("%s""ipoib_flush"){.+.+..}, at: [<ffffffff810827cc>] process_one_work+0x15c/0x760
  #1:  ((&priv->flush_heavy)){+.+...}, at: [<ffffffff810827cc>] process_one_work+0x15c/0x760
  #2:  (&priv->vlan_rwsem){.+.+..}, at: [<ffffffffa0791e2a>] __ipoib_ib_dev_flush+0x3a/0x2b0 [ib_ipoib]

 stack backtrace:
 CPU: 3 PID: 261 Comm: kworker/u20:2 Tainted: G           O    4.1.0-rc4+ #36
 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
 Workqueue: ipoib_flush ipoib_ib_dev_flush_heavy [ib_ipoib]
  ffff8801c6c54790 ffff8801c9927af8 ffffffff81665238 0000000000000001
  ffffffff825b5b30 ffff8801c9927bd8 ffffffff810bba51 ffff880100000000
  ffffffff00000001 ffff880100000001 ffff8801c6c55428 ffff8801c6c54790
 Call Trace:
  [<ffffffff81665238>] dump_stack+0x4f/0x6f
  [<ffffffff810bba51>] __lock_acquire+0x741/0x1820
  [<ffffffff810bcbf8>] lock_acquire+0xc8/0x240
  [<ffffffffa0791e2a>] ? __ipoib_ib_dev_flush+0x3a/0x2b0 [ib_ipoib]
  [<ffffffff81669d2c>] down_read+0x4c/0x70
  [<ffffffffa0791e2a>] ? __ipoib_ib_dev_flush+0x3a/0x2b0 [ib_ipoib]
  [<ffffffffa0791e2a>] __ipoib_ib_dev_flush+0x3a/0x2b0 [ib_ipoib]
  [<ffffffffa0791e4a>] __ipoib_ib_dev_flush+0x5a/0x2b0 [ib_ipoib]
  [<ffffffffa07920ba>] ipoib_ib_dev_flush_heavy+0x1a/0x20 [ib_ipoib]
  [<ffffffff81082871>] process_one_work+0x201/0x760
  [<ffffffff810827cc>] ? process_one_work+0x15c/0x760
  [<ffffffff81082ef0>] worker_thread+0x120/0x4d0
  [<ffffffff81082dd0>] ? process_one_work+0x760/0x760
  [<ffffffff81082dd0>] ? process_one_work+0x760/0x760
  [<ffffffff81088b7e>] kthread+0xfe/0x120
  [<ffffffff81088a80>] ? __init_kthread_worker+0x70/0x70
  [<ffffffff8166c6e2>] ret_from_fork+0x42/0x70
  [<ffffffff81088a80>] ? __init_kthread_worker+0x70/0x70
Signed-off-by: NHaggai Eran <haggaie@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

8b7cce0d

IB/ucma: Fix lockdep warning in ucma_lock_files · 31b57b87

由 Haggai Eran 提交于 7月 07, 2015

The ucma_lock_files() locks the mut mutex on two files, e.g. for migrating
an ID. Use mutex_lock_nested() to prevent the warning below.

 =============================================
 [ INFO: possible recursive locking detected ]
 4.1.0-rc6-hmm+ #40 Tainted: G           O
 ---------------------------------------------
 pingpong_rpc_se/10260 is trying to acquire lock:
  (&file->mut){+.+.+.}, at: [<ffffffffa047ac55>] ucma_migrate_id+0xc5/0x248 [rdma_ucm]

 but task is already holding lock:
  (&file->mut){+.+.+.}, at: [<ffffffffa047ac4b>] ucma_migrate_id+0xbb/0x248 [rdma_ucm]

 other info that might help us debug this:
  Possible unsafe locking scenario:

        CPU0
        ----
   lock(&file->mut);
   lock(&file->mut);

  *** DEADLOCK ***

  May be due to missing lock nesting notation

 1 lock held by pingpong_rpc_se/10260:
  #0:  (&file->mut){+.+.+.}, at: [<ffffffffa047ac4b>] ucma_migrate_id+0xbb/0x248 [rdma_ucm]

 stack backtrace:
 CPU: 0 PID: 10260 Comm: pingpong_rpc_se Tainted: G           O    4.1.0-rc6-hmm+ #40
 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
  ffff8801f85b63d0 ffff880195677b58 ffffffff81668f49 0000000000000001
  ffffffff825cbbe0 ffff880195677c38 ffffffff810bb991 ffff880100000000
  ffff880100000000 ffff880100000001 ffff8801f85b7010 ffffffff8121bee9
 Call Trace:
  [<ffffffff81668f49>] dump_stack+0x4f/0x6e
  [<ffffffff810bb991>] __lock_acquire+0x741/0x1820
  [<ffffffff8121bee9>] ? dput+0x29/0x320
  [<ffffffff810bcb38>] lock_acquire+0xc8/0x240
  [<ffffffffa047ac55>] ? ucma_migrate_id+0xc5/0x248 [rdma_ucm]
  [<ffffffff8166b901>] ? mutex_lock_nested+0x291/0x3e0
  [<ffffffff8166b6d5>] mutex_lock_nested+0x65/0x3e0
  [<ffffffffa047ac55>] ? ucma_migrate_id+0xc5/0x248 [rdma_ucm]
  [<ffffffff810baeed>] ? trace_hardirqs_on+0xd/0x10
  [<ffffffff8166b66e>] ? mutex_unlock+0xe/0x10
  [<ffffffffa047ac55>] ucma_migrate_id+0xc5/0x248 [rdma_ucm]
  [<ffffffffa0478474>] ucma_write+0xa4/0xb0 [rdma_ucm]
  [<ffffffff81200674>] __vfs_write+0x34/0x100
  [<ffffffff8112427c>] ? __audit_syscall_entry+0xac/0x110
  [<ffffffff810ec055>] ? current_kernel_time+0xc5/0xe0
  [<ffffffff812aa4d3>] ? security_file_permission+0x23/0x90
  [<ffffffff8120088d>] ? rw_verify_area+0x5d/0xe0
  [<ffffffff812009bb>] vfs_write+0xab/0x120
  [<ffffffff81201519>] SyS_write+0x59/0xd0
  [<ffffffff8112427c>] ? __audit_syscall_entry+0xac/0x110
  [<ffffffff8166ffee>] system_call_fastpath+0x12/0x76
Signed-off-by: NHaggai Eran <haggaie@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

31b57b87

RDMA/nes: Fix for incorrect recording of the MAC address · 0a691272

由 Tatyana Nikolova 提交于 7月 02, 2015

Fix for incorrect recording of the MAC address
Signed-off-by: NTatyana Nikolova <Tatyana.E.Nikolova@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

0a691272

RDMA/nes: Fix for resolving the neigh · 165d6822

由 Tatyana Nikolova 提交于 7月 02, 2015

Neighbor resolution doesn't work without this fix
Signed-off-by: NTatyana Nikolova <Tatyana.E.Nikolova@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

165d6822

RDMA/core: Fixes for port mapper client registration · a7f2f24c

由 Tatyana Nikolova 提交于 7月 02, 2015

Fixes to allow clients to make remove mapping requests, after
they have provided the user space service with the mapping
information, they are using when the service is restarted.

1) Adding IWPM_REG_VALID, IWPM_REG_INCOMPL and IWPM_REG_UNDEF
   registration types for the port mapper clients and functions
   to set/check the registration type.
2) If the port mapper user space service is not available to register
   the client, then its registration stays IWPM_REG_UNDEF and the
   registration isn't checked until the service becomes available
   (no mappings are possible, if the user space service isn't running).
3) After the service is restarted, the user space port mapper pid is set
   to valid and the client registration is set to IWPM_REG_INCOMPL
   to allow the client to make remove mapping requests.
Signed-off-by: NTatyana Nikolova <Tatyana.E.Nikolova@intel.com>
Reviewed-by: NSteve Wise <swise@opengridcomputing.com>
Tested-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

a7f2f24c

IB/IPoIB: Fix bad error flow in ipoib_add_port() · 58e9cc90

由 Amir Vadai 提交于 7月 01, 2015

Error values of ib_query_port() and ib_query_device() weren't propagated
correctly. Because of that, ipoib_add_port() could return NULL value,
which escaped the IS_ERR() check in ipoib_add_one() and we crashed.
Signed-off-by: NAmir Vadai <amirv@mellanox.com>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: NIra Weiny <ira.weiny@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

58e9cc90

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功