提交 · f1430536e008cd3b70794e12c414c20d54aabec2 · openeuler / Kernel

26 3月, 2019 20 次提交

mlx4: Convert pv_id_table to XArray · f1430536

由 Matthew Wilcox 提交于 2月 20, 2019

Signed-off-by: NMatthew Wilcox <willy@infradead.org>
Acked-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

f1430536

mlx5: Convert mlx5_srq_table to XArray · b02a29eb

由 Matthew Wilcox 提交于 2月 20, 2019

Remove the custom spinlock as the XArray handles its own locking.
Signed-off-by: NMatthew Wilcox <willy@infradead.org>
Acked-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

b02a29eb

IB/hfi1: Add running average for adaptive pio · 270a9833

由 Mike Marciniszyn 提交于 2月 26, 2019

The adaptive PIO implementation only considers the current packet size
when deciding between SDMA and pio for a packet.

This causes credit return forces if small and large packets are
interleaved.

Add a running average to avoid costly credit forces so that a large
sequence of small packets is required to go below the threshold that
chooses pio.
Reviewed-by: NMichael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

270a9833

RDMA: Use __packed annotation instead of __attribute__ ((packed)) · 19b1a294

由 Erez Alfasi 提交于 2月 25, 2019

"__attribute__" set of macros has been standardized, have became more
potentially portable and consistent code back in v2.6.21 by commit
82ddcb04 ("[PATCH] extend the set of "__attribute__" shortcut macros").
Moreover, nowadays checkpatch.pl warns about using __attribute__((packed))
instead of __packed.

This patch converts all the "__attribute__ ((packed))" annotations to
"__packed" within the RDMA subsystem.
Signed-off-by: NErez Alfasi <ereza@mellanox.com>
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

19b1a294

RDMA/hns: Delete unused variable in hns_roce_v2_modify_qp function · d0a93556

由 Lijun Ou 提交于 2月 23, 2019

The src_mac array is not used in hns_roce_v2_modify_qp function.
Signed-off-by: NLijun Ou <oulijun@huawei.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

d0a93556

RDMA/hns: Bugfix for sending with invalidate · 82342e49

由 Lijun Ou 提交于 2月 23, 2019

According to IB protocol, the send with invalidate operation will not
invalidate mr that was created through a register mr or reregister mr.

Fixes: e93df010 ("RDMA/hns: Support local invalidate for hip08 in kernel space")
Signed-off-by: NLijun Ou <oulijun@huawei.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

82342e49

RDMA/hns: Hide error print information with roce vf device · 07c2339a

由 Lijun Ou 提交于 2月 23, 2019

The driver should not print the error information when the hip08 driver
not support virtual function.
Signed-off-by: NLijun Ou <oulijun@huawei.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

07c2339a

RDMA/hns: Only assgin some fields if the relatived attr_mask is set · 5b01b243

由 Lijun Ou 提交于 2月 23, 2019

According to IB protocol, some fields of qp context are filled with
optional when the relatived attr_mask are set. The relatived attr_mask
include IB_QP_TIMEOUT, IB_QP_RETRY_CNT, IB_QP_RNR_RETRY and
IB_QP_MIN_RNR_TIMER. Besides, we move some assignments of the fields of
qp context into the outside of the specific qp state jump function.
Signed-off-by: NLijun Ou <oulijun@huawei.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

5b01b243

RDMA/hns: Update the range of raq_psn field of qp context · 834fa8cf

由 Lijun Ou 提交于 2月 23, 2019

According to hip08 UM(User Manual), the raq_psn field size is [23:0].
Signed-off-by: NLijun Ou <oulijun@huawei.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

834fa8cf

RDMA/hns: Only assign the fields of the rq psn if IB_QP_RQ_PSN is set · 601f3e6d

由 Lijun Ou 提交于 2月 23, 2019

Only when the IB_QP_RQ_PSN flags of attr_mask is set is it valid to assign
the relatived fields of rq'psn into the qp context when modified qp.
Signed-off-by: NLijun Ou <oulijun@huawei.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

601f3e6d

RDMA/hns: Only assign the relatived fields of psn if IB_QP_SQ_PSN is set · f04cc178

由 Lijun Ou 提交于 2月 23, 2019

Only when the IB_QP_SQ_PSN flags of attr_mask is set is it valid to assign
the relatived fields of psn into the qp context when modified qp.
Signed-off-by: NLijun Ou <oulijun@huawei.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

f04cc178

cxgb4: Convert stid_idr to XArray · 401b4480

由 Matthew Wilcox 提交于 2月 20, 2019

Signed-off-by: NMatthew Wilcox <willy@infradead.org>
Acked-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

401b4480

cxgb4: Convert atid_idr to XArray · 9f5a9632

由 Matthew Wilcox 提交于 2月 20, 2019

Signed-off-by: NMatthew Wilcox <willy@infradead.org>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

9f5a9632

cxgb4: Convert hwtid_idr to XArray · f254ba6a

由 Matthew Wilcox 提交于 2月 20, 2019

Signed-off-by: NMatthew Wilcox <willy@infradead.org>
Acked-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

f254ba6a

cxgb4: Convert mmidr to XArray · 7a268a93

由 Matthew Wilcox 提交于 2月 20, 2019

Signed-off-by: NMatthew Wilcox <willy@infradead.org>
Acked-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

7a268a93

cxgb4: Convert qpidr to XArray · 2f431291

由 Matthew Wilcox 提交于 2月 20, 2019

Signed-off-by: NMatthew Wilcox <willy@infradead.org>
Acked-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

2f431291

cxgb4: Convert cqidr to XArray · 52e124c2

由 Matthew Wilcox 提交于 2月 20, 2019

Signed-off-by: NMatthew Wilcox <willy@infradead.org>
Acked-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

52e124c2

cxgb3: Convert mmidr to XArray · e64a7c02

由 Matthew Wilcox 提交于 2月 20, 2019

Signed-off-by: NMatthew Wilcox <willy@infradead.org>
Acked-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

e64a7c02

cxgb3: Convert qpidr to XArray · 27114876

由 Matthew Wilcox 提交于 2月 20, 2019

Signed-off-by: NMatthew Wilcox <willy@infradead.org>
Acked-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

27114876

cxgb3: Convert cqidr to XArray · a2f40971

由 Matthew Wilcox 提交于 2月 20, 2019

It would make sense to convert this to an allocating XArray and remove the
kfifo that is currently used to allocate the CQID, but that work is better
done by someone who has the hardware to test with.
Signed-off-by: NMatthew Wilcox <willy@infradead.org>
Acked-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

a2f40971

18 3月, 2019 4 次提交

i40iw: Avoid panic when handling the inetdev event · ec4fe4bc

由 Feng Tang 提交于 3月 14, 2019

There is a panic reported that on a system with x722 ethernet, when doing
the operations like:

	# ip link add br0 type bridge
	# ip link set eno1 master br0
	# systemctl restart systemd-networkd

The system will panic "BUG: unable to handle kernel null pointer
dereference at 0000000000000034", with call chain:

	i40iw_inetaddr_event
	notifier_call_chain
	blocking_notifier_call_chain
	notifier_call_chain
	__inet_del_ifa
	inet_rtm_deladdr
	rtnetlink_rcv_msg
	netlink_rcv_skb
	rtnetlink_rcv
	netlink_unicast
	netlink_sendmsg
	sock_sendmsg
	__sys_sendto

It is caused by "local_ipaddr = ntohl(in->ifa_list->ifa_address)", while
the in->ifa_list is NULL.

So add a check for the "in->ifa_list == NULL" case, and skip the ARP
operation accordingly.
Signed-off-by: NFeng Tang <feng.tang@intel.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

ec4fe4bc

IB/mlx5: Fix mapping of link-mode to IB width and speed · cd272875

由 Aya Levin 提交于 3月 11, 2019

Add mapping of link mode: CAUI4 100Gbps CR4/KR4 with 4 lines and 25Gbps.
Fix mapping of link mode: GAUI2 50Gbps CR2/KR2 to be 2 lines with 25Gbps.

Fixes: 08e8676f ("IB/mlx5: Add support for 50Gbps per lane link modes")
Signed-off-by: NAya Levin <ayal@mellanox.com>
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

cd272875

IB/mlx5: Use mlx5 core to create/destroy a DEVX DCT · c5ae1954

由 Yishai Hadas 提交于 3月 06, 2019

To prevent a hardware memory leak when a DEVX DCT object is destroyed
without calling DRAIN DCT before, (e.g. under cleanup flow), need to
manage its creation and destruction via mlx5 core.

In that case the DRAIN DCT command will be called and only once that it
will be completed the DESTROY DCT command will be called.  Otherwise, the
DESTROY DCT may fail and a hardware leak may occur.

As of that change the DRAIN DCT command should not be exposed any more
from DEVX, it's managed internally by the driver to work as expected by
the device specification.

Fixes: 7efce369 ("IB/mlx5: Add obj create and destroy functionality")
Signed-off-by: NYishai Hadas <yishaih@mellanox.com>
Reviewed-by: NArtemy Kovalyov <artemyko@mellanox.com>
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

c5ae1954

IB/mlx4: Fix race condition between catas error reset and aliasguid flows · 587443e7

由 Jack Morgenstein 提交于 3月 06, 2019

Code review revealed a race condition which could allow the catas error
flow to interrupt the alias guid query post mechanism at random points.
Thiis is fixed by doing cancel_delayed_work_sync() instead of
cancel_delayed_work() during the alias guid mechanism destroy flow.

Fixes: a0c64a17 ("mlx4: Add alias_guid mechanism")
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

587443e7

07 3月, 2019 1 次提交

IB/hfi1: Close race condition on user context disable and close · bc5add09

由 Michael J. Ruhl 提交于 2月 26, 2019

When disabling and removing a receive context, it is possible for an
asynchronous event (i.e IRQ) to occur.  Because of this, there is a race
between cleaning up the context, and the context being used by the
asynchronous event.

cpu 0  (context cleanup)
    rc->ref_count-- (ref_count == 0)
    hfi1_rcd_free()
cpu 1  (IRQ (with rcd index))
	rcd_get_by_index()
	lock
	ref_count+++     <-- reference count race (WARNING)
	return rcd
	unlock
cpu 0
    hfi1_free_ctxtdata() <-- incorrect free location
    lock
    remove rcd from array
    unlock
    free rcd

This race will cause the following WARNING trace:

WARNING: CPU: 0 PID: 175027 at include/linux/kref.h:52 hfi1_rcd_get_by_index+0x84/0xa0 [hfi1]
CPU: 0 PID: 175027 Comm: IMB-MPI1 Kdump: loaded Tainted: G OE ------------ 3.10.0-957.el7.x86_64 #1
Hardware name: Intel Corporation S2600KP/S2600KP, BIOS SE5C610.86B.11.01.0076.C4.111920150602 11/19/2015
Call Trace:
  dump_stack+0x19/0x1b
  __warn+0xd8/0x100
  warn_slowpath_null+0x1d/0x20
  hfi1_rcd_get_by_index+0x84/0xa0 [hfi1]
  is_rcv_urgent_int+0x24/0x90 [hfi1]
  general_interrupt+0x1b6/0x210 [hfi1]
  __handle_irq_event_percpu+0x44/0x1c0
  handle_irq_event_percpu+0x32/0x80
  handle_irq_event+0x3c/0x60
  handle_edge_irq+0x7f/0x150
  handle_irq+0xe4/0x1a0
  do_IRQ+0x4d/0xf0
  common_interrupt+0x162/0x162

The race can also lead to a use after free which could be similar to:

general protection fault: 0000 1 SMP
CPU: 71 PID: 177147 Comm: IMB-MPI1 Kdump: loaded Tainted: G W OE ------------ 3.10.0-957.el7.x86_64 #1
Hardware name: Intel Corporation S2600KP/S2600KP, BIOS SE5C610.86B.11.01.0076.C4.111920150602 11/19/2015
task: ffff9962a8098000 ti: ffff99717a508000 task.ti: ffff99717a508000 __kmalloc+0x94/0x230
Call Trace:
  ? hfi1_user_sdma_process_request+0x9c8/0x1250 [hfi1]
  hfi1_user_sdma_process_request+0x9c8/0x1250 [hfi1]
  hfi1_aio_write+0xba/0x110 [hfi1]
  do_sync_readv_writev+0x7b/0xd0
  do_readv_writev+0xce/0x260
  ? handle_mm_fault+0x39d/0x9b0
  ? pick_next_task_fair+0x5f/0x1b0
  ? sched_clock_cpu+0x85/0xc0
  ? __schedule+0x13a/0x890
  vfs_writev+0x35/0x60
  SyS_writev+0x7f/0x110
  system_call_fastpath+0x22/0x27

Use the appropriate kref API to verify access.

Reorder context cleanup to ensure context removal before cleanup occurs
correctly.

Cc: stable@vger.kernel.org # v4.14.0+
Fixes: f683c80c ("IB/hfi1: Resolve kernel panics by reference counting receive contexts")
Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NMichael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

bc5add09

06 3月, 2019 1 次提交

mm: replace all open encodings for NUMA_NO_NODE · 98fa15f3

由 Anshuman Khandual 提交于 3月 05, 2019

Patch series "Replace all open encodings for NUMA_NO_NODE", v3.

All these places for replacement were found by running the following
grep patterns on the entire kernel code.  Please let me know if this
might have missed some instances.  This might also have replaced some
false positives.  I will appreciate suggestions, inputs and review.

1. git grep "nid == -1"
2. git grep "node == -1"
3. git grep "nid = -1"
4. git grep "node = -1"

This patch (of 2):

At present there are multiple places where invalid node number is
encoded as -1.  Even though implicitly understood it is always better to
have macros in there.  Replace these open encodings for an invalid node
number with the global macro NUMA_NO_NODE.  This helps remove NUMA
related assumptions like 'invalid node' from various places redirecting
them to a common definition.

Link: http://lkml.kernel.org/r/1545127933-10711-2-git-send-email-anshuman.khandual@arm.comSigned-off-by: NAnshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: NDavid Hildenbrand <david@redhat.com>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	[ixgbe]
Acked-by: Jens Axboe <axboe@kernel.dk>			[mtip32xx]
Acked-by: Vinod Koul <vkoul@kernel.org>			[dmaengine.c]
Acked-by: Michael Ellerman <mpe@ellerman.id.au>		[powerpc]
Acked-by: Doug Ledford <dledford@redhat.com>		[drivers/infiniband]
Cc: Joseph Qi <jiangqi903@gmail.com>
Cc: Hans Verkuil <hverkuil@xs4all.nl>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

98fa15f3

05 3月, 2019 2 次提交

RDMA/hns: Use GFP_ATOMIC in hns_roce_v2_modify_qp · 4e69cf1f

由 YueHaibing 提交于 3月 04, 2019

The the below commit, hns_roce_v2_modify_qp is called inside spinlock
while using GFP_KERNEL. Change it to GFP_ATOMIC.

Fixes: 0425e3e6 ("RDMA/hns: Support flush cqe for hip08 in kernel space")
Signed-off-by: NYueHaibing <yuehaibing@huawei.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

4e69cf1f

cxgb4: kfree mhp after the debug print · 952a3cc9

由 Shaobo He 提交于 2月 28, 2019

In function `c4iw_dealloc_mw`, variable mhp's value is printed after
freed, it is clearer to have the print before the kfree.

Otherwise racing threads could allocate another mhp with the same pointer
value and create confusing tracing.
Signed-off-by: NShaobo He <shaobo@cs.utah.edu>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

952a3cc9

04 3月, 2019 2 次提交

IB/mlx5: Set correct write permissions for implicit ODP MR · 7095ec3c

由 Moni Shoua 提交于 2月 25, 2019

The write access of an implicit MR is inherited to all of its children.
Therefore we must set the correct write access to the parent MR.

Pass full access_flags when creating umem to let it calculate write access
correctly.

Fixes: da6a496a ("IB/mlx5: Ranges in implicit ODP MR inherit its write access")
Signed-off-by: NMoni Shoua <monis@mellanox.com>
Reviewed-by: NArtemy Kovalyov <artemyko@mellanox.com>
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

7095ec3c

bnxt_re: Clean cq for kernel consumers only · 0fca467e

由 Devesh Sharma 提交于 2月 25, 2019

Kernel space provider driver should clean the CQs belonging to kernel
space consumers only. The current implementation is doing reverse of it.

Fixing the same by avoiding the call to __clean_cq on a kernel qp during
destroy.

Fixes: c50866e2 ("bnxt_re: fix the regression due to changes in alloc_pbl")
Signed-off-by: NDevesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

0fca467e

27 2月, 2019 1 次提交

net: devlink: turn devlink into a built-in · f4b6bcc7

由 Jakub Kicinski 提交于 2月 25, 2019

Being able to build devlink as a module causes growing pains.
First all drivers had to add a meta dependency to make sure
they are not built in when devlink is built as a module.  Now
we are struggling to invoke ethtool compat code reliably.

Make devlink code built-in, users can still not build it at
all but the dynamically loadable module option is removed.
Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
Acked-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f4b6bcc7

23 2月, 2019 3 次提交

RDMA: Handle ucontext allocations by IB/core · a2a074ef

由 Leon Romanovsky 提交于 2月 12, 2019

Following the PD conversion patch, do the same for ucontext allocations.
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

a2a074ef

bnxt_re: fix the regression due to changes in alloc_pbl · c50866e2

由 Devesh Sharma 提交于 2月 22, 2019

While adding the use of for_each_sg_dma_page iterator for Brodcom's rdma
driver, there was a regression added in the __alloc_pbl path. The change
left bnxt_re in DOA state in for-next branch.

Fixing the regression to avoid the host crash when a user space object is
created. Restricting the unconditional access to hwq.pg_arr when hwq is
initialized for user space objects.

Fixes: 161ebe24 ("RDMA/bnxt_re: Use for_each_sg_dma_page iterator on umem SGL")
Reported-by: NGal Pressman <galpress@amazon.com>
Signed-off-by: NSelvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: NDevesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

c50866e2

IB/mlx4: Increase the timeout for CM cache · 2612d723

由 Håkon Bugge 提交于 2月 17, 2019

Using CX-3 virtual functions, either from a bare-metal machine or
pass-through from a VM, MAD packets are proxied through the PF driver.

Since the VF drivers have separate name spaces for MAD Transaction Ids
(TIDs), the PF driver has to re-map the TIDs and keep the book keeping
in a cache.

Following the RDMA Connection Manager (CM) protocol, it is clear when
an entry has to evicted form the cache. But life is not perfect,
remote peers may die or be rebooted. Hence, it's a timeout to wipe out
a cache entry, when the PF driver assumes the remote peer has gone.

During workloads where a high number of QPs are destroyed concurrently,
excessive amount of CM DREQ retries has been observed

The problem can be demonstrated in a bare-metal environment, where two
nodes have instantiated 8 VFs each. This using dual ported HCAs, so we
have 16 vPorts per physical server.

64 processes are associated with each vPort and creates and destroys
one QP for each of the remote 64 processes. That is, 1024 QPs per
vPort, all in all 16K QPs. The QPs are created/destroyed using the
CM.

When tearing down these 16K QPs, excessive CM DREQ retries (and
duplicates) are observed. With some cat/paste/awk wizardry on the
infiniband_cm sysfs, we observe as sum of the 16 vPorts on one of the
nodes:

cm_rx_duplicates:
      dreq  2102
cm_rx_msgs:
      drep  1989
      dreq  6195
       rep  3968
       req  4224
       rtu  4224
cm_tx_msgs:
      drep  4093
      dreq 27568
       rep  4224
       req  3968
       rtu  3968
cm_tx_retries:
      dreq 23469

Note that the active/passive side is equally distributed between the
two nodes.

Enabling pr_debug in cm.c gives tons of:

[171778.814239] <mlx4_ib> mlx4_ib_multiplex_cm_handler: id{slave:
1,sl_cm_id: 0xd393089f} is NULL!

By increasing the CM_CLEANUP_CACHE_TIMEOUT from 5 to 30 seconds, the
tear-down phase of the application is reduced from approximately 90 to
50 seconds. Retries/duplicates are also significantly reduced:

cm_rx_duplicates:
      dreq  2460
[]
cm_tx_retries:
      dreq  3010
       req    47

Increasing the timeout further didn't help, as these duplicates and
retries stems from a too short CMA timeout, which was 20 (~4 seconds)
on the systems. By increasing the CMA timeout to 22 (~17 seconds), the
numbers fell down to about 10 for both of them.

Adjustment of the CMA timeout is not part of this commit.
Signed-off-by: NHåkon Bugge <haakon.bugge@oracle.com>
Acked-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

2612d723

22 2月, 2019 3 次提交

IB/mlx5: Validate correct PD before prefetch MR · 81dd4c4b

由 Moni Shoua 提交于 2月 17, 2019

When prefetching odp mr it is required to verify that pd of the mr is
identical to the pd for which the advise_mr request arrived with.

This check was missing from synchronous flow and is added now.

Fixes: 813e90b1 ("IB/mlx5: Add advise_mr() support")
Reported-by: NParav Pandit <parav@mellanox.com>
Signed-off-by: NMoni Shoua <monis@mellanox.com>
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

81dd4c4b

IB/mlx5: Protect against prefetch of invalid MR · a6bc3875

由 Moni Shoua 提交于 2月 17, 2019

When deferring a prefetch request we need to protect against MR or PD
being destroyed while the request is still enqueued.

The first step is to validate that PD owns the lkey that describes the MR
and that the MR that the lkey refers to is owned by that PD.

The second step is to dequeue all requests when MR is destroyed.

Since PD can't be destroyed while it owns MRs it is guaranteed that when a
worker wakes up the request it refers to is still valid.

Now, it is possible to refrain from taking a reference on the device since
it is assured to be present as pd.

While that, replace the dedicated ordered workqueue with the system
unbound workqueue to reuse an existing resource and improve
performance. This will also fix a bug of queueing to the wrong workqueue.

Fixes: 813e90b1 ("IB/mlx5: Add advise_mr() support")
Reported-by: NParav Pandit <parav@mellanox.com>
Signed-off-by: NMoni Shoua <monis@mellanox.com>
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

a6bc3875

IB/hfi1: Add missing break in switch statement · 7264235e

由 Gustavo A. R. Silva 提交于 2月 20, 2019

Fix the following warning by adding a missing break:

drivers/infiniband/hw/hfi1/tid_rdma.c: In function ‘hfi1_tid_rdma_wqe_interlock’:
drivers/infiniband/hw/hfi1/tid_rdma.c:3251:3: warning: this statement may fall through [-Wimplicit-fallthrough=]
   switch (prev->wr.opcode) {
   ^~~~~~
drivers/infiniband/hw/hfi1/tid_rdma.c:3259:2: note: here
  case IB_WR_RDMA_READ:
  ^~~~

Warning level 3 was used: -Wimplicit-fallthrough=3

This patch is part of the ongoing efforts to enable
-Wimplicit-fallthrough.

Fixes: c6c23117 ("IB/hfi1: Add interlock between TID RDMA WRITE and other requests")
Signed-off-by: NGustavo A. R. Silva <gustavo@embeddedor.com>
Reviewed-by: NKaike Wan <Kaike.wan@intel.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

7264235e

21 2月, 2019 1 次提交

drivers/IB,qib: Fix pinned/locked limit check in qib_get_user_pages() · ec95e0fa

由 Davidlohr Bueso 提交于 2月 16, 2019

The current check does not take into account the previous value of
pinned_vm; thus it is quite bogus as is. Fix this by checking the
new value after the (optimistic) atomic inc.
Signed-off-by: NDavidlohr Bueso <dbueso@suse.de>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

ec95e0fa

20 2月, 2019 2 次提交

iw_cxgb4: Make function read_tcb() static · 3b8f8b95

由 Wei Yongjun 提交于 2月 18, 2019

Fixes the following sparse warning:

drivers/infiniband/hw/cxgb4/cm.c:658:6: warning:
 symbol 'read_tcb' was not declared. Should it be static?

Fixes: 11a27e21 ("iw_cxgb4: complete the cached SRQ buffers")
Signed-off-by: NWei Yongjun <weiyongjun1@huawei.com>
Acked-by: NRaju Rangoju <rajur@chelsio.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

3b8f8b95

RDMA/hns: Bugfix for set hem of SCC · 6ac16e40

由 Yangyang Li 提交于 2月 16, 2019

The method of set hem for scc context is different from other contexts. It
should notify the hardware with the detailed idx in bt0 for scc, while for
other contexts, it only need to notify the bt step and the hardware will
calculate the idx.

Here fixes the following error when unloading the hip08 driver:

[  123.570768] {1}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 0
[  123.579023] {1}[Hardware Error]: event severity: recoverable
[  123.584670] {1}[Hardware Error]:  Error 0, type: recoverable
[  123.590317] {1}[Hardware Error]:   section_type: PCIe error
[  123.595877] {1}[Hardware Error]:   version: 4.0
[  123.600395] {1}[Hardware Error]:   command: 0x0006, status: 0x0010
[  123.606562] {1}[Hardware Error]:   device_id: 0000:7d:00.0
[  123.612034] {1}[Hardware Error]:   slot: 0
[  123.616120] {1}[Hardware Error]:   secondary_bus: 0x00
[  123.621245] {1}[Hardware Error]:   vendor_id: 0x19e5, device_id: 0xa222
[  123.627847] {1}[Hardware Error]:   class_code: 000002
[  123.632977] hns3 0000:7d:00.0: aer_status: 0x00000000, aer_mask: 0x00000000
[  123.639928] hns3 0000:7d:00.0: aer_layer=Transaction Layer, aer_agent=Receiver ID
[  123.647400] hns3 0000:7d:00.0: aer_uncor_severity: 0x00000000
[  123.653136] hns3 0000:7d:00.0: PCI error detected, state(=1)!!
[  123.658959] hns3 0000:7d:00.0: ROCEE uncorrected RAS error identified
[  123.665395] hns3 0000:7d:00.0: ROCEE RAS AXI rresp error
[  123.670713] hns3 0000:7d:00.0: requesting reset due to PCI error
[  123.676715] hns3 0000:7d:00.0: received reset event , reset type is 5
[  123.683147] hns3 0000:7d:00.0: AER: Device recovery successful
[  123.688978] hns3 0000:7d:00.0: PF Reset requested
[  123.693684] hns3 0000:7d:00.0: PF failed(=-5) to send mailbox message to VF
[  123.700633] hns3 0000:7d:00.0: inform reset to vf(1) failded -5!

Fixes: 6a157f7d ("RDMA/hns: Add SCC context allocation support for hip08")
Signed-off-by: NYangyang Li <liyangyang20@huawei.com>
Reviewed-by: NYixian Liu <liuyixian@huawei.com>
Reviewed-by: NLijun Ou <oulijun@huawei.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

6ac16e40

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功