提交 · 46824288a5fc7b7e5166be2b6c303d98a41bd631 · openanolis / cloud-kernel

13 12月, 2019 3 次提交

IB/hfi1: Ignore LNI errors before DC8051 transitions to Polling state · 46824288

由 Kaike Wan 提交于 11月 28, 2018

[ Upstream commit c1a797c0818e0122c7ec8422edd971cfec9b15ea ]

When it is requested to change its physical state back to Offline while in
the process to go up, DC8051 will set the ERROR field in the
DC8051_DBG_ERR_INFO_SET_BY_8051 register. This ERROR field will remain
until the next time when DC8051 transitions from Offline to Polling.
Subsequently, when the host requests DC8051 to change its physical state
to Polling again, it may receive a DC8051 interrupt with the stale ERROR
field still in DC8051_DBG_ERR_INFO_SET_BY_8051. If the host link state has
been changed to Polling, this stale ERROR will force the host to
transition to Offline state, resulting in a vicious cycle of Polling
->Offline->Polling->Offline. On the other hand, if the host link state is
still Offline when the stale ERROR is received, the stale ERROR will be
ignored, and the link will come up correctly.  This patch implements the
correct behavior by changing host link state to Polling only after DC8051
changes its physical state to Polling.
Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NKrzysztof Goreczny <krzysztof.goreczny@intel.com>
Signed-off-by: NKaike Wan <kaike.wan@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>

46824288

iw_cxgb4: only reconnect with MPAv1 if the peer aborts · cd69bcc0

由 Steve Wise 提交于 11月 10, 2018

[ Upstream commit 9828ca654b52848e7eb7dcc9b0994ff130dd4546 ]

Only retry connection setup with MPAv1 if the peer actually aborted the
connection upon receiving the MPAv2 start message.  This avoids retrying
with MPAv1 in the case where the connection was aborted due to retransmit
timeouts.

Fixes: d2fe99e8 ("RDMA/cxgb4: Add support for MPAv2 Enhanced RDMA Negotiation")
Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>

cd69bcc0

RDMA/hns: Correct the value of HNS_ROCE_HEM_CHUNK_LEN · 65e5e991

由 Sirong Wang 提交于 11月 01, 2019

[ Upstream commit 531eb45b3da4267fc2a64233ba256c8ffb02edd2 ]

Size of pointer to buf field of struct hns_roce_hem_chunk should be
considered when calculating HNS_ROCE_HEM_CHUNK_LEN, or sg table size will
be larger than expected when allocating hem.

Fixes: 9a443537 ("IB/hns: Add driver files for hns RoCE driver")
Link: https://lore.kernel.org/r/1572575610-52530-2-git-send-email-liweihang@hisilicon.comSigned-off-by: NSirong Wang <wangsirong@huawei.com>
Signed-off-by: NWeihang Li <liweihang@hisilicon.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>

65e5e991

05 12月, 2019 9 次提交

RDMA/hns: Use GFP_ATOMIC in hns_roce_v2_modify_qp · ec277112

由 YueHaibing 提交于 3月 04, 2019

[ Upstream commit 4e69cf1fe2c52d189acdd06c1fd99cc258aba61f ]

The the below commit, hns_roce_v2_modify_qp is called inside spinlock
while using GFP_KERNEL. Change it to GFP_ATOMIC.

Fixes: 0425e3e6 ("RDMA/hns: Support flush cqe for hip08 in kernel space")
Signed-off-by: NYueHaibing <yuehaibing@huawei.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>

ec277112

RDMA/hns: Fix the state of rereg mr · cc84d7b1

由 Yixian Liu 提交于 2月 03, 2019

[ Upstream commit ab22bf05216a6bb4812448f3a8609489047cf311 ]

The state of mr after reregister operation should be set to valid
state. Otherwise, it will keep the same as the state before reregistered.
Signed-off-by: NYixian Liu <liuyixian@huawei.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>

cc84d7b1

RDMA/hns: Bugfix for the scene without receiver queue · 53ac1992

由 Lijun Ou 提交于 12月 12, 2018

[ Upstream commit 4d103905eb1e4f14cb62fcf962c9d35da7005dea ]

In some application scenario, the user could not have receive queue when
run rdma write or read operation.
Signed-off-by: NLijun Ou <oulijun@huawei.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>

53ac1992

RDMA/hns: Fix the bug with updating rq head pointer when flush cqe · c6657600

由 Lijun Ou 提交于 12月 12, 2018

[ Upstream commit 9c6ccc035c209dda07685e8dba829a203ba17499 ]

When flush cqe with srq, the driver disable to update the rq head pointer
into the hardware.
Signed-off-by: NLijun Ou <oulijun@huawei.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>

c6657600

infiniband/qedr: Potential null ptr dereference of qp · 77a9b90d

由 Aditya Pakki 提交于 12月 24, 2018

[ Upstream commit 9c6260de505b63638dd86fcc33849b17f6146d94 ]

idr_find() may fail and return a NULL pointer. The fix checks the return
value of the function and returns an error in case of NULL.
Signed-off-by: NAditya Pakki <pakki001@umn.edu>
Acked-by: NMichal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>

77a9b90d

infiniband: bnxt_re: qplib: Check the return value of send_message · 6a9d929b

由 Aditya Pakki 提交于 12月 26, 2018

[ Upstream commit 94edd87a1c59f3efa6fdf4e98d6d492e6cec6173 ]

In bnxt_qplib_map_tc2cos(), bnxt_qplib_rcfw_send_message() can return an
error value but it is lost. Propagate this error to the callers.
Signed-off-by: NAditya Pakki <pakki001@umn.edu>
Acked-By: NDevesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>

6a9d929b

IB/qib: Fix an error code in qib_sdma_verbs_send() · 360e6fe7

由 Dan Carpenter 提交于 12月 17, 2018

[ Upstream commit 5050ae5fa3d54c8e83e1e447cc7e3591110a7f57 ]

We accidentally return success on this error path.

Fixes: f931551b ("IB/qib: Add new qib driver for QLogic PCIe InfiniBand adapters")
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>

360e6fe7

RDMA/vmw_pvrdma: Use atomic memory allocation in create AH · d2e3e3c3

由 Gal Pressman 提交于 12月 10, 2018

[ Upstream commit a276a4d93bf1580d737f38d1810e5f4b166f3edd ]

Create address handle callback should not sleep, use GFP_ATOMIC instead of
GFP_KERNEL for memory allocation.

Fixes: 29c8d9eb ("IB: Add vmw_pvrdma driver")
Cc: Adit Ranadive <aditr@vmware.com>
Signed-off-by: NGal Pressman <galpress@amazon.com>
Reviewed-by: NYuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>

d2e3e3c3

RDMA/hns: Fix the bug while use multi-hop of pbl · 2ec10345

由 Lijun Ou 提交于 12月 08, 2018

[ Upstream commit 4af07f01f7a787ba5158352b98c9e3cb74995a1c ]

It will prevent multiply overflow when defines the pbl for u64 type.
Signed-off-by: NLijun Ou <oulijun@huawei.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>

2ec10345

01 12月, 2019 3 次提交

RDMA/bnxt_re: Avoid resource leak in case the NQ registration fails · fae3cf88

由 Selvin Xavier 提交于 10月 08, 2018

[ Upstream commit 5df950994934814a8b91f0cf9f653842d2ba082d ]

In case the NQ alloc/enable fails, free up the already allocated/enabled
NQ before reporting failure. Also, track the alloc/enable using proper
state checking.
Signed-off-by: NSelvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>

fae3cf88

RDMA/bnxt_re: Fix qp async event reporting · b1bf1e42

由 Devesh Sharma 提交于 10月 08, 2018

[ Upstream commit 4c01f2e3a906a0d2d798be5751c331cf501bc129 ]

Reports affiliated async event on the qp-async event channel instead of
global event channel.
Signed-off-by: NDevesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: NSelvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>

b1bf1e42

RDMA/bnxt_re: Avoid NULL check after accessing the pointer · 2f241e33

由 Selvin Xavier 提交于 10月 08, 2018

[ Upstream commit eae4ad1b0c9a77ef0cbac212d58d46976eaacfc1 ]

This is reported by smatch check.  rcfw->creq_bar_reg_iomem is accessed in
bnxt_qplib_rcfw_stop_irq and this variable check afterwards doesn't make
sense.  Also, rcfw->creq_bar_reg_iomem will never be NULL.  So Removing
this check.
Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
Fixes: 6e04b103 ("RDMA/bnxt_re: Fix broken RoCE driver due to recent L2 driver changes")
Signed-off-by: NSelvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>

2f241e33

24 11月, 2019 6 次提交

RDMA/hns: Limit the size of extend sge of sq · d5d78049

由 Lijun Ou 提交于 9月 30, 2018

[ Upstream commit 05ad5482a5904c416bcfd74afd7193e206e563ce ]

The hip08 split two hardware version. The version id are 0x20 and 0x21
according to the PCI revison. The max size of extend sge of sq is limited
to 2M for 0x20 version and 8M for 0x21 version. It may be exceeded to 2M
according to the algorithm that compute the product of wqe count and
extend sge number of every wqe. But the product always less than 8M.
Signed-off-by: NLijun Ou <oulijun@huawei.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>

d5d78049

RDMA/hns: Bugfix for CM test · 010af7a8

由 Lijun Ou 提交于 9月 30, 2018

[ Upstream commit 15fc056fba7b17b9abfbe80a12f188403fc949fb ]

It will print the warning when the MSB bit of SLID is not zero running
cm_req_handler function that test CM. It needs to fixed zero when test
RoCE device.
Signed-off-by: NLijun Ou <oulijun@huawei.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>

010af7a8

RDMA/hns: Submit bad wr when post send wr exception · 5b7064ad

由 Lijun Ou 提交于 9月 30, 2018

[ Upstream commit c80e066100b5fed722c8da67c1bd2312e7bcf129 ]

When user issues a RDMA read and enables sq inline, it needs to report a
bad wr to user.
Signed-off-by: NLijun Ou <oulijun@huawei.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>

5b7064ad

RDMA/hns: Bugfix for reserved qp number · af762655

由 Lijun Ou 提交于 9月 30, 2018

[ Upstream commit 06ef0ee4b569101f3a07ce08335dbf29fd1404ef ]

It needs to include two special qps for every port. The hip08 have four
ports and the all reserved qp numbers are eight.
Signed-off-by: NLijun Ou <oulijun@huawei.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>

af762655

IB/mthca: Fix error return code in __mthca_init_one() · e3db306d

由 Wei Yongjun 提交于 9月 29, 2018

[ Upstream commit 39f2495618c5e980d2873ea3f2d1877dd253e07a ]

Fix to return a negative error code from the mthca_cmd_init() error
handling case instead of 0, as done elsewhere in this function.

Fixes: 80fd8238 ("[PATCH] IB/mthca: Encapsulate command interface init")
Signed-off-by: NWei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>

e3db306d

IB/hfi1: Error path MAD response size is incorrect · e0dee1c8

由 Michael J. Ruhl 提交于 9月 28, 2018

[ Upstream commit 935c84ac649a147e1aad2c48ee5c5a1a9176b2d0 ]

If a MAD packet has incorrect header information, the logic uses the reply
path to report the error.  The reply path expects *resp_len to be set
prior to return.  Unfortunately, *resp_len is set to 0 for this path.
This causes an incorrect response packet.

Fix by ensuring that the *resp_len is defaulted to the incoming packet
size (wc->bytes_len - sizeof(GRH)).
Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NMichael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>

e0dee1c8

21 11月, 2019 9 次提交

RDMA: Fix dependencies for rdma_user_mmap_io · 352668d3

由 Arnd Bergmann 提交于 9月 26, 2018

[ Upstream commit 46bdf777685677c1cc6b3da9220aace9da690731 ]

The mlx4 driver produces a link error when it is configured
as built-in while CONFIG_INFINIBAND_USER_ACCESS is set to =m:

drivers/infiniband/hw/mlx4/main.o: In function `mlx4_ib_mmap':
main.c:(.text+0x1af4): undefined reference to `rdma_user_mmap_io'

The same function is called from mlx5, which already has a
dependency to ensure we can call it, and from hns, which
appears to suffer from the same problem.

This adds the same dependency that mlx5 uses to the other two.

Fixes: 6745d356ab39 ("RDMA/hns: Use rdma_user_mmap_io")
Fixes: c282da4109e4 ("RDMA/mlx4: Use rdma_user_mmap_io")
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>

352668d3

iw_cxgb4: Use proper enumerated type in c4iw_bar2_addrs · 2f9d0f70

由 Nathan Chancellor 提交于 9月 24, 2018

[ Upstream commit 1b571086e869395b6a11ab24186b0104fe05c057 ]

Clang warns when one enumerated type is implicitly converted to another.

drivers/infiniband/hw/cxgb4/qp.c:287:8: warning: implicit conversion
from enumeration type 'enum t4_bar2_qtype' to different enumeration type
'enum cxgb4_bar2_qtype' [-Wenum-conversion]
                                                 T4_BAR2_QTYPE_EGRESS,
                                                 ^~~~~~~~~~~~~~~~~~~~

c4iw_bar2_addrs expects a value from enum cxgb4_bar2_qtype so use the
corresponding values from that type so Clang is satisfied without changing
the meaning of the code.

T4_BAR2_QTYPE_EGRESS = CXGB4_BAR2_QTYPE_EGRESS = 0
T4_BAR2_QTYPE_INGRESS = CXGB4_BAR2_QTYPE_INGRESS = 1
Reported-by: NNick Desaulniers <ndesaulniers@google.com>
Signed-off-by: NNathan Chancellor <natechancellor@gmail.com>
Acked-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>

2f9d0f70

RDMA/i40iw: Fix incorrect iterator type · 7cfb3b04

由 Håkon Bugge 提交于 9月 17, 2018

[ Upstream commit 802fa45cd320de319e86c93bca72abec028ba059 ]

Commit f27b4746 ("i40iw: add connection management code") uses an
incorrect rcu iterator, whilst holding the rtnl_lock. Since the
critical region invokes i40iw_manage_qhash(), which is a sleeping
function, the rcu locking and traversal cannot be used.
Signed-off-by: NHåkon Bugge <haakon.bugge@oracle.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>

7cfb3b04

IB/hfi1: Missing return value in error path for user sdma · 93ae4ded

由 Michael J. Ruhl 提交于 9月 10, 2018

[ Upstream commit 2bf4b33f83dfe521c4c7c407b6b150aeec04d69c ]

If the set_txreq_header_agh() function returns an error, the exit path
is chosen.

In this path, the code fails to set the return value.  This will cause
the caller to not realize an error has occurred.

Set the return value correctly in the error path.
Signed-off-by: NMichael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>

93ae4ded

RDMA/hns: Fix an error code in hns_roce_v2_init_eq_table() · 45d0ddf9

由 Dan Carpenter 提交于 9月 10, 2018

[ Upstream commit f1a315420e79fe5c077fa119db9439ffabd2cda2 ]

The error code isn't set on this path.
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>

45d0ddf9

IB/mlx5: Don't hold spin lock while checking device state · 2543eeba

由 Parav Pandit 提交于 8月 28, 2018

[ Upstream commit 6c75520f7e5a6a353f3b332509d205e213d05855 ]

mdev->state device state is not protected by the QP for which WRs are
being processed. Therefore, there is no need to hold spin lock while
checking mdev state.

Given that device fatal error is unlikely situation, wrap the condition
check with unlikely().

Additionally, kernel QP1 is also a kernel ULP for which soft CQEs needs
to be generated. Therefore, check for device fatal error before
processing QP1 work requests.

Fixes: 89ea94a7 ("IB/mlx5: Reset flow support for IB kernel ULPs")
Signed-off-by: NParav Pandit <parav@mellanox.com>
Reviewed-by: NDaniel Jurgens <danielj@mellanox.com>
Reviewed-by: NMaor Gottlieb <maorg@mellanox.com>
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>

2543eeba

IB/mlx5: Change TX affinity assignment in RoCE LAG mode · f5ee703f

由 Majd Dibbiny 提交于 8月 28, 2018

[ Upstream commit c6a21c3864fc7f5febae7d096cd136f397c791f2 ]

In the current code, the TX affinity is per RoCE device, which can cause
unfairness between different contexts. e.g. if we open two contexts, and
each open 10 QPs concurrently, all of the QPs of the first context might
end up on the first port instead of distributed on the two ports as
expected

To overcome this unfairness between processes, we maintain per device TX
affinity, and per process TX affinity.

The allocation algorithm is as follow:

1. Hold two tx_port_affinity atomic variables, one per RoCE device and one
   per ucontext. Both initialized to 0.

2. In mlx5_ib_alloc_ucontext do:
 2.1. ucontext.tx_port_affinity = device.tx_port_affinity
 2.2. device.tx_port_affinity += 1

3. In modify QP INIT2RST:
 3.1. qp.tx_port_affinity = ucontext.tx_port_affinity % MLX5_PORT_NUM
 3.2. ucontext.tx_port_affinity += 1
Signed-off-by: NMajd Dibbiny <majd@mellanox.com>
Reviewed-by: NMoni Shoua <monis@mellanox.com>
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>

f5ee703f

IB/hfi1: Use a common pad buffer for 9B and 16B packets · 9ace24bb

由 Mike Marciniszyn 提交于 10月 04, 2019

commit 22bb13653410424d9fce8d447506a41f8292f22f upstream.

There is no reason for a different pad buffer for the two
packet types.

Expand the current buffer allocation to allow for both
packet types.

Fixes: f8195f3b ("IB/hfi1: Eliminate allocation while atomic")
Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: NKaike Wan <kaike.wan@intel.com>
Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Link: https://lore.kernel.org/r/20191004204934.26838.13099.stgit@awfm-01.aw.intel.comSigned-off-by: NDoug Ledford <dledford@redhat.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

9ace24bb

IB/hfi1: Ensure full Gen3 speed in a Gen4 system · 6ec4a549

由 James Erwin 提交于 11月 01, 2019

commit a9c3c4c597704b3a1a2b9bef990e7d8a881f6533 upstream.

If an hfi1 card is inserted in a Gen4 systems, the driver will avoid the
gen3 speed bump and the card will operate at half speed.

This is because the driver avoids the gen3 speed bump when the parent bus
speed isn't identical to gen3, 8.0GT/s.  This is not compatible with gen4
and newer speeds.

Fix by relaxing the test to explicitly look for the lower capability
speeds which inherently allows for gen4 and all future speeds.

Fixes: 77241056 ("IB/hfi1: add driver files")
Link: https://lore.kernel.org/r/20191101192059.106248.1699.stgit@awfm-01.aw.intel.com
Cc: <stable@vger.kernel.org>
Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: NKaike Wan <kaike.wan@intel.com>
Signed-off-by: NJames Erwin <james.erwin@intel.com>
Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

6ec4a549

13 11月, 2019 5 次提交

RDMA/hns: Prevent memory leaks of eq->buf_list · f2bab3ed

由 Lijun Ou 提交于 10月 26, 2019

[ Upstream commit b681a0529968d2261aa15d7a1e78801b2c06bb07 ]

eq->buf_list->buf and eq->buf_list should also be freed when eqe_hop_num
is set to 0, or there will be memory leaks.

Fixes: a5073d60 ("RDMA/hns: Add eq support of hip08")
Link: https://lore.kernel.org/r/1572072995-11277-3-git-send-email-liweihang@hisilicon.comSigned-off-by: NLijun Ou <oulijun@huawei.com>
Signed-off-by: NWeihang Li <liweihang@hisilicon.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>

f2bab3ed

RDMA/iw_cxgb4: Avoid freeing skb twice in arp failure case · 55ca0834

由 Potnuri Bharat Teja 提交于 10月 25, 2019

[ Upstream commit d4934f45693651ea15357dd6c7c36be28b6da884 ]

_put_ep_safe() and _put_pass_ep_safe() free the skb before it is freed by
process_work(). fix double free by freeing the skb only in process_work().

Fixes: 1dad0ebe ("iw_cxgb4: Avoid touch after free error in ARP failure handlers")
Link: https://lore.kernel.org/r/1572006880-5800-1-git-send-email-bharat@chelsio.comSigned-off-by: NDakshaja Uppalapati <dakshaja@chelsio.com>
Signed-off-by: NPotnuri Bharat Teja <bharat@chelsio.com>
Reviewed-by: NJason Gunthorpe <jgg@mellanox.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>

55ca0834

RDMA/qedr: Fix reported firmware version · 48dd7128

由 Kamal Heib 提交于 10月 08, 2019

[ Upstream commit b806c94ee44e53233b8ce6c92d9078d9781786a5 ]

Remove spaces from the reported firmware version string.
Actual value:
$ cat /sys/class/infiniband/qedr0/fw_ver
8. 37. 7. 0

Expected value:
$ cat /sys/class/infiniband/qedr0/fw_ver
8.37.7.0

Fixes: ec72fce4 ("qedr: Add support for RoCE HW init")
Signed-off-by: NKamal Heib <kamalheib1@gmail.com>
Acked-by: Michal Kalderon <michal.kalderon@marvell.com>
Link: https://lore.kernel.org/r/20191007210730.7173-1-kamalheib1@gmail.comSigned-off-by: NDoug Ledford <dledford@redhat.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>

48dd7128

iw_cxgb4: fix ECN check on the passive accept · 6208c2bf

由 Potnuri Bharat Teja 提交于 10月 03, 2019

[ Upstream commit 612e0486ad0845c41ac10492e78144f99e326375 ]

pass_accept_req() is using the same skb for handling accept request and
sending accept reply to HW. Here req and rpl structures are pointing to
same skb->data which is over written by INIT_TP_WR() and leads to
accessing corrupt req fields in accept_cr() while checking for ECN flags.
Reordered code in accept_cr() to fetch correct req fields.

Fixes: 92e7ae71 ("iw_cxgb4: Choose appropriate hw mtu index and ISS for iWARP connections")
Signed-off-by: NPotnuri Bharat Teja <bharat@chelsio.com>
Link: https://lore.kernel.org/r/20191003104353.11590-1-bharat@chelsio.comSigned-off-by: NDoug Ledford <dledford@redhat.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>

6208c2bf

RDMA/mlx5: Clear old rate limit when closing QP · 89aa9e26

由 Rafi Wiener 提交于 10月 02, 2019

[ Upstream commit c8973df2da677f375f8b12b6eefca2f44c8884d5 ]

Before QP is closed it changes to ERROR state, when this happens
the QP was left with old rate limit that was already removed from
the table.

Fixes: 7d29f349 ("IB/mlx5: Properly adjust rate limit on QP state transitions")
Signed-off-by: NRafi Wiener <rafiw@mellanox.com>
Signed-off-by: NOleg Kuporosov <olegk@mellanox.com>
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
Link: https://lore.kernel.org/r/20191002120243.16971-1-leon@kernel.orgSigned-off-by: NDoug Ledford <dledford@redhat.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>

89aa9e26

06 11月, 2019 1 次提交

RDMA/hfi1: Prevent memory leak in sdma_init · 962cff4f

由 Navid Emamdoost 提交于 9月 25, 2019

[ Upstream commit 34b3be18a04ecdc610aae4c48e5d1b799d8689f6 ]

In sdma_init if rhashtable_init fails the allocated memory for
tmp_sdma_rht should be released.

Fixes: 5a52a7ac ("IB/hfi1: NULL pointer dereference when freeing rhashtable")
Link: https://lore.kernel.org/r/20190925144543.10141-1-navid.emamdoost@gmail.comSigned-off-by: NNavid Emamdoost <navid.emamdoost@gmail.com>
Acked-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>

962cff4f

29 10月, 2019 1 次提交

RDMA/cxgb4: Do not dma memory off of the stack · 27414f90

由 Greg KH 提交于 10月 01, 2019

commit 3840c5b78803b2b6cc1ff820100a74a092c40cbb upstream.

Nicolas pointed out that the cxgb4 driver is doing dma off of the stack,
which is generally considered a very bad thing. On some architectures it
could be a security problem, but odds are none of them actually run this
driver, so it's just a "normal" bug.

Resolve this by allocating the memory for a message off of the heap
instead of the stack. kmalloc() always will give us a proper memory
location that DMA will work correctly from.

Link: https://lore.kernel.org/r/20191001165611.GA3542072@kroah.comReported-by: NNicolas Waisman <nico@semmle.com>
Tested-by: NPotnuri Bharat Teja <bharat@chelsio.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

27414f90

05 10月, 2019 2 次提交

IB/hfi1: Define variables as unsigned long to fix KASAN warning · ad6819cd

由 Ira Weiny 提交于 9月 11, 2019

commit f8659d68e2bee5b86a1beaf7be42d942e1fc81f4 upstream.

Define the working variables to be unsigned long to be compatible with
for_each_set_bit and change types as needed.

While we are at it remove unused variables from a couple of functions.

This was found because of the following KASAN warning:
 ==================================================================
   BUG: KASAN: stack-out-of-bounds in find_first_bit+0x19/0x70
   Read of size 8 at addr ffff888362d778d0 by task kworker/u308:2/1889

   CPU: 21 PID: 1889 Comm: kworker/u308:2 Tainted: G W         5.3.0-rc2-mm1+ #2
   Hardware name: Intel Corporation W2600CR/W2600CR, BIOS SE5C600.86B.02.04.0003.102320141138 10/23/2014
   Workqueue: ib-comp-unb-wq ib_cq_poll_work [ib_core]
   Call Trace:
    dump_stack+0x9a/0xf0
    ? find_first_bit+0x19/0x70
    print_address_description+0x6c/0x332
    ? find_first_bit+0x19/0x70
    ? find_first_bit+0x19/0x70
    __kasan_report.cold.6+0x1a/0x3b
    ? find_first_bit+0x19/0x70
    kasan_report+0xe/0x12
    find_first_bit+0x19/0x70
    pma_get_opa_portstatus+0x5cc/0xa80 [hfi1]
    ? ret_from_fork+0x3a/0x50
    ? pma_get_opa_port_ectrs+0x200/0x200 [hfi1]
    ? stack_trace_consume_entry+0x80/0x80
    hfi1_process_mad+0x39b/0x26c0 [hfi1]
    ? __lock_acquire+0x65e/0x21b0
    ? clear_linkup_counters+0xb0/0xb0 [hfi1]
    ? check_chain_key+0x1d7/0x2e0
    ? lock_downgrade+0x3a0/0x3a0
    ? match_held_lock+0x2e/0x250
    ib_mad_recv_done+0x698/0x15e0 [ib_core]
    ? clear_linkup_counters+0xb0/0xb0 [hfi1]
    ? ib_mad_send_done+0xc80/0xc80 [ib_core]
    ? mark_held_locks+0x79/0xa0
    ? _raw_spin_unlock_irqrestore+0x44/0x60
    ? rvt_poll_cq+0x1e1/0x340 [rdmavt]
    __ib_process_cq+0x97/0x100 [ib_core]
    ib_cq_poll_work+0x31/0xb0 [ib_core]
    process_one_work+0x4ee/0xa00
    ? pwq_dec_nr_in_flight+0x110/0x110
    ? do_raw_spin_lock+0x113/0x1d0
    worker_thread+0x57/0x5a0
    ? process_one_work+0xa00/0xa00
    kthread+0x1bb/0x1e0
    ? kthread_create_on_node+0xc0/0xc0
    ret_from_fork+0x3a/0x50

   The buggy address belongs to the page:
   page:ffffea000d8b5dc0 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0
   flags: 0x17ffffc0000000()
   raw: 0017ffffc0000000 0000000000000000 ffffea000d8b5dc8 0000000000000000
   raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
   page dumped because: kasan: bad access detected

   addr ffff888362d778d0 is located in stack of task kworker/u308:2/1889 at offset 32 in frame:
    pma_get_opa_portstatus+0x0/0xa80 [hfi1]

   this frame has 1 object:
    [32, 36) 'vl_select_mask'

   Memory state around the buggy address:
    ffff888362d77780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    ffff888362d77800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
   >ffff888362d77880: 00 00 00 00 00 00 f1 f1 f1 f1 04 f2 f2 f2 00 00
                                                    ^
    ffff888362d77900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    ffff888362d77980: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 04 f2 f2 f2

 ==================================================================

Cc: <stable@vger.kernel.org>
Fixes: 77241056 ("IB/hfi1: add driver files")
Link: https://lore.kernel.org/r/20190911113053.126040.47327.stgit@awfm-01.aw.intel.comReviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NIra Weiny <ira.weiny@intel.com>
Signed-off-by: NKaike Wan <kaike.wan@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

ad6819cd

IB/mlx5: Free mpi in mp_slave mode · a924850c

由 Danit Goldberg 提交于 9月 16, 2019

commit 5d44adebbb7e785939df3db36ac360f5e8b73e44 upstream.

ib_add_slave_port() allocates a multiport struct but never frees it.
Don't leak memory, free the allocated mpi struct during driver unload.

Cc: <stable@vger.kernel.org>
Fixes: 32f69e4b ("{net, IB}/mlx5: Manage port association for multiport RoCE")
Link: https://lore.kernel.org/r/20190916064818.19823-3-leon@kernel.orgSigned-off-by: NDanit Goldberg <danitg@mellanox.com>
Reviewed-by: NJason Gunthorpe <jgg@mellanox.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

a924850c

16 9月, 2019 1 次提交

IB/hfi1: Avoid hardlockup with flushlist_lock · 90ca4912

由 Mike Marciniszyn 提交于 6月 14, 2019

[ Upstream commit cf131a81967583ae737df6383a0893b9fee75b4e ]

Heavy contention of the sde flushlist_lock can cause hard lockups at
extreme scale when the flushing logic is under stress.

Mitigate by replacing the item at a time copy to the local list with
an O(1) list_splice_init() and using the high priority work queue to
do the flushes.

Fixes: 77241056 ("IB/hfi1: add driver files")
Cc: <stable@vger.kernel.org>
Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>

90ca4912

openanolis / cloud-kernel 接近 2 年 前同步成功

openanolis / cloud-kernel
接近 2 年前同步成功