- 16 10月, 2018 15 次提交
-
-
由 Yixian Liu 提交于
This patch adds fast register physical memory region (FRMR) support for hip08. Signed-off-by: NYixian Liu <liuyixian@huawei.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Selvin Xavier 提交于
In case the NQ alloc/enable fails, free up the already allocated/enabled NQ before reporting failure. Also, track the alloc/enable using proper state checking. Signed-off-by: NSelvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Selvin Xavier 提交于
Delayed work bnxt_re_worker would be still running even after cancel_delayed_work returns. This causes crash as the driver proceeds with device removal. To make sure that the work is finished before returning, use cancel_delayed_work_sync. Signed-off-by: NSelvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Devesh Sharma 提交于
Some FW versios return pkey values more than 0xFFFF. pkey_tbl_len of ib_port_attr is 16bit value. So restricting max_pkeys to 0xFFFF. Signed-off-by: NDevesh Sharma <devesh.sharma@broadcom.com> Signed-off-by: NSelvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Devesh Sharma 提交于
Reports affiliated async event on the qp-async event channel instead of global event channel. Signed-off-by: NDevesh Sharma <devesh.sharma@broadcom.com> Signed-off-by: NSelvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Selvin Xavier 提交于
Expose out of sequence errors received from FW. This counter is a 32 bit counter and driver has to accumulate the counter. Stores the previous value for calculating the difference in the next query. Also, update the HW statistics structure with new fields. Signed-off-by: NSelvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Selvin Xavier 提交于
Expose the RoCE discard and drop counters from the HW statistics context Signed-off-by: NSelvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Somnath Kotur 提交于
crsqe->resp would be NULL in case the host command timed out before getting a response from HW. Check for NULL pointer to avoid a potential crash while printing the error message. Signed-off-by: NSomnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: NSelvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Devesh Sharma 提交于
In some FW versions, RoCE driver also receives an async notification which was directed to L2 driver. RoCE driver does not handle this and print a message to syslog. Drop these notifications silently. Signed-off-by: NDevesh Sharma <devesh.sharma@broadcom.com> Signed-off-by: NSelvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Selvin Xavier 提交于
In the failure path, nq->bar_reg_iomem gets accessed without initializing. Avoid this by calling the bnxt_qplib_nq_stop_irq only if the initialization is complete. Reported-by: NDan Carpenter <dan.carpenter@oracle.com> Fixes: 1ac5a404 ("RDMA/bnxt_re: Add bnxt_re RoCE driver") Fixes: 6e04b103 ("RDMA/bnxt_re: Fix broken RoCE driver due to recent L2 driver changes") Signed-off-by: NSelvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Selvin Xavier 提交于
This is reported by smatch check. rcfw->creq_bar_reg_iomem is accessed in bnxt_qplib_rcfw_stop_irq and this variable check afterwards doesn't make sense. Also, rcfw->creq_bar_reg_iomem will never be NULL. So Removing this check. Reported-by: NDan Carpenter <dan.carpenter@oracle.com> Fixes: 6e04b103 ("RDMA/bnxt_re: Fix broken RoCE driver due to recent L2 driver changes") Signed-off-by: NSelvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Selvin Xavier 提交于
Version macro is not required as the driver is not maintaining the version. Removing the references of this macro too. Signed-off-by: NSelvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Selvin Xavier 提交于
Fix possible recursive lock warning. Its a false warning as the locks are part of two differnt HW Queue data structure - cmdq and creq. Debug kernel is throwing the following warning and stack trace. [ 783.914967] ============================================ [ 783.914970] WARNING: possible recursive locking detected [ 783.914973] 4.19.0-rc2+ #33 Not tainted [ 783.914976] -------------------------------------------- [ 783.914979] swapper/2/0 is trying to acquire lock: [ 783.914982] 000000002aa3949d (&(&hwq->lock)->rlock){..-.}, at: bnxt_qplib_service_creq+0x232/0x350 [bnxt_re] [ 783.914999] but task is already holding lock: [ 783.915002] 00000000be73920d (&(&hwq->lock)->rlock){..-.}, at: bnxt_qplib_service_creq+0x2a/0x350 [bnxt_re] [ 783.915013] other info that might help us debug this: [ 783.915016] Possible unsafe locking scenario: [ 783.915019] CPU0 [ 783.915021] ---- [ 783.915034] lock(&(&hwq->lock)->rlock); [ 783.915035] lock(&(&hwq->lock)->rlock); [ 783.915037] *** DEADLOCK *** [ 783.915038] May be due to missing lock nesting notation [ 783.915039] 1 lock held by swapper/2/0: [ 783.915040] #0: 00000000be73920d (&(&hwq->lock)->rlock){..-.}, at: bnxt_qplib_service_creq+0x2a/0x350 [bnxt_re] [ 783.915044] stack backtrace: [ 783.915046] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.19.0-rc2+ #33 [ 783.915047] Hardware name: Dell Inc. PowerEdge R730/0599V5, BIOS 1.0.4 08/28/2014 [ 783.915048] Call Trace: [ 783.915049] <IRQ> [ 783.915054] dump_stack+0x90/0xe3 [ 783.915058] __lock_acquire+0x106c/0x1080 [ 783.915061] ? sched_clock+0x5/0x10 [ 783.915063] lock_acquire+0xbd/0x1a0 [ 783.915065] ? bnxt_qplib_service_creq+0x232/0x350 [bnxt_re] [ 783.915069] _raw_spin_lock_irqsave+0x4a/0x90 [ 783.915071] ? bnxt_qplib_service_creq+0x232/0x350 [bnxt_re] [ 783.915073] bnxt_qplib_service_creq+0x232/0x350 [bnxt_re] [ 783.915078] tasklet_action_common.isra.17+0x197/0x1b0 [ 783.915081] __do_softirq+0xcb/0x3a6 [ 783.915084] irq_exit+0xe9/0x100 [ 783.915085] do_IRQ+0x6a/0x120 [ 783.915087] common_interrupt+0xf/0xf [ 783.915088] </IRQ> Use nested notation for the spin_lock to avoid this warning. Signed-off-by: NSelvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Selvin Xavier 提交于
Add the missing initalization of the cq_lock and qplib.flush_lock. Fixes: 942c9b6c ("RDMA/bnxt_re: Avoid Hard lockup during error CQE processing") Signed-off-by: NSelvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Jason Gunthorpe 提交于
From git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git This is required to resolve dependencies of the next series of RDMA patches. The code motion conflicts in drivers/infiniband/core/cache.c were resolved. Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 11 10月, 2018 1 次提交
-
-
由 Valentine Fatiev 提交于
The function that puts back the MR in cache also removes the DMA address from the HCA. Therefore we need to call this function before we remove the DMA mapping from MMU. Otherwise the HCA may access a memory that is no longer DMA mapped. Call trace: NMI: IOCK error (debug interrupt?) for reason 71 on CPU 0. CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.19.0-rc6+ #4 Hardware name: HP ProLiant DL360p Gen8, BIOS P71 08/20/2012 RIP: 0010:intel_idle+0x73/0x120 Code: 80 5c 01 00 0f ae 38 0f ae f0 31 d2 65 48 8b 04 25 80 5c 01 00 48 89 d1 0f 60 02 RSP: 0018:ffffffff9a403e38 EFLAGS: 00000046 RAX: 0000000000000030 RBX: 0000000000000005 RCX: 0000000000000001 RDX: 0000000000000000 RSI: ffffffff9a5790c0 RDI: 0000000000000000 RBP: 0000000000000030 R08: 0000000000000000 R09: 0000000000007cf9 R10: 000000000000030a R11: 0000000000000018 R12: 0000000000000000 R13: ffffffff9a5792b8 R14: ffffffff9a5790c0 R15: 0000002b48471e4d FS: 0000000000000000(0000) GS:ffff9c6caf400000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f5737185000 CR3: 0000000590c0a002 CR4: 00000000000606f0 Call Trace: cpuidle_enter_state+0x7e/0x2e0 do_idle+0x1ed/0x290 cpu_startup_entry+0x6f/0x80 start_kernel+0x524/0x544 ? set_init_arg+0x55/0x55 secondary_startup_64+0xa4/0xb0 DMAR: DRHD: handling fault status reg 2 DMAR: [DMA Read] Request device [04:00.0] fault addr b34d2000 [fault reason 06] PTE Read access is not set DMAR: [DMA Read] Request device [01:00.2] fault addr bff8b000 [fault reason 06] PTE Read access is not set Fixes: f3f134f5 ("RDMA/mlx5: Fix crash while accessing garbage pointer and freed memory") Signed-off-by: NValentine Fatiev <valentinef@mellanox.com> Reviewed-by: NMoni Shoua <monis@mellanox.com> Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
- 06 10月, 2018 3 次提交
-
-
由 Leon Romanovsky 提交于
Tracking CM_ID resource is performed in two stages: creation of cm_id and connecting it to the cma_dev. It is needed because rdma-cm protocol exports two separate user-visible calls rdma_create_id and rdma_accept. At the time of CM_ID creation, the real owner of that object is unknown yet and we need to grab task_struct. This task_struct is released or reassigned in attach phase later on. but call to rdma_destroy_id left this task_struct unreleased. Such separation is unique to CM_ID and other restrack objects initialize in one shot. It means that it is safe to use "res->valid" check to catch unfinished CM_ID flow and release task_struct for that object. Fixes: 00313983 ("RDMA/nldev: provide detailed CM_ID information") Reported-by: NArtemy Kovalyov <artemyko@mellanox.com> Reviewed-by: NArtemy Kovalyov <artemyko@mellanox.com> Reviewed-by: NYossi Itigin <yosefe@mellanox.com> Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Reviewed-by: NSteve Wise <swise@opengridcomputing.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Leon Romanovsky 提交于
Unify task update and kernel name set in one place. Reviewed-by: NArtemy Kovalyov <artemyko@mellanox.com> Reviewed-by: NYossi Itigin <yosefe@mellanox.com> Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Reviewed-by: NSteve Wise <swise@opengridcomputing.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Leon Romanovsky 提交于
Prepare rdma_restrack_set_task() call to accommodate more code by moving its implementation from *.h to *.c. Reviewed-by: NArtemy Kovalyov <artemyko@mellanox.com> Reviewed-by: NYossi Itigin <yosefe@mellanox.com> Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Reviewed-by: NSteve Wise <swise@opengridcomputing.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 04 10月, 2018 21 次提交
-
-
由 Parav Pandit 提交于
rdma_find_ndev_for_src_ip_rcu() returns either valid netdev pointer or ERR_PTR(). Instead of checking for NULL, check for error. Fixes: caf1e3ae ("RDMA/core Introduce and use rdma_find_ndev_for_src_ip_rcu") Reported-by: syzbot+20c32fa6ff84a2d28c36@syzkaller.appspotmail.com Signed-off-by: NParav Pandit <parav@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
This patch moves ruc_loopback() from hfi1 into rdmavt for code sharing with the qib driver. Reviewed-by: NBrian Welty <brian.welty@intel.com> Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: NVenkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com> Signed-off-by: NHarish Chegondi <harish.chegondi@intel.com> Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
Moving send completion code into rdmavt in order to have shared logic between qib and hfi1 drivers. Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: NBrian Welty <brian.welty@intel.com> Signed-off-by: NVenkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com> Signed-off-by: NHarish Chegondi <harish.chegondi@intel.com> Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Brian Welty 提交于
This patch moves hfi1_copy_sge() into rdmavt for sharing with qib. This patch also moves all the wss_*() functions into rdmavt as several wss_*() functions are called from hfi1_copy_sge() When SGE copy mode is adaptive, cacheless copy may be done in some cases for performance reasons. In those cases, X86 cacheless copy function is called since the drivers that use rdmavt and may set SGE copy mode to adaptive are X86 only. For this reason, this patch adds "depends on X86_64" to rdmavt/Kconfig. Reviewed-by: NAshutosh Dixit <ashutosh.dixit@intel.com> Reviewed-by: NMichael J. Ruhl <michael.j.ruhl@intel.com> Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: NBrian Welty <brian.welty@intel.com> Signed-off-by: NHarish Chegondi <harish.chegondi@intel.com> Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Nathan Chancellor 提交于
Clang warns when one enumerated type is implicitly converted to another. drivers/infiniband/hw/mlx4/mad.c:1811:41: warning: implicit conversion from enumeration type 'enum mlx4_ib_qp_flags' to different enumeration type 'enum ib_qp_create_flags' [-Wenum-conversion] qp_init_attr.init_attr.create_flags = MLX4_IB_SRIOV_TUNNEL_QP; ~ ^~~~~~~~~~~~~~~~~~~~~~~ drivers/infiniband/hw/mlx4/mad.c:1819:41: warning: implicit conversion from enumeration type 'enum mlx4_ib_qp_flags' to different enumeration type 'enum ib_qp_create_flags' [-Wenum-conversion] qp_init_attr.init_attr.create_flags = MLX4_IB_SRIOV_SQP; ~ ^~~~~~~~~~~~~~~~~ The type mlx4_ib_qp_flags explicitly provides supplemental values to the type ib_qp_create_flags. Make that clear to Clang by changing the create_flags type to u32. Reported-by: NNick Desaulniers <ndesaulniers@google.com> Signed-off-by: NNathan Chancellor <natechancellor@gmail.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Lijun Ou 提交于
The atomic operation not supported inline. Besides, the standard atomic operation only support a sge and the sge is placed in the wqe. Fix: 384f8818("RDMA/hns: Add atomic support") Signed-off-by: NLijun Ou <oulijun@huawei.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Lijun Ou 提交于
In order to extend vlan device range, the design add two field of qp context for checking vlan packet in sender and in recevicer. Signed-off-by: NLijun Ou <oulijun@huawei.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Lijun Ou 提交于
This patch adds local invalidate Memory Region (MR) support in the kernel space driver. Signed-off-by: NYangyang Li <liyangyang20@huawei.com> Signed-off-by: NLijun Ou <oulijun@huawei.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Lijun Ou 提交于
The hip08 hardware has two version. the version id are 0x20 and 0x21 according to the pci revision. It needs to adjust some fields for extending new features. The specific updates include: 1. Add some fields for supporting new features by enabling some reserved fields in 0x20 version. 2. remove some fields which the user is not visiable in order to support the extend features. 3. Init some fields with zero. These updates is compatible with 0x20 version. Signed-off-by: NLijun Ou <oulijun@huawei.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Lijun Ou 提交于
According to hip08 limit, the buffer size of extend sge needs to be an integer wqe_sge_buf_page size. For example, the value of sge_shift field of qp context is greater or equal to eight when buffer page size is 4K size. The value of sge_shift field of qp context assigned by hr_qp->sge.sge_cnt. Signed-off-by: NLijun Ou <oulijun@huawei.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Lijun Ou 提交于
According to the IB protocol definition, the driver needs to show the correct device information and the information will be queryed by device attribute. Signed-off-by: NLijun Ou <oulijun@huawei.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Lijun Ou 提交于
In order to compatible with the third party RoCE device, The hardware modify the set method for the ecn field of ip header in new hip08 version. The high 6bit of tclass be assigned for dscp field of packet. Signed-off-by: NLijun Ou <oulijun@huawei.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Lijun Ou 提交于
The hip08 split two hardware version. The version id are 0x20 and 0x21 according to the PCI revison. The max size of extend sge of sq is limited to 2M for 0x20 version and 8M for 0x21 version. It may be exceeded to 2M according to the algorithm that compute the product of wqe count and extend sge number of every wqe. But the product always less than 8M. Signed-off-by: NLijun Ou <oulijun@huawei.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Lijun Ou 提交于
It will print the warning when the MSB bit of SLID is not zero running cm_req_handler function that test CM. It needs to fixed zero when test RoCE device. Signed-off-by: NLijun Ou <oulijun@huawei.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Lijun Ou 提交于
When user issues a RDMA read and enables sq inline, it needs to report a bad wr to user. Signed-off-by: NLijun Ou <oulijun@huawei.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Lijun Ou 提交于
It needs to include two special qps for every port. The hip08 have four ports and the all reserved qp numbers are eight. Signed-off-by: NLijun Ou <oulijun@huawei.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Leon Romanovsky 提交于
All users of rdma_nl_chk_listeners() are interested to get boolean answer if netlink socket has listeners, so update all places to boolean function. Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Kamal Heib 提交于
The ll parameter is not used in ib_modify_qp_is_ok(), so remove it. Signed-off-by: NKamal Heib <kamalheib1@gmail.com> Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Kamal Heib 提交于
This function is not in use - delete it. Signed-off-by: NKamal Heib <kamalheib1@gmail.com> Reviewed-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Zhu Yanjun 提交于
In rxe_queue_init, q and q->buf are allocated. In do_mmap_info, q->ip is allocated. When error occurs, rxe_srq_from_init and the later error handler do not free these allocated memories. This will make memory leak. Signed-off-by: NZhu Yanjun <yanjun.zhu@oracle.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Wei Yongjun 提交于
Fix to return a negative error code from the mthca_cmd_init() error handling case instead of 0, as done elsewhere in this function. Fixes: 80fd8238 ("[PATCH] IB/mthca: Encapsulate command interface init") Signed-off-by: NWei Yongjun <weiyongjun1@huawei.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-