- 20 2月, 2019 20 次提交
-
-
由 Lijun Ou 提交于
Accroding to hip08's limitation, qp&cq specification is 1M, mtpt specification 1M in kernel space. Signed-off-by: NYangyang Li <liyangyang20@huawei.com> Signed-off-by: NLijun Ou <oulijun@huawei.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Steve Wise 提交于
Do not assume irq_poll_sched() is called from an interrupt context only. So use raise_softirq_irqoff() instead of __raise_softirq_irqoff() so it will kick the ksoftirqd if the schedule is from a non-interrupt context. This is required for RDMA drivers, like soft iwarp, that generate cq completion notifications in a workqueue or kthread context. Without this change, siw completion notifications to the ULP can take several hundred usecs, depending on the system load. Signed-off-by: NSteve Wise <swise@opengridcomputing.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Steve Wise 提交于
Add support for the RDMA_NLDEV_CMD_NEWLINK/DELLINK messages which allow dynamically adding new RXE links. Deprecate the old module options for now. Cc: Moni Shoua <monis@mellanox.com> Reviewed-by: NYanjun Zhu <yanjun.zhu@oracle.com> Signed-off-by: NSteve Wise <swise@opengridcomputing.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Steve Wise 提交于
Add support for new LINK messages to allow adding and deleting rdma interfaces. This will be used initially for soft rdma drivers which instantiate device instances dynamically by the admin specifying a netdev device to use. The rdma_rxe module will be the first user of these messages. The design is modeled after RTNL_NEWLINK/DELLINK: rdma drivers register with the rdma core if they provide link add/delete functions. Each driver registers with a unique "type" string, that is used to dispatch messages coming from user space. A new RDMA_NLDEV_ATTR is defined for the "type" string. User mode will pass 3 attributes in a NEWLINK message: RDMA_NLDEV_ATTR_DEV_NAME for the desired rdma device name to be created, RDMA_NLDEV_ATTR_LINK_TYPE for the "type" of link being added, and RDMA_NLDEV_ATTR_NDEV_NAME for the net_device interface to use for this link. The DELLINK message will contain the RDMA_NLDEV_ATTR_DEV_INDEX of the device to delete. Signed-off-by: NSteve Wise <swise@opengridcomputing.com> Reviewed-by: NLeon Romanovsky <leonro@mellanox.com> Reviewed-by: NMichael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Parvi Kaustubhi 提交于
There is a dead lock in usnic ib_register and netdev_notify path. usnic_ib_discover_pf() | mutex_lock(&usnic_ib_ibdev_list_lock); | usnic_ib_device_add(); | ib_register_device() | usnic_ib_query_port() | mutex_lock(&us_ibdev->usdev_lock); | ib_get_eth_speed() | rtnl_lock() order of lock: &usnic_ib_ibdev_list_lock -> usdev_lock -> rtnl_lock rtnl_lock() | usnic_ib_netdevice_event() | mutex_lock(&usnic_ib_ibdev_list_lock); order of lock: rtnl_lock -> &usnic_ib_ibdev_list_lock Solution is to use the core's lock-free ib_device_get_by_netdev() scheme to lookup ib_dev while handling netdev & inet events. Signed-off-by: NParvi Kaustubhi <pkaustub@cisco.com> Reviewed-by: NGovindarajulu Varadarajan <gvaradar@cisco.com> Reviewed-by: NTanmay Inamdar <tinamdar@cisco.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Jason Gunthorpe 提交于
Since rxe allows unregistration from other threads the rxe pointer can become invalid any moment after ib_register_driver returns. This could cause a user triggered use after free. Add another driver callback to be called right after the device becomes registered to complete any device setup required post-registration. This callback has enough core locking to prevent the device from becoming unregistered. Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Jason Gunthorpe 提交于
rxe has an open coded version of this that is not as safe as the core version. This lets us eliminate the internal device list entirely from rxe. Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Jason Gunthorpe 提交于
rxe does not have correct locking for its registration/unregistration paths, use the core code to handle it instead. In this mode ib_unregister_device will also do the dealloc, so rxe is required to do clean up from a callback. The core code ensures that unregistration is done only once, and generally takes care of locking and concurrency problems for rxe. Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Jason Gunthorpe 提交于
These APIs are intended to support drivers that exist outside the usual driver core probe()/remove() callbacks. Normally the driver core will prevent remove() from running concurrently with probe(), once this safety is lost drivers need more support to get the locking and lifetimes right. ib_unregister_driver() is intended to be used during module_exit of a driver using these APIs. It unregisters all the associated ib_devices. ib_unregister_device_and_put() is to be used by a driver-specific removal function (ie removal by name, removal from a netdev notifier, removal from netlink) ib_unregister_queued() is to be used from netdev notifier chains where RTNL is held. The locking is tricky here since once things become async it is possible to race unregister with registration. This is largely solved by relying on the registration refcount, unregistration will only ever work on something that has a positive registration refcount - and then an unregistration mutex serializes all competing unregistrations of the same device. Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Jason Gunthorpe 提交于
The core API handles the locking correctly and is faster if there are multiple devices. Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Jason Gunthorpe 提交于
Several drivers need to find the ib_device from a given netdev. rxe needs this at speed in an unsleepable context, so choose to implement the translation using a RCU safe hash table. The hash table can have a many to one mapping. This is intended to support some future case where multiple IB drivers (ie iWarp and RoCE) connect to the same netdevs. driver_ids will need to be different to support this. In the process this makes the struct ib_device and ib_port_data RCU safe by deferring their kfrees. Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Jason Gunthorpe 提交于
The associated netdev should not actually be very dynamic, so for most drivers there is no reason for a callback like this. Provide an API to inform the core code about the net dev affiliation and use a core maintained data structure instead. This allows the core code to be more aware of the ndev relationship which will allow some new APIs based around this. This also uses locking that makes some kind of sense, many drivers had a confusing RCU lock, or missing locking which isn't right. Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Jason Gunthorpe 提交于
Like the other cases there no real reason to have another array just for the cache. This larger conversion gets its own patch. Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Jason Gunthorpe 提交于
There is no reason to have three allocations of per-port data. Combine them together and make the lifetime for all the per-port data match the struct ib_device. Following patches will require more port-specific data, now there is a good place to put it. Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Jason Gunthorpe 提交于
We have many loops iterating over all of the end port numbers on a struct ib_device, simplify them with a for_each helper. Reviewed-by: NParav Pandit <parav@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Leon Romanovsky 提交于
Netlink dumpit handshake exchanges the index from which kernel should start to return its value, in current code, this index included not-visible in this PID items too and indirectly revealed the number of entries. Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Leon Romanovsky 提交于
This patch adds ability to query specific QP based on its LQPN (local QPN), which is assigned by HW and needs special treatment while inserting into restrack DB. Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Leon Romanovsky 提交于
PD, MR and QP objects have parents objects: contexts and PDs. The exposed parent IDs allow to correlate various objects and simplify debug investigation. Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Leon Romanovsky 提交于
Give to the user space tools unique identifier for PD, MR, CQ and CM_ID objects, so they can be able to query on them with .doit callbacks. QP .doit is not supported yet, till all drivers will be updated to provide their LQPN to be equal to their restrack ID. Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Leon Romanovsky 提交于
As a preparation to extension of rdma_restrack_root to provide software IDs, which will be per-type too. We convert the rdma_restrack_root from struct with arrays to array of structs. Such conversion allows us to drop rwsem lock in favour of internal XArray lock. Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 19 2月, 2019 5 次提交
-
-
由 Leon Romanovsky 提交于
There is no need to expose internals of restrack DB to IB/core. Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Leon Romanovsky 提交于
XArray uses internal lock for updates to XArray. This means that our external RW lock is needed to ensure that entry is not deleted while we are performing iteration over list. Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Leon Romanovsky 提交于
Implement doit callbacks and ensure that users won't provide port values on resource entry allocated in per-device mode needed for .doit callback. Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Leon Romanovsky 提交于
Add new general helper to get restrack entry given by ID and their respective type. Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Leon Romanovsky 提交于
The additions of .doit callbacks posses new access pattern to the resource entries by some user visible index. Back then, the legacy DB was implemented as hash because per-index access wasn't needed and XArray wasn't accepted yet. Acceptance of XArray together with per-index access requires the refresh of DB implementation. Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 16 2月, 2019 10 次提交
-
-
由 Parav Pandit 提交于
Move core device addition and removal from sysfs.c to device.c as device.c is more appropriate place for device management. Signed-off-by: NParav Pandit <parav@mellanox.com> Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Parav Pandit 提交于
Refactor code for device and port sysfs attributes for reuse. While at it, rename counter part free function to ib_free_port_attrs. Also attribute setup sequence is: (a) port specific init. (b) device stats alloc/init. So for cleanup, follow reverse sequence: (a) device stats dealloc (b) port specific cleanup Signed-off-by: NParav Pandit <parav@mellanox.com> Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Parav Pandit 提交于
Instead of holding extra reference using get_device() that device_unregister() releases, simplify it as below. device_add() balances with device_del(). device_initialize() balances with put_device(), always via ib_dealloc_device(). Signed-off-by: NParav Pandit <parav@mellanox.com> Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Leon Romanovsky 提交于
Ucontext allocation and release aren't async events and don't need kref accounting. The common layer of RDMA subsystem ensures that dealloc ucontext will be called after all other objects are released. Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Reviewed-by: NSteve Wise <swise@opengridcomputing.com> Tested-by: NRaju Rangoju <rajur@chelsio.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Leon Romanovsky 提交于
The internal design of RDMA/core ensures that there dealloc ucontext will be called only if alloc_ucontext succeeded, hence there is no need to manage internal variable to mark validity of ucontext. As part of this change, remove redundant memeset too. Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Kaike Wan 提交于
The following build warning was produced for the TID RDMA READ patch ("IB/hfi1: Enable TID RDMA READ protocol"): drivers/infiniband/hw/hfi1/qp.c: In function 'hfi1_setup_wqe': drivers/infiniband/hw/hfi1/qp.c:328:3: warning: this statement may fall through [-Wimplicit-fallthrough=] hfi1_setup_tid_rdma_wqe(qp, wqe); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/infiniband/hw/hfi1/qp.c:329:2: note: here case IB_QPT_UC: ^~~~ This patch will fix the issue by adding the "fall through" comment. Fixes: f1ab4efa ("IB/hfi1: Enable TID RDMA READ protocol") Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: NKaike Wan <kaike.wan@intel.com> Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Jason Gunthorpe 提交于
The new output_written block was wrongly placed before the ret=0, causing the error code to be lost. uverbs_output_written is not expected to fail, and even if it does fail it has no significant impact on the userspace flow. Reported-by: NBart Van Assche <bvanassche@acm.org> Fixes: d6f4a21f ("RDMA/uverbs: Mark ioctl responses with UVERBS_ATTR_F_VALID_OUTPUT") Signed-off-by: NJason Gunthorpe <jgg@mellanox.com> Reviewed-by: NLeon Romanovsky <leonro@mellanox.com>
-
由 Shamir Rabinovitch 提交于
Now when we have the udata passed to all the ib_xxx object creation APIs and the additional macro 'rdma_udata_to_drv_context' to get the ib_ucontext from ib_udata stored in uverbs_attr_bundle, we can finally start to remove the dependency of the drivers in the ib_xxx->uobject->context. Signed-off-by: NShamir Rabinovitch <shamir.rabinovitch@oracle.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Shamir Rabinovitch 提交于
Helper function to get driver's context out of ib_udata wrapped in uverbs_attr_bundle for user objects or NULL for kernel objects. Signed-off-by: NShamir Rabinovitch <shamir.rabinovitch@oracle.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Shamir Rabinovitch 提交于
Add ib_ucontext to the uverbs_attr_bundle sent down the iocl and cmd flows as soon as the flow has ib_uobject. In addition, remove rdma_get_ucontext helper function that is only used by ib_umem_get. Signed-off-by: NShamir Rabinovitch <shamir.rabinovitch@oracle.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 15 2月, 2019 5 次提交
-
-
由 Erez Alfasi 提交于
Changed debug statements to use %s and __func__ instead of hard-coded function's name. Signed-off-by: NErez Alfasi <ereza@mellanox.com> Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 YueHaibing 提交于
Fixes gcc '-Wunused-but-set-variable' warning: drivers/infiniband/core/iwpm_util.c: In function 'iwpm_send_hello': drivers/infiniband/core/iwpm_util.c:811:6: warning: variable 'msg_seq' set but not used [-Wunused-but-set-variable] It never used since introduction in commit b0bad9ad ("RDMA/IWPM: Support no port mapping requirements") Signed-off-by: NYueHaibing <yuehaibing@huawei.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Lijun Ou 提交于
This patch adds new device capability for IB_DEVICE_MEM_MGT_EXTENSIONS to indicate device support for the following features: 1. Fast register memory region. 2. send with remote invalidate by frmr 3. local invalidate memory regsion As well as adds the max depth of frmr page list len. Signed-off-by: NYangyang Li <liyangyang20@huawei.com> Signed-off-by: NLijun Ou <oulijun@huawei.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Yixian Liu 提交于
Current all messages printed for aeq subtype event are wrong. Thus, delete them and only the value of subtype event is printed. Signed-off-by: NYixian Liu <liuyixian@huawei.com> Signed-off-by: NLijun Ou <oulijun@huawei.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Yixian Liu 提交于
The memory allocated for wrid should be initialized to zero. Signed-off-by: NYixian Liu <liuyixian@huawei.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-