提交 7b1cd95d 编写于 作者: L Linus Torvalds

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull RDMA subsystem updates from Jason Gunthorpe:
 "Overall this cycle did not have any major excitement, and did not
  require any shared branch with netdev.

  Lots of driver updates, particularly of the scale-up and performance
  variety. The largest body of core work was Parav's patches fixing and
  restructing some of the core code to make way for future RDMA
  containerization.

  Summary:

   - misc small driver fixups to
     bnxt_re/hfi1/qib/hns/ocrdma/rdmavt/vmw_pvrdma/nes

   - several major feature adds to bnxt_re driver: SRIOV VF RoCE
     support, HugePages support, extended hardware stats support, and
     SRQ support

   - a notable number of fixes to the i40iw driver from debugging scale
     up testing

   - more work to enable the new hip08 chip in the hns driver

   - misc small ULP fixups to srp/srpt//ipoib

   - preparation for srp initiator and target to support the RDMA-CM
     protocol for connections

   - add RDMA-CM support to srp initiator, srp target is still a WIP

   - fixes for a couple of places where ipoib could spam the dmesg log

   - fix encode/decode of FDR/EDR data rates in the core

   - many patches from Parav with ongoing work to clean up
     inconsistencies and bugs in RoCE support around the rdma_cm

   - mlx5 driver support for the userspace features 'thread domain',
     'wallclock timestamps' and 'DV Direct Connected transport'. Support
     for the firmware dual port rocee capability

   - core support for more than 32 rdma devices in the char dev
     allocation

   - kernel doc updates from Randy Dunlap

   - new netlink uAPI for inspecting RDMA objects similar in spirit to 'ss'

   - one minor change to the kobject code acked by Greg KH"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (259 commits)
  RDMA/nldev: Provide detailed QP information
  RDMA/nldev: Provide global resource utilization
  RDMA/core: Add resource tracking for create and destroy PDs
  RDMA/core: Add resource tracking for create and destroy CQs
  RDMA/core: Add resource tracking for create and destroy QPs
  RDMA/restrack: Add general infrastructure to track RDMA resources
  RDMA/core: Save kernel caller name when creating PD and CQ objects
  RDMA/core: Use the MODNAME instead of the function name for pd callers
  RDMA: Move enum ib_cq_creation_flags to uapi headers
  IB/rxe: Change RDMA_RXE kconfig to use select
  IB/qib: remove qib_keys.c
  IB/mthca: remove mthca_user.h
  RDMA/cm: Fix access to uninitialized variable
  RDMA/cma: Use existing netif_is_bond_master function
  IB/core: Avoid SGID attributes query while converting GID from OPA to IB
  RDMA/mlx5: Avoid memory leak in case of XRCD dealloc failure
  IB/umad: Fix use of unprotected device pointer
  IB/iser: Combine substrings for three messages
  IB/iser: Delete an unnecessary variable initialisation in iser_send_data_out()
  IB/iser: Delete an error message for a failed memory allocation in iser_send_data_out()
  ...
...@@ -6892,7 +6892,7 @@ M: Jason Gunthorpe <jgg@mellanox.com> ...@@ -6892,7 +6892,7 @@ M: Jason Gunthorpe <jgg@mellanox.com>
L: linux-rdma@vger.kernel.org L: linux-rdma@vger.kernel.org
W: http://www.openfabrics.org/ W: http://www.openfabrics.org/
Q: http://patchwork.kernel.org/project/linux-rdma/list/ Q: http://patchwork.kernel.org/project/linux-rdma/list/
T: git git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma.git T: git git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git
S: Supported S: Supported
F: Documentation/devicetree/bindings/infiniband/ F: Documentation/devicetree/bindings/infiniband/
F: Documentation/infiniband/ F: Documentation/infiniband/
...@@ -11218,7 +11218,8 @@ S: Maintained ...@@ -11218,7 +11218,8 @@ S: Maintained
F: drivers/firmware/qemu_fw_cfg.c F: drivers/firmware/qemu_fw_cfg.c
QIB DRIVER QIB DRIVER
M: Mike Marciniszyn <infinipath@intel.com> M: Dennis Dalessandro <dennis.dalessandro@intel.com>
M: Mike Marciniszyn <mike.marciniszyn@intel.com>
L: linux-rdma@vger.kernel.org L: linux-rdma@vger.kernel.org
S: Supported S: Supported
F: drivers/infiniband/hw/qib/ F: drivers/infiniband/hw/qib/
...@@ -11245,7 +11246,6 @@ F: include/linux/qed/ ...@@ -11245,7 +11246,6 @@ F: include/linux/qed/
F: drivers/net/ethernet/qlogic/qede/ F: drivers/net/ethernet/qlogic/qede/
QLOGIC QL4xxx RDMA DRIVER QLOGIC QL4xxx RDMA DRIVER
M: Ram Amrani <Ram.Amrani@cavium.com>
M: Michal Kalderon <Michal.Kalderon@cavium.com> M: Michal Kalderon <Michal.Kalderon@cavium.com>
M: Ariel Elior <Ariel.Elior@cavium.com> M: Ariel Elior <Ariel.Elior@cavium.com>
L: linux-rdma@vger.kernel.org L: linux-rdma@vger.kernel.org
...@@ -11507,6 +11507,7 @@ F: drivers/net/ethernet/rdc/r6040.c ...@@ -11507,6 +11507,7 @@ F: drivers/net/ethernet/rdc/r6040.c
RDMAVT - RDMA verbs software RDMAVT - RDMA verbs software
M: Dennis Dalessandro <dennis.dalessandro@intel.com> M: Dennis Dalessandro <dennis.dalessandro@intel.com>
M: Mike Marciniszyn <mike.marciniszyn@intel.com>
L: linux-rdma@vger.kernel.org L: linux-rdma@vger.kernel.org
S: Supported S: Supported
F: drivers/infiniband/sw/rdmavt F: drivers/infiniband/sw/rdmavt
......
...@@ -12,7 +12,7 @@ ib_core-y := packer.o ud_header.o verbs.o cq.o rw.o sysfs.o \ ...@@ -12,7 +12,7 @@ ib_core-y := packer.o ud_header.o verbs.o cq.o rw.o sysfs.o \
device.o fmr_pool.o cache.o netlink.o \ device.o fmr_pool.o cache.o netlink.o \
roce_gid_mgmt.o mr_pool.o addr.o sa_query.o \ roce_gid_mgmt.o mr_pool.o addr.o sa_query.o \
multicast.o mad.o smi.o agent.o mad_rmpp.o \ multicast.o mad.o smi.o agent.o mad_rmpp.o \
security.o nldev.o security.o nldev.o restrack.o
ib_core-$(CONFIG_INFINIBAND_USER_MEM) += umem.o ib_core-$(CONFIG_INFINIBAND_USER_MEM) += umem.o
ib_core-$(CONFIG_INFINIBAND_ON_DEMAND_PAGING) += umem_odp.o ib_core-$(CONFIG_INFINIBAND_ON_DEMAND_PAGING) += umem_odp.o
......
...@@ -243,8 +243,7 @@ void rdma_copy_addr(struct rdma_dev_addr *dev_addr, ...@@ -243,8 +243,7 @@ void rdma_copy_addr(struct rdma_dev_addr *dev_addr,
EXPORT_SYMBOL(rdma_copy_addr); EXPORT_SYMBOL(rdma_copy_addr);
int rdma_translate_ip(const struct sockaddr *addr, int rdma_translate_ip(const struct sockaddr *addr,
struct rdma_dev_addr *dev_addr, struct rdma_dev_addr *dev_addr)
u16 *vlan_id)
{ {
struct net_device *dev; struct net_device *dev;
...@@ -266,9 +265,6 @@ int rdma_translate_ip(const struct sockaddr *addr, ...@@ -266,9 +265,6 @@ int rdma_translate_ip(const struct sockaddr *addr,
return -EADDRNOTAVAIL; return -EADDRNOTAVAIL;
rdma_copy_addr(dev_addr, dev, NULL); rdma_copy_addr(dev_addr, dev, NULL);
dev_addr->bound_dev_if = dev->ifindex;
if (vlan_id)
*vlan_id = rdma_vlan_dev_vlan_id(dev);
dev_put(dev); dev_put(dev);
break; break;
#if IS_ENABLED(CONFIG_IPV6) #if IS_ENABLED(CONFIG_IPV6)
...@@ -279,9 +275,6 @@ int rdma_translate_ip(const struct sockaddr *addr, ...@@ -279,9 +275,6 @@ int rdma_translate_ip(const struct sockaddr *addr,
&((const struct sockaddr_in6 *)addr)->sin6_addr, &((const struct sockaddr_in6 *)addr)->sin6_addr,
dev, 1)) { dev, 1)) {
rdma_copy_addr(dev_addr, dev, NULL); rdma_copy_addr(dev_addr, dev, NULL);
dev_addr->bound_dev_if = dev->ifindex;
if (vlan_id)
*vlan_id = rdma_vlan_dev_vlan_id(dev);
break; break;
} }
} }
...@@ -481,7 +474,7 @@ static int addr_resolve_neigh(struct dst_entry *dst, ...@@ -481,7 +474,7 @@ static int addr_resolve_neigh(struct dst_entry *dst,
if (dst->dev->flags & IFF_LOOPBACK) { if (dst->dev->flags & IFF_LOOPBACK) {
int ret; int ret;
ret = rdma_translate_ip(dst_in, addr, NULL); ret = rdma_translate_ip(dst_in, addr);
if (!ret) if (!ret)
memcpy(addr->dst_dev_addr, addr->src_dev_addr, memcpy(addr->dst_dev_addr, addr->src_dev_addr,
MAX_ADDR_LEN); MAX_ADDR_LEN);
...@@ -558,7 +551,7 @@ static int addr_resolve(struct sockaddr *src_in, ...@@ -558,7 +551,7 @@ static int addr_resolve(struct sockaddr *src_in,
} }
if (ndev->flags & IFF_LOOPBACK) { if (ndev->flags & IFF_LOOPBACK) {
ret = rdma_translate_ip(dst_in, addr, NULL); ret = rdma_translate_ip(dst_in, addr);
/* /*
* Put the loopback device and get the translated * Put the loopback device and get the translated
* device instead. * device instead.
...@@ -744,7 +737,6 @@ void rdma_addr_cancel(struct rdma_dev_addr *addr) ...@@ -744,7 +737,6 @@ void rdma_addr_cancel(struct rdma_dev_addr *addr)
EXPORT_SYMBOL(rdma_addr_cancel); EXPORT_SYMBOL(rdma_addr_cancel);
struct resolve_cb_context { struct resolve_cb_context {
struct rdma_dev_addr *addr;
struct completion comp; struct completion comp;
int status; int status;
}; };
...@@ -752,39 +744,31 @@ struct resolve_cb_context { ...@@ -752,39 +744,31 @@ struct resolve_cb_context {
static void resolve_cb(int status, struct sockaddr *src_addr, static void resolve_cb(int status, struct sockaddr *src_addr,
struct rdma_dev_addr *addr, void *context) struct rdma_dev_addr *addr, void *context)
{ {
if (!status)
memcpy(((struct resolve_cb_context *)context)->addr,
addr, sizeof(struct rdma_dev_addr));
((struct resolve_cb_context *)context)->status = status; ((struct resolve_cb_context *)context)->status = status;
complete(&((struct resolve_cb_context *)context)->comp); complete(&((struct resolve_cb_context *)context)->comp);
} }
int rdma_addr_find_l2_eth_by_grh(const union ib_gid *sgid, int rdma_addr_find_l2_eth_by_grh(const union ib_gid *sgid,
const union ib_gid *dgid, const union ib_gid *dgid,
u8 *dmac, u16 *vlan_id, int *if_index, u8 *dmac, const struct net_device *ndev,
int *hoplimit) int *hoplimit)
{ {
int ret = 0;
struct rdma_dev_addr dev_addr; struct rdma_dev_addr dev_addr;
struct resolve_cb_context ctx; struct resolve_cb_context ctx;
struct net_device *dev;
union { union {
struct sockaddr _sockaddr; struct sockaddr _sockaddr;
struct sockaddr_in _sockaddr_in; struct sockaddr_in _sockaddr_in;
struct sockaddr_in6 _sockaddr_in6; struct sockaddr_in6 _sockaddr_in6;
} sgid_addr, dgid_addr; } sgid_addr, dgid_addr;
int ret;
rdma_gid2ip(&sgid_addr._sockaddr, sgid); rdma_gid2ip(&sgid_addr._sockaddr, sgid);
rdma_gid2ip(&dgid_addr._sockaddr, dgid); rdma_gid2ip(&dgid_addr._sockaddr, dgid);
memset(&dev_addr, 0, sizeof(dev_addr)); memset(&dev_addr, 0, sizeof(dev_addr));
if (if_index) dev_addr.bound_dev_if = ndev->ifindex;
dev_addr.bound_dev_if = *if_index;
dev_addr.net = &init_net; dev_addr.net = &init_net;
ctx.addr = &dev_addr;
init_completion(&ctx.comp); init_completion(&ctx.comp);
ret = rdma_resolve_ip(&self, &sgid_addr._sockaddr, &dgid_addr._sockaddr, ret = rdma_resolve_ip(&self, &sgid_addr._sockaddr, &dgid_addr._sockaddr,
&dev_addr, 1000, resolve_cb, &ctx); &dev_addr, 1000, resolve_cb, &ctx);
...@@ -798,42 +782,9 @@ int rdma_addr_find_l2_eth_by_grh(const union ib_gid *sgid, ...@@ -798,42 +782,9 @@ int rdma_addr_find_l2_eth_by_grh(const union ib_gid *sgid,
return ret; return ret;
memcpy(dmac, dev_addr.dst_dev_addr, ETH_ALEN); memcpy(dmac, dev_addr.dst_dev_addr, ETH_ALEN);
dev = dev_get_by_index(&init_net, dev_addr.bound_dev_if); *hoplimit = dev_addr.hoplimit;
if (!dev) return 0;
return -ENODEV;
if (if_index)
*if_index = dev_addr.bound_dev_if;
if (vlan_id)
*vlan_id = rdma_vlan_dev_vlan_id(dev);
if (hoplimit)
*hoplimit = dev_addr.hoplimit;
dev_put(dev);
return ret;
}
EXPORT_SYMBOL(rdma_addr_find_l2_eth_by_grh);
int rdma_addr_find_smac_by_sgid(union ib_gid *sgid, u8 *smac, u16 *vlan_id)
{
int ret = 0;
struct rdma_dev_addr dev_addr;
union {
struct sockaddr _sockaddr;
struct sockaddr_in _sockaddr_in;
struct sockaddr_in6 _sockaddr_in6;
} gid_addr;
rdma_gid2ip(&gid_addr._sockaddr, sgid);
memset(&dev_addr, 0, sizeof(dev_addr));
dev_addr.net = &init_net;
ret = rdma_translate_ip(&gid_addr._sockaddr, &dev_addr, vlan_id);
if (ret)
return ret;
memcpy(smac, dev_addr.src_dev_addr, ETH_ALEN);
return ret;
} }
EXPORT_SYMBOL(rdma_addr_find_smac_by_sgid);
static int netevent_callback(struct notifier_block *self, unsigned long event, static int netevent_callback(struct notifier_block *self, unsigned long event,
void *ctx) void *ctx)
......
...@@ -573,27 +573,24 @@ static int ib_cache_gid_find_by_filter(struct ib_device *ib_dev, ...@@ -573,27 +573,24 @@ static int ib_cache_gid_find_by_filter(struct ib_device *ib_dev,
struct ib_gid_attr attr; struct ib_gid_attr attr;
if (table->data_vec[i].props & GID_TABLE_ENTRY_INVALID) if (table->data_vec[i].props & GID_TABLE_ENTRY_INVALID)
goto next; continue;
if (memcmp(gid, &table->data_vec[i].gid, sizeof(*gid))) if (memcmp(gid, &table->data_vec[i].gid, sizeof(*gid)))
goto next; continue;
memcpy(&attr, &table->data_vec[i].attr, sizeof(attr)); memcpy(&attr, &table->data_vec[i].attr, sizeof(attr));
if (filter(gid, &attr, context)) if (filter(gid, &attr, context)) {
found = true; found = true;
if (index)
next: *index = i;
if (found)
break; break;
}
} }
read_unlock_irqrestore(&table->rwlock, flags); read_unlock_irqrestore(&table->rwlock, flags);
if (!found) if (!found)
return -ENOENT; return -ENOENT;
if (index)
*index = i;
return 0; return 0;
} }
...@@ -824,12 +821,7 @@ static int gid_table_setup_one(struct ib_device *ib_dev) ...@@ -824,12 +821,7 @@ static int gid_table_setup_one(struct ib_device *ib_dev)
if (err) if (err)
return err; return err;
err = roce_rescan_device(ib_dev); rdma_roce_rescan_device(ib_dev);
if (err) {
gid_table_cleanup_one(ib_dev);
gid_table_release_one(ib_dev);
}
return err; return err;
} }
...@@ -883,7 +875,6 @@ int ib_find_gid_by_filter(struct ib_device *device, ...@@ -883,7 +875,6 @@ int ib_find_gid_by_filter(struct ib_device *device,
port_num, filter, port_num, filter,
context, index); context, index);
} }
EXPORT_SYMBOL(ib_find_gid_by_filter);
int ib_get_cached_pkey(struct ib_device *device, int ib_get_cached_pkey(struct ib_device *device,
u8 port_num, u8 port_num,
......
...@@ -452,13 +452,14 @@ static void cm_set_private_data(struct cm_id_private *cm_id_priv, ...@@ -452,13 +452,14 @@ static void cm_set_private_data(struct cm_id_private *cm_id_priv,
cm_id_priv->private_data_len = private_data_len; cm_id_priv->private_data_len = private_data_len;
} }
static void cm_init_av_for_response(struct cm_port *port, struct ib_wc *wc, static int cm_init_av_for_response(struct cm_port *port, struct ib_wc *wc,
struct ib_grh *grh, struct cm_av *av) struct ib_grh *grh, struct cm_av *av)
{ {
av->port = port; av->port = port;
av->pkey_index = wc->pkey_index; av->pkey_index = wc->pkey_index;
ib_init_ah_from_wc(port->cm_dev->ib_device, port->port_num, wc, return ib_init_ah_attr_from_wc(port->cm_dev->ib_device,
grh, &av->ah_attr); port->port_num, wc,
grh, &av->ah_attr);
} }
static int cm_init_av_by_path(struct sa_path_rec *path, struct cm_av *av, static int cm_init_av_by_path(struct sa_path_rec *path, struct cm_av *av,
...@@ -494,8 +495,11 @@ static int cm_init_av_by_path(struct sa_path_rec *path, struct cm_av *av, ...@@ -494,8 +495,11 @@ static int cm_init_av_by_path(struct sa_path_rec *path, struct cm_av *av,
return ret; return ret;
av->port = port; av->port = port;
ib_init_ah_from_path(cm_dev->ib_device, port->port_num, path, ret = ib_init_ah_attr_from_path(cm_dev->ib_device, port->port_num, path,
&av->ah_attr); &av->ah_attr);
if (ret)
return ret;
av->timeout = path->packet_life_time + 1; av->timeout = path->packet_life_time + 1;
spin_lock_irqsave(&cm.lock, flags); spin_lock_irqsave(&cm.lock, flags);
...@@ -1560,6 +1564,35 @@ static u16 cm_get_bth_pkey(struct cm_work *work) ...@@ -1560,6 +1564,35 @@ static u16 cm_get_bth_pkey(struct cm_work *work)
return pkey; return pkey;
} }
/**
* Convert OPA SGID to IB SGID
* ULPs (such as IPoIB) do not understand OPA GIDs and will
* reject them as the local_gid will not match the sgid. Therefore,
* change the pathrec's SGID to an IB SGID.
*
* @work: Work completion
* @path: Path record
*/
static void cm_opa_to_ib_sgid(struct cm_work *work,
struct sa_path_rec *path)
{
struct ib_device *dev = work->port->cm_dev->ib_device;
u8 port_num = work->port->port_num;
if (rdma_cap_opa_ah(dev, port_num) &&
(ib_is_opa_gid(&path->sgid))) {
union ib_gid sgid;
if (ib_get_cached_gid(dev, port_num, 0, &sgid, NULL)) {
dev_warn(&dev->dev,
"Error updating sgid in CM request\n");
return;
}
path->sgid = sgid;
}
}
static void cm_format_req_event(struct cm_work *work, static void cm_format_req_event(struct cm_work *work,
struct cm_id_private *cm_id_priv, struct cm_id_private *cm_id_priv,
struct ib_cm_id *listen_id) struct ib_cm_id *listen_id)
...@@ -1573,10 +1606,13 @@ static void cm_format_req_event(struct cm_work *work, ...@@ -1573,10 +1606,13 @@ static void cm_format_req_event(struct cm_work *work,
param->bth_pkey = cm_get_bth_pkey(work); param->bth_pkey = cm_get_bth_pkey(work);
param->port = cm_id_priv->av.port->port_num; param->port = cm_id_priv->av.port->port_num;
param->primary_path = &work->path[0]; param->primary_path = &work->path[0];
if (cm_req_has_alt_path(req_msg)) cm_opa_to_ib_sgid(work, param->primary_path);
if (cm_req_has_alt_path(req_msg)) {
param->alternate_path = &work->path[1]; param->alternate_path = &work->path[1];
else cm_opa_to_ib_sgid(work, param->alternate_path);
} else {
param->alternate_path = NULL; param->alternate_path = NULL;
}
param->remote_ca_guid = req_msg->local_ca_guid; param->remote_ca_guid = req_msg->local_ca_guid;
param->remote_qkey = be32_to_cpu(req_msg->local_qkey); param->remote_qkey = be32_to_cpu(req_msg->local_qkey);
param->remote_qpn = be32_to_cpu(cm_req_get_local_qpn(req_msg)); param->remote_qpn = be32_to_cpu(cm_req_get_local_qpn(req_msg));
...@@ -1826,9 +1862,11 @@ static int cm_req_handler(struct cm_work *work) ...@@ -1826,9 +1862,11 @@ static int cm_req_handler(struct cm_work *work)
cm_id_priv = container_of(cm_id, struct cm_id_private, id); cm_id_priv = container_of(cm_id, struct cm_id_private, id);
cm_id_priv->id.remote_id = req_msg->local_comm_id; cm_id_priv->id.remote_id = req_msg->local_comm_id;
cm_init_av_for_response(work->port, work->mad_recv_wc->wc, ret = cm_init_av_for_response(work->port, work->mad_recv_wc->wc,
work->mad_recv_wc->recv_buf.grh, work->mad_recv_wc->recv_buf.grh,
&cm_id_priv->av); &cm_id_priv->av);
if (ret)
goto destroy;
cm_id_priv->timewait_info = cm_create_timewait_info(cm_id_priv-> cm_id_priv->timewait_info = cm_create_timewait_info(cm_id_priv->
id.local_id); id.local_id);
if (IS_ERR(cm_id_priv->timewait_info)) { if (IS_ERR(cm_id_priv->timewait_info)) {
...@@ -1841,9 +1879,10 @@ static int cm_req_handler(struct cm_work *work) ...@@ -1841,9 +1879,10 @@ static int cm_req_handler(struct cm_work *work)
listen_cm_id_priv = cm_match_req(work, cm_id_priv); listen_cm_id_priv = cm_match_req(work, cm_id_priv);
if (!listen_cm_id_priv) { if (!listen_cm_id_priv) {
pr_debug("%s: local_id %d, no listen_cm_id_priv\n", __func__,
be32_to_cpu(cm_id->local_id));
ret = -EINVAL; ret = -EINVAL;
kfree(cm_id_priv->timewait_info); goto free_timeinfo;
goto destroy;
} }
cm_id_priv->id.cm_handler = listen_cm_id_priv->id.cm_handler; cm_id_priv->id.cm_handler = listen_cm_id_priv->id.cm_handler;
...@@ -1861,56 +1900,50 @@ static int cm_req_handler(struct cm_work *work) ...@@ -1861,56 +1900,50 @@ static int cm_req_handler(struct cm_work *work)
work->port->port_num, work->port->port_num,
grh->sgid_index, grh->sgid_index,
&gid, &gid_attr); &gid, &gid_attr);
if (!ret) { if (ret) {
if (gid_attr.ndev) { ib_send_cm_rej(cm_id, IB_CM_REJ_UNSUPPORTED, NULL, 0, NULL, 0);
work->path[0].rec_type = goto rejected;
sa_conv_gid_to_pathrec_type(gid_attr.gid_type); }
sa_path_set_ifindex(&work->path[0],
gid_attr.ndev->ifindex); if (gid_attr.ndev) {
sa_path_set_ndev(&work->path[0], work->path[0].rec_type =
dev_net(gid_attr.ndev)); sa_conv_gid_to_pathrec_type(gid_attr.gid_type);
dev_put(gid_attr.ndev); sa_path_set_ifindex(&work->path[0],
} else { gid_attr.ndev->ifindex);
cm_path_set_rec_type(work->port->cm_dev->ib_device, sa_path_set_ndev(&work->path[0],
work->port->port_num, dev_net(gid_attr.ndev));
&work->path[0], dev_put(gid_attr.ndev);
&req_msg->primary_local_gid); } else {
} cm_path_set_rec_type(work->port->cm_dev->ib_device,
if (cm_req_has_alt_path(req_msg)) work->port->port_num,
work->path[1].rec_type = work->path[0].rec_type; &work->path[0],
cm_format_paths_from_req(req_msg, &work->path[0], &req_msg->primary_local_gid);
&work->path[1]);
if (cm_id_priv->av.ah_attr.type == RDMA_AH_ATTR_TYPE_ROCE)
sa_path_set_dmac(&work->path[0],
cm_id_priv->av.ah_attr.roce.dmac);
work->path[0].hop_limit = grh->hop_limit;
ret = cm_init_av_by_path(&work->path[0], &cm_id_priv->av,
cm_id_priv);
} }
if (cm_req_has_alt_path(req_msg))
work->path[1].rec_type = work->path[0].rec_type;
cm_format_paths_from_req(req_msg, &work->path[0],
&work->path[1]);
if (cm_id_priv->av.ah_attr.type == RDMA_AH_ATTR_TYPE_ROCE)
sa_path_set_dmac(&work->path[0],
cm_id_priv->av.ah_attr.roce.dmac);
work->path[0].hop_limit = grh->hop_limit;
ret = cm_init_av_by_path(&work->path[0], &cm_id_priv->av,
cm_id_priv);
if (ret) { if (ret) {
int err = ib_get_cached_gid(work->port->cm_dev->ib_device, int err;
work->port->port_num, 0,
&work->path[0].sgid, err = ib_get_cached_gid(work->port->cm_dev->ib_device,
&gid_attr); work->port->port_num, 0,
if (!err && gid_attr.ndev) { &work->path[0].sgid,
work->path[0].rec_type = NULL);
sa_conv_gid_to_pathrec_type(gid_attr.gid_type); if (err)
sa_path_set_ifindex(&work->path[0], ib_send_cm_rej(cm_id, IB_CM_REJ_INVALID_GID,
gid_attr.ndev->ifindex); NULL, 0, NULL, 0);
sa_path_set_ndev(&work->path[0], else
dev_net(gid_attr.ndev)); ib_send_cm_rej(cm_id, IB_CM_REJ_INVALID_GID,
dev_put(gid_attr.ndev); &work->path[0].sgid,
} else { sizeof(work->path[0].sgid),
cm_path_set_rec_type(work->port->cm_dev->ib_device, NULL, 0);
work->port->port_num,
&work->path[0],
&req_msg->primary_local_gid);
}
if (cm_req_has_alt_path(req_msg))
work->path[1].rec_type = work->path[0].rec_type;
ib_send_cm_rej(cm_id, IB_CM_REJ_INVALID_GID,
&work->path[0].sgid, sizeof work->path[0].sgid,
NULL, 0);
goto rejected; goto rejected;
} }
if (cm_req_has_alt_path(req_msg)) { if (cm_req_has_alt_path(req_msg)) {
...@@ -1919,7 +1952,7 @@ static int cm_req_handler(struct cm_work *work) ...@@ -1919,7 +1952,7 @@ static int cm_req_handler(struct cm_work *work)
if (ret) { if (ret) {
ib_send_cm_rej(cm_id, IB_CM_REJ_INVALID_ALT_GID, ib_send_cm_rej(cm_id, IB_CM_REJ_INVALID_ALT_GID,
&work->path[0].sgid, &work->path[0].sgid,
sizeof work->path[0].sgid, NULL, 0); sizeof(work->path[0].sgid), NULL, 0);
goto rejected; goto rejected;
} }
} }
...@@ -1945,6 +1978,8 @@ static int cm_req_handler(struct cm_work *work) ...@@ -1945,6 +1978,8 @@ static int cm_req_handler(struct cm_work *work)
rejected: rejected:
atomic_dec(&cm_id_priv->refcount); atomic_dec(&cm_id_priv->refcount);
cm_deref_id(listen_cm_id_priv); cm_deref_id(listen_cm_id_priv);
free_timeinfo:
kfree(cm_id_priv->timewait_info);
destroy: destroy:
ib_destroy_cm_id(cm_id); ib_destroy_cm_id(cm_id);
return ret; return ret;
...@@ -1997,6 +2032,8 @@ int ib_send_cm_rep(struct ib_cm_id *cm_id, ...@@ -1997,6 +2032,8 @@ int ib_send_cm_rep(struct ib_cm_id *cm_id,
spin_lock_irqsave(&cm_id_priv->lock, flags); spin_lock_irqsave(&cm_id_priv->lock, flags);
if (cm_id->state != IB_CM_REQ_RCVD && if (cm_id->state != IB_CM_REQ_RCVD &&
cm_id->state != IB_CM_MRA_REQ_SENT) { cm_id->state != IB_CM_MRA_REQ_SENT) {
pr_debug("%s: local_comm_id %d, cm_id->state: %d\n", __func__,
be32_to_cpu(cm_id_priv->id.local_id), cm_id->state);
ret = -EINVAL; ret = -EINVAL;
goto out; goto out;
} }
...@@ -2063,6 +2100,8 @@ int ib_send_cm_rtu(struct ib_cm_id *cm_id, ...@@ -2063,6 +2100,8 @@ int ib_send_cm_rtu(struct ib_cm_id *cm_id,
spin_lock_irqsave(&cm_id_priv->lock, flags); spin_lock_irqsave(&cm_id_priv->lock, flags);
if (cm_id->state != IB_CM_REP_RCVD && if (cm_id->state != IB_CM_REP_RCVD &&
cm_id->state != IB_CM_MRA_REP_SENT) { cm_id->state != IB_CM_MRA_REP_SENT) {
pr_debug("%s: local_id %d, cm_id->state %d\n", __func__,
be32_to_cpu(cm_id->local_id), cm_id->state);
ret = -EINVAL; ret = -EINVAL;
goto error; goto error;
} }
...@@ -2170,6 +2209,8 @@ static int cm_rep_handler(struct cm_work *work) ...@@ -2170,6 +2209,8 @@ static int cm_rep_handler(struct cm_work *work)
cm_id_priv = cm_acquire_id(rep_msg->remote_comm_id, 0); cm_id_priv = cm_acquire_id(rep_msg->remote_comm_id, 0);
if (!cm_id_priv) { if (!cm_id_priv) {
cm_dup_rep_handler(work); cm_dup_rep_handler(work);
pr_debug("%s: remote_comm_id %d, no cm_id_priv\n", __func__,
be32_to_cpu(rep_msg->remote_comm_id));
return -EINVAL; return -EINVAL;
} }
...@@ -2183,6 +2224,10 @@ static int cm_rep_handler(struct cm_work *work) ...@@ -2183,6 +2224,10 @@ static int cm_rep_handler(struct cm_work *work)
default: default:
spin_unlock_irq(&cm_id_priv->lock); spin_unlock_irq(&cm_id_priv->lock);
ret = -EINVAL; ret = -EINVAL;
pr_debug("%s: cm_id_priv->id.state: %d, local_comm_id %d, remote_comm_id %d\n",
__func__, cm_id_priv->id.state,
be32_to_cpu(rep_msg->local_comm_id),
be32_to_cpu(rep_msg->remote_comm_id));
goto error; goto error;
} }
...@@ -2196,6 +2241,8 @@ static int cm_rep_handler(struct cm_work *work) ...@@ -2196,6 +2241,8 @@ static int cm_rep_handler(struct cm_work *work)
spin_unlock(&cm.lock); spin_unlock(&cm.lock);
spin_unlock_irq(&cm_id_priv->lock); spin_unlock_irq(&cm_id_priv->lock);
ret = -EINVAL; ret = -EINVAL;
pr_debug("%s: Failed to insert remote id %d\n", __func__,
be32_to_cpu(rep_msg->remote_comm_id));
goto error; goto error;
} }
/* Check for a stale connection. */ /* Check for a stale connection. */
...@@ -2213,6 +2260,10 @@ static int cm_rep_handler(struct cm_work *work) ...@@ -2213,6 +2260,10 @@ static int cm_rep_handler(struct cm_work *work)
IB_CM_REJ_STALE_CONN, CM_MSG_RESPONSE_REP, IB_CM_REJ_STALE_CONN, CM_MSG_RESPONSE_REP,
NULL, 0); NULL, 0);
ret = -EINVAL; ret = -EINVAL;
pr_debug("%s: Stale connection. local_comm_id %d, remote_comm_id %d\n",
__func__, be32_to_cpu(rep_msg->local_comm_id),
be32_to_cpu(rep_msg->remote_comm_id));
if (cur_cm_id_priv) { if (cur_cm_id_priv) {
cm_id = &cur_cm_id_priv->id; cm_id = &cur_cm_id_priv->id;
ib_send_cm_dreq(cm_id, NULL, 0); ib_send_cm_dreq(cm_id, NULL, 0);
...@@ -2359,6 +2410,8 @@ int ib_send_cm_dreq(struct ib_cm_id *cm_id, ...@@ -2359,6 +2410,8 @@ int ib_send_cm_dreq(struct ib_cm_id *cm_id,
cm_id_priv = container_of(cm_id, struct cm_id_private, id); cm_id_priv = container_of(cm_id, struct cm_id_private, id);
spin_lock_irqsave(&cm_id_priv->lock, flags); spin_lock_irqsave(&cm_id_priv->lock, flags);
if (cm_id->state != IB_CM_ESTABLISHED) { if (cm_id->state != IB_CM_ESTABLISHED) {
pr_debug("%s: local_id %d, cm_id->state: %d\n", __func__,
be32_to_cpu(cm_id->local_id), cm_id->state);
ret = -EINVAL; ret = -EINVAL;
goto out; goto out;
} }
...@@ -2428,6 +2481,8 @@ int ib_send_cm_drep(struct ib_cm_id *cm_id, ...@@ -2428,6 +2481,8 @@ int ib_send_cm_drep(struct ib_cm_id *cm_id,
if (cm_id->state != IB_CM_DREQ_RCVD) { if (cm_id->state != IB_CM_DREQ_RCVD) {
spin_unlock_irqrestore(&cm_id_priv->lock, flags); spin_unlock_irqrestore(&cm_id_priv->lock, flags);
kfree(data); kfree(data);
pr_debug("%s: local_id %d, cm_idcm_id->state(%d) != IB_CM_DREQ_RCVD\n",
__func__, be32_to_cpu(cm_id->local_id), cm_id->state);
return -EINVAL; return -EINVAL;
} }
...@@ -2493,6 +2548,9 @@ static int cm_dreq_handler(struct cm_work *work) ...@@ -2493,6 +2548,9 @@ static int cm_dreq_handler(struct cm_work *work)
atomic_long_inc(&work->port->counter_group[CM_RECV_DUPLICATES]. atomic_long_inc(&work->port->counter_group[CM_RECV_DUPLICATES].
counter[CM_DREQ_COUNTER]); counter[CM_DREQ_COUNTER]);
cm_issue_drep(work->port, work->mad_recv_wc); cm_issue_drep(work->port, work->mad_recv_wc);
pr_debug("%s: no cm_id_priv, local_comm_id %d, remote_comm_id %d\n",
__func__, be32_to_cpu(dreq_msg->local_comm_id),
be32_to_cpu(dreq_msg->remote_comm_id));
return -EINVAL; return -EINVAL;
} }
...@@ -2535,6 +2593,9 @@ static int cm_dreq_handler(struct cm_work *work) ...@@ -2535,6 +2593,9 @@ static int cm_dreq_handler(struct cm_work *work)
counter[CM_DREQ_COUNTER]); counter[CM_DREQ_COUNTER]);
goto unlock; goto unlock;
default: default:
pr_debug("%s: local_id %d, cm_id_priv->id.state: %d\n",
__func__, be32_to_cpu(cm_id_priv->id.local_id),
cm_id_priv->id.state);
goto unlock; goto unlock;
} }
cm_id_priv->id.state = IB_CM_DREQ_RCVD; cm_id_priv->id.state = IB_CM_DREQ_RCVD;
...@@ -2638,6 +2699,8 @@ int ib_send_cm_rej(struct ib_cm_id *cm_id, ...@@ -2638,6 +2699,8 @@ int ib_send_cm_rej(struct ib_cm_id *cm_id,
cm_enter_timewait(cm_id_priv); cm_enter_timewait(cm_id_priv);
break; break;
default: default:
pr_debug("%s: local_id %d, cm_id->state: %d\n", __func__,
be32_to_cpu(cm_id_priv->id.local_id), cm_id->state);
ret = -EINVAL; ret = -EINVAL;
goto out; goto out;
} }
...@@ -2748,6 +2811,9 @@ static int cm_rej_handler(struct cm_work *work) ...@@ -2748,6 +2811,9 @@ static int cm_rej_handler(struct cm_work *work)
/* fall through */ /* fall through */
default: default:
spin_unlock_irq(&cm_id_priv->lock); spin_unlock_irq(&cm_id_priv->lock);
pr_debug("%s: local_id %d, cm_id_priv->id.state: %d\n",
__func__, be32_to_cpu(cm_id_priv->id.local_id),
cm_id_priv->id.state);
ret = -EINVAL; ret = -EINVAL;
goto out; goto out;
} }
...@@ -2811,6 +2877,9 @@ int ib_send_cm_mra(struct ib_cm_id *cm_id, ...@@ -2811,6 +2877,9 @@ int ib_send_cm_mra(struct ib_cm_id *cm_id,
} }
/* fall through */ /* fall through */
default: default:
pr_debug("%s: local_id %d, cm_id_priv->id.state: %d\n",
__func__, be32_to_cpu(cm_id_priv->id.local_id),
cm_id_priv->id.state);
ret = -EINVAL; ret = -EINVAL;
goto error1; goto error1;
} }
...@@ -2912,6 +2981,9 @@ static int cm_mra_handler(struct cm_work *work) ...@@ -2912,6 +2981,9 @@ static int cm_mra_handler(struct cm_work *work)
counter[CM_MRA_COUNTER]); counter[CM_MRA_COUNTER]);
/* fall through */ /* fall through */
default: default:
pr_debug("%s local_id %d, cm_id_priv->id.state: %d\n",
__func__, be32_to_cpu(cm_id_priv->id.local_id),
cm_id_priv->id.state);
goto out; goto out;
} }
...@@ -3085,6 +3157,12 @@ static int cm_lap_handler(struct cm_work *work) ...@@ -3085,6 +3157,12 @@ static int cm_lap_handler(struct cm_work *work)
if (!cm_id_priv) if (!cm_id_priv)
return -EINVAL; return -EINVAL;
ret = cm_init_av_for_response(work->port, work->mad_recv_wc->wc,
work->mad_recv_wc->recv_buf.grh,
&cm_id_priv->av);
if (ret)
goto deref;
param = &work->cm_event.param.lap_rcvd; param = &work->cm_event.param.lap_rcvd;
memset(&work->path[0], 0, sizeof(work->path[1])); memset(&work->path[0], 0, sizeof(work->path[1]));
cm_path_set_rec_type(work->port->cm_dev->ib_device, cm_path_set_rec_type(work->port->cm_dev->ib_device,
...@@ -3131,9 +3209,6 @@ static int cm_lap_handler(struct cm_work *work) ...@@ -3131,9 +3209,6 @@ static int cm_lap_handler(struct cm_work *work)
cm_id_priv->id.lap_state = IB_CM_LAP_RCVD; cm_id_priv->id.lap_state = IB_CM_LAP_RCVD;
cm_id_priv->tid = lap_msg->hdr.tid; cm_id_priv->tid = lap_msg->hdr.tid;
cm_init_av_for_response(work->port, work->mad_recv_wc->wc,
work->mad_recv_wc->recv_buf.grh,
&cm_id_priv->av);
cm_init_av_by_path(param->alternate_path, &cm_id_priv->alt_av, cm_init_av_by_path(param->alternate_path, &cm_id_priv->alt_av,
cm_id_priv); cm_id_priv);
ret = atomic_inc_and_test(&cm_id_priv->work_count); ret = atomic_inc_and_test(&cm_id_priv->work_count);
...@@ -3386,6 +3461,7 @@ static int cm_sidr_req_handler(struct cm_work *work) ...@@ -3386,6 +3461,7 @@ static int cm_sidr_req_handler(struct cm_work *work)
struct cm_id_private *cm_id_priv, *cur_cm_id_priv; struct cm_id_private *cm_id_priv, *cur_cm_id_priv;
struct cm_sidr_req_msg *sidr_req_msg; struct cm_sidr_req_msg *sidr_req_msg;
struct ib_wc *wc; struct ib_wc *wc;
int ret;
cm_id = ib_create_cm_id(work->port->cm_dev->ib_device, NULL, NULL); cm_id = ib_create_cm_id(work->port->cm_dev->ib_device, NULL, NULL);
if (IS_ERR(cm_id)) if (IS_ERR(cm_id))
...@@ -3398,9 +3474,12 @@ static int cm_sidr_req_handler(struct cm_work *work) ...@@ -3398,9 +3474,12 @@ static int cm_sidr_req_handler(struct cm_work *work)
wc = work->mad_recv_wc->wc; wc = work->mad_recv_wc->wc;
cm_id_priv->av.dgid.global.subnet_prefix = cpu_to_be64(wc->slid); cm_id_priv->av.dgid.global.subnet_prefix = cpu_to_be64(wc->slid);
cm_id_priv->av.dgid.global.interface_id = 0; cm_id_priv->av.dgid.global.interface_id = 0;
cm_init_av_for_response(work->port, work->mad_recv_wc->wc, ret = cm_init_av_for_response(work->port, work->mad_recv_wc->wc,
work->mad_recv_wc->recv_buf.grh, work->mad_recv_wc->recv_buf.grh,
&cm_id_priv->av); &cm_id_priv->av);
if (ret)
goto out;
cm_id_priv->id.remote_id = sidr_req_msg->request_id; cm_id_priv->id.remote_id = sidr_req_msg->request_id;
cm_id_priv->tid = sidr_req_msg->hdr.tid; cm_id_priv->tid = sidr_req_msg->hdr.tid;
atomic_inc(&cm_id_priv->work_count); atomic_inc(&cm_id_priv->work_count);
...@@ -3692,6 +3771,7 @@ static void cm_work_handler(struct work_struct *_work) ...@@ -3692,6 +3771,7 @@ static void cm_work_handler(struct work_struct *_work)
ret = cm_timewait_handler(work); ret = cm_timewait_handler(work);
break; break;
default: default:
pr_debug("cm_event.event: 0x%x\n", work->cm_event.event);
ret = -EINVAL; ret = -EINVAL;
break; break;
} }
...@@ -3727,6 +3807,8 @@ static int cm_establish(struct ib_cm_id *cm_id) ...@@ -3727,6 +3807,8 @@ static int cm_establish(struct ib_cm_id *cm_id)
ret = -EISCONN; ret = -EISCONN;
break; break;
default: default:
pr_debug("%s: local_id %d, cm_id->state: %d\n", __func__,
be32_to_cpu(cm_id->local_id), cm_id->state);
ret = -EINVAL; ret = -EINVAL;
break; break;
} }
...@@ -3924,6 +4006,9 @@ static int cm_init_qp_init_attr(struct cm_id_private *cm_id_priv, ...@@ -3924,6 +4006,9 @@ static int cm_init_qp_init_attr(struct cm_id_private *cm_id_priv,
ret = 0; ret = 0;
break; break;
default: default:
pr_debug("%s: local_id %d, cm_id_priv->id.state: %d\n",
__func__, be32_to_cpu(cm_id_priv->id.local_id),
cm_id_priv->id.state);
ret = -EINVAL; ret = -EINVAL;
break; break;
} }
...@@ -3971,6 +4056,9 @@ static int cm_init_qp_rtr_attr(struct cm_id_private *cm_id_priv, ...@@ -3971,6 +4056,9 @@ static int cm_init_qp_rtr_attr(struct cm_id_private *cm_id_priv,
ret = 0; ret = 0;
break; break;
default: default:
pr_debug("%s: local_id %d, cm_id_priv->id.state: %d\n",
__func__, be32_to_cpu(cm_id_priv->id.local_id),
cm_id_priv->id.state);
ret = -EINVAL; ret = -EINVAL;
break; break;
} }
...@@ -4030,6 +4118,9 @@ static int cm_init_qp_rts_attr(struct cm_id_private *cm_id_priv, ...@@ -4030,6 +4118,9 @@ static int cm_init_qp_rts_attr(struct cm_id_private *cm_id_priv,
ret = 0; ret = 0;
break; break;
default: default:
pr_debug("%s: local_id %d, cm_id_priv->id.state: %d\n",
__func__, be32_to_cpu(cm_id_priv->id.local_id),
cm_id_priv->id.state);
ret = -EINVAL; ret = -EINVAL;
break; break;
} }
......
...@@ -601,7 +601,7 @@ static int cma_translate_addr(struct sockaddr *addr, struct rdma_dev_addr *dev_a ...@@ -601,7 +601,7 @@ static int cma_translate_addr(struct sockaddr *addr, struct rdma_dev_addr *dev_a
int ret; int ret;
if (addr->sa_family != AF_IB) { if (addr->sa_family != AF_IB) {
ret = rdma_translate_ip(addr, dev_addr, NULL); ret = rdma_translate_ip(addr, dev_addr);
} else { } else {
cma_translate_ib((struct sockaddr_ib *) addr, dev_addr); cma_translate_ib((struct sockaddr_ib *) addr, dev_addr);
ret = 0; ret = 0;
...@@ -612,11 +612,14 @@ static int cma_translate_addr(struct sockaddr *addr, struct rdma_dev_addr *dev_a ...@@ -612,11 +612,14 @@ static int cma_translate_addr(struct sockaddr *addr, struct rdma_dev_addr *dev_a
static inline int cma_validate_port(struct ib_device *device, u8 port, static inline int cma_validate_port(struct ib_device *device, u8 port,
enum ib_gid_type gid_type, enum ib_gid_type gid_type,
union ib_gid *gid, int dev_type, union ib_gid *gid,
int bound_if_index) struct rdma_id_private *id_priv)
{ {
int ret = -ENODEV; struct rdma_dev_addr *dev_addr = &id_priv->id.route.addr.dev_addr;
int bound_if_index = dev_addr->bound_dev_if;
int dev_type = dev_addr->dev_type;
struct net_device *ndev = NULL; struct net_device *ndev = NULL;
int ret = -ENODEV;
if ((dev_type == ARPHRD_INFINIBAND) && !rdma_protocol_ib(device, port)) if ((dev_type == ARPHRD_INFINIBAND) && !rdma_protocol_ib(device, port))
return ret; return ret;
...@@ -624,11 +627,13 @@ static inline int cma_validate_port(struct ib_device *device, u8 port, ...@@ -624,11 +627,13 @@ static inline int cma_validate_port(struct ib_device *device, u8 port,
if ((dev_type != ARPHRD_INFINIBAND) && rdma_protocol_ib(device, port)) if ((dev_type != ARPHRD_INFINIBAND) && rdma_protocol_ib(device, port))
return ret; return ret;
if (dev_type == ARPHRD_ETHER && rdma_protocol_roce(device, port)) if (dev_type == ARPHRD_ETHER && rdma_protocol_roce(device, port)) {
ndev = dev_get_by_index(&init_net, bound_if_index); ndev = dev_get_by_index(dev_addr->net, bound_if_index);
else if (!ndev)
return ret;
} else {
gid_type = IB_GID_TYPE_IB; gid_type = IB_GID_TYPE_IB;
}
ret = ib_find_cached_gid_by_port(device, gid, gid_type, port, ret = ib_find_cached_gid_by_port(device, gid, gid_type, port,
ndev, NULL); ndev, NULL);
...@@ -669,8 +674,7 @@ static int cma_acquire_dev(struct rdma_id_private *id_priv, ...@@ -669,8 +674,7 @@ static int cma_acquire_dev(struct rdma_id_private *id_priv,
rdma_protocol_ib(cma_dev->device, port) ? rdma_protocol_ib(cma_dev->device, port) ?
IB_GID_TYPE_IB : IB_GID_TYPE_IB :
listen_id_priv->gid_type, gidp, listen_id_priv->gid_type, gidp,
dev_addr->dev_type, id_priv);
dev_addr->bound_dev_if);
if (!ret) { if (!ret) {
id_priv->id.port_num = port; id_priv->id.port_num = port;
goto out; goto out;
...@@ -691,8 +695,7 @@ static int cma_acquire_dev(struct rdma_id_private *id_priv, ...@@ -691,8 +695,7 @@ static int cma_acquire_dev(struct rdma_id_private *id_priv,
rdma_protocol_ib(cma_dev->device, port) ? rdma_protocol_ib(cma_dev->device, port) ?
IB_GID_TYPE_IB : IB_GID_TYPE_IB :
cma_dev->default_gid_type[port - 1], cma_dev->default_gid_type[port - 1],
gidp, dev_addr->dev_type, gidp, id_priv);
dev_addr->bound_dev_if);
if (!ret) { if (!ret) {
id_priv->id.port_num = port; id_priv->id.port_num = port;
goto out; goto out;
...@@ -2036,6 +2039,33 @@ __be64 rdma_get_service_id(struct rdma_cm_id *id, struct sockaddr *addr) ...@@ -2036,6 +2039,33 @@ __be64 rdma_get_service_id(struct rdma_cm_id *id, struct sockaddr *addr)
} }
EXPORT_SYMBOL(rdma_get_service_id); EXPORT_SYMBOL(rdma_get_service_id);
void rdma_read_gids(struct rdma_cm_id *cm_id, union ib_gid *sgid,
union ib_gid *dgid)
{
struct rdma_addr *addr = &cm_id->route.addr;
if (!cm_id->device) {
if (sgid)
memset(sgid, 0, sizeof(*sgid));
if (dgid)
memset(dgid, 0, sizeof(*dgid));
return;
}
if (rdma_protocol_roce(cm_id->device, cm_id->port_num)) {
if (sgid)
rdma_ip2gid((struct sockaddr *)&addr->src_addr, sgid);
if (dgid)
rdma_ip2gid((struct sockaddr *)&addr->dst_addr, dgid);
} else {
if (sgid)
rdma_addr_get_sgid(&addr->dev_addr, sgid);
if (dgid)
rdma_addr_get_dgid(&addr->dev_addr, dgid);
}
}
EXPORT_SYMBOL(rdma_read_gids);
static int cma_iw_handler(struct iw_cm_id *iw_id, struct iw_cm_event *iw_event) static int cma_iw_handler(struct iw_cm_id *iw_id, struct iw_cm_event *iw_event)
{ {
struct rdma_id_private *id_priv = iw_id->context; struct rdma_id_private *id_priv = iw_id->context;
...@@ -2132,7 +2162,7 @@ static int iw_conn_req_handler(struct iw_cm_id *cm_id, ...@@ -2132,7 +2162,7 @@ static int iw_conn_req_handler(struct iw_cm_id *cm_id,
mutex_lock_nested(&conn_id->handler_mutex, SINGLE_DEPTH_NESTING); mutex_lock_nested(&conn_id->handler_mutex, SINGLE_DEPTH_NESTING);
conn_id->state = RDMA_CM_CONNECT; conn_id->state = RDMA_CM_CONNECT;
ret = rdma_translate_ip(laddr, &conn_id->id.route.addr.dev_addr, NULL); ret = rdma_translate_ip(laddr, &conn_id->id.route.addr.dev_addr);
if (ret) { if (ret) {
mutex_unlock(&conn_id->handler_mutex); mutex_unlock(&conn_id->handler_mutex);
rdma_destroy_id(new_cm_id); rdma_destroy_id(new_cm_id);
...@@ -2414,6 +2444,26 @@ static void cma_ndev_work_handler(struct work_struct *_work) ...@@ -2414,6 +2444,26 @@ static void cma_ndev_work_handler(struct work_struct *_work)
kfree(work); kfree(work);
} }
static void cma_init_resolve_route_work(struct cma_work *work,
struct rdma_id_private *id_priv)
{
work->id = id_priv;
INIT_WORK(&work->work, cma_work_handler);
work->old_state = RDMA_CM_ROUTE_QUERY;
work->new_state = RDMA_CM_ROUTE_RESOLVED;
work->event.event = RDMA_CM_EVENT_ROUTE_RESOLVED;
}
static void cma_init_resolve_addr_work(struct cma_work *work,
struct rdma_id_private *id_priv)
{
work->id = id_priv;
INIT_WORK(&work->work, cma_work_handler);
work->old_state = RDMA_CM_ADDR_QUERY;
work->new_state = RDMA_CM_ADDR_RESOLVED;
work->event.event = RDMA_CM_EVENT_ADDR_RESOLVED;
}
static int cma_resolve_ib_route(struct rdma_id_private *id_priv, int timeout_ms) static int cma_resolve_ib_route(struct rdma_id_private *id_priv, int timeout_ms)
{ {
struct rdma_route *route = &id_priv->id.route; struct rdma_route *route = &id_priv->id.route;
...@@ -2424,11 +2474,7 @@ static int cma_resolve_ib_route(struct rdma_id_private *id_priv, int timeout_ms) ...@@ -2424,11 +2474,7 @@ static int cma_resolve_ib_route(struct rdma_id_private *id_priv, int timeout_ms)
if (!work) if (!work)
return -ENOMEM; return -ENOMEM;
work->id = id_priv; cma_init_resolve_route_work(work, id_priv);
INIT_WORK(&work->work, cma_work_handler);
work->old_state = RDMA_CM_ROUTE_QUERY;
work->new_state = RDMA_CM_ROUTE_RESOLVED;
work->event.event = RDMA_CM_EVENT_ROUTE_RESOLVED;
route->path_rec = kmalloc(sizeof *route->path_rec, GFP_KERNEL); route->path_rec = kmalloc(sizeof *route->path_rec, GFP_KERNEL);
if (!route->path_rec) { if (!route->path_rec) {
...@@ -2449,10 +2495,63 @@ static int cma_resolve_ib_route(struct rdma_id_private *id_priv, int timeout_ms) ...@@ -2449,10 +2495,63 @@ static int cma_resolve_ib_route(struct rdma_id_private *id_priv, int timeout_ms)
return ret; return ret;
} }
int rdma_set_ib_paths(struct rdma_cm_id *id, static enum ib_gid_type cma_route_gid_type(enum rdma_network_type network_type,
struct sa_path_rec *path_rec, int num_paths) unsigned long supported_gids,
enum ib_gid_type default_gid)
{
if ((network_type == RDMA_NETWORK_IPV4 ||
network_type == RDMA_NETWORK_IPV6) &&
test_bit(IB_GID_TYPE_ROCE_UDP_ENCAP, &supported_gids))
return IB_GID_TYPE_ROCE_UDP_ENCAP;
return default_gid;
}
/*
* cma_iboe_set_path_rec_l2_fields() is helper function which sets
* path record type based on GID type.
* It also sets up other L2 fields which includes destination mac address
* netdev ifindex, of the path record.
* It returns the netdev of the bound interface for this path record entry.
*/
static struct net_device *
cma_iboe_set_path_rec_l2_fields(struct rdma_id_private *id_priv)
{
struct rdma_route *route = &id_priv->id.route;
enum ib_gid_type gid_type = IB_GID_TYPE_ROCE;
struct rdma_addr *addr = &route->addr;
unsigned long supported_gids;
struct net_device *ndev;
if (!addr->dev_addr.bound_dev_if)
return NULL;
ndev = dev_get_by_index(addr->dev_addr.net,
addr->dev_addr.bound_dev_if);
if (!ndev)
return NULL;
supported_gids = roce_gid_type_mask_support(id_priv->id.device,
id_priv->id.port_num);
gid_type = cma_route_gid_type(addr->dev_addr.network,
supported_gids,
id_priv->gid_type);
/* Use the hint from IP Stack to select GID Type */
if (gid_type < ib_network_to_gid_type(addr->dev_addr.network))
gid_type = ib_network_to_gid_type(addr->dev_addr.network);
route->path_rec->rec_type = sa_conv_gid_to_pathrec_type(gid_type);
sa_path_set_ndev(route->path_rec, addr->dev_addr.net);
sa_path_set_ifindex(route->path_rec, ndev->ifindex);
sa_path_set_dmac(route->path_rec, addr->dev_addr.dst_dev_addr);
return ndev;
}
int rdma_set_ib_path(struct rdma_cm_id *id,
struct sa_path_rec *path_rec)
{ {
struct rdma_id_private *id_priv; struct rdma_id_private *id_priv;
struct net_device *ndev;
int ret; int ret;
id_priv = container_of(id, struct rdma_id_private, id); id_priv = container_of(id, struct rdma_id_private, id);
...@@ -2460,20 +2559,33 @@ int rdma_set_ib_paths(struct rdma_cm_id *id, ...@@ -2460,20 +2559,33 @@ int rdma_set_ib_paths(struct rdma_cm_id *id,
RDMA_CM_ROUTE_RESOLVED)) RDMA_CM_ROUTE_RESOLVED))
return -EINVAL; return -EINVAL;
id->route.path_rec = kmemdup(path_rec, sizeof *path_rec * num_paths, id->route.path_rec = kmemdup(path_rec, sizeof(*path_rec),
GFP_KERNEL); GFP_KERNEL);
if (!id->route.path_rec) { if (!id->route.path_rec) {
ret = -ENOMEM; ret = -ENOMEM;
goto err; goto err;
} }
id->route.num_paths = num_paths; if (rdma_protocol_roce(id->device, id->port_num)) {
ndev = cma_iboe_set_path_rec_l2_fields(id_priv);
if (!ndev) {
ret = -ENODEV;
goto err_free;
}
dev_put(ndev);
}
id->route.num_paths = 1;
return 0; return 0;
err_free:
kfree(id->route.path_rec);
id->route.path_rec = NULL;
err: err:
cma_comp_exch(id_priv, RDMA_CM_ROUTE_RESOLVED, RDMA_CM_ADDR_RESOLVED); cma_comp_exch(id_priv, RDMA_CM_ROUTE_RESOLVED, RDMA_CM_ADDR_RESOLVED);
return ret; return ret;
} }
EXPORT_SYMBOL(rdma_set_ib_paths); EXPORT_SYMBOL(rdma_set_ib_path);
static int cma_resolve_iw_route(struct rdma_id_private *id_priv, int timeout_ms) static int cma_resolve_iw_route(struct rdma_id_private *id_priv, int timeout_ms)
{ {
...@@ -2483,11 +2595,7 @@ static int cma_resolve_iw_route(struct rdma_id_private *id_priv, int timeout_ms) ...@@ -2483,11 +2595,7 @@ static int cma_resolve_iw_route(struct rdma_id_private *id_priv, int timeout_ms)
if (!work) if (!work)
return -ENOMEM; return -ENOMEM;
work->id = id_priv; cma_init_resolve_route_work(work, id_priv);
INIT_WORK(&work->work, cma_work_handler);
work->old_state = RDMA_CM_ROUTE_QUERY;
work->new_state = RDMA_CM_ROUTE_RESOLVED;
work->event.event = RDMA_CM_EVENT_ROUTE_RESOLVED;
queue_work(cma_wq, &work->work); queue_work(cma_wq, &work->work);
return 0; return 0;
} }
...@@ -2510,26 +2618,14 @@ static int iboe_tos_to_sl(struct net_device *ndev, int tos) ...@@ -2510,26 +2618,14 @@ static int iboe_tos_to_sl(struct net_device *ndev, int tos)
return 0; return 0;
} }
static enum ib_gid_type cma_route_gid_type(enum rdma_network_type network_type,
unsigned long supported_gids,
enum ib_gid_type default_gid)
{
if ((network_type == RDMA_NETWORK_IPV4 ||
network_type == RDMA_NETWORK_IPV6) &&
test_bit(IB_GID_TYPE_ROCE_UDP_ENCAP, &supported_gids))
return IB_GID_TYPE_ROCE_UDP_ENCAP;
return default_gid;
}
static int cma_resolve_iboe_route(struct rdma_id_private *id_priv) static int cma_resolve_iboe_route(struct rdma_id_private *id_priv)
{ {
struct rdma_route *route = &id_priv->id.route; struct rdma_route *route = &id_priv->id.route;
struct rdma_addr *addr = &route->addr; struct rdma_addr *addr = &route->addr;
struct cma_work *work; struct cma_work *work;
int ret; int ret;
struct net_device *ndev = NULL; struct net_device *ndev;
enum ib_gid_type gid_type = IB_GID_TYPE_IB;
u8 default_roce_tos = id_priv->cma_dev->default_roce_tos[id_priv->id.port_num - u8 default_roce_tos = id_priv->cma_dev->default_roce_tos[id_priv->id.port_num -
rdma_start_port(id_priv->cma_dev->device)]; rdma_start_port(id_priv->cma_dev->device)];
u8 tos = id_priv->tos_set ? id_priv->tos : default_roce_tos; u8 tos = id_priv->tos_set ? id_priv->tos : default_roce_tos;
...@@ -2539,9 +2635,6 @@ static int cma_resolve_iboe_route(struct rdma_id_private *id_priv) ...@@ -2539,9 +2635,6 @@ static int cma_resolve_iboe_route(struct rdma_id_private *id_priv)
if (!work) if (!work)
return -ENOMEM; return -ENOMEM;
work->id = id_priv;
INIT_WORK(&work->work, cma_work_handler);
route->path_rec = kzalloc(sizeof *route->path_rec, GFP_KERNEL); route->path_rec = kzalloc(sizeof *route->path_rec, GFP_KERNEL);
if (!route->path_rec) { if (!route->path_rec) {
ret = -ENOMEM; ret = -ENOMEM;
...@@ -2550,42 +2643,17 @@ static int cma_resolve_iboe_route(struct rdma_id_private *id_priv) ...@@ -2550,42 +2643,17 @@ static int cma_resolve_iboe_route(struct rdma_id_private *id_priv)
route->num_paths = 1; route->num_paths = 1;
if (addr->dev_addr.bound_dev_if) { ndev = cma_iboe_set_path_rec_l2_fields(id_priv);
unsigned long supported_gids;
ndev = dev_get_by_index(&init_net, addr->dev_addr.bound_dev_if);
if (!ndev) {
ret = -ENODEV;
goto err2;
}
supported_gids = roce_gid_type_mask_support(id_priv->id.device,
id_priv->id.port_num);
gid_type = cma_route_gid_type(addr->dev_addr.network,
supported_gids,
id_priv->gid_type);
route->path_rec->rec_type =
sa_conv_gid_to_pathrec_type(gid_type);
sa_path_set_ndev(route->path_rec, &init_net);
sa_path_set_ifindex(route->path_rec, ndev->ifindex);
}
if (!ndev) { if (!ndev) {
ret = -ENODEV; ret = -ENODEV;
goto err2; goto err2;
} }
sa_path_set_dmac(route->path_rec, addr->dev_addr.dst_dev_addr);
rdma_ip2gid((struct sockaddr *)&id_priv->id.route.addr.src_addr, rdma_ip2gid((struct sockaddr *)&id_priv->id.route.addr.src_addr,
&route->path_rec->sgid); &route->path_rec->sgid);
rdma_ip2gid((struct sockaddr *)&id_priv->id.route.addr.dst_addr, rdma_ip2gid((struct sockaddr *)&id_priv->id.route.addr.dst_addr,
&route->path_rec->dgid); &route->path_rec->dgid);
/* Use the hint from IP Stack to select GID Type */
if (gid_type < ib_network_to_gid_type(addr->dev_addr.network))
gid_type = ib_network_to_gid_type(addr->dev_addr.network);
route->path_rec->rec_type = sa_conv_gid_to_pathrec_type(gid_type);
if (((struct sockaddr *)&id_priv->id.route.addr.dst_addr)->sa_family != AF_IB) if (((struct sockaddr *)&id_priv->id.route.addr.dst_addr)->sa_family != AF_IB)
/* TODO: get the hoplimit from the inet/inet6 device */ /* TODO: get the hoplimit from the inet/inet6 device */
route->path_rec->hop_limit = addr->dev_addr.hoplimit; route->path_rec->hop_limit = addr->dev_addr.hoplimit;
...@@ -2607,11 +2675,7 @@ static int cma_resolve_iboe_route(struct rdma_id_private *id_priv) ...@@ -2607,11 +2675,7 @@ static int cma_resolve_iboe_route(struct rdma_id_private *id_priv)
goto err2; goto err2;
} }
work->old_state = RDMA_CM_ROUTE_QUERY; cma_init_resolve_route_work(work, id_priv);
work->new_state = RDMA_CM_ROUTE_RESOLVED;
work->event.event = RDMA_CM_EVENT_ROUTE_RESOLVED;
work->event.status = 0;
queue_work(cma_wq, &work->work); queue_work(cma_wq, &work->work);
return 0; return 0;
...@@ -2791,11 +2855,7 @@ static int cma_resolve_loopback(struct rdma_id_private *id_priv) ...@@ -2791,11 +2855,7 @@ static int cma_resolve_loopback(struct rdma_id_private *id_priv)
rdma_addr_get_sgid(&id_priv->id.route.addr.dev_addr, &gid); rdma_addr_get_sgid(&id_priv->id.route.addr.dev_addr, &gid);
rdma_addr_set_dgid(&id_priv->id.route.addr.dev_addr, &gid); rdma_addr_set_dgid(&id_priv->id.route.addr.dev_addr, &gid);
work->id = id_priv; cma_init_resolve_addr_work(work, id_priv);
INIT_WORK(&work->work, cma_work_handler);
work->old_state = RDMA_CM_ADDR_QUERY;
work->new_state = RDMA_CM_ADDR_RESOLVED;
work->event.event = RDMA_CM_EVENT_ADDR_RESOLVED;
queue_work(cma_wq, &work->work); queue_work(cma_wq, &work->work);
return 0; return 0;
err: err:
...@@ -2821,11 +2881,7 @@ static int cma_resolve_ib_addr(struct rdma_id_private *id_priv) ...@@ -2821,11 +2881,7 @@ static int cma_resolve_ib_addr(struct rdma_id_private *id_priv)
rdma_addr_set_dgid(&id_priv->id.route.addr.dev_addr, (union ib_gid *) rdma_addr_set_dgid(&id_priv->id.route.addr.dev_addr, (union ib_gid *)
&(((struct sockaddr_ib *) &id_priv->id.route.addr.dst_addr)->sib_addr)); &(((struct sockaddr_ib *) &id_priv->id.route.addr.dst_addr)->sib_addr));
work->id = id_priv; cma_init_resolve_addr_work(work, id_priv);
INIT_WORK(&work->work, cma_work_handler);
work->old_state = RDMA_CM_ADDR_QUERY;
work->new_state = RDMA_CM_ADDR_RESOLVED;
work->event.event = RDMA_CM_EVENT_ADDR_RESOLVED;
queue_work(cma_wq, &work->work); queue_work(cma_wq, &work->work);
return 0; return 0;
err: err:
...@@ -3404,9 +3460,10 @@ static int cma_sidr_rep_handler(struct ib_cm_id *cm_id, ...@@ -3404,9 +3460,10 @@ static int cma_sidr_rep_handler(struct ib_cm_id *cm_id,
event.status = ret; event.status = ret;
break; break;
} }
ib_init_ah_from_path(id_priv->id.device, id_priv->id.port_num, ib_init_ah_attr_from_path(id_priv->id.device,
id_priv->id.route.path_rec, id_priv->id.port_num,
&event.param.ud.ah_attr); id_priv->id.route.path_rec,
&event.param.ud.ah_attr);
event.param.ud.qp_num = rep->qpn; event.param.ud.qp_num = rep->qpn;
event.param.ud.qkey = rep->qkey; event.param.ud.qkey = rep->qkey;
event.event = RDMA_CM_EVENT_ESTABLISHED; event.event = RDMA_CM_EVENT_ESTABLISHED;
...@@ -3873,7 +3930,7 @@ static int cma_ib_mc_handler(int status, struct ib_sa_multicast *multicast) ...@@ -3873,7 +3930,7 @@ static int cma_ib_mc_handler(int status, struct ib_sa_multicast *multicast)
struct rdma_dev_addr *dev_addr = struct rdma_dev_addr *dev_addr =
&id_priv->id.route.addr.dev_addr; &id_priv->id.route.addr.dev_addr;
struct net_device *ndev = struct net_device *ndev =
dev_get_by_index(&init_net, dev_addr->bound_dev_if); dev_get_by_index(dev_addr->net, dev_addr->bound_dev_if);
enum ib_gid_type gid_type = enum ib_gid_type gid_type =
id_priv->cma_dev->default_gid_type[id_priv->id.port_num - id_priv->cma_dev->default_gid_type[id_priv->id.port_num -
rdma_start_port(id_priv->cma_dev->device)]; rdma_start_port(id_priv->cma_dev->device)];
...@@ -4010,8 +4067,10 @@ static void cma_iboe_set_mgid(struct sockaddr *addr, union ib_gid *mgid, ...@@ -4010,8 +4067,10 @@ static void cma_iboe_set_mgid(struct sockaddr *addr, union ib_gid *mgid,
} else if (addr->sa_family == AF_INET6) { } else if (addr->sa_family == AF_INET6) {
memcpy(mgid, &sin6->sin6_addr, sizeof *mgid); memcpy(mgid, &sin6->sin6_addr, sizeof *mgid);
} else { } else {
mgid->raw[0] = (gid_type == IB_GID_TYPE_IB) ? 0xff : 0; mgid->raw[0] =
mgid->raw[1] = (gid_type == IB_GID_TYPE_IB) ? 0x0e : 0; (gid_type == IB_GID_TYPE_ROCE_UDP_ENCAP) ? 0 : 0xff;
mgid->raw[1] =
(gid_type == IB_GID_TYPE_ROCE_UDP_ENCAP) ? 0 : 0x0e;
mgid->raw[2] = 0; mgid->raw[2] = 0;
mgid->raw[3] = 0; mgid->raw[3] = 0;
mgid->raw[4] = 0; mgid->raw[4] = 0;
...@@ -4061,7 +4120,7 @@ static int cma_iboe_join_multicast(struct rdma_id_private *id_priv, ...@@ -4061,7 +4120,7 @@ static int cma_iboe_join_multicast(struct rdma_id_private *id_priv,
mc->multicast.ib->rec.qkey = cpu_to_be32(RDMA_UDP_QKEY); mc->multicast.ib->rec.qkey = cpu_to_be32(RDMA_UDP_QKEY);
if (dev_addr->bound_dev_if) if (dev_addr->bound_dev_if)
ndev = dev_get_by_index(&init_net, dev_addr->bound_dev_if); ndev = dev_get_by_index(dev_addr->net, dev_addr->bound_dev_if);
if (!ndev) { if (!ndev) {
err = -ENODEV; err = -ENODEV;
goto out2; goto out2;
...@@ -4179,7 +4238,7 @@ void rdma_leave_multicast(struct rdma_cm_id *id, struct sockaddr *addr) ...@@ -4179,7 +4238,7 @@ void rdma_leave_multicast(struct rdma_cm_id *id, struct sockaddr *addr)
struct net_device *ndev = NULL; struct net_device *ndev = NULL;
if (dev_addr->bound_dev_if) if (dev_addr->bound_dev_if)
ndev = dev_get_by_index(&init_net, ndev = dev_get_by_index(dev_addr->net,
dev_addr->bound_dev_if); dev_addr->bound_dev_if);
if (ndev) { if (ndev) {
cma_igmp_send(ndev, cma_igmp_send(ndev,
...@@ -4235,7 +4294,7 @@ static int cma_netdev_callback(struct notifier_block *self, unsigned long event, ...@@ -4235,7 +4294,7 @@ static int cma_netdev_callback(struct notifier_block *self, unsigned long event,
if (event != NETDEV_BONDING_FAILOVER) if (event != NETDEV_BONDING_FAILOVER)
return NOTIFY_DONE; return NOTIFY_DONE;
if (!(ndev->flags & IFF_MASTER) || !(ndev->priv_flags & IFF_BONDING)) if (!netif_is_bond_master(ndev))
return NOTIFY_DONE; return NOTIFY_DONE;
mutex_lock(&lock); mutex_lock(&lock);
...@@ -4432,7 +4491,7 @@ static int cma_get_id_stats(struct sk_buff *skb, struct netlink_callback *cb) ...@@ -4432,7 +4491,7 @@ static int cma_get_id_stats(struct sk_buff *skb, struct netlink_callback *cb)
RDMA_NL_RDMA_CM_ATTR_SRC_ADDR)) RDMA_NL_RDMA_CM_ATTR_SRC_ADDR))
goto out; goto out;
if (ibnl_put_attr(skb, nlh, if (ibnl_put_attr(skb, nlh,
rdma_addr_size(cma_src_addr(id_priv)), rdma_addr_size(cma_dst_addr(id_priv)),
cma_dst_addr(id_priv), cma_dst_addr(id_priv),
RDMA_NL_RDMA_CM_ATTR_DST_ADDR)) RDMA_NL_RDMA_CM_ATTR_DST_ADDR))
goto out; goto out;
...@@ -4444,6 +4503,7 @@ static int cma_get_id_stats(struct sk_buff *skb, struct netlink_callback *cb) ...@@ -4444,6 +4503,7 @@ static int cma_get_id_stats(struct sk_buff *skb, struct netlink_callback *cb)
id_stats->qp_type = id->qp_type; id_stats->qp_type = id->qp_type;
i_id++; i_id++;
nlmsg_end(skb, nlh);
} }
cb->args[1] = 0; cb->args[1] = 0;
......
...@@ -295,7 +295,7 @@ static struct config_group *make_cma_dev(struct config_group *group, ...@@ -295,7 +295,7 @@ static struct config_group *make_cma_dev(struct config_group *group,
goto fail; goto fail;
} }
strncpy(cma_dev_group->name, name, sizeof(cma_dev_group->name)); strlcpy(cma_dev_group->name, name, sizeof(cma_dev_group->name));
config_group_init_type_name(&cma_dev_group->ports_group, "ports", config_group_init_type_name(&cma_dev_group->ports_group, "ports",
&cma_ports_group_type); &cma_ports_group_type);
......
...@@ -40,8 +40,12 @@ ...@@ -40,8 +40,12 @@
#include <rdma/ib_verbs.h> #include <rdma/ib_verbs.h>
#include <rdma/opa_addr.h> #include <rdma/opa_addr.h>
#include <rdma/ib_mad.h> #include <rdma/ib_mad.h>
#include <rdma/restrack.h>
#include "mad_priv.h" #include "mad_priv.h"
/* Total number of ports combined across all struct ib_devices's */
#define RDMA_MAX_PORTS 1024
struct pkey_index_qp_list { struct pkey_index_qp_list {
struct list_head pkey_index_list; struct list_head pkey_index_list;
u16 pkey_index; u16 pkey_index;
...@@ -137,7 +141,6 @@ int ib_cache_gid_del_all_netdev_gids(struct ib_device *ib_dev, u8 port, ...@@ -137,7 +141,6 @@ int ib_cache_gid_del_all_netdev_gids(struct ib_device *ib_dev, u8 port,
int roce_gid_mgmt_init(void); int roce_gid_mgmt_init(void);
void roce_gid_mgmt_cleanup(void); void roce_gid_mgmt_cleanup(void);
int roce_rescan_device(struct ib_device *ib_dev);
unsigned long roce_gid_type_mask_support(struct ib_device *ib_dev, u8 port); unsigned long roce_gid_type_mask_support(struct ib_device *ib_dev, u8 port);
int ib_cache_setup_one(struct ib_device *device); int ib_cache_setup_one(struct ib_device *device);
...@@ -191,13 +194,6 @@ void ib_sa_cleanup(void); ...@@ -191,13 +194,6 @@ void ib_sa_cleanup(void);
int rdma_nl_init(void); int rdma_nl_init(void);
void rdma_nl_exit(void); void rdma_nl_exit(void);
/**
* Check if there are any listeners to the netlink group
* @group: the netlink group ID
* Returns 0 on success or a negative for no listeners.
*/
int ibnl_chk_listeners(unsigned int group);
int ib_nl_handle_resolve_resp(struct sk_buff *skb, int ib_nl_handle_resolve_resp(struct sk_buff *skb,
struct nlmsghdr *nlh, struct nlmsghdr *nlh,
struct netlink_ext_ack *extack); struct netlink_ext_ack *extack);
...@@ -213,11 +209,6 @@ int ib_get_cached_subnet_prefix(struct ib_device *device, ...@@ -213,11 +209,6 @@ int ib_get_cached_subnet_prefix(struct ib_device *device,
u64 *sn_pfx); u64 *sn_pfx);
#ifdef CONFIG_SECURITY_INFINIBAND #ifdef CONFIG_SECURITY_INFINIBAND
int ib_security_pkey_access(struct ib_device *dev,
u8 port_num,
u16 pkey_index,
void *sec);
void ib_security_destroy_port_pkey_list(struct ib_device *device); void ib_security_destroy_port_pkey_list(struct ib_device *device);
void ib_security_cache_change(struct ib_device *device, void ib_security_cache_change(struct ib_device *device,
...@@ -240,14 +231,6 @@ int ib_mad_agent_security_setup(struct ib_mad_agent *agent, ...@@ -240,14 +231,6 @@ int ib_mad_agent_security_setup(struct ib_mad_agent *agent,
void ib_mad_agent_security_cleanup(struct ib_mad_agent *agent); void ib_mad_agent_security_cleanup(struct ib_mad_agent *agent);
int ib_mad_enforce_security(struct ib_mad_agent_private *map, u16 pkey_index); int ib_mad_enforce_security(struct ib_mad_agent_private *map, u16 pkey_index);
#else #else
static inline int ib_security_pkey_access(struct ib_device *dev,
u8 port_num,
u16 pkey_index,
void *sec)
{
return 0;
}
static inline void ib_security_destroy_port_pkey_list(struct ib_device *device) static inline void ib_security_destroy_port_pkey_list(struct ib_device *device)
{ {
} }
...@@ -318,4 +301,31 @@ struct ib_device *ib_device_get_by_index(u32 ifindex); ...@@ -318,4 +301,31 @@ struct ib_device *ib_device_get_by_index(u32 ifindex);
/* RDMA device netlink */ /* RDMA device netlink */
void nldev_init(void); void nldev_init(void);
void nldev_exit(void); void nldev_exit(void);
static inline struct ib_qp *_ib_create_qp(struct ib_device *dev,
struct ib_pd *pd,
struct ib_qp_init_attr *attr,
struct ib_udata *udata)
{
struct ib_qp *qp;
qp = dev->create_qp(pd, attr, udata);
if (IS_ERR(qp))
return qp;
qp->device = dev;
qp->pd = pd;
/*
* We don't track XRC QPs for now, because they don't have PD
* and more importantly they are created internaly by driver,
* see mlx5 create_dev_resources() as an example.
*/
if (attr->qp_type < IB_QPT_XRC_INI) {
qp->res.type = RDMA_RESTRACK_QP;
rdma_restrack_add(&qp->res);
} else
qp->res.valid = false;
return qp;
}
#endif /* _CORE_PRIV_H */ #endif /* _CORE_PRIV_H */
...@@ -25,9 +25,10 @@ ...@@ -25,9 +25,10 @@
#define IB_POLL_FLAGS \ #define IB_POLL_FLAGS \
(IB_CQ_NEXT_COMP | IB_CQ_REPORT_MISSED_EVENTS) (IB_CQ_NEXT_COMP | IB_CQ_REPORT_MISSED_EVENTS)
static int __ib_process_cq(struct ib_cq *cq, int budget) static int __ib_process_cq(struct ib_cq *cq, int budget, struct ib_wc *poll_wc)
{ {
int i, n, completed = 0; int i, n, completed = 0;
struct ib_wc *wcs = poll_wc ? : cq->wc;
/* /*
* budget might be (-1) if the caller does not * budget might be (-1) if the caller does not
...@@ -35,9 +36,9 @@ static int __ib_process_cq(struct ib_cq *cq, int budget) ...@@ -35,9 +36,9 @@ static int __ib_process_cq(struct ib_cq *cq, int budget)
* minimum here. * minimum here.
*/ */
while ((n = ib_poll_cq(cq, min_t(u32, IB_POLL_BATCH, while ((n = ib_poll_cq(cq, min_t(u32, IB_POLL_BATCH,
budget - completed), cq->wc)) > 0) { budget - completed), wcs)) > 0) {
for (i = 0; i < n; i++) { for (i = 0; i < n; i++) {
struct ib_wc *wc = &cq->wc[i]; struct ib_wc *wc = &wcs[i];
if (wc->wr_cqe) if (wc->wr_cqe)
wc->wr_cqe->done(cq, wc); wc->wr_cqe->done(cq, wc);
...@@ -60,18 +61,20 @@ static int __ib_process_cq(struct ib_cq *cq, int budget) ...@@ -60,18 +61,20 @@ static int __ib_process_cq(struct ib_cq *cq, int budget)
* @cq: CQ to process * @cq: CQ to process
* @budget: number of CQEs to poll for * @budget: number of CQEs to poll for
* *
* This function is used to process all outstanding CQ entries on a * This function is used to process all outstanding CQ entries.
* %IB_POLL_DIRECT CQ. It does not offload CQ processing to a different * It does not offload CQ processing to a different context and does
* context and does not ask for completion interrupts from the HCA. * not ask for completion interrupts from the HCA.
* Using direct processing on CQ with non IB_POLL_DIRECT type may trigger
* concurrent processing.
* *
* Note: do not pass -1 as %budget unless it is guaranteed that the number * Note: do not pass -1 as %budget unless it is guaranteed that the number
* of completions that will be processed is small. * of completions that will be processed is small.
*/ */
int ib_process_cq_direct(struct ib_cq *cq, int budget) int ib_process_cq_direct(struct ib_cq *cq, int budget)
{ {
WARN_ON_ONCE(cq->poll_ctx != IB_POLL_DIRECT); struct ib_wc wcs[IB_POLL_BATCH];
return __ib_process_cq(cq, budget); return __ib_process_cq(cq, budget, wcs);
} }
EXPORT_SYMBOL(ib_process_cq_direct); EXPORT_SYMBOL(ib_process_cq_direct);
...@@ -85,7 +88,7 @@ static int ib_poll_handler(struct irq_poll *iop, int budget) ...@@ -85,7 +88,7 @@ static int ib_poll_handler(struct irq_poll *iop, int budget)
struct ib_cq *cq = container_of(iop, struct ib_cq, iop); struct ib_cq *cq = container_of(iop, struct ib_cq, iop);
int completed; int completed;
completed = __ib_process_cq(cq, budget); completed = __ib_process_cq(cq, budget, NULL);
if (completed < budget) { if (completed < budget) {
irq_poll_complete(&cq->iop); irq_poll_complete(&cq->iop);
if (ib_req_notify_cq(cq, IB_POLL_FLAGS) > 0) if (ib_req_notify_cq(cq, IB_POLL_FLAGS) > 0)
...@@ -105,7 +108,7 @@ static void ib_cq_poll_work(struct work_struct *work) ...@@ -105,7 +108,7 @@ static void ib_cq_poll_work(struct work_struct *work)
struct ib_cq *cq = container_of(work, struct ib_cq, work); struct ib_cq *cq = container_of(work, struct ib_cq, work);
int completed; int completed;
completed = __ib_process_cq(cq, IB_POLL_BUDGET_WORKQUEUE); completed = __ib_process_cq(cq, IB_POLL_BUDGET_WORKQUEUE, NULL);
if (completed >= IB_POLL_BUDGET_WORKQUEUE || if (completed >= IB_POLL_BUDGET_WORKQUEUE ||
ib_req_notify_cq(cq, IB_POLL_FLAGS) > 0) ib_req_notify_cq(cq, IB_POLL_FLAGS) > 0)
queue_work(ib_comp_wq, &cq->work); queue_work(ib_comp_wq, &cq->work);
...@@ -117,20 +120,22 @@ static void ib_cq_completion_workqueue(struct ib_cq *cq, void *private) ...@@ -117,20 +120,22 @@ static void ib_cq_completion_workqueue(struct ib_cq *cq, void *private)
} }
/** /**
* ib_alloc_cq - allocate a completion queue * __ib_alloc_cq - allocate a completion queue
* @dev: device to allocate the CQ for * @dev: device to allocate the CQ for
* @private: driver private data, accessible from cq->cq_context * @private: driver private data, accessible from cq->cq_context
* @nr_cqe: number of CQEs to allocate * @nr_cqe: number of CQEs to allocate
* @comp_vector: HCA completion vectors for this CQ * @comp_vector: HCA completion vectors for this CQ
* @poll_ctx: context to poll the CQ from. * @poll_ctx: context to poll the CQ from.
* @caller: module owner name.
* *
* This is the proper interface to allocate a CQ for in-kernel users. A * This is the proper interface to allocate a CQ for in-kernel users. A
* CQ allocated with this interface will automatically be polled from the * CQ allocated with this interface will automatically be polled from the
* specified context. The ULP must use wr->wr_cqe instead of wr->wr_id * specified context. The ULP must use wr->wr_cqe instead of wr->wr_id
* to use this CQ abstraction. * to use this CQ abstraction.
*/ */
struct ib_cq *ib_alloc_cq(struct ib_device *dev, void *private, struct ib_cq *__ib_alloc_cq(struct ib_device *dev, void *private,
int nr_cqe, int comp_vector, enum ib_poll_context poll_ctx) int nr_cqe, int comp_vector,
enum ib_poll_context poll_ctx, const char *caller)
{ {
struct ib_cq_init_attr cq_attr = { struct ib_cq_init_attr cq_attr = {
.cqe = nr_cqe, .cqe = nr_cqe,
...@@ -154,6 +159,10 @@ struct ib_cq *ib_alloc_cq(struct ib_device *dev, void *private, ...@@ -154,6 +159,10 @@ struct ib_cq *ib_alloc_cq(struct ib_device *dev, void *private,
if (!cq->wc) if (!cq->wc)
goto out_destroy_cq; goto out_destroy_cq;
cq->res.type = RDMA_RESTRACK_CQ;
cq->res.kern_name = caller;
rdma_restrack_add(&cq->res);
switch (cq->poll_ctx) { switch (cq->poll_ctx) {
case IB_POLL_DIRECT: case IB_POLL_DIRECT:
cq->comp_handler = ib_cq_completion_direct; cq->comp_handler = ib_cq_completion_direct;
...@@ -178,11 +187,12 @@ struct ib_cq *ib_alloc_cq(struct ib_device *dev, void *private, ...@@ -178,11 +187,12 @@ struct ib_cq *ib_alloc_cq(struct ib_device *dev, void *private,
out_free_wc: out_free_wc:
kfree(cq->wc); kfree(cq->wc);
rdma_restrack_del(&cq->res);
out_destroy_cq: out_destroy_cq:
cq->device->destroy_cq(cq); cq->device->destroy_cq(cq);
return ERR_PTR(ret); return ERR_PTR(ret);
} }
EXPORT_SYMBOL(ib_alloc_cq); EXPORT_SYMBOL(__ib_alloc_cq);
/** /**
* ib_free_cq - free a completion queue * ib_free_cq - free a completion queue
...@@ -209,6 +219,7 @@ void ib_free_cq(struct ib_cq *cq) ...@@ -209,6 +219,7 @@ void ib_free_cq(struct ib_cq *cq)
} }
kfree(cq->wc); kfree(cq->wc);
rdma_restrack_del(&cq->res);
ret = cq->device->destroy_cq(cq); ret = cq->device->destroy_cq(cq);
WARN_ON_ONCE(ret); WARN_ON_ONCE(ret);
} }
......
...@@ -263,6 +263,8 @@ struct ib_device *ib_alloc_device(size_t size) ...@@ -263,6 +263,8 @@ struct ib_device *ib_alloc_device(size_t size)
if (!device) if (!device)
return NULL; return NULL;
rdma_restrack_init(&device->res);
device->dev.class = &ib_class; device->dev.class = &ib_class;
device_initialize(&device->dev); device_initialize(&device->dev);
...@@ -288,7 +290,7 @@ void ib_dealloc_device(struct ib_device *device) ...@@ -288,7 +290,7 @@ void ib_dealloc_device(struct ib_device *device)
{ {
WARN_ON(device->reg_state != IB_DEV_UNREGISTERED && WARN_ON(device->reg_state != IB_DEV_UNREGISTERED &&
device->reg_state != IB_DEV_UNINITIALIZED); device->reg_state != IB_DEV_UNINITIALIZED);
kobject_put(&device->dev.kobj); put_device(&device->dev);
} }
EXPORT_SYMBOL(ib_dealloc_device); EXPORT_SYMBOL(ib_dealloc_device);
...@@ -462,7 +464,6 @@ int ib_register_device(struct ib_device *device, ...@@ -462,7 +464,6 @@ int ib_register_device(struct ib_device *device,
struct ib_udata uhw = {.outlen = 0, .inlen = 0}; struct ib_udata uhw = {.outlen = 0, .inlen = 0};
struct device *parent = device->dev.parent; struct device *parent = device->dev.parent;
WARN_ON_ONCE(!parent);
WARN_ON_ONCE(device->dma_device); WARN_ON_ONCE(device->dma_device);
if (device->dev.dma_ops) { if (device->dev.dma_ops) {
/* /*
...@@ -471,16 +472,25 @@ int ib_register_device(struct ib_device *device, ...@@ -471,16 +472,25 @@ int ib_register_device(struct ib_device *device,
* into device->dev. * into device->dev.
*/ */
device->dma_device = &device->dev; device->dma_device = &device->dev;
if (!device->dev.dma_mask) if (!device->dev.dma_mask) {
device->dev.dma_mask = parent->dma_mask; if (parent)
if (!device->dev.coherent_dma_mask) device->dev.dma_mask = parent->dma_mask;
device->dev.coherent_dma_mask = else
parent->coherent_dma_mask; WARN_ON_ONCE(true);
}
if (!device->dev.coherent_dma_mask) {
if (parent)
device->dev.coherent_dma_mask =
parent->coherent_dma_mask;
else
WARN_ON_ONCE(true);
}
} else { } else {
/* /*
* The caller did not provide custom DMA operations. Use the * The caller did not provide custom DMA operations. Use the
* DMA mapping operations of the parent device. * DMA mapping operations of the parent device.
*/ */
WARN_ON_ONCE(!parent);
device->dma_device = parent; device->dma_device = parent;
} }
...@@ -588,6 +598,8 @@ void ib_unregister_device(struct ib_device *device) ...@@ -588,6 +598,8 @@ void ib_unregister_device(struct ib_device *device)
} }
up_read(&lists_rwsem); up_read(&lists_rwsem);
rdma_restrack_clean(&device->res);
ib_device_unregister_rdmacg(device); ib_device_unregister_rdmacg(device);
ib_device_unregister_sysfs(device); ib_device_unregister_sysfs(device);
...@@ -1033,32 +1045,22 @@ EXPORT_SYMBOL(ib_modify_port); ...@@ -1033,32 +1045,22 @@ EXPORT_SYMBOL(ib_modify_port);
/** /**
* ib_find_gid - Returns the port number and GID table index where * ib_find_gid - Returns the port number and GID table index where
* a specified GID value occurs. * a specified GID value occurs. Its searches only for IB link layer.
* @device: The device to query. * @device: The device to query.
* @gid: The GID value to search for. * @gid: The GID value to search for.
* @gid_type: Type of GID.
* @ndev: The ndev related to the GID to search for. * @ndev: The ndev related to the GID to search for.
* @port_num: The port number of the device where the GID value was found. * @port_num: The port number of the device where the GID value was found.
* @index: The index into the GID table where the GID was found. This * @index: The index into the GID table where the GID was found. This
* parameter may be NULL. * parameter may be NULL.
*/ */
int ib_find_gid(struct ib_device *device, union ib_gid *gid, int ib_find_gid(struct ib_device *device, union ib_gid *gid,
enum ib_gid_type gid_type, struct net_device *ndev, struct net_device *ndev, u8 *port_num, u16 *index)
u8 *port_num, u16 *index)
{ {
union ib_gid tmp_gid; union ib_gid tmp_gid;
int ret, port, i; int ret, port, i;
for (port = rdma_start_port(device); port <= rdma_end_port(device); ++port) { for (port = rdma_start_port(device); port <= rdma_end_port(device); ++port) {
if (rdma_cap_roce_gid_table(device, port)) { if (rdma_cap_roce_gid_table(device, port))
if (!ib_find_cached_gid_by_port(device, gid, gid_type, port,
ndev, index)) {
*port_num = port;
return 0;
}
}
if (gid_type != IB_GID_TYPE_IB)
continue; continue;
for (i = 0; i < device->port_immutable[port].gid_tbl_len; ++i) { for (i = 0; i < device->port_immutable[port].gid_tbl_len; ++i) {
......
...@@ -388,13 +388,11 @@ int ib_flush_fmr_pool(struct ib_fmr_pool *pool) ...@@ -388,13 +388,11 @@ int ib_flush_fmr_pool(struct ib_fmr_pool *pool)
EXPORT_SYMBOL(ib_flush_fmr_pool); EXPORT_SYMBOL(ib_flush_fmr_pool);
/** /**
* ib_fmr_pool_map_phys - * ib_fmr_pool_map_phys - Map an FMR from an FMR pool.
* @pool:FMR pool to allocate FMR from * @pool_handle: FMR pool to allocate FMR from
* @page_list:List of pages to map * @page_list: List of pages to map
* @list_len:Number of pages in @page_list * @list_len: Number of pages in @page_list
* @io_virtual_address:I/O virtual address for new FMR * @io_virtual_address: I/O virtual address for new FMR
*
* Map an FMR from an FMR pool.
*/ */
struct ib_pool_fmr *ib_fmr_pool_map_phys(struct ib_fmr_pool *pool_handle, struct ib_pool_fmr *ib_fmr_pool_map_phys(struct ib_fmr_pool *pool_handle,
u64 *page_list, u64 *page_list,
......
...@@ -654,6 +654,7 @@ int iwpm_send_mapinfo(u8 nl_client, int iwpm_pid) ...@@ -654,6 +654,7 @@ int iwpm_send_mapinfo(u8 nl_client, int iwpm_pid)
} }
skb_num++; skb_num++;
spin_lock_irqsave(&iwpm_mapinfo_lock, flags); spin_lock_irqsave(&iwpm_mapinfo_lock, flags);
ret = -EINVAL;
for (i = 0; i < IWPM_MAPINFO_HASH_SIZE; i++) { for (i = 0; i < IWPM_MAPINFO_HASH_SIZE; i++) {
hlist_for_each_entry(map_info, &iwpm_hash_bucket[i], hlist_for_each_entry(map_info, &iwpm_hash_bucket[i],
hlist_node) { hlist_node) {
......
...@@ -49,7 +49,6 @@ ...@@ -49,7 +49,6 @@
#include "smi.h" #include "smi.h"
#include "opa_smi.h" #include "opa_smi.h"
#include "agent.h" #include "agent.h"
#include "core_priv.h"
static int mad_sendq_size = IB_MAD_QP_SEND_SIZE; static int mad_sendq_size = IB_MAD_QP_SEND_SIZE;
static int mad_recvq_size = IB_MAD_QP_RECV_SIZE; static int mad_recvq_size = IB_MAD_QP_RECV_SIZE;
......
...@@ -41,8 +41,6 @@ ...@@ -41,8 +41,6 @@
#include <linux/module.h> #include <linux/module.h>
#include "core_priv.h" #include "core_priv.h"
#include "core_priv.h"
static DEFINE_MUTEX(rdma_nl_mutex); static DEFINE_MUTEX(rdma_nl_mutex);
static struct sock *nls; static struct sock *nls;
static struct { static struct {
...@@ -83,15 +81,13 @@ static bool is_nl_valid(unsigned int type, unsigned int op) ...@@ -83,15 +81,13 @@ static bool is_nl_valid(unsigned int type, unsigned int op)
if (!is_nl_msg_valid(type, op)) if (!is_nl_msg_valid(type, op))
return false; return false;
cb_table = rdma_nl_types[type].cb_table; if (!rdma_nl_types[type].cb_table) {
#ifdef CONFIG_MODULES
if (!cb_table) {
mutex_unlock(&rdma_nl_mutex); mutex_unlock(&rdma_nl_mutex);
request_module("rdma-netlink-subsys-%d", type); request_module("rdma-netlink-subsys-%d", type);
mutex_lock(&rdma_nl_mutex); mutex_lock(&rdma_nl_mutex);
cb_table = rdma_nl_types[type].cb_table;
} }
#endif
cb_table = rdma_nl_types[type].cb_table;
if (!cb_table || (!cb_table[op].dump && !cb_table[op].doit)) if (!cb_table || (!cb_table[op].dump && !cb_table[op].doit))
return false; return false;
......
...@@ -31,6 +31,8 @@ ...@@ -31,6 +31,8 @@
*/ */
#include <linux/module.h> #include <linux/module.h>
#include <linux/pid.h>
#include <linux/pid_namespace.h>
#include <net/netlink.h> #include <net/netlink.h>
#include <rdma/rdma_netlink.h> #include <rdma/rdma_netlink.h>
...@@ -52,16 +54,42 @@ static const struct nla_policy nldev_policy[RDMA_NLDEV_ATTR_MAX] = { ...@@ -52,16 +54,42 @@ static const struct nla_policy nldev_policy[RDMA_NLDEV_ATTR_MAX] = {
[RDMA_NLDEV_ATTR_PORT_STATE] = { .type = NLA_U8 }, [RDMA_NLDEV_ATTR_PORT_STATE] = { .type = NLA_U8 },
[RDMA_NLDEV_ATTR_PORT_PHYS_STATE] = { .type = NLA_U8 }, [RDMA_NLDEV_ATTR_PORT_PHYS_STATE] = { .type = NLA_U8 },
[RDMA_NLDEV_ATTR_DEV_NODE_TYPE] = { .type = NLA_U8 }, [RDMA_NLDEV_ATTR_DEV_NODE_TYPE] = { .type = NLA_U8 },
[RDMA_NLDEV_ATTR_RES_SUMMARY] = { .type = NLA_NESTED },
[RDMA_NLDEV_ATTR_RES_SUMMARY_ENTRY] = { .type = NLA_NESTED },
[RDMA_NLDEV_ATTR_RES_SUMMARY_ENTRY_NAME] = { .type = NLA_NUL_STRING,
.len = 16 },
[RDMA_NLDEV_ATTR_RES_SUMMARY_ENTRY_CURR] = { .type = NLA_U64 },
[RDMA_NLDEV_ATTR_RES_QP] = { .type = NLA_NESTED },
[RDMA_NLDEV_ATTR_RES_QP_ENTRY] = { .type = NLA_NESTED },
[RDMA_NLDEV_ATTR_RES_LQPN] = { .type = NLA_U32 },
[RDMA_NLDEV_ATTR_RES_RQPN] = { .type = NLA_U32 },
[RDMA_NLDEV_ATTR_RES_RQ_PSN] = { .type = NLA_U32 },
[RDMA_NLDEV_ATTR_RES_SQ_PSN] = { .type = NLA_U32 },
[RDMA_NLDEV_ATTR_RES_PATH_MIG_STATE] = { .type = NLA_U8 },
[RDMA_NLDEV_ATTR_RES_TYPE] = { .type = NLA_U8 },
[RDMA_NLDEV_ATTR_RES_STATE] = { .type = NLA_U8 },
[RDMA_NLDEV_ATTR_RES_PID] = { .type = NLA_U32 },
[RDMA_NLDEV_ATTR_RES_KERN_NAME] = { .type = NLA_NUL_STRING,
.len = TASK_COMM_LEN },
}; };
static int fill_dev_info(struct sk_buff *msg, struct ib_device *device) static int fill_nldev_handle(struct sk_buff *msg, struct ib_device *device)
{ {
char fw[IB_FW_VERSION_NAME_MAX];
if (nla_put_u32(msg, RDMA_NLDEV_ATTR_DEV_INDEX, device->index)) if (nla_put_u32(msg, RDMA_NLDEV_ATTR_DEV_INDEX, device->index))
return -EMSGSIZE; return -EMSGSIZE;
if (nla_put_string(msg, RDMA_NLDEV_ATTR_DEV_NAME, device->name)) if (nla_put_string(msg, RDMA_NLDEV_ATTR_DEV_NAME, device->name))
return -EMSGSIZE; return -EMSGSIZE;
return 0;
}
static int fill_dev_info(struct sk_buff *msg, struct ib_device *device)
{
char fw[IB_FW_VERSION_NAME_MAX];
if (fill_nldev_handle(msg, device))
return -EMSGSIZE;
if (nla_put_u32(msg, RDMA_NLDEV_ATTR_PORT_INDEX, rdma_end_port(device))) if (nla_put_u32(msg, RDMA_NLDEV_ATTR_PORT_INDEX, rdma_end_port(device)))
return -EMSGSIZE; return -EMSGSIZE;
...@@ -92,10 +120,9 @@ static int fill_port_info(struct sk_buff *msg, ...@@ -92,10 +120,9 @@ static int fill_port_info(struct sk_buff *msg,
struct ib_port_attr attr; struct ib_port_attr attr;
int ret; int ret;
if (nla_put_u32(msg, RDMA_NLDEV_ATTR_DEV_INDEX, device->index)) if (fill_nldev_handle(msg, device))
return -EMSGSIZE;
if (nla_put_string(msg, RDMA_NLDEV_ATTR_DEV_NAME, device->name))
return -EMSGSIZE; return -EMSGSIZE;
if (nla_put_u32(msg, RDMA_NLDEV_ATTR_PORT_INDEX, port)) if (nla_put_u32(msg, RDMA_NLDEV_ATTR_PORT_INDEX, port))
return -EMSGSIZE; return -EMSGSIZE;
...@@ -126,6 +153,137 @@ static int fill_port_info(struct sk_buff *msg, ...@@ -126,6 +153,137 @@ static int fill_port_info(struct sk_buff *msg,
return 0; return 0;
} }
static int fill_res_info_entry(struct sk_buff *msg,
const char *name, u64 curr)
{
struct nlattr *entry_attr;
entry_attr = nla_nest_start(msg, RDMA_NLDEV_ATTR_RES_SUMMARY_ENTRY);
if (!entry_attr)
return -EMSGSIZE;
if (nla_put_string(msg, RDMA_NLDEV_ATTR_RES_SUMMARY_ENTRY_NAME, name))
goto err;
if (nla_put_u64_64bit(msg,
RDMA_NLDEV_ATTR_RES_SUMMARY_ENTRY_CURR, curr, 0))
goto err;
nla_nest_end(msg, entry_attr);
return 0;
err:
nla_nest_cancel(msg, entry_attr);
return -EMSGSIZE;
}
static int fill_res_info(struct sk_buff *msg, struct ib_device *device)
{
static const char * const names[RDMA_RESTRACK_MAX] = {
[RDMA_RESTRACK_PD] = "pd",
[RDMA_RESTRACK_CQ] = "cq",
[RDMA_RESTRACK_QP] = "qp",
};
struct rdma_restrack_root *res = &device->res;
struct nlattr *table_attr;
int ret, i, curr;
if (fill_nldev_handle(msg, device))
return -EMSGSIZE;
table_attr = nla_nest_start(msg, RDMA_NLDEV_ATTR_RES_SUMMARY);
if (!table_attr)
return -EMSGSIZE;
for (i = 0; i < RDMA_RESTRACK_MAX; i++) {
if (!names[i])
continue;
curr = rdma_restrack_count(res, i, task_active_pid_ns(current));
ret = fill_res_info_entry(msg, names[i], curr);
if (ret)
goto err;
}
nla_nest_end(msg, table_attr);
return 0;
err:
nla_nest_cancel(msg, table_attr);
return ret;
}
static int fill_res_qp_entry(struct sk_buff *msg,
struct ib_qp *qp, uint32_t port)
{
struct rdma_restrack_entry *res = &qp->res;
struct ib_qp_init_attr qp_init_attr;
struct nlattr *entry_attr;
struct ib_qp_attr qp_attr;
int ret;
ret = ib_query_qp(qp, &qp_attr, 0, &qp_init_attr);
if (ret)
return ret;
if (port && port != qp_attr.port_num)
return 0;
entry_attr = nla_nest_start(msg, RDMA_NLDEV_ATTR_RES_QP_ENTRY);
if (!entry_attr)
goto out;
/* In create_qp() port is not set yet */
if (qp_attr.port_num &&
nla_put_u32(msg, RDMA_NLDEV_ATTR_PORT_INDEX, qp_attr.port_num))
goto err;
if (nla_put_u32(msg, RDMA_NLDEV_ATTR_RES_LQPN, qp->qp_num))
goto err;
if (qp->qp_type == IB_QPT_RC || qp->qp_type == IB_QPT_UC) {
if (nla_put_u32(msg, RDMA_NLDEV_ATTR_RES_RQPN,
qp_attr.dest_qp_num))
goto err;
if (nla_put_u32(msg, RDMA_NLDEV_ATTR_RES_RQ_PSN,
qp_attr.rq_psn))
goto err;
}
if (nla_put_u32(msg, RDMA_NLDEV_ATTR_RES_SQ_PSN, qp_attr.sq_psn))
goto err;
if (qp->qp_type == IB_QPT_RC || qp->qp_type == IB_QPT_UC ||
qp->qp_type == IB_QPT_XRC_INI || qp->qp_type == IB_QPT_XRC_TGT) {
if (nla_put_u8(msg, RDMA_NLDEV_ATTR_RES_PATH_MIG_STATE,
qp_attr.path_mig_state))
goto err;
}
if (nla_put_u8(msg, RDMA_NLDEV_ATTR_RES_TYPE, qp->qp_type))
goto err;
if (nla_put_u8(msg, RDMA_NLDEV_ATTR_RES_STATE, qp_attr.qp_state))
goto err;
/*
* Existence of task means that it is user QP and netlink
* user is invited to go and read /proc/PID/comm to get name
* of the task file and res->task_com should be NULL.
*/
if (rdma_is_kernel_res(res)) {
if (nla_put_string(msg, RDMA_NLDEV_ATTR_RES_KERN_NAME, res->kern_name))
goto err;
} else {
if (nla_put_u32(msg, RDMA_NLDEV_ATTR_RES_PID, task_pid_vnr(res->task)))
goto err;
}
nla_nest_end(msg, entry_attr);
return 0;
err:
nla_nest_cancel(msg, entry_attr);
out:
return -EMSGSIZE;
}
static int nldev_get_doit(struct sk_buff *skb, struct nlmsghdr *nlh, static int nldev_get_doit(struct sk_buff *skb, struct nlmsghdr *nlh,
struct netlink_ext_ack *extack) struct netlink_ext_ack *extack)
{ {
...@@ -321,6 +479,213 @@ static int nldev_port_get_dumpit(struct sk_buff *skb, ...@@ -321,6 +479,213 @@ static int nldev_port_get_dumpit(struct sk_buff *skb,
return skb->len; return skb->len;
} }
static int nldev_res_get_doit(struct sk_buff *skb, struct nlmsghdr *nlh,
struct netlink_ext_ack *extack)
{
struct nlattr *tb[RDMA_NLDEV_ATTR_MAX];
struct ib_device *device;
struct sk_buff *msg;
u32 index;
int ret;
ret = nlmsg_parse(nlh, 0, tb, RDMA_NLDEV_ATTR_MAX - 1,
nldev_policy, extack);
if (ret || !tb[RDMA_NLDEV_ATTR_DEV_INDEX])
return -EINVAL;
index = nla_get_u32(tb[RDMA_NLDEV_ATTR_DEV_INDEX]);
device = ib_device_get_by_index(index);
if (!device)
return -EINVAL;
msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
if (!msg)
goto err;
nlh = nlmsg_put(msg, NETLINK_CB(skb).portid, nlh->nlmsg_seq,
RDMA_NL_GET_TYPE(RDMA_NL_NLDEV, RDMA_NLDEV_CMD_RES_GET),
0, 0);
ret = fill_res_info(msg, device);
if (ret)
goto err_free;
nlmsg_end(msg, nlh);
put_device(&device->dev);
return rdma_nl_unicast(msg, NETLINK_CB(skb).portid);
err_free:
nlmsg_free(msg);
err:
put_device(&device->dev);
return ret;
}
static int _nldev_res_get_dumpit(struct ib_device *device,
struct sk_buff *skb,
struct netlink_callback *cb,
unsigned int idx)
{
int start = cb->args[0];
struct nlmsghdr *nlh;
if (idx < start)
return 0;
nlh = nlmsg_put(skb, NETLINK_CB(cb->skb).portid, cb->nlh->nlmsg_seq,
RDMA_NL_GET_TYPE(RDMA_NL_NLDEV, RDMA_NLDEV_CMD_RES_GET),
0, NLM_F_MULTI);
if (fill_res_info(skb, device)) {
nlmsg_cancel(skb, nlh);
goto out;
}
nlmsg_end(skb, nlh);
idx++;
out:
cb->args[0] = idx;
return skb->len;
}
static int nldev_res_get_dumpit(struct sk_buff *skb,
struct netlink_callback *cb)
{
return ib_enum_all_devs(_nldev_res_get_dumpit, skb, cb);
}
static int nldev_res_get_qp_dumpit(struct sk_buff *skb,
struct netlink_callback *cb)
{
struct nlattr *tb[RDMA_NLDEV_ATTR_MAX];
struct rdma_restrack_entry *res;
int err, ret = 0, idx = 0;
struct nlattr *table_attr;
struct ib_device *device;
int start = cb->args[0];
struct ib_qp *qp = NULL;
struct nlmsghdr *nlh;
u32 index, port = 0;
err = nlmsg_parse(cb->nlh, 0, tb, RDMA_NLDEV_ATTR_MAX - 1,
nldev_policy, NULL);
/*
* Right now, we are expecting the device index to get QP information,
* but it is possible to extend this code to return all devices in
* one shot by checking the existence of RDMA_NLDEV_ATTR_DEV_INDEX.
* if it doesn't exist, we will iterate over all devices.
*
* But it is not needed for now.
*/
if (err || !tb[RDMA_NLDEV_ATTR_DEV_INDEX])
return -EINVAL;
index = nla_get_u32(tb[RDMA_NLDEV_ATTR_DEV_INDEX]);
device = ib_device_get_by_index(index);
if (!device)
return -EINVAL;
/*
* If no PORT_INDEX is supplied, we will return all QPs from that device
*/
if (tb[RDMA_NLDEV_ATTR_PORT_INDEX]) {
port = nla_get_u32(tb[RDMA_NLDEV_ATTR_PORT_INDEX]);
if (!rdma_is_port_valid(device, port)) {
ret = -EINVAL;
goto err_index;
}
}
nlh = nlmsg_put(skb, NETLINK_CB(cb->skb).portid, cb->nlh->nlmsg_seq,
RDMA_NL_GET_TYPE(RDMA_NL_NLDEV, RDMA_NLDEV_CMD_RES_QP_GET),
0, NLM_F_MULTI);
if (fill_nldev_handle(skb, device)) {
ret = -EMSGSIZE;
goto err;
}
table_attr = nla_nest_start(skb, RDMA_NLDEV_ATTR_RES_QP);
if (!table_attr) {
ret = -EMSGSIZE;
goto err;
}
down_read(&device->res.rwsem);
hash_for_each_possible(device->res.hash, res, node, RDMA_RESTRACK_QP) {
if (idx < start)
goto next;
if ((rdma_is_kernel_res(res) &&
task_active_pid_ns(current) != &init_pid_ns) ||
(!rdma_is_kernel_res(res) &&
task_active_pid_ns(current) != task_active_pid_ns(res->task)))
/*
* 1. Kernel QPs should be visible in init namspace only
* 2. Present only QPs visible in the current namespace
*/
goto next;
if (!rdma_restrack_get(res))
/*
* Resource is under release now, but we are not
* relesing lock now, so it will be released in
* our next pass, once we will get ->next pointer.
*/
goto next;
qp = container_of(res, struct ib_qp, res);
up_read(&device->res.rwsem);
ret = fill_res_qp_entry(skb, qp, port);
down_read(&device->res.rwsem);
/*
* Return resource back, but it won't be released till
* the &device->res.rwsem will be released for write.
*/
rdma_restrack_put(res);
if (ret == -EMSGSIZE)
/*
* There is a chance to optimize here.
* It can be done by using list_prepare_entry
* and list_for_each_entry_continue afterwards.
*/
break;
if (ret)
goto res_err;
next: idx++;
}
up_read(&device->res.rwsem);
nla_nest_end(skb, table_attr);
nlmsg_end(skb, nlh);
cb->args[0] = idx;
/*
* No more QPs to fill, cancel the message and
* return 0 to mark end of dumpit.
*/
if (!qp)
goto err;
put_device(&device->dev);
return skb->len;
res_err:
nla_nest_cancel(skb, table_attr);
up_read(&device->res.rwsem);
err:
nlmsg_cancel(skb, nlh);
err_index:
put_device(&device->dev);
return ret;
}
static const struct rdma_nl_cbs nldev_cb_table[RDMA_NLDEV_NUM_OPS] = { static const struct rdma_nl_cbs nldev_cb_table[RDMA_NLDEV_NUM_OPS] = {
[RDMA_NLDEV_CMD_GET] = { [RDMA_NLDEV_CMD_GET] = {
.doit = nldev_get_doit, .doit = nldev_get_doit,
...@@ -330,6 +695,23 @@ static const struct rdma_nl_cbs nldev_cb_table[RDMA_NLDEV_NUM_OPS] = { ...@@ -330,6 +695,23 @@ static const struct rdma_nl_cbs nldev_cb_table[RDMA_NLDEV_NUM_OPS] = {
.doit = nldev_port_get_doit, .doit = nldev_port_get_doit,
.dump = nldev_port_get_dumpit, .dump = nldev_port_get_dumpit,
}, },
[RDMA_NLDEV_CMD_RES_GET] = {
.doit = nldev_res_get_doit,
.dump = nldev_res_get_dumpit,
},
[RDMA_NLDEV_CMD_RES_QP_GET] = {
.dump = nldev_res_get_qp_dumpit,
/*
* .doit is not implemented yet for two reasons:
* 1. It is not needed yet.
* 2. There is a need to provide identifier, while it is easy
* for the QPs (device index + port index + LQPN), it is not
* the case for the rest of resources (PD and CQ). Because it
* is better to provide similar interface for all resources,
* let's wait till we will have other resources implemented
* too.
*/
},
}; };
void __init nldev_init(void) void __init nldev_init(void)
......
/* SPDX-License-Identifier: (GPL-2.0+ OR BSD-3-Clause) */
/*
* Copyright (c) 2017-2018 Mellanox Technologies. All rights reserved.
*/
#include <rdma/ib_verbs.h>
#include <rdma/restrack.h>
#include <linux/mutex.h>
#include <linux/sched/task.h>
#include <linux/uaccess.h>
#include <linux/pid_namespace.h>
void rdma_restrack_init(struct rdma_restrack_root *res)
{
init_rwsem(&res->rwsem);
}
void rdma_restrack_clean(struct rdma_restrack_root *res)
{
WARN_ON_ONCE(!hash_empty(res->hash));
}
int rdma_restrack_count(struct rdma_restrack_root *res,
enum rdma_restrack_type type,
struct pid_namespace *ns)
{
struct rdma_restrack_entry *e;
u32 cnt = 0;
down_read(&res->rwsem);
hash_for_each_possible(res->hash, e, node, type) {
if (ns == &init_pid_ns ||
(!rdma_is_kernel_res(e) &&
ns == task_active_pid_ns(e->task)))
cnt++;
}
up_read(&res->rwsem);
return cnt;
}
EXPORT_SYMBOL(rdma_restrack_count);
static void set_kern_name(struct rdma_restrack_entry *res)
{
enum rdma_restrack_type type = res->type;
struct ib_qp *qp;
if (type != RDMA_RESTRACK_QP)
/* PD and CQ types already have this name embedded in */
return;
qp = container_of(res, struct ib_qp, res);
if (!qp->pd) {
WARN_ONCE(true, "XRC QPs are not supported\n");
/* Survive, despite the programmer's error */
res->kern_name = " ";
return;
}
res->kern_name = qp->pd->res.kern_name;
}
static struct ib_device *res_to_dev(struct rdma_restrack_entry *res)
{
enum rdma_restrack_type type = res->type;
struct ib_device *dev;
struct ib_xrcd *xrcd;
struct ib_pd *pd;
struct ib_cq *cq;
struct ib_qp *qp;
switch (type) {
case RDMA_RESTRACK_PD:
pd = container_of(res, struct ib_pd, res);
dev = pd->device;
break;
case RDMA_RESTRACK_CQ:
cq = container_of(res, struct ib_cq, res);
dev = cq->device;
break;
case RDMA_RESTRACK_QP:
qp = container_of(res, struct ib_qp, res);
dev = qp->device;
break;
case RDMA_RESTRACK_XRCD:
xrcd = container_of(res, struct ib_xrcd, res);
dev = xrcd->device;
break;
default:
WARN_ONCE(true, "Wrong resource tracking type %u\n", type);
return NULL;
}
return dev;
}
void rdma_restrack_add(struct rdma_restrack_entry *res)
{
struct ib_device *dev = res_to_dev(res);
if (!dev)
return;
if (!uaccess_kernel()) {
get_task_struct(current);
res->task = current;
res->kern_name = NULL;
} else {
set_kern_name(res);
res->task = NULL;
}
kref_init(&res->kref);
init_completion(&res->comp);
res->valid = true;
down_write(&dev->res.rwsem);
hash_add(dev->res.hash, &res->node, res->type);
up_write(&dev->res.rwsem);
}
EXPORT_SYMBOL(rdma_restrack_add);
int __must_check rdma_restrack_get(struct rdma_restrack_entry *res)
{
return kref_get_unless_zero(&res->kref);
}
EXPORT_SYMBOL(rdma_restrack_get);
static void restrack_release(struct kref *kref)
{
struct rdma_restrack_entry *res;
res = container_of(kref, struct rdma_restrack_entry, kref);
complete(&res->comp);
}
int rdma_restrack_put(struct rdma_restrack_entry *res)
{
return kref_put(&res->kref, restrack_release);
}
EXPORT_SYMBOL(rdma_restrack_put);
void rdma_restrack_del(struct rdma_restrack_entry *res)
{
struct ib_device *dev;
if (!res->valid)
return;
dev = res_to_dev(res);
if (!dev)
return;
rdma_restrack_put(res);
wait_for_completion(&res->comp);
down_write(&dev->res.rwsem);
hash_del(&res->node);
res->valid = false;
if (res->task)
put_task_struct(res->task);
up_write(&dev->res.rwsem);
}
EXPORT_SYMBOL(rdma_restrack_del);
...@@ -410,15 +410,18 @@ static void enum_all_gids_of_dev_cb(struct ib_device *ib_dev, ...@@ -410,15 +410,18 @@ static void enum_all_gids_of_dev_cb(struct ib_device *ib_dev,
rtnl_unlock(); rtnl_unlock();
} }
/* This function will rescan all of the network devices in the system /**
* and add their gids, as needed, to the relevant RoCE devices. */ * rdma_roce_rescan_device - Rescan all of the network devices in the system
int roce_rescan_device(struct ib_device *ib_dev) * and add their gids, as needed, to the relevant RoCE devices.
*
* @device: the rdma device
*/
void rdma_roce_rescan_device(struct ib_device *ib_dev)
{ {
ib_enum_roce_netdev(ib_dev, pass_all_filter, NULL, ib_enum_roce_netdev(ib_dev, pass_all_filter, NULL,
enum_all_gids_of_dev_cb, NULL); enum_all_gids_of_dev_cb, NULL);
return 0;
} }
EXPORT_SYMBOL(rdma_roce_rescan_device);
static void callback_for_addr_gid_device_scan(struct ib_device *device, static void callback_for_addr_gid_device_scan(struct ib_device *device,
u8 port, u8 port,
......
...@@ -1227,9 +1227,9 @@ static u8 get_src_path_mask(struct ib_device *device, u8 port_num) ...@@ -1227,9 +1227,9 @@ static u8 get_src_path_mask(struct ib_device *device, u8 port_num)
return src_path_mask; return src_path_mask;
} }
int ib_init_ah_from_path(struct ib_device *device, u8 port_num, int ib_init_ah_attr_from_path(struct ib_device *device, u8 port_num,
struct sa_path_rec *rec, struct sa_path_rec *rec,
struct rdma_ah_attr *ah_attr) struct rdma_ah_attr *ah_attr)
{ {
int ret; int ret;
u16 gid_index; u16 gid_index;
...@@ -1341,10 +1341,11 @@ int ib_init_ah_from_path(struct ib_device *device, u8 port_num, ...@@ -1341,10 +1341,11 @@ int ib_init_ah_from_path(struct ib_device *device, u8 port_num,
return 0; return 0;
} }
EXPORT_SYMBOL(ib_init_ah_from_path); EXPORT_SYMBOL(ib_init_ah_attr_from_path);
static int alloc_mad(struct ib_sa_query *query, gfp_t gfp_mask) static int alloc_mad(struct ib_sa_query *query, gfp_t gfp_mask)
{ {
struct rdma_ah_attr ah_attr;
unsigned long flags; unsigned long flags;
spin_lock_irqsave(&query->port->ah_lock, flags); spin_lock_irqsave(&query->port->ah_lock, flags);
...@@ -1356,6 +1357,15 @@ static int alloc_mad(struct ib_sa_query *query, gfp_t gfp_mask) ...@@ -1356,6 +1357,15 @@ static int alloc_mad(struct ib_sa_query *query, gfp_t gfp_mask)
query->sm_ah = query->port->sm_ah; query->sm_ah = query->port->sm_ah;
spin_unlock_irqrestore(&query->port->ah_lock, flags); spin_unlock_irqrestore(&query->port->ah_lock, flags);
/*
* Always check if sm_ah has valid dlid assigned,
* before querying for class port info
*/
if ((rdma_query_ah(query->sm_ah->ah, &ah_attr) < 0) ||
!rdma_is_valid_unicast_lid(&ah_attr)) {
kref_put(&query->sm_ah->ref, free_sm_ah);
return -EAGAIN;
}
query->mad_buf = ib_create_send_mad(query->port->agent, 1, query->mad_buf = ib_create_send_mad(query->port->agent, 1,
query->sm_ah->pkey_index, query->sm_ah->pkey_index,
0, IB_MGMT_SA_HDR, IB_MGMT_SA_DATA, 0, IB_MGMT_SA_HDR, IB_MGMT_SA_DATA,
......
...@@ -653,12 +653,11 @@ int ib_security_modify_qp(struct ib_qp *qp, ...@@ -653,12 +653,11 @@ int ib_security_modify_qp(struct ib_qp *qp,
} }
return ret; return ret;
} }
EXPORT_SYMBOL(ib_security_modify_qp);
int ib_security_pkey_access(struct ib_device *dev, static int ib_security_pkey_access(struct ib_device *dev,
u8 port_num, u8 port_num,
u16 pkey_index, u16 pkey_index,
void *sec) void *sec)
{ {
u64 subnet_prefix; u64 subnet_prefix;
u16 pkey; u16 pkey;
...@@ -678,7 +677,6 @@ int ib_security_pkey_access(struct ib_device *dev, ...@@ -678,7 +677,6 @@ int ib_security_pkey_access(struct ib_device *dev,
return security_ib_pkey_access(sec, subnet_prefix, pkey); return security_ib_pkey_access(sec, subnet_prefix, pkey);
} }
EXPORT_SYMBOL(ib_security_pkey_access);
static int ib_mad_agent_security_change(struct notifier_block *nb, static int ib_mad_agent_security_change(struct notifier_block *nb,
unsigned long event, unsigned long event,
......
...@@ -1276,7 +1276,6 @@ int ib_device_register_sysfs(struct ib_device *device, ...@@ -1276,7 +1276,6 @@ int ib_device_register_sysfs(struct ib_device *device,
int ret; int ret;
int i; int i;
WARN_ON_ONCE(!device->dev.parent);
ret = dev_set_name(class_dev, "%s", device->name); ret = dev_set_name(class_dev, "%s", device->name);
if (ret) if (ret)
return ret; return ret;
......
...@@ -53,6 +53,8 @@ ...@@ -53,6 +53,8 @@
#include <rdma/ib_user_cm.h> #include <rdma/ib_user_cm.h>
#include <rdma/ib_marshall.h> #include <rdma/ib_marshall.h>
#include "core_priv.h"
MODULE_AUTHOR("Libor Michalek"); MODULE_AUTHOR("Libor Michalek");
MODULE_DESCRIPTION("InfiniBand userspace Connection Manager access"); MODULE_DESCRIPTION("InfiniBand userspace Connection Manager access");
MODULE_LICENSE("Dual BSD/GPL"); MODULE_LICENSE("Dual BSD/GPL");
...@@ -104,10 +106,13 @@ struct ib_ucm_event { ...@@ -104,10 +106,13 @@ struct ib_ucm_event {
enum { enum {
IB_UCM_MAJOR = 231, IB_UCM_MAJOR = 231,
IB_UCM_BASE_MINOR = 224, IB_UCM_BASE_MINOR = 224,
IB_UCM_MAX_DEVICES = 32 IB_UCM_MAX_DEVICES = RDMA_MAX_PORTS,
IB_UCM_NUM_FIXED_MINOR = 32,
IB_UCM_NUM_DYNAMIC_MINOR = IB_UCM_MAX_DEVICES - IB_UCM_NUM_FIXED_MINOR,
}; };
#define IB_UCM_BASE_DEV MKDEV(IB_UCM_MAJOR, IB_UCM_BASE_MINOR) #define IB_UCM_BASE_DEV MKDEV(IB_UCM_MAJOR, IB_UCM_BASE_MINOR)
static dev_t dynamic_ucm_dev;
static void ib_ucm_add_one(struct ib_device *device); static void ib_ucm_add_one(struct ib_device *device);
static void ib_ucm_remove_one(struct ib_device *device, void *client_data); static void ib_ucm_remove_one(struct ib_device *device, void *client_data);
...@@ -1199,7 +1204,6 @@ static int ib_ucm_close(struct inode *inode, struct file *filp) ...@@ -1199,7 +1204,6 @@ static int ib_ucm_close(struct inode *inode, struct file *filp)
return 0; return 0;
} }
static DECLARE_BITMAP(overflow_map, IB_UCM_MAX_DEVICES);
static void ib_ucm_release_dev(struct device *dev) static void ib_ucm_release_dev(struct device *dev)
{ {
struct ib_ucm_device *ucm_dev; struct ib_ucm_device *ucm_dev;
...@@ -1210,10 +1214,7 @@ static void ib_ucm_release_dev(struct device *dev) ...@@ -1210,10 +1214,7 @@ static void ib_ucm_release_dev(struct device *dev)
static void ib_ucm_free_dev(struct ib_ucm_device *ucm_dev) static void ib_ucm_free_dev(struct ib_ucm_device *ucm_dev)
{ {
if (ucm_dev->devnum < IB_UCM_MAX_DEVICES) clear_bit(ucm_dev->devnum, dev_map);
clear_bit(ucm_dev->devnum, dev_map);
else
clear_bit(ucm_dev->devnum - IB_UCM_MAX_DEVICES, overflow_map);
} }
static const struct file_operations ucm_fops = { static const struct file_operations ucm_fops = {
...@@ -1235,27 +1236,6 @@ static ssize_t show_ibdev(struct device *dev, struct device_attribute *attr, ...@@ -1235,27 +1236,6 @@ static ssize_t show_ibdev(struct device *dev, struct device_attribute *attr,
} }
static DEVICE_ATTR(ibdev, S_IRUGO, show_ibdev, NULL); static DEVICE_ATTR(ibdev, S_IRUGO, show_ibdev, NULL);
static dev_t overflow_maj;
static int find_overflow_devnum(void)
{
int ret;
if (!overflow_maj) {
ret = alloc_chrdev_region(&overflow_maj, 0, IB_UCM_MAX_DEVICES,
"infiniband_cm");
if (ret) {
pr_err("ucm: couldn't register dynamic device number\n");
return ret;
}
}
ret = find_first_zero_bit(overflow_map, IB_UCM_MAX_DEVICES);
if (ret >= IB_UCM_MAX_DEVICES)
return -1;
return ret;
}
static void ib_ucm_add_one(struct ib_device *device) static void ib_ucm_add_one(struct ib_device *device)
{ {
int devnum; int devnum;
...@@ -1274,19 +1254,14 @@ static void ib_ucm_add_one(struct ib_device *device) ...@@ -1274,19 +1254,14 @@ static void ib_ucm_add_one(struct ib_device *device)
ucm_dev->dev.release = ib_ucm_release_dev; ucm_dev->dev.release = ib_ucm_release_dev;
devnum = find_first_zero_bit(dev_map, IB_UCM_MAX_DEVICES); devnum = find_first_zero_bit(dev_map, IB_UCM_MAX_DEVICES);
if (devnum >= IB_UCM_MAX_DEVICES) { if (devnum >= IB_UCM_MAX_DEVICES)
devnum = find_overflow_devnum(); goto err;
if (devnum < 0) ucm_dev->devnum = devnum;
goto err; set_bit(devnum, dev_map);
if (devnum >= IB_UCM_NUM_FIXED_MINOR)
ucm_dev->devnum = devnum + IB_UCM_MAX_DEVICES; base = dynamic_ucm_dev + devnum - IB_UCM_NUM_FIXED_MINOR;
base = devnum + overflow_maj; else
set_bit(devnum, overflow_map); base = IB_UCM_BASE_DEV + devnum;
} else {
ucm_dev->devnum = devnum;
base = devnum + IB_UCM_BASE_DEV;
set_bit(devnum, dev_map);
}
cdev_init(&ucm_dev->cdev, &ucm_fops); cdev_init(&ucm_dev->cdev, &ucm_fops);
ucm_dev->cdev.owner = THIS_MODULE; ucm_dev->cdev.owner = THIS_MODULE;
...@@ -1334,13 +1309,20 @@ static int __init ib_ucm_init(void) ...@@ -1334,13 +1309,20 @@ static int __init ib_ucm_init(void)
{ {
int ret; int ret;
ret = register_chrdev_region(IB_UCM_BASE_DEV, IB_UCM_MAX_DEVICES, ret = register_chrdev_region(IB_UCM_BASE_DEV, IB_UCM_NUM_FIXED_MINOR,
"infiniband_cm"); "infiniband_cm");
if (ret) { if (ret) {
pr_err("ucm: couldn't register device number\n"); pr_err("ucm: couldn't register device number\n");
goto error1; goto error1;
} }
ret = alloc_chrdev_region(&dynamic_ucm_dev, 0, IB_UCM_NUM_DYNAMIC_MINOR,
"infiniband_cm");
if (ret) {
pr_err("ucm: couldn't register dynamic device number\n");
goto err_alloc;
}
ret = class_create_file(&cm_class, &class_attr_abi_version.attr); ret = class_create_file(&cm_class, &class_attr_abi_version.attr);
if (ret) { if (ret) {
pr_err("ucm: couldn't create abi_version attribute\n"); pr_err("ucm: couldn't create abi_version attribute\n");
...@@ -1357,7 +1339,9 @@ static int __init ib_ucm_init(void) ...@@ -1357,7 +1339,9 @@ static int __init ib_ucm_init(void)
error3: error3:
class_remove_file(&cm_class, &class_attr_abi_version.attr); class_remove_file(&cm_class, &class_attr_abi_version.attr);
error2: error2:
unregister_chrdev_region(IB_UCM_BASE_DEV, IB_UCM_MAX_DEVICES); unregister_chrdev_region(dynamic_ucm_dev, IB_UCM_NUM_DYNAMIC_MINOR);
err_alloc:
unregister_chrdev_region(IB_UCM_BASE_DEV, IB_UCM_NUM_FIXED_MINOR);
error1: error1:
return ret; return ret;
} }
...@@ -1366,9 +1350,8 @@ static void __exit ib_ucm_cleanup(void) ...@@ -1366,9 +1350,8 @@ static void __exit ib_ucm_cleanup(void)
{ {
ib_unregister_client(&ucm_client); ib_unregister_client(&ucm_client);
class_remove_file(&cm_class, &class_attr_abi_version.attr); class_remove_file(&cm_class, &class_attr_abi_version.attr);
unregister_chrdev_region(IB_UCM_BASE_DEV, IB_UCM_MAX_DEVICES); unregister_chrdev_region(IB_UCM_BASE_DEV, IB_UCM_NUM_FIXED_MINOR);
if (overflow_maj) unregister_chrdev_region(dynamic_ucm_dev, IB_UCM_NUM_DYNAMIC_MINOR);
unregister_chrdev_region(overflow_maj, IB_UCM_MAX_DEVICES);
idr_destroy(&ctx_id_table); idr_destroy(&ctx_id_table);
} }
......
...@@ -904,13 +904,14 @@ static ssize_t ucma_query_path(struct ucma_context *ctx, ...@@ -904,13 +904,14 @@ static ssize_t ucma_query_path(struct ucma_context *ctx,
resp->path_data[i].flags = IB_PATH_GMP | IB_PATH_PRIMARY | resp->path_data[i].flags = IB_PATH_GMP | IB_PATH_PRIMARY |
IB_PATH_BIDIRECTIONAL; IB_PATH_BIDIRECTIONAL;
if (rec->rec_type == SA_PATH_REC_TYPE_IB) { if (rec->rec_type == SA_PATH_REC_TYPE_OPA) {
ib_sa_pack_path(rec, &resp->path_data[i].path_rec);
} else {
struct sa_path_rec ib; struct sa_path_rec ib;
sa_convert_path_opa_to_ib(&ib, rec); sa_convert_path_opa_to_ib(&ib, rec);
ib_sa_pack_path(&ib, &resp->path_data[i].path_rec); ib_sa_pack_path(&ib, &resp->path_data[i].path_rec);
} else {
ib_sa_pack_path(rec, &resp->path_data[i].path_rec);
} }
} }
...@@ -943,8 +944,8 @@ static ssize_t ucma_query_gid(struct ucma_context *ctx, ...@@ -943,8 +944,8 @@ static ssize_t ucma_query_gid(struct ucma_context *ctx,
} else { } else {
addr->sib_family = AF_IB; addr->sib_family = AF_IB;
addr->sib_pkey = (__force __be16) resp.pkey; addr->sib_pkey = (__force __be16) resp.pkey;
rdma_addr_get_sgid(&ctx->cm_id->route.addr.dev_addr, rdma_read_gids(ctx->cm_id, (union ib_gid *)&addr->sib_addr,
(union ib_gid *) &addr->sib_addr); NULL);
addr->sib_sid = rdma_get_service_id(ctx->cm_id, (struct sockaddr *) addr->sib_sid = rdma_get_service_id(ctx->cm_id, (struct sockaddr *)
&ctx->cm_id->route.addr.src_addr); &ctx->cm_id->route.addr.src_addr);
} }
...@@ -956,8 +957,8 @@ static ssize_t ucma_query_gid(struct ucma_context *ctx, ...@@ -956,8 +957,8 @@ static ssize_t ucma_query_gid(struct ucma_context *ctx,
} else { } else {
addr->sib_family = AF_IB; addr->sib_family = AF_IB;
addr->sib_pkey = (__force __be16) resp.pkey; addr->sib_pkey = (__force __be16) resp.pkey;
rdma_addr_get_dgid(&ctx->cm_id->route.addr.dev_addr, rdma_read_gids(ctx->cm_id, NULL,
(union ib_gid *) &addr->sib_addr); (union ib_gid *)&addr->sib_addr);
addr->sib_sid = rdma_get_service_id(ctx->cm_id, (struct sockaddr *) addr->sib_sid = rdma_get_service_id(ctx->cm_id, (struct sockaddr *)
&ctx->cm_id->route.addr.dst_addr); &ctx->cm_id->route.addr.dst_addr);
} }
...@@ -1231,9 +1232,9 @@ static int ucma_set_ib_path(struct ucma_context *ctx, ...@@ -1231,9 +1232,9 @@ static int ucma_set_ib_path(struct ucma_context *ctx,
struct sa_path_rec opa; struct sa_path_rec opa;
sa_convert_path_ib_to_opa(&opa, &sa_path); sa_convert_path_ib_to_opa(&opa, &sa_path);
ret = rdma_set_ib_paths(ctx->cm_id, &opa, 1); ret = rdma_set_ib_path(ctx->cm_id, &opa);
} else { } else {
ret = rdma_set_ib_paths(ctx->cm_id, &sa_path, 1); ret = rdma_set_ib_path(ctx->cm_id, &sa_path);
} }
if (ret) if (ret)
return ret; return ret;
......
...@@ -352,7 +352,7 @@ int ib_umem_copy_from(void *dst, struct ib_umem *umem, size_t offset, ...@@ -352,7 +352,7 @@ int ib_umem_copy_from(void *dst, struct ib_umem *umem, size_t offset,
return -EINVAL; return -EINVAL;
} }
ret = sg_pcopy_to_buffer(umem->sg_head.sgl, umem->nmap, dst, length, ret = sg_pcopy_to_buffer(umem->sg_head.sgl, umem->npages, dst, length,
offset + ib_umem_offset(umem)); offset + ib_umem_offset(umem));
if (ret < 0) if (ret < 0)
......
...@@ -55,16 +55,21 @@ ...@@ -55,16 +55,21 @@
#include <rdma/ib_mad.h> #include <rdma/ib_mad.h>
#include <rdma/ib_user_mad.h> #include <rdma/ib_user_mad.h>
#include "core_priv.h"
MODULE_AUTHOR("Roland Dreier"); MODULE_AUTHOR("Roland Dreier");
MODULE_DESCRIPTION("InfiniBand userspace MAD packet access"); MODULE_DESCRIPTION("InfiniBand userspace MAD packet access");
MODULE_LICENSE("Dual BSD/GPL"); MODULE_LICENSE("Dual BSD/GPL");
enum { enum {
IB_UMAD_MAX_PORTS = 64, IB_UMAD_MAX_PORTS = RDMA_MAX_PORTS,
IB_UMAD_MAX_AGENTS = 32, IB_UMAD_MAX_AGENTS = 32,
IB_UMAD_MAJOR = 231, IB_UMAD_MAJOR = 231,
IB_UMAD_MINOR_BASE = 0 IB_UMAD_MINOR_BASE = 0,
IB_UMAD_NUM_FIXED_MINOR = 64,
IB_UMAD_NUM_DYNAMIC_MINOR = IB_UMAD_MAX_PORTS - IB_UMAD_NUM_FIXED_MINOR,
IB_ISSM_MINOR_BASE = IB_UMAD_NUM_FIXED_MINOR,
}; };
/* /*
...@@ -127,9 +132,12 @@ struct ib_umad_packet { ...@@ -127,9 +132,12 @@ struct ib_umad_packet {
static struct class *umad_class; static struct class *umad_class;
static const dev_t base_dev = MKDEV(IB_UMAD_MAJOR, IB_UMAD_MINOR_BASE); static const dev_t base_umad_dev = MKDEV(IB_UMAD_MAJOR, IB_UMAD_MINOR_BASE);
static const dev_t base_issm_dev = MKDEV(IB_UMAD_MAJOR, IB_UMAD_MINOR_BASE) +
IB_UMAD_NUM_FIXED_MINOR;
static dev_t dynamic_umad_dev;
static dev_t dynamic_issm_dev;
static DEFINE_SPINLOCK(port_lock);
static DECLARE_BITMAP(dev_map, IB_UMAD_MAX_PORTS); static DECLARE_BITMAP(dev_map, IB_UMAD_MAX_PORTS);
static void ib_umad_add_one(struct ib_device *device); static void ib_umad_add_one(struct ib_device *device);
...@@ -233,8 +241,7 @@ static void recv_handler(struct ib_mad_agent *agent, ...@@ -233,8 +241,7 @@ static void recv_handler(struct ib_mad_agent *agent,
* On OPA devices it is okay to lose the upper 16 bits of LID as this * On OPA devices it is okay to lose the upper 16 bits of LID as this
* information is obtained elsewhere. Mask off the upper 16 bits. * information is obtained elsewhere. Mask off the upper 16 bits.
*/ */
if (agent->device->port_immutable[agent->port_num].core_cap_flags & if (rdma_cap_opa_mad(agent->device, agent->port_num))
RDMA_CORE_PORT_INTEL_OPA)
packet->mad.hdr.lid = ib_lid_be16(0xFFFF & packet->mad.hdr.lid = ib_lid_be16(0xFFFF &
mad_recv_wc->wc->slid); mad_recv_wc->wc->slid);
else else
...@@ -246,10 +253,14 @@ static void recv_handler(struct ib_mad_agent *agent, ...@@ -246,10 +253,14 @@ static void recv_handler(struct ib_mad_agent *agent,
if (packet->mad.hdr.grh_present) { if (packet->mad.hdr.grh_present) {
struct rdma_ah_attr ah_attr; struct rdma_ah_attr ah_attr;
const struct ib_global_route *grh; const struct ib_global_route *grh;
int ret;
ib_init_ah_from_wc(agent->device, agent->port_num, ret = ib_init_ah_attr_from_wc(agent->device, agent->port_num,
mad_recv_wc->wc, mad_recv_wc->recv_buf.grh, mad_recv_wc->wc,
&ah_attr); mad_recv_wc->recv_buf.grh,
&ah_attr);
if (ret)
goto err2;
grh = rdma_ah_read_grh(&ah_attr); grh = rdma_ah_read_grh(&ah_attr);
packet->mad.hdr.gid_index = grh->sgid_index; packet->mad.hdr.gid_index = grh->sgid_index;
...@@ -500,7 +511,7 @@ static ssize_t ib_umad_write(struct file *filp, const char __user *buf, ...@@ -500,7 +511,7 @@ static ssize_t ib_umad_write(struct file *filp, const char __user *buf,
} }
memset(&ah_attr, 0, sizeof ah_attr); memset(&ah_attr, 0, sizeof ah_attr);
ah_attr.type = rdma_ah_find_type(file->port->ib_dev, ah_attr.type = rdma_ah_find_type(agent->device,
file->port->port_num); file->port->port_num);
rdma_ah_set_dlid(&ah_attr, be16_to_cpu(packet->mad.hdr.lid)); rdma_ah_set_dlid(&ah_attr, be16_to_cpu(packet->mad.hdr.lid));
rdma_ah_set_sl(&ah_attr, packet->mad.hdr.sl); rdma_ah_set_sl(&ah_attr, packet->mad.hdr.sl);
...@@ -1139,54 +1150,26 @@ static DEVICE_ATTR(port, S_IRUGO, show_port, NULL); ...@@ -1139,54 +1150,26 @@ static DEVICE_ATTR(port, S_IRUGO, show_port, NULL);
static CLASS_ATTR_STRING(abi_version, S_IRUGO, static CLASS_ATTR_STRING(abi_version, S_IRUGO,
__stringify(IB_USER_MAD_ABI_VERSION)); __stringify(IB_USER_MAD_ABI_VERSION));
static dev_t overflow_maj;
static DECLARE_BITMAP(overflow_map, IB_UMAD_MAX_PORTS);
static int find_overflow_devnum(struct ib_device *device)
{
int ret;
if (!overflow_maj) {
ret = alloc_chrdev_region(&overflow_maj, 0, IB_UMAD_MAX_PORTS * 2,
"infiniband_mad");
if (ret) {
dev_err(&device->dev,
"couldn't register dynamic device number\n");
return ret;
}
}
ret = find_first_zero_bit(overflow_map, IB_UMAD_MAX_PORTS);
if (ret >= IB_UMAD_MAX_PORTS)
return -1;
return ret;
}
static int ib_umad_init_port(struct ib_device *device, int port_num, static int ib_umad_init_port(struct ib_device *device, int port_num,
struct ib_umad_device *umad_dev, struct ib_umad_device *umad_dev,
struct ib_umad_port *port) struct ib_umad_port *port)
{ {
int devnum; int devnum;
dev_t base; dev_t base_umad;
dev_t base_issm;
spin_lock(&port_lock);
devnum = find_first_zero_bit(dev_map, IB_UMAD_MAX_PORTS); devnum = find_first_zero_bit(dev_map, IB_UMAD_MAX_PORTS);
if (devnum >= IB_UMAD_MAX_PORTS) { if (devnum >= IB_UMAD_MAX_PORTS)
spin_unlock(&port_lock); return -1;
devnum = find_overflow_devnum(device); port->dev_num = devnum;
if (devnum < 0) set_bit(devnum, dev_map);
return -1; if (devnum >= IB_UMAD_NUM_FIXED_MINOR) {
base_umad = dynamic_umad_dev + devnum - IB_UMAD_NUM_FIXED_MINOR;
spin_lock(&port_lock); base_issm = dynamic_issm_dev + devnum - IB_UMAD_NUM_FIXED_MINOR;
port->dev_num = devnum + IB_UMAD_MAX_PORTS;
base = devnum + overflow_maj;
set_bit(devnum, overflow_map);
} else { } else {
port->dev_num = devnum; base_umad = devnum + base_umad_dev;
base = devnum + base_dev; base_issm = devnum + base_issm_dev;
set_bit(devnum, dev_map);
} }
spin_unlock(&port_lock);
port->ib_dev = device; port->ib_dev = device;
port->port_num = port_num; port->port_num = port_num;
...@@ -1198,7 +1181,7 @@ static int ib_umad_init_port(struct ib_device *device, int port_num, ...@@ -1198,7 +1181,7 @@ static int ib_umad_init_port(struct ib_device *device, int port_num,
port->cdev.owner = THIS_MODULE; port->cdev.owner = THIS_MODULE;
cdev_set_parent(&port->cdev, &umad_dev->kobj); cdev_set_parent(&port->cdev, &umad_dev->kobj);
kobject_set_name(&port->cdev.kobj, "umad%d", port->dev_num); kobject_set_name(&port->cdev.kobj, "umad%d", port->dev_num);
if (cdev_add(&port->cdev, base, 1)) if (cdev_add(&port->cdev, base_umad, 1))
goto err_cdev; goto err_cdev;
port->dev = device_create(umad_class, device->dev.parent, port->dev = device_create(umad_class, device->dev.parent,
...@@ -1212,12 +1195,11 @@ static int ib_umad_init_port(struct ib_device *device, int port_num, ...@@ -1212,12 +1195,11 @@ static int ib_umad_init_port(struct ib_device *device, int port_num,
if (device_create_file(port->dev, &dev_attr_port)) if (device_create_file(port->dev, &dev_attr_port))
goto err_dev; goto err_dev;
base += IB_UMAD_MAX_PORTS;
cdev_init(&port->sm_cdev, &umad_sm_fops); cdev_init(&port->sm_cdev, &umad_sm_fops);
port->sm_cdev.owner = THIS_MODULE; port->sm_cdev.owner = THIS_MODULE;
cdev_set_parent(&port->sm_cdev, &umad_dev->kobj); cdev_set_parent(&port->sm_cdev, &umad_dev->kobj);
kobject_set_name(&port->sm_cdev.kobj, "issm%d", port->dev_num); kobject_set_name(&port->sm_cdev.kobj, "issm%d", port->dev_num);
if (cdev_add(&port->sm_cdev, base, 1)) if (cdev_add(&port->sm_cdev, base_issm, 1))
goto err_sm_cdev; goto err_sm_cdev;
port->sm_dev = device_create(umad_class, device->dev.parent, port->sm_dev = device_create(umad_class, device->dev.parent,
...@@ -1244,10 +1226,7 @@ static int ib_umad_init_port(struct ib_device *device, int port_num, ...@@ -1244,10 +1226,7 @@ static int ib_umad_init_port(struct ib_device *device, int port_num,
err_cdev: err_cdev:
cdev_del(&port->cdev); cdev_del(&port->cdev);
if (port->dev_num < IB_UMAD_MAX_PORTS) clear_bit(devnum, dev_map);
clear_bit(devnum, dev_map);
else
clear_bit(devnum, overflow_map);
return -1; return -1;
} }
...@@ -1281,11 +1260,7 @@ static void ib_umad_kill_port(struct ib_umad_port *port) ...@@ -1281,11 +1260,7 @@ static void ib_umad_kill_port(struct ib_umad_port *port)
} }
mutex_unlock(&port->file_mutex); mutex_unlock(&port->file_mutex);
clear_bit(port->dev_num, dev_map);
if (port->dev_num < IB_UMAD_MAX_PORTS)
clear_bit(port->dev_num, dev_map);
else
clear_bit(port->dev_num - IB_UMAD_MAX_PORTS, overflow_map);
} }
static void ib_umad_add_one(struct ib_device *device) static void ib_umad_add_one(struct ib_device *device)
...@@ -1361,13 +1336,23 @@ static int __init ib_umad_init(void) ...@@ -1361,13 +1336,23 @@ static int __init ib_umad_init(void)
{ {
int ret; int ret;
ret = register_chrdev_region(base_dev, IB_UMAD_MAX_PORTS * 2, ret = register_chrdev_region(base_umad_dev,
IB_UMAD_NUM_FIXED_MINOR * 2,
"infiniband_mad"); "infiniband_mad");
if (ret) { if (ret) {
pr_err("couldn't register device number\n"); pr_err("couldn't register device number\n");
goto out; goto out;
} }
ret = alloc_chrdev_region(&dynamic_umad_dev, 0,
IB_UMAD_NUM_DYNAMIC_MINOR * 2,
"infiniband_mad");
if (ret) {
pr_err("couldn't register dynamic device number\n");
goto out_alloc;
}
dynamic_issm_dev = dynamic_umad_dev + IB_UMAD_NUM_DYNAMIC_MINOR;
umad_class = class_create(THIS_MODULE, "infiniband_mad"); umad_class = class_create(THIS_MODULE, "infiniband_mad");
if (IS_ERR(umad_class)) { if (IS_ERR(umad_class)) {
ret = PTR_ERR(umad_class); ret = PTR_ERR(umad_class);
...@@ -1395,7 +1380,12 @@ static int __init ib_umad_init(void) ...@@ -1395,7 +1380,12 @@ static int __init ib_umad_init(void)
class_destroy(umad_class); class_destroy(umad_class);
out_chrdev: out_chrdev:
unregister_chrdev_region(base_dev, IB_UMAD_MAX_PORTS * 2); unregister_chrdev_region(dynamic_umad_dev,
IB_UMAD_NUM_DYNAMIC_MINOR * 2);
out_alloc:
unregister_chrdev_region(base_umad_dev,
IB_UMAD_NUM_FIXED_MINOR * 2);
out: out:
return ret; return ret;
...@@ -1405,9 +1395,10 @@ static void __exit ib_umad_cleanup(void) ...@@ -1405,9 +1395,10 @@ static void __exit ib_umad_cleanup(void)
{ {
ib_unregister_client(&umad_client); ib_unregister_client(&umad_client);
class_destroy(umad_class); class_destroy(umad_class);
unregister_chrdev_region(base_dev, IB_UMAD_MAX_PORTS * 2); unregister_chrdev_region(base_umad_dev,
if (overflow_maj) IB_UMAD_NUM_FIXED_MINOR * 2);
unregister_chrdev_region(overflow_maj, IB_UMAD_MAX_PORTS * 2); unregister_chrdev_region(dynamic_umad_dev,
IB_UMAD_NUM_DYNAMIC_MINOR * 2);
} }
module_init(ib_umad_init); module_init(ib_umad_init);
......
...@@ -340,6 +340,8 @@ ssize_t ib_uverbs_alloc_pd(struct ib_uverbs_file *file, ...@@ -340,6 +340,8 @@ ssize_t ib_uverbs_alloc_pd(struct ib_uverbs_file *file,
uobj->object = pd; uobj->object = pd;
memset(&resp, 0, sizeof resp); memset(&resp, 0, sizeof resp);
resp.pd_handle = uobj->id; resp.pd_handle = uobj->id;
pd->res.type = RDMA_RESTRACK_PD;
rdma_restrack_add(&pd->res);
if (copy_to_user(u64_to_user_ptr(cmd.response), &resp, sizeof resp)) { if (copy_to_user(u64_to_user_ptr(cmd.response), &resp, sizeof resp)) {
ret = -EFAULT; ret = -EFAULT;
...@@ -1033,6 +1035,8 @@ static struct ib_ucq_object *create_cq(struct ib_uverbs_file *file, ...@@ -1033,6 +1035,8 @@ static struct ib_ucq_object *create_cq(struct ib_uverbs_file *file,
goto err_cb; goto err_cb;
uobj_alloc_commit(&obj->uobject); uobj_alloc_commit(&obj->uobject);
cq->res.type = RDMA_RESTRACK_CQ;
rdma_restrack_add(&cq->res);
return obj; return obj;
...@@ -1145,10 +1149,7 @@ int ib_uverbs_ex_create_cq(struct ib_uverbs_file *file, ...@@ -1145,10 +1149,7 @@ int ib_uverbs_ex_create_cq(struct ib_uverbs_file *file,
min(ucore->inlen, sizeof(cmd)), min(ucore->inlen, sizeof(cmd)),
ib_uverbs_ex_create_cq_cb, NULL); ib_uverbs_ex_create_cq_cb, NULL);
if (IS_ERR(obj)) return PTR_ERR_OR_ZERO(obj);
return PTR_ERR(obj);
return 0;
} }
ssize_t ib_uverbs_resize_cq(struct ib_uverbs_file *file, ssize_t ib_uverbs_resize_cq(struct ib_uverbs_file *file,
...@@ -1199,7 +1200,7 @@ static int copy_wc_to_user(struct ib_device *ib_dev, void __user *dest, ...@@ -1199,7 +1200,7 @@ static int copy_wc_to_user(struct ib_device *ib_dev, void __user *dest,
tmp.opcode = wc->opcode; tmp.opcode = wc->opcode;
tmp.vendor_err = wc->vendor_err; tmp.vendor_err = wc->vendor_err;
tmp.byte_len = wc->byte_len; tmp.byte_len = wc->byte_len;
tmp.ex.imm_data = (__u32 __force) wc->ex.imm_data; tmp.ex.imm_data = wc->ex.imm_data;
tmp.qp_num = wc->qp->qp_num; tmp.qp_num = wc->qp->qp_num;
tmp.src_qp = wc->src_qp; tmp.src_qp = wc->src_qp;
tmp.wc_flags = wc->wc_flags; tmp.wc_flags = wc->wc_flags;
...@@ -1517,7 +1518,7 @@ static int create_qp(struct ib_uverbs_file *file, ...@@ -1517,7 +1518,7 @@ static int create_qp(struct ib_uverbs_file *file,
if (cmd->qp_type == IB_QPT_XRC_TGT) if (cmd->qp_type == IB_QPT_XRC_TGT)
qp = ib_create_qp(pd, &attr); qp = ib_create_qp(pd, &attr);
else else
qp = device->create_qp(pd, &attr, uhw); qp = _ib_create_qp(device, pd, &attr, uhw);
if (IS_ERR(qp)) { if (IS_ERR(qp)) {
ret = PTR_ERR(qp); ret = PTR_ERR(qp);
...@@ -1530,7 +1531,6 @@ static int create_qp(struct ib_uverbs_file *file, ...@@ -1530,7 +1531,6 @@ static int create_qp(struct ib_uverbs_file *file,
goto err_cb; goto err_cb;
qp->real_qp = qp; qp->real_qp = qp;
qp->device = device;
qp->pd = pd; qp->pd = pd;
qp->send_cq = attr.send_cq; qp->send_cq = attr.send_cq;
qp->recv_cq = attr.recv_cq; qp->recv_cq = attr.recv_cq;
......
...@@ -243,16 +243,13 @@ static long ib_uverbs_cmd_verbs(struct ib_device *ib_dev, ...@@ -243,16 +243,13 @@ static long ib_uverbs_cmd_verbs(struct ib_device *ib_dev,
size_t ctx_size; size_t ctx_size;
uintptr_t data[UVERBS_OPTIMIZE_USING_STACK_SZ / sizeof(uintptr_t)]; uintptr_t data[UVERBS_OPTIMIZE_USING_STACK_SZ / sizeof(uintptr_t)];
if (hdr->reserved)
return -EINVAL;
object_spec = uverbs_get_object(ib_dev, hdr->object_id); object_spec = uverbs_get_object(ib_dev, hdr->object_id);
if (!object_spec) if (!object_spec)
return -EOPNOTSUPP; return -EPROTONOSUPPORT;
method_spec = uverbs_get_method(object_spec, hdr->method_id); method_spec = uverbs_get_method(object_spec, hdr->method_id);
if (!method_spec) if (!method_spec)
return -EOPNOTSUPP; return -EPROTONOSUPPORT;
if ((method_spec->flags & UVERBS_ACTION_FLAG_CREATE_ROOT) ^ !file->ucontext) if ((method_spec->flags & UVERBS_ACTION_FLAG_CREATE_ROOT) ^ !file->ucontext)
return -EINVAL; return -EINVAL;
...@@ -305,6 +302,16 @@ static long ib_uverbs_cmd_verbs(struct ib_device *ib_dev, ...@@ -305,6 +302,16 @@ static long ib_uverbs_cmd_verbs(struct ib_device *ib_dev,
err = uverbs_handle_method(buf, ctx->uattrs, hdr->num_attrs, ib_dev, err = uverbs_handle_method(buf, ctx->uattrs, hdr->num_attrs, ib_dev,
file, method_spec, ctx->uverbs_attr_bundle); file, method_spec, ctx->uverbs_attr_bundle);
/*
* EPROTONOSUPPORT is ONLY to be returned if the ioctl framework can
* not invoke the method because the request is not supported. No
* other cases should return this code.
*/
if (unlikely(err == -EPROTONOSUPPORT)) {
WARN_ON_ONCE(err == -EPROTONOSUPPORT);
err = -EINVAL;
}
out: out:
if (ctx != (void *)data) if (ctx != (void *)data)
kfree(ctx); kfree(ctx);
...@@ -341,7 +348,7 @@ long ib_uverbs_ioctl(struct file *filp, unsigned int cmd, unsigned long arg) ...@@ -341,7 +348,7 @@ long ib_uverbs_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
} }
if (hdr.reserved) { if (hdr.reserved) {
err = -EOPNOTSUPP; err = -EPROTONOSUPPORT;
goto out; goto out;
} }
......
...@@ -62,14 +62,16 @@ MODULE_LICENSE("Dual BSD/GPL"); ...@@ -62,14 +62,16 @@ MODULE_LICENSE("Dual BSD/GPL");
enum { enum {
IB_UVERBS_MAJOR = 231, IB_UVERBS_MAJOR = 231,
IB_UVERBS_BASE_MINOR = 192, IB_UVERBS_BASE_MINOR = 192,
IB_UVERBS_MAX_DEVICES = 32 IB_UVERBS_MAX_DEVICES = RDMA_MAX_PORTS,
IB_UVERBS_NUM_FIXED_MINOR = 32,
IB_UVERBS_NUM_DYNAMIC_MINOR = IB_UVERBS_MAX_DEVICES - IB_UVERBS_NUM_FIXED_MINOR,
}; };
#define IB_UVERBS_BASE_DEV MKDEV(IB_UVERBS_MAJOR, IB_UVERBS_BASE_MINOR) #define IB_UVERBS_BASE_DEV MKDEV(IB_UVERBS_MAJOR, IB_UVERBS_BASE_MINOR)
static dev_t dynamic_uverbs_dev;
static struct class *uverbs_class; static struct class *uverbs_class;
static DEFINE_SPINLOCK(map_lock);
static DECLARE_BITMAP(dev_map, IB_UVERBS_MAX_DEVICES); static DECLARE_BITMAP(dev_map, IB_UVERBS_MAX_DEVICES);
static ssize_t (*uverbs_cmd_table[])(struct ib_uverbs_file *file, static ssize_t (*uverbs_cmd_table[])(struct ib_uverbs_file *file,
...@@ -1005,34 +1007,6 @@ static DEVICE_ATTR(abi_version, S_IRUGO, show_dev_abi_version, NULL); ...@@ -1005,34 +1007,6 @@ static DEVICE_ATTR(abi_version, S_IRUGO, show_dev_abi_version, NULL);
static CLASS_ATTR_STRING(abi_version, S_IRUGO, static CLASS_ATTR_STRING(abi_version, S_IRUGO,
__stringify(IB_USER_VERBS_ABI_VERSION)); __stringify(IB_USER_VERBS_ABI_VERSION));
static dev_t overflow_maj;
static DECLARE_BITMAP(overflow_map, IB_UVERBS_MAX_DEVICES);
/*
* If we have more than IB_UVERBS_MAX_DEVICES, dynamically overflow by
* requesting a new major number and doubling the number of max devices we
* support. It's stupid, but simple.
*/
static int find_overflow_devnum(void)
{
int ret;
if (!overflow_maj) {
ret = alloc_chrdev_region(&overflow_maj, 0, IB_UVERBS_MAX_DEVICES,
"infiniband_verbs");
if (ret) {
pr_err("user_verbs: couldn't register dynamic device number\n");
return ret;
}
}
ret = find_first_zero_bit(overflow_map, IB_UVERBS_MAX_DEVICES);
if (ret >= IB_UVERBS_MAX_DEVICES)
return -1;
return ret;
}
static void ib_uverbs_add_one(struct ib_device *device) static void ib_uverbs_add_one(struct ib_device *device)
{ {
int devnum; int devnum;
...@@ -1062,24 +1036,15 @@ static void ib_uverbs_add_one(struct ib_device *device) ...@@ -1062,24 +1036,15 @@ static void ib_uverbs_add_one(struct ib_device *device)
INIT_LIST_HEAD(&uverbs_dev->uverbs_file_list); INIT_LIST_HEAD(&uverbs_dev->uverbs_file_list);
INIT_LIST_HEAD(&uverbs_dev->uverbs_events_file_list); INIT_LIST_HEAD(&uverbs_dev->uverbs_events_file_list);
spin_lock(&map_lock);
devnum = find_first_zero_bit(dev_map, IB_UVERBS_MAX_DEVICES); devnum = find_first_zero_bit(dev_map, IB_UVERBS_MAX_DEVICES);
if (devnum >= IB_UVERBS_MAX_DEVICES) { if (devnum >= IB_UVERBS_MAX_DEVICES)
spin_unlock(&map_lock); goto err;
devnum = find_overflow_devnum(); uverbs_dev->devnum = devnum;
if (devnum < 0) set_bit(devnum, dev_map);
goto err; if (devnum >= IB_UVERBS_NUM_FIXED_MINOR)
base = dynamic_uverbs_dev + devnum - IB_UVERBS_NUM_FIXED_MINOR;
spin_lock(&map_lock); else
uverbs_dev->devnum = devnum + IB_UVERBS_MAX_DEVICES; base = IB_UVERBS_BASE_DEV + devnum;
base = devnum + overflow_maj;
set_bit(devnum, overflow_map);
} else {
uverbs_dev->devnum = devnum;
base = devnum + IB_UVERBS_BASE_DEV;
set_bit(devnum, dev_map);
}
spin_unlock(&map_lock);
rcu_assign_pointer(uverbs_dev->ib_dev, device); rcu_assign_pointer(uverbs_dev->ib_dev, device);
uverbs_dev->num_comp_vectors = device->num_comp_vectors; uverbs_dev->num_comp_vectors = device->num_comp_vectors;
...@@ -1124,10 +1089,7 @@ static void ib_uverbs_add_one(struct ib_device *device) ...@@ -1124,10 +1089,7 @@ static void ib_uverbs_add_one(struct ib_device *device)
err_cdev: err_cdev:
cdev_del(&uverbs_dev->cdev); cdev_del(&uverbs_dev->cdev);
if (uverbs_dev->devnum < IB_UVERBS_MAX_DEVICES) clear_bit(devnum, dev_map);
clear_bit(devnum, dev_map);
else
clear_bit(devnum, overflow_map);
err: err:
if (atomic_dec_and_test(&uverbs_dev->refcount)) if (atomic_dec_and_test(&uverbs_dev->refcount))
...@@ -1219,11 +1181,7 @@ static void ib_uverbs_remove_one(struct ib_device *device, void *client_data) ...@@ -1219,11 +1181,7 @@ static void ib_uverbs_remove_one(struct ib_device *device, void *client_data)
dev_set_drvdata(uverbs_dev->dev, NULL); dev_set_drvdata(uverbs_dev->dev, NULL);
device_destroy(uverbs_class, uverbs_dev->cdev.dev); device_destroy(uverbs_class, uverbs_dev->cdev.dev);
cdev_del(&uverbs_dev->cdev); cdev_del(&uverbs_dev->cdev);
clear_bit(uverbs_dev->devnum, dev_map);
if (uverbs_dev->devnum < IB_UVERBS_MAX_DEVICES)
clear_bit(uverbs_dev->devnum, dev_map);
else
clear_bit(uverbs_dev->devnum - IB_UVERBS_MAX_DEVICES, overflow_map);
if (device->disassociate_ucontext) { if (device->disassociate_ucontext) {
/* We disassociate HW resources and immediately return. /* We disassociate HW resources and immediately return.
...@@ -1265,13 +1223,22 @@ static int __init ib_uverbs_init(void) ...@@ -1265,13 +1223,22 @@ static int __init ib_uverbs_init(void)
{ {
int ret; int ret;
ret = register_chrdev_region(IB_UVERBS_BASE_DEV, IB_UVERBS_MAX_DEVICES, ret = register_chrdev_region(IB_UVERBS_BASE_DEV,
IB_UVERBS_NUM_FIXED_MINOR,
"infiniband_verbs"); "infiniband_verbs");
if (ret) { if (ret) {
pr_err("user_verbs: couldn't register device number\n"); pr_err("user_verbs: couldn't register device number\n");
goto out; goto out;
} }
ret = alloc_chrdev_region(&dynamic_uverbs_dev, 0,
IB_UVERBS_NUM_DYNAMIC_MINOR,
"infiniband_verbs");
if (ret) {
pr_err("couldn't register dynamic device number\n");
goto out_alloc;
}
uverbs_class = class_create(THIS_MODULE, "infiniband_verbs"); uverbs_class = class_create(THIS_MODULE, "infiniband_verbs");
if (IS_ERR(uverbs_class)) { if (IS_ERR(uverbs_class)) {
ret = PTR_ERR(uverbs_class); ret = PTR_ERR(uverbs_class);
...@@ -1299,7 +1266,12 @@ static int __init ib_uverbs_init(void) ...@@ -1299,7 +1266,12 @@ static int __init ib_uverbs_init(void)
class_destroy(uverbs_class); class_destroy(uverbs_class);
out_chrdev: out_chrdev:
unregister_chrdev_region(IB_UVERBS_BASE_DEV, IB_UVERBS_MAX_DEVICES); unregister_chrdev_region(dynamic_uverbs_dev,
IB_UVERBS_NUM_DYNAMIC_MINOR);
out_alloc:
unregister_chrdev_region(IB_UVERBS_BASE_DEV,
IB_UVERBS_NUM_FIXED_MINOR);
out: out:
return ret; return ret;
...@@ -1309,9 +1281,10 @@ static void __exit ib_uverbs_cleanup(void) ...@@ -1309,9 +1281,10 @@ static void __exit ib_uverbs_cleanup(void)
{ {
ib_unregister_client(&uverbs_client); ib_unregister_client(&uverbs_client);
class_destroy(uverbs_class); class_destroy(uverbs_class);
unregister_chrdev_region(IB_UVERBS_BASE_DEV, IB_UVERBS_MAX_DEVICES); unregister_chrdev_region(IB_UVERBS_BASE_DEV,
if (overflow_maj) IB_UVERBS_NUM_FIXED_MINOR);
unregister_chrdev_region(overflow_maj, IB_UVERBS_MAX_DEVICES); unregister_chrdev_region(dynamic_uverbs_dev,
IB_UVERBS_NUM_DYNAMIC_MINOR);
} }
module_init(ib_uverbs_init); module_init(ib_uverbs_init);
......
...@@ -35,6 +35,7 @@ ...@@ -35,6 +35,7 @@
#include <rdma/ib_verbs.h> #include <rdma/ib_verbs.h>
#include <linux/bug.h> #include <linux/bug.h>
#include <linux/file.h> #include <linux/file.h>
#include <rdma/restrack.h>
#include "rdma_core.h" #include "rdma_core.h"
#include "uverbs.h" #include "uverbs.h"
...@@ -319,6 +320,8 @@ static int uverbs_create_cq_handler(struct ib_device *ib_dev, ...@@ -319,6 +320,8 @@ static int uverbs_create_cq_handler(struct ib_device *ib_dev,
obj->uobject.object = cq; obj->uobject.object = cq;
obj->uobject.user_handle = user_handle; obj->uobject.user_handle = user_handle;
atomic_set(&cq->usecnt, 0); atomic_set(&cq->usecnt, 0);
cq->res.type = RDMA_RESTRACK_CQ;
rdma_restrack_add(&cq->res);
ret = uverbs_copy_to(attrs, CREATE_CQ_RESP_CQE, &cq->cqe); ret = uverbs_copy_to(attrs, CREATE_CQ_RESP_CQE, &cq->cqe);
if (ret) if (ret)
......
...@@ -124,16 +124,24 @@ EXPORT_SYMBOL(ib_wc_status_msg); ...@@ -124,16 +124,24 @@ EXPORT_SYMBOL(ib_wc_status_msg);
__attribute_const__ int ib_rate_to_mult(enum ib_rate rate) __attribute_const__ int ib_rate_to_mult(enum ib_rate rate)
{ {
switch (rate) { switch (rate) {
case IB_RATE_2_5_GBPS: return 1; case IB_RATE_2_5_GBPS: return 1;
case IB_RATE_5_GBPS: return 2; case IB_RATE_5_GBPS: return 2;
case IB_RATE_10_GBPS: return 4; case IB_RATE_10_GBPS: return 4;
case IB_RATE_20_GBPS: return 8; case IB_RATE_20_GBPS: return 8;
case IB_RATE_30_GBPS: return 12; case IB_RATE_30_GBPS: return 12;
case IB_RATE_40_GBPS: return 16; case IB_RATE_40_GBPS: return 16;
case IB_RATE_60_GBPS: return 24; case IB_RATE_60_GBPS: return 24;
case IB_RATE_80_GBPS: return 32; case IB_RATE_80_GBPS: return 32;
case IB_RATE_120_GBPS: return 48; case IB_RATE_120_GBPS: return 48;
default: return -1; case IB_RATE_14_GBPS: return 6;
case IB_RATE_56_GBPS: return 22;
case IB_RATE_112_GBPS: return 45;
case IB_RATE_168_GBPS: return 67;
case IB_RATE_25_GBPS: return 10;
case IB_RATE_100_GBPS: return 40;
case IB_RATE_200_GBPS: return 80;
case IB_RATE_300_GBPS: return 120;
default: return -1;
} }
} }
EXPORT_SYMBOL(ib_rate_to_mult); EXPORT_SYMBOL(ib_rate_to_mult);
...@@ -141,16 +149,24 @@ EXPORT_SYMBOL(ib_rate_to_mult); ...@@ -141,16 +149,24 @@ EXPORT_SYMBOL(ib_rate_to_mult);
__attribute_const__ enum ib_rate mult_to_ib_rate(int mult) __attribute_const__ enum ib_rate mult_to_ib_rate(int mult)
{ {
switch (mult) { switch (mult) {
case 1: return IB_RATE_2_5_GBPS; case 1: return IB_RATE_2_5_GBPS;
case 2: return IB_RATE_5_GBPS; case 2: return IB_RATE_5_GBPS;
case 4: return IB_RATE_10_GBPS; case 4: return IB_RATE_10_GBPS;
case 8: return IB_RATE_20_GBPS; case 8: return IB_RATE_20_GBPS;
case 12: return IB_RATE_30_GBPS; case 12: return IB_RATE_30_GBPS;
case 16: return IB_RATE_40_GBPS; case 16: return IB_RATE_40_GBPS;
case 24: return IB_RATE_60_GBPS; case 24: return IB_RATE_60_GBPS;
case 32: return IB_RATE_80_GBPS; case 32: return IB_RATE_80_GBPS;
case 48: return IB_RATE_120_GBPS; case 48: return IB_RATE_120_GBPS;
default: return IB_RATE_PORT_CURRENT; case 6: return IB_RATE_14_GBPS;
case 22: return IB_RATE_56_GBPS;
case 45: return IB_RATE_112_GBPS;
case 67: return IB_RATE_168_GBPS;
case 10: return IB_RATE_25_GBPS;
case 40: return IB_RATE_100_GBPS;
case 80: return IB_RATE_200_GBPS;
case 120: return IB_RATE_300_GBPS;
default: return IB_RATE_PORT_CURRENT;
} }
} }
EXPORT_SYMBOL(mult_to_ib_rate); EXPORT_SYMBOL(mult_to_ib_rate);
...@@ -247,6 +263,10 @@ struct ib_pd *__ib_alloc_pd(struct ib_device *device, unsigned int flags, ...@@ -247,6 +263,10 @@ struct ib_pd *__ib_alloc_pd(struct ib_device *device, unsigned int flags,
mr_access_flags |= IB_ACCESS_REMOTE_READ | IB_ACCESS_REMOTE_WRITE; mr_access_flags |= IB_ACCESS_REMOTE_READ | IB_ACCESS_REMOTE_WRITE;
} }
pd->res.type = RDMA_RESTRACK_PD;
pd->res.kern_name = caller;
rdma_restrack_add(&pd->res);
if (mr_access_flags) { if (mr_access_flags) {
struct ib_mr *mr; struct ib_mr *mr;
...@@ -296,6 +316,7 @@ void ib_dealloc_pd(struct ib_pd *pd) ...@@ -296,6 +316,7 @@ void ib_dealloc_pd(struct ib_pd *pd)
requires the caller to guarantee we can't race here. */ requires the caller to guarantee we can't race here. */
WARN_ON(atomic_read(&pd->usecnt)); WARN_ON(atomic_read(&pd->usecnt));
rdma_restrack_del(&pd->res);
/* Making delalloc_pd a void return is a WIP, no driver should return /* Making delalloc_pd a void return is a WIP, no driver should return
an error here. */ an error here. */
ret = pd->device->dealloc_pd(pd); ret = pd->device->dealloc_pd(pd);
...@@ -421,8 +442,7 @@ static bool find_gid_index(const union ib_gid *gid, ...@@ -421,8 +442,7 @@ static bool find_gid_index(const union ib_gid *gid,
const struct ib_gid_attr *gid_attr, const struct ib_gid_attr *gid_attr,
void *context) void *context)
{ {
struct find_gid_index_context *ctx = struct find_gid_index_context *ctx = context;
(struct find_gid_index_context *)context;
if (ctx->gid_type != gid_attr->gid_type) if (ctx->gid_type != gid_attr->gid_type)
return false; return false;
...@@ -481,8 +501,53 @@ int ib_get_gids_from_rdma_hdr(const union rdma_network_hdr *hdr, ...@@ -481,8 +501,53 @@ int ib_get_gids_from_rdma_hdr(const union rdma_network_hdr *hdr,
} }
EXPORT_SYMBOL(ib_get_gids_from_rdma_hdr); EXPORT_SYMBOL(ib_get_gids_from_rdma_hdr);
/* Resolve destination mac address and hop limit for unicast destination
* GID entry, considering the source GID entry as well.
* ah_attribute must have have valid port_num, sgid_index.
*/
static int ib_resolve_unicast_gid_dmac(struct ib_device *device,
struct rdma_ah_attr *ah_attr)
{
struct ib_gid_attr sgid_attr;
struct ib_global_route *grh;
int hop_limit = 0xff;
union ib_gid sgid;
int ret;
grh = rdma_ah_retrieve_grh(ah_attr);
ret = ib_query_gid(device,
rdma_ah_get_port_num(ah_attr),
grh->sgid_index,
&sgid, &sgid_attr);
if (ret || !sgid_attr.ndev) {
if (!ret)
ret = -ENXIO;
return ret;
}
/* If destination is link local and source GID is RoCEv1,
* IP stack is not used.
*/
if (rdma_link_local_addr((struct in6_addr *)grh->dgid.raw) &&
sgid_attr.gid_type == IB_GID_TYPE_ROCE) {
rdma_get_ll_mac((struct in6_addr *)grh->dgid.raw,
ah_attr->roce.dmac);
goto done;
}
ret = rdma_addr_find_l2_eth_by_grh(&sgid, &grh->dgid,
ah_attr->roce.dmac,
sgid_attr.ndev, &hop_limit);
done:
dev_put(sgid_attr.ndev);
grh->hop_limit = hop_limit;
return ret;
}
/* /*
* This function creates ah from the incoming packet. * This function initializes address handle attributes from the incoming packet.
* Incoming packet has dgid of the receiver node on which this code is * Incoming packet has dgid of the receiver node on which this code is
* getting executed and, sgid contains the GID of the sender. * getting executed and, sgid contains the GID of the sender.
* *
...@@ -490,13 +555,10 @@ EXPORT_SYMBOL(ib_get_gids_from_rdma_hdr); ...@@ -490,13 +555,10 @@ EXPORT_SYMBOL(ib_get_gids_from_rdma_hdr);
* as sgid and, sgid is used as dgid because sgid contains destinations * as sgid and, sgid is used as dgid because sgid contains destinations
* GID whom to respond to. * GID whom to respond to.
* *
* This is why when calling rdma_addr_find_l2_eth_by_grh() function, the
* position of arguments dgid and sgid do not match the order of the
* parameters.
*/ */
int ib_init_ah_from_wc(struct ib_device *device, u8 port_num, int ib_init_ah_attr_from_wc(struct ib_device *device, u8 port_num,
const struct ib_wc *wc, const struct ib_grh *grh, const struct ib_wc *wc, const struct ib_grh *grh,
struct rdma_ah_attr *ah_attr) struct rdma_ah_attr *ah_attr)
{ {
u32 flow_class; u32 flow_class;
u16 gid_index; u16 gid_index;
...@@ -523,57 +585,33 @@ int ib_init_ah_from_wc(struct ib_device *device, u8 port_num, ...@@ -523,57 +585,33 @@ int ib_init_ah_from_wc(struct ib_device *device, u8 port_num,
if (ret) if (ret)
return ret; return ret;
rdma_ah_set_sl(ah_attr, wc->sl);
rdma_ah_set_port_num(ah_attr, port_num);
if (rdma_protocol_roce(device, port_num)) { if (rdma_protocol_roce(device, port_num)) {
int if_index = 0;
u16 vlan_id = wc->wc_flags & IB_WC_WITH_VLAN ? u16 vlan_id = wc->wc_flags & IB_WC_WITH_VLAN ?
wc->vlan_id : 0xffff; wc->vlan_id : 0xffff;
struct net_device *idev;
struct net_device *resolved_dev;
if (!(wc->wc_flags & IB_WC_GRH)) if (!(wc->wc_flags & IB_WC_GRH))
return -EPROTOTYPE; return -EPROTOTYPE;
if (!device->get_netdev) ret = get_sgid_index_from_eth(device, port_num,
return -EOPNOTSUPP; vlan_id, &dgid,
gid_type, &gid_index);
idev = device->get_netdev(device, port_num);
if (!idev)
return -ENODEV;
ret = rdma_addr_find_l2_eth_by_grh(&dgid, &sgid,
ah_attr->roce.dmac,
wc->wc_flags & IB_WC_WITH_VLAN ?
NULL : &vlan_id,
&if_index, &hoplimit);
if (ret) {
dev_put(idev);
return ret;
}
resolved_dev = dev_get_by_index(&init_net, if_index);
rcu_read_lock();
if (resolved_dev != idev && !rdma_is_upper_dev_rcu(idev,
resolved_dev))
ret = -EHOSTUNREACH;
rcu_read_unlock();
dev_put(idev);
dev_put(resolved_dev);
if (ret) if (ret)
return ret; return ret;
ret = get_sgid_index_from_eth(device, port_num, vlan_id, flow_class = be32_to_cpu(grh->version_tclass_flow);
&dgid, gid_type, &gid_index); rdma_ah_set_grh(ah_attr, &sgid,
if (ret) flow_class & 0xFFFFF,
return ret; (u8)gid_index, hoplimit,
} (flow_class >> 20) & 0xFF);
return ib_resolve_unicast_gid_dmac(device, ah_attr);
rdma_ah_set_dlid(ah_attr, wc->slid); } else {
rdma_ah_set_sl(ah_attr, wc->sl); rdma_ah_set_dlid(ah_attr, wc->slid);
rdma_ah_set_path_bits(ah_attr, wc->dlid_path_bits); rdma_ah_set_path_bits(ah_attr, wc->dlid_path_bits);
rdma_ah_set_port_num(ah_attr, port_num);
if (wc->wc_flags & IB_WC_GRH) { if (wc->wc_flags & IB_WC_GRH) {
if (!rdma_cap_eth_ah(device, port_num)) {
if (dgid.global.interface_id != cpu_to_be64(IB_SA_WELL_KNOWN_GUID)) { if (dgid.global.interface_id != cpu_to_be64(IB_SA_WELL_KNOWN_GUID)) {
ret = ib_find_cached_gid_by_port(device, &dgid, ret = ib_find_cached_gid_by_port(device, &dgid,
IB_GID_TYPE_IB, IB_GID_TYPE_IB,
...@@ -584,18 +622,17 @@ int ib_init_ah_from_wc(struct ib_device *device, u8 port_num, ...@@ -584,18 +622,17 @@ int ib_init_ah_from_wc(struct ib_device *device, u8 port_num,
} else { } else {
gid_index = 0; gid_index = 0;
} }
}
flow_class = be32_to_cpu(grh->version_tclass_flow);
rdma_ah_set_grh(ah_attr, &sgid,
flow_class & 0xFFFFF,
(u8)gid_index, hoplimit,
(flow_class >> 20) & 0xFF);
flow_class = be32_to_cpu(grh->version_tclass_flow);
rdma_ah_set_grh(ah_attr, &sgid,
flow_class & 0xFFFFF,
(u8)gid_index, hoplimit,
(flow_class >> 20) & 0xFF);
}
return 0;
} }
return 0;
} }
EXPORT_SYMBOL(ib_init_ah_from_wc); EXPORT_SYMBOL(ib_init_ah_attr_from_wc);
struct ib_ah *ib_create_ah_from_wc(struct ib_pd *pd, const struct ib_wc *wc, struct ib_ah *ib_create_ah_from_wc(struct ib_pd *pd, const struct ib_wc *wc,
const struct ib_grh *grh, u8 port_num) const struct ib_grh *grh, u8 port_num)
...@@ -603,7 +640,7 @@ struct ib_ah *ib_create_ah_from_wc(struct ib_pd *pd, const struct ib_wc *wc, ...@@ -603,7 +640,7 @@ struct ib_ah *ib_create_ah_from_wc(struct ib_pd *pd, const struct ib_wc *wc,
struct rdma_ah_attr ah_attr; struct rdma_ah_attr ah_attr;
int ret; int ret;
ret = ib_init_ah_from_wc(pd->device, port_num, wc, grh, &ah_attr); ret = ib_init_ah_attr_from_wc(pd->device, port_num, wc, grh, &ah_attr);
if (ret) if (ret)
return ERR_PTR(ret); return ERR_PTR(ret);
...@@ -850,7 +887,7 @@ struct ib_qp *ib_create_qp(struct ib_pd *pd, ...@@ -850,7 +887,7 @@ struct ib_qp *ib_create_qp(struct ib_pd *pd,
if (qp_init_attr->cap.max_rdma_ctxs) if (qp_init_attr->cap.max_rdma_ctxs)
rdma_rw_init_qp(device, qp_init_attr); rdma_rw_init_qp(device, qp_init_attr);
qp = device->create_qp(pd, qp_init_attr, NULL); qp = _ib_create_qp(device, pd, qp_init_attr, NULL);
if (IS_ERR(qp)) if (IS_ERR(qp))
return qp; return qp;
...@@ -860,7 +897,6 @@ struct ib_qp *ib_create_qp(struct ib_pd *pd, ...@@ -860,7 +897,6 @@ struct ib_qp *ib_create_qp(struct ib_pd *pd,
return ERR_PTR(ret); return ERR_PTR(ret);
} }
qp->device = device;
qp->real_qp = qp; qp->real_qp = qp;
qp->uobject = NULL; qp->uobject = NULL;
qp->qp_type = qp_init_attr->qp_type; qp->qp_type = qp_init_attr->qp_type;
...@@ -890,7 +926,6 @@ struct ib_qp *ib_create_qp(struct ib_pd *pd, ...@@ -890,7 +926,6 @@ struct ib_qp *ib_create_qp(struct ib_pd *pd,
atomic_inc(&qp_init_attr->srq->usecnt); atomic_inc(&qp_init_attr->srq->usecnt);
} }
qp->pd = pd;
qp->send_cq = qp_init_attr->send_cq; qp->send_cq = qp_init_attr->send_cq;
qp->xrcd = NULL; qp->xrcd = NULL;
...@@ -1269,16 +1304,8 @@ static int ib_resolve_eth_dmac(struct ib_device *device, ...@@ -1269,16 +1304,8 @@ static int ib_resolve_eth_dmac(struct ib_device *device,
if (!rdma_is_port_valid(device, rdma_ah_get_port_num(ah_attr))) if (!rdma_is_port_valid(device, rdma_ah_get_port_num(ah_attr)))
return -EINVAL; return -EINVAL;
if (ah_attr->type != RDMA_AH_ATTR_TYPE_ROCE)
return 0;
grh = rdma_ah_retrieve_grh(ah_attr); grh = rdma_ah_retrieve_grh(ah_attr);
if (rdma_link_local_addr((struct in6_addr *)grh->dgid.raw)) {
rdma_get_ll_mac((struct in6_addr *)grh->dgid.raw,
ah_attr->roce.dmac);
return 0;
}
if (rdma_is_multicast_addr((struct in6_addr *)ah_attr->grh.dgid.raw)) { if (rdma_is_multicast_addr((struct in6_addr *)ah_attr->grh.dgid.raw)) {
if (ipv6_addr_v4mapped((struct in6_addr *)ah_attr->grh.dgid.raw)) { if (ipv6_addr_v4mapped((struct in6_addr *)ah_attr->grh.dgid.raw)) {
__be32 addr = 0; __be32 addr = 0;
...@@ -1290,40 +1317,52 @@ static int ib_resolve_eth_dmac(struct ib_device *device, ...@@ -1290,40 +1317,52 @@ static int ib_resolve_eth_dmac(struct ib_device *device,
(char *)ah_attr->roce.dmac); (char *)ah_attr->roce.dmac);
} }
} else { } else {
union ib_gid sgid; ret = ib_resolve_unicast_gid_dmac(device, ah_attr);
struct ib_gid_attr sgid_attr; }
int ifindex; return ret;
int hop_limit; }
ret = ib_query_gid(device,
rdma_ah_get_port_num(ah_attr),
grh->sgid_index,
&sgid, &sgid_attr);
if (ret || !sgid_attr.ndev) {
if (!ret)
ret = -ENXIO;
goto out;
}
ifindex = sgid_attr.ndev->ifindex;
ret = /**
rdma_addr_find_l2_eth_by_grh(&sgid, &grh->dgid, * IB core internal function to perform QP attributes modification.
ah_attr->roce.dmac, */
NULL, &ifindex, &hop_limit); static int _ib_modify_qp(struct ib_qp *qp, struct ib_qp_attr *attr,
int attr_mask, struct ib_udata *udata)
{
u8 port = attr_mask & IB_QP_PORT ? attr->port_num : qp->port;
int ret;
dev_put(sgid_attr.ndev); if (rdma_ib_or_roce(qp->device, port)) {
if (attr_mask & IB_QP_RQ_PSN && attr->rq_psn & ~0xffffff) {
pr_warn("%s: %s rq_psn overflow, masking to 24 bits\n",
__func__, qp->device->name);
attr->rq_psn &= 0xffffff;
}
grh->hop_limit = hop_limit; if (attr_mask & IB_QP_SQ_PSN && attr->sq_psn & ~0xffffff) {
pr_warn("%s: %s sq_psn overflow, masking to 24 bits\n",
__func__, qp->device->name);
attr->sq_psn &= 0xffffff;
}
} }
out:
ret = ib_security_modify_qp(qp, attr, attr_mask, udata);
if (!ret && (attr_mask & IB_QP_PORT))
qp->port = attr->port_num;
return ret; return ret;
} }
static bool is_qp_type_connected(const struct ib_qp *qp)
{
return (qp->qp_type == IB_QPT_UC ||
qp->qp_type == IB_QPT_RC ||
qp->qp_type == IB_QPT_XRC_INI ||
qp->qp_type == IB_QPT_XRC_TGT);
}
/** /**
* ib_modify_qp_with_udata - Modifies the attributes for the specified QP. * ib_modify_qp_with_udata - Modifies the attributes for the specified QP.
* @qp: The QP to modify. * @ib_qp: The QP to modify.
* @attr: On input, specifies the QP attributes to modify. On output, * @attr: On input, specifies the QP attributes to modify. On output,
* the current values of selected QP attributes are returned. * the current values of selected QP attributes are returned.
* @attr_mask: A bit-mask used to specify which attributes of the QP * @attr_mask: A bit-mask used to specify which attributes of the QP
...@@ -1332,21 +1371,20 @@ static int ib_resolve_eth_dmac(struct ib_device *device, ...@@ -1332,21 +1371,20 @@ static int ib_resolve_eth_dmac(struct ib_device *device,
* are being modified. * are being modified.
* It returns 0 on success and returns appropriate error code on error. * It returns 0 on success and returns appropriate error code on error.
*/ */
int ib_modify_qp_with_udata(struct ib_qp *qp, struct ib_qp_attr *attr, int ib_modify_qp_with_udata(struct ib_qp *ib_qp, struct ib_qp_attr *attr,
int attr_mask, struct ib_udata *udata) int attr_mask, struct ib_udata *udata)
{ {
struct ib_qp *qp = ib_qp->real_qp;
int ret; int ret;
if (attr_mask & IB_QP_AV) { if (attr_mask & IB_QP_AV &&
attr->ah_attr.type == RDMA_AH_ATTR_TYPE_ROCE &&
is_qp_type_connected(qp)) {
ret = ib_resolve_eth_dmac(qp->device, &attr->ah_attr); ret = ib_resolve_eth_dmac(qp->device, &attr->ah_attr);
if (ret) if (ret)
return ret; return ret;
} }
ret = ib_security_modify_qp(qp, attr, attr_mask, udata); return _ib_modify_qp(qp, attr, attr_mask, udata);
if (!ret && (attr_mask & IB_QP_PORT))
qp->port = attr->port_num;
return ret;
} }
EXPORT_SYMBOL(ib_modify_qp_with_udata); EXPORT_SYMBOL(ib_modify_qp_with_udata);
...@@ -1409,7 +1447,7 @@ int ib_modify_qp(struct ib_qp *qp, ...@@ -1409,7 +1447,7 @@ int ib_modify_qp(struct ib_qp *qp,
struct ib_qp_attr *qp_attr, struct ib_qp_attr *qp_attr,
int qp_attr_mask) int qp_attr_mask)
{ {
return ib_modify_qp_with_udata(qp, qp_attr, qp_attr_mask, NULL); return _ib_modify_qp(qp->real_qp, qp_attr, qp_attr_mask, NULL);
} }
EXPORT_SYMBOL(ib_modify_qp); EXPORT_SYMBOL(ib_modify_qp);
...@@ -1503,6 +1541,7 @@ int ib_destroy_qp(struct ib_qp *qp) ...@@ -1503,6 +1541,7 @@ int ib_destroy_qp(struct ib_qp *qp)
if (!qp->uobject) if (!qp->uobject)
rdma_rw_cleanup_mrs(qp); rdma_rw_cleanup_mrs(qp);
rdma_restrack_del(&qp->res);
ret = qp->device->destroy_qp(qp); ret = qp->device->destroy_qp(qp);
if (!ret) { if (!ret) {
if (pd) if (pd)
...@@ -1545,6 +1584,8 @@ struct ib_cq *ib_create_cq(struct ib_device *device, ...@@ -1545,6 +1584,8 @@ struct ib_cq *ib_create_cq(struct ib_device *device,
cq->event_handler = event_handler; cq->event_handler = event_handler;
cq->cq_context = cq_context; cq->cq_context = cq_context;
atomic_set(&cq->usecnt, 0); atomic_set(&cq->usecnt, 0);
cq->res.type = RDMA_RESTRACK_CQ;
rdma_restrack_add(&cq->res);
} }
return cq; return cq;
...@@ -1563,6 +1604,7 @@ int ib_destroy_cq(struct ib_cq *cq) ...@@ -1563,6 +1604,7 @@ int ib_destroy_cq(struct ib_cq *cq)
if (atomic_read(&cq->usecnt)) if (atomic_read(&cq->usecnt))
return -EBUSY; return -EBUSY;
rdma_restrack_del(&cq->res);
return cq->device->destroy_cq(cq); return cq->device->destroy_cq(cq);
} }
EXPORT_SYMBOL(ib_destroy_cq); EXPORT_SYMBOL(ib_destroy_cq);
...@@ -1747,7 +1789,7 @@ int ib_detach_mcast(struct ib_qp *qp, union ib_gid *gid, u16 lid) ...@@ -1747,7 +1789,7 @@ int ib_detach_mcast(struct ib_qp *qp, union ib_gid *gid, u16 lid)
} }
EXPORT_SYMBOL(ib_detach_mcast); EXPORT_SYMBOL(ib_detach_mcast);
struct ib_xrcd *ib_alloc_xrcd(struct ib_device *device) struct ib_xrcd *__ib_alloc_xrcd(struct ib_device *device, const char *caller)
{ {
struct ib_xrcd *xrcd; struct ib_xrcd *xrcd;
...@@ -1765,7 +1807,7 @@ struct ib_xrcd *ib_alloc_xrcd(struct ib_device *device) ...@@ -1765,7 +1807,7 @@ struct ib_xrcd *ib_alloc_xrcd(struct ib_device *device)
return xrcd; return xrcd;
} }
EXPORT_SYMBOL(ib_alloc_xrcd); EXPORT_SYMBOL(__ib_alloc_xrcd);
int ib_dealloc_xrcd(struct ib_xrcd *xrcd) int ib_dealloc_xrcd(struct ib_xrcd *xrcd)
{ {
...@@ -1790,11 +1832,11 @@ EXPORT_SYMBOL(ib_dealloc_xrcd); ...@@ -1790,11 +1832,11 @@ EXPORT_SYMBOL(ib_dealloc_xrcd);
* ib_create_wq - Creates a WQ associated with the specified protection * ib_create_wq - Creates a WQ associated with the specified protection
* domain. * domain.
* @pd: The protection domain associated with the WQ. * @pd: The protection domain associated with the WQ.
* @wq_init_attr: A list of initial attributes required to create the * @wq_attr: A list of initial attributes required to create the
* WQ. If WQ creation succeeds, then the attributes are updated to * WQ. If WQ creation succeeds, then the attributes are updated to
* the actual capabilities of the created WQ. * the actual capabilities of the created WQ.
* *
* wq_init_attr->max_wr and wq_init_attr->max_sge determine * wq_attr->max_wr and wq_attr->max_sge determine
* the requested size of the WQ, and set to the actual values allocated * the requested size of the WQ, and set to the actual values allocated
* on return. * on return.
* If ib_create_wq() succeeds, then max_wr and max_sge will always be * If ib_create_wq() succeeds, then max_wr and max_sge will always be
...@@ -2156,16 +2198,16 @@ static void __ib_drain_sq(struct ib_qp *qp) ...@@ -2156,16 +2198,16 @@ static void __ib_drain_sq(struct ib_qp *qp)
struct ib_send_wr swr = {}, *bad_swr; struct ib_send_wr swr = {}, *bad_swr;
int ret; int ret;
swr.wr_cqe = &sdrain.cqe;
sdrain.cqe.done = ib_drain_qp_done;
init_completion(&sdrain.done);
ret = ib_modify_qp(qp, &attr, IB_QP_STATE); ret = ib_modify_qp(qp, &attr, IB_QP_STATE);
if (ret) { if (ret) {
WARN_ONCE(ret, "failed to drain send queue: %d\n", ret); WARN_ONCE(ret, "failed to drain send queue: %d\n", ret);
return; return;
} }
swr.wr_cqe = &sdrain.cqe;
sdrain.cqe.done = ib_drain_qp_done;
init_completion(&sdrain.done);
ret = ib_post_send(qp, &swr, &bad_swr); ret = ib_post_send(qp, &swr, &bad_swr);
if (ret) { if (ret) {
WARN_ONCE(ret, "failed to drain send queue: %d\n", ret); WARN_ONCE(ret, "failed to drain send queue: %d\n", ret);
...@@ -2190,16 +2232,16 @@ static void __ib_drain_rq(struct ib_qp *qp) ...@@ -2190,16 +2232,16 @@ static void __ib_drain_rq(struct ib_qp *qp)
struct ib_recv_wr rwr = {}, *bad_rwr; struct ib_recv_wr rwr = {}, *bad_rwr;
int ret; int ret;
rwr.wr_cqe = &rdrain.cqe;
rdrain.cqe.done = ib_drain_qp_done;
init_completion(&rdrain.done);
ret = ib_modify_qp(qp, &attr, IB_QP_STATE); ret = ib_modify_qp(qp, &attr, IB_QP_STATE);
if (ret) { if (ret) {
WARN_ONCE(ret, "failed to drain recv queue: %d\n", ret); WARN_ONCE(ret, "failed to drain recv queue: %d\n", ret);
return; return;
} }
rwr.wr_cqe = &rdrain.cqe;
rdrain.cqe.done = ib_drain_qp_done;
init_completion(&rdrain.done);
ret = ib_post_recv(qp, &rwr, &bad_rwr); ret = ib_post_recv(qp, &rwr, &bad_rwr);
if (ret) { if (ret) {
WARN_ONCE(ret, "failed to drain recv queue: %d\n", ret); WARN_ONCE(ret, "failed to drain recv queue: %d\n", ret);
......
...@@ -43,20 +43,41 @@ ...@@ -43,20 +43,41 @@
#define ROCE_DRV_MODULE_VERSION "1.0.0" #define ROCE_DRV_MODULE_VERSION "1.0.0"
#define BNXT_RE_DESC "Broadcom NetXtreme-C/E RoCE Driver" #define BNXT_RE_DESC "Broadcom NetXtreme-C/E RoCE Driver"
#define BNXT_RE_PAGE_SHIFT_4K (12)
#define BNXT_RE_PAGE_SIZE_4K BIT(12) #define BNXT_RE_PAGE_SHIFT_8K (13)
#define BNXT_RE_PAGE_SIZE_8K BIT(13) #define BNXT_RE_PAGE_SHIFT_64K (16)
#define BNXT_RE_PAGE_SIZE_64K BIT(16) #define BNXT_RE_PAGE_SHIFT_2M (21)
#define BNXT_RE_PAGE_SIZE_2M BIT(21) #define BNXT_RE_PAGE_SHIFT_8M (23)
#define BNXT_RE_PAGE_SIZE_8M BIT(23) #define BNXT_RE_PAGE_SHIFT_1G (30)
#define BNXT_RE_PAGE_SIZE_1G BIT(30)
#define BNXT_RE_PAGE_SIZE_4K BIT(BNXT_RE_PAGE_SHIFT_4K)
#define BNXT_RE_MAX_MR_SIZE BIT(30) #define BNXT_RE_PAGE_SIZE_8K BIT(BNXT_RE_PAGE_SHIFT_8K)
#define BNXT_RE_PAGE_SIZE_64K BIT(BNXT_RE_PAGE_SHIFT_64K)
#define BNXT_RE_PAGE_SIZE_2M BIT(BNXT_RE_PAGE_SHIFT_2M)
#define BNXT_RE_PAGE_SIZE_8M BIT(BNXT_RE_PAGE_SHIFT_8M)
#define BNXT_RE_PAGE_SIZE_1G BIT(BNXT_RE_PAGE_SHIFT_1G)
#define BNXT_RE_MAX_MR_SIZE_LOW BIT(BNXT_RE_PAGE_SHIFT_1G)
#define BNXT_RE_MAX_MR_SIZE_HIGH BIT(39)
#define BNXT_RE_MAX_MR_SIZE BNXT_RE_MAX_MR_SIZE_HIGH
#define BNXT_RE_MAX_QPC_COUNT (64 * 1024) #define BNXT_RE_MAX_QPC_COUNT (64 * 1024)
#define BNXT_RE_MAX_MRW_COUNT (64 * 1024) #define BNXT_RE_MAX_MRW_COUNT (64 * 1024)
#define BNXT_RE_MAX_SRQC_COUNT (64 * 1024) #define BNXT_RE_MAX_SRQC_COUNT (64 * 1024)
#define BNXT_RE_MAX_CQ_COUNT (64 * 1024) #define BNXT_RE_MAX_CQ_COUNT (64 * 1024)
#define BNXT_RE_MAX_MRW_COUNT_64K (64 * 1024)
#define BNXT_RE_MAX_MRW_COUNT_256K (256 * 1024)
/* Number of MRs to reserve for PF, leaving remainder for VFs */
#define BNXT_RE_RESVD_MR_FOR_PF (32 * 1024)
#define BNXT_RE_MAX_GID_PER_VF 128
/*
* Percentage of resources of each type reserved for PF.
* Remaining resources are divided equally among VFs.
* [0, 100]
*/
#define BNXT_RE_PCT_RSVD_FOR_PF 50
#define BNXT_RE_UD_QP_HW_STALL 0x400000 #define BNXT_RE_UD_QP_HW_STALL 0x400000
...@@ -100,6 +121,7 @@ struct bnxt_re_dev { ...@@ -100,6 +121,7 @@ struct bnxt_re_dev {
#define BNXT_RE_FLAG_RCFW_CHANNEL_EN 4 #define BNXT_RE_FLAG_RCFW_CHANNEL_EN 4
#define BNXT_RE_FLAG_QOS_WORK_REG 5 #define BNXT_RE_FLAG_QOS_WORK_REG 5
#define BNXT_RE_FLAG_TASK_IN_PROG 6 #define BNXT_RE_FLAG_TASK_IN_PROG 6
#define BNXT_RE_FLAG_ISSUE_ROCE_STATS 29
struct net_device *netdev; struct net_device *netdev;
unsigned int version, major, minor; unsigned int version, major, minor;
struct bnxt_en_dev *en_dev; struct bnxt_en_dev *en_dev;
...@@ -145,6 +167,9 @@ struct bnxt_re_dev { ...@@ -145,6 +167,9 @@ struct bnxt_re_dev {
struct bnxt_re_ah *sqp_ah; struct bnxt_re_ah *sqp_ah;
struct bnxt_re_sqp_entries sqp_tbl[1024]; struct bnxt_re_sqp_entries sqp_tbl[1024];
atomic_t nq_alloc_cnt; atomic_t nq_alloc_cnt;
u32 is_virtfn;
u32 num_vfs;
struct bnxt_qplib_roce_stats stats;
}; };
#define to_bnxt_re_dev(ptr, member) \ #define to_bnxt_re_dev(ptr, member) \
......
...@@ -58,16 +58,55 @@ ...@@ -58,16 +58,55 @@
#include "hw_counters.h" #include "hw_counters.h"
static const char * const bnxt_re_stat_name[] = { static const char * const bnxt_re_stat_name[] = {
[BNXT_RE_ACTIVE_QP] = "active_qps", [BNXT_RE_ACTIVE_QP] = "active_qps",
[BNXT_RE_ACTIVE_SRQ] = "active_srqs", [BNXT_RE_ACTIVE_SRQ] = "active_srqs",
[BNXT_RE_ACTIVE_CQ] = "active_cqs", [BNXT_RE_ACTIVE_CQ] = "active_cqs",
[BNXT_RE_ACTIVE_MR] = "active_mrs", [BNXT_RE_ACTIVE_MR] = "active_mrs",
[BNXT_RE_ACTIVE_MW] = "active_mws", [BNXT_RE_ACTIVE_MW] = "active_mws",
[BNXT_RE_RX_PKTS] = "rx_pkts", [BNXT_RE_RX_PKTS] = "rx_pkts",
[BNXT_RE_RX_BYTES] = "rx_bytes", [BNXT_RE_RX_BYTES] = "rx_bytes",
[BNXT_RE_TX_PKTS] = "tx_pkts", [BNXT_RE_TX_PKTS] = "tx_pkts",
[BNXT_RE_TX_BYTES] = "tx_bytes", [BNXT_RE_TX_BYTES] = "tx_bytes",
[BNXT_RE_RECOVERABLE_ERRORS] = "recoverable_errors" [BNXT_RE_RECOVERABLE_ERRORS] = "recoverable_errors",
[BNXT_RE_TO_RETRANSMITS] = "to_retransmits",
[BNXT_RE_SEQ_ERR_NAKS_RCVD] = "seq_err_naks_rcvd",
[BNXT_RE_MAX_RETRY_EXCEEDED] = "max_retry_exceeded",
[BNXT_RE_RNR_NAKS_RCVD] = "rnr_naks_rcvd",
[BNXT_RE_MISSING_RESP] = "missin_resp",
[BNXT_RE_UNRECOVERABLE_ERR] = "unrecoverable_err",
[BNXT_RE_BAD_RESP_ERR] = "bad_resp_err",
[BNXT_RE_LOCAL_QP_OP_ERR] = "local_qp_op_err",
[BNXT_RE_LOCAL_PROTECTION_ERR] = "local_protection_err",
[BNXT_RE_MEM_MGMT_OP_ERR] = "mem_mgmt_op_err",
[BNXT_RE_REMOTE_INVALID_REQ_ERR] = "remote_invalid_req_err",
[BNXT_RE_REMOTE_ACCESS_ERR] = "remote_access_err",
[BNXT_RE_REMOTE_OP_ERR] = "remote_op_err",
[BNXT_RE_DUP_REQ] = "dup_req",
[BNXT_RE_RES_EXCEED_MAX] = "res_exceed_max",
[BNXT_RE_RES_LENGTH_MISMATCH] = "res_length_mismatch",
[BNXT_RE_RES_EXCEEDS_WQE] = "res_exceeds_wqe",
[BNXT_RE_RES_OPCODE_ERR] = "res_opcode_err",
[BNXT_RE_RES_RX_INVALID_RKEY] = "res_rx_invalid_rkey",
[BNXT_RE_RES_RX_DOMAIN_ERR] = "res_rx_domain_err",
[BNXT_RE_RES_RX_NO_PERM] = "res_rx_no_perm",
[BNXT_RE_RES_RX_RANGE_ERR] = "res_rx_range_err",
[BNXT_RE_RES_TX_INVALID_RKEY] = "res_tx_invalid_rkey",
[BNXT_RE_RES_TX_DOMAIN_ERR] = "res_tx_domain_err",
[BNXT_RE_RES_TX_NO_PERM] = "res_tx_no_perm",
[BNXT_RE_RES_TX_RANGE_ERR] = "res_tx_range_err",
[BNXT_RE_RES_IRRQ_OFLOW] = "res_irrq_oflow",
[BNXT_RE_RES_UNSUP_OPCODE] = "res_unsup_opcode",
[BNXT_RE_RES_UNALIGNED_ATOMIC] = "res_unaligned_atomic",
[BNXT_RE_RES_REM_INV_ERR] = "res_rem_inv_err",
[BNXT_RE_RES_MEM_ERROR] = "res_mem_err",
[BNXT_RE_RES_SRQ_ERR] = "res_srq_err",
[BNXT_RE_RES_CMP_ERR] = "res_cmp_err",
[BNXT_RE_RES_INVALID_DUP_RKEY] = "res_invalid_dup_rkey",
[BNXT_RE_RES_WQE_FORMAT_ERR] = "res_wqe_format_err",
[BNXT_RE_RES_CQ_LOAD_ERR] = "res_cq_load_err",
[BNXT_RE_RES_SRQ_LOAD_ERR] = "res_srq_load_err",
[BNXT_RE_RES_TX_PCI_ERR] = "res_tx_pci_err",
[BNXT_RE_RES_RX_PCI_ERR] = "res_rx_pci_err"
}; };
int bnxt_re_ib_get_hw_stats(struct ib_device *ibdev, int bnxt_re_ib_get_hw_stats(struct ib_device *ibdev,
...@@ -76,6 +115,7 @@ int bnxt_re_ib_get_hw_stats(struct ib_device *ibdev, ...@@ -76,6 +115,7 @@ int bnxt_re_ib_get_hw_stats(struct ib_device *ibdev,
{ {
struct bnxt_re_dev *rdev = to_bnxt_re_dev(ibdev, ibdev); struct bnxt_re_dev *rdev = to_bnxt_re_dev(ibdev, ibdev);
struct ctx_hw_stats *bnxt_re_stats = rdev->qplib_ctx.stats.dma; struct ctx_hw_stats *bnxt_re_stats = rdev->qplib_ctx.stats.dma;
int rc = 0;
if (!port || !stats) if (!port || !stats)
return -EINVAL; return -EINVAL;
...@@ -97,6 +137,91 @@ int bnxt_re_ib_get_hw_stats(struct ib_device *ibdev, ...@@ -97,6 +137,91 @@ int bnxt_re_ib_get_hw_stats(struct ib_device *ibdev,
stats->value[BNXT_RE_TX_BYTES] = stats->value[BNXT_RE_TX_BYTES] =
le64_to_cpu(bnxt_re_stats->tx_ucast_bytes); le64_to_cpu(bnxt_re_stats->tx_ucast_bytes);
} }
if (test_bit(BNXT_RE_FLAG_ISSUE_ROCE_STATS, &rdev->flags)) {
rc = bnxt_qplib_get_roce_stats(&rdev->rcfw, &rdev->stats);
if (rc)
clear_bit(BNXT_RE_FLAG_ISSUE_ROCE_STATS,
&rdev->flags);
stats->value[BNXT_RE_TO_RETRANSMITS] =
rdev->stats.to_retransmits;
stats->value[BNXT_RE_SEQ_ERR_NAKS_RCVD] =
rdev->stats.seq_err_naks_rcvd;
stats->value[BNXT_RE_MAX_RETRY_EXCEEDED] =
rdev->stats.max_retry_exceeded;
stats->value[BNXT_RE_RNR_NAKS_RCVD] =
rdev->stats.rnr_naks_rcvd;
stats->value[BNXT_RE_MISSING_RESP] =
rdev->stats.missing_resp;
stats->value[BNXT_RE_UNRECOVERABLE_ERR] =
rdev->stats.unrecoverable_err;
stats->value[BNXT_RE_BAD_RESP_ERR] =
rdev->stats.bad_resp_err;
stats->value[BNXT_RE_LOCAL_QP_OP_ERR] =
rdev->stats.local_qp_op_err;
stats->value[BNXT_RE_LOCAL_PROTECTION_ERR] =
rdev->stats.local_protection_err;
stats->value[BNXT_RE_MEM_MGMT_OP_ERR] =
rdev->stats.mem_mgmt_op_err;
stats->value[BNXT_RE_REMOTE_INVALID_REQ_ERR] =
rdev->stats.remote_invalid_req_err;
stats->value[BNXT_RE_REMOTE_ACCESS_ERR] =
rdev->stats.remote_access_err;
stats->value[BNXT_RE_REMOTE_OP_ERR] =
rdev->stats.remote_op_err;
stats->value[BNXT_RE_DUP_REQ] =
rdev->stats.dup_req;
stats->value[BNXT_RE_RES_EXCEED_MAX] =
rdev->stats.res_exceed_max;
stats->value[BNXT_RE_RES_LENGTH_MISMATCH] =
rdev->stats.res_length_mismatch;
stats->value[BNXT_RE_RES_EXCEEDS_WQE] =
rdev->stats.res_exceeds_wqe;
stats->value[BNXT_RE_RES_OPCODE_ERR] =
rdev->stats.res_opcode_err;
stats->value[BNXT_RE_RES_RX_INVALID_RKEY] =
rdev->stats.res_rx_invalid_rkey;
stats->value[BNXT_RE_RES_RX_DOMAIN_ERR] =
rdev->stats.res_rx_domain_err;
stats->value[BNXT_RE_RES_RX_NO_PERM] =
rdev->stats.res_rx_no_perm;
stats->value[BNXT_RE_RES_RX_RANGE_ERR] =
rdev->stats.res_rx_range_err;
stats->value[BNXT_RE_RES_TX_INVALID_RKEY] =
rdev->stats.res_tx_invalid_rkey;
stats->value[BNXT_RE_RES_TX_DOMAIN_ERR] =
rdev->stats.res_tx_domain_err;
stats->value[BNXT_RE_RES_TX_NO_PERM] =
rdev->stats.res_tx_no_perm;
stats->value[BNXT_RE_RES_TX_RANGE_ERR] =
rdev->stats.res_tx_range_err;
stats->value[BNXT_RE_RES_IRRQ_OFLOW] =
rdev->stats.res_irrq_oflow;
stats->value[BNXT_RE_RES_UNSUP_OPCODE] =
rdev->stats.res_unsup_opcode;
stats->value[BNXT_RE_RES_UNALIGNED_ATOMIC] =
rdev->stats.res_unaligned_atomic;
stats->value[BNXT_RE_RES_REM_INV_ERR] =
rdev->stats.res_rem_inv_err;
stats->value[BNXT_RE_RES_MEM_ERROR] =
rdev->stats.res_mem_error;
stats->value[BNXT_RE_RES_SRQ_ERR] =
rdev->stats.res_srq_err;
stats->value[BNXT_RE_RES_CMP_ERR] =
rdev->stats.res_cmp_err;
stats->value[BNXT_RE_RES_INVALID_DUP_RKEY] =
rdev->stats.res_invalid_dup_rkey;
stats->value[BNXT_RE_RES_WQE_FORMAT_ERR] =
rdev->stats.res_wqe_format_err;
stats->value[BNXT_RE_RES_CQ_LOAD_ERR] =
rdev->stats.res_cq_load_err;
stats->value[BNXT_RE_RES_SRQ_LOAD_ERR] =
rdev->stats.res_srq_load_err;
stats->value[BNXT_RE_RES_TX_PCI_ERR] =
rdev->stats.res_tx_pci_err;
stats->value[BNXT_RE_RES_RX_PCI_ERR] =
rdev->stats.res_rx_pci_err;
}
return ARRAY_SIZE(bnxt_re_stat_name); return ARRAY_SIZE(bnxt_re_stat_name);
} }
......
...@@ -51,6 +51,45 @@ enum bnxt_re_hw_stats { ...@@ -51,6 +51,45 @@ enum bnxt_re_hw_stats {
BNXT_RE_TX_PKTS, BNXT_RE_TX_PKTS,
BNXT_RE_TX_BYTES, BNXT_RE_TX_BYTES,
BNXT_RE_RECOVERABLE_ERRORS, BNXT_RE_RECOVERABLE_ERRORS,
BNXT_RE_TO_RETRANSMITS,
BNXT_RE_SEQ_ERR_NAKS_RCVD,
BNXT_RE_MAX_RETRY_EXCEEDED,
BNXT_RE_RNR_NAKS_RCVD,
BNXT_RE_MISSING_RESP,
BNXT_RE_UNRECOVERABLE_ERR,
BNXT_RE_BAD_RESP_ERR,
BNXT_RE_LOCAL_QP_OP_ERR,
BNXT_RE_LOCAL_PROTECTION_ERR,
BNXT_RE_MEM_MGMT_OP_ERR,
BNXT_RE_REMOTE_INVALID_REQ_ERR,
BNXT_RE_REMOTE_ACCESS_ERR,
BNXT_RE_REMOTE_OP_ERR,
BNXT_RE_DUP_REQ,
BNXT_RE_RES_EXCEED_MAX,
BNXT_RE_RES_LENGTH_MISMATCH,
BNXT_RE_RES_EXCEEDS_WQE,
BNXT_RE_RES_OPCODE_ERR,
BNXT_RE_RES_RX_INVALID_RKEY,
BNXT_RE_RES_RX_DOMAIN_ERR,
BNXT_RE_RES_RX_NO_PERM,
BNXT_RE_RES_RX_RANGE_ERR,
BNXT_RE_RES_TX_INVALID_RKEY,
BNXT_RE_RES_TX_DOMAIN_ERR,
BNXT_RE_RES_TX_NO_PERM,
BNXT_RE_RES_TX_RANGE_ERR,
BNXT_RE_RES_IRRQ_OFLOW,
BNXT_RE_RES_UNSUP_OPCODE,
BNXT_RE_RES_UNALIGNED_ATOMIC,
BNXT_RE_RES_REM_INV_ERR,
BNXT_RE_RES_MEM_ERROR,
BNXT_RE_RES_SRQ_ERR,
BNXT_RE_RES_CMP_ERR,
BNXT_RE_RES_INVALID_DUP_RKEY,
BNXT_RE_RES_WQE_FORMAT_ERR,
BNXT_RE_RES_CQ_LOAD_ERR,
BNXT_RE_RES_SRQ_LOAD_ERR,
BNXT_RE_RES_TX_PCI_ERR,
BNXT_RE_RES_RX_PCI_ERR,
BNXT_RE_NUM_COUNTERS BNXT_RE_NUM_COUNTERS
}; };
......
...@@ -141,12 +141,13 @@ int bnxt_re_query_device(struct ib_device *ibdev, ...@@ -141,12 +141,13 @@ int bnxt_re_query_device(struct ib_device *ibdev,
struct bnxt_qplib_dev_attr *dev_attr = &rdev->dev_attr; struct bnxt_qplib_dev_attr *dev_attr = &rdev->dev_attr;
memset(ib_attr, 0, sizeof(*ib_attr)); memset(ib_attr, 0, sizeof(*ib_attr));
memcpy(&ib_attr->fw_ver, dev_attr->fw_ver,
ib_attr->fw_ver = (u64)(unsigned long)(dev_attr->fw_ver); min(sizeof(dev_attr->fw_ver),
sizeof(ib_attr->fw_ver)));
bnxt_qplib_get_guid(rdev->netdev->dev_addr, bnxt_qplib_get_guid(rdev->netdev->dev_addr,
(u8 *)&ib_attr->sys_image_guid); (u8 *)&ib_attr->sys_image_guid);
ib_attr->max_mr_size = BNXT_RE_MAX_MR_SIZE; ib_attr->max_mr_size = BNXT_RE_MAX_MR_SIZE;
ib_attr->page_size_cap = BNXT_RE_PAGE_SIZE_4K; ib_attr->page_size_cap = BNXT_RE_PAGE_SIZE_4K | BNXT_RE_PAGE_SIZE_2M;
ib_attr->vendor_id = rdev->en_dev->pdev->vendor; ib_attr->vendor_id = rdev->en_dev->pdev->vendor;
ib_attr->vendor_part_id = rdev->en_dev->pdev->device; ib_attr->vendor_part_id = rdev->en_dev->pdev->device;
...@@ -247,8 +248,7 @@ int bnxt_re_query_port(struct ib_device *ibdev, u8 port_num, ...@@ -247,8 +248,7 @@ int bnxt_re_query_port(struct ib_device *ibdev, u8 port_num,
IB_PORT_VENDOR_CLASS_SUP | IB_PORT_VENDOR_CLASS_SUP |
IB_PORT_IP_BASED_GIDS; IB_PORT_IP_BASED_GIDS;
/* Max MSG size set to 2G for now */ port_attr->max_msg_sz = (u32)BNXT_RE_MAX_MR_SIZE_LOW;
port_attr->max_msg_sz = 0x80000000;
port_attr->bad_pkey_cntr = 0; port_attr->bad_pkey_cntr = 0;
port_attr->qkey_viol_cntr = 0; port_attr->qkey_viol_cntr = 0;
port_attr->pkey_tbl_len = dev_attr->max_pkey; port_attr->pkey_tbl_len = dev_attr->max_pkey;
...@@ -281,6 +281,15 @@ int bnxt_re_get_port_immutable(struct ib_device *ibdev, u8 port_num, ...@@ -281,6 +281,15 @@ int bnxt_re_get_port_immutable(struct ib_device *ibdev, u8 port_num,
return 0; return 0;
} }
void bnxt_re_query_fw_str(struct ib_device *ibdev, char *str)
{
struct bnxt_re_dev *rdev = to_bnxt_re_dev(ibdev, ibdev);
snprintf(str, IB_FW_VERSION_NAME_MAX, "%d.%d.%d.%d",
rdev->dev_attr.fw_ver[0], rdev->dev_attr.fw_ver[1],
rdev->dev_attr.fw_ver[2], rdev->dev_attr.fw_ver[3]);
}
int bnxt_re_query_pkey(struct ib_device *ibdev, u8 port_num, int bnxt_re_query_pkey(struct ib_device *ibdev, u8 port_num,
u16 index, u16 *pkey) u16 index, u16 *pkey)
{ {
...@@ -532,7 +541,7 @@ static int bnxt_re_create_fence_mr(struct bnxt_re_pd *pd) ...@@ -532,7 +541,7 @@ static int bnxt_re_create_fence_mr(struct bnxt_re_pd *pd)
mr->qplib_mr.total_size = BNXT_RE_FENCE_BYTES; mr->qplib_mr.total_size = BNXT_RE_FENCE_BYTES;
pbl_tbl = dma_addr; pbl_tbl = dma_addr;
rc = bnxt_qplib_reg_mr(&rdev->qplib_res, &mr->qplib_mr, &pbl_tbl, rc = bnxt_qplib_reg_mr(&rdev->qplib_res, &mr->qplib_mr, &pbl_tbl,
BNXT_RE_FENCE_PBL_SIZE, false); BNXT_RE_FENCE_PBL_SIZE, false, PAGE_SIZE);
if (rc) { if (rc) {
dev_err(rdev_to_dev(rdev), "Failed to register fence-MR\n"); dev_err(rdev_to_dev(rdev), "Failed to register fence-MR\n");
goto fail; goto fail;
...@@ -1018,6 +1027,7 @@ struct ib_qp *bnxt_re_create_qp(struct ib_pd *ib_pd, ...@@ -1018,6 +1027,7 @@ struct ib_qp *bnxt_re_create_qp(struct ib_pd *ib_pd,
struct bnxt_qplib_dev_attr *dev_attr = &rdev->dev_attr; struct bnxt_qplib_dev_attr *dev_attr = &rdev->dev_attr;
struct bnxt_re_qp *qp; struct bnxt_re_qp *qp;
struct bnxt_re_cq *cq; struct bnxt_re_cq *cq;
struct bnxt_re_srq *srq;
int rc, entries; int rc, entries;
if ((qp_init_attr->cap.max_send_wr > dev_attr->max_qp_wqes) || if ((qp_init_attr->cap.max_send_wr > dev_attr->max_qp_wqes) ||
...@@ -1073,9 +1083,15 @@ struct ib_qp *bnxt_re_create_qp(struct ib_pd *ib_pd, ...@@ -1073,9 +1083,15 @@ struct ib_qp *bnxt_re_create_qp(struct ib_pd *ib_pd,
} }
if (qp_init_attr->srq) { if (qp_init_attr->srq) {
dev_err(rdev_to_dev(rdev), "SRQ not supported"); srq = container_of(qp_init_attr->srq, struct bnxt_re_srq,
rc = -ENOTSUPP; ib_srq);
goto fail; if (!srq) {
dev_err(rdev_to_dev(rdev), "SRQ not found");
rc = -EINVAL;
goto fail;
}
qp->qplib_qp.srq = &srq->qplib_srq;
qp->qplib_qp.rq.max_wqe = 0;
} else { } else {
/* Allocate 1 more than what's provided so posting max doesn't /* Allocate 1 more than what's provided so posting max doesn't
* mean empty * mean empty
...@@ -1280,6 +1296,237 @@ static enum ib_mtu __to_ib_mtu(u32 mtu) ...@@ -1280,6 +1296,237 @@ static enum ib_mtu __to_ib_mtu(u32 mtu)
} }
} }
/* Shared Receive Queues */
int bnxt_re_destroy_srq(struct ib_srq *ib_srq)
{
struct bnxt_re_srq *srq = container_of(ib_srq, struct bnxt_re_srq,
ib_srq);
struct bnxt_re_dev *rdev = srq->rdev;
struct bnxt_qplib_srq *qplib_srq = &srq->qplib_srq;
struct bnxt_qplib_nq *nq = NULL;
int rc;
if (qplib_srq->cq)
nq = qplib_srq->cq->nq;
rc = bnxt_qplib_destroy_srq(&rdev->qplib_res, qplib_srq);
if (rc) {
dev_err(rdev_to_dev(rdev), "Destroy HW SRQ failed!");
return rc;
}
if (srq->umem && !IS_ERR(srq->umem))
ib_umem_release(srq->umem);
kfree(srq);
atomic_dec(&rdev->srq_count);
if (nq)
nq->budget--;
return 0;
}
static int bnxt_re_init_user_srq(struct bnxt_re_dev *rdev,
struct bnxt_re_pd *pd,
struct bnxt_re_srq *srq,
struct ib_udata *udata)
{
struct bnxt_re_srq_req ureq;
struct bnxt_qplib_srq *qplib_srq = &srq->qplib_srq;
struct ib_umem *umem;
int bytes = 0;
struct ib_ucontext *context = pd->ib_pd.uobject->context;
struct bnxt_re_ucontext *cntx = container_of(context,
struct bnxt_re_ucontext,
ib_uctx);
if (ib_copy_from_udata(&ureq, udata, sizeof(ureq)))
return -EFAULT;
bytes = (qplib_srq->max_wqe * BNXT_QPLIB_MAX_RQE_ENTRY_SIZE);
bytes = PAGE_ALIGN(bytes);
umem = ib_umem_get(context, ureq.srqva, bytes,
IB_ACCESS_LOCAL_WRITE, 1);
if (IS_ERR(umem))
return PTR_ERR(umem);
srq->umem = umem;
qplib_srq->nmap = umem->nmap;
qplib_srq->sglist = umem->sg_head.sgl;
qplib_srq->srq_handle = ureq.srq_handle;
qplib_srq->dpi = &cntx->dpi;
return 0;
}
struct ib_srq *bnxt_re_create_srq(struct ib_pd *ib_pd,
struct ib_srq_init_attr *srq_init_attr,
struct ib_udata *udata)
{
struct bnxt_re_pd *pd = container_of(ib_pd, struct bnxt_re_pd, ib_pd);
struct bnxt_re_dev *rdev = pd->rdev;
struct bnxt_qplib_dev_attr *dev_attr = &rdev->dev_attr;
struct bnxt_re_srq *srq;
struct bnxt_qplib_nq *nq = NULL;
int rc, entries;
if (srq_init_attr->attr.max_wr >= dev_attr->max_srq_wqes) {
dev_err(rdev_to_dev(rdev), "Create CQ failed - max exceeded");
rc = -EINVAL;
goto exit;
}
if (srq_init_attr->srq_type != IB_SRQT_BASIC) {
rc = -ENOTSUPP;
goto exit;
}
srq = kzalloc(sizeof(*srq), GFP_KERNEL);
if (!srq) {
rc = -ENOMEM;
goto exit;
}
srq->rdev = rdev;
srq->qplib_srq.pd = &pd->qplib_pd;
srq->qplib_srq.dpi = &rdev->dpi_privileged;
/* Allocate 1 more than what's provided so posting max doesn't
* mean empty
*/
entries = roundup_pow_of_two(srq_init_attr->attr.max_wr + 1);
if (entries > dev_attr->max_srq_wqes + 1)
entries = dev_attr->max_srq_wqes + 1;
srq->qplib_srq.max_wqe = entries;
srq->qplib_srq.max_sge = srq_init_attr->attr.max_sge;
srq->qplib_srq.threshold = srq_init_attr->attr.srq_limit;
srq->srq_limit = srq_init_attr->attr.srq_limit;
srq->qplib_srq.eventq_hw_ring_id = rdev->nq[0].ring_id;
nq = &rdev->nq[0];
if (udata) {
rc = bnxt_re_init_user_srq(rdev, pd, srq, udata);
if (rc)
goto fail;
}
rc = bnxt_qplib_create_srq(&rdev->qplib_res, &srq->qplib_srq);
if (rc) {
dev_err(rdev_to_dev(rdev), "Create HW SRQ failed!");
goto fail;
}
if (udata) {
struct bnxt_re_srq_resp resp;
resp.srqid = srq->qplib_srq.id;
rc = ib_copy_to_udata(udata, &resp, sizeof(resp));
if (rc) {
dev_err(rdev_to_dev(rdev), "SRQ copy to udata failed!");
bnxt_qplib_destroy_srq(&rdev->qplib_res,
&srq->qplib_srq);
goto exit;
}
}
if (nq)
nq->budget++;
atomic_inc(&rdev->srq_count);
return &srq->ib_srq;
fail:
if (udata && srq->umem && !IS_ERR(srq->umem)) {
ib_umem_release(srq->umem);
srq->umem = NULL;
}
kfree(srq);
exit:
return ERR_PTR(rc);
}
int bnxt_re_modify_srq(struct ib_srq *ib_srq, struct ib_srq_attr *srq_attr,
enum ib_srq_attr_mask srq_attr_mask,
struct ib_udata *udata)
{
struct bnxt_re_srq *srq = container_of(ib_srq, struct bnxt_re_srq,
ib_srq);
struct bnxt_re_dev *rdev = srq->rdev;
int rc;
switch (srq_attr_mask) {
case IB_SRQ_MAX_WR:
/* SRQ resize is not supported */
break;
case IB_SRQ_LIMIT:
/* Change the SRQ threshold */
if (srq_attr->srq_limit > srq->qplib_srq.max_wqe)
return -EINVAL;
srq->qplib_srq.threshold = srq_attr->srq_limit;
rc = bnxt_qplib_modify_srq(&rdev->qplib_res, &srq->qplib_srq);
if (rc) {
dev_err(rdev_to_dev(rdev), "Modify HW SRQ failed!");
return rc;
}
/* On success, update the shadow */
srq->srq_limit = srq_attr->srq_limit;
/* No need to Build and send response back to udata */
break;
default:
dev_err(rdev_to_dev(rdev),
"Unsupported srq_attr_mask 0x%x", srq_attr_mask);
return -EINVAL;
}
return 0;
}
int bnxt_re_query_srq(struct ib_srq *ib_srq, struct ib_srq_attr *srq_attr)
{
struct bnxt_re_srq *srq = container_of(ib_srq, struct bnxt_re_srq,
ib_srq);
struct bnxt_re_srq tsrq;
struct bnxt_re_dev *rdev = srq->rdev;
int rc;
/* Get live SRQ attr */
tsrq.qplib_srq.id = srq->qplib_srq.id;
rc = bnxt_qplib_query_srq(&rdev->qplib_res, &tsrq.qplib_srq);
if (rc) {
dev_err(rdev_to_dev(rdev), "Query HW SRQ failed!");
return rc;
}
srq_attr->max_wr = srq->qplib_srq.max_wqe;
srq_attr->max_sge = srq->qplib_srq.max_sge;
srq_attr->srq_limit = tsrq.qplib_srq.threshold;
return 0;
}
int bnxt_re_post_srq_recv(struct ib_srq *ib_srq, struct ib_recv_wr *wr,
struct ib_recv_wr **bad_wr)
{
struct bnxt_re_srq *srq = container_of(ib_srq, struct bnxt_re_srq,
ib_srq);
struct bnxt_qplib_swqe wqe;
unsigned long flags;
int rc = 0, payload_sz = 0;
spin_lock_irqsave(&srq->lock, flags);
while (wr) {
/* Transcribe each ib_recv_wr to qplib_swqe */
wqe.num_sge = wr->num_sge;
payload_sz = bnxt_re_build_sgl(wr->sg_list, wqe.sg_list,
wr->num_sge);
wqe.wr_id = wr->wr_id;
wqe.type = BNXT_QPLIB_SWQE_TYPE_RECV;
rc = bnxt_qplib_post_srq_recv(&srq->qplib_srq, &wqe);
if (rc) {
*bad_wr = wr;
break;
}
wr = wr->next;
}
spin_unlock_irqrestore(&srq->lock, flags);
return rc;
}
static int bnxt_re_modify_shadow_qp(struct bnxt_re_dev *rdev, static int bnxt_re_modify_shadow_qp(struct bnxt_re_dev *rdev,
struct bnxt_re_qp *qp1_qp, struct bnxt_re_qp *qp1_qp,
int qp_attr_mask) int qp_attr_mask)
...@@ -2295,10 +2542,14 @@ int bnxt_re_post_recv(struct ib_qp *ib_qp, struct ib_recv_wr *wr, ...@@ -2295,10 +2542,14 @@ int bnxt_re_post_recv(struct ib_qp *ib_qp, struct ib_recv_wr *wr,
/* Completion Queues */ /* Completion Queues */
int bnxt_re_destroy_cq(struct ib_cq *ib_cq) int bnxt_re_destroy_cq(struct ib_cq *ib_cq)
{ {
struct bnxt_re_cq *cq = container_of(ib_cq, struct bnxt_re_cq, ib_cq);
struct bnxt_re_dev *rdev = cq->rdev;
int rc; int rc;
struct bnxt_qplib_nq *nq = cq->qplib_cq.nq; struct bnxt_re_cq *cq;
struct bnxt_qplib_nq *nq;
struct bnxt_re_dev *rdev;
cq = container_of(ib_cq, struct bnxt_re_cq, ib_cq);
rdev = cq->rdev;
nq = cq->qplib_cq.nq;
rc = bnxt_qplib_destroy_cq(&rdev->qplib_res, &cq->qplib_cq); rc = bnxt_qplib_destroy_cq(&rdev->qplib_res, &cq->qplib_cq);
if (rc) { if (rc) {
...@@ -2308,12 +2559,11 @@ int bnxt_re_destroy_cq(struct ib_cq *ib_cq) ...@@ -2308,12 +2559,11 @@ int bnxt_re_destroy_cq(struct ib_cq *ib_cq)
if (!IS_ERR_OR_NULL(cq->umem)) if (!IS_ERR_OR_NULL(cq->umem))
ib_umem_release(cq->umem); ib_umem_release(cq->umem);
if (cq) {
kfree(cq->cql);
kfree(cq);
}
atomic_dec(&rdev->cq_count); atomic_dec(&rdev->cq_count);
nq->budget--; nq->budget--;
kfree(cq->cql);
kfree(cq);
return 0; return 0;
} }
...@@ -3078,7 +3328,8 @@ struct ib_mr *bnxt_re_get_dma_mr(struct ib_pd *ib_pd, int mr_access_flags) ...@@ -3078,7 +3328,8 @@ struct ib_mr *bnxt_re_get_dma_mr(struct ib_pd *ib_pd, int mr_access_flags)
mr->qplib_mr.hwq.level = PBL_LVL_MAX; mr->qplib_mr.hwq.level = PBL_LVL_MAX;
mr->qplib_mr.total_size = -1; /* Infinte length */ mr->qplib_mr.total_size = -1; /* Infinte length */
rc = bnxt_qplib_reg_mr(&rdev->qplib_res, &mr->qplib_mr, &pbl, 0, false); rc = bnxt_qplib_reg_mr(&rdev->qplib_res, &mr->qplib_mr, &pbl, 0, false,
PAGE_SIZE);
if (rc) if (rc)
goto fail_mr; goto fail_mr;
...@@ -3104,10 +3355,8 @@ int bnxt_re_dereg_mr(struct ib_mr *ib_mr) ...@@ -3104,10 +3355,8 @@ int bnxt_re_dereg_mr(struct ib_mr *ib_mr)
int rc; int rc;
rc = bnxt_qplib_free_mrw(&rdev->qplib_res, &mr->qplib_mr); rc = bnxt_qplib_free_mrw(&rdev->qplib_res, &mr->qplib_mr);
if (rc) { if (rc)
dev_err(rdev_to_dev(rdev), "Dereg MR failed: %#x\n", rc); dev_err(rdev_to_dev(rdev), "Dereg MR failed: %#x\n", rc);
return rc;
}
if (mr->pages) { if (mr->pages) {
rc = bnxt_qplib_free_fast_reg_page_list(&rdev->qplib_res, rc = bnxt_qplib_free_fast_reg_page_list(&rdev->qplib_res,
...@@ -3170,7 +3419,7 @@ struct ib_mr *bnxt_re_alloc_mr(struct ib_pd *ib_pd, enum ib_mr_type type, ...@@ -3170,7 +3419,7 @@ struct ib_mr *bnxt_re_alloc_mr(struct ib_pd *ib_pd, enum ib_mr_type type,
rc = bnxt_qplib_alloc_mrw(&rdev->qplib_res, &mr->qplib_mr); rc = bnxt_qplib_alloc_mrw(&rdev->qplib_res, &mr->qplib_mr);
if (rc) if (rc)
goto fail; goto bail;
mr->ib_mr.lkey = mr->qplib_mr.lkey; mr->ib_mr.lkey = mr->qplib_mr.lkey;
mr->ib_mr.rkey = mr->ib_mr.lkey; mr->ib_mr.rkey = mr->ib_mr.lkey;
...@@ -3192,9 +3441,10 @@ struct ib_mr *bnxt_re_alloc_mr(struct ib_pd *ib_pd, enum ib_mr_type type, ...@@ -3192,9 +3441,10 @@ struct ib_mr *bnxt_re_alloc_mr(struct ib_pd *ib_pd, enum ib_mr_type type,
return &mr->ib_mr; return &mr->ib_mr;
fail_mr: fail_mr:
bnxt_qplib_free_mrw(&rdev->qplib_res, &mr->qplib_mr);
fail:
kfree(mr->pages); kfree(mr->pages);
fail:
bnxt_qplib_free_mrw(&rdev->qplib_res, &mr->qplib_mr);
bail:
kfree(mr); kfree(mr);
return ERR_PTR(rc); return ERR_PTR(rc);
} }
...@@ -3248,6 +3498,46 @@ int bnxt_re_dealloc_mw(struct ib_mw *ib_mw) ...@@ -3248,6 +3498,46 @@ int bnxt_re_dealloc_mw(struct ib_mw *ib_mw)
return rc; return rc;
} }
static int bnxt_re_page_size_ok(int page_shift)
{
switch (page_shift) {
case CMDQ_REGISTER_MR_LOG2_PBL_PG_SIZE_PG_4K:
case CMDQ_REGISTER_MR_LOG2_PBL_PG_SIZE_PG_8K:
case CMDQ_REGISTER_MR_LOG2_PBL_PG_SIZE_PG_64K:
case CMDQ_REGISTER_MR_LOG2_PBL_PG_SIZE_PG_2M:
case CMDQ_REGISTER_MR_LOG2_PBL_PG_SIZE_PG_256K:
case CMDQ_REGISTER_MR_LOG2_PBL_PG_SIZE_PG_1M:
case CMDQ_REGISTER_MR_LOG2_PBL_PG_SIZE_PG_4M:
case CMDQ_REGISTER_MR_LOG2_PBL_PG_SIZE_PG_1G:
return 1;
default:
return 0;
}
}
static int fill_umem_pbl_tbl(struct ib_umem *umem, u64 *pbl_tbl_orig,
int page_shift)
{
u64 *pbl_tbl = pbl_tbl_orig;
u64 paddr;
u64 page_mask = (1ULL << page_shift) - 1;
int i, pages;
struct scatterlist *sg;
int entry;
for_each_sg(umem->sg_head.sgl, sg, umem->nmap, entry) {
pages = sg_dma_len(sg) >> PAGE_SHIFT;
for (i = 0; i < pages; i++) {
paddr = sg_dma_address(sg) + (i << PAGE_SHIFT);
if (pbl_tbl == pbl_tbl_orig)
*pbl_tbl++ = paddr & ~page_mask;
else if ((paddr & page_mask) == 0)
*pbl_tbl++ = paddr;
}
}
return pbl_tbl - pbl_tbl_orig;
}
/* uverbs */ /* uverbs */
struct ib_mr *bnxt_re_reg_user_mr(struct ib_pd *ib_pd, u64 start, u64 length, struct ib_mr *bnxt_re_reg_user_mr(struct ib_pd *ib_pd, u64 start, u64 length,
u64 virt_addr, int mr_access_flags, u64 virt_addr, int mr_access_flags,
...@@ -3257,10 +3547,8 @@ struct ib_mr *bnxt_re_reg_user_mr(struct ib_pd *ib_pd, u64 start, u64 length, ...@@ -3257,10 +3547,8 @@ struct ib_mr *bnxt_re_reg_user_mr(struct ib_pd *ib_pd, u64 start, u64 length,
struct bnxt_re_dev *rdev = pd->rdev; struct bnxt_re_dev *rdev = pd->rdev;
struct bnxt_re_mr *mr; struct bnxt_re_mr *mr;
struct ib_umem *umem; struct ib_umem *umem;
u64 *pbl_tbl, *pbl_tbl_orig; u64 *pbl_tbl = NULL;
int i, umem_pgs, pages, rc; int umem_pgs, page_shift, rc;
struct scatterlist *sg;
int entry;
if (length > BNXT_RE_MAX_MR_SIZE) { if (length > BNXT_RE_MAX_MR_SIZE) {
dev_err(rdev_to_dev(rdev), "MR Size: %lld > Max supported:%ld\n", dev_err(rdev_to_dev(rdev), "MR Size: %lld > Max supported:%ld\n",
...@@ -3277,64 +3565,68 @@ struct ib_mr *bnxt_re_reg_user_mr(struct ib_pd *ib_pd, u64 start, u64 length, ...@@ -3277,64 +3565,68 @@ struct ib_mr *bnxt_re_reg_user_mr(struct ib_pd *ib_pd, u64 start, u64 length,
mr->qplib_mr.flags = __from_ib_access_flags(mr_access_flags); mr->qplib_mr.flags = __from_ib_access_flags(mr_access_flags);
mr->qplib_mr.type = CMDQ_ALLOCATE_MRW_MRW_FLAGS_MR; mr->qplib_mr.type = CMDQ_ALLOCATE_MRW_MRW_FLAGS_MR;
rc = bnxt_qplib_alloc_mrw(&rdev->qplib_res, &mr->qplib_mr);
if (rc) {
dev_err(rdev_to_dev(rdev), "Failed to allocate MR");
goto free_mr;
}
/* The fixed portion of the rkey is the same as the lkey */
mr->ib_mr.rkey = mr->qplib_mr.rkey;
umem = ib_umem_get(ib_pd->uobject->context, start, length, umem = ib_umem_get(ib_pd->uobject->context, start, length,
mr_access_flags, 0); mr_access_flags, 0);
if (IS_ERR(umem)) { if (IS_ERR(umem)) {
dev_err(rdev_to_dev(rdev), "Failed to get umem"); dev_err(rdev_to_dev(rdev), "Failed to get umem");
rc = -EFAULT; rc = -EFAULT;
goto free_mr; goto free_mrw;
} }
mr->ib_umem = umem; mr->ib_umem = umem;
rc = bnxt_qplib_alloc_mrw(&rdev->qplib_res, &mr->qplib_mr);
if (rc) {
dev_err(rdev_to_dev(rdev), "Failed to allocate MR");
goto release_umem;
}
/* The fixed portion of the rkey is the same as the lkey */
mr->ib_mr.rkey = mr->qplib_mr.rkey;
mr->qplib_mr.va = virt_addr; mr->qplib_mr.va = virt_addr;
umem_pgs = ib_umem_page_count(umem); umem_pgs = ib_umem_page_count(umem);
if (!umem_pgs) { if (!umem_pgs) {
dev_err(rdev_to_dev(rdev), "umem is invalid!"); dev_err(rdev_to_dev(rdev), "umem is invalid!");
rc = -EINVAL; rc = -EINVAL;
goto free_mrw; goto free_umem;
} }
mr->qplib_mr.total_size = length; mr->qplib_mr.total_size = length;
pbl_tbl = kcalloc(umem_pgs, sizeof(u64 *), GFP_KERNEL); pbl_tbl = kcalloc(umem_pgs, sizeof(u64 *), GFP_KERNEL);
if (!pbl_tbl) { if (!pbl_tbl) {
rc = -EINVAL; rc = -ENOMEM;
goto free_mrw; goto free_umem;
} }
pbl_tbl_orig = pbl_tbl;
if (umem->hugetlb) { page_shift = umem->page_shift;
dev_err(rdev_to_dev(rdev), "umem hugetlb not supported!");
if (!bnxt_re_page_size_ok(page_shift)) {
dev_err(rdev_to_dev(rdev), "umem page size unsupported!");
rc = -EFAULT; rc = -EFAULT;
goto fail; goto fail;
} }
if (umem->page_shift != PAGE_SHIFT) { if (!umem->hugetlb && length > BNXT_RE_MAX_MR_SIZE_LOW) {
dev_err(rdev_to_dev(rdev), "umem page shift unsupported!"); dev_err(rdev_to_dev(rdev), "Requested MR Sz:%llu Max sup:%llu",
rc = -EFAULT; length, (u64)BNXT_RE_MAX_MR_SIZE_LOW);
rc = -EINVAL;
goto fail; goto fail;
} }
/* Map umem buf ptrs to the PBL */ if (umem->hugetlb && length > BNXT_RE_PAGE_SIZE_2M) {
for_each_sg(umem->sg_head.sgl, sg, umem->nmap, entry) { page_shift = BNXT_RE_PAGE_SHIFT_2M;
pages = sg_dma_len(sg) >> umem->page_shift; dev_warn(rdev_to_dev(rdev), "umem hugetlb set page_size %x",
for (i = 0; i < pages; i++, pbl_tbl++) 1 << page_shift);
*pbl_tbl = sg_dma_address(sg) + (i << umem->page_shift);
} }
rc = bnxt_qplib_reg_mr(&rdev->qplib_res, &mr->qplib_mr, pbl_tbl_orig,
umem_pgs, false); /* Map umem buf ptrs to the PBL */
umem_pgs = fill_umem_pbl_tbl(umem, pbl_tbl, page_shift);
rc = bnxt_qplib_reg_mr(&rdev->qplib_res, &mr->qplib_mr, pbl_tbl,
umem_pgs, false, 1 << page_shift);
if (rc) { if (rc) {
dev_err(rdev_to_dev(rdev), "Failed to register user MR"); dev_err(rdev_to_dev(rdev), "Failed to register user MR");
goto fail; goto fail;
} }
kfree(pbl_tbl_orig); kfree(pbl_tbl);
mr->ib_mr.lkey = mr->qplib_mr.lkey; mr->ib_mr.lkey = mr->qplib_mr.lkey;
mr->ib_mr.rkey = mr->qplib_mr.lkey; mr->ib_mr.rkey = mr->qplib_mr.lkey;
...@@ -3342,11 +3634,11 @@ struct ib_mr *bnxt_re_reg_user_mr(struct ib_pd *ib_pd, u64 start, u64 length, ...@@ -3342,11 +3634,11 @@ struct ib_mr *bnxt_re_reg_user_mr(struct ib_pd *ib_pd, u64 start, u64 length,
return &mr->ib_mr; return &mr->ib_mr;
fail: fail:
kfree(pbl_tbl_orig); kfree(pbl_tbl);
free_umem:
ib_umem_release(umem);
free_mrw: free_mrw:
bnxt_qplib_free_mrw(&rdev->qplib_res, &mr->qplib_mr); bnxt_qplib_free_mrw(&rdev->qplib_res, &mr->qplib_mr);
release_umem:
ib_umem_release(umem);
free_mr: free_mr:
kfree(mr); kfree(mr);
return ERR_PTR(rc); return ERR_PTR(rc);
......
...@@ -68,6 +68,15 @@ struct bnxt_re_ah { ...@@ -68,6 +68,15 @@ struct bnxt_re_ah {
struct bnxt_qplib_ah qplib_ah; struct bnxt_qplib_ah qplib_ah;
}; };
struct bnxt_re_srq {
struct bnxt_re_dev *rdev;
u32 srq_limit;
struct ib_srq ib_srq;
struct bnxt_qplib_srq qplib_srq;
struct ib_umem *umem;
spinlock_t lock; /* protect srq */
};
struct bnxt_re_qp { struct bnxt_re_qp {
struct list_head list; struct list_head list;
struct bnxt_re_dev *rdev; struct bnxt_re_dev *rdev;
...@@ -143,6 +152,7 @@ int bnxt_re_query_port(struct ib_device *ibdev, u8 port_num, ...@@ -143,6 +152,7 @@ int bnxt_re_query_port(struct ib_device *ibdev, u8 port_num,
struct ib_port_attr *port_attr); struct ib_port_attr *port_attr);
int bnxt_re_get_port_immutable(struct ib_device *ibdev, u8 port_num, int bnxt_re_get_port_immutable(struct ib_device *ibdev, u8 port_num,
struct ib_port_immutable *immutable); struct ib_port_immutable *immutable);
void bnxt_re_query_fw_str(struct ib_device *ibdev, char *str);
int bnxt_re_query_pkey(struct ib_device *ibdev, u8 port_num, int bnxt_re_query_pkey(struct ib_device *ibdev, u8 port_num,
u16 index, u16 *pkey); u16 index, u16 *pkey);
int bnxt_re_del_gid(struct ib_device *ibdev, u8 port_num, int bnxt_re_del_gid(struct ib_device *ibdev, u8 port_num,
...@@ -164,6 +174,16 @@ struct ib_ah *bnxt_re_create_ah(struct ib_pd *pd, ...@@ -164,6 +174,16 @@ struct ib_ah *bnxt_re_create_ah(struct ib_pd *pd,
int bnxt_re_modify_ah(struct ib_ah *ah, struct rdma_ah_attr *ah_attr); int bnxt_re_modify_ah(struct ib_ah *ah, struct rdma_ah_attr *ah_attr);
int bnxt_re_query_ah(struct ib_ah *ah, struct rdma_ah_attr *ah_attr); int bnxt_re_query_ah(struct ib_ah *ah, struct rdma_ah_attr *ah_attr);
int bnxt_re_destroy_ah(struct ib_ah *ah); int bnxt_re_destroy_ah(struct ib_ah *ah);
struct ib_srq *bnxt_re_create_srq(struct ib_pd *pd,
struct ib_srq_init_attr *srq_init_attr,
struct ib_udata *udata);
int bnxt_re_modify_srq(struct ib_srq *srq, struct ib_srq_attr *srq_attr,
enum ib_srq_attr_mask srq_attr_mask,
struct ib_udata *udata);
int bnxt_re_query_srq(struct ib_srq *srq, struct ib_srq_attr *srq_attr);
int bnxt_re_destroy_srq(struct ib_srq *srq);
int bnxt_re_post_srq_recv(struct ib_srq *srq, struct ib_recv_wr *recv_wr,
struct ib_recv_wr **bad_recv_wr);
struct ib_qp *bnxt_re_create_qp(struct ib_pd *pd, struct ib_qp *bnxt_re_create_qp(struct ib_pd *pd,
struct ib_qp_init_attr *qp_init_attr, struct ib_qp_init_attr *qp_init_attr,
struct ib_udata *udata); struct ib_udata *udata);
......
...@@ -80,6 +80,79 @@ static DEFINE_MUTEX(bnxt_re_dev_lock); ...@@ -80,6 +80,79 @@ static DEFINE_MUTEX(bnxt_re_dev_lock);
static struct workqueue_struct *bnxt_re_wq; static struct workqueue_struct *bnxt_re_wq;
static void bnxt_re_ib_unreg(struct bnxt_re_dev *rdev, bool lock_wait); static void bnxt_re_ib_unreg(struct bnxt_re_dev *rdev, bool lock_wait);
/* SR-IOV helper functions */
static void bnxt_re_get_sriov_func_type(struct bnxt_re_dev *rdev)
{
struct bnxt *bp;
bp = netdev_priv(rdev->en_dev->net);
if (BNXT_VF(bp))
rdev->is_virtfn = 1;
}
/* Set the maximum number of each resource that the driver actually wants
* to allocate. This may be up to the maximum number the firmware has
* reserved for the function. The driver may choose to allocate fewer
* resources than the firmware maximum.
*/
static void bnxt_re_set_resource_limits(struct bnxt_re_dev *rdev)
{
u32 vf_qps = 0, vf_srqs = 0, vf_cqs = 0, vf_mrws = 0, vf_gids = 0;
u32 i;
u32 vf_pct;
u32 num_vfs;
struct bnxt_qplib_dev_attr *dev_attr = &rdev->dev_attr;
rdev->qplib_ctx.qpc_count = min_t(u32, BNXT_RE_MAX_QPC_COUNT,
dev_attr->max_qp);
rdev->qplib_ctx.mrw_count = BNXT_RE_MAX_MRW_COUNT_256K;
/* Use max_mr from fw since max_mrw does not get set */
rdev->qplib_ctx.mrw_count = min_t(u32, rdev->qplib_ctx.mrw_count,
dev_attr->max_mr);
rdev->qplib_ctx.srqc_count = min_t(u32, BNXT_RE_MAX_SRQC_COUNT,
dev_attr->max_srq);
rdev->qplib_ctx.cq_count = min_t(u32, BNXT_RE_MAX_CQ_COUNT,
dev_attr->max_cq);
for (i = 0; i < MAX_TQM_ALLOC_REQ; i++)
rdev->qplib_ctx.tqm_count[i] =
rdev->dev_attr.tqm_alloc_reqs[i];
if (rdev->num_vfs) {
/*
* Reserve a set of resources for the PF. Divide the remaining
* resources among the VFs
*/
vf_pct = 100 - BNXT_RE_PCT_RSVD_FOR_PF;
num_vfs = 100 * rdev->num_vfs;
vf_qps = (rdev->qplib_ctx.qpc_count * vf_pct) / num_vfs;
vf_srqs = (rdev->qplib_ctx.srqc_count * vf_pct) / num_vfs;
vf_cqs = (rdev->qplib_ctx.cq_count * vf_pct) / num_vfs;
/*
* The driver allows many more MRs than other resources. If the
* firmware does also, then reserve a fixed amount for the PF
* and divide the rest among VFs. VFs may use many MRs for NFS
* mounts, ISER, NVME applications, etc. If the firmware
* severely restricts the number of MRs, then let PF have
* half and divide the rest among VFs, as for the other
* resource types.
*/
if (rdev->qplib_ctx.mrw_count < BNXT_RE_MAX_MRW_COUNT_64K)
vf_mrws = rdev->qplib_ctx.mrw_count * vf_pct / num_vfs;
else
vf_mrws = (rdev->qplib_ctx.mrw_count -
BNXT_RE_RESVD_MR_FOR_PF) / rdev->num_vfs;
vf_gids = BNXT_RE_MAX_GID_PER_VF;
}
rdev->qplib_ctx.vf_res.max_mrw_per_vf = vf_mrws;
rdev->qplib_ctx.vf_res.max_gid_per_vf = vf_gids;
rdev->qplib_ctx.vf_res.max_qp_per_vf = vf_qps;
rdev->qplib_ctx.vf_res.max_srq_per_vf = vf_srqs;
rdev->qplib_ctx.vf_res.max_cq_per_vf = vf_cqs;
}
/* for handling bnxt_en callbacks later */ /* for handling bnxt_en callbacks later */
static void bnxt_re_stop(void *p) static void bnxt_re_stop(void *p)
{ {
...@@ -91,6 +164,15 @@ static void bnxt_re_start(void *p) ...@@ -91,6 +164,15 @@ static void bnxt_re_start(void *p)
static void bnxt_re_sriov_config(void *p, int num_vfs) static void bnxt_re_sriov_config(void *p, int num_vfs)
{ {
struct bnxt_re_dev *rdev = p;
if (!rdev)
return;
rdev->num_vfs = num_vfs;
bnxt_re_set_resource_limits(rdev);
bnxt_qplib_set_func_resources(&rdev->qplib_res, &rdev->rcfw,
&rdev->qplib_ctx);
} }
static void bnxt_re_shutdown(void *p) static void bnxt_re_shutdown(void *p)
...@@ -417,7 +499,7 @@ static struct bnxt_en_dev *bnxt_re_dev_probe(struct net_device *netdev) ...@@ -417,7 +499,7 @@ static struct bnxt_en_dev *bnxt_re_dev_probe(struct net_device *netdev)
return ERR_PTR(-EINVAL); return ERR_PTR(-EINVAL);
if (!(en_dev->flags & BNXT_EN_FLAG_ROCE_CAP)) { if (!(en_dev->flags & BNXT_EN_FLAG_ROCE_CAP)) {
dev_dbg(&pdev->dev, dev_info(&pdev->dev,
"%s: probe error: RoCE is not supported on this device", "%s: probe error: RoCE is not supported on this device",
ROCE_DRV_MODULE_NAME); ROCE_DRV_MODULE_NAME);
return ERR_PTR(-ENODEV); return ERR_PTR(-ENODEV);
...@@ -490,6 +572,7 @@ static int bnxt_re_register_ib(struct bnxt_re_dev *rdev) ...@@ -490,6 +572,7 @@ static int bnxt_re_register_ib(struct bnxt_re_dev *rdev)
ibdev->query_port = bnxt_re_query_port; ibdev->query_port = bnxt_re_query_port;
ibdev->get_port_immutable = bnxt_re_get_port_immutable; ibdev->get_port_immutable = bnxt_re_get_port_immutable;
ibdev->get_dev_fw_str = bnxt_re_query_fw_str;
ibdev->query_pkey = bnxt_re_query_pkey; ibdev->query_pkey = bnxt_re_query_pkey;
ibdev->query_gid = bnxt_re_query_gid; ibdev->query_gid = bnxt_re_query_gid;
ibdev->get_netdev = bnxt_re_get_netdev; ibdev->get_netdev = bnxt_re_get_netdev;
...@@ -505,6 +588,12 @@ static int bnxt_re_register_ib(struct bnxt_re_dev *rdev) ...@@ -505,6 +588,12 @@ static int bnxt_re_register_ib(struct bnxt_re_dev *rdev)
ibdev->query_ah = bnxt_re_query_ah; ibdev->query_ah = bnxt_re_query_ah;
ibdev->destroy_ah = bnxt_re_destroy_ah; ibdev->destroy_ah = bnxt_re_destroy_ah;
ibdev->create_srq = bnxt_re_create_srq;
ibdev->modify_srq = bnxt_re_modify_srq;
ibdev->query_srq = bnxt_re_query_srq;
ibdev->destroy_srq = bnxt_re_destroy_srq;
ibdev->post_srq_recv = bnxt_re_post_srq_recv;
ibdev->create_qp = bnxt_re_create_qp; ibdev->create_qp = bnxt_re_create_qp;
ibdev->modify_qp = bnxt_re_modify_qp; ibdev->modify_qp = bnxt_re_modify_qp;
ibdev->query_qp = bnxt_re_query_qp; ibdev->query_qp = bnxt_re_query_qp;
...@@ -541,14 +630,6 @@ static ssize_t show_rev(struct device *device, struct device_attribute *attr, ...@@ -541,14 +630,6 @@ static ssize_t show_rev(struct device *device, struct device_attribute *attr,
return scnprintf(buf, PAGE_SIZE, "0x%x\n", rdev->en_dev->pdev->vendor); return scnprintf(buf, PAGE_SIZE, "0x%x\n", rdev->en_dev->pdev->vendor);
} }
static ssize_t show_fw_ver(struct device *device, struct device_attribute *attr,
char *buf)
{
struct bnxt_re_dev *rdev = to_bnxt_re_dev(device, ibdev.dev);
return scnprintf(buf, PAGE_SIZE, "%s\n", rdev->dev_attr.fw_ver);
}
static ssize_t show_hca(struct device *device, struct device_attribute *attr, static ssize_t show_hca(struct device *device, struct device_attribute *attr,
char *buf) char *buf)
{ {
...@@ -558,12 +639,10 @@ static ssize_t show_hca(struct device *device, struct device_attribute *attr, ...@@ -558,12 +639,10 @@ static ssize_t show_hca(struct device *device, struct device_attribute *attr,
} }
static DEVICE_ATTR(hw_rev, 0444, show_rev, NULL); static DEVICE_ATTR(hw_rev, 0444, show_rev, NULL);
static DEVICE_ATTR(fw_rev, 0444, show_fw_ver, NULL);
static DEVICE_ATTR(hca_type, 0444, show_hca, NULL); static DEVICE_ATTR(hca_type, 0444, show_hca, NULL);
static struct device_attribute *bnxt_re_attributes[] = { static struct device_attribute *bnxt_re_attributes[] = {
&dev_attr_hw_rev, &dev_attr_hw_rev,
&dev_attr_fw_rev,
&dev_attr_hca_type &dev_attr_hca_type
}; };
...@@ -616,10 +695,10 @@ static struct bnxt_re_dev *bnxt_re_dev_add(struct net_device *netdev, ...@@ -616,10 +695,10 @@ static struct bnxt_re_dev *bnxt_re_dev_add(struct net_device *netdev,
return rdev; return rdev;
} }
static int bnxt_re_aeq_handler(struct bnxt_qplib_rcfw *rcfw, static int bnxt_re_handle_unaffi_async_event(struct creq_func_event
struct creq_func_event *aeqe) *unaffi_async)
{ {
switch (aeqe->event) { switch (unaffi_async->event) {
case CREQ_FUNC_EVENT_EVENT_TX_WQE_ERROR: case CREQ_FUNC_EVENT_EVENT_TX_WQE_ERROR:
break; break;
case CREQ_FUNC_EVENT_EVENT_TX_DATA_ERROR: case CREQ_FUNC_EVENT_EVENT_TX_DATA_ERROR:
...@@ -648,6 +727,93 @@ static int bnxt_re_aeq_handler(struct bnxt_qplib_rcfw *rcfw, ...@@ -648,6 +727,93 @@ static int bnxt_re_aeq_handler(struct bnxt_qplib_rcfw *rcfw,
return 0; return 0;
} }
static int bnxt_re_handle_qp_async_event(struct creq_qp_event *qp_event,
struct bnxt_re_qp *qp)
{
struct ib_event event;
memset(&event, 0, sizeof(event));
if (qp->qplib_qp.srq) {
event.device = &qp->rdev->ibdev;
event.element.qp = &qp->ib_qp;
event.event = IB_EVENT_QP_LAST_WQE_REACHED;
}
if (event.device && qp->ib_qp.event_handler)
qp->ib_qp.event_handler(&event, qp->ib_qp.qp_context);
return 0;
}
static int bnxt_re_handle_affi_async_event(struct creq_qp_event *affi_async,
void *obj)
{
int rc = 0;
u8 event;
if (!obj)
return rc; /* QP was already dead, still return success */
event = affi_async->event;
if (event == CREQ_QP_EVENT_EVENT_QP_ERROR_NOTIFICATION) {
struct bnxt_qplib_qp *lib_qp = obj;
struct bnxt_re_qp *qp = container_of(lib_qp, struct bnxt_re_qp,
qplib_qp);
rc = bnxt_re_handle_qp_async_event(affi_async, qp);
}
return rc;
}
static int bnxt_re_aeq_handler(struct bnxt_qplib_rcfw *rcfw,
void *aeqe, void *obj)
{
struct creq_qp_event *affi_async;
struct creq_func_event *unaffi_async;
u8 type;
int rc;
type = ((struct creq_base *)aeqe)->type;
if (type == CREQ_BASE_TYPE_FUNC_EVENT) {
unaffi_async = aeqe;
rc = bnxt_re_handle_unaffi_async_event(unaffi_async);
} else {
affi_async = aeqe;
rc = bnxt_re_handle_affi_async_event(affi_async, obj);
}
return rc;
}
static int bnxt_re_srqn_handler(struct bnxt_qplib_nq *nq,
struct bnxt_qplib_srq *handle, u8 event)
{
struct bnxt_re_srq *srq = container_of(handle, struct bnxt_re_srq,
qplib_srq);
struct ib_event ib_event;
int rc = 0;
if (!srq) {
dev_err(NULL, "%s: SRQ is NULL, SRQN not handled",
ROCE_DRV_MODULE_NAME);
rc = -EINVAL;
goto done;
}
ib_event.device = &srq->rdev->ibdev;
ib_event.element.srq = &srq->ib_srq;
if (event == NQ_SRQ_EVENT_EVENT_SRQ_THRESHOLD_EVENT)
ib_event.event = IB_EVENT_SRQ_LIMIT_REACHED;
else
ib_event.event = IB_EVENT_SRQ_ERR;
if (srq->ib_srq.event_handler) {
/* Lock event_handler? */
(*srq->ib_srq.event_handler)(&ib_event,
srq->ib_srq.srq_context);
}
done:
return rc;
}
static int bnxt_re_cqn_handler(struct bnxt_qplib_nq *nq, static int bnxt_re_cqn_handler(struct bnxt_qplib_nq *nq,
struct bnxt_qplib_cq *handle) struct bnxt_qplib_cq *handle)
{ {
...@@ -690,7 +856,8 @@ static int bnxt_re_init_res(struct bnxt_re_dev *rdev) ...@@ -690,7 +856,8 @@ static int bnxt_re_init_res(struct bnxt_re_dev *rdev)
rc = bnxt_qplib_enable_nq(rdev->en_dev->pdev, &rdev->nq[i - 1], rc = bnxt_qplib_enable_nq(rdev->en_dev->pdev, &rdev->nq[i - 1],
i - 1, rdev->msix_entries[i].vector, i - 1, rdev->msix_entries[i].vector,
rdev->msix_entries[i].db_offset, rdev->msix_entries[i].db_offset,
&bnxt_re_cqn_handler, NULL); &bnxt_re_cqn_handler,
&bnxt_re_srqn_handler);
if (rc) { if (rc) {
dev_err(rdev_to_dev(rdev), dev_err(rdev_to_dev(rdev),
...@@ -734,7 +901,8 @@ static int bnxt_re_alloc_res(struct bnxt_re_dev *rdev) ...@@ -734,7 +901,8 @@ static int bnxt_re_alloc_res(struct bnxt_re_dev *rdev)
/* Configure and allocate resources for qplib */ /* Configure and allocate resources for qplib */
rdev->qplib_res.rcfw = &rdev->rcfw; rdev->qplib_res.rcfw = &rdev->rcfw;
rc = bnxt_qplib_get_dev_attr(&rdev->rcfw, &rdev->dev_attr); rc = bnxt_qplib_get_dev_attr(&rdev->rcfw, &rdev->dev_attr,
rdev->is_virtfn);
if (rc) if (rc)
goto fail; goto fail;
...@@ -1035,19 +1203,6 @@ static void bnxt_re_ib_unreg(struct bnxt_re_dev *rdev, bool lock_wait) ...@@ -1035,19 +1203,6 @@ static void bnxt_re_ib_unreg(struct bnxt_re_dev *rdev, bool lock_wait)
} }
} }
static void bnxt_re_set_resource_limits(struct bnxt_re_dev *rdev)
{
u32 i;
rdev->qplib_ctx.qpc_count = BNXT_RE_MAX_QPC_COUNT;
rdev->qplib_ctx.mrw_count = BNXT_RE_MAX_MRW_COUNT;
rdev->qplib_ctx.srqc_count = BNXT_RE_MAX_SRQC_COUNT;
rdev->qplib_ctx.cq_count = BNXT_RE_MAX_CQ_COUNT;
for (i = 0; i < MAX_TQM_ALLOC_REQ; i++)
rdev->qplib_ctx.tqm_count[i] =
rdev->dev_attr.tqm_alloc_reqs[i];
}
/* worker thread for polling periodic events. Now used for QoS programming*/ /* worker thread for polling periodic events. Now used for QoS programming*/
static void bnxt_re_worker(struct work_struct *work) static void bnxt_re_worker(struct work_struct *work)
{ {
...@@ -1070,6 +1225,9 @@ static int bnxt_re_ib_reg(struct bnxt_re_dev *rdev) ...@@ -1070,6 +1225,9 @@ static int bnxt_re_ib_reg(struct bnxt_re_dev *rdev)
} }
set_bit(BNXT_RE_FLAG_NETDEV_REGISTERED, &rdev->flags); set_bit(BNXT_RE_FLAG_NETDEV_REGISTERED, &rdev->flags);
/* Check whether VF or PF */
bnxt_re_get_sriov_func_type(rdev);
rc = bnxt_re_request_msix(rdev); rc = bnxt_re_request_msix(rdev);
if (rc) { if (rc) {
pr_err("Failed to get MSI-X vectors: %#x\n", rc); pr_err("Failed to get MSI-X vectors: %#x\n", rc);
...@@ -1101,16 +1259,18 @@ static int bnxt_re_ib_reg(struct bnxt_re_dev *rdev) ...@@ -1101,16 +1259,18 @@ static int bnxt_re_ib_reg(struct bnxt_re_dev *rdev)
(rdev->en_dev->pdev, &rdev->rcfw, (rdev->en_dev->pdev, &rdev->rcfw,
rdev->msix_entries[BNXT_RE_AEQ_IDX].vector, rdev->msix_entries[BNXT_RE_AEQ_IDX].vector,
rdev->msix_entries[BNXT_RE_AEQ_IDX].db_offset, rdev->msix_entries[BNXT_RE_AEQ_IDX].db_offset,
0, &bnxt_re_aeq_handler); rdev->is_virtfn, &bnxt_re_aeq_handler);
if (rc) { if (rc) {
pr_err("Failed to enable RCFW channel: %#x\n", rc); pr_err("Failed to enable RCFW channel: %#x\n", rc);
goto free_ring; goto free_ring;
} }
rc = bnxt_qplib_get_dev_attr(&rdev->rcfw, &rdev->dev_attr); rc = bnxt_qplib_get_dev_attr(&rdev->rcfw, &rdev->dev_attr,
rdev->is_virtfn);
if (rc) if (rc)
goto disable_rcfw; goto disable_rcfw;
bnxt_re_set_resource_limits(rdev); if (!rdev->is_virtfn)
bnxt_re_set_resource_limits(rdev);
rc = bnxt_qplib_alloc_ctx(rdev->en_dev->pdev, &rdev->qplib_ctx, 0); rc = bnxt_qplib_alloc_ctx(rdev->en_dev->pdev, &rdev->qplib_ctx, 0);
if (rc) { if (rc) {
...@@ -1125,7 +1285,8 @@ static int bnxt_re_ib_reg(struct bnxt_re_dev *rdev) ...@@ -1125,7 +1285,8 @@ static int bnxt_re_ib_reg(struct bnxt_re_dev *rdev)
goto free_ctx; goto free_ctx;
} }
rc = bnxt_qplib_init_rcfw(&rdev->rcfw, &rdev->qplib_ctx, 0); rc = bnxt_qplib_init_rcfw(&rdev->rcfw, &rdev->qplib_ctx,
rdev->is_virtfn);
if (rc) { if (rc) {
pr_err("Failed to initialize RCFW: %#x\n", rc); pr_err("Failed to initialize RCFW: %#x\n", rc);
goto free_sctx; goto free_sctx;
...@@ -1144,13 +1305,15 @@ static int bnxt_re_ib_reg(struct bnxt_re_dev *rdev) ...@@ -1144,13 +1305,15 @@ static int bnxt_re_ib_reg(struct bnxt_re_dev *rdev)
goto fail; goto fail;
} }
rc = bnxt_re_setup_qos(rdev); if (!rdev->is_virtfn) {
if (rc) rc = bnxt_re_setup_qos(rdev);
pr_info("RoCE priority not yet configured\n"); if (rc)
pr_info("RoCE priority not yet configured\n");
INIT_DELAYED_WORK(&rdev->worker, bnxt_re_worker); INIT_DELAYED_WORK(&rdev->worker, bnxt_re_worker);
set_bit(BNXT_RE_FLAG_QOS_WORK_REG, &rdev->flags); set_bit(BNXT_RE_FLAG_QOS_WORK_REG, &rdev->flags);
schedule_delayed_work(&rdev->worker, msecs_to_jiffies(30000)); schedule_delayed_work(&rdev->worker, msecs_to_jiffies(30000));
}
/* Register ib dev */ /* Register ib dev */
rc = bnxt_re_register_ib(rdev); rc = bnxt_re_register_ib(rdev);
...@@ -1176,6 +1339,7 @@ static int bnxt_re_ib_reg(struct bnxt_re_dev *rdev) ...@@ -1176,6 +1339,7 @@ static int bnxt_re_ib_reg(struct bnxt_re_dev *rdev)
set_bit(BNXT_RE_FLAG_IBDEV_REGISTERED, &rdev->flags); set_bit(BNXT_RE_FLAG_IBDEV_REGISTERED, &rdev->flags);
ib_get_eth_speed(&rdev->ibdev, 1, &rdev->active_speed, ib_get_eth_speed(&rdev->ibdev, 1, &rdev->active_speed,
&rdev->active_width); &rdev->active_width);
set_bit(BNXT_RE_FLAG_ISSUE_ROCE_STATS, &rdev->flags);
bnxt_re_dispatch_event(&rdev->ibdev, NULL, 1, IB_EVENT_PORT_ACTIVE); bnxt_re_dispatch_event(&rdev->ibdev, NULL, 1, IB_EVENT_PORT_ACTIVE);
bnxt_re_dispatch_event(&rdev->ibdev, NULL, 1, IB_EVENT_GID_CHANGE); bnxt_re_dispatch_event(&rdev->ibdev, NULL, 1, IB_EVENT_GID_CHANGE);
...@@ -1400,7 +1564,7 @@ static int __init bnxt_re_mod_init(void) ...@@ -1400,7 +1564,7 @@ static int __init bnxt_re_mod_init(void)
static void __exit bnxt_re_mod_exit(void) static void __exit bnxt_re_mod_exit(void)
{ {
struct bnxt_re_dev *rdev; struct bnxt_re_dev *rdev, *next;
LIST_HEAD(to_be_deleted); LIST_HEAD(to_be_deleted);
mutex_lock(&bnxt_re_dev_lock); mutex_lock(&bnxt_re_dev_lock);
...@@ -1408,8 +1572,11 @@ static void __exit bnxt_re_mod_exit(void) ...@@ -1408,8 +1572,11 @@ static void __exit bnxt_re_mod_exit(void)
if (!list_empty(&bnxt_re_dev_list)) if (!list_empty(&bnxt_re_dev_list))
list_splice_init(&bnxt_re_dev_list, &to_be_deleted); list_splice_init(&bnxt_re_dev_list, &to_be_deleted);
mutex_unlock(&bnxt_re_dev_lock); mutex_unlock(&bnxt_re_dev_lock);
/*
list_for_each_entry(rdev, &to_be_deleted, list) { * Cleanup the devices in reverse order so that the VF device
* cleanup is done before PF cleanup
*/
list_for_each_entry_safe_reverse(rdev, next, &to_be_deleted, list) {
dev_info(rdev_to_dev(rdev), "Unregistering Device"); dev_info(rdev_to_dev(rdev), "Unregistering Device");
bnxt_re_dev_stop(rdev); bnxt_re_dev_stop(rdev);
bnxt_re_ib_unreg(rdev, true); bnxt_re_ib_unreg(rdev, true);
......
...@@ -39,6 +39,27 @@ ...@@ -39,6 +39,27 @@
#ifndef __BNXT_QPLIB_FP_H__ #ifndef __BNXT_QPLIB_FP_H__
#define __BNXT_QPLIB_FP_H__ #define __BNXT_QPLIB_FP_H__
struct bnxt_qplib_srq {
struct bnxt_qplib_pd *pd;
struct bnxt_qplib_dpi *dpi;
void __iomem *dbr_base;
u64 srq_handle;
u32 id;
u32 max_wqe;
u32 max_sge;
u32 threshold;
bool arm_req;
struct bnxt_qplib_cq *cq;
struct bnxt_qplib_hwq hwq;
struct bnxt_qplib_swq *swq;
struct scatterlist *sglist;
int start_idx;
int last_idx;
u32 nmap;
u16 eventq_hw_ring_id;
spinlock_t lock; /* protect SRQE link list */
};
struct bnxt_qplib_sge { struct bnxt_qplib_sge {
u64 addr; u64 addr;
u32 lkey; u32 lkey;
...@@ -79,6 +100,7 @@ static inline u32 get_psne_idx(u32 val) ...@@ -79,6 +100,7 @@ static inline u32 get_psne_idx(u32 val)
struct bnxt_qplib_swq { struct bnxt_qplib_swq {
u64 wr_id; u64 wr_id;
int next_idx;
u8 type; u8 type;
u8 flags; u8 flags;
u32 start_psn; u32 start_psn;
...@@ -404,29 +426,27 @@ struct bnxt_qplib_cq { ...@@ -404,29 +426,27 @@ struct bnxt_qplib_cq {
writel(NQ_DB_CP_FLAGS | ((raw_cons) & ((cp_bit) - 1)), db) writel(NQ_DB_CP_FLAGS | ((raw_cons) & ((cp_bit) - 1)), db)
struct bnxt_qplib_nq { struct bnxt_qplib_nq {
struct pci_dev *pdev; struct pci_dev *pdev;
int vector; int vector;
cpumask_t mask; cpumask_t mask;
int budget; int budget;
bool requested; bool requested;
struct tasklet_struct worker; struct tasklet_struct worker;
struct bnxt_qplib_hwq hwq; struct bnxt_qplib_hwq hwq;
u16 bar_reg; u16 bar_reg;
u16 bar_reg_off; u16 bar_reg_off;
u16 ring_id; u16 ring_id;
void __iomem *bar_reg_iomem; void __iomem *bar_reg_iomem;
int (*cqn_handler) int (*cqn_handler)(struct bnxt_qplib_nq *nq,
(struct bnxt_qplib_nq *nq, struct bnxt_qplib_cq *cq);
struct bnxt_qplib_cq *cq); int (*srqn_handler)(struct bnxt_qplib_nq *nq,
int (*srqn_handler) struct bnxt_qplib_srq *srq,
(struct bnxt_qplib_nq *nq, u8 event);
void *srq, struct workqueue_struct *cqn_wq;
u8 event); char name[32];
struct workqueue_struct *cqn_wq;
char name[32];
}; };
struct bnxt_qplib_nq_work { struct bnxt_qplib_nq_work {
...@@ -441,8 +461,18 @@ int bnxt_qplib_enable_nq(struct pci_dev *pdev, struct bnxt_qplib_nq *nq, ...@@ -441,8 +461,18 @@ int bnxt_qplib_enable_nq(struct pci_dev *pdev, struct bnxt_qplib_nq *nq,
int (*cqn_handler)(struct bnxt_qplib_nq *nq, int (*cqn_handler)(struct bnxt_qplib_nq *nq,
struct bnxt_qplib_cq *cq), struct bnxt_qplib_cq *cq),
int (*srqn_handler)(struct bnxt_qplib_nq *nq, int (*srqn_handler)(struct bnxt_qplib_nq *nq,
void *srq, struct bnxt_qplib_srq *srq,
u8 event)); u8 event));
int bnxt_qplib_create_srq(struct bnxt_qplib_res *res,
struct bnxt_qplib_srq *srq);
int bnxt_qplib_modify_srq(struct bnxt_qplib_res *res,
struct bnxt_qplib_srq *srq);
int bnxt_qplib_query_srq(struct bnxt_qplib_res *res,
struct bnxt_qplib_srq *srq);
int bnxt_qplib_destroy_srq(struct bnxt_qplib_res *res,
struct bnxt_qplib_srq *srq);
int bnxt_qplib_post_srq_recv(struct bnxt_qplib_srq *srq,
struct bnxt_qplib_swqe *wqe);
int bnxt_qplib_create_qp1(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp); int bnxt_qplib_create_qp1(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp);
int bnxt_qplib_create_qp(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp); int bnxt_qplib_create_qp(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp);
int bnxt_qplib_modify_qp(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp); int bnxt_qplib_modify_qp(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp);
......
...@@ -93,7 +93,8 @@ static int __send_message(struct bnxt_qplib_rcfw *rcfw, struct cmdq_base *req, ...@@ -93,7 +93,8 @@ static int __send_message(struct bnxt_qplib_rcfw *rcfw, struct cmdq_base *req,
opcode = req->opcode; opcode = req->opcode;
if (!test_bit(FIRMWARE_INITIALIZED_FLAG, &rcfw->flags) && if (!test_bit(FIRMWARE_INITIALIZED_FLAG, &rcfw->flags) &&
(opcode != CMDQ_BASE_OPCODE_QUERY_FUNC && (opcode != CMDQ_BASE_OPCODE_QUERY_FUNC &&
opcode != CMDQ_BASE_OPCODE_INITIALIZE_FW)) { opcode != CMDQ_BASE_OPCODE_INITIALIZE_FW &&
opcode != CMDQ_BASE_OPCODE_QUERY_VERSION)) {
dev_err(&rcfw->pdev->dev, dev_err(&rcfw->pdev->dev,
"QPLIB: RCFW not initialized, reject opcode 0x%x", "QPLIB: RCFW not initialized, reject opcode 0x%x",
opcode); opcode);
...@@ -615,7 +616,7 @@ int bnxt_qplib_enable_rcfw_channel(struct pci_dev *pdev, ...@@ -615,7 +616,7 @@ int bnxt_qplib_enable_rcfw_channel(struct pci_dev *pdev,
int msix_vector, int msix_vector,
int cp_bar_reg_off, int virt_fn, int cp_bar_reg_off, int virt_fn,
int (*aeq_handler)(struct bnxt_qplib_rcfw *, int (*aeq_handler)(struct bnxt_qplib_rcfw *,
struct creq_func_event *)) void *, void *))
{ {
resource_size_t res_base; resource_size_t res_base;
struct cmdq_init init; struct cmdq_init init;
......
...@@ -167,7 +167,7 @@ struct bnxt_qplib_rcfw { ...@@ -167,7 +167,7 @@ struct bnxt_qplib_rcfw {
#define FIRMWARE_TIMED_OUT 3 #define FIRMWARE_TIMED_OUT 3
wait_queue_head_t waitq; wait_queue_head_t waitq;
int (*aeq_handler)(struct bnxt_qplib_rcfw *, int (*aeq_handler)(struct bnxt_qplib_rcfw *,
struct creq_func_event *); void *, void *);
u32 seq_num; u32 seq_num;
/* Bar region info */ /* Bar region info */
...@@ -199,9 +199,8 @@ int bnxt_qplib_enable_rcfw_channel(struct pci_dev *pdev, ...@@ -199,9 +199,8 @@ int bnxt_qplib_enable_rcfw_channel(struct pci_dev *pdev,
struct bnxt_qplib_rcfw *rcfw, struct bnxt_qplib_rcfw *rcfw,
int msix_vector, int msix_vector,
int cp_bar_reg_off, int virt_fn, int cp_bar_reg_off, int virt_fn,
int (*aeq_handler) int (*aeq_handler)(struct bnxt_qplib_rcfw *,
(struct bnxt_qplib_rcfw *, void *aeqe, void *obj));
struct creq_func_event *));
struct bnxt_qplib_rcfw_sbuf *bnxt_qplib_rcfw_alloc_sbuf( struct bnxt_qplib_rcfw_sbuf *bnxt_qplib_rcfw_alloc_sbuf(
struct bnxt_qplib_rcfw *rcfw, struct bnxt_qplib_rcfw *rcfw,
......
...@@ -104,13 +104,12 @@ static int __alloc_pbl(struct pci_dev *pdev, struct bnxt_qplib_pbl *pbl, ...@@ -104,13 +104,12 @@ static int __alloc_pbl(struct pci_dev *pdev, struct bnxt_qplib_pbl *pbl,
if (!sghead) { if (!sghead) {
for (i = 0; i < pages; i++) { for (i = 0; i < pages; i++) {
pbl->pg_arr[i] = dma_alloc_coherent(&pdev->dev, pbl->pg_arr[i] = dma_zalloc_coherent(&pdev->dev,
pbl->pg_size, pbl->pg_size,
&pbl->pg_map_arr[i], &pbl->pg_map_arr[i],
GFP_KERNEL); GFP_KERNEL);
if (!pbl->pg_arr[i]) if (!pbl->pg_arr[i])
goto fail; goto fail;
memset(pbl->pg_arr[i], 0, pbl->pg_size);
pbl->pg_count++; pbl->pg_count++;
} }
} else { } else {
......
...@@ -64,8 +64,28 @@ static bool bnxt_qplib_is_atomic_cap(struct bnxt_qplib_rcfw *rcfw) ...@@ -64,8 +64,28 @@ static bool bnxt_qplib_is_atomic_cap(struct bnxt_qplib_rcfw *rcfw)
return !!(pcie_ctl2 & PCI_EXP_DEVCTL2_ATOMIC_REQ); return !!(pcie_ctl2 & PCI_EXP_DEVCTL2_ATOMIC_REQ);
} }
static void bnxt_qplib_query_version(struct bnxt_qplib_rcfw *rcfw,
char *fw_ver)
{
struct cmdq_query_version req;
struct creq_query_version_resp resp;
u16 cmd_flags = 0;
int rc = 0;
RCFW_CMD_PREP(req, QUERY_VERSION, cmd_flags);
rc = bnxt_qplib_rcfw_send_message(rcfw, (void *)&req,
(void *)&resp, NULL, 0);
if (rc)
return;
fw_ver[0] = resp.fw_maj;
fw_ver[1] = resp.fw_minor;
fw_ver[2] = resp.fw_bld;
fw_ver[3] = resp.fw_rsvd;
}
int bnxt_qplib_get_dev_attr(struct bnxt_qplib_rcfw *rcfw, int bnxt_qplib_get_dev_attr(struct bnxt_qplib_rcfw *rcfw,
struct bnxt_qplib_dev_attr *attr) struct bnxt_qplib_dev_attr *attr, bool vf)
{ {
struct cmdq_query_func req; struct cmdq_query_func req;
struct creq_query_func_resp resp; struct creq_query_func_resp resp;
...@@ -95,7 +115,8 @@ int bnxt_qplib_get_dev_attr(struct bnxt_qplib_rcfw *rcfw, ...@@ -95,7 +115,8 @@ int bnxt_qplib_get_dev_attr(struct bnxt_qplib_rcfw *rcfw,
/* Extract the context from the side buffer */ /* Extract the context from the side buffer */
attr->max_qp = le32_to_cpu(sb->max_qp); attr->max_qp = le32_to_cpu(sb->max_qp);
/* max_qp value reported by FW for PF doesn't include the QP1 for PF */ /* max_qp value reported by FW for PF doesn't include the QP1 for PF */
attr->max_qp += 1; if (!vf)
attr->max_qp += 1;
attr->max_qp_rd_atom = attr->max_qp_rd_atom =
sb->max_qp_rd_atom > BNXT_QPLIB_MAX_OUT_RD_ATOM ? sb->max_qp_rd_atom > BNXT_QPLIB_MAX_OUT_RD_ATOM ?
BNXT_QPLIB_MAX_OUT_RD_ATOM : sb->max_qp_rd_atom; BNXT_QPLIB_MAX_OUT_RD_ATOM : sb->max_qp_rd_atom;
...@@ -133,7 +154,7 @@ int bnxt_qplib_get_dev_attr(struct bnxt_qplib_rcfw *rcfw, ...@@ -133,7 +154,7 @@ int bnxt_qplib_get_dev_attr(struct bnxt_qplib_rcfw *rcfw,
attr->l2_db_size = (sb->l2_db_space_size + 1) * PAGE_SIZE; attr->l2_db_size = (sb->l2_db_space_size + 1) * PAGE_SIZE;
attr->max_sgid = le32_to_cpu(sb->max_gid); attr->max_sgid = le32_to_cpu(sb->max_gid);
strlcpy(attr->fw_ver, "20.6.28.0", sizeof(attr->fw_ver)); bnxt_qplib_query_version(rcfw, attr->fw_ver);
for (i = 0; i < MAX_TQM_ALLOC_REQ / 4; i++) { for (i = 0; i < MAX_TQM_ALLOC_REQ / 4; i++) {
temp = le32_to_cpu(sb->tqm_alloc_reqs[i]); temp = le32_to_cpu(sb->tqm_alloc_reqs[i]);
...@@ -150,6 +171,38 @@ int bnxt_qplib_get_dev_attr(struct bnxt_qplib_rcfw *rcfw, ...@@ -150,6 +171,38 @@ int bnxt_qplib_get_dev_attr(struct bnxt_qplib_rcfw *rcfw,
return rc; return rc;
} }
int bnxt_qplib_set_func_resources(struct bnxt_qplib_res *res,
struct bnxt_qplib_rcfw *rcfw,
struct bnxt_qplib_ctx *ctx)
{
struct cmdq_set_func_resources req;
struct creq_set_func_resources_resp resp;
u16 cmd_flags = 0;
int rc = 0;
RCFW_CMD_PREP(req, SET_FUNC_RESOURCES, cmd_flags);
req.number_of_qp = cpu_to_le32(ctx->qpc_count);
req.number_of_mrw = cpu_to_le32(ctx->mrw_count);
req.number_of_srq = cpu_to_le32(ctx->srqc_count);
req.number_of_cq = cpu_to_le32(ctx->cq_count);
req.max_qp_per_vf = cpu_to_le32(ctx->vf_res.max_qp_per_vf);
req.max_mrw_per_vf = cpu_to_le32(ctx->vf_res.max_mrw_per_vf);
req.max_srq_per_vf = cpu_to_le32(ctx->vf_res.max_srq_per_vf);
req.max_cq_per_vf = cpu_to_le32(ctx->vf_res.max_cq_per_vf);
req.max_gid_per_vf = cpu_to_le32(ctx->vf_res.max_gid_per_vf);
rc = bnxt_qplib_rcfw_send_message(rcfw, (void *)&req,
(void *)&resp,
NULL, 0);
if (rc) {
dev_err(&res->pdev->dev,
"QPLIB: Failed to set function resources");
}
return rc;
}
/* SGID */ /* SGID */
int bnxt_qplib_get_sgid(struct bnxt_qplib_res *res, int bnxt_qplib_get_sgid(struct bnxt_qplib_res *res,
struct bnxt_qplib_sgid_tbl *sgid_tbl, int index, struct bnxt_qplib_sgid_tbl *sgid_tbl, int index,
...@@ -604,7 +657,7 @@ int bnxt_qplib_dereg_mrw(struct bnxt_qplib_res *res, struct bnxt_qplib_mrw *mrw, ...@@ -604,7 +657,7 @@ int bnxt_qplib_dereg_mrw(struct bnxt_qplib_res *res, struct bnxt_qplib_mrw *mrw,
} }
int bnxt_qplib_reg_mr(struct bnxt_qplib_res *res, struct bnxt_qplib_mrw *mr, int bnxt_qplib_reg_mr(struct bnxt_qplib_res *res, struct bnxt_qplib_mrw *mr,
u64 *pbl_tbl, int num_pbls, bool block) u64 *pbl_tbl, int num_pbls, bool block, u32 buf_pg_size)
{ {
struct bnxt_qplib_rcfw *rcfw = res->rcfw; struct bnxt_qplib_rcfw *rcfw = res->rcfw;
struct cmdq_register_mr req; struct cmdq_register_mr req;
...@@ -615,6 +668,9 @@ int bnxt_qplib_reg_mr(struct bnxt_qplib_res *res, struct bnxt_qplib_mrw *mr, ...@@ -615,6 +668,9 @@ int bnxt_qplib_reg_mr(struct bnxt_qplib_res *res, struct bnxt_qplib_mrw *mr,
u32 pg_size; u32 pg_size;
if (num_pbls) { if (num_pbls) {
/* Allocate memory for the non-leaf pages to store buf ptrs.
* Non-leaf pages always uses system PAGE_SIZE
*/
pg_ptrs = roundup_pow_of_two(num_pbls); pg_ptrs = roundup_pow_of_two(num_pbls);
pages = pg_ptrs >> MAX_PBL_LVL_1_PGS_SHIFT; pages = pg_ptrs >> MAX_PBL_LVL_1_PGS_SHIFT;
if (!pages) if (!pages)
...@@ -632,6 +688,7 @@ int bnxt_qplib_reg_mr(struct bnxt_qplib_res *res, struct bnxt_qplib_mrw *mr, ...@@ -632,6 +688,7 @@ int bnxt_qplib_reg_mr(struct bnxt_qplib_res *res, struct bnxt_qplib_mrw *mr,
bnxt_qplib_free_hwq(res->pdev, &mr->hwq); bnxt_qplib_free_hwq(res->pdev, &mr->hwq);
mr->hwq.max_elements = pages; mr->hwq.max_elements = pages;
/* Use system PAGE_SIZE */
rc = bnxt_qplib_alloc_init_hwq(res->pdev, &mr->hwq, NULL, 0, rc = bnxt_qplib_alloc_init_hwq(res->pdev, &mr->hwq, NULL, 0,
&mr->hwq.max_elements, &mr->hwq.max_elements,
PAGE_SIZE, 0, PAGE_SIZE, PAGE_SIZE, 0, PAGE_SIZE,
...@@ -652,18 +709,22 @@ int bnxt_qplib_reg_mr(struct bnxt_qplib_res *res, struct bnxt_qplib_mrw *mr, ...@@ -652,18 +709,22 @@ int bnxt_qplib_reg_mr(struct bnxt_qplib_res *res, struct bnxt_qplib_mrw *mr,
/* Configure the request */ /* Configure the request */
if (mr->hwq.level == PBL_LVL_MAX) { if (mr->hwq.level == PBL_LVL_MAX) {
/* No PBL provided, just use system PAGE_SIZE */
level = 0; level = 0;
req.pbl = 0; req.pbl = 0;
pg_size = PAGE_SIZE; pg_size = PAGE_SIZE;
} else { } else {
level = mr->hwq.level + 1; level = mr->hwq.level + 1;
req.pbl = cpu_to_le64(mr->hwq.pbl[PBL_LVL_0].pg_map_arr[0]); req.pbl = cpu_to_le64(mr->hwq.pbl[PBL_LVL_0].pg_map_arr[0]);
pg_size = mr->hwq.pbl[PBL_LVL_0].pg_size;
} }
pg_size = buf_pg_size ? buf_pg_size : PAGE_SIZE;
req.log2_pg_size_lvl = (level << CMDQ_REGISTER_MR_LVL_SFT) | req.log2_pg_size_lvl = (level << CMDQ_REGISTER_MR_LVL_SFT) |
((ilog2(pg_size) << ((ilog2(pg_size) <<
CMDQ_REGISTER_MR_LOG2_PG_SIZE_SFT) & CMDQ_REGISTER_MR_LOG2_PG_SIZE_SFT) &
CMDQ_REGISTER_MR_LOG2_PG_SIZE_MASK); CMDQ_REGISTER_MR_LOG2_PG_SIZE_MASK);
req.log2_pbl_pg_size = cpu_to_le16(((ilog2(PAGE_SIZE) <<
CMDQ_REGISTER_MR_LOG2_PBL_PG_SIZE_SFT) &
CMDQ_REGISTER_MR_LOG2_PBL_PG_SIZE_MASK));
req.access = (mr->flags & 0xFFFF); req.access = (mr->flags & 0xFFFF);
req.va = cpu_to_le64(mr->va); req.va = cpu_to_le64(mr->va);
req.key = cpu_to_le32(mr->lkey); req.key = cpu_to_le32(mr->lkey);
...@@ -729,3 +790,73 @@ int bnxt_qplib_map_tc2cos(struct bnxt_qplib_res *res, u16 *cids) ...@@ -729,3 +790,73 @@ int bnxt_qplib_map_tc2cos(struct bnxt_qplib_res *res, u16 *cids)
0); 0);
return 0; return 0;
} }
int bnxt_qplib_get_roce_stats(struct bnxt_qplib_rcfw *rcfw,
struct bnxt_qplib_roce_stats *stats)
{
struct cmdq_query_roce_stats req;
struct creq_query_roce_stats_resp resp;
struct bnxt_qplib_rcfw_sbuf *sbuf;
struct creq_query_roce_stats_resp_sb *sb;
u16 cmd_flags = 0;
int rc = 0;
RCFW_CMD_PREP(req, QUERY_ROCE_STATS, cmd_flags);
sbuf = bnxt_qplib_rcfw_alloc_sbuf(rcfw, sizeof(*sb));
if (!sbuf) {
dev_err(&rcfw->pdev->dev,
"QPLIB: SP: QUERY_ROCE_STATS alloc side buffer failed");
return -ENOMEM;
}
sb = sbuf->sb;
req.resp_size = sizeof(*sb) / BNXT_QPLIB_CMDQE_UNITS;
rc = bnxt_qplib_rcfw_send_message(rcfw, (void *)&req, (void *)&resp,
(void *)sbuf, 0);
if (rc)
goto bail;
/* Extract the context from the side buffer */
stats->to_retransmits = le64_to_cpu(sb->to_retransmits);
stats->seq_err_naks_rcvd = le64_to_cpu(sb->seq_err_naks_rcvd);
stats->max_retry_exceeded = le64_to_cpu(sb->max_retry_exceeded);
stats->rnr_naks_rcvd = le64_to_cpu(sb->rnr_naks_rcvd);
stats->missing_resp = le64_to_cpu(sb->missing_resp);
stats->unrecoverable_err = le64_to_cpu(sb->unrecoverable_err);
stats->bad_resp_err = le64_to_cpu(sb->bad_resp_err);
stats->local_qp_op_err = le64_to_cpu(sb->local_qp_op_err);
stats->local_protection_err = le64_to_cpu(sb->local_protection_err);
stats->mem_mgmt_op_err = le64_to_cpu(sb->mem_mgmt_op_err);
stats->remote_invalid_req_err = le64_to_cpu(sb->remote_invalid_req_err);
stats->remote_access_err = le64_to_cpu(sb->remote_access_err);
stats->remote_op_err = le64_to_cpu(sb->remote_op_err);
stats->dup_req = le64_to_cpu(sb->dup_req);
stats->res_exceed_max = le64_to_cpu(sb->res_exceed_max);
stats->res_length_mismatch = le64_to_cpu(sb->res_length_mismatch);
stats->res_exceeds_wqe = le64_to_cpu(sb->res_exceeds_wqe);
stats->res_opcode_err = le64_to_cpu(sb->res_opcode_err);
stats->res_rx_invalid_rkey = le64_to_cpu(sb->res_rx_invalid_rkey);
stats->res_rx_domain_err = le64_to_cpu(sb->res_rx_domain_err);
stats->res_rx_no_perm = le64_to_cpu(sb->res_rx_no_perm);
stats->res_rx_range_err = le64_to_cpu(sb->res_rx_range_err);
stats->res_tx_invalid_rkey = le64_to_cpu(sb->res_tx_invalid_rkey);
stats->res_tx_domain_err = le64_to_cpu(sb->res_tx_domain_err);
stats->res_tx_no_perm = le64_to_cpu(sb->res_tx_no_perm);
stats->res_tx_range_err = le64_to_cpu(sb->res_tx_range_err);
stats->res_irrq_oflow = le64_to_cpu(sb->res_irrq_oflow);
stats->res_unsup_opcode = le64_to_cpu(sb->res_unsup_opcode);
stats->res_unaligned_atomic = le64_to_cpu(sb->res_unaligned_atomic);
stats->res_rem_inv_err = le64_to_cpu(sb->res_rem_inv_err);
stats->res_mem_error = le64_to_cpu(sb->res_mem_error);
stats->res_srq_err = le64_to_cpu(sb->res_srq_err);
stats->res_cmp_err = le64_to_cpu(sb->res_cmp_err);
stats->res_invalid_dup_rkey = le64_to_cpu(sb->res_invalid_dup_rkey);
stats->res_wqe_format_err = le64_to_cpu(sb->res_wqe_format_err);
stats->res_cq_load_err = le64_to_cpu(sb->res_cq_load_err);
stats->res_srq_load_err = le64_to_cpu(sb->res_srq_load_err);
stats->res_tx_pci_err = le64_to_cpu(sb->res_tx_pci_err);
stats->res_rx_pci_err = le64_to_cpu(sb->res_rx_pci_err);
bail:
bnxt_qplib_rcfw_free_sbuf(rcfw, sbuf);
return rc;
}
此差异已折叠。
...@@ -236,7 +236,7 @@ int c4iw_ev_handler(struct c4iw_dev *dev, u32 qid) ...@@ -236,7 +236,7 @@ int c4iw_ev_handler(struct c4iw_dev *dev, u32 qid)
if (atomic_dec_and_test(&chp->refcnt)) if (atomic_dec_and_test(&chp->refcnt))
wake_up(&chp->wait); wake_up(&chp->wait);
} else { } else {
pr_warn("%s unknown cqid 0x%x\n", __func__, qid); pr_debug("unknown cqid 0x%x\n", qid);
spin_unlock_irqrestore(&dev->lock, flag); spin_unlock_irqrestore(&dev->lock, flag);
} }
return 0; return 0;
......
...@@ -153,8 +153,8 @@ struct c4iw_hw_queue { ...@@ -153,8 +153,8 @@ struct c4iw_hw_queue {
}; };
struct wr_log_entry { struct wr_log_entry {
struct timespec post_host_ts; ktime_t post_host_time;
struct timespec poll_host_ts; ktime_t poll_host_time;
u64 post_sge_ts; u64 post_sge_ts;
u64 cqe_sge_ts; u64 cqe_sge_ts;
u64 poll_sge_ts; u64 poll_sge_ts;
......
...@@ -1042,7 +1042,7 @@ int c4iw_post_send(struct ib_qp *ibqp, struct ib_send_wr *wr, ...@@ -1042,7 +1042,7 @@ int c4iw_post_send(struct ib_qp *ibqp, struct ib_send_wr *wr,
if (c4iw_wr_log) { if (c4iw_wr_log) {
swsqe->sge_ts = cxgb4_read_sge_timestamp( swsqe->sge_ts = cxgb4_read_sge_timestamp(
qhp->rhp->rdev.lldi.ports[0]); qhp->rhp->rdev.lldi.ports[0]);
getnstimeofday(&swsqe->host_ts); swsqe->host_time = ktime_get();
} }
init_wr_hdr(wqe, qhp->wq.sq.pidx, fw_opcode, fw_flags, len16); init_wr_hdr(wqe, qhp->wq.sq.pidx, fw_opcode, fw_flags, len16);
...@@ -1117,8 +1117,8 @@ int c4iw_post_receive(struct ib_qp *ibqp, struct ib_recv_wr *wr, ...@@ -1117,8 +1117,8 @@ int c4iw_post_receive(struct ib_qp *ibqp, struct ib_recv_wr *wr,
qhp->wq.rq.sw_rq[qhp->wq.rq.pidx].sge_ts = qhp->wq.rq.sw_rq[qhp->wq.rq.pidx].sge_ts =
cxgb4_read_sge_timestamp( cxgb4_read_sge_timestamp(
qhp->rhp->rdev.lldi.ports[0]); qhp->rhp->rdev.lldi.ports[0]);
getnstimeofday( qhp->wq.rq.sw_rq[qhp->wq.rq.pidx].host_time =
&qhp->wq.rq.sw_rq[qhp->wq.rq.pidx].host_ts); ktime_get();
} }
wqe->recv.opcode = FW_RI_RECV_WR; wqe->recv.opcode = FW_RI_RECV_WR;
......
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
...@@ -1272,6 +1272,8 @@ struct hfi1_devdata *hfi1_alloc_devdata(struct pci_dev *pdev, size_t extra) ...@@ -1272,6 +1272,8 @@ struct hfi1_devdata *hfi1_alloc_devdata(struct pci_dev *pdev, size_t extra)
"Could not allocate unit ID: error %d\n", -ret); "Could not allocate unit ID: error %d\n", -ret);
goto bail; goto bail;
} }
rvt_set_ibdev_name(&dd->verbs_dev.rdi, "%s_%d", class_name(), dd->unit);
/* /*
* Initialize all locks for the device. This needs to be as early as * Initialize all locks for the device. This needs to be as early as
* possible so locks are usable. * possible so locks are usable.
......
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
...@@ -5,4 +5,3 @@ config INFINIBAND_I40IW ...@@ -5,4 +5,3 @@ config INFINIBAND_I40IW
select GENERIC_ALLOCATOR select GENERIC_ALLOCATOR
---help--- ---help---
Intel(R) Ethernet X722 iWARP Driver Intel(R) Ethernet X722 iWARP Driver
INET && I40IW && INFINIBAND && I40E
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册