- 09 11月, 2013 1 次提交
-
-
由 Doug Ledford 提交于
The cma_acquire_dev function was changed by commit 3c86aa70 ("RDMA/cm: Add RDMA CM support for IBoE devices") to use find_gid_port() because multiport devices might have either IB or IBoE formatted gids. The old function assumed that all ports on the same device used the same GID format. However, when it was changed to use find_gid_port(), we inadvertently lost usage of the GID cache. This turned out to be a very costly change. In our testing, each iteration through each index of the GID table takes roughly 35us. When you have multiple devices in a system, and the GID you are looking for is on one of the later devices, the code loops through all of the GID indexes on all of the early devices before it finally succeeds on the target device. This pathological search behavior combined with 35us per GID table index retrieval results in results such as the following from the cmtime application that's part of the latest librdmacm git repo: ib1: step total ms max ms min us us / conn create id : 29.42 0.04 1.00 2.94 bind addr : 186705.66 19.00 18556.00 18670.57 resolve addr : 41.93 9.68 619.00 4.19 resolve route: 486.93 0.48 101.00 48.69 create qp : 4021.95 6.18 330.00 402.20 connect : 68350.39 68588.17 24632.00 6835.04 disconnect : 1460.43 252.65-1862269.00 146.04 destroy : 41.16 0.04 2.00 4.12 ib0: step total ms max ms min us us / conn create id : 28.61 0.68 1.00 2.86 bind addr : 2178.86 2.95 201.00 217.89 resolve addr : 51.26 16.85 845.00 5.13 resolve route: 620.08 0.43 92.00 62.01 create qp : 3344.40 6.36 273.00 334.44 connect : 6435.99 6368.53 7844.00 643.60 disconnect : 5095.38 321.90 757.00 509.54 destroy : 37.13 0.02 2.00 3.71 Clearly, both the bind address and connect operations suffer a huge penalty for being anything other than the default GID on the first port in the system. After applying this patch, the numbers now look like this: ib1: step total ms max ms min us us / conn create id : 30.15 0.03 1.00 3.01 bind addr : 80.27 0.04 7.00 8.03 resolve addr : 43.02 13.53 589.00 4.30 resolve route: 482.90 0.45 100.00 48.29 create qp : 3986.55 5.80 330.00 398.66 connect : 7141.53 7051.29 5005.00 714.15 disconnect : 5038.85 193.63 918.00 503.88 destroy : 37.02 0.04 2.00 3.70 ib0: step total ms max ms min us us / conn create id : 34.27 0.05 1.00 3.43 bind addr : 26.45 0.04 1.00 2.64 resolve addr : 38.25 10.54 760.00 3.82 resolve route: 604.79 0.43 97.00 60.48 create qp : 3314.95 6.34 273.00 331.49 connect : 12399.26 12351.10 8609.00 1239.93 disconnect : 5096.76 270.72 1015.00 509.68 destroy : 37.10 0.03 2.00 3.71 It's worth noting that we still suffer a bit of a penalty on connect to the wrong device, but the penalty is much less than it used to be. Follow on patches deal with this penalty. Many thanks to Neil Horman for helping to track the source of slow function that allowed us to track down the fact that the original patch I mentioned above backed out cache usage and identify just how much that impacted the system. Signed-off-by: NDoug Ledford <dledford@redhat.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
- 13 8月, 2013 1 次提交
-
-
由 Steve Wise 提交于
Modify the type of local_addr and remote_addr fields in struct iw_cm_id from struct sockaddr_in to struct sockaddr_storage to hold IPv6 and IPv4 addresses uniformly. Change the references of local_addr and remote_addr in cxgb4, cxgb3, nes and amso drivers to match this. However to be able to actully run traffic over IPv6, low-level drivers have to add code to support this. Signed-off-by: NSteve Wise <swise@opengridcomputing.com> Reviewed-by: NSean Hefty <sean.hefty@intel.com> [ Fix unused variable warnings when INFINIBAND_NES_DEBUG not set. - Roland ] Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
- 31 7月, 2013 3 次提交
-
-
由 Sean Hefty 提交于
Calling cma_save_ib_info() for CM SIDR REQs results in a crash accessing an invalid path record pointer. Signed-off-by: NSean Hefty <sean.hefty@intel.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Sean Hefty 提交于
If a application is using AF_IB with a UD QP, but does not provide any private data, we will end up accessing invalid memory. Check for this case and handle it appropriately. Signed-off-by: NSean Hefty <sean.hefty@intel.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Paul Bolle 提交于
Building cma.o triggers this gcc warning: drivers/infiniband/core/cma.c: In function ‘rdma_resolve_addr’: drivers/infiniband/core/cma.c:465:23: warning: ‘port’ may be used uninitialized in this function [-Wmaybe-uninitialized] drivers/infiniband/core/cma.c:426:5: note: ‘port’ was declared here This is a false positive, as "port" will always be initialized if we're at "found". But if we assign to "id_priv->id.port_num" directly, we can drop "port". That will, obviously, silence gcc. Signed-off-by: NPaul Bolle <pebolle@tiscali.nl> Signed-off-by: NSean Hefty <sean.hefty@intel.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
- 15 7月, 2013 1 次提交
-
-
由 Rusty Russell 提交于
Sweep of the simple cases. Cc: netdev@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-arm-kernel@lists.infradead.org Cc: Julia Lawall <julia.lawall@lip6.fr> Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Acked-by: NDavid S. Miller <davem@davemloft.net> Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 21 6月, 2013 20 次提交
-
-
由 Sean Hefty 提交于
Report AF_IB source and destination addresses through netlink interface. Signed-off-by: NSean Hefty <sean.hefty@intel.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Sean Hefty 提交于
Allow user space applications to join multicast groups using MGIDs directly. MGIDs may be passed using AF_IB addresses. Since the current multicast join command only supports addresses as large as sockaddr_in6, define a new structure for joining addresses specified using sockaddr_ib. Since AF_IB allows the user to specify the qkey when resolving a remote UD QP address, when joining the multicast group use the qkey value, if one has been assigned. Signed-off-by: NSean Hefty <sean.hefty@intel.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Sean Hefty 提交于
Allow the rdma_ucm to query the IB service ID formed or allocated by the rdma_cm by exporting the cma_get_service_id() functionality. Signed-off-by: NSean Hefty <sean.hefty@intel.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Sean Hefty 提交于
If an rdma_cm_id is bound to AF_IB, with a wild card address, only listen on IB devices. Signed-off-by: NSean Hefty <sean.hefty@intel.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Sean Hefty 提交于
Allow the user to specify the qkey when using AF_IB. The qkey is added to struct rdma_ucm_conn_param in place of a reserved field, but for backwards compatability, is only accessed if the associated rdma_cm_id is using AF_IB. Signed-off-by: NSean Hefty <sean.hefty@intel.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Sean Hefty 提交于
If the source or destination address is AF_IB, then do not reserve a portion of the private data in the IB CM REQ or SIDR REQ messages for the cma header. Instead, all private data should be exported to the user. When AF_IB is used, the rdma cm does not have sufficient information to fill in the cma header. Additionally, this will be necessary to support any IB connection through the rdma cm interface, Signed-off-by: NSean Hefty <sean.hefty@intel.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Sean Hefty 提交于
With the removal of SDP related code, we can merge cma_get_net_info() with cma_save_net_info(), since we're only ever dealing with a single header format. Signed-off-by: NSean Hefty <sean.hefty@intel.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Sean Hefty 提交于
The SDP protocol was never merged upstream. Remove unused SDP related code from the RDMA CM. Signed-off-by: NSean Hefty <sean.hefty@intel.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Sean Hefty 提交于
cma_get_service_id() forms the service ID based on the port space and port number of the rdma_cm_id. Extend the call to support AF_IB, which contains the service ID directly. This will be needed to support any arbitrary SID. Signed-off-by: NSean Hefty <sean.hefty@intel.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Sean Hefty 提交于
Allow rdma_resolve_route() to handle the case where the user specified the source and destination addresses using AF_IB. Signed-off-by: NSean Hefty <sean.hefty@intel.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Sean Hefty 提交于
Allow the user to specify the remote address using AF_IB format. When AF_IB is used, the remote address simply needs to be recorded, and no resolution using ARP is done. The local address may still need to be matched with a local IB device. Signed-off-by: NSean Hefty <sean.hefty@intel.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Sean Hefty 提交于
Signed-off-by: NSean Hefty <sean.hefty@intel.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Sean Hefty 提交于
If a user specifies AF_IB as the source address for a loopback connection, limit the resolution to IB devices only. Signed-off-by: NSean Hefty <sean.hefty@intel.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Sean Hefty 提交于
Provide inline helpers to extract source and destination address data from the rdma_cm_id. Signed-off-by: NSean Hefty <sean.hefty@intel.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Sean Hefty 提交于
cma_resolve_loopback is called after an rdma_cm_id has been bound to a specific sa_family and port. Once the source sa_family for the id has been set, do not modify it. Only the actual IP address portion of the source address needs to be set. As part of this fix, we can simplify setting the source address by moving the loopback address assignment from cma_resolve_loopback to cma_bind_loopback. cma_bind_loopback is only invoked when the source address is the loopback address. Finally, add loopback support for AF_IB as part of the change. Signed-off-by: NSean Hefty <sean.hefty@intel.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Sean Hefty 提交于
Modify rdma_bind_addr to allow the user to specify AF_IB when binding to a device. AF_IB indicates that the user is not mapping an IP address to the native IB addressing. (The mapping may have already been done, or is not needed) Signed-off-by: NSean Hefty <sean.hefty@intel.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Sean Hefty 提交于
The AF_IB uses a 64-bit service id (SID), which the user can control through the use of a mask. The rdma_cm will assign values to the unmasked portions of the SID based on the selected port space and port number. Because the IB spec divides the SID range into several regions, a SID/mask combination may fall into one of the existing port space ranges as defined by the RDMA CM IP Annex. Map the AF_IB SID to the correct RDMA port space. Signed-off-by: NSean Hefty <sean.hefty@intel.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Sean Hefty 提交于
Add support for AF_IB to ip_addr_size, and rename the function to account for the change. Give the compiler more control over whether the call should be inline or not by moving the definition into the .c file, removing the static inline, and exporting it. Signed-off-by: NSean Hefty <sean.hefty@intel.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Sean Hefty 提交于
Enhance checks for loopback and any address to support AF_IB in addition to AF_INET and AF_INT6. This will allow future patches to use AF_IB when binding and resolving addresses. Signed-off-by: NSean Hefty <sean.hefty@intel.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Sean Hefty 提交于
The rdma_cm only allows setting reuseaddr if the corresponding rdma_cm_id is in the idle state. Allow setting this value in other states. This brings the behavior more inline with sockets. Signed-off-by: NSean Hefty <sean.hefty@intel.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
- 29 5月, 2013 1 次提交
-
-
由 Jiri Pirko 提交于
So far, only net_device * could be passed along with netdevice notifier event. This patch provides a possibility to pass custom structure able to provide info that event listener needs to know. Signed-off-by: NJiri Pirko <jiri@resnulli.us> v2->v3: fix typo on simeth shortened dev_getter shortened notifier_info struct name v1->v2: fix notifier_call parameter in call_netdevice_notifier() Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 28 2月, 2013 2 次提交
-
-
由 Sasha Levin 提交于
I'm not sure why, but the hlist for each entry iterators were conceived list_for_each_entry(pos, head, member) The hlist ones were greedy and wanted an extra parameter: hlist_for_each_entry(tpos, pos, head, member) Why did they need an extra pos parameter? I'm not quite sure. Not only they don't really need it, it also prevents the iterator from looking exactly like the list iterator, which is unfortunate. Besides the semantic patch, there was some manual work required: - Fix up the actual hlist iterators in linux/list.h - Fix up the declaration of other iterators based on the hlist ones. - A very small amount of places were using the 'node' parameter, this was modified to use 'obj->member' instead. - Coccinelle didn't handle the hlist_for_each_entry_safe iterator properly, so those had to be fixed up manually. The semantic patch which is mostly the work of Peter Senna Tschudin is here: @@ iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host; type T; expression a,c,d,e; identifier b; statement S; @@ -T b; <+... when != b ( hlist_for_each_entry(a, - b, c, d) S | hlist_for_each_entry_continue(a, - b, c) S | hlist_for_each_entry_from(a, - b, c) S | hlist_for_each_entry_rcu(a, - b, c, d) S | hlist_for_each_entry_rcu_bh(a, - b, c, d) S | hlist_for_each_entry_continue_rcu_bh(a, - b, c) S | for_each_busy_worker(a, c, - b, d) S | ax25_uid_for_each(a, - b, c) S | ax25_for_each(a, - b, c) S | inet_bind_bucket_for_each(a, - b, c) S | sctp_for_each_hentry(a, - b, c) S | sk_for_each(a, - b, c) S | sk_for_each_rcu(a, - b, c) S | sk_for_each_from -(a, b) +(a) S + sk_for_each_from(a) S | sk_for_each_safe(a, - b, c, d) S | sk_for_each_bound(a, - b, c) S | hlist_for_each_entry_safe(a, - b, c, d, e) S | hlist_for_each_entry_continue_rcu(a, - b, c) S | nr_neigh_for_each(a, - b, c) S | nr_neigh_for_each_safe(a, - b, c, d) S | nr_node_for_each(a, - b, c) S | nr_node_for_each_safe(a, - b, c, d) S | - for_each_gfn_sp(a, c, d, b) S + for_each_gfn_sp(a, c, d) S | - for_each_gfn_indirect_valid_sp(a, c, d, b) S + for_each_gfn_indirect_valid_sp(a, c, d) S | for_each_host(a, - b, c) S | for_each_host_safe(a, - b, c, d) S | for_each_mesh_entry(a, - b, c, d) S ) ...+> [akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c] [akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c] [akpm@linux-foundation.org: checkpatch fixes] [akpm@linux-foundation.org: fix warnings] [akpm@linux-foudnation.org: redo intrusive kvm changes] Tested-by: NPeter Senna Tschudin <peter.senna@gmail.com> Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: NSasha Levin <sasha.levin@oracle.com> Cc: Wu Fengguang <fengguang.wu@intel.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Gleb Natapov <gleb@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Tejun Heo 提交于
Convert to the much saner new idr interface. v2: Mike triggered WARN_ON() in idr_preload() because send_mad(), which may be used from non-process context, was calling idr_preload() unconditionally. Preload iff @gfp_mask has __GFP_WAIT. Signed-off-by: NTejun Heo <tj@kernel.org> Reviewed-by: NSean Hefty <sean.hefty@intel.com> Reported-by: N"Marciniszyn, Mike" <mike.marciniszyn@intel.com> Cc: Roland Dreier <roland@kernel.org> Cc: Sean Hefty <sean.hefty@intel.com> Cc: Hal Rosenstock <hal.rosenstock@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 30 11月, 2012 1 次提交
-
-
由 shefty 提交于
Problem reported by Dan Carpenter <dan.carpenter@oracle.com>: The patch 3c86aa70: "RDMA/cm: Add RDMA CM support for IBoE devices" from Oct 13, 2010, leads to the following warning: net/sunrpc/xprtrdma/svc_rdma_transport.c:722 svc_rdma_create() error: passing non neg 1 to ERR_PTR This bug would result in a NULL dereference. svc_rdma_create() is supposed to return ERR_PTRs or valid pointers, but instead it returns ERR_PTRs, valid pointers and 1. The call tree is: svc_rdma_create() => rdma_bind_addr() => cma_acquire_dev() => find_gid_port() rdma_bind_addr() should return a valid errno. Fix this by having find_gid_port() also return a valid errno. If we can't find the specified GID on a given port, return -EADDRNOTAVAIL, rather than -EAGAIN, to better indicate the error. We also drop using the special return value of '1' and instead pass through the error returned by the underlying verbs call. On such errors, rather than aborting the search, we simply continue to check the next device/port. Signed-off-by: NSean Hefty <sean.hefty@intel.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
- 07 10月, 2012 1 次提交
-
-
由 Gao feng 提交于
set netlink_dump_control.module to avoid panic. Signed-off-by: NGao feng <gaofeng@cn.fujitsu.com> Cc: Roland Dreier <roland@kernel.org> Cc: Sean Hefty <sean.hefty@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 05 10月, 2012 1 次提交
-
-
由 Sean Hefty 提交于
The retry_count and rnr_retry_count connection parameters are both 3-bit values. Check that the values are in range and reduce if they're not. This fixes a problem reported by Doug Ledford <dledford@redhat.com> that resulted in the userspace rping test (part of the librdmacm samples) failing to run over Intel IB HCAs. Signed-off-by: NSean Hefty <sean.hefty@intel.com> [ Use min_t() to avoid warnings about type mismatch. - Roland ] Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
- 01 10月, 2012 1 次提交
-
-
由 Dotan Barak 提交于
CMA multicast joins for the IPoIB port space need to use the same component mask used by the ipoib driver. Otherwise, it's possible for the CMA to create a group to which a join made by ipoib will fail, or vise-versa. Some of the component mask fields set by ipoib weren't set by the CMA, fix that. Signed-off-by: NDotan Barak <dotanb@dev.mellanox.co.il> Reviewed-by: NJack Morgenstein <jackm@dev.mellanox.co.il> Acked-by: NSean Hefty <sean.hefty@intel.com> Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
- 28 7月, 2012 1 次提交
-
-
由 Fengguang Wu 提交于
Suggested by scripts/coccinelle/api/ptr_ret.cocci. Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
- 09 7月, 2012 4 次提交
-
-
由 Roland Dreier 提交于
Instead of testing defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Sean Hefty 提交于
Provide an option for the user to specify that listens should only accept connections where the incoming address family matches that of the locally bound address. This is used to support the equivalent of IPV6_V6ONLY socket option, which allows an app to only accept connection requests directed to IPv6 addresses. Signed-off-by: NSean Hefty <sean.hefty@intel.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Sean Hefty 提交于
The rdma_cm maps IPv4 and IPv6 addresses to the same service ID. This prevents apps from listening only for IPv4 or IPv6 addresses. It also results in an app binding to an IPv4 address receiving connection requests for an IPv6 address. Change this to match socket behavior: restrict listens on IPv4 addresses to only IPv4 addresses, and if a listen is on an IPv6 address, allow it to receive either IPv4 or IPv6 addresses, based on its address family binding. Signed-off-by: NSean Hefty <sean.hefty@intel.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Sean Hefty 提交于
The RDMA CM uses a single port space for all associated (tcp, udp, etc.) port bindings, regardless of the address family that the user binds to. The result is that if a user binds to AF_INET, but does not specify an IP address, the bind will occur for AF_INET6. This causes an attempt to bind to the same port using AF_INET6 to fail, and connection requests to AF_INET6 will match with the AF_INET listener. Align the behavior with sockets and restrict the bind to AF_INET only. If a user binds to AF_INET6, we bind the port to AF_INET6 and AF_INET depending on the value of bindv6only. Signed-off-by: NSean Hefty <sean.hefty@intel.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
- 20 6月, 2012 1 次提交
-
-
由 Sean Hefty 提交于
Change || check to the intended && when checking the QP type in a received connection request against the listening endpoint. Signed-off-by: NSean Hefty <sean.hefty@intel.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
- 09 5月, 2012 1 次提交
-
-
由 Sean Hefty 提交于
The following lockdep problem was reported by Or Gerlitz <ogerlitz@mellanox.com>: [ INFO: possible recursive locking detected ] 3.3.0-32035-g1b2649e-dirty #4 Not tainted --------------------------------------------- kworker/5:1/418 is trying to acquire lock: (&id_priv->handler_mutex){+.+.+.}, at: [<ffffffffa0138a41>] rdma_destroy_i d+0x33/0x1f0 [rdma_cm] but task is already holding lock: (&id_priv->handler_mutex){+.+.+.}, at: [<ffffffffa0135130>] cma_disable_ca llback+0x24/0x45 [rdma_cm] other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&id_priv->handler_mutex); lock(&id_priv->handler_mutex); *** DEADLOCK *** May be due to missing lock nesting notation 3 locks held by kworker/5:1/418: #0: (ib_cm){.+.+.+}, at: [<ffffffff81042ac1>] process_one_work+0x210/0x4a 6 #1: ((&(&work->work)->work)){+.+.+.}, at: [<ffffffff81042ac1>] process_on e_work+0x210/0x4a6 #2: (&id_priv->handler_mutex){+.+.+.}, at: [<ffffffffa0135130>] cma_disab le_callback+0x24/0x45 [rdma_cm] stack backtrace: Pid: 418, comm: kworker/5:1 Not tainted 3.3.0-32035-g1b2649e-dirty #4 Call Trace: [<ffffffff8102b0fb>] ? console_unlock+0x1f4/0x204 [<ffffffff81068771>] __lock_acquire+0x16b5/0x174e [<ffffffff8106461f>] ? save_trace+0x3f/0xb3 [<ffffffff810688fa>] lock_acquire+0xf0/0x116 [<ffffffffa0138a41>] ? rdma_destroy_id+0x33/0x1f0 [rdma_cm] [<ffffffff81364351>] mutex_lock_nested+0x64/0x2ce [<ffffffffa0138a41>] ? rdma_destroy_id+0x33/0x1f0 [rdma_cm] [<ffffffff81065a78>] ? trace_hardirqs_on_caller+0x11e/0x155 [<ffffffff81065abc>] ? trace_hardirqs_on+0xd/0xf [<ffffffffa0138a41>] rdma_destroy_id+0x33/0x1f0 [rdma_cm] [<ffffffffa0139c02>] cma_req_handler+0x418/0x644 [rdma_cm] [<ffffffffa012ee88>] cm_process_work+0x32/0x119 [ib_cm] [<ffffffffa0130299>] cm_req_handler+0x928/0x982 [ib_cm] [<ffffffffa01302f3>] ? cm_req_handler+0x982/0x982 [ib_cm] [<ffffffffa0130326>] cm_work_handler+0x33/0xfe5 [ib_cm] [<ffffffff81065a78>] ? trace_hardirqs_on_caller+0x11e/0x155 [<ffffffffa01302f3>] ? cm_req_handler+0x982/0x982 [ib_cm] [<ffffffff81042b6e>] process_one_work+0x2bd/0x4a6 [<ffffffff81042ac1>] ? process_one_work+0x210/0x4a6 [<ffffffff813669f3>] ? _raw_spin_unlock_irq+0x2b/0x40 [<ffffffff8104316e>] worker_thread+0x1d6/0x350 [<ffffffff81042f98>] ? rescuer_thread+0x241/0x241 [<ffffffff81046a32>] kthread+0x84/0x8c [<ffffffff8136e854>] kernel_thread_helper+0x4/0x10 [<ffffffff81366d59>] ? retint_restore_args+0xe/0xe [<ffffffff810469ae>] ? __init_kthread_worker+0x56/0x56 [<ffffffff8136e850>] ? gs_change+0xb/0xb The actual locking is fine, since we're dealing with different locks, but from the same lock class. cma_disable_callback() acquires the listening id mutex, whereas rdma_destroy_id() acquires the mutex for the new connection id. To fix this, delay the call to rdma_destroy_id() until we've released the listening id mutex. Signed-off-by: NSean Hefty <sean.hefty@intel.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-