- 15 7月, 2008 1 次提交
-
-
由 Steve Wise 提交于
- set IB_DEVICE_MEM_MGT_EXTENSIONS capability bit if fw supports it. - set max_fast_reg_page_list_len device attribute. - add iwch_alloc_fast_reg_mr function. - add iwch_alloc_fastreg_pbl - add iwch_free_fastreg_pbl - adjust the WQ depth for kernel mode work queues to account for fastreg possibly taking 2 WR slots. - add fastreg_mr work request support. - add local_inv work request support. - add send_with_inv and send_with_se_inv work request support. - removed useless duplicate enums/defines for TPT/MW/MR stuff. Signed-off-by: NSteve Wise <swise@opengridcomputing.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
- 09 7月, 2008 1 次提交
-
-
由 Steve Wise 提交于
The change to iwch_provider.c in commit f4e91eb4 ("IB: convert struct class_device to struct device") undid the fix done in commit 7f049f2f ("RDMA/cxgb3: Hold rtnl_lock() around ethtool get_drvinfo call"). It removed the calls to rtnl_lock() that serialized the iw_cxgb3 ethtool ops calls into the cxgb3 driver. This locking is needed to avoid messing up the internal state of the cxgb3 driver. Signed-off-by: NSteve Wise <swise@opengridcomputing.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
- 17 5月, 2008 1 次提交
-
-
由 Roland Dreier 提交于
drivers/infiniband/hw/cxgb3/iwch_qp.c: In function 'iwch_post_send': drivers/infiniband/hw/cxgb3/iwch_qp.c:232: warning: 't3_wr_flit_cnt' may be used uninitialized in this function This is what akpm describes as "the dopey gcc-doesn't-know-that-foo(&var)-writes-to-var problem." Signed-off-by: NRoland Dreier <rolandd@cisco.com> Acked-by: NSteve Wise <swise@opengridcomputing.com>
-
- 14 5月, 2008 1 次提交
-
-
由 Steve Wise 提交于
cxio_flush_sq() was failing to wrap around the software send queue causing garbage completion entries on a flush operation. Signed-off-by: NSteve Wise <swise@opengridcomputing.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
- 07 5月, 2008 2 次提交
-
-
由 Roland Dreier 提交于
Currently, iw_cxgb3 is severely limited on the amount of userspace memory that can be registered in in a single memory region, which causes big problems for applications that expect to be able to register 100s of MB. The problem is that the driver uses a single kmalloc()ed buffer to hold the physical buffer list (PBL) for the entire memory region during registration, which means that 8 bytes of contiguous memory are required for each page of memory being registered. For example, a 64 MB registration will require 128 KB of contiguous memory with 4 KB pages, and it unlikely that such an allocation will succeed on a busy system. This is purely a driver problem: the temporary page list buffer is not needed by the hardware, so we can fix this by writing the PBL to the hardware in page-sized chunks rather than all at once. We do this by splitting the memory registration operation up into several steps: - Allocate PBL space in adapter memory for the full registration - Copy PBL to adapter memory in chunks - Allocate STag and enable memory region This also allows several other cleanups to the __cxio_tpt_op() interface and related parts of the driver. This change leaves the reregister memory region and memory window operations broken, but they already didn't work due to other longstanding bugs, so fixing them will be left to a later patch. Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Roland Dreier 提交于
Current iw_cxgb3 code adds PBL memory to the driver's gen_pool in 2 MB chunks. This limits the largest single allocation that can be done to the same size, which means that with 4 KB pages, each of which takes 8 bytes of PBL memory, the largest memory region that can be allocated is 1 GB (256K PBL entries * 4 KB/entry). Remove this limit by adding all the PBL memory in a single gen_pool chunk, if possible. Add code that falls back to smaller chunks if gen_pool_add() fails, which can happen if there is not sufficient contiguous lowmem for the internal gen_pool bitmap. Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
- 03 5月, 2008 3 次提交
-
-
由 Steve Wise 提交于
Testing on large clusters shows its way too short at 10 secs. Signed-off-by: NSteve Wise <swise@opengridcomputing.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Steve Wise 提交于
Remove bad BUG_ON() that can trigger in correct operation from close_con_rpl(). It is possible to get a close_rpl message on a dead connection. The sequence is: - host refs ep for close exchange - host posts close_req - hw posts PEER_ABORT from incoming RST - host marks ep DEAD - host posts ABORT_RPL and releases ep resources - hw posts CLOSE_RPL - host derefs ep and ep freed. Signed-off-by: NSteve Wise <swise@opengridcomputing.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Steve Wise 提交于
- Flush the QP only after the HW disables the connection. Currently we flush the QP when transitioning to CLOSING. This exposes a race condition where the HW can complete a RECV WR, for instance, -and- the SW can flush that same WR. - Only call CQ event handlers on flush IFF we actually flushed something. Signed-off-by: NSteve Wise <swise@opengridcomputing.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
- 30 4月, 2008 3 次提交
-
-
由 Steve Wise 提交于
Open MPI, Intel MPI and other applications don't respect the iWARP requirement that the client (active) side of the connection send the first RDMA message. This class of application connection setup is called peer-to-peer. Typically once the connection is setup, _both_ sides want to send data. This patch enables supporting peer-to-peer over the chelsio RNIC by enforcing this iWARP requirement in the driver itself as part of RDMA connection setup. Connection setup is extended, when the peer2peer module option is 1, such that the MPA initiator will send a 0B Read (the RTR) just after connection setup. The MPA responder will suspend SQ processing until the RTR message is received and reply-to. In the longer term, this will be handled in a standardized way by enhancing the MPA negotiation so peers can indicate whether they want/need the RTR and what type of RTR (0B read, 0B write, or 0B send) should be sent. This will be done by standardizing a few bits of the private data in order to negotiate all this. However this patch enables peer-to-peer applications now and allows most of the required firmware and driver changes to be done and tested now. Design: - Add a module option, peer2peer, to enable this mode. - New firmware support for peer-to-peer mode: - a new bit in the rdma_init WR to tell it to do peer-2-peer and what form of RTR message to send or expect. - process _all_ preposted recvs before moving the connection into rdma mode. - passive side: defer completing the rdma_init WR until all pre-posted recvs are processed. Suspend SQ processing until the RTR is received. - active side: expect and process the 0B read WR on offload TX queue. Defer completing the rdma_init WR until all pre-posted recvs are processed. Suspend SQ processing until the 0B read WR is processed from the offload TX queue. - If peer2peer is set, driver posts 0B read request on offload TX queue just after posting the rdma_init WR to the offload TX queue. - Add CQ poll logic to ignore unsolicitied read responses. Signed-off-by: NSteve Wise <swise@opengridcomputing.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Steve Wise 提交于
cxgb3 only supports 4GB memory regions. The lustre RDMA code uses this attribute and currently has to code around our bad setting. Signed-off-by: NSteve Wise <swise@opengridcomputing.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Steve Wise 提交于
Open MPI and other stress testing exposed a few bad bugs in handling aborts in the middle of a normal close. Fix these by: - serializing abort reply and peer abort processing with disconnect processing - warning (and ignoring) if ep timer is stopped when it wasn't running - cleaning up disconnect path to correctly deal with aborting and dead endpoints - in iwch_modify_qp(), taking a ref on the ep before releasing the qp lock if iwch_ep_disconnect() will be called. The ref is dropped after calling disconnect. Signed-off-by: NSteve Wise <swise@opengridcomputing.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
- 29 4月, 2008 1 次提交
-
-
由 Arthur Kepner 提交于
Add a new parameter, dmasync, to the ib_umem_get() prototype. Use dmasync = 1 when mapping user-allocated CQs with ib_umem_get(). Signed-off-by: NArthur Kepner <akepner@sgi.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Cc: Jes Sorensen <jes@sgi.com> Cc: Randy Dunlap <randy.dunlap@oracle.com> Cc: Roland Dreier <rdreier@cisco.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: David Miller <davem@davemloft.net> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Grant Grundler <grundler@parisc-linux.org> Cc: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 20 4月, 2008 1 次提交
-
-
由 Tony Jones 提交于
This converts the main ib_device to use struct device instead of struct class_device as class_device is going away. Signed-off-by: NTony Jones <tonyj@suse.de> Signed-off-by: NKay Sievers <kay.sievers@vrfy.org> Cc: Roland Dreier <rolandd@cisco.com> Cc: Sean Hefty <sean.hefty@intel.com> Cc: Hal Rosenstock <hal.rosenstock@gmail.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
-
- 17 4月, 2008 3 次提交
-
-
由 Roland Dreier 提交于
Add a new IB_WR_SEND_WITH_INV send opcode that can be used to mark a "send with invalidate" work request as defined in the iWARP verbs and the InfiniBand base memory management extensions. Also put "imm_data" and a new "invalidate_rkey" member in a new "ex" union in struct ib_send_wr. The invalidate_rkey member can be used to pass in an R_Key/STag to be invalidated. Add this new union to struct ib_uverbs_send_wr. Add code to copy the invalidate_rkey field in ib_uverbs_post_send(). Fix up low-level drivers to deal with the change to struct ib_send_wr, and just remove the imm_data initialization from net/sunrpc/xprtrdma/, since that code never does any send with immediate operations. Also, move the existing IB_DEVICE_SEND_W_INV flag to a new bit, since the iWARP drivers currently in the tree set the bit. The amso1100 driver at least will silently fail to honor the IB_SEND_INVALIDATE bit if passed in as part of userspace send requests (since it does not implement kernel bypass work request queueing). Remove the flag from all existing drivers that set it until we know which ones are OK. The values chosen for the new flag is not consecutive to avoid clashing with flags defined in the XRC patches, which are not merged yet but which are already in use and are likely to be merged soon. This resurrects a patch sent long ago by Mikkel Hagen <mhagen@iol.unh.edu>. Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Harvey Harrison 提交于
__FUNCTION__ is gcc-specific, use __func__ instead. Signed-off-by: NHarvey Harrison <harvey.harrison@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Roland Dreier 提交于
Fix sparse warnings about pointer signedness by using a signed int when calling idr_get_new_above(). Signed-off-by: NRoland Dreier <rolandd@cisco.com> Acked-by: NSteve Wise <swise@opengridcomputing.com>
-
- 29 3月, 2008 1 次提交
-
-
由 Roland Dreier 提交于
Because of a typo in iwch_accept_cr(), the cxgb3 connection handling code programs the hardware IRD (incoming RDMA read queue depth) with the value that is passed in for the ORD (outgoing RDMA read queue depth). In particular this means that if an application passes in IRD > 0 and ORD = 0 (which is a completely sane and valid thing to do for an app that expects only incoming RDMA read requests), then the hardware will end up programmed with IRD = 0 and the app will fail in a mysterious way. Fix this by using "ep->ird" instead of "ep->ord" in the intended place. Signed-off-by: NRoland Dreier <rolandd@cisco.com> Acked-by: NSteve Wise <swise@opengridcomputing.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 10 3月, 2008 1 次提交
-
-
由 Jon Mason 提交于
The cxbg3 driver is unnecessarily decreasing the number of CQ entries by one when creating a CQ. This will cause the CQ not to have as many entries as requested by the user if the user requests a power of 2 size. Signed-off-by: NJon Mason <jon@opengridcomputing.com> Acked-by: NSteve Wise <swise@opengridcomputing.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
- 01 3月, 2008 1 次提交
-
-
由 Jon Mason 提交于
Set cap.max_inline_data to the actual max inline data that the adapter support, so that userspace apps see the right value returned. Signed-off-by: NJon Mason <jon@opengridcomputing.com> Acked-by: NSteve Wise <swise@opengridcomputing.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
- 26 2月, 2008 1 次提交
-
-
由 Bryan Rosenburg 提交于
A single entry (addr 0x10001000, size 0x2000) will get converted to page address 0x10000000 with a page size of 0x4000. The code as it stands doesn't address the single buffer case, but in fact it allows the subsequent single-buffer special case to be eliminated entirely. Because the mask now includes the (page adjusted) starting and ending addresses, the general case works for the single buffer case as well. Signed-off-by: NBryan Rosenburg <rosnbrg@us.ibm.com> Acked-by: NSteve Wise <swise@opengridcomputing.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
- 13 2月, 2008 1 次提交
-
-
由 Steve Wise 提交于
The cxgb3 HW and driver don't support loopback RDMA connections. So fail any connection attempt where the destination address is local. Signed-off-by: NSteve Wise <swise@opengridcomputing.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
- 29 1月, 2008 2 次提交
-
-
由 Denis V. Lunev 提交于
Needed to propagate it down to the __ip_route_output_key. Signed_off_by: Denis V. Lunev <den@openvz.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 WANG Cong 提交于
This patch removes TOPDIR from infiniband Makefile and delete one include statement pointing to a non-existing directory Cc: Roland Dreier <rolandd@cisco.com> Cc: Sean Hefty <mshefty@ichips.intel.com> Cc: Hal Rosenstock <hal.rosenstock@gmail.com> Signed-off-by: NWANG Cong <xiyou.wangcong@gmail.com> Signed-off-by: NSam Ravnborg <sam@ravnborg.org>
-
- 26 1月, 2008 8 次提交
-
-
由 Steve Wise 提交于
Correctly work around T3A issues by checking "hwtype != T3A" instead of "hwtype == T3B". This will be needed for new hardware types. Signed-off-by: NSteve Wise <swise@opengridcomputing.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Steve Wise 提交于
This is needed to support zero-stag properly. Signed-off-by: NSteve Wise <swise@opengridcomputing.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Steve Wise 提交于
The existing logic incorrectly maps this buffer list: 0: addr 0x10001000, size 0x1000 1: addr 0x10002000, size 0x1000 To this bogus page list: 0: 0x10000000 1: 0x10002000 The shift calculation must also take into account the address of the first entry masked by the page_mask as well as the last address+size rounded up to the next page size. Signed-off-by: NSteve Wise <swise@opengridcomputing.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Steve Wise 提交于
- for kernel mode cqs, call event notification handler when flushing. - flush QP when moving from RTS -> CLOSING. - fix logic to identify a kernel mode qp. Signed-off-by: NSteve Wise <swise@opengridcomputing.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Roland Dreier 提交于
t3_rdma_init_wr.irs is a big-endian field, so declare it as __be32. This fixes one sparse warning. Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Joe Perches 提交于
Signed-off-by: NJoe Perches <joe@perches.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Steve Wise 提交于
The 5.0 firmware now supports translating sgls in recv work requests, so remove the host driver logic currently doing the translation. Note: this change requires 5.0 firmware. Signed-off-by: NSteve Wise <swise@opengridcomputing.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Steve Wise 提交于
Currently the call into cxgb3 to get the driver info is not serialized. The iw_cxgb3 module needs to hold the rtnl_lock around the ethtool ops call like dev_ioctl() does. Signed-off-by: NSteve Wise <swise@opengridcomputing.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
- 14 11月, 2007 1 次提交
-
-
由 Steve Wise 提交于
The device attribute max_qp_init_rd_atom is not getting set in cxgb3's query_device method. Version 1.0.4 of librdmacm now validates the user's requested initiator and responder resources against the max supported by the device. Since iw_cxgb3 wasn't setting this attribute (and it defaulted to 0), all rdma_connect()s fail if there are initiator resources requested by the app. Fix this by setting the correct value in iwch_query_device(). Signed-off-by: NSteve Wise <swise@opengridcomputing.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
- 11 10月, 2007 1 次提交
-
-
由 Eric W. Biederman 提交于
This patch makes most of the generic device layer network namespace safe. This patch makes dev_base_head a network namespace variable, and then it picks up a few associated variables. The functions: dev_getbyhwaddr dev_getfirsthwbytype dev_get_by_flags dev_get_by_name __dev_get_by_name dev_get_by_index __dev_get_by_index dev_ioctl dev_ethtool dev_load wireless_process_ioctl were modified to take a network namespace argument, and deal with it. vlan_ioctl_set and brioctl_set were modified so their hooks will receive a network namespace argument. So basically anthing in the core of the network stack that was affected to by the change of dev_base was modified to handle multiple network namespaces. The rest of the network stack was simply modified to explicitly use &init_net the initial network namespace. This can be fixed when those components of the network stack are modified to handle multiple network namespaces. For now the ifindex generator is left global. Fundametally ifindex numbers are per namespace, or else we will have corner case problems with migration when we get that far. At the same time there are assumptions in the network stack that the ifindex of a network device won't change. Making the ifindex number global seems a good compromise until the network stack can cope with ifindex changes when you change namespaces, and the like. Signed-off-by: NEric W. Biederman <ebiederm@xmission.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 10 10月, 2007 1 次提交
-
-
由 Steve Wise 提交于
Allow changing parameter values without having to reload the module. This is safe because these parameters are only looked at when a new connection is established. Signed-off-by: NSteve Wise <swise@opengridcomputing.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
- 31 8月, 2007 1 次提交
-
-
由 Divy Le Ray 提交于
cxgb3 used netdev_priv() and dev->priv for different purposes. In 2.6.23, netdev_priv() == dev->priv, cxgb3 needs a fix. This patch is a partial backport of Dave Miller's changes in the net-2.6.24 git branch. Without this fix, cxgb3 crashes on 2.6.23. Signed-off-by: NDivy Le Ray <divy@chelsio.com> Signed-off-by: NJeff Garzik <jeff@garzik.org>
-
- 04 8月, 2007 1 次提交
-
-
由 Steve Wise 提交于
This avoids deadlocks. Signed-off-by: NSteve Wise <swise@opengridcomputing.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
- 20 7月, 2007 1 次提交
-
-
由 Yoann Padioleau 提交于
Transform some calls to kmalloc/memset to a single kzalloc (or kcalloc). Here is a short excerpt of the semantic patch performing this transformation: @@ type T2; expression x; identifier f,fld; expression E; expression E1,E2; expression e1,e2,e3,y; statement S; @@ x = - kmalloc + kzalloc (E1,E2) ... when != \(x->fld=E;\|y=f(...,x,...);\|f(...,x,...);\|x=E;\|while(...) S\|for(e1;e2;e3) S\) - memset((T2)x,0,E1); @@ expression E1,E2,E3; @@ - kzalloc(E1 * E2,E3) + kcalloc(E1,E2,E3) [akpm@linux-foundation.org: get kcalloc args the right way around] Signed-off-by: NYoann Padioleau <padator@wanadoo.fr> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Acked-by: NRussell King <rmk@arm.linux.org.uk> Cc: Bryan Wu <bryan.wu@analog.com> Acked-by: NJiri Slaby <jirislaby@gmail.com> Cc: Dave Airlie <airlied@linux.ie> Acked-by: NRoland Dreier <rolandd@cisco.com> Cc: Jiri Kosina <jkosina@suse.cz> Acked-by: NDmitry Torokhov <dtor@mail.ru> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: NMauro Carvalho Chehab <mchehab@infradead.org> Acked-by: NPierre Ossman <drzeus-list@drzeus.cx> Cc: Jeff Garzik <jeff@garzik.org> Cc: "David S. Miller" <davem@davemloft.net> Acked-by: NGreg KH <greg@kroah.com> Cc: James Bottomley <James.Bottomley@steeleye.com> Cc: "Antonino A. Daplas" <adaplas@pol.net> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 18 7月, 2007 1 次提交
-
-
由 Steve Wise 提交于
Signed-off-by: NSteve Wise <swise@opengridcomputing.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
- 10 7月, 2007 1 次提交
-
-
由 Jan Engelhardt 提交于
Change Kconfig objects from "menu, config" into "menuconfig" so that the user can disable the whole feature without having to enter the menu first. Signed-off-by: NJan Engelhardt <jengelh@gmx.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-