- 22 10月, 2011 4 次提交
-
-
由 Mike Marciniszyn 提交于
The heavy weight spinlock in qib_lookup_qpn() is replaced with RCU. The hash list itself is now accessed via jhash functions instead of mod. The changes should benefit multiple receive contexts in different processors by not contending for the lock just to read the hash structures. The patch also adds a lookaside_qp (pointer) and a lookaside_qpn in the context. The interrupt handler will test the current packet's qpn against lookaside_qpn if the lookaside_qp pointer is non-NULL. The pointer is NULL'ed when the interrupt handler exits. Signed-off-by: NMike Marciniszyn <mike.marciniszyn@qlogic.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Mike Marciniszyn 提交于
The context init now saves a shift from rcvegrbufs_perchunk rcvegrbufs_perchunk_shift using ilog2. A BUG_ON() protects the power of 2 assumption. Signed-off-by: NMike Marciniszyn <mike.marciniszyn@qlogic.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Mike Marciniszyn 提交于
Store both the encoded and decoded MTU in the QP structure as a minor optimization for UC/RC receive routines. Signed-off-by: NMike Marciniszyn <mike.marciniszyn@qlogic.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Mike Marciniszyn 提交于
The memset for zeroing work completions had been unconditional. This patch removes the memset and moves the zeroing into the work completion with a more explicit field by field set. With this patch, non-ONLY/non-LAST packets will avoid the overhead since they will not generate a completion. Signed-off-by: NMike Marciniszyn <mike.marciniszyn@qlogic.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
- 07 10月, 2011 1 次提交
-
-
由 Mike Marciniszyn 提交于
The code that was recently introduced to report the number of free contexts is flawed for multiple HCAs: /* Return the number of free user ports (contexts) available. */ return scnprintf(buf, PAGE_SIZE, "%u\n", dd->cfgctxts - dd->first_user_ctxt - (u32)qib_stats.sps_ctxts); The qib_stats is global to the module, not per HCA, so the code is broken for multiple HCAs. This patch adds a qib_devdata field, freectxts, that reflects the free contexts for this HCA. Signed-off-by: NMike Marciniszyn <mike.marciniszyn@qlogic.com> Reviewed-by: NRam Vepa <ram.vepa@qlogic.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
- 23 7月, 2011 1 次提交
-
-
由 Mike Marciniszyn 提交于
With ib_qib options: options ib_qib krcvqs=1 pcie_caps=0x51 rcvhdrcnt=4096 singleport=1 ibmtu=4 a run of ib_write_bw -a yields the following: ------------------------------------------------------------------ #bytes #iterations BW peak[MB/sec] BW average[MB/sec] 1048576 5000 2910.64 229.80 ------------------------------------------------------------------ The top cpu use in a profile is: CPU: Intel Architectural Perfmon, speed 2400.15 MHz (estimated) Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (No unit mask) count 1002300 Counted LLC_MISSES events (Last level cache demand requests from this core that missed the LLC) with a unit mask of 0x41 (No unit mask) count 10000 samples % samples % app name symbol name 15237 29.2642 964 17.1195 ib_qib.ko qib_7322intr 12320 23.6618 1040 18.4692 ib_qib.ko handle_7322_errors 4106 7.8860 0 0 vmlinux vsnprintf Analysis of the stats, profile, the code, and the annotated profile indicate: - All of the overflow interrupts (one per packet overflow) are serviced on CPU0 with no mitigation on the frequency. - All of the receive interrupts are being serviced by CPU0. (That is the way truescale.cmds statically allocates the kctx IRQs to CPU) - The code is spending all of its time servicing QIB_I_C_ERROR RcvEgrFullErr interrupts on CPU0, starving the packet receive processing. - The decode_err routine is very inefficient, using a printf variant to format a "%s" and continues to loop when the errs mask has been cleared. - Both qib_7322intr and handle_7322_errors read pci registers, which is very inefficient. The fix does the following: - Adds a tasklet to service QIB_I_C_ERROR - Replaces the very inefficient scnprintf() with a memcpy(). A field is added to qib_hwerror_msgs to save the sizeof("string") at compile time so that a strlen is not needed during err_decode(). - The most frequent errors (Overflows) are serviced first to exit the loop as early as possible. - The loop now exits as soon as the errs mask is clear rather than fruitlessly looping through the msp array. With this fix the performance changes to: ------------------------------------------------------------------ #bytes #iterations BW peak[MB/sec] BW average[MB/sec] 1048576 5000 2990.64 2941.35 ------------------------------------------------------------------ During testing of the error handling overflow patch, it was determined that some CPU's were slower when servicing both overflow and receive interrupts on CPU0 with different MSI interrupt vectors. This patch adds an option (krcvq01_no_msi) to not use a dedicated MSI interrupt for kctx's < 2 and to service them on the default interrupt. For some CPUs, the cost of the interrupt enter/exit is more costly than then the additional PCI read in the default handler. Signed-off-by: NMike Marciniszyn <mike.marciniszyn@qlogic.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
- 19 7月, 2011 7 次提交
-
-
由 Or Gerlitz 提交于
Move the various definitions and mad structures needed for software implementation of IBA PM agent from the ipath and qib drivers into a single include file, which in turn could be used by more consumers. Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.co.il> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Mitko Haralanov 提交于
Update the active link width on QLE7220 chips when link goes down if chip width does not match shadowed width. Signed-off-by: NMitko Haralanov <mitko@qlogic.com> Signed-off-by: NMike Marciniszyn <mike.marciniszyn@qlogic.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Ram Vepa 提交于
There is a possibility of a deadlock due to the way locks are acquired and released in qib_set_uevent_bits(). The function qib_set_uevent_bits() is called in process context and it uses spin_lock() and spin_unlock(). This same lock is acquired/released in interrupt context which can lead to a deadlock when running on the same cpu. The fix is to replace spin_lock() and spin_unlock() with spin_lock_irqsave() and spin_unlock_irqrestore() respectively in qib_set_uevent_bits(). Signed-off-by: NMike Marciniszyn <mike.marciniszyn@qlogic.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Ram Vepa 提交于
Indicate the number of free user contexts via the sysfs file /sys/class/infiniband/qib0/nfreectxts as required for PSM. Signed-off-by: NMike Marciniszyn <mike.marciniszyn@qlogic.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Edwin van Vliet 提交于
Signed-off-by: NEdwin van Vliet <edwin@cheatah.nl> Reviewed-by: NJesper Juhl <jj@chaosbits.net> Acked-by: NMike Marciniszyn <mike.marciniszyn@qlogic.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Jon Mason 提交于
The PCIE capability offset is saved during PCI bus walking. It will remove an unnecessary search in the PCI configuration space if this value is referenced instead of reacquiring it. Signed-off-by: NJon Mason <jdmason@kudzu.us> Acked-by: NMike Marciniszyn <mike.marciniszyn@qlogic.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Motohiro KOSAKI 提交于
Adapt to use new APIs. We plan to remove old one later and plan to change current->cpus_allowed implementation. No functional change. Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Acked-by: NMike Marciniszyn <mike.marciniszyn@qlogic.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
- 18 6月, 2011 1 次提交
-
-
由 Mitko Haralanov 提交于
Due to timing, it is possible for the LOS and DFE to remain on. This is due to the link progressing to LinkUP prior to the driver getting the first Status Changed interrupt. By expanding the conditions under which LOS is turned off and DFE timeout is being set, timing is no longer an issue. Signed-off-by: NMitko Haralanov <mitko@qlogic.com> Signed-off-by: NMike Marciniszyn <mike.marciniszyn@qlogic.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
- 21 5月, 2011 1 次提交
-
-
由 Roland Dreier 提交于
Add basic RDMA netlink infrastructure that allows for registration of RDMA clients for which data is to be exported and supplies message construction callbacks. Signed-off-by: NNir Muchtar <nirm@voltaire.com> [ Reorganize a few things, add CONFIG_NET dependency. - Roland ] Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
- 12 5月, 2011 1 次提交
-
-
由 Sergei Shtylyov 提交于
The driver reads PCI revision ID from the PCI configuration register while it's already stored by PCI subsystem in the revision field of struct pci_dev. Signed-off-by: NSergei Shtylyov <sshtylyov@ru.mvista.com> Acked-by: NMike Marciniszyn <mike.marciniszyn@qlogic.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
- 10 5月, 2011 1 次提交
-
-
由 Mitko Haralanov 提交于
The time limit test now correctly checks against current jiffies to avoid the hang. Signed-off-by: NMitko Haralanov <mitko@qlogic.com> Signed-off-by: NMike Marciniszyn <mike.marciniszyn@qlogic.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
- 27 4月, 2011 1 次提交
-
-
由 Lucas De Marchi 提交于
These changes were incorrectly fixed by codespell. They were now manually corrected. Signed-off-by: NLucas De Marchi <lucas.demarchi@profusion.mobi>
-
- 31 3月, 2011 1 次提交
-
-
由 Lucas De Marchi 提交于
Fixes generated by 'codespell' and manually reviewed. Signed-off-by: NLucas De Marchi <lucas.demarchi@profusion.mobi>
-
- 18 3月, 2011 1 次提交
-
-
由 Huang Ying 提交于
In most cases, get_user_pages and get_user_pages_fast should be used to pin user pages in memory. But sometimes, some special flags except FOLL_GET, FOLL_WRITE and FOLL_FORCE are needed, for example in following patch, KVM needs FOLL_HWPOISON. To support these users, __get_user_pages is exported directly. There are some symbol name conflicts in infiniband driver, fixed them too. Signed-off-by: NHuang Ying <ying.huang@intel.com> CC: Andrew Morton <akpm@linux-foundation.org> CC: Michel Lespinasse <walken@google.com> CC: Roland Dreier <roland@kernel.org> CC: Ralph Campbell <infinipath@qlogic.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
- 15 3月, 2011 2 次提交
-
-
由 Mitko Haralanov 提交于
Set the M_Key field in SubnGet and SugnGetResp MADs based on correctly interpreting the protection level specified in the M_KeyProtBits field. Signed-off-by: NMitko Haralanov <mitko@qlogic.com> Signed-off-by: NMike Marciniszyn <mike.marciniszyn@qlogic.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Mitko Haralanov 提交于
For active and far-EQ cables use an LE2 value of 0 for improved SI. Signed-off-by: NMitko Haralanov <mitko@qlogic.com> Signed-off-by: NMike Marciniszyn <mike.marciniszyn@qlogic.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
- 23 2月, 2011 1 次提交
-
-
由 Mitko Haralanov 提交于
Fix a bug which causes the driver to return incorrect MADs as a response to Set(PortInfo) which sets the link width to 0xFF or link speed to 0xF. Signed-off-by: NMitko Haralanov <mitko@qlogic.com> Signed-off-by: NMike Marciniszyn <mike.marciniszyn@qlogic.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
- 18 2月, 2011 1 次提交
-
-
由 Mike Marciniszyn 提交于
There is a double completion associated with error handling for RC QPs. The sequence is: - The do_rc_ack() routine fields an RNR nack and there are 0 rnr_retries configured on the QP. - qib_error_qp() stops the pending timer - qib_rc_send_complete() is called from sdma_complete() - qib_rc_send_complete() starts the timer because the msb of the psn just completed says an ack is needed. - a bunch of flushes occur as ipoib posts WQEs to an error'ed QP - rc_timeout() calls qib_restart_rc() - qib_restart_rc() calls qib_send_complete() with a IB_WC_RETRY_EXC_ERR on a wqe that has already been completed in the past The fix avoids starting the timer since another packet will never arrive. Signed-off-by: NMike Marciniszyn <mike.marciniszyn@qlogic.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
- 11 2月, 2011 1 次提交
-
-
由 Mike Marciniszyn 提交于
The following panic BUG_ON occurs during qib testing: Kernel BUG at include/linux/timer.h:82 RIP [<ffffffff881f7109>] :ib_qib:start_timer+0x73/0x89 RSP <ffffffff80425bd0> <0>Kernel panic - not syncing: Fatal exception <0>Dumping qib trace buffer from panic qib_set_lid INFO: IB0:1 got a lid: 0xf8 Done dumping qib trace buffer BUG: warning at kernel/panic.c:137/panic() (Tainted: G The flaw is due to a missing state test when processing responses that results in an add_timer() call when the same timer is already queued. This code was executing in parallel with a QP destroy on another CPU that had changed the state to reset, but the missing test caused to response handling code to run on into the panic. Signed-off-by: NMike Marciniszyn <mike.marciniszyn@qlogic.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
- 29 1月, 2011 1 次提交
-
-
由 Mitko Haralanov 提交于
Hold the IB link at DISABLED until we get the correct TX settings on mezz boards. Signed-off-by: NMike Marciniszyn <mike.marciniszyn@qlogic.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
- 17 1月, 2011 1 次提交
-
-
由 Tejun Heo 提交于
* ib_wq is added, which is used as the common workqueue for infiniband instead of the system workqueue. All system workqueue usages including flush_scheduled_work() callers are converted to use and flush ib_wq. * cancel_delayed_work() + flush_scheduled_work() converted to cancel_delayed_work_sync(). * qib_wq is removed and ib_wq is used instead. This is to prepare for deprecation of flush_scheduled_work(). Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
- 13 1月, 2011 1 次提交
-
-
由 Joe Perches 提交于
Signed-off-by: NJoe Perches <joe@perches.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
- 11 1月, 2011 12 次提交
-
-
由 Mike Marciniszyn 提交于
The mr optimization introduced a reference count leak on an exception test. The lock/refcount manipulation is moved down and the problematic exception test now calls bail to insure that the lock is released. Additional fixes as suggested by Ralph Campbell <ralph.campbell@qlogic.org>: - reduce lock scope of dma regions - use explicit values on returns vs. automatic ret value Signed-off-by: NMike Marciniszyn <mike.marciniszyn@qlogic.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Mike Marciniszyn 提交于
Improve the QMH SERDES tunning on initial driver load by having the driver go through a link state change. Signed-off-by: NMike Marciniszyn <mike.marciniszyn@qlogic.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Mike Marciniszyn 提交于
Currently on receipt of a response message (ACKs, RDMA Response, Atomic Responses etc.) if the SDMA completion counter is not advanced the driver delays the completion of the WQE. In most cases this is overly pessimistic as the response (ACK) to a previously transmitted send implies that the send is complete. Ensure that SDMA queue is progressed appropriately before determining if a send has delayed completions. Signed-off-by: NMike Marciniszyn <mike.marciniszyn@qlogic.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Mike Marciniszyn 提交于
Under congestion resulting in eager buffer overflow attempt to send pre-emptive NAKs if header queue entries with TID errors are generated and a valid header is present. This prevents long timeouts and flow restarts if a trailing set of packets are dropped due to eager overflows. Pre-emptive NAKs are currently only supported for RDMA writes. Signed-off-by: NMike Marciniszyn <mike.marciniszyn@qlogic.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Mike Marciniszyn 提交于
The current code loops during rkey/lkey validiation to isolate the MR for the RDMA, which is expensive when the current operation is inside a very large memory region. This fix optimizes rkey/lkey validation routines for user memory regions and fast memory regions. The MR entry can be isolated by shifts/mods instead of looping. The existing loop is preserved for phys memory regions for now. Signed-off-by: NMike Marciniszyn <mike.marciniszyn@qlogic.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Mike Marciniszyn 提交于
Changing from +1 to +2 allows for better QP distribution across receive contexts. Signed-off-by: NMike Marciniszyn <mike.marciniszyn@qlogic.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Mike Marciniszyn 提交于
The upstream code was missing part of a receive/error race fix from the internal tree. Add the missing part, which makes future merges possible. Signed-off-by: NMike Marciniszyn <mike.marciniszyn@qlogic.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Mike Marciniszyn 提交于
The basic idea is that on SusieQ, the difficult part of mapping QPN to context is handled by the mapping registers so the generic QPN allocation doesn't need to worry about chip specifics. For Monty and Linda, there is no mapping table so the qpt->mask (same as dd->qpn_mask), is used to see if the QPN to context falls within [zero..dd->n_krcv_queues). Signed-off-by: NMike Marciniszyn <mike.marciniszyn@qlogic.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Mike Marciniszyn 提交于
For SusieQ we need to write to the interrupt timer register before updating the header queue head with interrupt count. This is to ensure that the timer is enabled properly and a receive available interrupt is delivered. Otherwise this interrupt can be lost if the receiver header/eager queues are full before the timer is enabled. Signed-off-by: NMike Marciniszyn <mike.marciniszyn@qlogic.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Mike Marciniszyn 提交于
Avoid duplicate writes to the head register as this can lead to lost interrupts if the context goes full before the second write is done. Signed-off-by: NMike Marciniszyn <mike.marciniszyn@qlogic.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Mike Marciniszyn 提交于
Add new SERDES tuning to aid manufacturing. Signed-off-by: NMike Marciniszyn <mike.marciniszyn@qlogic.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Mike Marciniszyn 提交于
Reset the list pointers after freeing the SDMA packet list. This is done to any potential double-free cases. Signed-off-by: NMike Marciniszyn <mike.marciniszyn@qlogic.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-