- 18 9月, 2012 1 次提交
-
-
由 Wei Yongjun 提交于
Using kmem_cache_zalloc() instead of kmem_cache_alloc() and memset(). spatch with a semantic match is used to found this problem. (http://coccinelle.lip6.fr/) Signed-off-by: NWei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: NVinod Koul <vinod.koul@linux.intel.com>
-
- 06 4月, 2012 1 次提交
-
-
由 Dave Jiang 提交于
The alloc order can be up to 16 and 1 << 16 will over flow the 16bit integer. Change the appropriate variables to 16bit to avoid overflow. Reported-by: NJim Harris <james.r.harris@intel.com> Signed-off-by: NDave Jiang <dave.jiang@intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
- 24 3月, 2012 1 次提交
-
-
由 Dan Williams 提交于
Starting with v3.2 Jonathan reports that Xen crashes loading the ioatdma driver. A debug run shows: ioatdma 0000:00:16.4: desc[0]: (0x300cc7000->0x300cc7040) cookie: 0 flags: 0x2 ctl: 0x29 (op: 0 int_en: 1 compl: 1) ... ioatdma 0000:00:16.4: ioat_get_current_completion: phys_complete: 0xcc7000 ...which shows that in this environment GFP_KERNEL memory may be backed by a 64-bit dma address. This breaks the driver's assumption that an unsigned long should be able to contain the physical address for descriptor memory. Switch to dma_addr_t which beyond being the right size, is the true type for the data i.e. an io-virtual address inidicating the engine's last processed descriptor. [stable: 3.2+] Cc: <stable@vger.kernel.org> Reported-by: NJonathan Nieder <jrnieder@gmail.com> Reported-by: NWilliam Dauchy <wdauchy@gmail.com> Tested-by: NWilliam Dauchy <wdauchy@gmail.com> Tested-by: NDave Jiang <dave.jiang@intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
- 13 3月, 2012 4 次提交
-
-
由 Russell King - ARM Linux 提交于
Provide a common function to do the cookie mechanics for completing a DMA descriptor. Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk> Tested-by: NLinus Walleij <linus.walleij@linaro.org> Reviewed-by: NLinus Walleij <linus.walleij@linaro.org> Acked-by: NJassi Brar <jassisinghbrar@gmail.com> [imx-sdma.c & mxs-dma.c] Tested-by: NShawn Guo <shawn.guo@linaro.org> Signed-off-by: NVinod Koul <vinod.koul@linux.intel.com>
-
由 Russell King - ARM Linux 提交于
Everyone deals with assigning DMA cookies in the same way (it's part of the API so they should be), so lets consolidate the common code into a helper function to avoid this duplication. Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk> Tested-by: NLinus Walleij <linus.walleij@linaro.org> Reviewed-by: NLinus Walleij <linus.walleij@linaro.org> Acked-by: NJassi Brar <jassisinghbrar@gmail.com> [imx-sdma.c & mxs-dma.c] Tested-by: NShawn Guo <shawn.guo@linaro.org> Signed-off-by: NVinod Koul <vinod.koul@linux.intel.com>
-
由 Russell King - ARM Linux 提交于
Add a local private header file to contain definitions and declarations which should only be used by DMA engine drivers. We also fix linux/dmaengine.h to use LINUX_DMAENGINE_H to guard against multiple inclusion. Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk> Tested-by: NLinus Walleij <linus.walleij@linaro.org> Reviewed-by: NLinus Walleij <linus.walleij@linaro.org> Acked-by: NJassi Brar <jassisinghbrar@gmail.com> [imx-sdma.c & mxs-dma.c] Tested-by: NShawn Guo <shawn.guo@linaro.org> Signed-off-by: NVinod Koul <vinod.koul@linux.intel.com>
-
由 Russell King - ARM Linux 提交于
Every DMA engine implementation declares a last completed dma cookie in their private dma channel structures. This is pointless, and forces driver specific code. Move this out into the common dma_chan structure. Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk> Tested-by: NLinus Walleij <linus.walleij@linaro.org> Reviewed-by: NLinus Walleij <linus.walleij@linaro.org> Acked-by: NJassi Brar <jassisinghbrar@gmail.com> [imx-sdma.c & mxs-dma.c] Tested-by: NShawn Guo <shawn.guo@linaro.org> Signed-off-by: NVinod Koul <vinod.koul@linux.intel.com>
-
- 27 5月, 2011 1 次提交
-
-
由 Dimitri Sivanich 提交于
For certain system configurations a 5 usec udelay before checking I/OAT DMA channel status is sometimes not sufficient, resulting in a false failure status and unnecessary freeing of channel resources. Conversely, for many configurations 5 usec is longer than necessary. Loop for up to 20 usec waiting for successful status before failing. Signed-off-by: NDimitri Sivanich <sivanich@sgi.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
- 23 5月, 2011 1 次提交
-
-
由 Paul Gortmaker 提交于
After discovering that wide use of prefetch on modern CPUs could be a net loss instead of a win, net drivers which were relying on the implicit inclusion of prefetch.h via the list headers showed up in the resulting cleanup fallout. Give them an explicit include via the following $0.02 script. ========================================= #!/bin/bash MANUAL="" for i in `git grep -l 'prefetch(.*)' .` ; do grep -q '<linux/prefetch.h>' $i if [ $? = 0 ] ; then continue fi ( echo '?^#include <linux/?a' echo '#include <linux/prefetch.h>' echo . echo w echo q ) | ed -s $i > /dev/null 2>&1 if [ $? != 0 ]; then echo $i needs manual fixup MANUAL="$i $MANUAL" fi done echo ------------------- 8\<---------------------- echo vi $MANUAL ========================================= Signed-off-by: NPaul <paul.gortmaker@windriver.com> [ Fixed up some incorrect #include placements, and added some non-network drivers and the fib_trie.c case - Linus ] Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 14 10月, 2010 1 次提交
-
-
由 Dan Williams 提交于
Commit 07934481 "DMAENGINE: generic channel status v2" changed the interface for how dma channel progress is retrieved. It inadvertently exported an internal helper function ioat_tx_status() instead of ioat_dma_tx_status(). The latter polls the hardware to get the latest completion state, while the helper just evaluates the current state without touching hardware. The effect is that we end up waiting for completion timeouts or descriptor allocation errors before the completion state is updated. iperf (before fix): [SUM] 0.0-41.3 sec 364 MBytes 73.9 Mbits/sec iperf (after fix): [SUM] 0.0- 4.5 sec 499 MBytes 940 Mbits/sec This is a regression starting with 2.6.35. Cc: <stable@kernel.org> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Jesse Brandeburg <jesse.brandeburg@intel.com> Cc: Linus Walleij <linus.walleij@stericsson.com> Cc: Maciej Sosnowski <maciej.sosnowski@intel.com> Reported-by: NRichard Scobie <richard@sauce.co.nz> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
- 05 8月, 2010 1 次提交
-
-
由 Dan Williams 提交于
On some platforms (MacPro3,1) the BIOS assigns the ioatdma device to the incorrect iommu causing faults when the driver initializes. Add a quirk to catch this misconfiguration and try falling back to untranslated operation (which works in the MacPro3,1 case). Assuming there are other platforms with misconfigured iommus teach the ioatdma driver to treat initialization failures as non-fatal (just fail the driver load and emit a warning instead of triggering a BUG_ON). This can be classified as a boot regression since 2.6.32 on affected platforms since the ioatdma module did not autoload prior to that kernel. Cc: <stable@kernel.org> Acked-by: NDavid Woodhouse <David.Woodhouse@intel.com> Reported-by: NChris Li <lkml@chrisli.org> Tested-by: NChris Li <lkml@chrisli.org> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
- 02 5月, 2010 2 次提交
-
-
由 Dan Williams 提交于
Use separate locks for the descriptor prep (producer) and descriptor cleanup (consumer) paths. Allows the producer path to run concurrently with the cleanup path. Inspired by Documentation/circular-buffer.txt. Cc: David Howells <dhowells@redhat.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
Use the common power-of-2 circular buffer macros. Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
- 30 3月, 2010 1 次提交
-
-
由 Tejun Heo 提交于
include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: NTejun Heo <tj@kernel.org> Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
-
- 27 3月, 2010 1 次提交
-
-
由 Linus Walleij 提交于
Convert the device_is_tx_complete() operation on the DMA engine to a generic device_tx_status()operation which can return three states, DMA_TX_RUNNING, DMA_TX_COMPLETE, DMA_TX_PAUSED. [dan.j.williams@intel.com: update for timberdale] Signed-off-by: NLinus Walleij <linus.walleij@stericsson.com> Acked-by: NMark Brown <broonie@opensource.wolfsonmicro.com> Cc: Maciej Sosnowski <maciej.sosnowski@intel.com> Cc: Nicolas Ferre <nicolas.ferre@atmel.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: Li Yang <leoli@freescale.com> Cc: Guennadi Liakhovetski <g.liakhovetski@gmx.de> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Haavard Skinnemoen <haavard.skinnemoen@atmel.com> Cc: Magnus Damm <damm@opensource.se> Cc: Liam Girdwood <lrg@slimlogic.co.uk> Cc: Joe Perches <joe@perches.com> Cc: Roland Dreier <rdreier@cisco.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
- 04 3月, 2010 3 次提交
-
-
由 Dan Williams 提交于
If the calling convention of ->timer_fn() and ->cleanup_fn() are unified across hardware versions we can drop parameters to ioat_init_channel() and unify ioat_is_dma_complete() implementations. Both ->timer_fn() and ->cleanup_fn() are modified to expect a struct dma_chan pointer. Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
Since ioat_cleanup_preamble() and the update of the last completed descriptor are not synchronized there is a chance that two cleanup threads can see descriptors to clean. If the first cleans up all pending descriptors then the second will trigger the BUG_ON. Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
The pending == 2 case no longer exists in the driver so, we can use ioat2_ring_pending() outside the lock to determine if there might be any descriptors in the ring that the hardware has not seen. Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
- 03 2月, 2010 1 次提交
-
-
由 Dan Williams 提交于
Fix typo in ioat2_quiesce. check 'tmo' is zero, not 'end'. Also applies to 2.6.32.3 Cc: <stable@kernel.org> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
- 20 12月, 2009 1 次提交
-
-
由 Dan Williams 提交于
Put the ioat2 and ioat3 state machines in the halted state with all errors cleared. The ioat1 init path is not disturbed for stability, there are no reported ioat1 initiaization issues. Cc: <stable@kernel.org> Reported-by: NRoland Dreier <rdreier@cisco.com> Tested-by: NRoland Dreier <rdreier@cisco.com> Acked-by: NSimon Horman <horms@verge.net.au> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
- 20 11月, 2009 1 次提交
-
-
由 Dan Williams 提交于
Modify is_ioat_bug() to catch all errors that are uncorrectable, or not currently handled. Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
- 22 9月, 2009 1 次提交
-
-
由 Andrew Morton 提交于
drivers/dma/ioat/dma_v2.c: In function 'ioat2_dma_prep_memcpy_lock': drivers/dma/ioat/dma_v2.c:680: warning: 'hw' may be used uninitialized in this function drivers/dma/ioat/dma_v2.c:681: warning: 'desc' may be used uninitialized in this function Cc: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
- 17 9月, 2009 1 次提交
-
-
由 Dan Williams 提交于
With the addition of ioat_max_alloc_order it is not clear what the maximum allocation order is, so document that in the modinfo. Also take an opportunity to kill a stray semicolon. Signed-off-by: NMaciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
- 09 9月, 2009 15 次提交
-
-
由 Dan Williams 提交于
All the necessary fields for handling an ioat2,3 ring entry can fit into one cacheline. Move ->len prior to ->txd in struct ioat_ring_ent, and move allocation of these entries to a hw-cache-aligned kmem cache to reduce the number of cachelines dirtied for descriptor management. Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
The cleanup routine for the raid cases imposes extra checks for handling raid descriptors and extended descriptors. If the channel does not support raid it can avoid this extra overhead by using the ioat2 cleanup path. Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
This adds a hardware specific self test to be called from ioat_probe. In the ioat3 case we will have tests for all the different raid operations, while ioat1 and ioat2 will continue to just test memcpy. Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
ioat3.2 adds xor offload support for up to 8 sources. It can also perform an xor-zero-sum operation to validate whether all given sources sum to zero, without writing to a destination. Xor descriptors differ from memcpy in that one operation may require multiple descriptors depending on the number of sources. When the number of sources exceeds 5 an extended descriptor is needed. These descriptors need to be accounted for when updating the DMA_COUNT register. Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
Export driver attributes for diagnostic purposes: 'ring_size': total number of descriptors available to the engine 'ring_active': number of descriptors in-flight 'capabilities': supported operation types for this channel 'version': Intel(R) QuickData specfication revision This also allows some chattiness to be removed from the driver startup as this information is now available via sysfs. Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
Up until this point the driver for Intel(R) QuickData Technology engines, specification versions 2 and 3, were mostly identical save for a few quirks. Version 3.2 hardware adds many new capabilities (like raid offload support) requiring some infrastructure that is not relevant for v2. For better code organization of the new funcionality move v3 and v3.2 support to its own file dma_v3.c, and export some routines from the base files (dma.c and dma_v2.c) that can be reused directly. The first new capability included in this code reorganization is support for v3.2 memset operations. Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
In preparation for adding more operation types to the ioat3 path the driver needs to honor the DMA_PREP_FENCE flag. For example the async_tx api will hand xor->memcpy->xor chains to the driver with the 'fence' flag set on the first xor and the memcpy operation. This flag in turn sets the 'fence' flag in the descriptor control field telling the hardware that future descriptors in the chain depend on the result of the current descriptor, so wait for all writes to complete before starting the next operation. Note that ioat1 does not prefetch the descriptor chain, so does not require/support fenced operations. Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
Increment the allocation order of the descriptor ring every time we run out of descriptors up to a maximum of allocation order specified by the module parameter 'ioat_max_alloc_order'. After each idle period decrement the allocation order to a minimum order of 'ioat_ring_alloc_order' (i.e. the default ring size, tunable as a module parameter). Signed-off-by: NMaciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
In order to support dynamic resizing of the descriptor ring or polling for a descriptor in the presence of a hung channel the reset handler needs to make progress while in a non-preemptible context. The current workqueue implementation precludes polling channel reset completion under spin_lock(). This conversion also allows us to return to opportunistic cleanup in the ioat2 case as the timer implementation guarantees at least one cleanup after every descriptor is submitted. This means the worst case completion latency becomes the timer frequency (for exceptional circumstances), but with the benefit of avoiding busy waiting when the lock is contended. Signed-off-by: NMaciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
Mark all single use initialization routines with __devinit. Signed-off-by: NMaciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
The register write in ioat_dma_cleanup_tasklet is unfortunate in two ways: 1/ It clears the extra 'enable' bits that we set at alloc_chan_resources time 2/ It gives the impression that it disables interrupts when it is in fact re-arming interrupts [ Impact: fix, persist the value of the chanctrl register when re-arming ] Signed-off-by: NMaciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
Don't trust that the reserved bits are always zero, also sanity check the returned value. Signed-off-by: NMaciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
The cleanup path makes an effort to only perform an atomic read of the 64-bit completion address. However in the 32-bit case it does not matter if we read the upper-32 and lower-32 non-atomically because the upper-32 will always be zero. Signed-off-by: NMaciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
Provide some output for debugging the driver. Signed-off-by: NMaciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
Replace the current linked list munged into a ring with a native ring buffer implementation. The benefit of this approach is reduced overhead as many parameters can be derived from ring position with simple pointer comparisons and descriptor allocation/freeing becomes just a manipulation of head/tail pointers. It requires a contiguous allocation for the software descriptor information. Since this arrangement is significantly different from the ioat1 chain, move ioat2,3 support into its own file and header. Common routines are exported from driver/dma/ioat/dma.[ch]. Signed-off-by: NMaciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-