- 09 9月, 2009 28 次提交
-
-
由 Dan Williams 提交于
Export driver attributes for diagnostic purposes: 'ring_size': total number of descriptors available to the engine 'ring_active': number of descriptors in-flight 'capabilities': supported operation types for this channel 'version': Intel(R) QuickData specfication revision This also allows some chattiness to be removed from the driver startup as this information is now available via sysfs. Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
Up until this point the driver for Intel(R) QuickData Technology engines, specification versions 2 and 3, were mostly identical save for a few quirks. Version 3.2 hardware adds many new capabilities (like raid offload support) requiring some infrastructure that is not relevant for v2. For better code organization of the new funcionality move v3 and v3.2 support to its own file dma_v3.c, and export some routines from the base files (dma.c and dma_v2.c) that can be reused directly. The first new capability included in this code reorganization is support for v3.2 memset operations. Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
ioat3.2 adds raid5 and raid6 offload capabilities. Signed-off-by: NTom Picard <tom.s.picard@intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
In preparation for adding more operation types to the ioat3 path the driver needs to honor the DMA_PREP_FENCE flag. For example the async_tx api will hand xor->memcpy->xor chains to the driver with the 'fence' flag set on the first xor and the memcpy operation. This flag in turn sets the 'fence' flag in the descriptor control field telling the hardware that future descriptors in the chain depend on the result of the current descriptor, so wait for all writes to complete before starting the next operation. Note that ioat1 does not prefetch the descriptor chain, so does not require/support fenced operations. Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
Some engines have transfer size and address alignment restrictions. Add a per-operation alignment property to struct dma_device that the async routines and dmatest can use to check alignment capabilities. Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
No drivers currently implement these operation types, so they can be deleted. Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
Channel switching is problematic for some dmaengine drivers as the architecture precludes separating the ->prep from ->submit. In these cases the driver can select ASYNC_TX_DISABLE_CHANNEL_SWITCH to modify the async_tx allocator to only return channels that support all of the required asynchronous operations. For example MD_RAID456=y selects support for asynchronous xor, xor validate, pq, pq validate, and memcpy. When ASYNC_TX_DISABLE_CHANNEL_SWITCH=y any channel with all these capabilities is marked DMA_ASYNC_TX allowing async_tx_find_channel() to quickly locate compatible channels with the guarantee that dependency chains will remain on one channel. When ASYNC_TX_DISABLE_CHANNEL_SWITCH=n async_tx_find_channel() may select channels that lead to operation chains that need to cross channel boundaries using the async_tx channel switch capability. Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
Handle descriptor allocation failures by polling for a descriptor. The driver will force forward progress when polled. In the best case this polling interval will be the time it takes for one dma memcpy transaction to complete. In the worst case, channel hang, we will need to wait 100ms for the cleanup watchdog to fire (ioatdma driver). Signed-off-by: NMaciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
Increment the allocation order of the descriptor ring every time we run out of descriptors up to a maximum of allocation order specified by the module parameter 'ioat_max_alloc_order'. After each idle period decrement the allocation order to a minimum order of 'ioat_ring_alloc_order' (i.e. the default ring size, tunable as a module parameter). Signed-off-by: NMaciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
In order to support dynamic resizing of the descriptor ring or polling for a descriptor in the presence of a hung channel the reset handler needs to make progress while in a non-preemptible context. The current workqueue implementation precludes polling channel reset completion under spin_lock(). This conversion also allows us to return to opportunistic cleanup in the ioat2 case as the timer implementation guarantees at least one cleanup after every descriptor is submitted. This means the worst case completion latency becomes the timer frequency (for exceptional circumstances), but with the benefit of avoiding busy waiting when the lock is contended. Signed-off-by: NMaciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
Save 4 bytes per software descriptor by transmitting tx_cnt in an unused portion of the hardware descriptor. Signed-off-by: NMaciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
Mark all single use initialization routines with __devinit. Signed-off-by: NMaciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
The register write in ioat_dma_cleanup_tasklet is unfortunate in two ways: 1/ It clears the extra 'enable' bits that we set at alloc_chan_resources time 2/ It gives the impression that it disables interrupts when it is in fact re-arming interrupts [ Impact: fix, persist the value of the chanctrl register when re-arming ] Signed-off-by: NMaciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
Don't trust that the reserved bits are always zero, also sanity check the returned value. Signed-off-by: NMaciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
The cleanup path makes an effort to only perform an atomic read of the 64-bit completion address. However in the 32-bit case it does not matter if we read the upper-32 and lower-32 non-atomically because the upper-32 will always be zero. Signed-off-by: NMaciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
Provide some output for debugging the driver. Signed-off-by: NMaciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
The unified ioat1/ioat2 ioat_dma_unmap() implementation derives the source and dest addresses from the unmap descriptor. There is no longer a need to track this information in struct ioat_desc_sw. Signed-off-by: NMaciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
Replace the current linked list munged into a ring with a native ring buffer implementation. The benefit of this approach is reduced overhead as many parameters can be derived from ring position with simple pointer comparisons and descriptor allocation/freeing becomes just a manipulation of head/tail pointers. It requires a contiguous allocation for the software descriptor information. Since this arrangement is significantly different from the ioat1 chain, move ioat2,3 support into its own file and header. Common routines are exported from driver/dma/ioat/dma.[ch]. Signed-off-by: NMaciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
Prepare the code for the conversion of the ioat2 linked-list-ring into a native ring buffer. After this conversion ioat2 channels will share less of the ioat1 infrastructure, but there will still be places where sharing is possible. struct ioat_chan_common is created to house the channel attributes that will remain common between ioat1 and ioat2 channels. For every routine that accesses both common and hardware specific fields the old unified 'ioat_chan' pointer is split into an 'ioat' and 'chan' pointer. Where 'chan' references common fields and 'ioat' the hardware/version specific. [ Impact: pure structure member movement/variable renames, no logic changes ] Signed-off-by: NMaciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
If a callback is to be attached to a descriptor the channel needs to know at ->prep time so it can set the interrupt enable bit. This is in preparation for moving descriptor ioat2 descriptor preparation from ->submit to ->prep. Signed-off-by: NMaciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
The async_tx api assumes that after a successful ->prep a subsequent ->submit will not fail due to a lack of resources. This also fixes a bug in the allocation failure case. Previously the descriptors allocated prior to the allocation failure would not be returned to the free list. Signed-off-by: NMaciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
This cleans up a mess of and'ing and or'ing bit definitions, and allows simple assignments from the specified dma_ctrl_flags parameter. Signed-off-by: NMaciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
->dmacount tracks the sequence number of active descriptors. It is written to the DMACOUNT register to update the channel's view of pending descriptors in the chain. The register is 16-bits so ->dmacount should be unsigned and 16-bit as well. Also modify ->desccount to maintain alignment. This was never a problem in practice because we never compared dmacount values, but this is a bug waiting to happen. Signed-off-by: NMaciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
Towards the removal of ioatdma_device.version split the initialization path into distinct versions. This conversion: 1/ moves version specific probe code to version specific routines 2/ removes the need for ioat_device 3/ turns off the ioat1 msi quirk if the device is reinitialized for intx Signed-off-by: NMaciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
The only .c files that utilize these protected prototypes depend on CONFIG_INTEL_IOATDMA=y, so there is no value gained in providing empty prototypes. [ Impact: pure cleanup ] Signed-off-by: NMaciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
* reduce device->common. to dma-> in ioat_dma_{probe,remove,selftest} * ioat_lookup_chan_by_index to ioat_chan_by_index * multi-line function definitions * ioat_desc_sw.async_tx to ioat_desc_sw.txd * desc->txd. to tx-> in cleanup routine Signed-off-by: NMaciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
The driver currently duplicates much of what these routines offer, so just use the common code. For example ->irq_mode tracks what interrupt mode was initialized, which duplicates the ->msix_enabled and ->msi_enabled handling in pcim_release. This also adds a check to the return value of dma_async_device_register, which can fail. Signed-off-by: NMaciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
Some of these defines may be useful outside of dma.c and the header is private so there are no namespace pollution concerns. Signed-off-by: NMaciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
- 30 8月, 2009 3 次提交
-
-
由 Dan Williams 提交于
Test raid6 p+q operations with a simple "always multiply by 1" q calculation to fit into dmatest's current destination verification scheme. Reviewed-by: NAndre Noll <maan@systemlinux.org> Acked-by: NMaciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
[ Based on an original patch by Yuri Tikhonov ] This adds support for doing asynchronous GF multiplication by adding two additional functions to the async_tx API: async_gen_syndrome() does simultaneous XOR and Galois field multiplication of sources. async_syndrome_val() validates the given source buffers against known P and Q values. When a request is made to run async_pq against more than the hardware maximum number of supported sources we need to reuse the previous generated P and Q values as sources into the next operation. Care must be taken to remove Q from P' and P from Q'. For example to perform a 5 source pq op with hardware that only supports 4 sources at a time the following approach is taken: p, q = PQ(src0, src1, src2, src3, COEF({01}, {02}, {04}, {08})) p', q' = PQ(p, q, q, src4, COEF({00}, {01}, {00}, {10})) p' = p + q + q + src4 = p + src4 q' = {00}*p + {01}*q + {00}*q + {10}*src4 = q + {10}*src4 Note: 4 is the minimum acceptable maxpq otherwise we punt to synchronous-software path. The DMA_PREP_CONTINUE flag indicates to the driver to reuse p and q as sources (in the above manner) and fill the remaining slots up to maxpq with the new sources/coefficients. Note1: Some devices have native support for P+Q continuation and can skip this extra work. Devices with this capability can advertise it with dma_set_maxpq. It is up to each driver how to handle the DMA_PREP_CONTINUE flag. Note2: The api supports disabling the generation of P when generating Q, this is ignored by the synchronous path but is implemented by some dma devices to save unnecessary writes. In this case the continuation algorithm is simplified to only reuse Q as a source. Cc: H. Peter Anvin <hpa@zytor.com> Cc: David Woodhouse <David.Woodhouse@intel.com> Signed-off-by: NYuri Tikhonov <yur@emcraft.com> Signed-off-by: NIlya Yanok <yanok@emcraft.com> Reviewed-by: NAndre Noll <maan@systemlinux.org> Acked-by: NMaciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Dan Williams 提交于
We currently walk the parent chain when waiting for a given tx to complete however this walk may race with the driver cleanup routine. The routines in async_raid6_recov.c may fall back to the synchronous path at any point so we need to be prepared to call async_tx_quiesce() (which calls dma_wait_for_async_tx). To remove the ->parent walk we guarantee that every time a dependency is attached ->issue_pending() is invoked, then we can simply poll the initial descriptor until completion. This also allows for a lighter weight 'issue pending' implementation as there is no longer a requirement to iterate through all the channels' ->issue_pending() routines as long as operations have been submitted in an ordered chain. async_tx_issue_pending() is added for this case. Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
- 29 7月, 2009 1 次提交
-
-
由 Dan Williams 提交于
When first created the ioat driver was the only inhabitant of drivers/dma/. Now, it is the only multi-file (more than a .c and a .h) driver in the directory. Moving it to an ioat/ subdirectory allows the naming convention to be cleaned up, and allows for future splitting of the source files by hardware version (v1, v2, and v3). Signed-off-by: NMaciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
- 04 6月, 2009 1 次提交
-
-
由 Dan Williams 提交于
async_xor() needs space to perform dma and page address conversions. In most cases the code can simply reuse the struct page * array because the size of the native pointer matches the size of a dma/page address. In order to support archs where sizeof(dma_addr_t) is larger than sizeof(struct page *), or to preserve the input parameters, we utilize a memory region passed in by the caller. Since the code is now prepared to handle the case where it cannot perform address conversions on the stack, we no longer need the !HIGHMEM64G dependency in drivers/dma/Kconfig. [ Impact: don't clobber input buffers for address conversions ] Reviewed-by: NAndre Noll <maan@systemlinux.org> Acked-by: NMaciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
- 29 5月, 2009 1 次提交
-
-
由 Len Brown 提交于
Testing the i7300_idle driver on i5000-series hardware required an edit to i7300_idle.h to "#define SUPPORT_I5000 1" and a re-build of both i7300_idle and ioat_dma. Replace that build-time scheme with a load-time module parameter: "7300_idle.forceload=1" to make it easier to test the driver on hardware that while not officially validated, works fine and is much more commonly available. By default (no modparam) the driver will continue to load only on the i7300. Note that ioat_dma runs a copy of i7300_idle's probe routine to know to reserve an IOAT channel for i7300_idle. This change makes ioat_dma do that always on the i5000, just like it does on the i7300. Signed-off-by: NLen Brown <len.brown@intel.com> Acked-by: NAndrew Henroid <andrew.d.henroid@intel.com>
-
- 28 5月, 2009 1 次提交
-
-
由 Kumar Gala 提交于
We we build with dma_addr_t as a 64-bit quantity we get: drivers/dma/fsldma.c: In function 'fsl_chan_xfer_ld_queue': drivers/dma/fsldma.c:625: warning: cast to pointer from integer of different size drivers/dma/fsldma.c: In function 'fsl_dma_chan_do_interrupt': drivers/dma/fsldma.c:737: warning: cast to pointer from integer of different size drivers/dma/fsldma.c:737: warning: cast to pointer from integer of different size drivers/dma/fsldma.c: In function 'of_fsl_dma_probe': drivers/dma/fsldma.c:927: warning: cast to pointer from integer of different Signed-off-by: NKumar Gala <galak@kernel.crashing.org> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
- 22 5月, 2009 5 次提交
-
-
由 Ira Snyder 提交于
When preparing a memcpy operation, if the kernel fails to allocate memory for a link descriptor after the first link descriptor has already been allocated, then some memory will never be released. Fix the problem by walking the list of allocated descriptors backwards, and freeing the allocated descriptors back into the DMA pool. Signed-off-by: NIra W. Snyder <iws@ovro.caltech.edu> Signed-off-by: NLi Yang <leoli@freescale.com>
-
由 Ira Snyder 提交于
On the 83xx controller, snooping is necessary for the DMA controller to ensure cache coherence with the CPU when transferring to/from RAM. The last descriptor in a chain will always have the End-of-Chain interrupt bit set, so we can set the snoop bit while adding the End-of-Chain interrupt bit. Signed-off-by: NIra W. Snyder <iws@ovro.caltech.edu> Signed-off-by: NLi Yang <leoli@freescale.com>
-
由 Ira Snyder 提交于
When creating a DMA transaction with multiple descriptors, the async_tx cookie is set to 0 for each descriptor in the chain, excluding the last descriptor, whose cookie is set to -EBUSY. When fsl_dma_tx_submit() is run, it only assigns a cookie to the first descriptor. All of the remaining descriptors keep their original value, including the last descriptor, which is set to -EBUSY. After the DMA completes, the driver will update the last completed cookie to be -EBUSY, which is an error code instead of a valid cookie. This causes dma_async_is_complete() to always return DMA_IN_PROGRESS. This causes the fsldma driver to never cleanup the queue of link descriptors, and the driver will re-run the DMA transaction on the hardware each time it receives the End-of-Chain interrupt. This causes an infinite loop. With this patch, fsl_dma_tx_submit() is changed to assign a cookie to every descriptor in the chain. The rest of the code then works without problems. Signed-off-by: NIra W. Snyder <iws@ovro.caltech.edu> Signed-off-by: NLi Yang <leoli@freescale.com>
-
由 Ira Snyder 提交于
When using the DMA controller from multiple threads at the same time, it is possible to get lots of "DMA halt timeout!" errors printed to the kernel log. This occurs due to a race between fsl_dma_memcpy_issue_pending() and the interrupt handler, fsl_dma_chan_do_interrupt(). Both call the fsl_chan_xfer_ld_queue() function, which does not protect against concurrent accesses to dma_halt() and dma_start(). The existing spinlock is moved to cover the dma_halt() and dma_start() functions. Testing shows that the "DMA halt timeout!" errors disappear. Signed-off-by: NIra W. Snyder <iws@ovro.caltech.edu> Signed-off-by: NLi Yang <leoli@freescale.com>
-
由 Roel Kluin 提交于
Fix the check of potential array overflow when using corrupted channel device tree nodes. Signed-off-by: NRoel Kluin <roel.kluin@gmail.com> Signed-off-by: NLi Yang <leoli@freescale.com>
-