1. 02 5月, 2010 1 次提交
    • D
      ioat2,3: convert to producer/consumer locking · 074cc476
      Dan Williams 提交于
      Use separate locks for the descriptor prep (producer) and descriptor
      cleanup (consumer) paths.  Allows the producer path to run concurrently
      with the cleanup path.  Inspired by Documentation/circular-buffer.txt.
      
      Cc: David Howells <dhowells@redhat.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Maciej Sosnowski <maciej.sosnowski@intel.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      074cc476
  2. 30 3月, 2010 1 次提交
    • T
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking... · 5a0e3ad6
      Tejun Heo 提交于
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
      
      percpu.h is included by sched.h and module.h and thus ends up being
      included when building most .c files.  percpu.h includes slab.h which
      in turn includes gfp.h making everything defined by the two files
      universally available and complicating inclusion dependencies.
      
      percpu.h -> slab.h dependency is about to be removed.  Prepare for
      this change by updating users of gfp and slab facilities include those
      headers directly instead of assuming availability.  As this conversion
      needs to touch large number of source files, the following script is
      used as the basis of conversion.
      
        http://userweb.kernel.org/~tj/misc/slabh-sweep.py
      
      The script does the followings.
      
      * Scan files for gfp and slab usages and update includes such that
        only the necessary includes are there.  ie. if only gfp is used,
        gfp.h, if slab is used, slab.h.
      
      * When the script inserts a new include, it looks at the include
        blocks and try to put the new include such that its order conforms
        to its surrounding.  It's put in the include block which contains
        core kernel includes, in the same order that the rest are ordered -
        alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
        doesn't seem to be any matching order.
      
      * If the script can't find a place to put a new include (mostly
        because the file doesn't have fitting include block), it prints out
        an error message indicating which .h file needs to be added to the
        file.
      
      The conversion was done in the following steps.
      
      1. The initial automatic conversion of all .c files updated slightly
         over 4000 files, deleting around 700 includes and adding ~480 gfp.h
         and ~3000 slab.h inclusions.  The script emitted errors for ~400
         files.
      
      2. Each error was manually checked.  Some didn't need the inclusion,
         some needed manual addition while adding it to implementation .h or
         embedding .c file was more appropriate for others.  This step added
         inclusions to around 150 files.
      
      3. The script was run again and the output was compared to the edits
         from #2 to make sure no file was left behind.
      
      4. Several build tests were done and a couple of problems were fixed.
         e.g. lib/decompress_*.c used malloc/free() wrappers around slab
         APIs requiring slab.h to be added manually.
      
      5. The script was run on all .h files but without automatically
         editing them as sprinkling gfp.h and slab.h inclusions around .h
         files could easily lead to inclusion dependency hell.  Most gfp.h
         inclusion directives were ignored as stuff from gfp.h was usually
         wildly available and often used in preprocessor macros.  Each
         slab.h inclusion directive was examined and added manually as
         necessary.
      
      6. percpu.h was updated not to include slab.h.
      
      7. Build test were done on the following configurations and failures
         were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
         distributed build env didn't work with gcov compiles) and a few
         more options had to be turned off depending on archs to make things
         build (like ipr on powerpc/64 which failed due to missing writeq).
      
         * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
         * powerpc and powerpc64 SMP allmodconfig
         * sparc and sparc64 SMP allmodconfig
         * ia64 SMP allmodconfig
         * s390 SMP allmodconfig
         * alpha SMP allmodconfig
         * um on x86_64 SMP allmodconfig
      
      8. percpu.h modifications were reverted so that it could be applied as
         a separate patch and serve as bisection point.
      
      Given the fact that I had only a couple of failures from tests on step
      6, I'm fairly confident about the coverage of this conversion patch.
      If there is a breakage, it's likely to be something in one of the arch
      headers which should be easily discoverable easily on most builds of
      the specific arch.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
      5a0e3ad6
  3. 04 3月, 2010 5 次提交
  4. 20 12月, 2009 1 次提交
  5. 18 12月, 2009 1 次提交
    • D
      ioat3: fix p-disabled q-continuation · cd78809f
      Dan Williams 提交于
      When continuing a pq calculation the driver needs 3 extra sources.  The
      driver can perform a 3 source calculation with a single descriptor, but
      needs an extended descriptor to process up to 8 sources in one
      operation.  However, in the p-disabled case only one extra source is
      needed.  When continuing a p-disabled operation there are occasions
      (i.e. 0 < src_cnt % 8 < 3) where the tail operation does not need an
      extended descriptor.  Properly account for this fact otherwise invalid
      'dmacount' values will be written to hardware usually causing the
      channel to halt with 'invalid descriptor' errors.
      
      Cc: <stable@kernel.org>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      cd78809f
  6. 20 11月, 2009 5 次提交
  7. 22 9月, 2009 1 次提交
    • D
      ioat3: fix uninitialized var warnings · cdef57db
      Dan Williams 提交于
      drivers/dma/ioat/dma_v3.c: In function 'ioat3_prep_memset_lock':
      drivers/dma/ioat/dma_v3.c:439: warning: 'fill' may be used uninitialized in this function
      drivers/dma/ioat/dma_v3.c:437: warning: 'desc' may be used uninitialized in this function
      drivers/dma/ioat/dma_v3.c: In function '__ioat3_prep_xor_lock':
      drivers/dma/ioat/dma_v3.c:489: warning: 'xor' may be used uninitialized in this function
      drivers/dma/ioat/dma_v3.c:486: warning: 'desc' may be used uninitialized in this function
      drivers/dma/ioat/dma_v3.c: In function '__ioat3_prep_pq_lock':
      drivers/dma/ioat/dma_v3.c:631: warning: 'pq' may be used uninitialized in this function
      drivers/dma/ioat/dma_v3.c:628: warning: 'desc' may be used uninitialized in this function
      
      gcc-4.0, unlike gcc-4.3, does not see that these variables are
      initialized before use.  Convert the descriptor loops to do-while make
      this initialization apparent.
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      cdef57db
  8. 09 9月, 2009 9 次提交
    • D
      ioat3: segregate raid engines · e3232714
      Dan Williams 提交于
      The cleanup routine for the raid cases imposes extra checks for handling
      raid descriptors and extended descriptors.  If the channel does not
      support raid it can avoid this extra overhead by using the ioat2 cleanup
      path.
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      e3232714
    • D
      ioat3: interrupt descriptor support · 58c8649e
      Dan Williams 提交于
      The async_tx api uses the DMA_INTERRUPT operation type to terminate a
      chain of issued operations with a callback routine.
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      58c8649e
    • D
      ioat3: support xor via pq descriptors · ae786624
      Dan Williams 提交于
      If a platform advertises pq capabilities, but not xor, then use
      ioat3_prep_pqxor and ioat3_prep_pqxor_val to simulate xor support.
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      ae786624
    • D
      ioat3: pq support · d69d235b
      Dan Williams 提交于
      ioat3.2 adds support for raid6 syndrome generation (xor sum of galois
      field multiplication products) using up to 8 sources.  It can also
      perform an pq-zero-sum operation to validate whether the syndrome for a
      given set of sources matches a previously computed syndrome.
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      d69d235b
    • D
      ioat3: xor self test · 9de6fc71
      Dan Williams 提交于
      This adds a hardware specific self test to be called from ioat_probe.
      In the ioat3 case we will have tests for all the different raid
      operations, while ioat1 and ioat2 will continue to just test memcpy.
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      9de6fc71
    • D
      ioat3: xor support · b094ad3b
      Dan Williams 提交于
      ioat3.2 adds xor offload support for up to 8 sources.  It can also
      perform an xor-zero-sum operation to validate whether all given sources
      sum to zero, without writing to a destination.  Xor descriptors differ
      from memcpy in that one operation may require multiple descriptors
      depending on the number of sources.  When the number of sources exceeds
      5 an extended descriptor is needed.  These descriptors need to be
      accounted for when updating the DMA_COUNT register.
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      b094ad3b
    • D
      ioat3: enable dca for completion writes · e61dacae
      Dan Williams 提交于
      Tag completion writes for direct cache access to reduce the latency of
      checking for descriptor completions.
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      e61dacae
    • D
      ioat: add 'ioat' sysfs attributes · 5669e31c
      Dan Williams 提交于
      Export driver attributes for diagnostic purposes:
      'ring_size': total number of descriptors available to the engine
      'ring_active': number of descriptors in-flight
      'capabilities': supported operation types for this channel
      'version': Intel(R) QuickData specfication revision
      
      This also allows some chattiness to be removed from the driver startup
      as this information is now available via sysfs.
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      5669e31c
    • D
      ioat3: split ioat3 support to its own file, add memset · bf40a686
      Dan Williams 提交于
      Up until this point the driver for Intel(R) QuickData Technology
      engines, specification versions 2 and 3, were mostly identical save for
      a few quirks.  Version 3.2 hardware adds many new capabilities (like
      raid offload support) requiring some infrastructure that is not relevant
      for v2.  For better code organization of the new funcionality move v3
      and v3.2 support to its own file dma_v3.c, and export some routines from
      the base files (dma.c and dma_v2.c) that can be reused directly.
      
      The first new capability included in this code reorganization is support
      for v3.2 memset operations.
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      bf40a686