1. 18 12月, 2009 1 次提交
  2. 23 11月, 2009 2 次提交
  3. 20 11月, 2009 1 次提交
    • D
      async_tx: build-time toggling of async_{syndrome,xor}_val dma support · 7b3cc2b1
      Dan Williams 提交于
      ioat3.2 does not support asynchronous error notifications which makes
      the driver experience latencies when non-zero pq validate results are
      expected.  Provide a mechanism for turning off async_xor_val and
      async_syndrome_val via Kconfig.  This approach is generally useful for
      any driver that specifies ASYNC_TX_DISABLE_CHANNEL_SWITCH and would like
      to force the async_tx api to fall back to the synchronous path for
      certain operations.
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      7b3cc2b1
  4. 19 11月, 2009 1 次提交
  5. 16 11月, 2009 1 次提交
    • H
      crypto: gcm - fix another complete call in complete fuction · 62c5593a
      Huang Ying 提交于
      The flow of the complete function (xxx_done) in gcm.c is as follow:
      
      void complete(struct crypto_async_request *areq, int err)
      {
      	struct aead_request *req = areq->data;
      
      	if (!err) {
      		err = async_next_step();
      		if (err == -EINPROGRESS || err == -EBUSY)
      			return;
      	}
      
      	complete_for_next_step(areq, err);
      }
      
      But *areq may be destroyed in async_next_step(), this makes
      complete_for_next_step() can not work properly. To fix this, one of
      following methods is used for each complete function.
      
      - Add a __complete() for each complete(), which accept struct
        aead_request *req instead of areq, so avoid using areq after it is
        destroyed.
      
      - Expand complete_for_next_step().
      
      The fixing method is based on the idea of Herbert Xu.
      Signed-off-by: NHuang Ying <ying.huang@intel.com>
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      62c5593a
  6. 12 11月, 2009 1 次提交
  7. 27 10月, 2009 3 次提交
  8. 20 10月, 2009 3 次提交
    • D
      async_tx: fix asynchronous raid6 recovery for ddf layouts · da17bf43
      Dan Williams 提交于
      The raid6 recovery code currently requires special handling of the
      4-disk and 5-disk recovery scenarios for the native layout.  Quoting
      from commit 0a82a623:
      
           In these situations the default N-disk algorithm will present
           0-source or 1-source operations to dma devices.  To cover for
           dma devices where the minimum source count is 2 we implement
           4-disk and 5-disk handling in the recovery code.
      
      The ddf layout presents disks=6 and disks=7 to the recovery code in
      these situations.  Instead of looking at the number of disks count the
      number of non-zero sources in the list and call the special case code
      when the number of non-failed sources is 0 or 1.
      
      [neilb@suse.de: replace 'ddf' flag with counting good sources]
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      da17bf43
    • D
      async_pq: rename scribble page · 030b0772
      Dan Williams 提交于
      The global scribble page is used as a temporary destination buffer when
      disabling the P or Q result is requested.  The local scribble buffer
      contains memory for performing address conversions.  Rename the global
      variable to avoid confusion.
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      
      
      030b0772
    • D
      async_pq: kill a stray dma_map() call and other cleanups · 5676470f
      Dan Williams 提交于
      - update the kernel doc for async_syndrome to indicate what NULL in the
        source list means
      - whitespace fixups
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      5676470f
  9. 19 10月, 2009 4 次提交
  10. 16 10月, 2009 2 次提交
    • N
      raid6/async_tx: handle holes in block list in async_syndrome_val · b2141e69
      NeilBrown 提交于
      async_syndrome_val check the P and Q blocks used for RAID6
      calculations.
      With DDF raid6, some of the data blocks might be NULL, so
      this needs to be handled in the same way that async_gen_syndrome
      handles it.
      
      As async_syndrome_val calls async_xor, also enhance async_xor
      to detect and skip NULL blocks in the list.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      b2141e69
    • N
      md/async: don't pass a memory pointer as a page pointer. · 5dd33c9a
      NeilBrown 提交于
      md/raid6 passes a list of 'struct page *' to the async_tx routines,
      which then either DMA map them for offload, or take the page_address
      for CPU based calculations.
      
      For RAID6 we sometime leave 'blanks' in the list of pages.
      For CPU based calcs, we want to treat theses as a page of zeros.
      For offloaded calculations, we simply don't pass a page to the
      hardware.
      
      Currently the 'blanks' are encoded as a pointer to
      raid6_empty_zero_page.  This is a 4096 byte memory region, not a
      'struct page'.  This is mostly handled correctly but is rather ugly.
      
      So change the code to pass and expect a NULL pointer for the blanks.
      When taking page_address of a page, we need to check for a NULL and
      in that case use raid6_empty_zero_page.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      5dd33c9a
  11. 12 10月, 2009 1 次提交
  12. 03 10月, 2009 1 次提交
  13. 22 9月, 2009 1 次提交
  14. 17 9月, 2009 1 次提交
    • D
      raid6test: fix stack overflow · 1b6df693
      Dan Williams 提交于
      Testing on x86_64 with NDISKS=255 yields:
      
         do_IRQ: modprobe near stack overflow (cur:ffff88007d19c000,sp:ffff88007d19c128)
      
      ...and eventually
      
         general protection fault: 0000 [#1]
      
      Moving the scribble buffers off the stack allows the test to complete
      successfully.
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      1b6df693
  15. 09 9月, 2009 3 次提交
    • D
      dmaengine, async_tx: support alignment checks · 83544ae9
      Dan Williams 提交于
      Some engines have transfer size and address alignment restrictions.  Add
      a per-operation alignment property to struct dma_device that the async
      routines and dmatest can use to check alignment capabilities.
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      83544ae9
    • D
      dmaengine, async_tx: add a "no channel switch" allocator · 138f4c35
      Dan Williams 提交于
      Channel switching is problematic for some dmaengine drivers as the
      architecture precludes separating the ->prep from ->submit.  In these
      cases the driver can select ASYNC_TX_DISABLE_CHANNEL_SWITCH to modify
      the async_tx allocator to only return channels that support all of the
      required asynchronous operations.
      
      For example MD_RAID456=y selects support for asynchronous xor, xor
      validate, pq, pq validate, and memcpy.  When
      ASYNC_TX_DISABLE_CHANNEL_SWITCH=y any channel with all these
      capabilities is marked DMA_ASYNC_TX allowing async_tx_find_channel() to
      quickly locate compatible channels with the guarantee that dependency
      chains will remain on one channel.  When
      ASYNC_TX_DISABLE_CHANNEL_SWITCH=n async_tx_find_channel() may select
      channels that lead to operation chains that need to cross channel
      boundaries using the async_tx channel switch capability.
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      138f4c35
    • D
      dmaengine: add fence support · 0403e382
      Dan Williams 提交于
      Some engines optimize operation by reading ahead in the descriptor chain
      such that descriptor2 may start execution before descriptor1 completes.
      If descriptor2 depends on the result from descriptor1 then a fence is
      required (on descriptor2) to disable this optimization.  The async_tx
      api could implicitly identify dependencies via the 'depend_tx'
      parameter, but that would constrain cases where the dependency chain
      only specifies a completion order rather than a data dependency.  So,
      provide an ASYNC_TX_FENCE to explicitly identify data dependencies.
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      0403e382
  16. 02 9月, 2009 1 次提交
  17. 31 8月, 2009 1 次提交
    • H
      crypto: api - Do not displace newly registered algorithms · 2bf29016
      Herbert Xu 提交于
      We have a mechanism where newly registered algorithms of a higher
      priority can displace existing instances that use a different
      implementation of the same algorithm with a lower priority.
      
      Unfortunately the same mechanism can cause a newly registered
      algorithm to displace itself if it depends on an existing version
      of the same algorithm.
      
      This patch fixes this by keeping all algorithms that the newly
      reigstered algorithm depends on, thus protecting them from being
      removed.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      2bf29016
  18. 30 8月, 2009 6 次提交
    • D
      async_tx: raid6 recovery self test · cb3c8299
      Dan Williams 提交于
      Port drivers/md/raid6test/test.c to use the async raid6 recovery
      routines.  This is meant as a unit test for raid6 acceleration drivers.  In
      addition to the 16-drive test case this implements tests for the 4-disk and
      5-disk special cases (dma devices can not generically handle less than 2
      sources), and adds a test for the D+Q case.
      Reviewed-by: NAndre Noll <maan@systemlinux.org>
      Acked-by: NMaciej Sosnowski <maciej.sosnowski@intel.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      
      cb3c8299
    • D
      async_tx: add support for asynchronous RAID6 recovery operations · 0a82a623
      Dan Williams 提交于
       async_raid6_2data_recov() recovers two data disk failures
      
       async_raid6_datap_recov() recovers a data disk and the P disk
      
      These routines are a port of the synchronous versions found in
      drivers/md/raid6recov.c.  The primary difference is breaking out the xor
      operations into separate calls to async_xor.  Two helper routines are
      introduced to perform scalar multiplication where needed.
      async_sum_product() multiplies two sources by scalar coefficients and
      then sums (xor) the result.  async_mult() simply multiplies a single
      source by a scalar.
      
      This implemention also includes, in contrast to the original
      synchronous-only code, special case handling for the 4-disk and 5-disk
      array cases.  In these situations the default N-disk algorithm will
      present 0-source or 1-source operations to dma devices.  To cover for
      dma devices where the minimum source count is 2 we implement 4-disk and
      5-disk handling in the recovery code.
      
      [ Impact: asynchronous raid6 recovery routines for 2data and datap cases ]
      
      Cc: Yuri Tikhonov <yur@emcraft.com>
      Cc: Ilya Yanok <yanok@emcraft.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: David Woodhouse <David.Woodhouse@intel.com>
      Reviewed-by: NAndre Noll <maan@systemlinux.org>
      Acked-by: NMaciej Sosnowski <maciej.sosnowski@intel.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      
      0a82a623
    • D
      async_tx: add support for asynchronous GF multiplication · b2f46fd8
      Dan Williams 提交于
      [ Based on an original patch by Yuri Tikhonov ]
      
      This adds support for doing asynchronous GF multiplication by adding
      two additional functions to the async_tx API:
      
       async_gen_syndrome() does simultaneous XOR and Galois field
          multiplication of sources.
      
       async_syndrome_val() validates the given source buffers against known P
          and Q values.
      
      When a request is made to run async_pq against more than the hardware
      maximum number of supported sources we need to reuse the previous
      generated P and Q values as sources into the next operation.  Care must
      be taken to remove Q from P' and P from Q'.  For example to perform a 5
      source pq op with hardware that only supports 4 sources at a time the
      following approach is taken:
      
      p, q = PQ(src0, src1, src2, src3, COEF({01}, {02}, {04}, {08}))
      p', q' = PQ(p, q, q, src4, COEF({00}, {01}, {00}, {10}))
      
      p' = p + q + q + src4 = p + src4
      q' = {00}*p + {01}*q + {00}*q + {10}*src4 = q + {10}*src4
      
      Note: 4 is the minimum acceptable maxpq otherwise we punt to
      synchronous-software path.
      
      The DMA_PREP_CONTINUE flag indicates to the driver to reuse p and q as
      sources (in the above manner) and fill the remaining slots up to maxpq
      with the new sources/coefficients.
      
      Note1: Some devices have native support for P+Q continuation and can skip
      this extra work.  Devices with this capability can advertise it with
      dma_set_maxpq.  It is up to each driver how to handle the
      DMA_PREP_CONTINUE flag.
      
      Note2: The api supports disabling the generation of P when generating Q,
      this is ignored by the synchronous path but is implemented by some dma
      devices to save unnecessary writes.  In this case the continuation
      algorithm is simplified to only reuse Q as a source.
      
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: David Woodhouse <David.Woodhouse@intel.com>
      Signed-off-by: NYuri Tikhonov <yur@emcraft.com>
      Signed-off-by: NIlya Yanok <yanok@emcraft.com>
      Reviewed-by: NAndre Noll <maan@systemlinux.org>
      Acked-by: NMaciej Sosnowski <maciej.sosnowski@intel.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      b2f46fd8
    • D
      async_tx: remove walk of tx->parent chain in dma_wait_for_async_tx · 95475e57
      Dan Williams 提交于
      We currently walk the parent chain when waiting for a given tx to
      complete however this walk may race with the driver cleanup routine.
      The routines in async_raid6_recov.c may fall back to the synchronous
      path at any point so we need to be prepared to call async_tx_quiesce()
      (which calls  dma_wait_for_async_tx).  To remove the ->parent walk we
      guarantee that every time a dependency is attached ->issue_pending() is
      invoked, then we can simply poll the initial descriptor until
      completion.
      
      This also allows for a lighter weight 'issue pending' implementation as
      there is no longer a requirement to iterate through all the channels'
      ->issue_pending() routines as long as operations have been submitted in
      an ordered chain.  async_tx_issue_pending() is added for this case.
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      95475e57
    • D
      async_tx: kill needless module_{init|exit} · af1f951e
      Dan Williams 提交于
      If module_init and module_exit are nops then neither need to be defined.
      
      [ Impact: pure cleanup ]
      Reviewed-by: NAndre Noll <maan@systemlinux.org>
      Acked-by: NMaciej Sosnowski <maciej.sosnowski@intel.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      af1f951e
    • D
      async_tx: add sum check flags · ad283ea4
      Dan Williams 提交于
      Replace the flat zero_sum_result with a collection of flags to contain
      the P (xor) zero-sum result, and the soon to be utilized Q (raid6 reed
      solomon syndrome) zero-sum result.  Use the SUM_CHECK_ namespace instead
      of DMA_ since these flags will be used on non-dma-zero-sum enabled
      platforms.
      Reviewed-by: NAndre Noll <maan@systemlinux.org>
      Acked-by: NMaciej Sosnowski <maciej.sosnowski@intel.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      ad283ea4
  19. 29 8月, 2009 2 次提交
  20. 20 8月, 2009 2 次提交
  21. 14 8月, 2009 1 次提交
    • H
      crypto: blkcipher - Do not use eseqiv on stream ciphers · 63b5ac28
      Herbert Xu 提交于
      Recently we switched to using eseqiv on SMP machines in preference
      over chainiv.  However, eseqiv does not support stream ciphers so
      they should still default to chainiv.
      
      This patch applies the same check as done by eseqiv to weed out
      the stream ciphers.  In particular, all algorithms where the IV
      size is not equal to the block size will now default to chainiv.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      63b5ac28
  22. 13 8月, 2009 1 次提交
    • H
      crypto: ctr - Use chainiv on raw counter mode · aef27136
      Herbert Xu 提交于
      Raw counter mode only works with chainiv, which is no longer
      the default IV generator on SMP machines.  This broke raw counter
      mode as it can no longer instantiate as a givcipher.
      
      This patch fixes it by always picking chainiv on raw counter
      mode.  This is based on the diagnosis and a patch by Huang
      Ying.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      aef27136