1. 22 4月, 2015 4 次提交
    • M
      md/raid6 algorithms: xor_syndrome() for SSE2 · a582564b
      Markus Stockhausen 提交于
      The second and (last) optimized XOR syndrome calculation. This version
      supports right and left side optimization. All CPUs with architecture
      older than Haswell will benefit from it.
      
      It should be noted that SSE2 movntdq kills performance for memory areas
      that are read and written simultaneously in chunks smaller than cache
      line size. So use movdqa instead for P/Q writes in sse21 and sse22 XOR
      functions.
      Signed-off-by: NMarkus Stockhausen <stockhausen@collogia.de>
      Signed-off-by: NNeilBrown <neilb@suse.de>
      a582564b
    • M
      md/raid6 algorithms: xor_syndrome() for generic int · 9a5ce91d
      Markus Stockhausen 提交于
      Start the algorithms with the very basic one. It is left and right
      optimized. That means we can avoid all calculations for unneeded pages
      above the right stop offset. For pages below the left start offset we
      still need the syndrome multiplication but without reading data pages.
      Signed-off-by: NMarkus Stockhausen <stockhausen@collogia.de>
      Signed-off-by: NNeilBrown <neilb@suse.de>
      9a5ce91d
    • M
      md/raid6 algorithms: improve test program · 7e92e1d7
      Markus Stockhausen 提交于
      It is always helpful to have a test tool in place if we implement
      new data critical algorithms. So add some test routines to the raid6
      checker that can prove if the new xor_syndrome() works as expected.
      
      Run through all permutations of start/stop pages per algorithm and
      simulate a xor_syndrome() assisted rmw run. After each rmw check if
      the recovery algorithm still confirms that the stripe is fine.
      Signed-off-by: NMarkus Stockhausen <stockhausen@collogia.de>
      Signed-off-by: NNeilBrown <neilb@suse.de>
      7e92e1d7
    • M
      md/raid6 algorithms: delta syndrome functions · fe5cbc6e
      Markus Stockhausen 提交于
      v3: s-o-b comment, explanation of performance and descision for
      the start/stop implementation
      
      Implementing rmw functionality for RAID6 requires optimized syndrome
      calculation. Up to now we can only generate a complete syndrome. The
      target P/Q pages are always overwritten. With this patch we provide
      a framework for inplace P/Q modification. In the first place simply
      fill those functions with NULL values.
      
      xor_syndrome() has two additional parameters: start & stop. These
      will indicate the first and last page that are changing during a
      rmw run. That makes it possible to avoid several unneccessary loops
      and speed up calculation. The caller needs to implement the following
      logic to make the functions work.
      
      1) xor_syndrome(disks, start, stop, ...): "Remove" all data of source
      blocks inside P/Q between (and including) start and end.
      
      2) modify any block with start <= block <= stop
      
      3) xor_syndrome(disks, start, stop, ...): "Reinsert" all data of
      source blocks into P/Q between (and including) start and end.
      
      Pages between start and stop that won't be changed should be filled
      with a pointer to the kernel zero page. The reasons for not taking NULL
      pages are:
      
      1) Algorithms cross the whole source data line by line. Thus avoid
      additional branches.
      
      2) Having a NULL page avoids calculating the XOR P parity but still
      need calulation steps for the Q parity. Depending on the algorithm
      unrolling that might be only a difference of 2 instructions per loop.
      
      The benchmark numbers of the gen_syndrome() functions are displayed in
      the kernel log. Do the same for the xor_syndrome() functions. This
      will help to analyze performance problems and give an rough estimate
      how well the algorithm works. The choice of the fastest algorithm will
      still depend on the gen_syndrome() performance.
      
      With the start/stop page implementation the speed can vary a lot in real
      life. E.g. a change of page 0 & page 15 on a stripe will be harder to
      compute than the case where page 0 & page 1 are XOR candidates. To be not
      to enthusiatic about the expected speeds we will run a worse case test
      that simulates a change on the upper half of the stripe. So we do:
      
      1) calculation of P/Q for the upper pages
      
      2) continuation of Q for the lower (empty) pages
      Signed-off-by: NMarkus Stockhausen <stockhausen@collogia.de>
      Signed-off-by: NNeilBrown <neilb@suse.de>
      fe5cbc6e
  2. 04 2月, 2015 1 次提交
  3. 14 10月, 2014 1 次提交
  4. 27 8月, 2013 2 次提交
  5. 09 7月, 2013 1 次提交
  6. 13 12月, 2012 3 次提交
  7. 28 5月, 2012 1 次提交
  8. 22 5月, 2012 4 次提交
  9. 29 3月, 2012 2 次提交
  10. 01 11月, 2011 2 次提交
  11. 20 10月, 2011 1 次提交
  12. 30 8月, 2010 1 次提交
  13. 12 8月, 2010 2 次提交
  14. 11 8月, 2010 1 次提交
  15. 29 10月, 2009 1 次提交