1. 29 5月, 2011 5 次提交
    • M
      dm kcopyd: alloc pages from the main page allocator · d0471458
      Mikulas Patocka 提交于
      This patch changes dm-kcopyd so that it allocates pages from the main
      page allocator with __GFP_NOWARN | __GFP_NORETRY flags (so that it can
      fail in case of memory pressure). If the allocation fails, dm-kcopyd
      allocates pages from its own reserve.
      Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
      d0471458
    • M
      dm kcopyd: add gfp parm to alloc_pl · f99b55ee
      Mikulas Patocka 提交于
      Introduce a parameter for gfp flags to alloc_pl() for use in following
      patches.
      Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
      f99b55ee
    • M
      dm kcopyd: remove superfluous page allocation spinlock · 4cc1b4cf
      Mikulas Patocka 提交于
      Remove the spinlock protecting the pages allocation.  The spinlock is only
      taken on initialization or from single-threaded workqueue.  Therefore, the
      spinlock is useless.
      
      The spinlock is taken in kcopyd_get_pages and kcopyd_put_pages.
      
      kcopyd_get_pages is only called from run_pages_job, which is only
      called from process_jobs called from do_work.
      
      kcopyd_put_pages is called from client_alloc_pages (which is initialization
      function) or from run_complete_job. run_complete_job is only called from
      process_jobs called from do_work.
      
      Another spinlock, kc->job_lock is taken each time someone pushes or pops
      some work for the worker thread.  Once we take kc->job_lock, we
      guarantee that any written memory is visible to the other CPUs.
      Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
      4cc1b4cf
    • M
      dm kcopyd: preallocate sub jobs to avoid deadlock · c6ea41fb
      Mikulas Patocka 提交于
      There's a possible theoretical deadlock in dm-kcopyd because multiple
      allocations from the same mempool are required to finish a request.
      Avoid this by preallocating sub jobs.
      
      There is a mempool of 512 entries. Each request requires up to 9
      entries from the mempool. If we have at least 57 concurrent requests
      running, the mempool may overflow and mempool allocations may start
      blocking until another entry is freed to the mempool. Because the same
      thread is used to free entries to the mempool and allocate entries from
      the mempool, this may result in a deadlock.
      
      This patch changes it so that one mempool entry contains all 9 "struct
      kcopyd_job" required to fulfill the whole request. The allocation is
      done only once in dm_kcopyd_copy and no further mempool allocations are
      done during request processing.
      
      If dm_kcopyd_copy is not run in the completion thread, this
      implementation is deadlock-free.
      
      MIN_JOBS needs reducing accordingly and we've chosen to reduce it
      further to 8.
      Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
      c6ea41fb
    • M
      dm kcopyd: avoid pointless job splitting · a705a34a
      Mikulas Patocka 提交于
      Don't split SUB_JOB_SIZE jobs
      
      If the job size equals SUB_JOB_SIZE, there is no point in splitting it.
      Splitting it just unnecessarily wastes time, because the split job size
      is SUB_JOB_SIZE too.
      Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
      a705a34a
  2. 10 3月, 2011 2 次提交
    • J
      block: kill off REQ_UNPLUG · 721a9602
      Jens Axboe 提交于
      With the plugging now being explicitly controlled by the
      submitter, callers need not pass down unplugging hints
      to the block layer. If they want to unplug, it's because they
      manually plugged on their own - in which case, they should just
      unplug at will.
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      721a9602
    • J
      block: remove per-queue plugging · 7eaceacc
      Jens Axboe 提交于
      Code has been converted over to the new explicit on-stack plugging,
      and delay users have been converted to use the new API for that.
      So lets kill off the old plugging along with aops->sync_page().
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      7eaceacc
  3. 14 1月, 2011 4 次提交
    • T
      dm: use non reentrant workqueues if equivalent · 9c4376de
      Tejun Heo 提交于
      kmirrord_wq, kcopyd_work and md->wq are created per dm instance and
      serve only a single work item from the dm instance, so non-reentrant
      workqueues would provide the same ordering guarantees as ordered ones
      while allowing CPU affinity and use of the workqueues for other
      purposes.  Switch them to non-reentrant workqueues.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
      9c4376de
    • T
      dm: convert workqueues to alloc_ordered · 4d4d66ab
      Tejun Heo 提交于
      Convert all create[_singlethread]_work() users to the new
      alloc[_ordered]_workqueue().  This conversion is mechanical and
      doesn't introduce any behavior change.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
      4d4d66ab
    • M
      dm kcopyd: delay unplugging · 8d35d3e3
      Mikulas Patocka 提交于
      Make kcopyd merge more I/O requests by using device unplugging.
      
      Without this patch, each I/O request is dispatched separately to the device.
      If the device supports tagged queuing, there are many small requests sent
      to the device. To improve performance, this patch will batch as many requests
      as possible, allowing the queue to merge consecutive requests, and send them
      to the device at once.
      
      In my tests (15k SCSI disk), this patch improves sequential write throughput:
      
        Sequential write throughput (chunksize of 4k, 32k, 512k)
        unpatched: 15.2, 18.5, 17.5 MB/s
        patched:   14.4, 22.6, 23.0 MB/s
      
      In most common uses (snapshot or two-way mirror), kcopyd is only used for
      two devices, one for reading and the other for writing, thus this optimization
      is implemented only for two devices. The optimization may be extended to n-way
      mirrors with some code complexity increase.
      
      We keep track of two block devices to unplug (one for read and the
      other for write) and unplug them when exiting "do_work" thread.  If
      there are more devices used (in theory it could happen, in practice it
      is rare), we unplug immediately.
      Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
      8d35d3e3
    • M
      dm io: remove BIO_RW_SYNCIO flag from kcopyd · d9bf0b50
      Mikulas Patocka 提交于
      Remove the REQ_SYNC flag to improve write throughput when writing
      to the origin with a snapshot on the same device (using the CFQ I/O
      scheduler).
      
      Sequential write throughput (chunksize of 4k, 32k, 512k)
        unpatched:  8.5,  8.6,  9.3 MB/s
        patched:   15.2, 18.5, 17.5 MB/s
      
      Snapshot exception reallocations are triggered by writes that are
      usually async, so mark the associated dm_io_request as async as well.
      This helps when using the CFQ I/O scheduler because it has separate
      queues for sync and async I/O.  Async is optimized for throughput; sync
      for latency.  With this change we're consciously favoring throughput over
      latency.
      Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
      d9bf0b50
  4. 08 8月, 2010 1 次提交
    • C
      block: unify flags for struct bio and struct request · 7b6d91da
      Christoph Hellwig 提交于
      Remove the current bio flags and reuse the request flags for the bio, too.
      This allows to more easily trace the type of I/O from the filesystem
      down to the block driver.  There were two flags in the bio that were
      missing in the requests:  BIO_RW_UNPLUG and BIO_RW_AHEAD.  Also I've
      renamed two request flags that had a superflous RW in them.
      
      Note that the flags are in bio.h despite having the REQ_ name - as
      blkdev.h includes bio.h that is the only way to go for now.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      7b6d91da
  5. 11 12月, 2009 1 次提交
    • M
      dm kcopyd: accept zero size jobs · 9ca170a3
      Mikulas Patocka 提交于
      dm-kcopyd: accept zero-size jobs
      
      This patch changes dm-kcopyd so that it accepts zero-size jobs and completes
      them immediatelly via its completion thread.
      
      It is needed for multisnapshots snapshot resizing. When we are writing to
      a chunk beyond origin end, no copying is done. To simplify the code, we submit
      an empty request to kcopyd and let kcopyd complete it. If we didn't submit
      a request to kcopyd and called the completion routine immediatelly, it would
      violate the principle that completion is called only from one thread and
      it would need additional locking.
      Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
      9ca170a3
  6. 09 4月, 2009 2 次提交
    • M
      dm kcopyd: fix callback race · 340cd444
      Mikulas Patocka 提交于
      If the thread calling dm_kcopyd_copy is delayed due to scheduling inside
      split_job/segment_complete and the subjobs complete before the loop in
      split_job completes, the kcopyd callback could be invoked from the
      thread that called dm_kcopyd_copy instead of the kcopyd workqueue.
      
      dm_kcopyd_copy -> split_job -> segment_complete -> job->fn()
      
      Snapshots depend on the fact that callbacks are called from the singlethreaded
      kcopyd workqueue and expect that there is no racing between individual
      callbacks. The racing between callbacks can lead to corruption of exception
      store and it can also mean that exception store callbacks are called twice
      for the same exception - a likely reason for crashes reported inside
      pending_complete() / remove_exception().
      
      This patch fixes two problems:
      
      1. job->fn being called from the thread that submitted the job (see above).
      
      - Fix: hand over the completion callback to the kcopyd thread.
      
      2. job->fn(read_err, write_err, job->context); in segment_complete
      reports the error of the last subjob, not the union of all errors.
      
      - Fix: pass job->write_err to the callback to report all error bits
        (it is done already in run_complete_job)
      
      Cc: stable@kernel.org
      Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
      340cd444
    • M
      dm kcopyd: prepare for callback race fix · 73830857
      Mikulas Patocka 提交于
      Use a variable in segment_complete() to point to the dm_kcopyd_client
      struct and only release job->pages in run_complete_job() if any are
      defined.  These changes are needed by the next patch.
      
      Cc: stable@kernel.org
      Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
      73830857
  7. 18 2月, 2009 1 次提交
  8. 22 10月, 2008 2 次提交
    • M
      dm: remove dm header from targets · 586e80e6
      Mikulas Patocka 提交于
      Change #include "dm.h" to #include <linux/device-mapper.h> in all targets.
      Targets should not need direct access to internal DM structures.
      Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
      586e80e6
    • K
      dm kcopyd: avoid queue shuffle · b673c3a8
      Kazuo Ito 提交于
      Write throughput to LVM snapshot origin volume is an order
      of magnitude slower than those to LV without snapshots or
      snapshot target volumes, especially in the case of sequential
      writes with O_SYNC on.
      
      The following patch originally written by Kevin Jamieson and
      Jan Blunck and slightly modified for the current RCs by myself
      tries to improve the performance by modifying the behaviour
      of kcopyd, so that it pushes back an I/O job to the head of
      the job queue instead of the tail as process_jobs() currently
      does when it has to wait for free pages. This way, write
      requests aren't shuffled to cause extra seeks.
      
      I tested the patch against 2.6.27-rc5 and got the following results.
      The test is a dd command writing to snapshot origin followed by fsync
      to the file just created/updated.  A couple of filesystem benchmarks
      gave me similar results in case of sequential writes, while random
      writes didn't suffer much.
      
      dd if=/dev/zero of=<somewhere on snapshot origin> bs=4096 count=...
         [conv=notrunc when updating]
      
      1) linux 2.6.27-rc5 without the patch, write to snapshot origin,
      average throughput (MB/s)
                           10M     100M    1000M
      create,dd         511.46   610.72    11.81
      create,dd+fsync     7.10     6.77     8.13
      update,dd         431.63   917.41    12.75
      update,dd+fsync     7.79     7.43     8.12
      
      compared with write throughput to LV without any snapshots,
      all dd+fsync and 1000 MiB writes perform very poorly.
      
                           10M     100M    1000M
      create,dd         555.03   608.98   123.29
      create,dd+fsync   114.27    72.78    76.65
      update,dd         152.34  1267.27   124.04
      update,dd+fsync   130.56    77.81    77.84
      
      2) linux 2.6.27-rc5 with the patch, write to snapshot origin,
      average throughput (MB/s)
      
                           10M     100M    1000M
      create,dd         537.06   589.44    46.21
      create,dd+fsync    31.63    29.19    29.23
      update,dd         487.59   897.65    37.76
      update,dd+fsync    34.12    30.07    26.85
      
      Although still not on par with plain LV performance -
      cannot be avoided because it's copy on write anyway -
      this simple patch successfully improves throughtput
      of dd+fsync while not affecting the rest.
      Signed-off-by: NJan Blunck <jblunck@suse.de>
      Signed-off-by: NKazuo Ito <ito.kazuo@oss.ntt.co.jp>
      Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
      Cc: stable@kernel.org
      b673c3a8
  9. 25 4月, 2008 8 次提交
    • M
      dm: unplug queues in threads · 7ff14a36
      Mikulas Patocka 提交于
      Remove an avoidable 3ms delay on some dm-raid1 and kcopyd I/O.
      
      It is specified that any submitted bio without BIO_RW_SYNC flag may plug the
      queue (i.e. block the requests from being dispatched to the physical device).
      
      The queue is unplugged when the caller calls blk_unplug() function. Usually, the
      sequence is that someone calls submit_bh to submit IO on a buffer. The IO plugs
      the queue and waits (to be possibly joined with other adjacent bios). Then, when
      the caller calls wait_on_buffer(), it unplugs the queue and submits the IOs to
      the disk.
      
      This was happenning:
      
      When doing O_SYNC writes, function fsync_buffers_list() submits a list of
      bios to dm_raid1, the bios are added to dm_raid1 write queue and kmirrord is
      woken up.
      
      fsync_buffers_list() calls wait_on_buffer().  That unplugs the queue, but
      there are no bios on the device queue as they are still in the dm_raid1 queue.
      
      wait_on_buffer() starts waiting until the IO is finished.
      
      kmirrord is scheduled, kmirrord takes bios and submits them to the devices.
      
      The submitted bio plugs the harddisk queue but there is no one to unplug it.
      (The process that called wait_on_buffer() is already sleeping.)
      
      So there is a 3ms timeout, after which the queues on the harddisks are
      unplugged and requests are processed.
      
      This 3ms timeout meant that in certain workloads (e.g. O_SYNC, 8kb writes),
      dm-raid1 is 10 times slower than md raid1.
      
      Every time we submit something asynchronously via dm_io, we must unplug the
      queue actually to send the request to the device.
      
      This patch adds an unplug call to kmirrord - while processing requests, it keeps
      the queue plugged (so that adjacent bios can be merged); when it finishes
      processing all the bios, it unplugs the queue to submit the bios.
      
      It also fixes kcopyd which has the same potential problem. All kcopyd requests
      are submitted with BIO_RW_SYNC.
      Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
      Acked-by: NJens Axboe <jens.axboe@oracle.com>
      7ff14a36
    • A
      dm: move include files · a765e20e
      Alasdair G Kergon 提交于
      Publish the dm-io, dm-log and dm-kcopyd headers in include/linux.
      Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
      a765e20e
    • A
      dm kcopyd: rename · 2d1e580a
      Alasdair G Kergon 提交于
      Rename kcopyd.[ch] to dm-kcopyd.[ch].
      Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
      2d1e580a
    • M
      dm kcopyd: remove redundant client counting · 945fa4d2
      Mikulas Patocka 提交于
      Remove client counting code that is no longer needed.
      
      Initialization and destruction is made globally from dm_init and dm_exit and is
      not based on client counts. Initialization allocates only one empty slab cache,
      so there is no negative impact from performing the initialization always,
      regardless of whether some client uses kcopyd or not.
      Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
      945fa4d2
    • M
      dm kcopyd: private mempool · 08d8757a
      Mikulas Patocka 提交于
      Change the global mempool in kcopyd into a per-device mempool to avoid
      deadlock possibilities.
      Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
      08d8757a
    • M
      dm kcopyd: per device · 8c0cbc2f
      Mikulas Patocka 提交于
      Make one kcopyd thread per device.
      
      The original shared kcopyd could deadlock.
      
      Configuration:
      8c0cbc2f
    • H
      dm kcopyd: clean interface · eb69aca5
      Heinz Mauelshagen 提交于
      Clean up the kcopyd interface to prepare for publishing it in include/linux.
      Signed-off-by: NHeinz Mauelshagen <hjm@redhat.com>
      Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
      eb69aca5
    • H
      dm io: clean interface · 22a1ceb1
      Heinz Mauelshagen 提交于
      Clean up the dm-io interface to prepare for publishing it in include/linux.
      Signed-off-by: NHeinz Mauelshagen <hjm@redhat.com>
      Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
      22a1ceb1
  10. 29 3月, 2008 1 次提交
  11. 20 10月, 2007 1 次提交
  12. 13 7月, 2007 1 次提交
  13. 10 5月, 2007 1 次提交
  14. 08 12月, 2006 1 次提交
  15. 22 11月, 2006 1 次提交
  16. 01 7月, 2006 1 次提交
  17. 27 6月, 2006 1 次提交
  18. 28 3月, 2006 2 次提交
  19. 27 3月, 2006 2 次提交
  20. 19 1月, 2006 1 次提交
    • A
      [PATCH] EDAC: atomic scrub operations · 715b49ef
      Alan Cox 提交于
      EDAC requires a way to scrub memory if an ECC error is found and the chipset
      does not do the work automatically.  That means rewriting memory locations
      atomically with respect to all CPUs _and_ bus masters.  That means we can't
      use atomic_add(foo, 0) as it gets optimised for non-SMP
      
      This adds a function to include/asm-foo/atomic.h for the platforms currently
      supported which implements a scrub of a mapped block.
      
      It also adjusts a few other files include order where atomic.h is included
      before types.h as this now causes an error as atomic_scrub uses u32.
      Signed-off-by: NAlan Cox <alan@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      715b49ef
  21. 07 1月, 2006 1 次提交