1. 27 6月, 2009 4 次提交
    • T
      Add new __init_task_data macro to be used in arch init_task.c files. · 857eceeb
      Tim Abbott 提交于
      This patch is preparation for replacing most ".data.init_task" in the
      kernel with macros, so that the section name can later be changed
      without having to touch a lot of the kernel.
      
      The long-term goal here is to be able to change the kernel's magic
      section names to those that are compatible with -ffunction-sections
      -fdata-sections.  This requires renaming all magic sections with names
      of the form ".data.foo".
      Signed-off-by: NTim Abbott <tabbott@ksplice.com>
      Signed-off-by: NSam Ravnborg <sam@ravnborg.org>
      857eceeb
    • T
      asm-generic/vmlinux.lds.h: shuffle INIT_TASK* macro names in vmlinux.lds.h · 39a449d9
      Tim Abbott 提交于
      We recently added a INIT_TASK(align) in include/asm-generic/vmlinux.lds.h,
      but there is already a macro INIT_TASK in include/linux/init_task.h, which
      is quite confusing.  We should switch the macro in the linker script to
      INIT_TASK_DATA. (Sorry that I missed this in reviewing the patch).  Since
      the macros are new, there is only one user of the INIT_TASK in
      vmlinux.lds.h, arch/mn10300/kernel/vmlinux.lds.S.
      
      However, we are currently using INIT_TASK_DATA for laying down an entire
      .data.init_task section.  So rename that to INIT_TASK_DATA_SECTION.
      
      I would be worried about changing the meaning of INIT_TASK_DATA, but the
      old INIT_TASK_DATA implementation had no users, and in fact if anyone had
      tried to use it, it would have failed to compile because it didn't pass
      the alignment to the old INIT_TASK.
      Signed-off-by: NTim Abbott <tabbott@ksplice.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Jesper Nilsson <Jesper.Nilsson@axis.com
      Signed-off-by: NSam Ravnborg <sam@ravnborg.org>
      39a449d9
    • T
      Add new macros for page-aligned data and bss sections. · d2af12ae
      Tim Abbott 提交于
      This patch is preparation for replacing most uses of
      ".bss.page_aligned" and ".data.page_aligned" in the kernel with
      macros, so that the section name can later be changed without having
      to touch a lot of the kernel.
      
      The long-term goal here is to be able to change the kernel's magic
      section names to those that are compatible with -ffunction-sections
      -fdata-sections.  This requires renaming all magic sections with names
      of the form ".data.foo".
      Signed-off-by: NTim Abbott <tabbott@ksplice.com>
      Acked-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NSam Ravnborg <sam@ravnborg.org>
      d2af12ae
    • P
      asm-generic/vmlinux.lds.h: Fix up RW_DATA_SECTION definition. · 73f1d939
      Paul Mundt 提交于
      RW_DATA_SECTION is defined to take 4 different alignment parameters,
      while NOSAVE_DATA currently uses a fixed PAGE_SIZE alignment as noted
      in the comments.
      
      There are presently no in-tree users of this at present, and I just
      stumbled across this while implementing the simplified script on a new
      architecture port, which subsequently resulted in a syntax error.
      Signed-off-by: NPaul Mundt <lethal@linux-sh.org>
      Signed-off-by: NSam Ravnborg <sam@ravnborg.org>
      73f1d939
  2. 26 6月, 2009 1 次提交
  3. 25 6月, 2009 2 次提交
  4. 24 6月, 2009 16 次提交
  5. 23 6月, 2009 15 次提交
  6. 22 6月, 2009 2 次提交
    • K
      dm: prepare for request based option · cec47e3d
      Kiyoshi Ueda 提交于
      This patch adds core functions for request-based dm.
      
      When struct mapped device (md) is initialized, md->queue has
      an I/O scheduler and the following functions are used for
      request-based dm as the queue functions:
          make_request_fn: dm_make_request()
          pref_fn:         dm_prep_fn()
          request_fn:      dm_request_fn()
          softirq_done_fn: dm_softirq_done()
          lld_busy_fn:     dm_lld_busy()
      Actual initializations are done in another patch (PATCH 2).
      
      Below is a brief summary of how request-based dm behaves, including:
        - making request from bio
        - cloning, mapping and dispatching request
        - completing request and bio
        - suspending md
        - resuming md
      
        bio to request
        ==============
        md->queue->make_request_fn() (dm_make_request()) calls __make_request()
        for a bio submitted to the md.
        Then, the bio is kept in the queue as a new request or merged into
        another request in the queue if possible.
      
        Cloning and Mapping
        ===================
        Cloning and mapping are done in md->queue->request_fn() (dm_request_fn()),
        when requests are dispatched after they are sorted by the I/O scheduler.
      
        dm_request_fn() checks busy state of underlying devices using
        target's busy() function and stops dispatching requests to keep them
        on the dm device's queue if busy.
        It helps better I/O merging, since no merge is done for a request
        once it is dispatched to underlying devices.
      
        Actual cloning and mapping are done in dm_prep_fn() and map_request()
        called from dm_request_fn().
        dm_prep_fn() clones not only request but also bios of the request
        so that dm can hold bio completion in error cases and prevent
        the bio submitter from noticing the error.
        (See the "Completion" section below for details.)
      
        After the cloning, the clone is mapped by target's map_rq() function
          and inserted to underlying device's queue using
          blk_insert_cloned_request().
      
        Completion
        ==========
        Request completion can be hooked by rq->end_io(), but then, all bios
        in the request will have been completed even error cases, and the bio
        submitter will have noticed the error.
        To prevent the bio completion in error cases, request-based dm clones
        both bio and request and hooks both bio->bi_end_io() and rq->end_io():
            bio->bi_end_io(): end_clone_bio()
            rq->end_io():     end_clone_request()
      
        Summary of the request completion flow is below:
        blk_end_request() for a clone request
          => blk_update_request()
             => bio->bi_end_io() == end_clone_bio() for each clone bio
                => Free the clone bio
                => Success: Complete the original bio (blk_update_request())
                   Error:   Don't complete the original bio
          => blk_finish_request()
             => rq->end_io() == end_clone_request()
                => blk_complete_request()
                   => dm_softirq_done()
                      => Free the clone request
                      => Success: Complete the original request (blk_end_request())
                         Error:   Requeue the original request
      
        end_clone_bio() completes the original request on the size of
        the original bio in successful cases.
        Even if all bios in the original request are completed by that
        completion, the original request must not be completed yet to keep
        the ordering of request completion for the stacking.
        So end_clone_bio() uses blk_update_request() instead of
        blk_end_request().
        In error cases, end_clone_bio() doesn't complete the original bio.
        It just frees the cloned bio and gives over the error handling to
        end_clone_request().
      
        end_clone_request(), which is called with queue lock held, completes
        the clone request and the original request in a softirq context
        (dm_softirq_done()), which has no queue lock, to avoid a deadlock
        issue on submission of another request during the completion:
            - The submitted request may be mapped to the same device
            - Request submission requires queue lock, but the queue lock
              has been held by itself and it doesn't know that
      
        The clone request has no clone bio when dm_softirq_done() is called.
        So target drivers can't resubmit it again even error cases.
        Instead, they can ask dm core for requeueing and remapping
        the original request in that cases.
      
        suspend
        =======
        Request-based dm uses stopping md->queue as suspend of the md.
        For noflush suspend, just stops md->queue.
      
        For flush suspend, inserts a marker request to the tail of md->queue.
        And dispatches all requests in md->queue until the marker comes to
        the front of md->queue.  Then, stops dispatching request and waits
        for the all dispatched requests to complete.
        After that, completes the marker request, stops md->queue and
        wake up the waiter on the suspend queue, md->wait.
      
        resume
        ======
        Starts md->queue.
      Signed-off-by: NKiyoshi Ueda <k-ueda@ct.jp.nec.com>
      Signed-off-by: NJun'ichi Nomura <j-nomura@ce.jp.nec.com>
      Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
      cec47e3d
    • J
      dm raid1: add userspace log · f5db4af4
      Jonthan Brassow 提交于
      This patch contains a device-mapper mirror log module that forwards
      requests to userspace for processing.
      
      The structures used for communication between kernel and userspace are
      located in include/linux/dm-log-userspace.h.  Due to the frequency,
      diversity, and 2-way communication nature of the exchanges between
      kernel and userspace, 'connector' was chosen as the interface for
      communication.
      
      The first log implementations written in userspace - "clustered-disk"
      and "clustered-core" - support clustered shared storage.   A userspace
      daemon (in the LVM2 source code repository) uses openAIS/corosync to
      process requests in an ordered fashion with the rest of the nodes in the
      cluster so as to prevent log state corruption.  Other implementations
      with no association to LVM or openAIS/corosync, are certainly possible.
      
      (Imagine if two machines are writing to the same region of a mirror.
      They would both mark the region dirty, but you need a cluster-aware
      entity that can handle properly marking the region clean when they are
      done.  Otherwise, you might clear the region when the first machine is
      done, not the second.)
      Signed-off-by: NJonathan Brassow <jbrassow@redhat.com>
      Cc: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
      Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
      f5db4af4