1. 08 1月, 2009 1 次提交
    • D
      NOMMU: Make VMAs per MM as for MMU-mode linux · 8feae131
      David Howells 提交于
      Make VMAs per mm_struct as for MMU-mode linux.  This solves two problems:
      
       (1) In SYSV SHM where nattch for a segment does not reflect the number of
           shmat's (and forks) done.
      
       (2) In mmap() where the VMA's vm_mm is set to point to the parent mm by an
           exec'ing process when VM_EXECUTABLE is specified, regardless of the fact
           that a VMA might be shared and already have its vm_mm assigned to another
           process or a dead process.
      
      A new struct (vm_region) is introduced to track a mapped region and to remember
      the circumstances under which it may be shared and the vm_list_struct structure
      is discarded as it's no longer required.
      
      This patch makes the following additional changes:
      
       (1) Regions are now allocated with alloc_pages() rather than kmalloc() and
           with no recourse to __GFP_COMP, so the pages are not composite.  Instead,
           each page has a reference on it held by the region.  Anything else that is
           interested in such a page will have to get a reference on it to retain it.
           When the pages are released due to unmapping, each page is passed to
           put_page() and will be freed when the page usage count reaches zero.
      
       (2) Excess pages are trimmed after an allocation as the allocation must be
           made as a power-of-2 quantity of pages.
      
       (3) VMAs are added to the parent MM's R/B tree and mmap lists.  As an MM may
           end up with overlapping VMAs within the tree, the VMA struct address is
           appended to the sort key.
      
       (4) Non-anonymous VMAs are now added to the backing inode's prio list.
      
       (5) Holes may be punched in anonymous VMAs with munmap(), releasing parts of
           the backing region.  The VMA and region structs will be split if
           necessary.
      
       (6) sys_shmdt() only releases one attachment to a SYSV IPC shared memory
           segment instead of all the attachments at that addresss.  Multiple
           shmat()'s return the same address under NOMMU-mode instead of different
           virtual addresses as under MMU-mode.
      
       (7) Core dumping for ELF-FDPIC requires fewer exceptions for NOMMU-mode.
      
       (8) /proc/maps is now the global list of mapped regions, and may list bits
           that aren't actually mapped anywhere.
      
       (9) /proc/meminfo gains a line (tagged "MmapCopy") that indicates the amount
           of RAM currently allocated by mmap to hold mappable regions that can't be
           mapped directly.  These are copies of the backing device or file if not
           anonymous.
      
      These changes make NOMMU mode more similar to MMU mode.  The downside is that
      NOMMU mode requires some extra memory to track things over NOMMU without this
      patch (VMAs are no longer shared, and there are now region structs).
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Tested-by: NMike Frysinger <vapier.adi@gmail.com>
      Acked-by: NPaul Mundt <lethal@linux-sh.org>
      8feae131
  2. 07 1月, 2009 2 次提交
  3. 01 1月, 2009 2 次提交
  4. 13 12月, 2008 1 次提交
  5. 20 10月, 2008 2 次提交
    • M
      container freezer: implement freezer cgroup subsystem · dc52ddc0
      Matt Helsley 提交于
      This patch implements a new freezer subsystem in the control groups
      framework.  It provides a way to stop and resume execution of all tasks in
      a cgroup by writing in the cgroup filesystem.
      
      The freezer subsystem in the container filesystem defines a file named
      freezer.state.  Writing "FROZEN" to the state file will freeze all tasks
      in the cgroup.  Subsequently writing "RUNNING" will unfreeze the tasks in
      the cgroup.  Reading will return the current state.
      
      * Examples of usage :
      
         # mkdir /containers/freezer
         # mount -t cgroup -ofreezer freezer  /containers
         # mkdir /containers/0
         # echo $some_pid > /containers/0/tasks
      
      to get status of the freezer subsystem :
      
         # cat /containers/0/freezer.state
         RUNNING
      
      to freeze all tasks in the container :
      
         # echo FROZEN > /containers/0/freezer.state
         # cat /containers/0/freezer.state
         FREEZING
         # cat /containers/0/freezer.state
         FROZEN
      
      to unfreeze all tasks in the container :
      
         # echo RUNNING > /containers/0/freezer.state
         # cat /containers/0/freezer.state
         RUNNING
      
      This is the basic mechanism which should do the right thing for user space
      task in a simple scenario.
      
      It's important to note that freezing can be incomplete.  In that case we
      return EBUSY.  This means that some tasks in the cgroup are busy doing
      something that prevents us from completely freezing the cgroup at this
      time.  After EBUSY, the cgroup will remain partially frozen -- reflected
      by freezer.state reporting "FREEZING" when read.  The state will remain
      "FREEZING" until one of these things happens:
      
      	1) Userspace cancels the freezing operation by writing "RUNNING" to
      		the freezer.state file
      	2) Userspace retries the freezing operation by writing "FROZEN" to
      		the freezer.state file (writing "FREEZING" is not legal
      		and returns EIO)
      	3) The tasks that blocked the cgroup from entering the "FROZEN"
      		state disappear from the cgroup's set of tasks.
      
      [akpm@linux-foundation.org: coding-style fixes]
      [akpm@linux-foundation.org: export thaw_process]
      Signed-off-by: NCedric Le Goater <clg@fr.ibm.com>
      Signed-off-by: NMatt Helsley <matthltc@us.ibm.com>
      Acked-by: NSerge E. Hallyn <serue@us.ibm.com>
      Tested-by: NMatt Helsley <matthltc@us.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      dc52ddc0
    • M
      container freezer: add TIF_FREEZE flag to all architectures · 83224b08
      Matt Helsley 提交于
      This patch series introduces a cgroup subsystem that utilizes the swsusp
      freezer to freeze a group of tasks.  It's immediately useful for batch job
      management scripts.  It should also be useful in the future for
      implementing container checkpoint/restart.
      
      The freezer subsystem in the container filesystem defines a cgroup file
      named freezer.state.  Reading freezer.state will return the current state
      of the cgroup.  Writing "FROZEN" to the state file will freeze all tasks
      in the cgroup.  Subsequently writing "RUNNING" will unfreeze the tasks in
      the cgroup.
      
      * Examples of usage :
      
         # mkdir /containers/freezer
         # mount -t cgroup -ofreezer freezer  /containers
         # mkdir /containers/0
         # echo $some_pid > /containers/0/tasks
      
      to get status of the freezer subsystem :
      
         # cat /containers/0/freezer.state
         RUNNING
      
      to freeze all tasks in the container :
      
         # echo FROZEN > /containers/0/freezer.state
         # cat /containers/0/freezer.state
         FREEZING
         # cat /containers/0/freezer.state
         FROZEN
      
      to unfreeze all tasks in the container :
      
         # echo RUNNING > /containers/0/freezer.state
         # cat /containers/0/freezer.state
         RUNNING
      
      This patch:
      
      The first step in making the refrigerator() available to all
      architectures, even for those without power management.
      
      The purpose of such a change is to be able to use the refrigerator() in a
      new control group subsystem which will implement a control group freezer.
      
      [akpm@linux-foundation.org: fix sparc]
      Signed-off-by: NCedric Le Goater <clg@fr.ibm.com>
      Signed-off-by: NMatt Helsley <matthltc@us.ibm.com>
      Acked-by: NPavel Machek <pavel@suse.cz>
      Acked-by: NSerge E. Hallyn <serue@us.ibm.com>
      Acked-by: NRafael J. Wysocki <rjw@sisk.pl>
      Acked-by: NNigel Cunningham <nigel@tuxonice.net>
      Tested-by: NMatt Helsley <matthltc@us.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      83224b08
  6. 16 10月, 2008 1 次提交
  7. 07 9月, 2008 2 次提交
  8. 12 8月, 2008 1 次提交
  9. 07 8月, 2008 1 次提交
  10. 27 7月, 2008 1 次提交
  11. 25 7月, 2008 1 次提交
    • A
      PAGE_ALIGN(): correctly handle 64-bit values on 32-bit architectures · 27ac792c
      Andrea Righi 提交于
      On 32-bit architectures PAGE_ALIGN() truncates 64-bit values to the 32-bit
      boundary. For example:
      
      	u64 val = PAGE_ALIGN(size);
      
      always returns a value < 4GB even if size is greater than 4GB.
      
      The problem resides in PAGE_MASK definition (from include/asm-x86/page.h for
      example):
      
      #define PAGE_SHIFT      12
      #define PAGE_SIZE       (_AC(1,UL) << PAGE_SHIFT)
      #define PAGE_MASK       (~(PAGE_SIZE-1))
      ...
      #define PAGE_ALIGN(addr)       (((addr)+PAGE_SIZE-1)&PAGE_MASK)
      
      The "~" is performed on a 32-bit value, so everything in "and" with
      PAGE_MASK greater than 4GB will be truncated to the 32-bit boundary.
      Using the ALIGN() macro seems to be the right way, because it uses
      typeof(addr) for the mask.
      
      Also move the PAGE_ALIGN() definitions out of include/asm-*/page.h in
      include/linux/mm.h.
      
      See also lkml discussion: http://lkml.org/lkml/2008/6/11/237
      
      [akpm@linux-foundation.org: fix drivers/media/video/uvc/uvc_queue.c]
      [akpm@linux-foundation.org: fix v850]
      [akpm@linux-foundation.org: fix powerpc]
      [akpm@linux-foundation.org: fix arm]
      [akpm@linux-foundation.org: fix mips]
      [akpm@linux-foundation.org: fix drivers/media/video/pvrusb2/pvrusb2-dvb.c]
      [akpm@linux-foundation.org: fix drivers/mtd/maps/uclinux.c]
      [akpm@linux-foundation.org: fix powerpc]
      Signed-off-by: NAndrea Righi <righi.andrea@gmail.com>
      Cc: <linux-arch@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      27ac792c
  12. 23 7月, 2008 13 次提交
  13. 28 6月, 2008 1 次提交
    • A
      PCI: remove unused arch pcibios_update_resource() functions · 0aea5313
      Adrian Bunk 提交于
      Russell King did the following back in 2003:
      
      <--  snip  -->
      
          [PCI] pci-9: Kill per-architecture pcibios_update_resource()
      
          Kill pcibios_update_resource(), replacing it with pci_update_resource().
          pci_update_resource() uses pcibios_resource_to_bus() to convert a
          resource to a device BAR - the transformation should be exactly the
          same as the transformation used for the PCI bridges.
      
          pci_update_resource "knows" about 64-bit BARs, but doesn't attempt to
          set the high 32-bits to anything non-zero - currently no architecture
          attempts to do something different.  If anyone cares, please fix; I'm
          going to reflect current behaviour for the time being.
      
          Ivan pointed out the following architectures need to examine their
          pcibios_update_resource() implementation - they should make sure that
          this new implementation does the right thing.  #warning's have been
          added where appropriate.
      
              ia64
              mips
              mips64
      
          This cset also includes a fix for the problem reported by AKPM where
          64-bit arch compilers complain about the resource mask being placed
          in a u32.
      
      <--  snip  -->
      
      This patch removes the unused pcibios_update_resource() functions the
      kernel gained since, from FRV, m68k, mips & sh architectures.
      Signed-off-by: NAdrian Bunk <bunk@kernel.org>
      Acked-by: NDavid Howells <dhowells@redhat.com>
      Acked-by: NGreg Ungerer <gerg@uclinux.org>
      Acked-by: NPaul Mundt <lethal@linux-sh.org>
      Acked-by: NRalf Baechle <ralf@linux-mips.org>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      0aea5313
  14. 13 6月, 2008 1 次提交
  15. 17 5月, 2008 1 次提交
  16. 13 5月, 2008 2 次提交
  17. 04 5月, 2008 1 次提交
    • U
      unified (weak) sys_pipe implementation · d35c7b0e
      Ulrich Drepper 提交于
      This replaces the duplicated arch-specific versions of "sys_pipe()" with
      one unified implementation.  This removes almost 250 lines of duplicated
      code.
      
      It's marked __weak, so that *if* an architecture wants to override the
      default implementation it can do so by simply having its own replacement
      version, since many architectures use alternate calling conventions for
      the 'pipe()' system call for legacy reasons (ie traditional UNIX
      implementations often return the two file descriptors in registers)
      
      I still haven't changed the cris version even though Linus says the BKL
      isn't needed.  The arch maintainer can easily do it if there are really
      no obstacles.
      Signed-off-by: NUlrich Drepper <drepper@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d35c7b0e
  18. 01 5月, 2008 6 次提交