1. 10 6月, 2008 1 次提交
  2. 07 6月, 2008 1 次提交
  3. 04 5月, 2008 1 次提交
  4. 29 4月, 2008 19 次提交
  5. 28 4月, 2008 2 次提交
    • L
      mempolicy: rework mempolicy Reference Counting [yet again] · 52cd3b07
      Lee Schermerhorn 提交于
      After further discussion with Christoph Lameter, it has become clear that my
      earlier attempts to clean up the mempolicy reference counting were a bit of
      overkill in some areas, resulting in superflous ref/unref in what are usually
      fast paths.  In other areas, further inspection reveals that I botched the
      unref for interleave policies.
      
      A separate patch, suitable for upstream/stable trees, fixes up the known
      errors in the previous attempt to fix reference counting.
      
      This patch reworks the memory policy referencing counting and, one hopes,
      simplifies the code.  Maybe I'll get it right this time.
      
      See the update to the numa_memory_policy.txt document for a discussion of
      memory policy reference counting that motivates this patch.
      
      Summary:
      
      Lookup of mempolicy, based on (vma, address) need only add a reference for
      shared policy, and we need only unref the policy when finished for shared
      policies.  So, this patch backs out all of the unneeded extra reference
      counting added by my previous attempt.  It then unrefs only shared policies
      when we're finished with them, using the mpol_cond_put() [conditional put]
      helper function introduced by this patch.
      
      Note that shmem_swapin() calls read_swap_cache_async() with a dummy vma
      containing just the policy.  read_swap_cache_async() can call alloc_page_vma()
      multiple times, so we can't let alloc_page_vma() unref the shared policy in
      this case.  To avoid this, we make a copy of any non-null shared policy and
      remove the MPOL_F_SHARED flag from the copy.  This copy occurs before reading
      a page [or multiple pages] from swap, so the overhead should not be an issue
      here.
      
      I introduced a new static inline function "mpol_cond_copy()" to copy the
      shared policy to an on-stack policy and remove the flags that would require a
      conditional free.  The current implementation of mpol_cond_copy() assumes that
      the struct mempolicy contains no pointers to dynamically allocated structures
      that must be duplicated or reference counted during copy.
      Signed-off-by: NLee Schermerhorn <lee.schermerhorn@hp.com>
      Cc: Christoph Lameter <clameter@sgi.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      52cd3b07
    • L
      mempolicy: fixup Fallback for Default Shmem Policy · ae4d8c16
      Lee Schermerhorn 提交于
      get_vma_policy() is not handling fallback to task policy correctly when the
      get_policy() vm_op returns NULL.  The NULL overwrites the 'pol' variable that
      was holding the fallback task mempolicy.  So, it was falling back directly to
      system default policy.
      
      Fix get_vma_policy() to use only non-NULL policy returned from the vma
      get_policy op.
      
      shm_get_policy() was falling back to current task's mempolicy if the "backing
      file system" [tmpfs vs hugetlbfs] does not support the get_policy vm_op and
      the vma policy is null.  This is incorrect for show_numa_maps() which is
      likely querying the numa_maps of some task other than current.  Remove this
      fallback.
      Signed-off-by: NLee Schermerhorn <lee.schermerhorn@hp.com>
      Cc: Christoph Lameter <clameter@sgi.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ae4d8c16
  6. 19 4月, 2008 2 次提交
  7. 11 3月, 2008 1 次提交
    • L
      mempolicy: fix reference counting bugs · 69682d85
      Lee Schermerhorn 提交于
      Address 3 known bugs in the current memory policy reference counting method.
      I have a series of patches to rework the reference counting to reduce overhead
      in the allocation path.  However, that series will require testing in -mm once
      I repost it.
      
      1) alloc_page_vma() does not release the extra reference taken for
         vma/shared mempolicy when the mode == MPOL_INTERLEAVE.  This can result in
         leaking mempolicy structures.  This is probably occurring, but not being
         noticed.
      
         Fix:  add the conditional release of the reference.
      
      2) hugezonelist unconditionally releases a reference on the mempolicy when
         mode == MPOL_INTERLEAVE.  This can result in decrementing the reference
         count for system default policy [should have no ill effect] or premature
         freeing of task policy.  If this occurred, the next allocation using task
         mempolicy would use the freed structure and probably BUG out.
      
         Fix:  add the necessary check to the release.
      
      3) The current reference counting method assumes that vma 'get_policy()'
         methods automatically add an extra reference a non-NULL returned mempolicy.
          This is true for shmem_get_policy() used by tmpfs mappings, including
         regular page shm segments.  However, SHM_HUGETLB shm's, backed by
         hugetlbfs, just use the vma policy without the extra reference.  This
         results in freeing of the vma policy on the first allocation, with reuse of
         the freed mempolicy structure on subsequent allocations.
      
         Fix: Rather than add another condition to the conditional reference
         release, which occur in the allocation path, just add a reference when
         returning the vma policy in shm_get_policy() to match the assumptions.
      Signed-off-by: NLee Schermerhorn <lee.schermerhorn@hp.com>
      Cc: Greg KH <greg@kroah.com>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Christoph Lameter <clameter@sgi.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: David Rientjes <rientjes@google.com>
      Cc: <eric.whitney@hp.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      69682d85
  8. 09 2月, 2008 7 次提交
    • P
      Pidns: fix badly converted mqueues pid handling · 56496c1d
      Pavel Emelyanov 提交于
      When sending the pid namespaces patches I wrongly converted the tsk->tgid into
      task_pid_vnr(tsk) in mqueue-s (the git id of this patch is
      b488893a).
      
      The proper behavior is to get the task_tgid_vnr(tsk).
      
      This seem to be the only mistake of that kind.
      Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Oleg Nesterov <oleg@tv-sign.ru>
      Cc: Balbir Singh <balbir@in.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      56496c1d
    • P
      Pidns: make full use of xxx_vnr() calls · 6c5f3e7b
      Pavel Emelyanov 提交于
      Some time ago the xxx_vnr() calls (e.g.  pid_vnr or find_task_by_vpid) were
      _all_ converted to operate on the current pid namespace.  After this each call
      like xxx_nr_ns(foo, current->nsproxy->pid_ns) is nothing but a xxx_vnr(foo)
      one.
      
      Switch all the xxx_nr_ns() callers to use the xxx_vnr() calls where
      appropriate.
      Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
      Reviewed-by: NOleg Nesterov <oleg@tv-sign.ru>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Balbir Singh <balbir@in.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6c5f3e7b
    • P
      IPC: consolidate sem_exit_ns(), msg_exit_ns() and shm_exit_ns() · 01b8b07a
      Pierre Peiffer 提交于
      sem_exit_ns(), msg_exit_ns() and shm_exit_ns() are all called when an
      ipc_namespace is released to free all ipcs of each type.  But in fact, they
      do the same thing: they loop around all ipcs to free them individually by
      calling a specific routine.
      
      This patch proposes to consolidate this by introducing a common function,
      free_ipcs(), that do the job.  The specific routine to call on each
      individual ipcs is passed as parameter.  For this, these ipc-specific
      'free' routines are reworked to take a generic 'struct ipc_perm' as
      parameter.
      Signed-off-by: NPierre Peiffer <pierre.peiffer@bull.net>
      Cc: Cedric Le Goater <clg@fr.ibm.com>
      Cc: Pavel Emelyanov <xemul@openvz.org>
      Cc: Nadia Derbey <Nadia.Derbey@bull.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      01b8b07a
    • P
      IPC: make struct ipc_ids static in ipc_namespace · ed2ddbf8
      Pierre Peiffer 提交于
      Each ipc_namespace contains a table of 3 pointers to struct ipc_ids (3 for
      msg, sem and shm, structure used to store all ipcs) These 'struct ipc_ids'
      are dynamically allocated for each icp_namespace as the ipc_namespace
      itself (for the init namespace, they are initialized with pointers to
      static variables instead)
      
      It is so for historical reason: in fact, before the use of idr to store the
      ipcs, the ipcs were stored in tables of variable length, depending of the
      maximum number of ipc allowed.  Now, these 'struct ipc_ids' have a fixed
      size.  As they are allocated in any cases for each new ipc_namespace, there
      is no gain of memory in having them allocated separately of the struct
      ipc_namespace.
      
      This patch proposes to make this table static in the struct ipc_namespace.
      Thus, we can allocate all in once and get rid of all the code needed to
      allocate and free these ipc_ids separately.
      Signed-off-by: NPierre Peiffer <pierre.peiffer@bull.net>
      Acked-by: NCedric Le Goater <clg@fr.ibm.com>
      Cc: Pavel Emelyanov <xemul@openvz.org>
      Cc: Nadia Derbey <Nadia.Derbey@bull.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ed2ddbf8
    • P
      IPC/semaphores: consolidate SEM_STAT and IPC_STAT commands · 4b9fcb0e
      Pierre Peiffer 提交于
      These commands (SEM_STAT and IPC_STAT) are rather doing the same things
      (only the meaning of the id given as input and the return value differ).
      However, for the semaphores, they are handled in two different places (two
      different functions).
      
      This patch consolidates this for clarification by handling these both
      commands in the same place in semctl_nolock().  It also removes one unused
      parameter for this function.
      Signed-off-by: NPierre Peiffer <pierre.peiffer@bull.net>
      Cc: Nadia Derbey <Nadia.Derbey@bull.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4b9fcb0e
    • P
      ipc: uninline some code from util.h · b2d75cdd
      Pavel Emelyanov 提交于
      ipc_lock_check_down(), ipc_lock_check() and ipcget() seem too large to be
      inline.  Besides, they give no optimization being inline as they perform
      calls inside in any case.
      
      Moving them into ipc/util.c saves 500 bytes of vmlinux and shortens IPC
      internal API.
      
      $ ./scripts/bloat-o-meter vmlinux-orig vmlinux
      add/remove: 3/2 grow/shrink: 0/10 up/down: 490/-989 (-499)
      function                                     old     new   delta
      ipcget                                         -     392    +392
      ipc_lock_check_down                            -      49     +49
      ipc_lock_check                                 -      49     +49
      sys_semget                                   119     105     -14
      sys_shmget                                   108      86     -22
      sys_msgget                                   100      78     -22
      do_msgsnd                                    665     631     -34
      do_msgrcv                                    680     644     -36
      do_shmat                                     771     733     -38
      sys_msgctl                                  1302    1229     -73
      ipcget_new                                    80       -     -80
      sys_semtimedop                              1534    1452     -82
      sys_semctl                                  2034    1922    -112
      sys_shmctl                                  1919    1765    -154
      ipcget_public                                322       -    -322
      
      The ipcget() growth is the result of gcc inlining of currently static
      ipcget_new/_public.
      Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
      Cc: Nadia Derbey <Nadia.Derbey@bull.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b2d75cdd
    • P
      namespaces: move the IPC namespace under IPC_NS option · ae5e1b22
      Pavel Emelyanov 提交于
      Currently the IPC namespace management code is spread over the ipc/*.c files.
      I moved this code into ipc/namespace.c file which is compiled out when needed.
      
      The linux/ipc_namespace.h file is used to store the prototypes of the
      functions in namespace.c and the stubs for NAMESPACES=n case.  This is done
      so, because the stub for copy_ipc_namespace requires the knowledge of the
      CLONE_NEWIPC flag, which is in sched.h.  But the linux/ipc.h file itself in
      included into many many .c files via the sys.h->sem.h sequence so adding the
      sched.h into it will make all these .c depend on sched.h which is not that
      good.  On the other hand the knowledge about the namespaces stuff is required
      in 4 .c files only.
      
      Besides, this patch compiles out some auxiliary functions from ipc/sem.c,
      msg.c and shm.c files.  It turned out that moving these functions into
      namespaces.c is not that easy because they use many other calls and macros
      from the original file.  Moving them would make this patch complicated.  On
      the other hand all these functions can be consolidated, so I will send a
      separate patch doing this a bit later.
      Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
      Acked-by: NSerge Hallyn <serue@us.ibm.com>
      Cc: Cedric Le Goater <clg@fr.ibm.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Herbert Poetzl <herbert@13thfloor.at>
      Cc: Kirill Korotaev <dev@sw.ru>
      Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ae5e1b22
  9. 07 2月, 2008 2 次提交
  10. 30 11月, 2007 1 次提交
  11. 07 11月, 2007 1 次提交
  12. 21 10月, 2007 1 次提交
  13. 20 10月, 2007 1 次提交