1. 21 7月, 2010 1 次提交
  2. 28 5月, 2010 4 次提交
    • J
      ipc/sem.c: use ERR_CAST · 4de85cd6
      Julia Lawall 提交于
      Use ERR_CAST(x) rather than ERR_PTR(PTR_ERR(x)).  The former makes more
      clear what is the purpose of the operation, which otherwise looks like a
      no-op.
      
      The semantic patch that makes this change is as follows:
      (http://coccinelle.lip6.fr/)
      
      // <smpl>
      @@
      type T;
      T x;
      identifier f;
      @@
      
      T f (...) { <+...
      - ERR_PTR(PTR_ERR(x))
      + x
       ...+> }
      
      @@
      expression x;
      @@
      
      - ERR_PTR(PTR_ERR(x))
      + ERR_CAST(x)
      // </smpl>
      Signed-off-by: NJulia Lawall <julia@diku.dk>
      Cc: Manfred Spraul <manfred@colorfullife.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4de85cd6
    • M
      ipc/sem.c: update description of the implementation · c5cf6359
      Manfred Spraul 提交于
      ipc/sem.c begins with a 15 year old description about bugs in the initial
      implementation in Linux-1.0.  The patch replaces that with a top level
      description of the current code.
      
      A TODO could be derived from this text:
      
      The opengroup man page for semop() does not mandate FIFO.  Thus there is
      no need for a semaphore array list of pending operations.
      
      If
      
      - this list is removed
      - the per-semaphore array spinlock is removed (possible if there is no
        list to protect)
      - sem_otime is moved into the semaphores and calculated on demand during
        semctl()
      
      then the array would be read-mostly - which would significantly improve
      scaling for applications that use semaphore arrays with lots of entries.
      
      The price would be expensive semctl() calls:
      
      	for(i=0;i<sma->sem_nsems;i++) spin_lock(sma->sem_lock);
      	<do stuff>
      	for(i=0;i<sma->sem_nsems;i++) spin_unlock(sma->sem_lock);
      
      I'm not sure if the complexity is worth the effort, thus here is the
      documentation of the current behavior first.
      Signed-off-by: NManfred Spraul <manfred@colorfullife.com>
      Cc: Chris Mason <chris.mason@oracle.com>
      Cc: Zach Brown <zach.brown@oracle.com>
      Cc: Jens Axboe <jens.axboe@oracle.com>
      Cc: Nick Piggin <npiggin@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c5cf6359
    • M
      ipc/sem.c: move wake_up_process out of the spinlock section · 0a2b9d4c
      Manfred Spraul 提交于
      The wake-up part of semtimedop() consists out of two steps:
      
      - the right tasks must be identified.
      - they must be woken up.
      
      Right now, both steps run while the array spinlock is held.  This patch
      reorders the code and moves the actual wake_up_process() behind the point
      where the spinlock is dropped.
      
      The code also moves setting sem->sem_otime to one place: It does not make
      sense to set the last modify time multiple times.
      
      [akpm@linux-foundation.org: repair kerneldoc]
      [akpm@linux-foundation.org: fix uninitialised retval]
      Signed-off-by: NManfred Spraul <manfred@colorfullife.com>
      Cc: Chris Mason <chris.mason@oracle.com>
      Cc: Zach Brown <zach.brown@oracle.com>
      Cc: Jens Axboe <jens.axboe@oracle.com>
      Cc: Nick Piggin <npiggin@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0a2b9d4c
    • M
      ipc/sem.c: optimize update_queue() for bulk wakeup calls · fd5db422
      Manfred Spraul 提交于
      The following series of patches tries to fix the spinlock contention
      reported by Chris Mason - his benchmark exposes problems of the current
      code:
      
      - In the worst case, the algorithm used by update_queue() is O(N^2).
        Bulk wake-up calls can enter this worst case.  The patch series fix
        that.
      
        Note that the benchmark app doesn't expose the problem, it just should
        be fixed: Real world apps might do the wake-ups in another order than
        perfect FIFO.
      
      - The part of the code that runs within the semaphore array spinlock is
        significantly larger than necessary.
      
        The patch series fixes that.  This change is responsible for the main
        improvement.
      
      - The cacheline with the spinlock is also used for a variable that is
        read in the hot path (sem_base) and for a variable that is unnecessarily
        written to multiple times (sem_otime).  The last step of the series
        cacheline-aligns the spinlock.
      
      This patch:
      
      The SysV semaphore code allows to perform multiple operations on all
      semaphores in the array as atomic operations.  After a modification,
      update_queue() checks which of the waiting tasks can complete.
      
      The algorithm that is used to identify the tasks is O(N^2) in the worst
      case.  For some cases, it is simple to avoid the O(N^2).
      
      The patch adds a detection logic for some cases, especially for the case
      of an array where all sleeping tasks are single sembuf operations and a
      multi-sembuf operation is used to wake up multiple tasks.
      
      A big database application uses that approach.
      
      The patch fixes wakeup due to semctl(,,SETALL,) - the initial version of
      the patch breaks that.
      
      [akpm@linux-foundation.org: make do_smart_update() static]
      Signed-off-by: NManfred Spraul <manfred@colorfullife.com>
      Cc: Chris Mason <chris.mason@oracle.com>
      Cc: Zach Brown <zach.brown@oracle.com>
      Cc: Jens Axboe <jens.axboe@oracle.com>
      Cc: Nick Piggin <npiggin@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      fd5db422
  3. 16 12月, 2009 9 次提交
  4. 15 4月, 2009 1 次提交
  5. 14 1月, 2009 2 次提交
  6. 07 1月, 2009 1 次提交
  7. 06 1月, 2009 1 次提交
  8. 17 10月, 2008 1 次提交
  9. 26 7月, 2008 4 次提交
  10. 29 4月, 2008 8 次提交
  11. 09 2月, 2008 4 次提交
    • P
      IPC: consolidate sem_exit_ns(), msg_exit_ns() and shm_exit_ns() · 01b8b07a
      Pierre Peiffer 提交于
      sem_exit_ns(), msg_exit_ns() and shm_exit_ns() are all called when an
      ipc_namespace is released to free all ipcs of each type.  But in fact, they
      do the same thing: they loop around all ipcs to free them individually by
      calling a specific routine.
      
      This patch proposes to consolidate this by introducing a common function,
      free_ipcs(), that do the job.  The specific routine to call on each
      individual ipcs is passed as parameter.  For this, these ipc-specific
      'free' routines are reworked to take a generic 'struct ipc_perm' as
      parameter.
      Signed-off-by: NPierre Peiffer <pierre.peiffer@bull.net>
      Cc: Cedric Le Goater <clg@fr.ibm.com>
      Cc: Pavel Emelyanov <xemul@openvz.org>
      Cc: Nadia Derbey <Nadia.Derbey@bull.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      01b8b07a
    • P
      IPC: make struct ipc_ids static in ipc_namespace · ed2ddbf8
      Pierre Peiffer 提交于
      Each ipc_namespace contains a table of 3 pointers to struct ipc_ids (3 for
      msg, sem and shm, structure used to store all ipcs) These 'struct ipc_ids'
      are dynamically allocated for each icp_namespace as the ipc_namespace
      itself (for the init namespace, they are initialized with pointers to
      static variables instead)
      
      It is so for historical reason: in fact, before the use of idr to store the
      ipcs, the ipcs were stored in tables of variable length, depending of the
      maximum number of ipc allowed.  Now, these 'struct ipc_ids' have a fixed
      size.  As they are allocated in any cases for each new ipc_namespace, there
      is no gain of memory in having them allocated separately of the struct
      ipc_namespace.
      
      This patch proposes to make this table static in the struct ipc_namespace.
      Thus, we can allocate all in once and get rid of all the code needed to
      allocate and free these ipc_ids separately.
      Signed-off-by: NPierre Peiffer <pierre.peiffer@bull.net>
      Acked-by: NCedric Le Goater <clg@fr.ibm.com>
      Cc: Pavel Emelyanov <xemul@openvz.org>
      Cc: Nadia Derbey <Nadia.Derbey@bull.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ed2ddbf8
    • P
      IPC/semaphores: consolidate SEM_STAT and IPC_STAT commands · 4b9fcb0e
      Pierre Peiffer 提交于
      These commands (SEM_STAT and IPC_STAT) are rather doing the same things
      (only the meaning of the id given as input and the return value differ).
      However, for the semaphores, they are handled in two different places (two
      different functions).
      
      This patch consolidates this for clarification by handling these both
      commands in the same place in semctl_nolock().  It also removes one unused
      parameter for this function.
      Signed-off-by: NPierre Peiffer <pierre.peiffer@bull.net>
      Cc: Nadia Derbey <Nadia.Derbey@bull.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4b9fcb0e
    • P
      namespaces: move the IPC namespace under IPC_NS option · ae5e1b22
      Pavel Emelyanov 提交于
      Currently the IPC namespace management code is spread over the ipc/*.c files.
      I moved this code into ipc/namespace.c file which is compiled out when needed.
      
      The linux/ipc_namespace.h file is used to store the prototypes of the
      functions in namespace.c and the stubs for NAMESPACES=n case.  This is done
      so, because the stub for copy_ipc_namespace requires the knowledge of the
      CLONE_NEWIPC flag, which is in sched.h.  But the linux/ipc.h file itself in
      included into many many .c files via the sys.h->sem.h sequence so adding the
      sched.h into it will make all these .c depend on sched.h which is not that
      good.  On the other hand the knowledge about the namespaces stuff is required
      in 4 .c files only.
      
      Besides, this patch compiles out some auxiliary functions from ipc/sem.c,
      msg.c and shm.c files.  It turned out that moving these functions into
      namespaces.c is not that easy because they use many other calls and macros
      from the original file.  Moving them would make this patch complicated.  On
      the other hand all these functions can be consolidated, so I will send a
      separate patch doing this a bit later.
      Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
      Acked-by: NSerge Hallyn <serue@us.ibm.com>
      Cc: Cedric Le Goater <clg@fr.ibm.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Herbert Poetzl <herbert@13thfloor.at>
      Cc: Kirill Korotaev <dev@sw.ru>
      Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ae5e1b22
  12. 07 2月, 2008 1 次提交
    • P
      IPC: fix error check in all new xxx_lock() and xxx_exit_ns() functions · b1ed88b4
      Pierre Peiffer 提交于
      In the new implementation of the [sem|shm|msg]_lock[_check]() routines, we
      use the return value of ipc_lock() in container_of() without any check.
      But ipc_lock may return a errcode.  The use of this errcode in
      container_of() may alter this errcode, and we don't want this.
      
      And in xxx_exit_ns, the pointer return by idr_find is of type 'struct
      kern_ipc_per'...
      
      Today, the code will work as is because the member used in these
      container_of() is the first member of its container (offset == 0), the
      errcode isn't changed then.  But in the general case, we can't count on
      this assumption and this may lead later to a real bug if we don't correct
      this.
      
      Again, the proposed solution is simple and correct.  But, as pointed by
      Nadia, with this solution, the same check will be done several times (in
      all sub-callers...), what is not very funny/optimal...
      Signed-off-by: NPierre Peiffer <pierre.peiffer@bull.net>
      Cc: Nadia Derbey <Nadia.Derbey@bull.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b1ed88b4
  13. 20 10月, 2007 3 次提交