1. 17 12月, 2009 1 次提交
  2. 03 11月, 2009 1 次提交
  3. 09 9月, 2009 1 次提交
  4. 28 7月, 2009 1 次提交
  5. 19 6月, 2009 2 次提交
    • O
      kthreads: rework kthread_stop() · 63706172
      Oleg Nesterov 提交于
      Based on Eric's patch which in turn was based on my patch.
      
      kthread_stop() has the nasty problems:
      
      - it runs unpredictably long with the global semaphore held.
      
      - it deadlocks if kthread itself does kthread_stop() before it obeys
        the kthread_should_stop() request.
      
      - it is not useable if kthread exits on its own, see for example the
        ugly "wait_to_die:" hack in migration_thread()
      
      - it is not possible to just tell kthread it should stop, we must always
        wait for its exit.
      
      With this patch kthread() allocates all neccesary data (struct kthread) on
      its own stack, globals kthread_stop_xxx are deleted.  ->vfork_done is used
      as a pointer into "struct kthread", this means kthread_stop() can easily
      wait for kthread's exit.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Pavel Emelyanov <xemul@openvz.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Vitaliy Gusev <vgusev@openvz.org
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      63706172
    • O
      kthreads: simplify the startup synchronization · cdd140bd
      Oleg Nesterov 提交于
      We use two completions two create the kernel thread, this is a bit ugly.
      kthread() wakes up create_kthread() via ->started, then create_kthread()
      wakes up the caller kthread_create() via ->done.  But kthread() does not
      need to wait for kthread(), it can just return.  Instead kthread() itself
      can wake up the caller of kthread_create().
      
      Kill kthread_create_info->started, ->done is enough.  This improves the
      scalability a bit and sijmplifies the code.
      
      The only problem if kernel_thread() fails, in that case create_kthread()
      must do complete(&create->done).
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Pavel Emelyanov <xemul@openvz.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Vitaliy Gusev <vgusev@openvz.org
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      cdd140bd
  6. 17 6月, 2009 1 次提交
    • M
      cpuset,mm: update tasks' mems_allowed in time · 58568d2a
      Miao Xie 提交于
      Fix allocating page cache/slab object on the unallowed node when memory
      spread is set by updating tasks' mems_allowed after its cpuset's mems is
      changed.
      
      In order to update tasks' mems_allowed in time, we must modify the code of
      memory policy.  Because the memory policy is applied in the process's
      context originally.  After applying this patch, one task directly
      manipulates anothers mems_allowed, and we use alloc_lock in the
      task_struct to protect mems_allowed and memory policy of the task.
      
      But in the fast path, we didn't use lock to protect them, because adding a
      lock may lead to performance regression.  But if we don't add a lock,the
      task might see no nodes when changing cpuset's mems_allowed to some
      non-overlapping set.  In order to avoid it, we set all new allowed nodes,
      then clear newly disallowed ones.
      
      [lee.schermerhorn@hp.com:
        The rework of mpol_new() to extract the adjusting of the node mask to
        apply cpuset and mpol flags "context" breaks set_mempolicy() and mbind()
        with MPOL_PREFERRED and a NULL nodemask--i.e., explicit local
        allocation.  Fix this by adding the check for MPOL_PREFERRED and empty
        node mask to mpol_new_mpolicy().
      
        Remove the now unneeded 'nodes = NULL' from mpol_new().
      
        Note that mpol_new_mempolicy() is always called with a non-NULL
        'nodes' parameter now that it has been removed from mpol_new().
        Therefore, we don't need to test nodes for NULL before testing it for
        'empty'.  However, just to be extra paranoid, add a VM_BUG_ON() to
        verify this assumption.]
      [lee.schermerhorn@hp.com:
      
        I don't think the function name 'mpol_new_mempolicy' is descriptive
        enough to differentiate it from mpol_new().
      
        This function applies cpuset set context, usually constraining nodes
        to those allowed by the cpuset.  However, when the 'RELATIVE_NODES flag
        is set, it also translates the nodes.  So I settled on
        'mpol_set_nodemask()', because the comment block for mpol_new() mentions
        that we need to call this function to "set nodes".
      
        Some additional minor line length, whitespace and typo cleanup.]
      Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Paul Menage <menage@google.com>
      Cc: Nick Piggin <nickpiggin@yahoo.com.au>
      Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: David Rientjes <rientjes@google.com>
      Signed-off-by: NLee Schermerhorn <lee.schermerhorn@hp.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      58568d2a
  7. 15 4月, 2009 2 次提交
    • S
      tracing/events: move trace point headers into include/trace/events · ad8d75ff
      Steven Rostedt 提交于
      Impact: clean up
      
      Create a sub directory in include/trace called events to keep the
      trace point headers in their own separate directory. Only headers that
      declare trace points should be defined in this directory.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Neil Horman <nhorman@tuxdriver.com>
      Cc: Zhao Lei <zhaolei@cn.fujitsu.com>
      Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      ad8d75ff
    • S
      tracing: create automated trace defines · a8d154b0
      Steven Rostedt 提交于
      This patch lowers the number of places a developer must modify to add
      new tracepoints. The current method to add a new tracepoint
      into an existing system is to write the trace point macro in the
      trace header with one of the macros TRACE_EVENT, TRACE_FORMAT or
      DECLARE_TRACE, then they must add the same named item into the C file
      with the macro DEFINE_TRACE(name) and then add the trace point.
      
      This change cuts out the needing to add the DEFINE_TRACE(name).
      Every file that uses the tracepoint must still include the trace/<type>.h
      file, but the one C file must also add a define before the including
      of that file.
      
       #define CREATE_TRACE_POINTS
       #include <trace/mytrace.h>
      
      This will cause the trace/mytrace.h file to also produce the C code
      necessary to implement the trace point.
      
      Note, if more than one trace/<type>.h is used to create the C code
      it is best to list them all together.
      
       #define CREATE_TRACE_POINTS
       #include <trace/foo.h>
       #include <trace/bar.h>
       #include <trace/fido.h>
      
      Thanks to Mathieu Desnoyers and Christoph Hellwig for coming up with
      the cleaner solution of the define above the includes over my first
      design to have the C code include a "special" header.
      
      This patch converts sched, irq and lockdep and skb to use this new
      method.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Neil Horman <nhorman@tuxdriver.com>
      Cc: Zhao Lei <zhaolei@cn.fujitsu.com>
      Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      a8d154b0
  8. 09 4月, 2009 2 次提交
  9. 30 3月, 2009 1 次提交
    • R
      cpumask: remove dangerous CPU_MASK_ALL_PTR, &CPU_MASK_ALL · 1a2142af
      Rusty Russell 提交于
      Impact: cleanup
      
      (Thanks to Al Viro for reminding me of this, via Ingo)
      
      CPU_MASK_ALL is the (deprecated) "all bits set" cpumask, defined as so:
      
      	#define CPU_MASK_ALL (cpumask_t) { { ... } }
      
      Taking the address of such a temporary is questionable at best,
      unfortunately 321a8e9d (cpumask: add CPU_MASK_ALL_PTR macro) added
      CPU_MASK_ALL_PTR:
      
      	#define CPU_MASK_ALL_PTR (&CPU_MASK_ALL)
      
      Which formalizes this practice.  One day gcc could bite us over this
      usage (though we seem to have gotten away with it so far).
      
      So replace everywhere which used &CPU_MASK_ALL or CPU_MASK_ALL_PTR
      with the modern "cpu_all_mask" (a real const struct cpumask *).
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Acked-by: NIngo Molnar <mingo@elte.hu>
      Reported-by: NAl Viro <viro@zeniv.linux.org.uk>
      Cc: Mike Travis <travis@sgi.com>
      1a2142af
  10. 16 11月, 2008 1 次提交
    • M
      tracepoints: add DECLARE_TRACE() and DEFINE_TRACE() · 7e066fb8
      Mathieu Desnoyers 提交于
      Impact: API *CHANGE*. Must update all tracepoint users.
      
      Add DEFINE_TRACE() to tracepoints to let them declare the tracepoint
      structure in a single spot for all the kernel. It helps reducing memory
      consumption, especially when declaring a lot of tracepoints, e.g. for
      kmalloc tracing.
      
      *API CHANGE WARNING*: now, DECLARE_TRACE() must be used in headers for
      tracepoint declarations rather than DEFINE_TRACE(). This is the sane way
      to do it. The name previously used was misleading.
      
      Updates scheduler instrumentation to follow this API change.
      Signed-off-by: NMathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      7e066fb8
  11. 20 10月, 2008 1 次提交
  12. 14 10月, 2008 1 次提交
    • M
      tracing, sched: LTTng instrumentation - scheduler · 0a16b607
      Mathieu Desnoyers 提交于
      Instrument the scheduler activity (sched_switch, migration, wakeups,
      wait for a task, signal delivery) and process/thread
      creation/destruction (fork, exit, kthread stop). Actually, kthread
      creation is not instrumented in this patch because it is architecture
      dependent. It allows to connect tracers such as ftrace which detects
      scheduling latencies, good/bad scheduler decisions. Tools like LTTng can
      export this scheduler information along with instrumentation of the rest
      of the kernel activity to perform post-mortem analysis on the scheduler
      activity.
      
      About the performance impact of tracepoints (which is comparable to
      markers), even without immediate values optimizations, tests done by
      Hideo Aoki on ia64 show no regression. His test case was using hackbench
      on a kernel where scheduler instrumentation (about 5 events in code
      scheduler code) was added. See the "Tracepoints" patch header for
      performance result detail.
      
      Changelog :
      
      - Change instrumentation location and parameter to match ftrace
        instrumentation, previously done with kernel markers.
      
      [ mingo@elte.hu: conflict resolutions ]
      Signed-off-by: NMathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Acked-by: N'Peter Zijlstra' <peterz@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      0a16b607
  13. 27 7月, 2008 1 次提交
  14. 19 7月, 2008 1 次提交
  15. 17 7月, 2008 1 次提交
    • R
      Freezer: Introduce PF_FREEZER_NOSIG · ebb12db5
      Rafael J. Wysocki 提交于
      The freezer currently attempts to distinguish kernel threads from
      user space tasks by checking if their mm pointer is unset and it
      does not send fake signals to kernel threads.  However, there are
      kernel threads, mostly related to networking, that behave like
      user space tasks and may want to be sent a fake signal to be frozen.
      
      Introduce the new process flag PF_FREEZER_NOSIG that will be set
      by default for all kernel threads and make the freezer only send
      fake signals to the tasks having PF_FREEZER_NOSIG unset.  Provide
      the set_freezable_with_signal() function to be called by the kernel
      threads that want to be sent a fake signal for freezing.
      
      This patch should not change the freezer's observable behavior.
      Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NPavel Machek <pavel@suse.cz>
      Signed-off-by: NLen Brown <len.brown@intel.com>
      ebb12db5
  16. 10 6月, 2008 1 次提交
  17. 30 4月, 2008 1 次提交
  18. 29 4月, 2008 1 次提交
  19. 20 4月, 2008 1 次提交
  20. 19 4月, 2008 1 次提交
  21. 26 1月, 2008 1 次提交
  22. 01 8月, 2007 1 次提交
  23. 17 7月, 2007 1 次提交
  24. 24 5月, 2007 1 次提交
  25. 10 5月, 2007 2 次提交
    • O
      change kernel threads to ignore signals instead of blocking them · 10ab825b
      Oleg Nesterov 提交于
      Currently kernel threads use sigprocmask(SIG_BLOCK) to protect against
      signals.  This doesn't prevent the signal delivery, this only blocks
      signal_wake_up().  Every "killall -33 kthreadd" means a "struct siginfo"
      leak.
      
      Change kthreadd_setup() to set all handlers to SIG_IGN instead of blocking
      them (make a new helper ignore_signals() for that).  If the kernel thread
      needs some signal, it should use allow_signal() anyway, and in that case it
      should not use CLONE_SIGHAND.
      
      Note that we can't change daemonize() (should die!) in the same way,
      because it can be used along with CLONE_SIGHAND.  This means that
      allow_signal() still should unblock the signal to work correctly with
      daemonize()ed threads.
      
      However, disallow_signal() doesn't block the signal any longer but ignores
      it.
      
      NOTE: with or without this patch the kernel threads are not protected from
      handle_stop_signal(), this seems harmless, but not good.
      Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru>
      Acked-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      10ab825b
    • E
      kthread: don't depend on work queues · 73c27992
      Eric W. Biederman 提交于
      Currently there is a circular reference between work queue initialization
      and kthread initialization.  This prevents the kthread infrastructure from
      initializing until after work queues have been initialized.
      
      We want the properties of tasks created with kthread_create to be as close
      as possible to the init_task and to not be contaminated by user processes.
      The later we start our kthreadd that creates these tasks the harder it is
      to avoid contamination from user processes and the more of a mess we have
      to clean up because the defaults have changed on us.
      
      So this patch modifies the kthread support to not use work queues but to
      instead use a simple list of structures, and to have kthreadd start from
      init_task immediately after our kernel thread that execs /sbin/init.
      
      By being a true child of init_task we only have to change those process
      settings that we want to have different from init_task, such as our process
      name, the cpus that are allowed, blocking all signals and setting SIGCHLD
      to SIG_IGN so that all of our children are reaped automatically.
      
      By being a true child of init_task we also naturally get our ppid set to 0
      and do not wind up as a child of PID == 1.  Ensuring that tasks generated
      by kthread_create will not slow down the functioning of the wait family of
      functions.
      
      [akpm@linux-foundation.org: use interruptible sleeps]
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Cc: Oleg Nesterov <oleg@tv-sign.ru>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      73c27992
  26. 12 2月, 2007 1 次提交
  27. 22 11月, 2006 1 次提交
    • D
      WorkStruct: Pass the work_struct pointer instead of context data · 65f27f38
      David Howells 提交于
      Pass the work_struct pointer to the work function rather than context data.
      The work function can use container_of() to work out the data.
      
      For the cases where the container of the work_struct may go away the moment the
      pending bit is cleared, it is made possible to defer the release of the
      structure by deferring the clearing of the pending bit.
      
      To make this work, an extra flag is introduced into the management side of the
      work_struct.  This governs auto-release of the structure upon execution.
      
      Ordinarily, the work queue executor would release the work_struct for further
      scheduling or deallocation by clearing the pending bit prior to jumping to the
      work function.  This means that, unless the driver makes some guarantee itself
      that the work_struct won't go away, the work function may not access anything
      else in the work_struct or its container lest they be deallocated..  This is a
      problem if the auxiliary data is taken away (as done by the last patch).
      
      However, if the pending bit is *not* cleared before jumping to the work
      function, then the work function *may* access the work_struct and its container
      with no problems.  But then the work function must itself release the
      work_struct by calling work_release().
      
      In most cases, automatic release is fine, so this is the default.  Special
      initiators exist for the non-auto-release case (ending in _NAR).
      Signed-Off-By: NDavid Howells <dhowells@redhat.com>
      65f27f38
  28. 15 7月, 2006 1 次提交
  29. 26 6月, 2006 1 次提交
  30. 26 3月, 2006 1 次提交
    • A
      [PATCH] find_task_by_pid() needs tasklist_lock · 05eeae20
      Andrew Morton 提交于
      A couple of places are forgetting to take it.
      
      The kswapd case is probably unimportant.  keventd_create_kthread() was racy.
      
      The whole thing is a bit flakey: you start a kernel thread, get its pid from
      kernel_thread() then look up its task_struct.
      
      a) It assumes that pid recycling takes a "long" time.
      
      b) We get a task_struct but no reference was taken on it.  The owner of the
         kswapd and kthread task_struct*'s must assume that the new thread won't
         exit unexpectedly.  Because if it does, they're left holding dead memory
         and any attempt to control or stop that task will crash.
      
      Cc: Christoph Hellwig <hch@lst.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      05eeae20
  31. 23 3月, 2006 1 次提交
  32. 31 10月, 2005 1 次提交
  33. 01 5月, 2005 1 次提交
  34. 17 4月, 2005 1 次提交
    • L
      Linux-2.6.12-rc2 · 1da177e4
      Linus Torvalds 提交于
      Initial git repository build. I'm not bothering with the full history,
      even though we have it. We can create a separate "historical" git
      archive of that later if we want to, and in the meantime it's about
      3.2GB when imported into git - space that would just make the early
      git days unnecessarily complicated, when we don't have a lot of good
      infrastructure for it.
      
      Let it rip!
      1da177e4