1. 16 9月, 2016 1 次提交
  2. 05 9月, 2015 1 次提交
  3. 12 8月, 2015 1 次提交
  4. 07 8月, 2015 1 次提交
  5. 10 10月, 2014 1 次提交
    • N
      kernel/kthread.c: partial revert of 81c98869 ("kthread: ensure locality of... · 10922838
      Nishanth Aravamudan 提交于
      kernel/kthread.c: partial revert of 81c98869 ("kthread: ensure locality of task_struct allocations")
      
      After discussions with Tejun, we don't want to spread the use of
      cpu_to_mem() (and thus knowledge of allocators/NUMA topology details) into
      callers, but would rather ensure the callees correctly handle memoryless
      nodes.  With the previous patches ("topology: add support for
      node_to_mem_node() to determine the fallback node" and "slub: fallback to
      node_to_mem_node() node if allocating on memoryless node") adding and
      using node_to_mem_node(), we can safely undo part of the change to the
      kthread logic from 81c98869.
      Signed-off-by: NNishanth Aravamudan <nacc@linux.vnet.ibm.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Han Pingtian <hanpt@linux.vnet.ibm.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Anton Blanchard <anton@samba.org>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      10922838
  6. 29 7月, 2014 1 次提交
  7. 05 6月, 2014 1 次提交
  8. 04 4月, 2014 1 次提交
  9. 13 11月, 2013 1 次提交
  10. 01 5月, 2013 1 次提交
    • T
      kthread: implement probe_kthread_data() · cd42d559
      Tejun Heo 提交于
      One of the problems that arise when converting dedicated custom threadpool
      to workqueue is that the shared worker pool used by workqueue anonimizes
      each worker making it more difficult to identify what the worker was doing
      on which target from the output of sysrq-t or debug dump from oops, BUG()
      and friends.
      
      For example, after writeback is converted to use workqueue instead of
      priviate thread pool, there's no easy to tell which backing device a
      writeback work item was working on at the time of task dump, which,
      according to our writeback brethren, is important in tracking down issues
      with a lot of mounted file systems on a lot of different devices.
      
      This patchset implements a way for a work function to mark its execution
      instance so that task dump of the worker task includes information to
      indicate what the work item was doing.
      
      An example WARN dump would look like the following.
      
       WARNING: at fs/fs-writeback.c:1015 bdi_writeback_workfn+0x2b4/0x3c0()
       Modules linked in:
       CPU: 0 Pid: 28 Comm: kworker/u18:0 Not tainted 3.9.0-rc1-work+ #24
       Hardware name: empty empty/S3992, BIOS 080011  10/26/2007
       Workqueue: writeback bdi_writeback_workfn (flush-8:16)
        ffffffff820a3a98 ffff88015b927cb8 ffffffff81c61855 ffff88015b927cf8
        ffffffff8108f500 0000000000000000 ffff88007a171948 ffff88007a1716b0
        ffff88015b49df00 ffff88015b8d3940 0000000000000000 ffff88015b927d08
       Call Trace:
        [<ffffffff81c61855>] dump_stack+0x19/0x1b
        [<ffffffff8108f500>] warn_slowpath_common+0x70/0xa0
        ...
      
      This patch:
      
      Implement probe_kthread_data() which returns kthread_data if accessible.
      The function is equivalent to kthread_data() except that the specified
      @task may not be a kthread or its vfork_done is already cleared rendering
      struct kthread inaccessible.  In the former case, probe_kthread_data() may
      return any value.  In the latter, NULL.
      
      This will be used to safely print debug information without affecting
      synchronization in the normal paths.  Workqueue debug info printing on
      dump_stack() and friends will make use of it.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Acked-by: NJan Kara <jack@suse.cz>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      cd42d559
  11. 30 4月, 2013 2 次提交
    • O
      kthread: kill task_get_live_kthread() · b5c5442b
      Oleg Nesterov 提交于
      task_get_live_kthread() looks confusing and unneeded.  It does
      get_task_struct() but only kthread_stop() needs this, it can be called
      even if the calller doesn't have a reference when we know that this
      kthread can't exit until we do kthread_stop().
      
      kthread_park() and kthread_unpark() do not need get_task_struct(), the
      callers already have the reference.  And it can not help if we can race
      with the exiting kthread anyway, kthread_park() can hang forever in this
      case.
      
      Change kthread_park() and kthread_unpark() to use to_live_kthread(),
      change kthread_stop() to do get_task_struct() by hand and remove
      task_get_live_kthread().
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b5c5442b
    • O
      kthread: introduce to_live_kthread() · 4ecdafc8
      Oleg Nesterov 提交于
      "k->vfork_done != NULL" with a barrier() after to_kthread(k) in
      task_get_live_kthread(k) looks unclear, and sub-optimal because we load
      ->vfork_done twice.
      
      All we need is to ensure that we do not return to_kthread(NULL).  Add a
      new trivial helper which loads/checks ->vfork_done once, this also looks
      more understandable.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4ecdafc8
  12. 12 4月, 2013 1 次提交
    • T
      kthread: Prevent unpark race which puts threads on the wrong cpu · f2530dc7
      Thomas Gleixner 提交于
      The smpboot threads rely on the park/unpark mechanism which binds per
      cpu threads on a particular core. Though the functionality is racy:
      
      CPU0	       	 	CPU1  	     	    CPU2
      unpark(T)				    wake_up_process(T)
        clear(SHOULD_PARK)	T runs
      			leave parkme() due to !SHOULD_PARK  
        bind_to(CPU2)		BUG_ON(wrong CPU)						    
      
      We cannot let the tasks move themself to the target CPU as one of
      those tasks is actually the migration thread itself, which requires
      that it starts running on the target cpu right away.
      
      The solution to this problem is to prevent wakeups in park mode which
      are not from unpark(). That way we can guarantee that the association
      of the task to the target cpu is working correctly.
      
      Add a new task state (TASK_PARKED) which prevents other wakeups and
      use this state explicitly for the unpark wakeup.
      
      Peter noticed: Also, since the task state is visible to userspace and
      all the parked tasks are still in the PID space, its a good hint in ps
      and friends that these tasks aren't really there for the moment.
      
      The migration thread has another related issue.
      
      CPU0	      	     	 CPU1
      Bring up CPU2
      create_thread(T)
      park(T)
       wait_for_completion()
      			 parkme()
      			 complete()
      sched_set_stop_task()
      			 schedule(TASK_PARKED)
      
      The sched_set_stop_task() call is issued while the task is on the
      runqueue of CPU1 and that confuses the hell out of the stop_task class
      on that cpu. So we need the same synchronizaion before
      sched_set_stop_task().
      Reported-by: NDave Jones <davej@redhat.com>
      Reported-and-tested-by: NDave Hansen <dave@sr71.net>
      Reported-and-tested-by: NBorislav Petkov <bp@alien8.de>
      Acked-by: NPeter Ziljstra <peterz@infradead.org>
      Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Cc: dhillf@gmail.com
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: stable@vger.kernel.org
      Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1304091635430.21884@ionosSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      f2530dc7
  13. 20 3月, 2013 1 次提交
    • T
      sched: replace PF_THREAD_BOUND with PF_NO_SETAFFINITY · 14a40ffc
      Tejun Heo 提交于
      PF_THREAD_BOUND was originally used to mark kernel threads which were
      bound to a specific CPU using kthread_bind() and a task with the flag
      set allows cpus_allowed modifications only to itself.  Workqueue is
      currently abusing it to prevent userland from meddling with
      cpus_allowed of workqueue workers.
      
      What we need is a flag to prevent userland from messing with
      cpus_allowed of certain kernel tasks.  In kernel, anyone can
      (incorrectly) squash the flag, and, for worker-type usages,
      restricting cpus_allowed modification to the task itself doesn't
      provide meaningful extra proection as other tasks can inject work
      items to the task anyway.
      
      This patch replaces PF_THREAD_BOUND with PF_NO_SETAFFINITY.
      sched_setaffinity() checks the flag and return -EINVAL if set.
      set_cpus_allowed_ptr() is no longer affected by the flag.
      
      This will allow simplifying workqueue worker CPU affinity management.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NIngo Molnar <mingo@kernel.org>
      Reviewed-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      14a40ffc
  14. 13 12月, 2012 1 次提交
  15. 13 10月, 2012 1 次提交
    • A
      infrastructure for saner ret_from_kernel_thread semantics · a74fb73c
      Al Viro 提交于
      * allow kernel_execve() leave the actual return to userland to
      caller (selected by CONFIG_GENERIC_KERNEL_EXECVE).  Callers
      updated accordingly.
      * architecture that does select GENERIC_KERNEL_EXECVE in its
      Kconfig should have its ret_from_kernel_thread() do this:
      	call schedule_tail
      	call the callback left for it by copy_thread(); if it ever
      returns, that's because it has just done successful kernel_execve()
      	jump to return from syscall
      IOW, its only difference from ret_from_fork() is that it does call the
      callback.
      * such an architecture should also get rid of ret_from_kernel_execve()
      and __ARCH_WANT_KERNEL_EXECVE
      
      This is the last part of infrastructure patches in that area - from
      that point on work on different architectures can live independently.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      a74fb73c
  16. 13 8月, 2012 1 次提交
  17. 23 7月, 2012 2 次提交
    • T
      kthread_worker: reimplement flush_kthread_work() to allow freeing the work item being executed · 46f3d976
      Tejun Heo 提交于
      kthread_worker provides minimalistic workqueue-like interface for
      users which need a dedicated worker thread (e.g. for realtime
      priority).  It has basic queue, flush_work, flush_worker operations
      which mostly match the workqueue counterparts; however, due to the way
      flush_work() is implemented, it has a noticeable difference of not
      allowing work items to be freed while being executed.
      
      While the current users of kthread_worker are okay with the current
      behavior, the restriction does impede some valid use cases.  Also,
      removing this difference isn't difficult and actually makes the code
      easier to understand.
      
      This patch reimplements flush_kthread_work() such that it uses a
      flush_work item instead of queue/done sequence numbers.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      46f3d976
    • T
      kthread_worker: reorganize to prepare for flush_kthread_work() reimplementation · 9a2e03d8
      Tejun Heo 提交于
      Make the following two non-functional changes.
      
      * Separate out insert_kthread_work() from queue_kthread_work().
      
      * Relocate struct kthread_flush_work and kthread_flush_work_fn()
        definitions above flush_kthread_work().
      
      v2: Added lockdep_assert_held() in insert_kthread_work() as suggested
          by Andy Walls.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NAndy Walls <awalls@md.metrocast.net>
      9a2e03d8
  18. 24 11月, 2011 1 次提交
    • T
      freezer: kill unused set_freezable_with_signal() · 34b087e4
      Tejun Heo 提交于
      There's no in-kernel user of set_freezable_with_signal() left.  Mixing
      TIF_SIGPENDING with kernel threads can lead to nasty corner cases as
      kernel threads never travel signal delivery path on their own.
      
      e.g. the current implementation is buggy in the cancelation path of
      __thaw_task().  It calls recalc_sigpending_and_wake() in an attempt to
      clear TIF_SIGPENDING but the function never clears it regardless of
      sigpending state.  This means that signallable freezable kthreads may
      continue executing with !freezing() && stuck TIF_SIGPENDING, which can
      be troublesome.
      
      This patch removes set_freezable_with_signal() along with
      PF_FREEZER_NOSIG and recalc_sigpending*() calls in freezer.  User
      tasks get TIF_SIGPENDING, kernel tasks get woken up and the spurious
      sigpending is dealt with in the usual signal delivery path.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NOleg Nesterov <oleg@redhat.com>
      34b087e4
  19. 22 11月, 2011 1 次提交
    • T
      freezer: implement and use kthread_freezable_should_stop() · 8a32c441
      Tejun Heo 提交于
      Writeback and thinkpad_acpi have been using thaw_process() to prevent
      deadlock between the freezer and kthread_stop(); unfortunately, this
      is inherently racy - nothing prevents freezing from happening between
      thaw_process() and kthread_stop().
      
      This patch implements kthread_freezable_should_stop() which enters
      refrigerator if necessary but is guaranteed to return if
      kthread_stop() is invoked.  Both thaw_process() users are converted to
      use the new function.
      
      Note that this deadlock condition exists for many of freezable
      kthreads.  They need to be converted to use the new should_stop or
      freezable workqueue.
      
      Tested with synthetic test case.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NHenrique de Moraes Holschuh <ibm-acpi@hmh.eng.br>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Oleg Nesterov <oleg@redhat.com>
      8a32c441
  20. 31 10月, 2011 1 次提交
  21. 28 5月, 2011 1 次提交
  22. 31 3月, 2011 1 次提交
  23. 23 3月, 2011 1 次提交
  24. 07 1月, 2011 1 次提交
  25. 22 12月, 2010 1 次提交
  26. 23 10月, 2010 1 次提交
  27. 29 6月, 2010 2 次提交
    • T
      kthread: implement kthread_data() · 82805ab7
      Tejun Heo 提交于
      Implement kthread_data() which takes @task pointing to a kthread and
      returns @data specified when creating the kthread.  The caller is
      responsible for ensuring the validity of @task when calling this
      function.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      82805ab7
    • T
      kthread: implement kthread_worker · b56c0d89
      Tejun Heo 提交于
      Implement simple work processor for kthread.  This is to ease using
      kthread.  Single thread workqueue used to be used for things like this
      but workqueue won't guarantee fixed kthread association anymore to
      enable worker sharing.
      
      This can be used in cases where specific kthread association is
      necessary, for example, when it should have RT priority or be assigned
      to certain cgroup.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      b56c0d89
  28. 25 3月, 2010 1 次提交
  29. 09 2月, 2010 2 次提交
  30. 17 12月, 2009 1 次提交
  31. 03 11月, 2009 1 次提交
  32. 09 9月, 2009 1 次提交
  33. 28 7月, 2009 1 次提交
  34. 19 6月, 2009 2 次提交
    • O
      kthreads: rework kthread_stop() · 63706172
      Oleg Nesterov 提交于
      Based on Eric's patch which in turn was based on my patch.
      
      kthread_stop() has the nasty problems:
      
      - it runs unpredictably long with the global semaphore held.
      
      - it deadlocks if kthread itself does kthread_stop() before it obeys
        the kthread_should_stop() request.
      
      - it is not useable if kthread exits on its own, see for example the
        ugly "wait_to_die:" hack in migration_thread()
      
      - it is not possible to just tell kthread it should stop, we must always
        wait for its exit.
      
      With this patch kthread() allocates all neccesary data (struct kthread) on
      its own stack, globals kthread_stop_xxx are deleted.  ->vfork_done is used
      as a pointer into "struct kthread", this means kthread_stop() can easily
      wait for kthread's exit.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Pavel Emelyanov <xemul@openvz.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Vitaliy Gusev <vgusev@openvz.org
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      63706172
    • O
      kthreads: simplify the startup synchronization · cdd140bd
      Oleg Nesterov 提交于
      We use two completions two create the kernel thread, this is a bit ugly.
      kthread() wakes up create_kthread() via ->started, then create_kthread()
      wakes up the caller kthread_create() via ->done.  But kthread() does not
      need to wait for kthread(), it can just return.  Instead kthread() itself
      can wake up the caller of kthread_create().
      
      Kill kthread_create_info->started, ->done is enough.  This improves the
      scalability a bit and sijmplifies the code.
      
      The only problem if kernel_thread() fails, in that case create_kthread()
      must do complete(&create->done).
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Pavel Emelyanov <xemul@openvz.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Vitaliy Gusev <vgusev@openvz.org
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      cdd140bd
  35. 17 6月, 2009 1 次提交
    • M
      cpuset,mm: update tasks' mems_allowed in time · 58568d2a
      Miao Xie 提交于
      Fix allocating page cache/slab object on the unallowed node when memory
      spread is set by updating tasks' mems_allowed after its cpuset's mems is
      changed.
      
      In order to update tasks' mems_allowed in time, we must modify the code of
      memory policy.  Because the memory policy is applied in the process's
      context originally.  After applying this patch, one task directly
      manipulates anothers mems_allowed, and we use alloc_lock in the
      task_struct to protect mems_allowed and memory policy of the task.
      
      But in the fast path, we didn't use lock to protect them, because adding a
      lock may lead to performance regression.  But if we don't add a lock,the
      task might see no nodes when changing cpuset's mems_allowed to some
      non-overlapping set.  In order to avoid it, we set all new allowed nodes,
      then clear newly disallowed ones.
      
      [lee.schermerhorn@hp.com:
        The rework of mpol_new() to extract the adjusting of the node mask to
        apply cpuset and mpol flags "context" breaks set_mempolicy() and mbind()
        with MPOL_PREFERRED and a NULL nodemask--i.e., explicit local
        allocation.  Fix this by adding the check for MPOL_PREFERRED and empty
        node mask to mpol_new_mpolicy().
      
        Remove the now unneeded 'nodes = NULL' from mpol_new().
      
        Note that mpol_new_mempolicy() is always called with a non-NULL
        'nodes' parameter now that it has been removed from mpol_new().
        Therefore, we don't need to test nodes for NULL before testing it for
        'empty'.  However, just to be extra paranoid, add a VM_BUG_ON() to
        verify this assumption.]
      [lee.schermerhorn@hp.com:
      
        I don't think the function name 'mpol_new_mempolicy' is descriptive
        enough to differentiate it from mpol_new().
      
        This function applies cpuset set context, usually constraining nodes
        to those allowed by the cpuset.  However, when the 'RELATIVE_NODES flag
        is set, it also translates the nodes.  So I settled on
        'mpol_set_nodemask()', because the comment block for mpol_new() mentions
        that we need to call this function to "set nodes".
      
        Some additional minor line length, whitespace and typo cleanup.]
      Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Paul Menage <menage@google.com>
      Cc: Nick Piggin <nickpiggin@yahoo.com.au>
      Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: David Rientjes <rientjes@google.com>
      Signed-off-by: NLee Schermerhorn <lee.schermerhorn@hp.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      58568d2a