1. 11 4月, 2006 1 次提交
    • J
      [PATCH] splice: add direct fd <-> fd splicing support · b92ce558
      Jens Axboe 提交于
      It's more efficient for sendfile() emulation. Basically we cache an
      internal private pipe and just use that as the intermediate area for
      pages. Direct splicing is not available from sys_splice(), it is only
      meant to be used for sendfile() emulation.
      
      Additional patch from Ingo Molnar to avoid the PIPE_BUFFERS loop at
      exit for the normal fast path.
      Signed-off-by: NJens Axboe <axboe@suse.de>
      b92ce558
  2. 01 4月, 2006 1 次提交
    • E
      [PATCH] task: RCU protect task->usage · 8c7904a0
      Eric W. Biederman 提交于
      A big problem with rcu protected data structures that are also reference
      counted is that you must jump through several hoops to increase the reference
      count.  I think someone finally implemented atomic_inc_not_zero(&count) to
      automate the common case.  Unfortunately this means you must special case the
      rcu access case.
      
      When data structures are only visible via rcu in a manner that is not
      determined by the reference count on the object (i.e.  tasks are visible until
      their zombies are reaped) there is a much simpler technique we can employ.
      Simply delaying the decrement of the reference count until the rcu interval is
      over.
      
      What that means is that the proc code that looks up a task and later
      wants to sleep can now do:
      
      rcu_read_lock();
      task = find_task_by_pid(some_pid);
      if (task) {
      	get_task_struct(task);
      }
      rcu_read_unlock();
      
      The effect on the rest of the kernel is that put_task_struct becomes cheaper
      and immediate, and in the case where the task has been reaped it frees the
      task immediate instead of unnecessarily waiting an until the rcu interval is
      over.
      
      Cleanup of task_struct does not happen when its reference count drops to
      zero, instead cleanup happens when release_task is called.  Tasks can only
      be looked up via rcu before release_task is called.  All rcu protected
      members of task_struct are freed by release_task.
      
      Therefore we can move call_rcu from put_task_struct into release_task.  And
      we can modify release_task to not immediately release the reference count
      but instead have it call put_task_struct from the function it gives to
      call_rcu.
      
      The end result:
      
      - get_task_struct is safe in an rcu context where we have just looked
        up the task.
      
      - put_task_struct() simplifies into its old pre rcu self.
      
      This reorganization also makes put_task_struct uncallable from modules as
      it is not exported but it does not appear to be called from any modules so
      this should not be an issue, and is trivially fixed.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      8c7904a0
  3. 29 3月, 2006 14 次提交
  4. 28 3月, 2006 2 次提交
  5. 23 3月, 2006 1 次提交
  6. 19 3月, 2006 1 次提交
  7. 21 2月, 2006 1 次提交
  8. 15 1月, 2006 2 次提交
  9. 12 1月, 2006 1 次提交
  10. 11 1月, 2006 2 次提交
  11. 10 1月, 2006 1 次提交
  12. 09 1月, 2006 3 次提交
  13. 14 11月, 2005 1 次提交
  14. 07 11月, 2005 1 次提交
  15. 31 10月, 2005 3 次提交
  16. 30 10月, 2005 1 次提交
    • H
      [PATCH] mm: update_hiwaters just in time · 365e9c87
      Hugh Dickins 提交于
      update_mem_hiwater has attracted various criticisms, in particular from those
      concerned with mm scalability.  Originally it was called whenever rss or
      total_vm got raised.  Then many of those callsites were replaced by a timer
      tick call from account_system_time.  Now Frank van Maarseveen reports that to
      be found inadequate.  How about this?  Works for Frank.
      
      Replace update_mem_hiwater, a poor combination of two unrelated ops, by macros
      update_hiwater_rss and update_hiwater_vm.  Don't attempt to keep
      mm->hiwater_rss up to date at timer tick, nor every time we raise rss (usually
      by 1): those are hot paths.  Do the opposite, update only when about to lower
      rss (usually by many), or just before final accounting in do_exit.  Handle
      mm->hiwater_vm in the same way, though it's much less of an issue.  Demand
      that whoever collects these hiwater statistics do the work of taking the
      maximum with rss or total_vm.
      
      And there has been no collector of these hiwater statistics in the tree.  The
      new convention needs an example, so match Frank's usage by adding a VmPeak
      line above VmSize to /proc/<pid>/status, and also a VmHWM line above VmRSS
      (High-Water-Mark or High-Water-Memory).
      
      There was a particular anomaly during mremap move, that hiwater_vm might be
      captured too high.  A fleeting such anomaly remains, but it's quickly
      corrected now, whereas before it would stick.
      
      What locking?  None: if the app is racy then these statistics will be racy,
      it's not worth any overhead to make them exact.  But whenever it suits,
      hiwater_vm is updated under exclusive mmap_sem, and hiwater_rss under
      page_table_lock (for now) or with preemption disabled (later on): without
      going to any trouble, minimize the time between reading current values and
      updating, to minimize those occasions when a racing thread bumps a count up
      and back down in between.
      Signed-off-by: NHugh Dickins <hugh@veritas.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      365e9c87
  17. 28 10月, 2005 1 次提交
  18. 24 10月, 2005 1 次提交
    • O
      [PATCH] posix-timers: remove false BUG_ON() from run_posix_cpu_timers() · 3de463c7
      Oleg Nesterov 提交于
      do_exit() clears ->it_##clock##_expires, but nothing prevents
      another cpu to attach the timer to exiting process after that.
      
      After exit_notify() does 'write_unlock_irq(&tasklist_lock)' and
      before do_exit() calls 'schedule() local timer interrupt can find
      tsk->exit_state != 0. If that state was EXIT_DEAD (or another cpu
      does sys_wait4) interrupted task has ->signal == NULL.
      
      At this moment exiting task has no pending cpu timers, they were cleaned
      up in __exit_signal()->posix_cpu_timers_exit{,_group}(), so we can just
      return from irq.
      Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      3de463c7
  19. 22 10月, 2005 1 次提交
    • R
      [PATCH] Call exit_itimers from do_exit, not __exit_signal · 25f407f0
      Roland McGrath 提交于
      When I originally moved exit_itimers into __exit_signal, that was the only
      place where we could reliably know it was the last thread in the group
      dying, without races.  Since then we've gotten the signal_struct.live
      counter, and do_exit can reliably do group-wide cleanup work.
      
      This patch moves the call to do_exit, where it's made without locks.  This
      avoids the deadlock issues that the old __exit_signal code's comment talks
      about, and the one that Oleg found recently with process CPU timers.
      
      [ This replaces e03d13e9, which is why
        it was just reverted. ]
      Signed-off-by: NRoland McGrath <roland@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      25f407f0
  20. 02 10月, 2005 1 次提交
    • L
      Fix inequality comparison against "task->state" · 14bf01bb
      Linus Torvalds 提交于
      We should always use bitmask ops, rather than depend on some ordering of
      the different states.  With the TASK_NONINTERACTIVE flag, the inequality
      doesn't really work.
      
      Oleg Nesterov argues (likely correctly) that this test is unnecessary in
      the first place.  However, the minimal fix for now is to at least make
      it work in the presense of TASK_NONINTERACTIVE.  Waiting for consensus
      from Roland & co on potential bigger cleanups.
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      14bf01bb