1. 28 7月, 2010 18 次提交
  2. 22 7月, 2010 5 次提交
  3. 19 7月, 2010 1 次提交
  4. 05 7月, 2010 1 次提交
  5. 01 7月, 2010 2 次提交
    • P
      sched: Cure nr_iowait_cpu() users · 8c215bd3
      Peter Zijlstra 提交于
      Commit 0224cf4c (sched: Intoduce get_cpu_iowait_time_us())
      broke things by not making sure preemption was indeed disabled
      by the callers of nr_iowait_cpu() which took the iowait value of
      the current cpu.
      
      This resulted in a heap of preempt warnings. Cure this by making
      nr_iowait_cpu() take a cpu number and fix up the callers to pass
      in the right number.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Rafael J. Wysocki <rjw@sisk.pl>
      Cc: Maxim Levitsky <maximlevitsky@gmail.com>
      Cc: Len Brown <len.brown@intel.com>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: Jiri Slaby <jslaby@suse.cz>
      Cc: linux-pm@lists.linux-foundation.org
      LKML-Reference: <1277968037.1868.120.camel@laptop>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8c215bd3
    • M
      futex: futex_find_get_task remove credentails check · 7a0ea09a
      Michal Hocko 提交于
      futex_find_get_task is currently used (through lookup_pi_state) from two
      contexts, futex_requeue and futex_lock_pi_atomic.  None of the paths
      looks it needs the credentials check, though.  Different (e)uids
      shouldn't matter at all because the only thing that is important for
      shared futex is the accessibility of the shared memory.
      
      The credentail check results in glibc assert failure or process hang (if
      glibc is compiled without assert support) for shared robust pthread
      mutex with priority inheritance if a process tries to lock already held
      lock owned by a process with a different euid:
      
      pthread_mutex_lock.c:312: __pthread_mutex_lock_full: Assertion `(-(e)) != 3 || !robust' failed.
      
      The problem is that futex_lock_pi_atomic which is called when we try to
      lock already held lock checks the current holder (tid is stored in the
      futex value) to get the PI state.  It uses lookup_pi_state which in turn
      gets task struct from futex_find_get_task.  ESRCH is returned either
      when the task is not found or if credentials check fails.
      
      futex_lock_pi_atomic simply returns if it gets ESRCH.  glibc code,
      however, doesn't expect that robust lock returns with ESRCH because it
      should get either success or owner died.
      Signed-off-by: NMichal Hocko <mhocko@suse.cz>
      Acked-by: NDarren Hart <dvhltc@us.ibm.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Nick Piggin <npiggin@suse.de>
      Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7a0ea09a
  6. 30 6月, 2010 1 次提交
  7. 25 6月, 2010 1 次提交
  8. 24 6月, 2010 1 次提交
  9. 23 6月, 2010 1 次提交
    • D
      rcu: apply RCU protection to wake_affine() · f3b577de
      Daniel J Blueman 提交于
      The task_group() function returns a pointer that must be protected
      by either RCU, the ->alloc_lock, or the cgroup lock (see the
      rcu_dereference_check() in task_subsys_state(), which is invoked by
      task_group()).  The wake_affine() function currently does none of these,
      which means that a concurrent update would be within its rights to free
      the structure returned by task_group().  Because wake_affine() uses this
      structure only to compute load-balancing heuristics, there is no reason
      to acquire either of the two locks.
      
      Therefore, this commit introduces an RCU read-side critical section that
      starts before the first call to task_group() and ends after the last use
      of the "tg" pointer returned from task_group().  Thanks to Li Zefan for
      pointing out the need to extend the RCU read-side critical section from
      that proposed by the original patch.
      Signed-off-by: NDaniel J Blueman <daniel.blueman@gmail.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      f3b577de
  10. 18 6月, 2010 2 次提交
    • A
      sched: Fix over-scheduling bug · 3c93717c
      Alex,Shi 提交于
      Commit e7097159 ("sched: Optimize unused cgroup configuration") introduced
      an imbalanced scheduling bug.
      
      If we do not use CGROUP, function update_h_load won't update h_load. When the
      system has a large number of tasks far more than logical CPU number, the
      incorrect cfs_rq[cpu]->h_load value will cause load_balance() to pull too
      many tasks to the local CPU from the busiest CPU. So the busiest CPU keeps
      going in a round robin. That will hurt performance.
      
      The issue was found originally by a scientific calculation workload that
      developed by Yanmin. With that commit, the workload performance drops
      about 40%.
      
       CPU  before    after
      
       00   : 2       : 7
       01   : 1       : 7
       02   : 11      : 6
       03   : 12      : 7
       04   : 6       : 6
       05   : 11      : 7
       06   : 10      : 6
       07   : 12      : 7
       08   : 11      : 6
       09   : 12      : 6
       10   : 1       : 6
       11   : 1       : 6
       12   : 6       : 6
       13   : 2       : 6
       14   : 2       : 6
       15   : 1       : 6
      Reviewed-by: NYanmin zhang <yanmin.zhang@intel.com>
      Signed-off-by: NAlex Shi <alex.shi@intel.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <1276754893.9452.5442.camel@debian>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3c93717c
    • P
      nohz: Fix nohz ratelimit · 3310d4d3
      Peter Zijlstra 提交于
      Chris Wedgwood reports that 39c0cbe2 (sched: Rate-limit nohz) causes a
      serial console regression, unresponsiveness, and indeed it does. The
      reason is that the nohz code is skipped even when the tick was already
      stopped before the nohz_ratelimit(cpu) condition changed.
      
      Move the nohz_ratelimit() check to the other conditions which prevent
      long idle sleeps.
      Reported-by: NChris Wedgwood <cw@f00f.org>
      Tested-by: NBrian Bloniarz <bmb@athenacr.com>
      Signed-off-by: NMike Galbraith <efault@gmx.de>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Greg KH <gregkh@suse.de>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
      Cc: Jef Driesen <jefdriesen@telenet.be>
      LKML-Reference: <1276790557.27822.516.camel@twins>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      3310d4d3
  11. 11 6月, 2010 1 次提交
    • S
      perf/tracing: Fix regression of perf losing kprobe events · a8fb2608
      Steven Rostedt 提交于
      With the addition of the code to shrink the kernel tracepoint
      infrastructure, we lost kprobes being traced by perf. The reason
      is that I tested if the "tp_event->class->perf_probe" existed before
      enabling it. This prevents "ftrace only" events (like the function
      trace events) from being enabled by perf.
      
      Unfortunately, kprobe events do not use perf_probe. This causes
      kprobes to be missed by perf. To fix this, we add the test to
      see if "tp_event->class->reg" exists as well as perf_probe.
      
      Normal trace events have only "perf_probe" but no "reg" function,
      and kprobes and syscalls have the "reg" but no "perf_probe".
      The ftrace unique events do not have either, so this is a valid
      test. If a kprobe or syscall is not to be probed by perf, the
      "reg" function is called anyway, and will return a failure and
      prevent perf from probing it.
      Reported-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Tested-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      a8fb2608
  12. 10 6月, 2010 1 次提交
  13. 09 6月, 2010 3 次提交
    • T
      genirq: Deal with desc->set_type() changing desc->chip · 46732475
      Thomas Gleixner 提交于
      The set_type() function can change the chip implementation when the
      trigger mode changes. That might result in using an non-initialized
      irq chip when called from __setup_irq() or when called via
      set_irq_type() on an already enabled irq. 
      
      The set_irq_type() function should not be called on an enabled irq,
      but because we forgot to put a check into it, we have a bunch of users
      which grew the habit of doing that and it never blew up as the
      function is serialized via desc->lock against all users of desc->chip
      and they never hit the non-initialized irq chip issue.
      
      The easy fix for the __setup_irq() issue would be to move the
      irq_chip_set_defaults(desc->chip) call after the trigger setting to
      make sure that a chip change is covered.
      
      But as we have already users, which do the type setting after
      request_irq(), the safe fix for now is to call irq_chip_set_defaults()
      from __irq_set_trigger() when desc->set_type() changed the irq chip.
      
      It needs a deeper analysis whether we should refuse to change the chip
      on an already enabled irq, but that'd be a large scale change to fix
      all the existing users. So that's neither stable nor 2.6.35 material.
      Reported-by: NEsben Haabendal <eha@doredevelopment.dk>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: linuxppc-dev <linuxppc-dev@ozlabs.org>
      Cc: stable@kernel.org
      46732475
    • P
      sched: Fix PROVE_RCU vs cpu_cgroup · dc61b1d6
      Peter Zijlstra 提交于
      PROVE_RCU has a few issues with the cpu_cgroup because the scheduler
      typically holds rq->lock around the css rcu derefs but the generic
      cgroup code doesn't (and can't) know about that lock.
      
      Provide means to add extra checks to the css dereference and use that
      in the scheduler to annotate its users.
      
      The addition of rq->lock to these checks is correct because the
      cgroup_subsys::attach() method takes the rq->lock for each task it
      moves, therefore by holding that lock, we ensure the task is pinned to
      the current cgroup and the RCU derefence is valid.
      
      That leaves one genuine race in __sched_setscheduler() where we used
      task_group() without holding any of the required locks and thus raced
      with the cgroup code. Solve this by moving the check under the
      appropriate lock.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      dc61b1d6
    • P
      perf: Fix signed comparison in perf_adjust_period() · f6ab91ad
      Peter Zijlstra 提交于
      Frederic reported that frequency driven swevents didn't work properly
      and even caused a division-by-zero error.
      
      It turns out there are two bugs, the division-by-zero comes from a
      failure to deal with that in perf_calculate_period().
      
      The other was more interesting and turned out to be a wrong comparison
      in perf_adjust_period(). The comparison was between an s64 and u64 and
      got implicitly converted to an unsigned comparison. The problem is
      that period_left is typically < 0, so it ended up being always true.
      
      Cure this by making the local period variables s64.
      Reported-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Tested-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: <stable@kernel.org>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f6ab91ad
  14. 05 6月, 2010 2 次提交
    • R
      module: fix bne2 "gave up waiting for init of module libcrc32c" · 9bea7f23
      Rusty Russell 提交于
      Problem: it's hard to avoid an init routine stumbling over a
      request_module these days.  And it's not clear it's always a bad idea:
      for example, a module like kvm with dynamic dependencies on kvm-intel
      or kvm-amd would be neater if it could simply request_module the right
      one.
      
      In this particular case, it's libcrc32c:
      
      	libcrc32c_mod_init
      	 crypto_alloc_shash
      	  crypto_alloc_tfm
      	   crypto_find_alg
      	    crypto_alg_mod_lookup
      	     crypto_larval_lookup
      	      request_module
      
      If another module is waiting inside resolve_symbol() for libcrc32c to
      finish initializing (ie. bne2 depends on libcrc32c) then it does so
      holding the module lock, and our request_module() can't make progress
      until that is released.
      
      Waiting inside resolve_symbol() without the lock isn't all that hard:
      we just need to pass the -EBUSY up the call chain so we can sleep
      where we don't hold the lock.  Error reporting is a bit trickier: we
      need to copy the name of the unfinished module before releasing the
      lock.
      
      Other notes:
      1) This also fixes a theoretical issue where a weak dependency would allow
         symbol version mismatches to be ignored.
      2) We rename use_module to ref_module to make life easier for the only
         external user (the out-of-tree ksplice patches).
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Tim Abbot <tabbott@ksplice.com>
      Tested-by: NBrandon Philips <bphilips@suse.de>
      9bea7f23
    • R
      module: verify_export_symbols under the lock · be593f4c
      Rusty Russell 提交于
      It disabled preempt so it was "safe", but nothing stops another module
      slipping in before this module is added to the global list now we don't
      hold the lock the whole time.
      
      So we check this just after we check for duplicate modules, and just
      before we put the module in the global list.
      
      (find_symbol finds symbols in coming and going modules, too).
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      be593f4c