1. 07 12月, 2007 1 次提交
  2. 10 11月, 2007 1 次提交
    • P
      sched: restore deterministic CPU accounting on powerpc · fa13a5a1
      Paul Mackerras 提交于
      Since powerpc started using CONFIG_GENERIC_CLOCKEVENTS, the
      deterministic CPU accounting (CONFIG_VIRT_CPU_ACCOUNTING) has been
      broken on powerpc, because we end up counting user time twice: once in
      timer_interrupt() and once in update_process_times().
      
      This fixes the problem by pulling the code in update_process_times
      that updates utime and stime into a separate function called
      account_process_tick.  If CONFIG_VIRT_CPU_ACCOUNTING is not defined,
      there is a version of account_process_tick in kernel/timer.c that
      simply accounts a whole tick to either utime or stime as before.  If
      CONFIG_VIRT_CPU_ACCOUNTING is defined, then arch code gets to
      implement account_process_tick.
      
      This also lets us simplify the s390 code a bit; it means that the s390
      timer interrupt can now call update_process_times even when
      CONFIG_VIRT_CPU_ACCOUNTING is turned on, and can just implement a
      suitable account_process_tick().
      
      account_process_tick() now takes the task_struct * as an argument.
      Tested both with and without CONFIG_VIRT_CPU_ACCOUNTING.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      fa13a5a1
  3. 06 11月, 2007 1 次提交
  4. 20 10月, 2007 1 次提交
    • P
      pid namespaces: changes to show virtual ids to user · b488893a
      Pavel Emelyanov 提交于
      This is the largest patch in the set. Make all (I hope) the places where
      the pid is shown to or get from user operate on the virtual pids.
      
      The idea is:
       - all in-kernel data structures must store either struct pid itself
         or the pid's global nr, obtained with pid_nr() call;
       - when seeking the task from kernel code with the stored id one
         should use find_task_by_pid() call that works with global pids;
       - when showing pid's numerical value to the user the virtual one
         should be used, but however when one shows task's pid outside this
         task's namespace the global one is to be used;
       - when getting the pid from userspace one need to consider this as
         the virtual one and use appropriate task/pid-searching functions.
      
      [akpm@linux-foundation.org: build fix]
      [akpm@linux-foundation.org: nuther build fix]
      [akpm@linux-foundation.org: yet nuther build fix]
      [akpm@linux-foundation.org: remove unneeded casts]
      Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
      Signed-off-by: NAlexey Dobriyan <adobriyan@openvz.org>
      Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com>
      Cc: Oleg Nesterov <oleg@tv-sign.ru>
      Cc: Paul Menage <menage@google.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b488893a
  5. 19 10月, 2007 2 次提交
  6. 21 7月, 2007 1 次提交
  7. 20 7月, 2007 1 次提交
  8. 18 7月, 2007 1 次提交
  9. 17 7月, 2007 2 次提交
  10. 30 5月, 2007 1 次提交
  11. 15 5月, 2007 1 次提交
  12. 11 5月, 2007 1 次提交
  13. 10 5月, 2007 3 次提交
  14. 09 5月, 2007 3 次提交
    • P
      Introduce a handy list_first_entry macro · b5e61818
      Pavel Emelianov 提交于
      There are many places in the kernel where the construction like
      
         foo = list_entry(head->next, struct foo_struct, list);
      
      are used.
      The code might look more descriptive and neat if using the macro
      
         list_first_entry(head, type, member) \
                   list_entry((head)->next, type, member)
      
      Here is the macro itself and the examples of its usage in the generic code.
       If it will turn out to be useful, I can prepare the set of patches to
      inject in into arch-specific code, drivers, networking, etc.
      Signed-off-by: NPavel Emelianov <xemul@openvz.org>
      Signed-off-by: NKirill Korotaev <dev@openvz.org>
      Cc: Randy Dunlap <randy.dunlap@oracle.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Zach Brown <zach.brown@oracle.com>
      Cc: Davide Libenzi <davidel@xmailserver.org>
      Cc: John McCutchan <ttb@tentacle.dhs.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: john stultz <johnstul@us.ibm.com>
      Cc: Ram Pai <linuxram@us.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b5e61818
    • J
      Move timekeeping code to timekeeping.c · 8524070b
      john stultz 提交于
      Move the timekeeping code out of kernel/timer.c and into
      kernel/time/timekeeping.c.  I made no cleanups or other changes in transit.
      
      [akpm@linux-foundation.org: build fix]
      Signed-off-by: NJohn Stultz <johnstul@us.ibm.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8524070b
    • V
      Add support for deferrable timers · 6e453a67
      Venki Pallipadi 提交于
      Introduce a new flag for timers - deferrable: Timers that work normally
      when system is busy.  But, will not cause CPU to come out of idle (just to
      service this timer), when CPU is idle.  Instead, this timer will be
      serviced when CPU eventually wakes up with a subsequent non-deferrable
      timer.
      
      The main advantage of this is to avoid unnecessary timer interrupts when
      CPU is idle.  If the routine currently called by a timer can wait until
      next event without any issues, this new timer can be used to setup timer
      event for that routine.  This, with dynticks, allows CPUs to be lazy,
      allowing them to stay in idle for extended period of time by reducing
      unnecesary wakeup and thereby reducing the power consumption.
      
      This patch:
      
      Builds this new timer on top of existing timer infrastructure.  It uses
      last bit in 'base' pointer of timer_list structure to store this deferrable
      timer flag.  __next_timer_interrupt() function skips over these deferrable
      timers when CPU looks for next timer event for which it has to wake up.
      
      This is exported by a new interface init_timer_deferrable() that can be
      called in place of regular init_timer().
      
      [akpm@linux-foundation.org: Privatise a #define]
      Signed-off-by: NVenkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Oleg Nesterov <oleg@tv-sign.ru>
      Cc: Dave Jones <davej@codemonkey.org.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6e453a67
  15. 27 4月, 2007 1 次提交
  16. 08 4月, 2007 1 次提交
    • I
      [PATCH] high-res timers: resume fix · 995f054f
      Ingo Molnar 提交于
      Soeren Sonnenburg reported that upon resume he is getting
      this backtrace:
      
       [<c0119637>] smp_apic_timer_interrupt+0x57/0x90
       [<c0142d30>] retrigger_next_event+0x0/0xb0
       [<c0104d30>] apic_timer_interrupt+0x28/0x30
       [<c0142d30>] retrigger_next_event+0x0/0xb0
       [<c0140068>] __kfifo_put+0x8/0x90
       [<c0130fe5>] on_each_cpu+0x35/0x60
       [<c0143538>] clock_was_set+0x18/0x20
       [<c0135cdc>] timekeeping_resume+0x7c/0xa0
       [<c02aabe1>] __sysdev_resume+0x11/0x80
       [<c02ab0c7>] sysdev_resume+0x47/0x80
       [<c02b0b05>] device_power_up+0x5/0x10
      
      it turns out that on resume we mistakenly re-enable interrupts too
      early.  Do the timer retrigger only on the current CPU.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NSoeren Sonnenburg <kernel@nn7.de>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      995f054f
  17. 26 3月, 2007 1 次提交
  18. 07 3月, 2007 2 次提交
  19. 05 3月, 2007 1 次提交
    • H
      [PATCH] timer/hrtimer: take per cpu locks in sane order · e81ce1f7
      Heiko Carstens 提交于
      Doing something like this on a two cpu system
      
        # echo 0 > /sys/devices/system/cpu/cpu0/online
        # echo 1 > /sys/devices/system/cpu/cpu0/online
        # echo 0 > /sys/devices/system/cpu/cpu1/online
      
      will give me this:
      
        =======================================================
        [ INFO: possible circular locking dependency detected ]
        2.6.21-rc2-g562aa1d4-dirty #7
        -------------------------------------------------------
        bash/1282 is trying to acquire lock:
         (&cpu_base->lock_key){.+..}, at: [<000000000005f17e>] hrtimer_cpu_notify+0xc6/0x240
      
        but task is already holding lock:
         (&cpu_base->lock_key#2){.+..}, at: [<000000000005f174>] hrtimer_cpu_notify+0xbc/0x240
      
        which lock already depends on the new lock.
      
      This happens because we have the following code in kernel/hrtimer.c:
      
        migrate_hrtimers(int cpu)
        [...]
        old_base = &per_cpu(hrtimer_bases, cpu);
        new_base = &get_cpu_var(hrtimer_bases);
        [...]
        spin_lock(&new_base->lock);
        spin_lock(&old_base->lock);
      
      Which means the spinlocks are taken in an order which depends on which cpu
      gets shut down from which other cpu. Therefore lockdep complains that there
      might be an ABBA deadlock. Since migrate_hrtimers() gets only called on
      cpu hotplug it's safe to assume that it isn't executed concurrently on a
      
      The same problem exists in kernel/timer.c: migrate_timers().
      
      As pointed out by Christian Borntraeger one possible solution to avoid
      the locking order complaints would be to make sure that the locks are
      always taken in the same order. E.g. by taking the lock of the cpu with
      the lower number first.
      
      To achieve this we introduce two new spinlock functions double_spin_lock
      and double_spin_unlock which lock or unlock two locks in a given order.
      
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Roman Zippel <zippel@linux-m68k.org>
      Cc: John Stultz <johnstul@us.ibm.com>
      Cc: Christian Borntraeger <cborntra@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e81ce1f7
  20. 02 3月, 2007 2 次提交
  21. 17 2月, 2007 11 次提交
  22. 13 2月, 2007 1 次提交
    • E
      [PATCH] x86-64: get rid of ARCH_HAVE_XTIME_LOCK · 5809f9d4
      Eric Dumazet 提交于
      ARCH_HAVE_XTIME_LOCK is used by x86_64 arch .  This arch needs to place a
      read only copy of xtime_lock into vsyscall page.  This read only copy is
      named __xtime_lock, and xtime_lock is defined in
      arch/x86_64/kernel/vmlinux.lds.S as an alias.  So the declaration of
      xtime_lock in kernel/timer.c was guarded by ARCH_HAVE_XTIME_LOCK define,
      defined to true on x86_64.
      
      We can get same result with _attribute__((weak)) in the declaration. linker
      should do the job.
      Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      5809f9d4