1. 24 2月, 2013 3 次提交
    • F
      cputime: Use local_clock() for full dynticks cputime accounting · 7f6575f1
      Frederic Weisbecker 提交于
      Running the full dynticks cputime accounting with preemptible
      kernel debugging trigger the following warning:
      
      	[    4.488303] BUG: using smp_processor_id() in preemptible [00000000] code: init/1
      	[    4.490971] caller is native_sched_clock+0x22/0x80
      	[    4.493663] Pid: 1, comm: init Not tainted 3.8.0+ #13
      	[    4.496376] Call Trace:
      	[    4.498996]  [<ffffffff813410eb>] debug_smp_processor_id+0xdb/0xf0
      	[    4.501716]  [<ffffffff8101e642>] native_sched_clock+0x22/0x80
      	[    4.504434]  [<ffffffff8101db99>] sched_clock+0x9/0x10
      	[    4.507185]  [<ffffffff81096ccd>] fetch_task_cputime+0xad/0x120
      	[    4.509916]  [<ffffffff81096dd5>] task_cputime+0x35/0x60
      	[    4.512622]  [<ffffffff810f146e>] acct_update_integrals+0x1e/0x40
      	[    4.515372]  [<ffffffff8117d2cf>] do_execve_common+0x4ff/0x5c0
      	[    4.518117]  [<ffffffff8117cf14>] ? do_execve_common+0x144/0x5c0
      	[    4.520844]  [<ffffffff81867a10>] ? rest_init+0x160/0x160
      	[    4.523554]  [<ffffffff8117d457>] do_execve+0x37/0x40
      	[    4.526276]  [<ffffffff810021a3>] run_init_process+0x23/0x30
      	[    4.528953]  [<ffffffff81867aac>] kernel_init+0x9c/0xf0
      	[    4.531608]  [<ffffffff8188356c>] ret_from_fork+0x7c/0xb0
      
      We use sched_clock() to perform and fixup the cputime
      accounting. However we are calling it with preemption enabled
      from the read side, which trigger the bug above.
      
      To fix this up, use local_clock() instead. It takes care of
      preemption and also provide a more reliable clock source. This
      is welcome for this kind of statistic that is widely relied on
      in userspace.
      Reported-by: NThomas Gleixner <tglx@linutronix.de>
      Reported-by: NIngo Molnar <mingo@kernel.org>
      Suggested-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Kevin Hilman <khilman@linaro.org>
      Link: http://lkml.kernel.org/r/1361636925-22288-3-git-send-email-fweisbec@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      7f6575f1
    • P
      page-writeback.c: subtract min_free_kbytes from dirtyable memory · 75f7ad8e
      Paul Szabo 提交于
      When calculating amount of dirtyable memory, min_free_kbytes should be
      subtracted because it is not intended for dirty pages.
      
      Addresses http://bugs.debian.org/695182
      
      [akpm@linux-foundation.org: fix up min_free_kbytes extern declarations]
      [akpm@linux-foundation.org: fix min() warning]
      Signed-off-by: NPaul Szabo <psz@maths.usyd.edu.au>
      Acked-by: NRik van Riel <riel@redhat.com>
      Cc: Wu Fengguang <fengguang.wu@intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      75f7ad8e
    • T
      sched: do not use cpu_to_node() to find an offlined cpu's node. · aa00d89c
      Tang Chen 提交于
      If a cpu is offline, its nid will be set to -1, and cpu_to_node(cpu)
      will return -1.  As a result, cpumask_of_node(nid) will return NULL.  In
      this case, find_next_bit() in for_each_cpu will get a NULL pointer and
      cause panic.
      
      Here is a call trace:
        Call Trace:
         <IRQ>
          select_fallback_rq+0x71/0x190
          try_to_wake_up+0x2cb/0x2f0
          wake_up_process+0x15/0x20
          hrtimer_wakeup+0x22/0x30
          __run_hrtimer+0x83/0x320
          hrtimer_interrupt+0x106/0x280
          smp_apic_timer_interrupt+0x69/0x99
          apic_timer_interrupt+0x6f/0x80
      
      There is a hrtimer process sleeping, whose cpu has already been
      offlined.  When it is waken up, it tries to find another cpu to run, and
      get a -1 nid.  As a result, cpumask_of_node(-1) returns NULL, and causes
      ernel panic.
      
      This patch fixes this problem by judging if the nid is -1.  If nid is
      not -1, a cpu on the same node will be picked.  Else, a online cpu on
      another node will be picked.
      Signed-off-by: NTang Chen <tangchen@cn.fujitsu.com>
      Signed-off-by: NWen Congyang <wency@cn.fujitsu.com>
      Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Jiang Liu <liuj97@gmail.com>
      Cc: Minchan Kim <minchan.kim@gmail.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      aa00d89c
  2. 22 2月, 2013 9 次提交
  3. 20 2月, 2013 2 次提交
  4. 19 2月, 2013 12 次提交
  5. 15 2月, 2013 2 次提交
  6. 14 2月, 2013 8 次提交
  7. 13 2月, 2013 3 次提交
    • S
      tracing/syscalls: Allow archs to ignore tracing compat syscalls · f431b634
      Steven Rostedt 提交于
      The tracing of ia32 compat system calls has been a bit of a pain as they
      use different system call numbers than the 64bit equivalents.
      
      I wrote a simple 'lls' program that lists files. I compiled it as a i686
      ELF binary and ran it under a x86_64 box. This is the result:
      
      echo 0 > /debug/tracing/tracing_on
      echo 1 > /debug/tracing/events/syscalls/enable
      echo 1 > /debug/tracing/tracing_on ; ./lls ; echo 0 > /debug/tracing/tracing_on
      
      grep lls /debug/tracing/trace
      
      [.. skipping calls before TS_COMPAT is set ...]
      
                   lls-1127  [005] d...   936.409188: sys_recvfrom(fd: 0, ubuf: 4d560fc4, size: 0, flags: 8048034, addr: 8, addr_len: f7700420)
                   lls-1127  [005] d...   936.409190: sys_recvfrom -> 0x8a77000
                   lls-1127  [005] d...   936.409211: sys_lgetxattr(pathname: 0, name: 1000, value: 3, size: 22)
                   lls-1127  [005] d...   936.409215: sys_lgetxattr -> 0xf76ff000
                   lls-1127  [005] d...   936.409223: sys_dup2(oldfd: 4d55ae9b, newfd: 4)
                   lls-1127  [005] d...   936.409228: sys_dup2 -> 0xfffffffffffffffe
                   lls-1127  [005] d...   936.409236: sys_newfstat(fd: 4d55b085, statbuf: 80000)
                   lls-1127  [005] d...   936.409242: sys_newfstat -> 0x3
                   lls-1127  [005] d...   936.409243: sys_removexattr(pathname: 3, name: ffcd0060)
                   lls-1127  [005] d...   936.409244: sys_removexattr -> 0x0
                   lls-1127  [005] d...   936.409245: sys_lgetxattr(pathname: 0, name: 19614, value: 1, size: 2)
                   lls-1127  [005] d...   936.409248: sys_lgetxattr -> 0xf76e5000
                   lls-1127  [005] d...   936.409248: sys_newlstat(filename: 3, statbuf: 19614)
                   lls-1127  [005] d...   936.409249: sys_newlstat -> 0x0
                   lls-1127  [005] d...   936.409262: sys_newfstat(fd: f76fb588, statbuf: 80000)
                   lls-1127  [005] d...   936.409279: sys_newfstat -> 0x3
                   lls-1127  [005] d...   936.409279: sys_close(fd: 3)
                   lls-1127  [005] d...   936.421550: sys_close -> 0x200
                   lls-1127  [005] d...   936.421558: sys_removexattr(pathname: 3, name: ffcd00d0)
                   lls-1127  [005] d...   936.421560: sys_removexattr -> 0x0
                   lls-1127  [005] d...   936.421569: sys_lgetxattr(pathname: 4d564000, name: 1b1abc, value: 5, size: 802)
                   lls-1127  [005] d...   936.421574: sys_lgetxattr -> 0x4d564000
                   lls-1127  [005] d...   936.421575: sys_capget(header: 4d70f000, dataptr: 1000)
                   lls-1127  [005] d...   936.421580: sys_capget -> 0x0
                   lls-1127  [005] d...   936.421580: sys_lgetxattr(pathname: 4d710000, name: 3000, value: 3, size: 812)
                   lls-1127  [005] d...   936.421589: sys_lgetxattr -> 0x4d710000
                   lls-1127  [005] d...   936.426130: sys_lgetxattr(pathname: 4d713000, name: 2abc, value: 3, size: 32)
                   lls-1127  [005] d...   936.426141: sys_lgetxattr -> 0x4d713000
                   lls-1127  [005] d...   936.426145: sys_newlstat(filename: 3, statbuf: f76ff3f0)
                   lls-1127  [005] d...   936.426146: sys_newlstat -> 0x0
                   lls-1127  [005] d...   936.431748: sys_lgetxattr(pathname: 0, name: 1000, value: 3, size: 22)
      
      Obviously I'm not calling newfstat with a fd of 4d55b085. The calls are
      obviously incorrect, and confusing.
      
      Other efforts have been made to fix this:
      
      https://lkml.org/lkml/2012/3/26/367
      
      But the real solution is to rewrite the syscall internals and come up
      with a fixed solution. One that doesn't require all the kluge that the
      current solution has.
      
      Thus for now, instead of outputting incorrect data, simply ignore them.
      With this patch the changes now have:
      
       #> grep lls /debug/tracing/trace
       #>
      
      Compat system calls simply are not traced. If users need compat
      syscalls, then they should just use the raw syscall tracepoints.
      
      For an architecture to make their compat syscalls ignored, it must
      define ARCH_TRACE_IGNORE_COMPAT_SYSCALLS (done in asm/ftrace.h) and also
      define an arch_trace_is_compat_syscall() function that will return true
      if the current task should ignore tracing the syscall.
      
      I want to stress that this change does not affect actual syscalls in any
      way, shape or form. It is only used within the tracing system and
      doesn't interfere with the syscall logic at all. The changes are
      consolidated nicely into trace_syscalls.c and asm/ftrace.h.
      
      I had to make one small modification to asm/thread_info.h and that was
      to remove the include of asm/ftrace.h. As asm/ftrace.h required the
      current_thread_info() it was causing include hell. That include was
      added back in 2008 when the function graph tracer was added:
      
       commit caf4b323 "tracing, x86: add low level support for ftrace return tracing"
      
      It does not need to be included there.
      
      Link: http://lkml.kernel.org/r/1360703939.21867.99.camel@gandalf.local.homeAcked-by: NH. Peter Anvin <hpa@zytor.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      f431b634
    • E
      kernel/pid.c: reenable interrupts when alloc_pid() fails because init has exited · 6e666884
      Eric W. Biederman 提交于
      We're forgetting to reenable local interrupts on an error path.
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Reported-by: NJosh Boyer <jwboyer@redhat.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6e666884
    • M
      clockevents: Fix generic broadcast for FEAT_C3STOP · 5d1d9a29
      Mark Rutland 提交于
      Commit 12ad1000: "clockevents: Add generic timer broadcast function"
      made tick_device_uses_broadcast set up the generic broadcast function
      for dummy devices (where !tick_device_is_functional(dev)), but neglected
      to set up the broadcast function for devices that stop in low power
      states (with the CLOCK_EVT_FEAT_C3STOP flag).
      
      When these devices enter low power states they will not have the generic
      broadcast function assigned, and will bring down the system when an
      attempt is made to broadcast to them.
      
      This patch ensures that the broadcast function is also assigned for
      devices which require broadcast in low power states.
      Reported-by: NStephen Warren <swarren@nvidia.com>
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Tested-by: NStephen Warren <swarren@nvidia.com>
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: nico@linaro.org
      Cc: Marc.Zyngier@arm.com
      Cc: Will.Deacon@arm.com
      Cc: santosh.shilimkar@ti.com
      Cc: john.stultz@linaro.org
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      5d1d9a29
  8. 10 2月, 2013 1 次提交
    • L
      suspend: enable freeze timeout configuration through sys · 957d1282
      Li Fei 提交于
      At present, the value of timeout for freezing is 20s, which is
      meaningless in case that one thread is frozen with mutex locked
      and another thread is trying to lock the mutex, as this time of
      freezing will fail unavoidably.
      And if there is no new wakeup event registered, the system will
      waste at most 20s for such meaningless trying of freezing.
      
      With this patch, the value of timeout can be configured to smaller
      value, so such meaningless trying of freezing will be aborted in
      earlier time, and later freezing can be also triggered in earlier
      time. And more power will be saved.
      In normal case on mobile phone, it costs real little time to freeze
      processes. On some platform, it only costs about 20ms to freeze
      user space processes and 10ms to freeze kernel freezable threads.
      Signed-off-by: NLiu Chuansheng <chuansheng.liu@intel.com>
      Signed-off-by: NLi Fei <fei.li@intel.com>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      957d1282