1. 14 10月, 2009 2 次提交
    • F
      tracing: Move syscalls metadata handling from arch to core · c44fc770
      Frederic Weisbecker 提交于
      Most of the syscalls metadata processing is done from arch.
      But these operations are mostly generic accross archs. Especially now
      that we have a common variable name that expresses the number of
      syscalls supported by an arch: NR_syscalls, the only remaining bits
      that need to reside in arch is the syscall nr to addr translation.
      
      v2: Compare syscalls symbols only after the "sys" prefix so that we
          avoid spurious mismatches with archs that have syscalls wrappers,
          in which case syscalls symbols have "SyS" prefixed aliases.
          (Reported by: Heiko Carstens)
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      c44fc770
    • S
      function-graph/x86: Replace unbalanced ret with jmp · 194ec341
      Steven Rostedt 提交于
      The function graph tracer replaces the return address with a hook
      to trace the exit of the function call. This hook will finish by
      returning to the real location the function should return to.
      
      But the current implementation uses a ret to jump to the real
      return location. This causes a imbalance between calls and ret.
      That is the original function does a call, the ret goes to the
      handler and then the handler does a ret without a matching call.
      
      Although the function graph tracer itself still breaks the branch
      predictor by replacing the original ret, by using a second ret and
      causing an imbalance, it breaks the predictor even more.
      
      This patch replaces the ret with a jmp to keep the calls and ret
      balanced. I tested this on one box and it showed a 1.7% increase in
      performance. Another box only showed a small 0.3% increase. But no
      box that I tested this on showed a decrease in performance by
      making this change.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      Acked-by: NMathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <20091013203425.042034383@goodmis.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      194ec341
  2. 12 10月, 2009 2 次提交
  3. 02 10月, 2009 2 次提交
  4. 01 10月, 2009 1 次提交
    • A
      x86: Provide an alternative() based cmpxchg64() · 79e1dd05
      Arjan van de Ven 提交于
      cmpxchg64() today generates, to quote Linus, "barf bag" code.
      
      cmpxchg64() is about to get used in the scheduler to fix a bug there,
      but it's a prerequisite that cmpxchg64() first be made non-sucking.
      
      This patch turns cmpxchg64() into an efficient implementation that
      uses the alternative() mechanism to just use the raw instruction on
      all modern systems.
      
      Note: the fallback is NOT smp safe, just like the current fallback
      is not SMP safe. (Interested parties with i486 based SMP systems
      are welcome to submit fix patches for that.)
      Signed-off-by: NArjan van de Ven <arjan@linux.intel.com>
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      [ fixed asm constraint bug ]
      Fixed-by: NEric Dumazet <eric.dumazet@gmail.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: John Stultz <johnstul@us.ibm.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <20090930170754.0886ff2e@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      79e1dd05
  5. 30 9月, 2009 1 次提交
  6. 28 9月, 2009 1 次提交
  7. 27 9月, 2009 2 次提交
  8. 24 9月, 2009 11 次提交
    • A
      sysctl: remove "struct file *" argument of ->proc_handler · 8d65af78
      Alexey Dobriyan 提交于
      It's unused.
      
      It isn't needed -- read or write flag is already passed and sysctl
      shouldn't care about the rest.
      
      It _was_ used in two places at arch/frv for some reason.
      Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: James Morris <jmorris@namei.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8d65af78
    • R
      x86: Remove redundant non-NUMA topology functions · b0c6fbe4
      Rusty Russell 提交于
      arch/x86/include/asm/topology.h declares inline fns cpu_to_node and
      cpumask_of_node for !NUMA, even though they are then declared as
      macros by asm-generic/topology.h, which is #included just below.
      
      The macros (which are the same) end up being used; these functions
      are just confusing.
      Noticed-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
      Cc: "Greg Kroah-Hartman" <gregkh@suse.de>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Tejun Heo <tj@kernel.org>
      LKML-Reference: <200909241748.45629.rusty@rustcorp.com.au>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b0c6fbe4
    • J
      x86: early_printk: Protect against using the same device twice · 429a6e5e
      Jason Wessel 提交于
      If you use the kernel argument:
      
        earlyprintk=serial,ttyS0,115200
      
      This will cause a recursive hang printing the same line
      again and again:
      
       BIOS-e820: 000000003fff3000 - 0000000040000000 (ACPI data)
       BIOS-e820: 00000000e0000000 - 00000000f0000000 (reserved)
       BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved)
      bootconsole [earlyser0] enabled
      Linux version 2.6.31-07863-gb64ada6b (mingo@sirius) (gcc version 4.3.2 20081105 (Red Hat 4.3.2-7) (GCC) ) #16789 SMP Wed Sep 23 21:09:43 CEST 2009
      Linux version 2.6.31-07863-gb64ada6b (mingo@sirius) (gcc version 4.3.2 20081105 (Red Hat 4.3.2-7) (GCC) ) #16789 SMP Wed Sep 23 21:09:43 CEST 2009
      Linux version 2.6.31-07863-gb64ada6b (mingo@sirius) (gcc version 4.3.2 20081105 (Red Hat 4.3.2-7) (GCC) ) #16789 SMP Wed Sep 23 21:09:43 CEST 2009
      Linux version 2.6.31-07863-gb64ada6b (mingo@sirius) (gcc version 4.3.2 20081105 (Red Hat 4.3.2-7) (GCC) ) #16789 SMP Wed Sep 23 21:09:43 CEST 2009
      Linux version 2.6.31-07863-gb64ada6b (mingo@sirius) (gcc version 4.3.2 20081105 (Red Hat 4.3.2-7) (GCC) ) #16789 SMP Wed Sep 23 21:09:43 CEST 2009
      
      Instead warn the end user that they specified the device
      a second time, and ignore that second console.
      Reported-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
      Cc: Len Brown <lenb@kernel.org>
      Cc: Greg KH <gregkh@suse.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      LKML-Reference: <4ABAAB89.1080407@windriver.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      429a6e5e
    • R
      x86: Reduce verbosity of "PAT enabled" kernel message · e23a8b6a
      Roland Dreier 提交于
      On modern systems, the kernel prints the message
      
          x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106
      
      once for every CPU.
      
      This gets kind of ridiculous on huge systems; for example, on a
      64-thread system I was lucky enough to get:
      
          dmesg| grep 'PAT enabled' | wc
               64     704    5174
      
      There is already a BUG() if non-boot CPUs have PAT capabilities
      that don't match the boot CPU, so just print the message on the
      boot CPU. (I kept the print after the wrmsrl() that enables PAT,
      so that the log output continues to mean that the system survived
      enabling PAT on the boot CPU)
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      Cc: Suresh Siddha <suresh.b.siddha@intel.com>
      Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      LKML-Reference: <adavdj92sso.fsf@cisco.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e23a8b6a
    • R
      x86: Reduce verbosity of "TSC is reliable" message · ea01c0d7
      Roland Dreier 提交于
      On modern systems, the kernel prints the message
      
          Skipping synchronization checks as TSC is reliable.
      
      once for every non-boot CPU.
      
      This gets kind of ridiculous on huge systems; for example, on a
      64-thread system I was lucky enough to get:
      
          $ dmesg | grep 'TSC is reliable' | wc
               63     567    4221
      
      There's no point to doing this for every CPU, since the code is
      just checking the boot CPU anyway, so change this to a
      printk_once() to make the message appears only once.
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      LKML-Reference: <adazl8l2swc.fsf@cisco.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ea01c0d7
    • A
      headers: utsname.h redux · 2bcd57ab
      Alexey Dobriyan 提交于
      * remove asm/atomic.h inclusion from linux/utsname.h --
         not needed after kref conversion
       * remove linux/utsname.h inclusion from files which do not need it
      
      NOTE: it looks like fs/binfmt_elf.c do not need utsname.h, however
      due to some personality stuff it _is_ needed -- cowardly leave ELF-related
      headers and files alone.
      Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2bcd57ab
    • R
      cpumask: use mm_cpumask() wrapper: x86 · 78f1c4d6
      Rusty Russell 提交于
      Makes code futureproof against the impending change to mm->cpu_vm_mask (to be a pointer).
      
      It's also a chance to use the new cpumask_ ops which take a pointer
      (the older ones are deprecated, but there's no hurry for arch code).
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      78f1c4d6
    • R
      cpumask: remove arch_send_call_function_ipi · 0748bd01
      Rusty Russell 提交于
      Now everyone is converted to arch_send_call_function_ipi_mask, remove
      the shim and the #defines.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      0748bd01
    • R
      cpumask: remove last assignment to mask field of struct irqaction. · 1d1afc19
      Rusty Russell 提交于
      This snuck in after the patch which removed all the others.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Cc: Ingo Molnar <mingo@elte.hu>
      1d1afc19
    • L
      cpumask: use zalloc_cpumask_var() where possible · 79f55997
      Li Zefan 提交于
      Remove open-coded zalloc_cpumask_var() and zalloc_cpumask_var_node().
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      79f55997
    • I
      x86: mce: Use safer ways to access MCE registers · 11868a2d
      Ingo Molnar 提交于
      Use rdmsrl_safe() when accessing MCE registers. While in
      theory we always 'know' which ones are safe to access from
      the capability bits, there's a lot of hardware variations
      and reality might differ from theory, as it did in this case:
      
         http://bugzilla.kernel.org/show_bug.cgi?id=14204
      
      [    0.010016] mce: CPU supports 5 MCE banks
      [    0.011029] general protection fault: 0000 [#1]
      [    0.011998] last sysfs file:
      [    0.011998] Modules linked in:
      [    0.011998]
      [    0.011998] Pid: 0, comm: swapper Not tainted (2.6.31_router #1) HP Vectra
      [    0.011998] EIP: 0060:[<c100d9b9>] EFLAGS: 00010246 CPU: 0
      [    0.011998] EIP is at mce_rdmsrl+0x19/0x60
      [    0.011998] EAX: 00000000 EBX: 00000001 ECX: 00000407 EDX: 08000000
      [    0.011998] ESI: 00000000 EDI: 8c000000 EBP: 00000405 ESP: c17d5eac
      
      So WARN_ONCE() instead of crashing the box.
      
      ( also fix a number of stylistic inconsistencies in the code. )
      
      Note, we might still crash in wrmsrl() if we get that far, but
      we shouldnt if the registers are truly inaccessible.
      Reported-by: NGNUtoo <GNUtoo@no-log.org>
      Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
      Cc: Huang Ying <ying.huang@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      LKML-Reference: <bug-14204-5438@http.bugzilla.kernel.org/>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      11868a2d
  9. 23 9月, 2009 14 次提交
  10. 22 9月, 2009 4 次提交
    • I
      x86: mce: Fix thermal throttling message storm · b417c9fd
      Ingo Molnar 提交于
      If a system switches back and forth between hot and cold mode,
      the MCE code will print a stream of critical kernel messages.
      
      Extend the throttling code to properly notice this, by
      only printing the first hot + cold transition and omitting
      the rest up to CHECK_INTERVAL (5 minutes).
      
      This way we'll only get a single incident of:
      
       [  102.356584] CPU0: Temperature above threshold, cpu clock throttled (total events = 1)
       [  102.357000] Disabling lock debugging due to kernel taint
       [  102.369223] CPU0: Temperature/speed normal
      
      Every 5 minutes. The 'total events' count tells the number of cold/hot
      transitions detected, should overheating occur after 5 minutes again:
      
      [  402.357580] CPU0: Temperature above threshold, cpu clock throttled (total events = 24891)
      [  402.358001] CPU0: Temperature/speed normal
      [  450.704142] Machine check events logged
      
      Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
      Cc: Huang Ying <ying.huang@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b417c9fd
    • I
      x86: mce: Clean up thermal throttling state tracking code · 39676840
      Ingo Molnar 提交于
      Instead of a mess of three separate percpu variables, consolidate
      the state into a single structure.
      
      Also clean up therm_throt_process(), use cleaner and more
      understandable variable names and a clearer logic.
      
      This, without changing the logic, makes the code more
      streamlined, more readable and smaller as well:
      
         text	   data	    bss	    dec	    hex	filename
         1487	    169	      4	   1660	    67c	therm_throt.o.before
         1432	    176	      4	   1612	    64c	therm_throt.o.after
      
      Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
      Cc: Huang Ying <ying.huang@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      39676840
    • J
      mm: don't use alloc_bootmem_low() where not strictly needed · 3c1596ef
      Jan Beulich 提交于
      Since alloc_bootmem() will never return inaccessible (via virtual
      addressing) memory anyway, using the ..._low() variant only makes sense
      when the physical address range of the allocated memory must fulfill
      further constraints, espacially since on 64-bits (or more generally in all
      cases where the pools the two variants allocate from are than the full
      available range.
      
      Probably the use in alloc_tce_table() could also be eliminated (based on
      code inspection of pci-calgary_64.c), but that seems too risky given I
      know nothing about that hardware and have no way to test it.
      Signed-off-by: NJan Beulich <jbeulich@novell.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3c1596ef
    • J
      mm: replace various uses of num_physpages by totalram_pages · 4481374c
      Jan Beulich 提交于
      Sizing of memory allocations shouldn't depend on the number of physical
      pages found in a system, as that generally includes (perhaps a huge amount
      of) non-RAM pages.  The amount of what actually is usable as storage
      should instead be used as a basis here.
      
      Some of the calculations (i.e.  those not intending to use high memory)
      should likely even use (totalram_pages - totalhigh_pages).
      Signed-off-by: NJan Beulich <jbeulich@novell.com>
      Acked-by: NRusty Russell <rusty@rustcorp.com.au>
      Acked-by: NIngo Molnar <mingo@elte.hu>
      Cc: Dave Airlie <airlied@linux.ie>
      Cc: Kyle McMartin <kyle@mcmartin.ca>
      Cc: Jeremy Fitzhardinge <jeremy@goop.org>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Patrick McHardy <kaber@trash.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4481374c