1. 29 5月, 2009 3 次提交
    • M
      x86: ignore VM_LOCKED when determining if hugetlb-backed page tables can be shared or not · 32b154c0
      Mel Gorman 提交于
      Addresses http://bugzilla.kernel.org/show_bug.cgi?id=13302
      
      On x86 and x86-64, it is possible that page tables are shared beween
      shared mappings backed by hugetlbfs.  As part of this,
      page_table_shareable() checks a pair of vma->vm_flags and they must match
      if they are to be shared.  All VMA flags are taken into account, including
      VM_LOCKED.
      
      The problem is that VM_LOCKED is cleared on fork().  When a process with a
      shared memory segment forks() to exec() a helper, there will be shared
      VMAs with different flags.  The impact is that the shared segment is
      sometimes considered shareable and other times not, depending on what
      process is checking.
      
      What happens is that the segment page tables are being shared but the
      count is inaccurate depending on the ordering of events.  As the page
      tables are freed with put_page(), bad pmd's are found when some of the
      children exit.  The hugepage counters also get corrupted and the Total and
      Free count will no longer match even when all the hugepage-backed regions
      are freed.  This requires a reboot of the machine to "fix".
      
      This patch addresses the problem by comparing all flags except VM_LOCKED
      when deciding if pagetables should be shared or not for hugetlbfs-backed
      mapping.
      Signed-off-by: NMel Gorman <mel@csn.ul.ie>
      Acked-by: NHugh Dickins <hugh.dickins@tiscali.co.uk>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: <stable@kernel.org>
      Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: <starlight@binnacle.cx>
      Cc: Eric B Munson <ebmunson@us.ibm.com>
      Cc: Adam Litke <agl@us.ibm.com>
      Cc: Andy Whitcroft <apw@canonical.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      32b154c0
    • O
      flat: fix data sections alignment · c3dc5bec
      Oskar Schirmer 提交于
      The flat loader uses an architecture's flat_stack_align() to align the
      stack but assumes word-alignment is enough for the data sections.
      
      However, on the Xtensa S6000 we have registers up to 128bit width
      which can be used from userspace and therefor need userspace stack and
      data-section alignment of at least this size.
      
      This patch drops flat_stack_align() and uses the same alignment that
      is required for slab caches, ARCH_SLAB_MINALIGN, or wordsize if it's
      not defined by the architecture.
      
      It also fixes m32r which was obviously kaput, aligning an
      uninitialized stack entry instead of the stack pointer.
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: NOskar Schirmer <os@emlix.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Bryan Wu <cooloney@kernel.org>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Acked-by: NPaul Mundt <lethal@linux-sh.org>
      Cc: Greg Ungerer <gerg@uclinux.org>
      Signed-off-by: NJohannes Weiner <jw@emlix.com>
      Acked-by: NMike Frysinger <vapier.adi@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c3dc5bec
    • Y
      perf_counter/x86: Always use NMI for performance-monitoring interrupt · c323d95f
      Yong Wang 提交于
      Always use NMI for performance-monitoring interrupt as there could be
      racy situations if we switch between irq and nmi mode frequently.
      Signed-off-by: NYong Wang <yong.y.wang@intel.com>
      LKML-Reference: <20090529052835.GA13657@ywang-moblin2.bj.intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c323d95f
  2. 27 5月, 2009 13 次提交
  3. 26 5月, 2009 11 次提交
    • I
      perf_counter, x86: Make NMI lockups more robust · aaba9801
      Ingo Molnar 提交于
      We have a debug check that detects stuck NMIs and returns with
      the PMU disabled in the global ctrl MSR - but i managed to trigger
      a situation where this was not enough to deassert the NMI.
      
      So clear/reset the full PMU and keep the disable count balanced when
      exiting from here. This way the box produces a debug warning but
      stays up and is more debuggable.
      
      [ Impact: in case of PMU related bugs, recover more gracefully ]
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: John Kacur <jkacur@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      aaba9801
    • I
      perf_counter, x86: Fix APIC NMI programming · 79202ba9
      Ingo Molnar 提交于
      My Nehalem box locks up in certain situations (with an
      always-asserted NMI causing a lockup) if the PMU LVT
      entry is programmed between NMI and IRQ mode with a
      high frequency.
      
      Standardize exlusively on NMIs instead.
      
      [ Impact: fix lockup ]
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: John Kacur <jkacur@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      79202ba9
    • P
      perf_counter: powerpc: Implement interrupt throttling · 8a7b8cb9
      Paul Mackerras 提交于
      This implements interrupt throttling on powerpc.  Since we don't have
      individual count enable/disable or interrupt enable/disable controls
      per counter, this simply sets the hardware counter to 0, meaning that
      it will not interrupt again until it has counted 2^31 counts, which
      will take at least 2^30 cycles assuming a maximum of 2 counts per
      cycle.  Also, we set counter->hw.period_left to the maximum possible
      value (2^63 - 1), so we won't report overflows for this counter for
      the forseeable future.
      
      The unthrottle operation restores counter->hw.period_left and the
      hardware counter so that we will once again report a counter overflow
      after counter->hw.irq_period counts.
      
      [ Impact: new perfcounters robustness feature on PowerPC ]
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      LKML-Reference: <18971.35823.643362.446774@cargo.ozlabs.ibm.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8a7b8cb9
    • T
      x86, relocs: ignore R_386_NONE in kernel relocation entries · 46176b4f
      Tejun Heo 提交于
      For relocatable 32bit kernels, boot/compressed/relocs.c processes
      relocation entries in the kernel image and appends it to the kernel
      image such that boot/compressed/head_32.S can relocate the kernel.
      The kernel image is one statically linked object and only uses two
      relocation types - R_386_PC32 and R_386_32, of the two only the latter
      needs massaging during kernel relocation and thus handled by relocs.
      R_386_PC32 is ignored and all other relocation types are considered
      error.
      
      When the target of a relocation resides in a discarded section,
      binutils doesn't throw away the relocation record but nullifies it by
      changing it to R_386_NONE, which unfortunately makes relocs fail.
      
      The problem was triggered by yet out-of-tree x86 stack unwind patches
      but given the binutils behavior, ignoring R_386_NONE is the right
      thing to do.
      
      The problem has been tracked down to binutils behavior by Jan Beulich.
      
      [ Impact: fix build with certain binutils by ignoring R_386_NONE ]
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Jan Beulich <JBeulich@novell.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      LKML-Reference: <4A1B8150.40702@kernel.org>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      46176b4f
    • H
      powerpc/mm: Fix broken MMU PID stealing on !SMP · 8e35961b
      Hideo Saito 提交于
      The recent rework of the MMU PID handling for non-hash CPUs has a
      subtle bug in the !SMP "optimized" variant of the PID stealing
      function.  It clears the PID in the mm context before it calls
      local_flush_tlb_mm(). However, the later will not flush anything
      if the PID in the context is clear...
      Signed-off-by: NHideo Saito <hsaito.ppc@gmail.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      8e35961b
    • I
      Revert "perf_counter, x86: speed up the scheduling fast-path" · 53b441a5
      Ingo Molnar 提交于
      This reverts commit b68f1d2e.
      
      It is causing problems (stuck/stuttering profiling) - when mixed
      NMI and non-NMI counters are used.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: John Kacur <jkacur@redhat.com>
      LKML-Reference: <20090525153931.703093461@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      53b441a5
    • P
      perf_counter: Generic per counter interrupt throttle · a78ac325
      Peter Zijlstra 提交于
      Introduce a generic per counter interrupt throttle.
      
      This uses the perf_counter_overflow() quick disable to throttle a specific
      counter when its going too fast when a pmu->unthrottle() method is provided
      which can undo the quick disable.
      
      Power needs to implement both the quick disable and the unthrottle method.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: John Kacur <jkacur@redhat.com>
      LKML-Reference: <20090525153931.703093461@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a78ac325
    • P
      perf_counter: x86: Remove interrupt throttle · 48e22d56
      Peter Zijlstra 提交于
      remove the x86 specific interrupt throttle
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: John Kacur <jkacur@redhat.com>
      LKML-Reference: <20090525153931.616671838@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      48e22d56
    • P
      perf_counter: x86: Expose INV and EDGE bits · ff99be57
      Peter Zijlstra 提交于
      Expose the INV and EDGE bits of the PMU to raw configs.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: John Kacur <jkacur@redhat.com>
      LKML-Reference: <20090525153931.494709027@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ff99be57
    • A
      KVM: Fix PDPTR reloading on CR4 writes · a2edf57f
      Avi Kivity 提交于
      The processor is documented to reload the PDPTRs while in PAE mode if any
      of the CR4 bits PSE, PGE, or PAE change.  Linux relies on this
      behaviour when zapping the low mappings of PAE kernels during boot.
      
      The code already handled changes to CR4.PAE; augment it to also notice changes
      to PSE and PGE.
      
      This triggered while booting an F11 PAE kernel; the futex initialization code
      runs before any CR3 reloads and writes to a NULL pointer; the futex subsystem
      ended up uninitialized, killing PI futexes and pulseaudio which uses them.
      
      Cc: stable@kernel.org
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      a2edf57f
    • A
      KVM: Make paravirt tlb flush also reload the PAE PDPTRs · a8cd0244
      Avi Kivity 提交于
      The paravirt tlb flush may be used not only to flush TLBs, but also
      to reload the four page-directory-pointer-table entries, as it is used
      as a replacement for reloading CR3.  Change the code to do the entire
      CR3 reloading dance instead of simply flushing the TLB.
      
      Cc: stable@kernel.org
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      a8cd0244
  4. 25 5月, 2009 1 次提交
  5. 23 5月, 2009 4 次提交
  6. 22 5月, 2009 7 次提交
    • R
      MIPS: IP32: Remove unnecessary if not even harmful volatile keywords. · d2f82c2f
      Ralf Baechle 提交于
      They are unneeded and as the issue fixed in lmo commit
      63f7ec59053e3f850ab67a9938e631bcba64c6ce shows even harmful.
      Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
      d2f82c2f
    • R
      MIPS: IP32: Fix build error due to uninitialized variable. · 63c901c7
      Ralf Baechle 提交于
        CC      arch/mips/sgi-ip32/ip32-reset.o
      cc1: warnings being treated as errors
      arch/mips/sgi-ip32/ip32-reset.c: In function 'debounce':
      arch/mips/sgi-ip32/ip32-reset.c:97: error: 'reg_a' is used uninitialized in this function
      
      The issues is old but due to the volatile keyword gcc older than 4.4 did
      not warn about this obvious bug.
      Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
      63c901c7
    • W
      MIPS: Fix sparse warning in incompatiable argument type of clear_user. · 63d38923
      Wu Zhangjin 提交于
      The type of the second argument of access_ok should be (void __user *).
      The unnecessary conversion of the clear_user address argument was causing
      sparse to emit warnings on the __chk_user_ptr check.
      Signed-off-by: NWu Zhangjin <wuzhangjin@gmail.com>
      Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
      63d38923
    • P
      perf_counter: Dynamically allocate tasks' perf_counter_context struct · a63eaf34
      Paul Mackerras 提交于
      This replaces the struct perf_counter_context in the task_struct with
      a pointer to a dynamically allocated perf_counter_context struct.  The
      main reason for doing is this is to allow us to transfer a
      perf_counter_context from one task to another when we do lazy PMU
      switching in a later patch.
      
      This has a few side-benefits: the task_struct becomes a little smaller,
      we save some memory because only tasks that have perf_counters attached
      get a perf_counter_context allocated for them, and we can remove the
      inclusion of <linux/perf_counter.h> in sched.h, meaning that we don't
      end up recompiling nearly everything whenever perf_counter.h changes.
      
      The perf_counter_context structures are reference-counted and freed
      when the last reference is dropped.  A context can have references
      from its task and the counters on its task.  Counters can outlive the
      task so it is possible that a context will be freed well after its
      task has exited.
      
      Contexts are allocated on fork if the parent had a context, or
      otherwise the first time that a per-task counter is created on a task.
      In the latter case, we set the context pointer in the task struct
      locklessly using an atomic compare-and-exchange operation in case we
      raced with some other task in creating a context for the subject task.
      
      This also removes the task pointer from the perf_counter struct.  The
      task pointer was not used anywhere and would make it harder to move a
      context from one task to another.  Anything that needed to know which
      task a counter was attached to was already using counter->ctx->task.
      
      The __perf_counter_init_context function moves up in perf_counter.c
      so that it can be called from find_get_context, and now initializes
      the refcount, but is otherwise unchanged.
      
      We were potentially calling list_del_counter twice: once from
      __perf_counter_exit_task when the task exits and once from
      __perf_counter_remove_from_context when the counter's fd gets closed.
      This adds a check in list_del_counter so it doesn't do anything if
      the counter has already been removed from the lists.
      
      Since perf_counter_task_sched_in doesn't do anything if the task doesn't
      have a context, and leaves cpuctx->task_ctx = NULL, this adds code to
      __perf_install_in_context to set cpuctx->task_ctx if necessary, i.e. in
      the case where the current task adds the first counter to itself and
      thus creates a context for itself.
      
      This also adds similar code to __perf_counter_enable to handle a
      similar situation which can arise when the counters have been disabled
      using prctl; that also leaves cpuctx->task_ctx = NULL.
      
      [ Impact: refactor counter context management to prepare for new feature ]
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <18966.10075.781053.231153@cargo.ozlabs.ibm.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a63eaf34
    • Z
      x86: DMI match for the Sony VGN-Z540N as it needs BIOS reboot · 88dff493
      Zhang Rui 提交于
      x86: DMI match for the Sony VGN-Z540N as it needs BIOS reboot,
      see:
      
        http://bugzilla.kernel.org/show_bug.cgi?id=12901
      
      [ Impact: fix hung reboot on certain systems ]
      Signed-off-by: NZhang Rui <rui.zhang@intel.com>
      Cc: Len Brown <lenb@kernel.org>
      LKML-Reference: <1242963350.32574.53.camel@rzhang-dt>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      88dff493
    • M
      powerpc/maple: Add a quirk to disable MSI for IPR on Bimini · 6eb0ac03
      Michael Ellerman 提交于
      Something in the HW or FW setup is busted and MSIs aren't working with
      IPR on Bimini, so until we figure out exaxtly what's up, we quirk them
      out
      Signed-off-by: NMichael Ellerman <michael@ellerman.id.au>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      6eb0ac03
    • M
      sh: ap325 camera without i2c driver fix · 37869fa2
      Magnus Damm 提交于
      This patch fixes the ap325rxa ncm03j camera code to handle
      the case where no i2c driver is present. Without this fix
      i2c_transfer() may be passed NULL as adapter which results
      in a crash.
      
      Triggered when i2c-sh_mobile.c failed to probe() due to
      missing MSTP clocks.
      Signed-off-by: NMagnus Damm <damm@igel.co.jp>
      Signed-off-by: NPaul Mundt <lethal@linux-sh.org>
      37869fa2
  7. 21 5月, 2009 1 次提交
    • I
      perf_counter: Fix context removal deadlock · 34adc806
      Ingo Molnar 提交于
      Disable the PMU globally before removing a counter from a
      context. This fixes the following lockup:
      
      [22081.741922] ------------[ cut here ]------------
      [22081.746668] WARNING: at arch/x86/kernel/cpu/perf_counter.c:803 intel_pmu_handle_irq+0x9b/0x24e()
      [22081.755624] Hardware name: X8DTN
      [22081.758903] perfcounters: irq loop stuck!
      [22081.762985] Modules linked in:
      [22081.766136] Pid: 11082, comm: perf Not tainted 2.6.30-rc6-tip #226
      [22081.772432] Call Trace:
      [22081.774940]  <NMI>  [<ffffffff81019aed>] ? intel_pmu_handle_irq+0x9b/0x24e
      [22081.781993]  [<ffffffff81019aed>] ? intel_pmu_handle_irq+0x9b/0x24e
      [22081.788368]  [<ffffffff8104505c>] ? warn_slowpath_common+0x77/0xa3
      [22081.794649]  [<ffffffff810450d3>] ? warn_slowpath_fmt+0x40/0x45
      [22081.800696]  [<ffffffff81019aed>] ? intel_pmu_handle_irq+0x9b/0x24e
      [22081.807080]  [<ffffffff814d1a72>] ? perf_counter_nmi_handler+0x3f/0x4a
      [22081.813751]  [<ffffffff814d2d09>] ? notifier_call_chain+0x58/0x86
      [22081.819951]  [<ffffffff8105b250>] ? notify_die+0x2d/0x32
      [22081.825392]  [<ffffffff814d1414>] ? do_nmi+0x8e/0x242
      [22081.830538]  [<ffffffff814d0f0a>] ? nmi+0x1a/0x20
      [22081.835342]  [<ffffffff8117e102>] ? selinux_file_free_security+0x0/0x1a
      [22081.842105]  [<ffffffff81018793>] ? x86_pmu_disable_counter+0x15/0x41
      [22081.848673]  <<EOE>>  [<ffffffff81018f3d>] ? x86_pmu_disable+0x86/0x103
      [22081.855512]  [<ffffffff8108fedd>] ? __perf_counter_remove_from_context+0x0/0xfe
      [22081.862926]  [<ffffffff8108fcbc>] ? counter_sched_out+0x30/0xce
      [22081.868909]  [<ffffffff8108ff36>] ? __perf_counter_remove_from_context+0x59/0xfe
      [22081.876382]  [<ffffffff8106808a>] ? smp_call_function_single+0x6c/0xe6
      [22081.882955]  [<ffffffff81091b96>] ? perf_release+0x86/0x14c
      [22081.888600]  [<ffffffff810c4c84>] ? __fput+0xe7/0x195
      [22081.893718]  [<ffffffff810c213e>] ? filp_close+0x5b/0x62
      [22081.899107]  [<ffffffff81046a70>] ? put_files_struct+0x64/0xc2
      [22081.905031]  [<ffffffff8104841a>] ? do_exit+0x1e2/0x6ef
      [22081.910360]  [<ffffffff814d0a60>] ? _spin_lock_irqsave+0x9/0xe
      [22081.916292]  [<ffffffff8104898e>] ? do_group_exit+0x67/0x93
      [22081.921953]  [<ffffffff810489cc>] ? sys_exit_group+0x12/0x16
      [22081.927759]  [<ffffffff8100baab>] ? system_call_fastpath+0x16/0x1b
      [22081.934076] ---[ end trace 3a3936ce3e1b4505 ]---
      
      And could potentially also fix the lockup reported by Marcelo Tosatti.
      
      Also, print more debug info in case of a detected lockup.
      
      [ Impact: fix lockup ]
      Reported-by: NMarcelo Tosatti <mtosatti@redhat.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      34adc806