1. 20 6月, 2013 2 次提交
    • M
      x86: Fix trigger_all_cpu_backtrace() implementation · b52e0a7c
      Michel Lespinasse 提交于
      The following change fixes the x86 implementation of
      trigger_all_cpu_backtrace(), which was previously (accidentally,
      as far as I can tell) disabled to always return false as on
      architectures that do not implement this function.
      
      trigger_all_cpu_backtrace(), as defined in include/linux/nmi.h,
      should call arch_trigger_all_cpu_backtrace() if available, or
      return false if the underlying arch doesn't implement this
      function.
      
      x86 did provide a suitable arch_trigger_all_cpu_backtrace()
      implementation, but it wasn't actually being used because it was
      declared in asm/nmi.h, which linux/nmi.h doesn't include. Also,
      linux/nmi.h couldn't easily be fixed by including asm/nmi.h,
      because that file is not available on all architectures.
      
      I am proposing to fix this by moving the x86 definition of
      arch_trigger_all_cpu_backtrace() to asm/irq.h.
      
      Tested via: echo l > /proc/sysrq-trigger
      
      Before the change, this uses a fallback implementation which
      shows backtraces on active CPUs (using
      smp_call_function_interrupt() )
      
      After the change, this shows NMI backtraces on all CPUs
      Signed-off-by: NMichel Lespinasse <walken@google.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/1370518875-1346-1-git-send-email-walken@google.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      b52e0a7c
    • P
      x86: Fix section mismatch on load_ucode_ap · 94978599
      Paul Gortmaker 提交于
      We are in the process of removing all the __cpuinit annotations.
      While working on making that change, an existing problem was
      made evident:
      
        WARNING: arch/x86/kernel/built-in.o(.text+0x198f2): Section mismatch
        in reference from the function cpu_init() to the function
        .init.text:load_ucode_ap()   The function cpu_init() references
        the function __init load_ucode_ap().  This is often because cpu_init
        lacks a __init annotation or the annotation of load_ucode_ap is wrong.
      
      This now appears because in my working tree, cpu_init() is no longer
      tagged as __cpuinit, and so the audit picks up the mismatch.  The 2nd
      hypothesis from the audit is the correct one, as there was an incorrect
      __init tag on the prototype in the header (but __cpuinit was used on
      the function itself.)
      
      The audit is telling us that the prototype's __init annotation took
      effect and the function did land in the .init.text section.  Checking
      with objdump on a mainline tree that still has __cpuinit shows that
      the __cpuinit on the function takes precedence over the __init on the
      prototype, but that won't be true once we make __cpuinit a no-op.
      
      Even though we are removing __cpuinit, we temporarily align both
      the function and the prototype on __cpuinit so that the changeset
      can be applied to stable trees  if desired.
      
      [ hpa: build fix only, no object code change ]
      
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: stable <stable@vger.kernel.org> # 3.9+
      Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      Link: http://lkml.kernel.org/r/1371654926-11729-1-git-send-email-paul.gortmaker@windriver.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      94978599
  2. 11 6月, 2013 1 次提交
    • M
      Modify UEFI anti-bricking code · f8b84043
      Matthew Garrett 提交于
      This patch reworks the UEFI anti-bricking code, including an effective
      reversion of cc5a080c and 31ff2f20. It turns out that calling
      QueryVariableInfo() from boot services results in some firmware
      implementations jumping to physical addresses even after entering virtual
      mode, so until we have 1:1 mappings for UEFI runtime space this isn't
      going to work so well.
      
      Reverting these gets us back to the situation where we'd refuse to create
      variables on some systems because they classify deleted variables as "used"
      until the firmware triggers a garbage collection run, which they won't do
      until they reach a lower threshold. This results in it being impossible to
      install a bootloader, which is unhelpful.
      
      Feedback from Samsung indicates that the firmware doesn't need more than
      5KB of storage space for its own purposes, so that seems like a reasonable
      threshold. However, there's still no guarantee that a platform will attempt
      garbage collection merely because it drops below this threshold. It seems
      that this is often only triggered if an attempt to write generates a
      genuine EFI_OUT_OF_RESOURCES error. We can force that by attempting to
      create a variable larger than the remaining space. This should fail, but if
      it somehow succeeds we can then immediately delete it.
      
      I've tested this on the UEFI machines I have available, but I don't have
      a Samsung and so can't verify that it avoids the bricking problem.
      Signed-off-by: NMatthew Garrett <matthew.garrett@nebula.com>
      Signed-off-by: Lee, Chun-Y <jlee@suse.com> [ dummy variable cleanup ]
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
      f8b84043
  3. 31 5月, 2013 1 次提交
  4. 10 5月, 2013 1 次提交
  5. 07 5月, 2013 1 次提交
  6. 03 5月, 2013 3 次提交
  7. 01 5月, 2013 1 次提交
    • T
      dump_stack: unify debug information printed by show_regs() · a43cb95d
      Tejun Heo 提交于
      show_regs() is inherently arch-dependent but it does make sense to print
      generic debug information and some archs already do albeit in slightly
      different forms.  This patch introduces a generic function to print debug
      information from show_regs() so that different archs print out the same
      information and it's much easier to modify what's printed.
      
      show_regs_print_info() prints out the same debug info as dump_stack()
      does plus task and thread_info pointers.
      
      * Archs which didn't print debug info now do.
      
        alpha, arc, blackfin, c6x, cris, frv, h8300, hexagon, ia64, m32r,
        metag, microblaze, mn10300, openrisc, parisc, score, sh64, sparc,
        um, xtensa
      
      * Already prints debug info.  Replaced with show_regs_print_info().
        The printed information is superset of what used to be there.
      
        arm, arm64, avr32, mips, powerpc, sh32, tile, unicore32, x86
      
      * s390 is special in that it used to print arch-specific information
        along with generic debug info.  Heiko and Martin think that the
        arch-specific extra isn't worth keeping s390 specfic implementation.
        Converted to use the generic version.
      
      Note that now all archs print the debug info before actual register
      dumps.
      
      An example BUG() dump follows.
      
       kernel BUG at /work/os/work/kernel/workqueue.c:4841!
       invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
       Modules linked in:
       CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.9.0-rc1-work+ #7
       Hardware name: empty empty/S3992, BIOS 080011  10/26/2007
       task: ffff88007c85e040 ti: ffff88007c860000 task.ti: ffff88007c860000
       RIP: 0010:[<ffffffff8234a07e>]  [<ffffffff8234a07e>] init_workqueues+0x4/0x6
       RSP: 0000:ffff88007c861ec8  EFLAGS: 00010246
       RAX: ffff88007c861fd8 RBX: ffffffff824466a8 RCX: 0000000000000001
       RDX: 0000000000000046 RSI: 0000000000000001 RDI: ffffffff8234a07a
       RBP: ffff88007c861ec8 R08: 0000000000000000 R09: 0000000000000000
       R10: 0000000000000001 R11: 0000000000000000 R12: ffffffff8234a07a
       R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
       FS:  0000000000000000(0000) GS:ffff88007dc00000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
       CR2: ffff88015f7ff000 CR3: 00000000021f1000 CR4: 00000000000007f0
       DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
       DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
       Stack:
        ffff88007c861ef8 ffffffff81000312 ffffffff824466a8 ffff88007c85e650
        0000000000000003 0000000000000000 ffff88007c861f38 ffffffff82335e5d
        ffff88007c862080 ffffffff8223d8c0 ffff88007c862080 ffffffff81c47760
       Call Trace:
        [<ffffffff81000312>] do_one_initcall+0x122/0x170
        [<ffffffff82335e5d>] kernel_init_freeable+0x9b/0x1c8
        [<ffffffff81c47760>] ? rest_init+0x140/0x140
        [<ffffffff81c4776e>] kernel_init+0xe/0xf0
        [<ffffffff81c6be9c>] ret_from_fork+0x7c/0xb0
        [<ffffffff81c47760>] ? rest_init+0x140/0x140
        ...
      
      v2: Typo fix in x86-32.
      
      v3: CPU number dropped from show_regs_print_info() as
          dump_stack_print_info() has been updated to print it.  s390
          specific implementation dropped as requested by s390 maintainers.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Acked-by: NJesper Nilsson <jesper.nilsson@axis.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Fengguang Wu <fengguang.wu@intel.com>
      Cc: Mike Frysinger <vapier@gentoo.org>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Sam Ravnborg <sam@ravnborg.org>
      Acked-by: Chris Metcalf <cmetcalf@tilera.com>		[tile bits]
      Acked-by: Richard Kuo <rkuo@codeaurora.org>		[hexagon bits]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a43cb95d
  8. 30 4月, 2013 1 次提交
    • G
      mm/hugetlb: add more arch-defined huge_pte functions · 106c992a
      Gerald Schaefer 提交于
      Commit abf09bed ("s390/mm: implement software dirty bits")
      introduced another difference in the pte layout vs.  the pmd layout on
      s390, thoroughly breaking the s390 support for hugetlbfs.  This requires
      replacing some more pte_xxx functions in mm/hugetlbfs.c with a
      huge_pte_xxx version.
      
      This patch introduces those huge_pte_xxx functions and their generic
      implementation in asm-generic/hugetlb.h, which will now be included on
      all architectures supporting hugetlbfs apart from s390.  This change
      will be a no-op for those architectures.
      
      [akpm@linux-foundation.org: fix warning]
      Signed-off-by: NGerald Schaefer <gerald.schaefer@de.ibm.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Hillf Danton <dhillf@gmail.com>
      Acked-by: Michal Hocko <mhocko@suse.cz>	[for !s390 parts]
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      106c992a
  9. 28 4月, 2013 2 次提交
  10. 27 4月, 2013 1 次提交
  11. 26 4月, 2013 1 次提交
    • I
      perf/x86/intel/P4: Robistify P4 PMU types · 5ac2b5c2
      Ingo Molnar 提交于
      Linus found, while extending integer type extension checks in the
      sparse static code checker, various fragile patterns of mixed
      signed/unsigned  64-bit/32-bit integer use in perf_events_p4.c.
      
      The relevant hardware register ABI is 64 bit wide on 32-bit
      kernels as  well, so clean it all up a bit, remove unnecessary
      casts, and make sure we  use 64-bit unsigned integers in these
      places.
      
      [ Unfortunately this patch was not tested on real P4 hardware,
        those are pretty rare already. If this patch causes any
        problems on P4 hardware then please holler ... ]
      Reported-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      Cc: David Miller <davem@davemloft.net>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/20130424072630.GB1780@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      5ac2b5c2
  12. 25 4月, 2013 6 次提交
  13. 22 4月, 2013 5 次提交
  14. 21 4月, 2013 1 次提交
  15. 20 4月, 2013 1 次提交
  16. 19 4月, 2013 2 次提交
    • W
      mutex: Back out architecture specific check for negative mutex count · cc189d25
      Waiman Long 提交于
      Linus suggested that probably all the supported architectures can
      allow a negative mutex count without incorrect behavior, so we can
      then back out the architecture specific change and allow the
      mutex count to go to any negative number. That should further
      reduce contention for non-x86 architecture.
      Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NWaiman Long <Waiman.Long@hp.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Chandramouleeswaran Aswin <aswin@hp.com>
      Cc: Davidlohr Bueso <davidlohr.bueso@hp.com>
      Cc: Norton Scott J <scott.norton@hp.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Dave Jones <davej@redhat.com>
      Cc: Clark Williams <williams@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1366226594-5506-5-git-send-email-Waiman.Long@hp.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      cc189d25
    • W
      mutex: Make more scalable by doing less atomic operations · 0dc8c730
      Waiman Long 提交于
      In the __mutex_lock_common() function, an initial entry into
      the lock slow path will cause two atomic_xchg instructions to be
      issued. Together with the atomic decrement in the fast path, a
      total of three atomic read-modify-write instructions will be
      issued in rapid succession. This can cause a lot of cache
      bouncing when many tasks are trying to acquire the mutex at the
      same time.
      
      This patch will reduce the number of atomic_xchg instructions
      used by checking the counter value first before issuing the
      instruction. The atomic_read() function is just a simple memory
      read. The atomic_xchg() function, on the other hand, can be up
      to 2 order of magnitude or even more in cost when compared with
      atomic_read(). By using atomic_read() to check the value first
      before calling atomic_xchg(), we can avoid a lot of unnecessary
      cache coherency traffic. The only downside with this change is
      that a task on the slow path will have a tiny bit less chance of
      getting the mutex when competing with another task in the fast
      path.
      
      The same is true for the atomic_cmpxchg() function in the
      mutex-spin-on-owner loop. So an atomic_read() is also performed
      before calling atomic_cmpxchg().
      
      The mutex locking and unlocking code for the x86 architecture
      can allow any negative number to be used in the mutex count to
      indicate that some tasks are waiting for the mutex. I am not so
      sure if that is the case for the other architectures. So the
      default is to avoid atomic_xchg() if the count has already been
      set to -1. For x86, the check is modified to include all
      negative numbers to cover a larger case.
      
      The following table shows the jobs per minutes (JPM) scalability
      data on an 8-node 80-core Westmere box with a 3.7.10 kernel. The
      numactl command is used to restrict the running of the
      high_systime workloads to 1/2/4/8 nodes with hyperthreading on
      and off.
      
      +-----------------+-----------+------------+----------+
      |  Configuration  | Mean JPM  |  Mean JPM  | % Change |
      |		  | w/o patch | with patch |	      |
      +-----------------+-----------------------------------+
      |		  |      User Range 1100 - 2000	      |
      +-----------------+-----------------------------------+
      | 8 nodes, HT on  |    36980   |   148590  | +301.8%  |
      | 8 nodes, HT off |    42799   |   145011  | +238.8%  |
      | 4 nodes, HT on  |    61318   |   118445  |  +51.1%  |
      | 4 nodes, HT off |   158481   |   158592  |   +0.1%  |
      | 2 nodes, HT on  |   180602   |   173967  |   -3.7%  |
      | 2 nodes, HT off |   198409   |   198073  |   -0.2%  |
      | 1 node , HT on  |   149042   |   147671  |   -0.9%  |
      | 1 node , HT off |   126036   |   126533  |   +0.4%  |
      +-----------------+-----------------------------------+
      |		  |       User Range 200 - 1000	      |
      +-----------------+-----------------------------------+
      | 8 nodes, HT on  |   41525    |   122349  | +194.6%  |
      | 8 nodes, HT off |   49866    |   124032  | +148.7%  |
      | 4 nodes, HT on  |   66409    |   106984  |  +61.1%  |
      | 4 nodes, HT off |  119880    |   130508  |   +8.9%  |
      | 2 nodes, HT on  |  138003    |   133948  |   -2.9%  |
      | 2 nodes, HT off |  132792    |   131997  |   -0.6%  |
      | 1 node , HT on  |  116593    |   115859  |   -0.6%  |
      | 1 node , HT off |  104499    |   104597  |   +0.1%  |
      +-----------------+------------+-----------+----------+
      
      At low user range 10-100, the JPM differences were within +/-1%.
      So they are not that interesting.
      
      AIM7 benchmark run has a pretty large run-to-run variance due to
      random nature of the subtests executed. So a difference of less
      than +-5% may not be really significant.
      
      This patch improves high_systime workload performance at 4 nodes
      and up by maintaining transaction rates without significant
      drop-off at high node count.  The patch has practically no
      impact on 1 and 2 nodes system.
      
      The table below shows the percentage time (as reported by perf
      record -a -s -g) spent on the __mutex_lock_slowpath() function
      by the high_systime workload at 1500 users for 2/4/8-node
      configurations with hyperthreading off.
      
      +---------------+-----------------+------------------+---------+
      | Configuration | %Time w/o patch | %Time with patch | %Change |
      +---------------+-----------------+------------------+---------+
      |    8 nodes    |      65.34%     |      0.69%       |  -99%   |
      |    4 nodes    |       8.70%	  |      1.02%	     |  -88%   |
      |    2 nodes    |       0.41%     |      0.32%       |  -22%   |
      +---------------+-----------------+------------------+---------+
      
      It is obvious that the dramatic performance improvement at 8
      nodes was due to the drastic cut in the time spent within the
      __mutex_lock_slowpath() function.
      
      The table below show the improvements in other AIM7 workloads
      (at 8 nodes, hyperthreading off).
      
      +--------------+---------------+----------------+-----------------+
      |   Workload   | mean % change | mean % change  | mean % change   |
      |              | 10-100 users  | 200-1000 users | 1100-2000 users |
      +--------------+---------------+----------------+-----------------+
      | alltests     |     +0.6%     |   +104.2%      |   +185.9%       |
      | five_sec     |     +1.9%     |     +0.9%      |     +0.9%       |
      | fserver      |     +1.4%     |     -7.7%      |     +5.1%       |
      | new_fserver  |     -0.5%     |     +3.2%      |     +3.1%       |
      | shared       |    +13.1%     |   +146.1%      |   +181.5%       |
      | short        |     +7.4%     |     +5.0%      |     +4.2%       |
      +--------------+---------------+----------------+-----------------+
      Signed-off-by: NWaiman Long <Waiman.Long@hp.com>
      Reviewed-by: NDavidlohr Bueso <davidlohr.bueso@hp.com>
      Reviewed-by: NRik van Riel <riel@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Chandramouleeswaran Aswin <aswin@hp.com>
      Cc: Norton: Scott J <scott.norton@hp.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Dave Jones <davej@redhat.com>
      Cc: Clark Williams <williams@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1366226594-5506-3-git-send-email-Waiman.Long@hp.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      0dc8c730
  17. 18 4月, 2013 1 次提交
    • N
      iommu/vt-d: add quirk for broken interrupt remapping on 55XX chipsets · 03bbcb2e
      Neil Horman 提交于
      A few years back intel published a spec update:
      http://www.intel.com/content/dam/doc/specification-update/5520-and-5500-chipset-ioh-specification-update.pdf
      
      For the 5520 and 5500 chipsets which contained an errata (specificially errata
      53), which noted that these chipsets can't properly do interrupt remapping, and
      as a result the recommend that interrupt remapping be disabled in bios.  While
      many vendors have a bios update to do exactly that, not all do, and of course
      not all users update their bios to a level that corrects the problem.  As a
      result, occasionally interrupts can arrive at a cpu even after affinity for that
      interrupt has be moved, leading to lost or spurrious interrupts (usually
      characterized by the message:
      kernel: do_IRQ: 7.71 No irq handler for vector (irq -1)
      
      There have been several incidents recently of people seeing this error, and
      investigation has shown that they have system for which their BIOS level is such
      that this feature was not properly turned off.  As such, it would be good to
      give them a reminder that their systems are vulnurable to this problem.  For
      details of those that reported the problem, please see:
      https://bugzilla.redhat.com/show_bug.cgi?id=887006
      
      [ Joerg: Removed CONFIG_IRQ_REMAP ifdef from early-quirks.c ]
      Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
      CC: Prarit Bhargava <prarit@redhat.com>
      CC: Don Zickus <dzickus@redhat.com>
      CC: Don Dutile <ddutile@redhat.com>
      CC: Bjorn Helgaas <bhelgaas@google.com>
      CC: Asit Mallick <asit.k.mallick@intel.com>
      CC: David Woodhouse <dwmw2@infradead.org>
      CC: linux-pci@vger.kernel.org
      CC: Joerg Roedel <joro@8bytes.org>
      CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      CC: Arkadiusz Miśkiewicz <arekm@maven.pl>
      Signed-off-by: NJoerg Roedel <joro@8bytes.org>
      03bbcb2e
  18. 17 4月, 2013 4 次提交
  19. 16 4月, 2013 1 次提交
    • M
      efi: Pass boot services variable info to runtime code · cc5a080c
      Matthew Garrett 提交于
      EFI variables can be flagged as being accessible only within boot services.
      This makes it awkward for us to figure out how much space they use at
      runtime. In theory we could figure this out by simply comparing the results
      from QueryVariableInfo() to the space used by all of our variables, but
      that fails if the platform doesn't garbage collect on every boot. Thankfully,
      calling QueryVariableInfo() while still inside boot services gives a more
      reliable answer. This patch passes that information from the EFI boot stub
      up to the efi platform code.
      Signed-off-by: NMatthew Garrett <matthew.garrett@nebula.com>
      Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
      cc5a080c
  20. 14 4月, 2013 1 次提交
  21. 13 4月, 2013 2 次提交
    • A
      uretprobes/x86: Hijack return address · 791eca10
      Anton Arapov 提交于
      Hijack the return address and replace it with a trampoline address.
      Signed-off-by: NAnton Arapov <anton@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      791eca10
    • D
      x86-32: Fix possible incomplete TLB invalidate with PAE pagetables · 1de14c3c
      Dave Hansen 提交于
      This patch attempts to fix:
      
      	https://bugzilla.kernel.org/show_bug.cgi?id=56461
      
      The symptom is a crash and messages like this:
      
      	chrome: Corrupted page table at address 34a03000
      	*pdpt = 0000000000000000 *pde = 0000000000000000
      	Bad pagetable: 000f [#1] PREEMPT SMP
      
      Ingo guesses this got introduced by commit 611ae8e3 ("x86/tlb:
      enable tlb flush range support for x86") since that code started to free
      unused pagetables.
      
      On x86-32 PAE kernels, that new code has the potential to free an entire
      PMD page and will clear one of the four page-directory-pointer-table
      (aka pgd_t entries).
      
      The hardware aggressively "caches" these top-level entries and invlpg
      does not actually affect the CPU's copy.  If we clear one we *HAVE* to
      do a full TLB flush, otherwise we might continue using a freed pmd page.
      (note, we do this properly on the population side in pud_populate()).
      
      This patch tracks whenever we clear one of these entries in the 'struct
      mmu_gather', and ensures that we follow up with a full tlb flush.
      
      BTW, I disassembled and checked that:
      
      	if (tlb->fullmm == 0)
      and
      	if (!tlb->fullmm && !tlb->need_flush_all)
      
      generate essentially the same code, so there should be zero impact there
      to the !PAE case.
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Cc: Peter Anvin <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Artem S Tashkinov <t.artem@mailcity.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1de14c3c
  22. 12 4月, 2013 1 次提交