1. 24 8月, 2016 10 次提交
    • B
      sched/x86: Add 'struct inactive_task_frame' to better document the sleeping task stack frame · 7b32aead
      Brian Gerst 提交于
      Add 'struct inactive_task_frame', which defines the layout of the stack for
      a sleeping process.  For now, the only defined field is the BP register
      (frame pointer).
      Signed-off-by: NBrian Gerst <brgerst@gmail.com>
      Reviewed-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1471106302-10159-4-git-send-email-brgerst@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      7b32aead
    • B
      sched/x86/64, kgdb: Clear GDB_PS on 64-bit · 16363019
      Brian Gerst 提交于
      switch_to() no longer saves EFLAGS, so it's bogus to look for it on the
      stack.  Set it to zero like 32-bit.
      Signed-off-by: NBrian Gerst <brgerst@gmail.com>
      Reviewed-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jason Wessel <jason.wessel@windriver.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1471106302-10159-3-git-send-email-brgerst@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      16363019
    • B
      sched/x86/32, kgdb: Don't use thread.ip in sleeping_thread_to_gdb_regs() · 4e047aa7
      Brian Gerst 提交于
      Match 64-bit and set gdb_regs[GDB_PC] to zero.  thread.ip is always the
      same point in the scheduler (except for newly forked processes), and will
      be removed in a future patch.
      Signed-off-by: NBrian Gerst <brgerst@gmail.com>
      Reviewed-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jason Wessel <jason.wessel@windriver.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1471106302-10159-2-git-send-email-brgerst@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      4e047aa7
    • J
      x86/dumpstack/ftrace: Don't print unreliable addresses in print_context_stack_bp() · 13e25bab
      Josh Poimboeuf 提交于
      When function graph tracing is enabled, print_context_stack_bp() can
      report return_to_handler() as an unreliable address, which is confusing
      and misleading: return_to_handler() is really only useful as a hint for
      debugging, whereas print_context_stack_bp() users only care about the
      actual 'reliable' call path.
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Byungchul Park <byungchul.park@lge.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Nilay Vaish <nilayvaish@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/c51aef578d8027791b38d2ad9bac0c7f499fde91.1471607358.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      13e25bab
    • J
      x86/dumpstack/ftrace: Mark function graph handler function as unreliable · 6f727b84
      Josh Poimboeuf 提交于
      When function graph tracing is enabled for a function, its return
      address on the stack is replaced with the address of an ftrace handler
      (return_to_handler).
      
      Currently 'return_to_handler' can be reported as reliable.  That's not
      ideal, and can actually be misleading.  When saving or dumping the
      stack, you normally only care about what led up to that point (the call
      path), rather than what will happen in the future (the return path).
      
      That's especially true in the non-oops stack trace case, which isn't
      used for debugging.  For example, in a perf profiling operation,
      reporting return_to_handler() in the trace would just be confusing.
      
      And in the oops case, where debugging is important, "unreliable" is also
      more appropriate there because it serves as a hint that graph tracing
      was involved, instead of trying to imply that return_to_handler() was
      the real caller.
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Byungchul Park <byungchul.park@lge.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Nilay Vaish <nilayvaish@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/f8af15749c7d632d3e7f815995831d5b7f82950d.1471607358.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      6f727b84
    • J
      ftrace/x86: Implement HAVE_FUNCTION_GRAPH_RET_ADDR_PTR · 471bd10f
      Josh Poimboeuf 提交于
      Use the more reliable version of ftrace_graph_ret_addr() so we no longer
      have to worry about the unwinder getting out of sync with the function
      graph ret_stack index, which can happen if the unwinder skips any frames
      before calling ftrace_graph_ret_addr().
      
      This fixes this issue (and several others like it):
      
        $ cat /proc/self/stack
        [<ffffffff810489a2>] save_stack_trace_tsk+0x22/0x40
        [<ffffffff81311a89>] proc_pid_stack+0xb9/0x110
        [<ffffffff813127c4>] proc_single_show+0x54/0x80
        [<ffffffff812be088>] seq_read+0x108/0x3e0
        [<ffffffff812923d7>] __vfs_read+0x37/0x140
        [<ffffffff812929d9>] vfs_read+0x99/0x140
        [<ffffffff81293f28>] SyS_read+0x58/0xc0
        [<ffffffff818af97c>] entry_SYSCALL_64_fastpath+0x1f/0xbd
        [<ffffffffffffffff>] 0xffffffffffffffff
      
        $ echo function_graph > /sys/kernel/debug/tracing/current_tracer
      
        $ cat /proc/self/stack
        [<ffffffff818b2428>] return_to_handler+0x0/0x27
        [<ffffffff810394cc>] print_context_stack+0xfc/0x100
        [<ffffffff818b2428>] return_to_handler+0x0/0x27
        [<ffffffff8103891b>] dump_trace+0x12b/0x350
        [<ffffffff818b2428>] return_to_handler+0x0/0x27
        [<ffffffff810489a2>] save_stack_trace_tsk+0x22/0x40
        [<ffffffff818b2428>] return_to_handler+0x0/0x27
        [<ffffffff81311a89>] proc_pid_stack+0xb9/0x110
        [<ffffffff818b2428>] return_to_handler+0x0/0x27
        [<ffffffff813127c4>] proc_single_show+0x54/0x80
        [<ffffffff818b2428>] return_to_handler+0x0/0x27
        [<ffffffff812be088>] seq_read+0x108/0x3e0
        [<ffffffff818b2428>] return_to_handler+0x0/0x27
        [<ffffffff812923d7>] __vfs_read+0x37/0x140
        [<ffffffff818b2428>] return_to_handler+0x0/0x27
        [<ffffffff812929d9>] vfs_read+0x99/0x140
        [<ffffffffffffffff>] 0xffffffffffffffff
      
      Enabling function graph tracing causes the stack trace to change in two
      ways:
      
      First, the real call addresses are confusingly interspersed with
      'return_to_handler' addresses.  This issue will be fixed by the next
      patch.
      
      Second, the stack trace is offset by two frames, because the unwinder
      skipped the first two frames and got out of sync with the ret_stack
      index.  This patch fixes this issue.
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Byungchul Park <byungchul.park@lge.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Nilay Vaish <nilayvaish@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/a6d623e36f8d08f9a17bd74d804d201177a23afd.1471607358.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      471bd10f
    • J
      x86/dumpstack/ftrace: Convert dump_trace() callbacks to use ftrace_graph_ret_addr() · 408fe5de
      Josh Poimboeuf 提交于
      Convert print_context_stack() and print_context_stack_bp() to use the
      arch-independent ftrace_graph_ret_addr() helper.
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Byungchul Park <byungchul.park@lge.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Nilay Vaish <nilayvaish@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/56ec97cafc1bf2e34d1119e6443d897db406da86.1471607358.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      408fe5de
    • J
      ftrace: Add return address pointer to ftrace_ret_stack · 9a7c348b
      Josh Poimboeuf 提交于
      Storing this value will help prevent unwinders from getting out of sync
      with the function graph tracer ret_stack.  Now instead of needing a
      stateful iterator, they can compare the return address pointer to find
      the right ret_stack entry.
      
      Note that an array of 50 ftrace_ret_stack structs is allocated for every
      task.  So when an arch implements this, it will add either 200 or 400
      bytes of memory usage per task (depending on whether it's a 32-bit or
      64-bit platform).
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Byungchul Park <byungchul.park@lge.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Nilay Vaish <nilayvaish@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/a95cfcc39e8f26b89a430c56926af0bb217bc0a1.1471607358.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      9a7c348b
    • A
      x86/mm/64: Enable vmapped stacks (CONFIG_HAVE_ARCH_VMAP_STACK=y) · e37e43a4
      Andy Lutomirski 提交于
      This allows x86_64 kernels to enable vmapped stacks by setting
      HAVE_ARCH_VMAP_STACK=y - which enables the CONFIG_VMAP_STACK=y
      high level Kconfig option.
      
      There are a couple of interesting bits:
      
      First, x86 lazily faults in top-level paging entries for the vmalloc
      area.  This won't work if we get a page fault while trying to access
      the stack: the CPU will promote it to a double-fault and we'll die.
      To avoid this problem, probe the new stack when switching stacks and
      forcibly populate the pgd entry for the stack when switching mms.
      
      Second, once we have guard pages around the stack, we'll want to
      detect and handle stack overflow.
      
      I didn't enable it on x86_32.  We'd need to rework the double-fault
      code a bit and I'm concerned about running out of vmalloc virtual
      addresses under some workloads.
      
      This patch, by itself, will behave somewhat erratically when the
      stack overflows while RSP is still more than a few tens of bytes
      above the bottom of the stack.  Specifically, we'll get #PF and make
      it to no_context and them oops without reliably triggering a
      double-fault, and no_context doesn't know about stack overflows.
      The next patch will improve that case.
      
      Thank you to Nadav and Brian for helping me pay enough attention to
      the SDM to hopefully get this right.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Nadav Amit <nadav.amit@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/c88f3e2920b18e6cc621d772a04a62c06869037e.1470907718.git.luto@kernel.org
      [ Minor edits. ]
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      e37e43a4
    • B
      x86/entry: Remove outdated comment about SYSCALL targets · 556b6723
      Borislav Petkov 提交于
      The comment probably meant some old AMD64 incarnation which most likely
      never saw the light of day. STAR and LSTAR are two different registers
      and STAR sets CS/SS(DS) selectors for *all* modes, not only 32-bit.
      
      So simply remove that comment.
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20160823172356.15879-1-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      556b6723
  2. 19 8月, 2016 6 次提交
    • J
      x86/dumpstack: Remove 64-byte gap at end of irq stack · 4950d6d4
      Josh Poimboeuf 提交于
      There has been a 64-byte gap at the end of the irq stack for at least 12
      years.  It predates git history, and I can't find any good reason for
      it.  Remove it.  What's the worst that could happen?
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Byungchul Park <byungchul.park@lge.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Nilay Vaish <nilayvaish@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/14f9281c5475cc44af95945ea7546bff2e3836db.1471535549.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      4950d6d4
    • J
      x86/dumpstack: Fix x86_32 kernel_stack_pointer() previous stack access · 72b4f6a5
      Josh Poimboeuf 提交于
      On x86_32, when an interrupt happens from kernel space, SS and SP aren't
      pushed and the existing stack is used.  So pt_regs is effectively two
      words shorter, and the previous stack pointer is normally the memory
      after the shortened pt_regs, aka '&regs->sp'.
      
      But in the rare case where the interrupt hits right after the stack
      pointer has been changed to point to an empty stack, like for example
      when call_on_stack() is used, the address immediately after the
      shortened pt_regs is no longer on the stack.  In that case, instead of
      '&regs->sp', the previous stack pointer should be retrieved from the
      beginning of the current stack page.
      
      kernel_stack_pointer() wants to do that, but it forgets to dereference
      the pointer.  So instead of returning a pointer to the previous stack,
      it returns a pointer to the beginning of the current stack.
      
      Note that it's probably outside of kernel_stack_pointer()'s scope to be
      switching stacks at all.  The x86_64 version of this function doesn't do
      it, and it would be better for the caller to do it if necessary.  But
      that's a patch for another day.  This just fixes the original intent.
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Byungchul Park <byungchul.park@lge.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Nilay Vaish <nilayvaish@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Fixes: 0788aa6a ("x86: Prepare removal of previous_esp from i386 thread_info structure")
      Link: http://lkml.kernel.org/r/472453d6e9f6a2d4ab16aaed4935f43117111566.1471535549.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      72b4f6a5
    • J
      x86/head: Remove useless zeroed word · ae952ffd
      Josh Poimboeuf 提交于
      This zeroed word has no apparent purpose, so remove it.
      
      Brian Gerst says:
      
        "FYI the word used to be the SS segment selector for the LSS
         instruction, which isn't needed in 64-bit mode."
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Byungchul Park <byungchul.park@lge.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Nilay Vaish <nilayvaish@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/b056855c295bbb3825b97c1e9f7958539a4d6cf2.1471535549.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      ae952ffd
    • J
      x86/dumpstack: Remove extra brackets around "<EOE>" · 6225f323
      Josh Poimboeuf 提交于
      When starting the dump of an exception stack, it shows "<<EOE>>" instead
      of "<EOE>".  print_trace_stack() already adds brackets, no need to add
      them again.
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Byungchul Park <byungchul.park@lge.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Nilay Vaish <nilayvaish@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/77f185fd5b81845869b400aa619415458df6b6cc.1471535549.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      6225f323
    • J
      x86/asm/head: Rename 'stack_start' -> 'initial_stack' · b32f96c7
      Josh Poimboeuf 提交于
      The 'stack_start' variable is similar in usage to 'initial_code' and
      'initial_gs': they're all stored in head_64.S and they're all updated by
      SMP and ACPI suspend before starting a CPU.
      
      Rename it to 'initial_stack' to be consistent with the others.
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Byungchul Park <byungchul.park@lge.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Nilay Vaish <nilayvaish@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/87063d773a3212051b77e17b0ee427f6582a5050.1471535549.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      b32f96c7
    • J
      x86/dumpstack: Remove show_trace() · bf255bda
      Josh Poimboeuf 提交于
      There are a bewildering array of options for dumping the stack.
      Simplify things a little by removing show_trace(), which is unused.
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Reviewed-by: NAndy Lutomirski <luto@kernel.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Byungchul Park <byungchul.park@lge.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Nilay Vaish <nilayvaish@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/fe02292eac9d409001ec0cf6d06f90ced242570d.1471535549.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      bf255bda
  3. 18 8月, 2016 2 次提交
    • J
      x86/smp: Fix __max_logical_packages value setup · 7b0501b1
      Jiri Olsa 提交于
      Frank reported kernel panic when he disabled several cores in BIOS
      via following option:
      
        Core Disable Bitmap(Hex)   [0]
      
      with number 0xFFE, which leaves 16 CPUs in system (out of 48).
      
      The kernel panic below goes along with following messages:
      
       smpboot: Max logical packages: 2^M
       smpboot: APIC(0) Converting physical 0 to logical package 0^M
       smpboot: APIC(20) Converting physical 1 to logical package 1^M
       smpboot: APIC(40) Package 2 exceeds logical package map^M
       smpboot: CPU 8 APICId 40 disabled^M
       smpboot: APIC(60) Package 3 exceeds logical package map^M
       smpboot: CPU 12 APICId 60 disabled^M
       ...
       general protection fault: 0000 [#1] SMP^M
       Modules linked in:^M
       CPU: 15 PID: 1 Comm: swapper/0 Not tainted 4.7.0-rc5+ #1^M
       Hardware name: SGI UV300/UV300, BIOS SGI UV 300 series BIOS 05/25/2016^M
       task: ffff8801673e0000 ti: ffff8801673ac000 task.ti: ffff8801673ac000^M
       RIP: 0010:[<ffffffff81014d54>]  [<ffffffff81014d54>] uncore_change_context+0xd4/0x180^M
       ...
        [<ffffffff810158ac>] uncore_event_init_cpu+0x6c/0x70^M
        [<ffffffff81d8c91c>] intel_uncore_init+0x1c2/0x2dd^M
        [<ffffffff81d8c75a>] ? uncore_cpu_setup+0x17/0x17^M
        [<ffffffff81002190>] do_one_initcall+0x50/0x190^M
        [<ffffffff810ab193>] ? parse_args+0x293/0x480^M
        [<ffffffff81d87365>] kernel_init_freeable+0x1a5/0x249^M
        [<ffffffff81d86a35>] ? set_debug_rodata+0x12/0x12^M
        [<ffffffff816dc19e>] kernel_init+0xe/0x110^M
        [<ffffffff816e93bf>] ret_from_fork+0x1f/0x40^M
        [<ffffffff816dc190>] ? rest_init+0x80/0x80^M
      
      The reason for the panic is wrong value of __max_logical_packages,
      which lets logical_package_map uninitialized and the uncore code
      relying on this map being properly initialized (maybe we should
      add some safety checks there as well).
      
      The __max_logical_packages is computed as:
      
        DIV_ROUND_UP(total_cpus, ncpus);
        - ncpus being number of cores
      
      With above BIOS setup we get total_cpus == 16 which set
      __max_logical_packages to 2 (ncpus is 12).
      
      Once topology_update_package_map processes CPU with logical
      pkg over 2 we display above messages and fail to initialize
      the physical_to_logical_pkg map, which makes the uncore code
      crash.
      
      The fix is to remove logical_package_map bitmap completely
      and keep and update the logical_packages number instead.
      
      After we enumerate all the present CPUs, we check if the
      enumerated logical packages count is within its computed
      maximum from BIOS data.
      
      If it's not the case, we set this maximum to the new enumerated
      value and freeze any new addition of logical packages.
      
      The freeze is because lot of init code like uncore/rapl/cqm
      depends on having maximum logical package value set to allocate
      their data, so we can't change it later on.
      
      Prarit Bhargava tested the patch and confirms that it solves
      the problem:
      
        From dmidecode:
                Core Count: 24
                Core Enabled: 24
                Thread Count: 48
      
      Orig kernel boot log:
      
       [    0.464981] smpboot: Max logical packages: 19
       [    0.469861] smpboot: APIC(0) Converting physical 0 to logical package 0
       [    0.477261] smpboot: APIC(40) Converting physical 1 to logical package 1
       [    0.484760] smpboot: APIC(80) Converting physical 2 to logical package 2
       [    0.492258] smpboot: APIC(c0) Converting physical 3 to logical package 3
      
      1.  nr_cpus=8, should stop enumerating in package 0:
      
       [    0.533664] smpboot: APIC(0) Converting physical 0 to logical package 0
       [    0.539596] smpboot: Max logical packages: 19
      
      2.  max_cpus=8, should still enumerate all packages:
      
       [    0.526494] smpboot: APIC(0) Converting physical 0 to logical package 0
       [    0.532428] smpboot: APIC(40) Converting physical 1 to logical package 1
       [    0.538456] smpboot: APIC(80) Converting physical 2 to logical package 2
       [    0.544486] smpboot: APIC(c0) Converting physical 3 to logical package 3
       [    0.550524] smpboot: Max logical packages: 19
      
      3.  nr_cpus=49 ( 2 socket + 1 core on 3rd socket), should stop enumerating in
          package 2:
      
       [    0.521378] smpboot: APIC(0) Converting physical 0 to logical package 0
       [    0.527314] smpboot: APIC(40) Converting physical 1 to logical package 1
       [    0.533345] smpboot: APIC(80) Converting physical 2 to logical package 2
       [    0.539368] smpboot: Max logical packages: 19
      
      4.  maxcpus=49, should still enumerate all packages:
      
       [    0.525591] smpboot: APIC(0) Converting physical 0 to logical package 0
       [    0.531525] smpboot: APIC(40) Converting physical 1 to logical package 1
       [    0.537547] smpboot: APIC(80) Converting physical 2 to logical package 2
       [    0.543579] smpboot: APIC(c0) Converting physical 3 to logical package 3
       [    0.549624] smpboot: Max logical packages: 19
      
      5.  kdump (nr_cpus=1) works as well.
      Reported-by: NFrank Ramsay <framsay@redhat.com>
      Tested-by: NPrarit Bhargava <prarit@redhat.com>
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Reviewed-by: NPrarit Bhargava <prarit@redhat.com>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20160815101700.GA30090@kravaSigned-off-by: NIngo Molnar <mingo@kernel.org>
      7b0501b1
    • B
      x86/microcode/AMD: Fix initrd loading with CONFIG_RANDOMIZE_MEMORY=y · 88b2f634
      Borislav Petkov 提交于
      Similar to:
      
        efaad554 ("x86/microcode/intel: Fix initrd loading with CONFIG_RANDOMIZE_MEMORY=y")
      
      ... fix microcode loading from the initrd on AMD by adding the
      randomization offset to the microcode patch container within the initrd.
      Reported-and-tested-by: NBrian Gerst <brgerst@gmail.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-tip-commits@vger.kernel.org
      Link: http://lkml.kernel.org/r/20160817113314.GA19221@nazgul.tnicSigned-off-by: NIngo Molnar <mingo@kernel.org>
      88b2f634
  4. 12 8月, 2016 1 次提交
    • D
      uprobes/x86: Fix RIP-relative handling of EVEX-encoded instructions · 68187872
      Denys Vlasenko 提交于
      Since instruction decoder now supports EVEX-encoded instructions, two fixes
      are needed to correctly handle them in uprobes.
      
      Extended bits for MODRM.rm field need to be sanitized just like we do it
      for VEX3, to avoid encoding wrong register for register-relative access.
      
      EVEX has _two_ extended bits: b and x. Theoretically, EVEX.x should be
      ignored by the CPU (since GPRs go only up to 15, not 31), but let's be
      paranoid here: proper encoding for register-relative access
      should have EVEX.x = 1.
      
      Secondly, we should fetch vex.vvvv for EVEX too.
      This is now super easy because instruction decoder populates
      vex_prefix.bytes[2] for all flavors of (e)vex encodings, even for VEX2.
      Signed-off-by: NDenys Vlasenko <dvlasenk@redhat.com>
      Acked-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jim Keniston <jkenisto@us.ibm.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: linux-kernel@vger.kernel.org
      Cc: <stable@vger.kernel.org> # v4.1+
      Fixes: 8a764a87 ("x86/asm/decoder: Create artificial 3rd byte for 2-byte VEX")
      Link: http://lkml.kernel.org/r/20160811154521.20469-1-dvlasenk@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      68187872
  5. 11 8月, 2016 5 次提交
    • S
      x86/apic/x2apic, smp/hotplug: Don't use before alloc in x2apic_cluster_probe() · d52c0569
      Sebastian Andrzej Siewior 提交于
      I made a mistake while converting the driver to the hotplug state
      machine and as a result x2apic_cluster_probe() was accessing
      cpus_in_cluster before allocating it.
      
      This patch fixes it by setting the cpumask after the allocation the
      memory succeeded.
      
      While at it, I marked two functions static which are only used within
      this file.
      Reported-by: NLaura Abbott <labbott@redhat.com>
      Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Fixes: 6b2c2847 ("x86/x2apic: Convert to CPU hotplug state machine")
      Link: http://lkml.kernel.org/r/1470924515-9444-1-git-send-email-bigeasy@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      d52c0569
    • A
      x86/boot: Defer setup_real_mode() to early_initcall time · d0de0f68
      Andy Lutomirski 提交于
      There's no need to run setup_real_mode() as early as we run it.
      Defer it to the same early_initcall that sets up the page
      permissions for the real mode code.
      
      This should be a code size reduction.  More importantly, it give us
      a longer window in which we can allocate the real mode trampoline.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mario Limonciello <mario_limonciello@dell.com>
      Cc: Matt Fleming <mfleming@suse.de>
      Cc: Matthew Garrett <mjg59@srcf.ucam.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/fd62f0da4f79357695e9bf3e365623736b05f119.1470821230.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      d0de0f68
    • A
      x86/boot: Synchronize trampoline_cr4_features and mmu_cr4_features directly · 18bc7bd5
      Andy Lutomirski 提交于
      The initialization process for trampoline_cr4_features and
      mmu_cr4_features was confusing.  The intent is for mmu_cr4_features
      and *trampoline_cr4_features to stay in sync, but
      trampoline_cr4_features is NULL until setup_real_mode() runs.  The
      old code synchronized *trampoline_cr4_features *twice*, once in
      setup_real_mode() and once in setup_arch().  It also initialized
      mmu_cr4_features in setup_real_mode(), which causes the actual value
      of mmu_cr4_features to potentially depend on when setup_real_mode()
      is called.
      
      With this patch, mmu_cr4_features is initialized directly in
      setup_arch(), and *trampoline_cr4_features is synchronized to
      mmu_cr4_features when the trampoline is set up.
      
      After this patch, it should be safe to defer setup_real_mode().
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mario Limonciello <mario_limonciello@dell.com>
      Cc: Matt Fleming <mfleming@suse.de>
      Cc: Matthew Garrett <mjg59@srcf.ucam.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/d48a263f9912389b957dd495a7127b009259ffe0.1470821230.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      18bc7bd5
    • A
      x86/boot: Run reserve_bios_regions() after we initialize the memory map · 007b7560
      Andy Lutomirski 提交于
      reserve_bios_regions() is a quirk that reserves memory that we might
      otherwise think is available.  There's no need to run it so early,
      and running it before we have the memory map initialized with its
      non-quirky inputs makes it hard to make reserve_bios_regions() more
      intelligent.
      
      Move it right after we populate the memblock state.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mario Limonciello <mario_limonciello@dell.com>
      Cc: Matt Fleming <mfleming@suse.de>
      Cc: Matthew Garrett <mjg59@srcf.ucam.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/59f58618911005c799c6c9979ce6ae4881d907c2.1470821230.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      007b7560
    • A
      x86/irq: Do not substract irq_tlb_count from irq_call_count · 82ba4fac
      Aaron Lu 提交于
      Since commit:
      
        52aec330 ("x86/tlb: replace INVALIDATE_TLB_VECTOR by CALL_FUNCTION_VECTOR")
      
      the TLB remote shootdown is done through call function vector. That
      commit didn't take care of irq_tlb_count, which a later commit:
      
        fd0f5869 ("x86: Distinguish TLB shootdown interrupts from other functions call interrupts")
      
      ... tried to fix.
      
      The fix assumes every increase of irq_tlb_count has a corresponding
      increase of irq_call_count. So the irq_call_count is always bigger than
      irq_tlb_count and we could substract irq_tlb_count from irq_call_count.
      
      Unfortunately this is not true for the smp_call_function_single() case.
      The IPI is only sent if the target CPU's call_single_queue is empty when
      adding a csd into it in generic_exec_single. That means if two threads
      are both adding flush tlb csds to the same CPU's call_single_queue, only
      one IPI is sent. In other words, the irq_call_count is incremented by 1
      but irq_tlb_count is incremented by 2. Over time, irq_tlb_count will be
      bigger than irq_call_count and the substract will produce a very large
      irq_call_count value due to overflow.
      
      Considering that:
      
        1) it's not worth to send more IPIs for the sake of accurate counting of
           irq_call_count in generic_exec_single();
      
        2) it's not easy to tell if the call function interrupt is for TLB
           shootdown in __smp_call_function_single_interrupt().
      
      Not to exclude TLB shootdown from call function count seems to be the
      simplest fix and this patch just does that.
      
      This bug was found by LKP's cyclic performance regression tracking recently
      with the vm-scalability test suite. I have bisected to commit:
      
        3dec0ba0 ("mm/rmap: share the i_mmap_rwsem")
      
      This commit didn't do anything wrong but revealed the irq_call_count
      problem. IIUC, the commit makes rwc->remap_one in rmap_walk_file
      concurrent with multiple threads.  When remap_one is try_to_unmap_one(),
      then multiple threads could queue flush TLB to the same CPU but only
      one IPI will be sent.
      
      Since the commit was added in Linux v3.19, the counting problem only
      shows up from v3.19 onwards.
      Signed-off-by: NAaron Lu <aaron.lu@intel.com>
      Cc: Alex Shi <alex.shi@linaro.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Huang Ying <ying.huang@intel.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tomoki Sekiyama <tomoki.sekiyama.qu@hitachi.com>
      Link: http://lkml.kernel.org/r/20160811074430.GA18163@aaronlu.sh.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      82ba4fac
  6. 10 8月, 2016 8 次提交
    • D
      x86/mm/pkeys: Fix compact mode by removing protection keys' XSAVE buffer manipulation · b79daf85
      Dave Hansen 提交于
      The Memory Protection Keys "rights register" (PKRU) is
      XSAVE-managed, and is saved/restored along with the FPU state.
      
      When kernel code accesses FPU regsisters, it does a delicate
      dance with preempt.  Otherwise, the context switching code can
      get confused as to whether the most up-to-date state is in the
      registers themselves or in the XSAVE buffer.
      
      But, PKRU is not a normal FPU register.  Using it does not
      generate the normal device-not-available (#NM) exceptions which
      means we can not manage it lazily, and the kernel completley
      disallows using lazy mode when it is enabled.
      
      The dance with preempt *only* occurs when managing the FPU
      lazily.  Since we never manage PKRU lazily, we do not have to do
      the dance with preempt; we can access it directly.  Doing it
      this way saves a ton of complicated code (and is faster too).
      
      Further, the XSAVES reenabling failed to patch a bit of code
      in fpu__xfeature_set_state() the checked for compacted buffers.
      That check caused fpu__xfeature_set_state() to silently refuse to
      work when the kernel is using compacted XSAVE buffers.  This
      broke execute-only and future pkey_mprotect() support when using
      compact XSAVE buffers.
      
      But, removing fpu__xfeature_set_state() gets rid of this issue,
      in addition to the nice cleanup and speedup.
      
      This fixes the same thing as a fix that Sai posted:
      
        https://lkml.org/lkml/2016/7/25/637
      
      The fix that he posted is a much more obviously correct, but I
      think we should just do this instead.
      Reported-by: NSai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Dave Hansen <dave@sr71.net>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
      Cc: Ravi Shankar <ravi.v.shankar@intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Yu-Cheng Yu <yu-cheng.yu@intel.com>
      Link: http://lkml.kernel.org/r/20160727232040.7D060DAD@viggo.jf.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      b79daf85
    • M
      x86/platform/UV: Fix kernel panic running RHEL kdump kernel on UV systems · 5a52e8f8
      Mike Travis 提交于
      The latest UV kernel support panics when RHEL7 kexec's the kdump kernel
      to make a dumpfile.  This patch fixes the problem by turning off all UV
      support if NUMA is off.
      Tested-by: NFrank Ramsay <framsay@sgi.com>
      Tested-by: NJohn Estabrook <estabrook@sgi.com>
      Signed-off-by: NMike Travis <travis@sgi.com>
      Reviewed-by: NDimitri Sivanich <sivanich@sgi.com>
      Reviewed-by: NNathan Zimmer <nzimmer@sgi.com>
      Cc: Alex Thorlton <athorlton@sgi.com>
      Cc: Andrew Banman <abanman@sgi.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Russ Anderson <rja@sgi.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20160801184050.577755634@asylum.americas.sgi.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      5a52e8f8
    • M
      x86/platform/UV: Fix problem with UV4 BIOS providing incorrect PXM values · 22ac2bca
      Mike Travis 提交于
      There are some circumstances where the UV4 BIOS cannot provide the
      correct Proximity Node values to associate with specific Sockets and
      Physical Nodes.  The decision was made to remove these values from BIOS
      and for the kernel to get these values from the standard ACPI tables.
      Tested-by: NFrank Ramsay <framsay@sgi.com>
      Tested-by: NJohn Estabrook <estabrook@sgi.com>
      Signed-off-by: NMike Travis <travis@sgi.com>
      Reviewed-by: NDimitri Sivanich <sivanich@sgi.com>
      Reviewed-by: NNathan Zimmer <nzimmer@sgi.com>
      Cc: Alex Thorlton <athorlton@sgi.com>
      Cc: Andrew Banman <abanman@sgi.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Russ Anderson <rja@sgi.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20160801184050.414210079@asylum.americas.sgi.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      22ac2bca
    • M
      x86/platform/UV: Fix problem with UV4 Socket IDs not being contiguous · 054f621f
      Mike Travis 提交于
      The UV4 Socket IDs are not guaranteed to equate to Node values which
      can cause the GAM (Global Addressable Memory) table lookups to fail.
      Fix this by using an independent index into the GAM table instead of
      the Socket ID to reference the base address.
      Tested-by: NFrank Ramsay <framsay@sgi.com>
      Tested-by: NJohn Estabrook <estabrook@sgi.com>
      Signed-off-by: NMike Travis <travis@sgi.com>
      Reviewed-by: NDimitri Sivanich <sivanich@sgi.com>
      Reviewed-by: NNathan Zimmer <nzimmer@sgi.com>
      Cc: Alex Thorlton <athorlton@sgi.com>
      Cc: Andrew Banman <abanman@sgi.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Russ Anderson <rja@sgi.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20160801184050.048755337@asylum.americas.sgi.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      054f621f
    • T
      x86/mm/KASLR: Fix physical memory calculation on KASLR memory randomization · c7d2361f
      Thomas Garnier 提交于
      Initialize KASLR memory randomization after max_pfn is initialized. Also
      ensure the size is rounded up. It could create problems on machines
      with more than 1Tb of memory on certain random addresses.
      Signed-off-by: NThomas Garnier <thgarnie@google.com>
      Cc: Aleksey Makarov <aleksey.makarov@linaro.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Fabian Frederick <fabf@skynet.be>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Joerg Roedel <jroedel@suse.de>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Lv Zheng <lv.zheng@intel.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: kernel-hardening@lists.openwall.com
      Fixes: 021182e5 ("Enable KASLR for physical mapping memory regions")
      Link: http://lkml.kernel.org/r/1470762665-88032-1-git-send-email-thgarnie@google.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      c7d2361f
    • A
      x86/hpet: Fix /dev/rtc breakage caused by RTC cleanup · 22cc1ca3
      Arnd Bergmann 提交于
      Ville Syrjälä reports "The first time I run hwclock after rebooting
      I get this:
      
       open("/dev/rtc", O_RDONLY)              = 3
       ioctl(3, PHN_SET_REGS or RTC_UIE_ON, 0) = 0
       select(4, [3], NULL, NULL, {10, 0})     = 0 (Timeout)
       ioctl(3, PHN_NOT_OH or RTC_UIE_OFF, 0)  = 0
       close(3)                                = 0
      
      On all subsequent runs I get this:
      
       open("/dev/rtc", O_RDONLY)              = 3
       ioctl(3, PHN_SET_REGS or RTC_UIE_ON, 0) = -1 EINVAL (Invalid argument)
       ioctl(3, RTC_RD_TIME, 0x7ffd76b3ae70)   = -1 EINVAL (Invalid argument)
       close(3)                                = 0"
      
      This was caused by a stupid typo in a patch that should have been
      a simple rename to move around contents of a header file, but
      accidentally wrote zeroes into the rtc rather than reading from
      it:
      
        463a8630 ("char/genrtc: x86: remove remnants of asm/rtc.h")
      Reported-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
      Tested-by: NJarkko Nikula <jarkko.nikula@linux.intel.com>
      Tested-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Cc: Alessandro Zummo <a.zummo@towertech.it>
      Cc: Alexandre Belloni <alexandre.belloni@free-electrons.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: rtc-linux@googlegroups.com
      Fixes: 463a8630 ("char/genrtc: x86: remove remnants of asm/rtc.h")
      Link: http://lkml.kernel.org/r/20160809195528.1604312-1-arnd@arndb.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      22cc1ca3
    • N
      x86/timers/apic: Inform TSC deadline clockevent device about recalibration · 6731b0d6
      Nicolai Stange 提交于
      This patch eliminates a source of imprecise APIC timer interrupts,
      which imprecision may result in double interrupts or even late
      interrupts.
      
      The TSC deadline clockevent devices' configuration and registration
      happens before the TSC frequency calibration is refined in
      tsc_refine_calibration_work().
      
      This results in the TSC clocksource and the TSC deadline clockevent
      devices being configured with slightly different frequencies: the former
      gets the refined one and the latter are configured with the inaccurate
      frequency detected earlier by means of the "Fast TSC calibration using PIT".
      
      Within the APIC code, introduce the notifier function
      lapic_update_tsc_freq() which reconfigures all per-CPU TSC deadline
      clockevent devices with the current tsc_khz.
      
      Call it from the TSC code after TSC calibration refinement has happened.
      Signed-off-by: NNicolai Stange <nicstange@gmail.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Christopher S. Hall <christopher.s.hall@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
      Cc: Len Brown <len.brown@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Viresh Kumar <viresh.kumar@linaro.org>
      Link: http://lkml.kernel.org/r/20160714152255.18295-3-nicstange@gmail.com
      [ Pushed #ifdef CONFIG_X86_LOCAL_APIC into header, improved changelog. ]
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      6731b0d6
    • N
      x86/timers/apic: Fix imprecise timer interrupts by eliminating TSC clockevents... · 1a9e4c56
      Nicolai Stange 提交于
      x86/timers/apic: Fix imprecise timer interrupts by eliminating TSC clockevents frequency roundoff error
      
      I noticed the following bug/misbehavior on certain Intel systems: with a
      single task running on a NOHZ CPU on an Intel Haswell, I recognized
      that I did not only get the one expected local_timer APIC interrupt, but
      two per second at minimum. (!)
      
      Further tracing showed that the first one precedes the programmed deadline
      by up to ~50us and hence, it did nothing except for reprogramming the TSC
      deadline clockevent device to trigger shortly thereafter again.
      
      The reason for this is imprecise calibration, the timeout we program into
      the APIC results in 'too short' timer interrupts. The core (hr)timer code
      notices this (because it has a precise ktime source and sees the short
      interrupt) and fixes it up by programming an additional very short
      interrupt period.
      
      This is obviously suboptimal.
      
      The reason for the imprecise calibration is twofold, and this patch
      fixes the first reason:
      
      In setup_APIC_timer(), the registered clockevent device's frequency
      is calculated by first dividing tsc_khz by TSC_DIVISOR and multiplying
      it with 1000 afterwards:
      
        (tsc_khz / TSC_DIVISOR) * 1000
      
      The multiplication with 1000 is done for converting from kHz to Hz and the
      division by TSC_DIVISOR is carried out in order to make sure that the final
      result fits into an u32.
      
      However, with the order given in this calculation, the roundoff error
      introduced by the division gets magnified by a factor of 1000 by the
      following multiplication.
      
      To fix it, reversing the order of the division and the multiplication a la:
      
        (tsc_khz * 1000) / TSC_DIVISOR
      
      ... reduces the roundoff error already.
      
      Furthermore, if TSC_DIVISOR divides 1000, associativity holds:
      
        (tsc_khz * 1000) / TSC_DIVISOR = tsc_khz * (1000 / TSC_DIVISOR)
      
      and thus, the roundoff error even vanishes and the whole operation can be
      carried out within 32 bits.
      
      The powers of two that divide 1000 are 2, 4 and 8. A value of 8 for
      TSC_DIVISOR still allows for TSC frequencies up to
      2^32 / 10^9ns * 8 = 34.4GHz which is way larger than anything to expect
      in the next years.
      
      Thus we also replace the current TSC_DIVISOR value of 32 by 8. Reverse
      the order of the divison and the multiplication in the calculation of
      the registered clockevent device's frequency.
      Signed-off-by: NNicolai Stange <nicstange@gmail.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Christopher S. Hall <christopher.s.hall@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
      Cc: Len Brown <len.brown@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Viresh Kumar <viresh.kumar@linaro.org>
      Link: http://lkml.kernel.org/r/20160714152255.18295-2-nicstange@gmail.com
      [ Improved changelog. ]
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      1a9e4c56
  7. 04 8月, 2016 3 次提交
    • K
      dma-mapping: use unsigned long for dma_attrs · 00085f1e
      Krzysztof Kozlowski 提交于
      The dma-mapping core and the implementations do not change the DMA
      attributes passed by pointer.  Thus the pointer can point to const data.
      However the attributes do not have to be a bitfield.  Instead unsigned
      long will do fine:
      
      1. This is just simpler.  Both in terms of reading the code and setting
         attributes.  Instead of initializing local attributes on the stack
         and passing pointer to it to dma_set_attr(), just set the bits.
      
      2. It brings safeness and checking for const correctness because the
         attributes are passed by value.
      
      Semantic patches for this change (at least most of them):
      
          virtual patch
          virtual context
      
          @r@
          identifier f, attrs;
      
          @@
          f(...,
          - struct dma_attrs *attrs
          + unsigned long attrs
          , ...)
          {
          ...
          }
      
          @@
          identifier r.f;
          @@
          f(...,
          - NULL
          + 0
           )
      
      and
      
          // Options: --all-includes
          virtual patch
          virtual context
      
          @r@
          identifier f, attrs;
          type t;
      
          @@
          t f(..., struct dma_attrs *attrs);
      
          @@
          identifier r.f;
          @@
          f(...,
          - NULL
          + 0
           )
      
      Link: http://lkml.kernel.org/r/1468399300-5399-2-git-send-email-k.kozlowski@samsung.comSigned-off-by: NKrzysztof Kozlowski <k.kozlowski@samsung.com>
      Acked-by: NVineet Gupta <vgupta@synopsys.com>
      Acked-by: NRobin Murphy <robin.murphy@arm.com>
      Acked-by: NHans-Christian Noren Egtvedt <egtvedt@samfundet.no>
      Acked-by: Mark Salter <msalter@redhat.com> [c6x]
      Acked-by: Jesper Nilsson <jesper.nilsson@axis.com> [cris]
      Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> [drm]
      Reviewed-by: NBart Van Assche <bart.vanassche@sandisk.com>
      Acked-by: Joerg Roedel <jroedel@suse.de> [iommu]
      Acked-by: Fabien Dessenne <fabien.dessenne@st.com> [bdisp]
      Reviewed-by: Marek Szyprowski <m.szyprowski@samsung.com> [vb2-core]
      Acked-by: David Vrabel <david.vrabel@citrix.com> [xen]
      Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> [xen swiotlb]
      Acked-by: Joerg Roedel <jroedel@suse.de> [iommu]
      Acked-by: Richard Kuo <rkuo@codeaurora.org> [hexagon]
      Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k]
      Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> [s390]
      Acked-by: NBjorn Andersson <bjorn.andersson@linaro.org>
      Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no> [avr32]
      Acked-by: Vineet Gupta <vgupta@synopsys.com> [arc]
      Acked-by: Robin Murphy <robin.murphy@arm.com> [arm64 and dma-iommu]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      00085f1e
    • M
      tree-wide: replace config_enabled() with IS_ENABLED() · 97f2645f
      Masahiro Yamada 提交于
      The use of config_enabled() against config options is ambiguous.  In
      practical terms, config_enabled() is equivalent to IS_BUILTIN(), but the
      author might have used it for the meaning of IS_ENABLED().  Using
      IS_ENABLED(), IS_BUILTIN(), IS_MODULE() etc.  makes the intention
      clearer.
      
      This commit replaces config_enabled() with IS_ENABLED() where possible.
      This commit is only touching bool config options.
      
      I noticed two cases where config_enabled() is used against a tristate
      option:
      
       - config_enabled(CONFIG_HWMON)
        [ drivers/net/wireless/ath/ath10k/thermal.c ]
      
       - config_enabled(CONFIG_BACKLIGHT_CLASS_DEVICE)
        [ drivers/gpu/drm/gma500/opregion.c ]
      
      I did not touch them because they should be converted to IS_BUILTIN()
      in order to keep the logic, but I was not sure it was the authors'
      intention.
      
      Link: http://lkml.kernel.org/r/1465215656-20569-1-git-send-email-yamada.masahiro@socionext.comSigned-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com>
      Acked-by: NKees Cook <keescook@chromium.org>
      Cc: Stas Sergeev <stsp@list.ru>
      Cc: Matt Redfearn <matt.redfearn@imgtec.com>
      Cc: Joshua Kinard <kumba@gentoo.org>
      Cc: Jiri Slaby <jslaby@suse.com>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Markos Chandras <markos.chandras@imgtec.com>
      Cc: "Dmitry V. Levin" <ldv@altlinux.org>
      Cc: yu-cheng yu <yu-cheng.yu@intel.com>
      Cc: James Hogan <james.hogan@imgtec.com>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Johannes Berg <johannes@sipsolutions.net>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Will Drewry <wad@chromium.org>
      Cc: Nikolay Martynov <mar.kolya@gmail.com>
      Cc: Huacai Chen <chenhc@lemote.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Leonid Yegoshin <Leonid.Yegoshin@imgtec.com>
      Cc: Rafal Milecki <zajec5@gmail.com>
      Cc: James Cowgill <James.Cowgill@imgtec.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Alex Smith <alex.smith@imgtec.com>
      Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
      Cc: Qais Yousef <qais.yousef@imgtec.com>
      Cc: Jiang Liu <jiang.liu@linux.intel.com>
      Cc: Mikko Rapeli <mikko.rapeli@iki.fi>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Brian Norris <computersforpeace@gmail.com>
      Cc: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
      Cc: "Luis R. Rodriguez" <mcgrof@do-not-panic.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Roland McGrath <roland@hack.frob.com>
      Cc: Paul Burton <paul.burton@imgtec.com>
      Cc: Kalle Valo <kvalo@qca.qualcomm.com>
      Cc: Viresh Kumar <viresh.kumar@linaro.org>
      Cc: Tony Wu <tung7970@gmail.com>
      Cc: Huaitong Han <huaitong.han@intel.com>
      Cc: Sumit Semwal <sumit.semwal@linaro.org>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Jason Cooper <jason@lakedaemon.net>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Andrea Gelmini <andrea.gelmini@gelma.net>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Rabin Vincent <rabin@rab.in>
      Cc: "Maciej W. Rozycki" <macro@imgtec.com>
      Cc: David Daney <david.daney@cavium.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      97f2645f
    • P
      pvclock: introduce seqcount-like API · 3aed64f6
      Paolo Bonzini 提交于
      The version field in struct pvclock_vcpu_time_info basically implements
      a seqcount.  Wrap it with the usual read_begin and read_retry functions,
      and use these APIs instead of peppering the code with smp_rmb()s.
      While at it, change it to the more pedantically correct virt_rmb().
      
      With this change, __pvclock_read_cycles can be simplified noticeably.
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      3aed64f6
  8. 27 7月, 2016 3 次提交
  9. 25 7月, 2016 1 次提交
  10. 22 7月, 2016 1 次提交
    • A
      x86/boot: Simplify EBDA-vs-BIOS reservation logic · 6a79296c
      Andy Lutomirski 提交于
      Both the intent and the effect of reserve_bios_regions() is simple:
      reserve the range from the apparent BIOS start (suitably filtered)
      through 1MB and, if the EBDA start address is sensible, extend that
      reservation downward to cover the EBDA as well.
      
      The code is overcomplicated, though, and contains head-scratchers
      like:
      
      	if (ebda_start < BIOS_START_MIN)
      		ebda_start = BIOS_START_MAX;
      
      That snipped is trying to say "if ebda_start < BIOS_START_MIN,
      ignore it".
      
      Simplify it: reorder the code so that it makes sense.  This should
      have no functional effect under any circumstances.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Luis R. Rodriguez <mcgrof@suse.com>
      Cc: Mario Limonciello <mario_limonciello@dell.com>
      Cc: Matthew Garrett <mjg59@srcf.ucam.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Link: http://lkml.kernel.org/r/ef89c0c761be20ead8bd9a3275743e6259b6092a.1469135598.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      6a79296c