1. 20 1月, 2017 1 次提交
    • P
      sched/clock: Fix hotplug crash · acb04058
      Peter Zijlstra 提交于
      Mike reported that he could trigger the WARN_ON_ONCE() in
      set_sched_clock_stable() using hotplug.
      
      This exposed a fundamental problem with the interface, we should never
      mark the TSC stable if we ever find it to be unstable. Therefore
      set_sched_clock_stable() is a broken interface.
      
      The reason it existed is that not having it is a pain, it means all
      relevant architecture code needs to call clear_sched_clock_stable()
      where appropriate.
      
      Of the three architectures that select HAVE_UNSTABLE_SCHED_CLOCK ia64
      and parisc are trivial in that they never called
      set_sched_clock_stable(), so add an unconditional call to
      clear_sched_clock_stable() to them.
      
      For x86 the story is a lot more involved, and what this patch tries to
      do is ensure we preserve the status quo. So even is Cyrix or Transmeta
      have usable TSC they never called set_sched_clock_stable() so they now
      get an explicit mark unstable.
      Reported-by: NMike Galbraith <efault@gmx.de>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Fixes: 9881b024 ("sched/clock: Delay switching sched_clock to stable")
      Link: http://lkml.kernel.org/r/20170119133633.GB6536@twins.programming.kicks-ass.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
      acb04058
  2. 19 1月, 2017 1 次提交
  3. 18 1月, 2017 1 次提交
  4. 14 1月, 2017 2 次提交
  5. 12 1月, 2017 2 次提交
    • J
      x86/unwind: Disable KASAN checks for non-current tasks · 84936118
      Josh Poimboeuf 提交于
      There are a handful of callers to save_stack_trace_tsk() and
      show_stack() which try to unwind the stack of a task other than current.
      In such cases, it's remotely possible that the task is running on one
      CPU while the unwinder is reading its stack from another CPU, causing
      the unwinder to see stack corruption.
      
      These cases seem to be mostly harmless.  The unwinder has checks which
      prevent it from following bad pointers beyond the bounds of the stack.
      So it's not really a bug as long as the caller understands that
      unwinding another task will not always succeed.
      
      In such cases, it's possible that the unwinder may read a KASAN-poisoned
      region of the stack.  Account for that by using READ_ONCE_NOCHECK() when
      reading the stack of another task.
      
      Use READ_ONCE() when reading the stack of the current task, since KASAN
      warnings can still be useful for finding bugs in that case.
      Reported-by: NDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Jones <davej@codemonkey.org.uk>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Miroslav Benes <mbenes@suse.cz>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/4c575eb288ba9f73d498dfe0acde2f58674598f1.1483978430.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      84936118
    • J
      x86/unwind: Silence warnings for non-current tasks · 900742d8
      Josh Poimboeuf 提交于
      There are a handful of callers to save_stack_trace_tsk() and
      show_stack() which try to unwind the stack of a task other than current.
      In such cases, it's remotely possible that the task is running on one
      CPU while the unwinder is reading its stack from another CPU, causing
      the unwinder to see stack corruption.
      
      These cases seem to be mostly harmless.  The unwinder has checks which
      prevent it from following bad pointers beyond the bounds of the stack.
      So it's not really a bug as long as the caller understands that
      unwinding another task will not always succeed.
      
      Since stack "corruption" on another task's stack isn't necessarily a
      bug, silence the warnings when unwinding tasks other than current.
      Reported-by: NDave Jones <davej@codemonkey.org.uk>
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Miroslav Benes <mbenes@suse.cz>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/00d8c50eea3446c1524a2a755397a3966629354c.1483978430.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      900742d8
  6. 10 1月, 2017 4 次提交
  7. 06 1月, 2017 1 次提交
  8. 05 1月, 2017 1 次提交
  9. 27 12月, 2016 1 次提交
  10. 25 12月, 2016 4 次提交
  11. 24 12月, 2016 1 次提交
    • J
      Revert "x86/unwind: Detect bad stack return address" · c280f773
      Josh Poimboeuf 提交于
      Revert the following commit:
      
        b6959a36 ("x86/unwind: Detect bad stack return address")
      
      ... because Andrey Konovalov reported an unwinder warning:
      
        WARNING: unrecognized kernel stack return address ffffffffa0000001 at ffff88006377fa18 in a.out:4467
      
      The unwind was initiated from an interrupt which occurred while running in the
      generated code for a kprobe.  The unwinder printed the warning because it
      expected regs->ip to point to a valid text address, but instead it pointed to
      the generated code.
      
      Eventually we may want come up with a way to identify generated kprobe
      code so the unwinder can know that it's a valid return address.  Until
      then, just remove the warning.
      Reported-by: NAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/02f296848fbf49fb72dfeea706413ecbd9d4caf6.1482418739.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      c280f773
  12. 23 12月, 2016 1 次提交
    • P
      x86/paravirt: Mark unused patch_default label · cef4402d
      Peter Zijlstra 提交于
      A bugfix commit:
      
        45dbea5f ("x86/paravirt: Fix native_patch()")
      
      ... introduced a harmless warning:
      
        arch/x86/kernel/paravirt_patch_32.c: In function 'native_patch':
        arch/x86/kernel/paravirt_patch_32.c:71:1: error: label 'patch_default' defined but not used [-Werror=unused-label]
      
      Fix it by annotating the label as __maybe_unused.
      Reported-by: NArnd Bergmann <arnd@arndb.de>
      Reported-by: NPiotr Gregor <piotrgregor@rsyncme.org>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Fixes: 45dbea5f ("x86/paravirt: Fix native_patch()")
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      cef4402d
  13. 21 12月, 2016 1 次提交
  14. 20 12月, 2016 2 次提交
    • B
      x86/alternatives: Do not use sync_core() to serialize I$ · 34bfab0e
      Borislav Petkov 提交于
      We use sync_core() in the alternatives code to stop speculative
      execution of prefetched instructions because we are potentially changing
      them and don't want to execute stale bytes.
      
      What it does on most machines is call CPUID which is a serializing
      instruction. And that's expensive.
      
      However, the instruction cache is serialized when we're on the local CPU
      and are changing the data through the same virtual address. So then, we
      don't need the serializing CPUID but a simple control flow change. Last
      being accomplished with a CALL/RET which the noinline causes.
      Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Reviewed-by: NAndy Lutomirski <luto@kernel.org>
      Cc: Andrew Cooper <andrew.cooper3@citrix.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
      Cc: Matthew Whitehead <tedheadster@gmail.com>
      Cc: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20161203150258.vwr5zzco7ctgc4pe@pd.tnicSigned-off-by: NIngo Molnar <mingo@kernel.org>
      34bfab0e
    • V
      x86/hyperv: Handle unknown NMIs on one CPU when unknown_nmi_panic · 59107e2f
      Vitaly Kuznetsov 提交于
      There is a feature in Hyper-V ('Debug-VM --InjectNonMaskableInterrupt')
      which injects NMI to the guest. We may want to crash the guest and do kdump
      on this NMI by enabling unknown_nmi_panic. To make kdump succeed we need to
      allow the kdump kernel to re-establish VMBus connection so it will see
      VMBus devices (storage, network,..).
      
      To properly unload VMBus making it possible to start over during kdump we
      need to do the following:
      
       - Send an 'unload' message to the hypervisor. This can be done on any CPU
         so we do this the crashing CPU.
      
       - Receive the 'unload finished' reply message. WS2012R2 delivers this
         message to the CPU which was used to establish VMBus connection during
         module load and this CPU may differ from the CPU sending 'unload'.
      
      Receiving a VMBus message means the following:
      
       - There is a per-CPU slot in memory for one message. This slot can in
         theory be accessed by any CPU.
      
       - We get an interrupt on the CPU when a message was placed into the slot.
      
       - When we read the message we need to clear the slot and signal the fact
         to the hypervisor. In case there are more messages to this CPU pending
         the hypervisor will deliver the next message. The signaling is done by
         writing to an MSR so this can only be done on the appropriate CPU.
      
      To avoid doing cross-CPU work on crash we have vmbus_wait_for_unload()
      function which checks message slots for all CPUs in a loop waiting for the
      'unload finished' messages. However, there is an issue which arises when
      these conditions are met:
      
       - We're crashing on a CPU which is different from the one which was used
         to initially contact the hypervisor.
      
       - The CPU which was used for the initial contact is blocked with interrupts
         disabled and there is a message pending in the message slot.
      
      In this case we won't be able to read the 'unload finished' message on the
      crashing CPU. This is reproducible when we receive unknown NMIs on all CPUs
      simultaneously: the first CPU entering panic() will proceed to crash and
      all other CPUs will stop themselves with interrupts disabled.
      
      The suggested solution is to handle unknown NMIs for Hyper-V guests on the
      first CPU which gets them only. This will allow us to rely on VMBus
      interrupt handler being able to receive the 'unload finish' message in
      case it is delivered to a different CPU.
      
      The issue is not reproducible on WS2016 as Debug-VM delivers NMI to the
      boot CPU only, WS2012R2 and earlier Hyper-V versions are affected.
      Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
      Acked-by: NK. Y. Srinivasan <kys@microsoft.com>
      Cc: devel@linuxdriverproject.org
      Cc: Haiyang Zhang <haiyangz@microsoft.com>
      Link: http://lkml.kernel.org/r/20161202100720.28121-1-vkuznets@redhat.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      59107e2f
  15. 19 12月, 2016 12 次提交
  16. 18 12月, 2016 2 次提交
    • T
      x86/tsc: Limit the adjust value further · 8c9b9d87
      Thomas Gleixner 提交于
      Adjust value 0x80000000 and other values larger than that render the TSC
      deadline timer disfunctional.
      
      We have not yet any information about this from Intel, but experimentation
      clearly proves that this is a 32/64 bit and sign extension issue.
      
      If adjust values larger than that are actually required, which might be the
      case for physical CPU hotplug, then we need to disable the deadline timer
      on the affected package/CPUs and use the local APIC timer instead.
      
      That requires some surgery in the APIC setup code, so we just limit the
      ADJUST register value into the known to work range for now and revisit this
      when Intel comes forth with proper information.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Roland Scheidegger <rscheidegger_lists@hispeed.ch>
      Cc: Bruce Schlobohm <bruce.schlobohm@intel.com>
      Cc: Kevin Stanton <kevin.b.stanton@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      8c9b9d87
    • T
      x86/tsc: Annotate printouts as firmware bug · 16588f65
      Thomas Gleixner 提交于
      Make it more obvious that the BIOS is screwed up.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Roland Scheidegger <rscheidegger_lists@hispeed.ch>
      Cc: Bruce Schlobohm <bruce.schlobohm@intel.com>
      Cc: Kevin Stanton <kevin.b.stanton@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      16588f65
  17. 15 12月, 2016 3 次提交
    • T
      x86/tsc: Force TSC_ADJUST register to value >= zero · 5bae1562
      Thomas Gleixner 提交于
      Roland reported that his DELL T5810 sports a value add BIOS which
      completely wreckages the TSC. The squirmware [(TM) Ingo Molnar] boots with
      random negative TSC_ADJUST values, different on all CPUs. That renders the
      TSC useless because the sycnchronization check fails.
      
      Roland tested the new TSC_ADJUST mechanism. While it manages to readjust
      the TSCs he needs to disable the TSC deadline timer, otherwise the machine
      just stops booting.
      
      Deeper investigation unearthed that the TSC deadline timer is sensitive to
      the TSC_ADJUST value. Writing TSC_ADJUST to a negative value results in an
      interrupt storm caused by the TSC deadline timer.
      
      This does not make any sense and it's hard to imagine what kind of hardware
      wreckage is behind that misfeature, but it's reliably reproducible on other
      systems which have TSC_ADJUST and TSC deadline timer.
      
      While it would be understandable that a big enough negative value which
      moves the resulting TSC readout into the negative space could have the
      described effect, this happens even with a adjust value of -1, which keeps
      the TSC readout definitely in the positive space. The compare register for
      the TSC deadline timer is set to a positive value larger than the TSC, but
      despite not having reached the deadline the interrupt is raised
      immediately. If this happens on the boot CPU, then the machine dies
      silently because this setup happens before the NMI watchdog is armed.
      
      Further experiments showed that any other adjustment of TSC_ADJUST works as
      expected as long as it stays in the positive range. The direction of the
      adjustment has no influence either. See the lkml link for further analysis.
      
      Yet another proof for the theory that timers are designed by janitors and
      the underlying (obviously undocumented) mechanisms which allow BIOSes to
      wreckage them are considered a feature. Well done Intel - NOT!
      
      To address this wreckage add the following sanity measures:
      
      - If the TSC_ADJUST value on the boot cpu is not 0, set it to 0
      
      - If the TSC_ADJUST value on any cpu is negative, set it to 0
      
      - Prevent the cross package synchronization mechanism from setting negative
        TSC_ADJUST values.
      Reported-and-tested-by: NRoland Scheidegger <rscheidegger_lists@hispeed.ch>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Bruce Schlobohm <bruce.schlobohm@intel.com>
      Cc: Kevin Stanton <kevin.b.stanton@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Allen Hung <allen_hung@dell.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Link: http://lkml.kernel.org/r/20161213131211.397588033@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      5bae1562
    • T
      x86/tsc: Validate TSC_ADJUST after resume · 6a369583
      Thomas Gleixner 提交于
      Some 'feature' BIOSes fiddle with the TSC_ADJUST register during
      suspend/resume which renders the TSC unusable.
      
      Add sanity checks into the resume path and restore the
      original value if it was adjusted.
      Reported-and-tested-by: NRoland Scheidegger <rscheidegger_lists@hispeed.ch>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Bruce Schlobohm <bruce.schlobohm@intel.com>
      Cc: Kevin Stanton <kevin.b.stanton@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Allen Hung <allen_hung@dell.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Link: http://lkml.kernel.org/r/20161213131211.317654500@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      6a369583
    • B
      x86/acpi: Use proper macro for invalid node · 4370a3ef
      Boris Ostrovsky 提交于
      Use NUMA_NO_NODE instead of -1.
      Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: len.brown@intel.com
      Cc: rjw@rjwysocki.net
      Cc: linux-acpi@vger.kernel.org
      Cc: pavel@ucw.cz
      Link: http://lkml.kernel.org/r/1481570993-13941-1-git-send-email-boris.ostrovsky@oracle.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      4370a3ef