1. 04 5月, 2016 5 次提交
  2. 28 4月, 2016 1 次提交
    • K
      x86/apic: Handle zero vector gracefully in clear_vector_irq() · 1bdb8970
      Keith Busch 提交于
      If x86_vector_alloc_irq() fails x86_vector_free_irqs() is invoked to cleanup
      the already allocated vectors. This subsequently calls clear_vector_irq().
      
      The failed irq has no vector assigned, which triggers the BUG_ON(!vector) in
      clear_vector_irq().
      
      We cannot suppress the call to x86_vector_free_irqs() for the failed
      interrupt, because the other data related to this irq must be cleaned up as
      well. So calling clear_vector_irq() with vector == 0 is legitimate.
      
      Remove the BUG_ON and return if vector is zero,
      
      [ tglx: Massaged changelog ]
      
      Fixes: b5dc8e6c "x86/irq: Use hierarchical irqdomain to manage CPU interrupt vectors"
      Signed-off-by: NKeith Busch <keith.busch@intel.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      1bdb8970
  3. 23 3月, 2016 1 次提交
    • D
      kernel: add kcov code coverage · 5c9a8750
      Dmitry Vyukov 提交于
      kcov provides code coverage collection for coverage-guided fuzzing
      (randomized testing).  Coverage-guided fuzzing is a testing technique
      that uses coverage feedback to determine new interesting inputs to a
      system.  A notable user-space example is AFL
      (http://lcamtuf.coredump.cx/afl/).  However, this technique is not
      widely used for kernel testing due to missing compiler and kernel
      support.
      
      kcov does not aim to collect as much coverage as possible.  It aims to
      collect more or less stable coverage that is function of syscall inputs.
      To achieve this goal it does not collect coverage in soft/hard
      interrupts and instrumentation of some inherently non-deterministic or
      non-interesting parts of kernel is disbled (e.g.  scheduler, locking).
      
      Currently there is a single coverage collection mode (tracing), but the
      API anticipates additional collection modes.  Initially I also
      implemented a second mode which exposes coverage in a fixed-size hash
      table of counters (what Quentin used in his original patch).  I've
      dropped the second mode for simplicity.
      
      This patch adds the necessary support on kernel side.  The complimentary
      compiler support was added in gcc revision 231296.
      
      We've used this support to build syzkaller system call fuzzer, which has
      found 90 kernel bugs in just 2 months:
      
        https://github.com/google/syzkaller/wiki/Found-Bugs
      
      We've also found 30+ bugs in our internal systems with syzkaller.
      Another (yet unexplored) direction where kcov coverage would greatly
      help is more traditional "blob mutation".  For example, mounting a
      random blob as a filesystem, or receiving a random blob over wire.
      
      Why not gcov.  Typical fuzzing loop looks as follows: (1) reset
      coverage, (2) execute a bit of code, (3) collect coverage, repeat.  A
      typical coverage can be just a dozen of basic blocks (e.g.  an invalid
      input).  In such context gcov becomes prohibitively expensive as
      reset/collect coverage steps depend on total number of basic
      blocks/edges in program (in case of kernel it is about 2M).  Cost of
      kcov depends only on number of executed basic blocks/edges.  On top of
      that, kernel requires per-thread coverage because there are always
      background threads and unrelated processes that also produce coverage.
      With inlined gcov instrumentation per-thread coverage is not possible.
      
      kcov exposes kernel PCs and control flow to user-space which is
      insecure.  But debugfs should not be mapped as user accessible.
      
      Based on a patch by Quentin Casasnovas.
      
      [akpm@linux-foundation.org: make task_struct.kcov_mode have type `enum kcov_mode']
      [akpm@linux-foundation.org: unbreak allmodconfig]
      [akpm@linux-foundation.org: follow x86 Makefile layout standards]
      Signed-off-by: NDmitry Vyukov <dvyukov@google.com>
      Reviewed-by: NKees Cook <keescook@chromium.org>
      Cc: syzkaller <syzkaller@googlegroups.com>
      Cc: Vegard Nossum <vegard.nossum@oracle.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Tavis Ormandy <taviso@google.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
      Cc: Kostya Serebryany <kcc@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Kees Cook <keescook@google.com>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Sasha Levin <sasha.levin@oracle.com>
      Cc: David Drysdale <drysdale@google.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
      Cc: Kirill A. Shutemov <kirill@shutemov.name>
      Cc: Jiri Slaby <jslaby@suse.cz>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5c9a8750
  4. 19 3月, 2016 1 次提交
  5. 18 3月, 2016 1 次提交
    • T
      x86/irq: Cure live lock in fixup_irqs() · 551adc60
      Thomas Gleixner 提交于
      Harry reported, that he's able to trigger a system freeze with cpu hot
      unplug. The freeze turned out to be a live lock caused by recent changes in
      irq_force_complete_move().
      
      When fixup_irqs() and from there irq_force_complete_move() is called on the
      dying cpu, then all other cpus are in stop machine an wait for the dying cpu
      to complete the teardown. If there is a move of an interrupt pending then
      irq_force_complete_move() sends the cleanup IPI to the cpus in the old_domain
      mask and waits for them to clear the mask. That's obviously impossible as
      those cpus are firmly stuck in stop machine with interrupts disabled.
      
      I should have known that, but I completely overlooked it being concentrated on
      the locking issues around the vectors. And the existance of the call to
      __irq_complete_move() in the code, which actually sends the cleanup IPI made
      it reasonable to wait for that cleanup to complete. That call was bogus even
      before the recent changes as it was just a pointless distraction.
      
      We have to look at two cases:
      
      1) The move_in_progress flag of the interrupt is set
      
         This means the ioapic has been updated with the new vector, but it has not
         fired yet. In theory there is a race:
      
         set_ioapic(new_vector) <-- Interrupt is raised before update is effective,
         			      i.e. it's raised on the old vector. 
      
         So if the target cpu cannot handle that interrupt before the old vector is
         cleaned up, we get a spurious interrupt and in the worst case the ioapic
         irq line becomes stale, but my experiments so far have only resulted in
         spurious interrupts.
      
         But in case of cpu hotplug this should be a non issue because if the
         affinity update happens right before all cpus rendevouz in stop machine,
         there is no way that the interrupt can be blocked on the target cpu because
         all cpus loops first with interrupts enabled in stop machine, so the old
         vector is not yet cleaned up when the interrupt fires.
      
         So the only way to run into this issue is if the delivery of the interrupt
         on the apic/system bus would be delayed beyond the point where the target
         cpu disables interrupts in stop machine. I doubt that it can happen, but at
         least there is a theroretical chance. Virtualization might be able to
         expose this, but AFAICT the IOAPIC emulation is not as stupid as the real
         hardware.
      
         I've spent quite some time over the weekend to enforce that situation,
         though I was not able to trigger the delayed case.
      
      2) The move_in_progress flag is not set and the old_domain cpu mask is not
         empty.
      
         That means, that an interrupt was delivered after the change and the
         cleanup IPI has been sent to the cpus in old_domain, but not all CPUs have
         responded to it yet.
      
      In both cases we can assume that the next interrupt will arrive on the new
      vector, so we can cleanup the old vectors on the cpus in the old_domain cpu
      mask.
      
      Fixes: 98229aa3 "x86/irq: Plug vector cleanup race"
      Reported-by: NHarry Junior <harryjr@outlook.fr>
      Tested-by: NTony Luck <tony.luck@intel.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Joe Lawrence <joe.lawrence@stratus.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Ben Hutchings <ben@decadent.org.uk>
      Cc: stable@vger.kernel.org
      Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1603140931430.3657@nanosSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      551adc60
  6. 08 3月, 2016 2 次提交
  7. 29 2月, 2016 1 次提交
    • T
      x86/topology: Create logical package id · 1f12e32f
      Thomas Gleixner 提交于
      For per package oriented services we must be able to rely on the number of CPU
      packages to be within bounds. Create a tracking facility, which
      
      - calculates the number of possible packages depending on nr_cpu_ids after boot
      
      - makes sure that the package id is within the number of possible packages. If
        the apic id is outside we map it to a logical package id if there is enough
        space available.
      
      Provide interfaces for drivers to query the mapping and do translations from
      physcial to logical ids.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andi Kleen <andi.kleen@intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Harish Chegondi <harish.chegondi@intel.com>
      Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Luis R. Rodriguez <mcgrof@suse.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: linux-kernel@vger.kernel.org
      Link: http://lkml.kernel.org/r/20160222221011.541071755@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      1f12e32f
  8. 24 2月, 2016 1 次提交
  9. 30 1月, 2016 1 次提交
    • B
      x86/cpufeature: Replace the old static_cpu_has() with safe variant · bc696ca0
      Borislav Petkov 提交于
      So the old one didn't work properly before alternatives had run.
      And it was supposed to provide an optimized JMP because the
      assumption was that the offset it is jumping to is within a
      signed byte and thus a two-byte JMP.
      
      So I did an x86_64 allyesconfig build and dumped all possible
      sites where static_cpu_has() was used. The optimization amounted
      to all in all 12(!) places where static_cpu_has() had generated
      a 2-byte JMP. Which has saved us a whopping 36 bytes!
      
      This clearly is not worth the trouble so we can remove it. The
      only place where the optimization might count - in __switch_to()
      - we will handle differently. But that's not subject of this
      patch.
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1453842730-28463-6-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      bc696ca0
  10. 19 1月, 2016 1 次提交
  11. 15 1月, 2016 14 次提交
  12. 31 12月, 2015 1 次提交
  13. 20 12月, 2015 1 次提交
  14. 19 12月, 2015 1 次提交
    • H
      x86/apic: Introduce apic_extnmi command line parameter · b7c4948e
      Hidehiro Kawai 提交于
      This patch introduces a command line parameter apic_extnmi:
      
       apic_extnmi=( bsp|all|none )
      
      The default value is "bsp" and this is the current behavior: only the
      Boot-Strapping Processor receives an external NMI.
      
      "all" allows external NMIs to be broadcast to all CPUs. This would
      raise the success rate of panic on NMI when BSP hangs in NMI context
      or the external NMI is swallowed by other NMI handlers on the BSP.
      
      If you specify "none", no CPUs receive external NMIs. This is useful for
      the dump capture kernel so that it cannot be shot down by accidentally
      pressing the external NMI button (on platforms which have it) while
      saving a crash dump.
      Signed-off-by: NHidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Bandan Das <bsd@redhat.com>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiang Liu <jiang.liu@linux.intel.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: kexec@lists.infradead.org
      Cc: linux-doc@vger.kernel.org
      Cc: "Maciej W. Rozycki" <macro@linux-mips.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Viresh Kumar <viresh.kumar@linaro.org>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: x86-ml <x86@kernel.org>
      Link: http://lkml.kernel.org/r/20151210014632.25437.43778.stgit@softrsSigned-off-by: NBorislav Petkov <bp@suse.de>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      b7c4948e
  15. 24 11月, 2015 1 次提交
  16. 07 11月, 2015 1 次提交
    • V
      x86/irq: Probe for PIC presence before allocating descs for legacy IRQs · 8c058b0b
      Vitaly Kuznetsov 提交于
      Commit d32932d0 ("x86/irq: Convert IOAPIC to use hierarchical irqdomain
      interfaces") brought a regression for Hyper-V Gen2 instances. These
      instances don't have i8259 legacy PIC but they use legacy IRQs for serial
      port, rtc, and acpi. With this commit included we end up with these IRQs
      not initialized. Earlier, there was a special workaround for legacy IRQs
      in mp_map_pin_to_irq() doing mp_irqdomain_map() without looking at
      nr_legacy_irqs() and now we fail in __irq_domain_alloc_irqs() when
      irq_domain_alloc_descs() returns -EEXIST.
      
      The essence of the issue seems to be that early_irq_init() calls
      arch_probe_nr_irqs() to figure out the number of legacy IRQs before
      we probe for i8259 and gets 16. Later when init_8259A() is called we switch
      to NULL legacy PIC and nr_legacy_irqs() starts to return 0 but we already
      have 16 descs allocated.
      
      Solve the issue by separating i8259 probe from init and calling it in
      arch_probe_nr_irqs() before we actually use nr_legacy_irqs() information.
      
      Fixes: d32932d0 ("x86/irq: Convert IOAPIC to use hierarchical irqdomain interfaces")
      Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
      Cc: Jiang Liu <jiang.liu@linux.intel.com>
      Cc: K. Y. Srinivasan <kys@microsoft.com>
      Cc: stable@vger.kernel.org
      Link: http://lkml.kernel.org/r/1446543614-3621-1-git-send-email-vkuznets@redhat.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      8c058b0b
  17. 05 11月, 2015 6 次提交