1. 04 5月, 2016 8 次提交
  2. 28 4月, 2016 4 次提交
    • K
      perf/x86/intel: Fix incorrect lbr_sel_mask value · cf3beb7c
      Kan Liang 提交于
      This patch fixes a bug which was introduced by:
      
       b16a5b52 ("perf/x86: Add option to disable reading branch flags/cycles")
      
      In this patch, lbr_sel_mask is used to mask the lbr_select. But LBR_SEL_MASK
      doesn't include the bit for LBR_CALL_STACK. So LBR call stack will never be
      set in lbr_select.
      
      This patch corrects the LBR_SEL_MASK by including all valid bits in
      LBR_SELECT. Also, the LBR_CALL_STACK bit is different as other bit in
      LBR_SELECT. It does not operate in suppress mode, so it needs to be
      specially handled in intel_pmu_setup_hw_lbr_filter.
      Signed-off-by: NKan Liang <kan.liang@intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Link: http://lkml.kernel.org/r/1461231010-4399-1-git-send-email-kan.liang@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      cf3beb7c
    • A
      perf/x86/intel/pt: Don't die on VMXON · 1c5ac21a
      Alexander Shishkin 提交于
      Some versions of Intel PT do not support tracing across VMXON, more
      specifically, VMXON will clear TraceEn control bit and any attempt to
      set it before VMXOFF will throw a #GP, which in the current state of
      things will crash the kernel. Namely:
      
        $ perf record -e intel_pt// kvm -nographic
      
      on such a machine will kill it.
      
      To avoid this, notify the intel_pt driver before VMXON and after
      VMXOFF so that it knows when not to enable itself.
      Signed-off-by: NAlexander Shishkin <alexander.shishkin@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Gleb Natapov <gleb@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: hpa@zytor.com
      Link: http://lkml.kernel.org/r/87oa9dwrfk.fsf@ashishki-desk.ger.corp.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      1c5ac21a
    • A
      perf/x86/amd: Set the size of event map array to PERF_COUNT_HW_MAX · 0a25556f
      Adam Borowski 提交于
      The entry for PERF_COUNT_HW_REF_CPU_CYCLES is not used on AMD, but is
      referenced by filter_events() which expects undefined events to have a
      value of 0.
      
      Found via KASAN:
      
        UBSAN: Undefined behaviour in arch/x86/events/amd/core.c:132:30
        index 9 is out of range for type 'u64 [9]'
        UBSAN: Undefined behaviour in arch/x86/events/amd/core.c:132:9
        load of address ffffffff81c021c8 with insufficient space for an object of type 'const u64'
      Signed-off-by: NAdam Borowski <kilobyte@angband.pl>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Link: http://lkml.kernel.org/r/1461749731-30979-1-git-send-email-kilobyte@angband.plSigned-off-by: NIngo Molnar <mingo@kernel.org>
      0a25556f
    • K
      x86/apic: Handle zero vector gracefully in clear_vector_irq() · 1bdb8970
      Keith Busch 提交于
      If x86_vector_alloc_irq() fails x86_vector_free_irqs() is invoked to cleanup
      the already allocated vectors. This subsequently calls clear_vector_irq().
      
      The failed irq has no vector assigned, which triggers the BUG_ON(!vector) in
      clear_vector_irq().
      
      We cannot suppress the call to x86_vector_free_irqs() for the failed
      interrupt, because the other data related to this irq must be cleaned up as
      well. So calling clear_vector_irq() with vector == 0 is legitimate.
      
      Remove the BUG_ON and return if vector is zero,
      
      [ tglx: Massaged changelog ]
      
      Fixes: b5dc8e6c "x86/irq: Use hierarchical irqdomain to manage CPU interrupt vectors"
      Signed-off-by: NKeith Busch <keith.busch@intel.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      1bdb8970
  3. 27 4月, 2016 5 次提交
    • A
      ARC: add support for reserved memory defined by device tree · 1b10cb21
      Alexey Brodkin 提交于
      Enable reserved memory initialization from device tree.
      Signed-off-by: NAlexey Brodkin <abrodkin@synopsys.com>
      Cc: Grant Likely <grant.likely@linaro.org>
      Cc: Marek Szyprowski <m.szyprowski@samsung.com>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      1b10cb21
    • A
      ARC: support generic per-device coherent dma mem · 32ed9a0e
      Alexey Brodkin 提交于
      Signed-off-by: NAlexey Brodkin <abrodkin@synopsys.com>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      32ed9a0e
    • R
      nios2: memset: use the right constraint modifier for the %4 output operand · a8950e49
      Romain Perier 提交于
      Depending on the size of the area to be memset'ed, the nios2 memset implementation
      either uses a naive loop (for buffers smaller or equal than 8 bytes) or a more optimized
      implementation (for buffers larger than 8 bytes). This implementation does 4-byte stores
      rather than 1-byte stores to speed up memset.
      
      However, we discovered that on our nios2 platform, memset() was not properly setting the
      buffer to the expected value. A memset of 0xff would not set the entire buffer to 0xff, but to:
      
      0xff 0x00 0xff 0x00 0xff 0x00 0xff 0x00 ...
      
      Which is obviously incorrect. Our investigation has revealed that the problem lies in the
      incorrect constraints used in the inline assembly.
      
      The following piece of assembly, from the nios2 memset implementation, is supposed to
      create a 4-byte value that repeats 4 times the 1-byte pattern passed as memset argument:
      
      /* fill8 %3, %5 (c & 0xff) */
      "       slli    %4, %5, 8\n"
      "       or      %4, %4, %5\n"
      "       slli    %3, %4, 16\n"
      "       or      %3, %3, %4\n"
      
      However, depending on the compiler and optimization level, this code might be compiled as:
      
      34:	280a923a 	slli	r5,r5,8
      38:	294ab03a 	or	r5,r5,r5
      3c:	2808943a 	slli	r4,r5,16
      40:	2148b03a 	or	r4,r4,r5
      
      This is wrong because r5 gets used both for %5 and %4, which leads to the final pattern
      stored in r4 to be 0xff00ff00 rather than the expected 0xffffffff.
      
      %4 is defined with the "=r" constraint, i.e as an output operand. However, as explained in
      http://www.ethernut.de/en/documents/arm-inline-asm.html, this does not prevent gcc from
      using the same register for an output operand (%4) and input operand (%5). By using the
      constraint modifier '&', we indicate that the register should be used for output only. With this
      change, we get the following assembly output:
      
      34:	2810923a 	slli	r8,r5,8
      38:	4150b03a 	or	r8,r8,r5
      3c:	400e943a 	slli	r7,r8,16
      40:	3a0eb03a 	or	r7,r7,r8
      
      Which correctly produces the 0xffffffff pattern when 0xff is passed as the memset() pattern.
      
      It is worth mentioning the observed consequence of this bug: we were hitting the kernel
      BUG() in mm/bootmem.c:__free() that verifies when marking a page as free that it was
      previously marked as occupied (i.e that the bit was set to 1). The entire bootmem bitmap is
      set to 0xff bit via a memset() during the bootmem initialization. The bootmem_free() call right
      after the initialization was finding some bits to be set to 0, which didn't make sense since the
      bitmap has just been memset'ed to 0xff. Except that due to the bug explained above, the
      bitmap was in fact initialized to 0xff00ff00.
      
      Thanks to Marek Vasut for his help and feedback.
      Signed-off-by: NRomain Perier <romain.perier@free-electrons.com>
      Acked-by: NMarek Vasut <marex@denx.de>
      Acked-by: NLey Foon Tan <lftan@altera.com>
      a8950e49
    • R
      powerpc: wire up preadv2 and pwritev2 syscalls · d701cca6
      Rui Salvaterra 提交于
      Wire up preadv2/pwritev2 in the same way as preadv/pwritev. Fixes two
      build warnings on ppc64.
      
      mpe: Lightly tested with fio (slightly hacked to add the syscall
      wrappers):
      
        fio-4217  [009] ....  1304.635300: sys_preadv2(fd: 3, vec:
        10025821de0, vlen: 1, pos_l: 6253000, pos_h: 0, flags: 1)
        fio-4217  [009] ....  1304.635474: sys_preadv2 -> 0x1000
      Signed-off-by: NRui Salvaterra <rsalvaterra@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      d701cca6
    • A
      Revert "x86/mm/32: Set NX in __supported_pte_mask before enabling paging" · e16d8a6c
      Andy Lutomirski 提交于
      This reverts commit 320d25b6.
      
      This change was problematic for a couple of reasons:
      
      1. It missed a some entry points (Xen things and 64-bit native).
      
      2. The entry it changed can be executed more than once.  This isn't
         really a problem, but it conflated per-cpu state setup and global
         state setup.
      
      3. It broke 64-bit non-NX.  64-bit non-NX worked the other way around from
         32-bit -- __supported_pte_mask had NX set initially and was *cleared*
         in x86_configure_nx.  With the patch applied, it never got cleared.
      Reported-and-tested-by: NMeelis Roos <mroos@linux.ee>
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/59bd15f7f4b56b633a611b7f70876c6d2ad01a98.1461685884.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      e16d8a6c
  4. 24 4月, 2016 1 次提交
  5. 23 4月, 2016 3 次提交
    • S
      perf/x86/intel/rapl: Add missing Haswell model · e1089602
      Srinivas Pandruvada 提交于
      Added one missing Haswell model.
      Signed-off-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: bp@alien8.de
      Cc: hpa@zytor.com
      Link: http://lkml.kernel.org/r/1460907809-11897-1-git-send-email-srinivas.pandruvada@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      e1089602
    • A
      perf/x86/intel: Add model number for Skylake Server to perf · b89c1737
      Andi Kleen 提交于
      Everything the same as base Skylake, just a new model number.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Link: http://lkml.kernel.org/r/1460751933-2264-1-git-send-email-andi@firstfloor.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      b89c1737
    • R
      xen/qspinlock: Don't kick CPU if IRQ is not initialized · 707e59ba
      Ross Lagerwall 提交于
      The following commit:
      
        1fb3a8b2 ("xen/spinlock: Fix locking path engaging too soon under PVHVM.")
      
      ... moved the initalization of the kicker interrupt until after
      native_cpu_up() is called.
      
      However, when using qspinlocks, a CPU may try to kick another CPU that is
      spinning (because it has not yet initialized its kicker interrupt), resulting
      in the following crash during boot:
      
        kernel BUG at /build/linux-Ay7j_C/linux-4.4.0/drivers/xen/events/events_base.c:1210!
        invalid opcode: 0000 [#1] SMP
        ...
        RIP: 0010:[<ffffffff814c97c9>]  [<ffffffff814c97c9>] xen_send_IPI_one+0x59/0x60
        ...
        Call Trace:
         [<ffffffff8102be9e>] xen_qlock_kick+0xe/0x10
         [<ffffffff810cabc2>] __pv_queued_spin_unlock+0xb2/0xf0
         [<ffffffff810ca6d1>] ? __raw_callee_save___pv_queued_spin_unlock+0x11/0x20
         [<ffffffff81052936>] ? check_tsc_warp+0x76/0x150
         [<ffffffff81052aa6>] check_tsc_sync_source+0x96/0x160
         [<ffffffff81051e28>] native_cpu_up+0x3d8/0x9f0
         [<ffffffff8102b315>] xen_hvm_cpu_up+0x35/0x80
         [<ffffffff8108198c>] _cpu_up+0x13c/0x180
         [<ffffffff81081a4a>] cpu_up+0x7a/0xa0
         [<ffffffff81f80dfc>] smp_init+0x7f/0x81
         [<ffffffff81f5a121>] kernel_init_freeable+0xef/0x212
         [<ffffffff81817f30>] ? rest_init+0x80/0x80
         [<ffffffff81817f3e>] kernel_init+0xe/0xe0
         [<ffffffff8182488f>] ret_from_fork+0x3f/0x70
         [<ffffffff81817f30>] ? rest_init+0x80/0x80
      
      To fix this, only send the kick if the target CPU's interrupt has been
      initialized. This check isn't racy, because the target is waiting for
      the spinlock, so it won't have initialized the interrupt in the
      meantime.
      Signed-off-by: NRoss Lagerwall <ross.lagerwall@citrix.com>
      Reviewed-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: David Vrabel <david.vrabel@citrix.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Cc: xen-devel@lists.xenproject.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      707e59ba
  6. 22 4月, 2016 3 次提交
    • E
      ARCv2: Enable LOCKDEP · d9676fa1
      Evgeny Voevodin 提交于
      - The asm helpers for calling into irq tracer were missing
      
      - Add calls to above helpers in low level assembly entry code for ARCv2
      
      - irq_save() uses CLRI to disable interrupts and returns the prev interrupt
        state (in STATUS32) in a specific encoding (and not the raw value of
        STATUS32). This is usable with SETI in irq_restore(). However
        save_flags() reads the raw value of STATUS32 which doesn't pair with
        irq_save/restore() and thus needs fixing.
      Signed-off-by: NEvgeny Voevodin <evgeny.voevodin@intel.com>
      [vgupta: updated changelog and also added some comments]
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      d9676fa1
    • J
      x86/mm/xen: Suppress hugetlbfs in PV guests · 103f6112
      Jan Beulich 提交于
      Huge pages are not normally available to PV guests. Not suppressing
      hugetlbfs use results in an endless loop of page faults when user mode
      code tries to access a hugetlbfs mapped area (since the hypervisor
      denies such PTEs to be created, but error indications can't be
      propagated out of xen_set_pte_at(), just like for various of its
      siblings), and - once killed in an oops like this:
      
        kernel BUG at .../fs/hugetlbfs/inode.c:428!
        invalid opcode: 0000 [#1] SMP
        ...
        RIP: e030:[<ffffffff811c333b>]  [<ffffffff811c333b>] remove_inode_hugepages+0x25b/0x320
        ...
        Call Trace:
         [<ffffffff811c3415>] hugetlbfs_evict_inode+0x15/0x40
         [<ffffffff81167b3d>] evict+0xbd/0x1b0
         [<ffffffff8116514a>] __dentry_kill+0x19a/0x1f0
         [<ffffffff81165b0e>] dput+0x1fe/0x220
         [<ffffffff81150535>] __fput+0x155/0x200
         [<ffffffff81079fc0>] task_work_run+0x60/0xa0
         [<ffffffff81063510>] do_exit+0x160/0x400
         [<ffffffff810637eb>] do_group_exit+0x3b/0xa0
         [<ffffffff8106e8bd>] get_signal+0x1ed/0x470
         [<ffffffff8100f854>] do_signal+0x14/0x110
         [<ffffffff810030e9>] prepare_exit_to_usermode+0xe9/0xf0
         [<ffffffff814178a5>] retint_user+0x8/0x13
      
      This is CVE-2016-3961 / XSA-174.
      Reported-by: NVitaly Kuznetsov <vkuznets@redhat.com>
      Signed-off-by: NJan Beulich <jbeulich@suse.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: David Vrabel <david.vrabel@citrix.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Juergen Gross <JGross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Luis R. Rodriguez <mcgrof@suse.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: stable@vger.kernel.org
      Cc: xen-devel <xen-devel@lists.xenproject.org>
      Link: http://lkml.kernel.org/r/57188ED802000078000E431C@prv-mh.provo.novell.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      103f6112
    • D
      arm64: Fix EL1/EL2 early init inconsistencies with VHE · 882416c1
      Dave Martin 提交于
      When using the Virtualisation Host Extensions, EL1 is not used in
      the host and requires no separate configuration.
      
      In addition, with VHE enabled, non-hyp-specific EL2 configuration
      that does not need to be done early will be done anyway in
      __cpu_setup via the _EL1 system register aliases.  In particular,
      the layout and definition of CPTR_EL2 are changed by enabling VHE
      so that they resemble CPACR_EL1, so existing code to initialise
      CPTR_EL2 becomes architecturally wrong in this case.
      
      This patch simply skips the affected initialisation code in the
      non-VHE case.
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      882416c1
  7. 21 4月, 2016 2 次提交
    • G
      s390/mm: fix asce_bits handling with dynamic pagetable levels · 723cacbd
      Gerald Schaefer 提交于
      There is a race with multi-threaded applications between context switch and
      pagetable upgrade. In switch_mm() a new user_asce is built from mm->pgd and
      mm->context.asce_bits, w/o holding any locks. A concurrent mmap with a
      pagetable upgrade on another thread in crst_table_upgrade() could already
      have set new asce_bits, but not yet the new mm->pgd. This would result in a
      corrupt user_asce in switch_mm(), and eventually in a kernel panic from a
      translation exception.
      
      Fix this by storing the complete asce instead of just the asce_bits, which
      can then be read atomically from switch_mm(), so that it either sees the
      old value or the new value, but no mixture. Both cases are OK. Having the
      old value would result in a page fault on access to the higher level memory,
      but the fault handler would see the new mm->pgd, if it was a valid access
      after the mmap on the other thread has completed. So as worst-case scenario
      we would have a page fault loop for the racing thread until the next time
      slice.
      
      Also remove dead code and simplify the upgrade/downgrade path, there are no
      upgrades from 2 levels, and only downgrades from 3 levels for compat tasks.
      There are also no concurrent upgrades, because the mmap_sem is held with
      down_write() in do_mmap, so the flush and table checks during upgrade can
      be removed.
      Reported-by: NMichael Munday <munday@ca.ibm.com>
      Reviewed-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: NGerald Schaefer <gerald.schaefer@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      723cacbd
    • S
      s390/pci: fix use after free in dma_init · dba59909
      Sebastian Ott 提交于
      After a failure during registration of the dma_table (because of the
      function being in error state) we free its memory but don't reset the
      associated pointer to zero.
      
      When we then receive a notification from firmware (about the function
      being in error state) we'll try to walk and free the dma_table again.
      
      Fix this by resetting the dma_table pointer. In addition to that make
      sure that we free the iommu_bitmap when appropriate.
      Signed-off-by: NSebastian Ott <sebott@linux.vnet.ibm.com>
      Reviewed-by: NGerald Schaefer <gerald.schaefer@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      dba59909
  8. 20 4月, 2016 6 次提交
  9. 19 4月, 2016 1 次提交
  10. 18 4月, 2016 4 次提交
    • A
      arm64: fix invalidation of wrong __early_cpu_boot_status cacheline · adb49070
      Ard Biesheuvel 提交于
      In head.S, the str_l macro, which takes a source register, a symbol name
      and a temp register, is used to store a status value to the variable
      __early_cpu_boot_status. Subsequently, the value of the temp register is
      reused to invalidate any cachelines covering this variable.
      
      However, since str_l resolves to
      
            adrp    \tmp, \sym
            str     \src, [\tmp, :lo12:\sym]
      
      the temp register never actually holds the address of the variable but
      only of the 4 KB window that covers it, and reusing it leads to the
      wrong cacheline being invalidated. So instead, take the address
      explicitly before doing the store, and reuse that value to perform
      the cache invalidation.
      
      Fixes: bb905274 ("arm64: Handle early CPU boot failures")
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Acked-by: NMark Rutland <mark.rutland@arm.com>
      Acked-by: NSuzuki K Poulose <Suzuki.Poulose@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      adb49070
    • A
      powerpc: Update TM user feature bits in scan_features() · 4705e024
      Anton Blanchard 提交于
      We need to update the user TM feature bits (PPC_FEATURE2_HTM and
      PPC_FEATURE2_HTM) to mirror what we do with the kernel TM feature
      bit.
      
      At the moment, if firmware reports TM is not available we turn off
      the kernel TM feature bit but leave the userspace ones on. Userspace
      thinks it can execute TM instructions and it dies trying.
      
      This (together with a QEMU patch) fixes PR KVM, which doesn't currently
      support TM.
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      4705e024
    • A
      powerpc: Update cpu_user_features2 in scan_features() · beff8237
      Anton Blanchard 提交于
      scan_features() updates cpu_user_features but not cpu_user_features2.
      
      Amongst other things, cpu_user_features2 contains the user TM feature
      bits which we must keep in sync with the kernel TM feature bit.
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      beff8237
    • A
      powerpc: scan_features() updates incorrect bits for REAL_LE · 6997e57d
      Anton Blanchard 提交于
      The REAL_LE feature entry in the ibm_pa_feature struct is missing an MMU
      feature value, meaning all the remaining elements initialise the wrong
      values.
      
      This means instead of checking for byte 5, bit 0, we check for byte 0,
      bit 0, and then we incorrectly set the CPU feature bit as well as MMU
      feature bit 1 and CPU user feature bits 0 and 2 (5).
      
      Checking byte 0 bit 0 (IBM numbering), means we're looking at the
      "Memory Management Unit (MMU)" feature - ie. does the CPU have an MMU.
      In practice that bit is set on all platforms which have the property.
      
      This means we set CPU_FTR_REAL_LE always. In practice that seems not to
      matter because all the modern cpus which have this property also
      implement REAL_LE, and we've never needed to disable it.
      
      We're also incorrectly setting MMU feature bit 1, which is:
      
        #define MMU_FTR_TYPE_8xx		0x00000002
      
      Luckily the only place that looks for MMU_FTR_TYPE_8xx is in Book3E
      code, which can't run on the same cpus as scan_features(). So this also
      doesn't matter in practice.
      
      Finally in the CPU user feature mask, we're setting bits 0 and 2. Bit 2
      is not currently used, and bit 0 is:
      
        #define PPC_FEATURE_PPC_LE		0x00000001
      
      Which says the CPU supports the old style "PPC Little Endian" mode.
      Again this should be harmless in practice as no 64-bit CPUs implement
      that mode.
      
      Fix the code by adding the missing initialisation of the MMU feature.
      
      Also add a comment marking CPU user feature bit 2 (0x4) as reserved. It
      would be unsafe to start using it as old kernels incorrectly set it.
      
      Fixes: 44ae3ab3 ("powerpc: Free up some CPU feature bits by moving out MMU-related features")
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Cc: stable@vger.kernel.org
      [mpe: Flesh out changelog, add comment reserving 0x4]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      6997e57d
  11. 16 4月, 2016 3 次提交
    • V
      x86/hyperv: Avoid reporting bogus NMI status for Gen2 instances · 1e2ae9ec
      Vitaly Kuznetsov 提交于
      Generation2 instances don't support reporting the NMI status on port 0x61,
      read from there returns 'ff' and we end up reporting nonsensical PCI
      error (as there is no PCI bus in these instances) on all NMIs:
      
          NMI: PCI system error (SERR) for reason ff on CPU 0.
          Dazed and confused, but trying to continue
      
      Fix the issue by overriding x86_platform.get_nmi_reason. Use 'booted on
      EFI' flag to detect Gen2 instances.
      Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Cathy Avery <cavery@redhat.com>
      Cc: Haiyang Zhang <haiyangz@microsoft.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: K. Y. Srinivasan <kys@microsoft.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: devel@linuxdriverproject.org
      Link: http://lkml.kernel.org/r/1460728232-31433-1-git-send-email-vkuznets@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      1e2ae9ec
    • H
      s390: add CPU_BIG_ENDIAN config option · 2fd92273
      Heiko Carstens 提交于
      Make sure that s390 appears to be a big endian machine by defining
      this config option.
      
      Without this s390 appears to be little endian as seen by e.g. the
      recordmount script: "perl ./scripts/recordmcount.pl "s390" "little"
      "64""
      This has no practical impact within the script since the endian
      variable is only evaluated for mips. However there are already a
      couple of common code places which evaluate this config option. None
      of them is relevant for s390 currently though.
      
      To avoid any issues in the future (and fix the recordmcount oddity)
      add the new config option.
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      2fd92273
    • H
      s390/spinlock: avoid yield to non existent cpu · 84976952
      Heiko Carstens 提交于
      arch_spin_lock_wait_flags() checks if a spinlock is not held before
      trying a compare and swap instruction. If the lock is unlocked it
      tries the compare and swap instruction, however if a different cpu
      grabbed the lock in the meantime the instruction will fail as
      expected.
      
      Subsequently the arch_spin_lock_wait_flags() incorrectly tries to
      figure out if the cpu that holds the lock is running. However it is
      using the wrong cpu number for this (-1) and then will also yield the
      current cpu to the wrong cpu.
      
      Fix this by adding a missing continue statement.
      
      Fixes: 470ada6b ("s390/spinlock: refactor arch_spin_lock_wait[_flags]")
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Acked-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      84976952