1. 18 9月, 2015 2 次提交
  2. 17 9月, 2015 14 次提交
    • J
      x86/pci/dma: Fix gfp flags for coherent DMA memory allocation · 590f0787
      Junichi Nomura 提交于
      Commit 6894258e reversed the order of gfp_flags adjustment in
      dma_alloc_attrs() for x86 [arch/x86/kernel/pci-dma.c] As a result,
      relevant flags set by dma_alloc_coherent_gfp_flags() are just
      discarded and cause coherent DMA memory allocation failure on some
      devices.
      
      Fixes: 6894258e ("dma-mapping: consolidate dma_{alloc,free}_{attrs,coherent}")
      Signed-off-by: NJun'ichi Nomura <j-nomura@ce.jp.nec.com>
      Tested-by: NTony Luck <tony.luck@intel.com>
      Acked-by: NChristoph Hellwig <hch@lst.de>
      Link: http://lkml.kernel.org/r/20150914073834.GA13077@xzibit.linux.bs1.fc.nec.co.jpSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      590f0787
    • M
      arm/arm64: KVM: Remove 'config KVM_ARM_MAX_VCPUS' · ef748917
      Ming Lei 提交于
      This patch removes config option of KVM_ARM_MAX_VCPUS,
      and like other ARCHs, just choose the maximum allowed
      value from hardware, and follows the reasons:
      
      1) from distribution view, the option has to be
      defined as the max allowed value because it need to
      meet all kinds of virtulization applications and
      need to support most of SoCs;
      
      2) using a bigger value doesn't introduce extra memory
      consumption, and the help text in Kconfig isn't accurate
      because kvm_vpu structure isn't allocated until request
      of creating VCPU is sent from QEMU;
      
      3) the main effect is that the field of vcpus[] in 'struct kvm'
      becomes a bit bigger(sizeof(void *) per vcpu) and need more cache
      lines to hold the structure, but 'struct kvm' is one generic struct,
      and it has worked well on other ARCHs already in this way. Also,
      the world switch frequecy is often low, for example, it is ~2000
      when running kernel building load in VM from APM xgene KVM host,
      so the effect is very small, and the difference can't be observed
      in my test at all.
      
      Cc: Dann Frazier <dann.frazier@canonical.com>
      Signed-off-by: NMing Lei <ming.lei@canonical.com>
      Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      ef748917
    • W
      arm64: KVM: Remove all traces of the ThumbEE registers · 34c3faa3
      Will Deacon 提交于
      Although the ThumbEE registers and traps were present in earlier
      versions of the v8 architecture, it was retrospectively removed and so
      we can do the same.
      
      Whilst this breaks migrating a guest started on a previous version of
      the kernel, it is much better to kill these (non existent) registers
      as soon as possible.
      Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      [maz: added commend about migration]
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      34c3faa3
    • M
      arm: KVM: Disable virtual timer even if the guest is not using it · 688bc577
      Marc Zyngier 提交于
      When running a guest with the architected timer disabled (with QEMU and
      the kernel_irqchip=off option, for example), it is important to make
      sure the timer gets turned off. Otherwise, the guest may try to
      enable it anyway, leading to a screaming HW interrupt.
      
      The fix is to unconditionally turn off the virtual timer on guest
      exit.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      688bc577
    • M
      arm64: KVM: Disable virtual timer even if the guest is not using it · c4cbba9f
      Marc Zyngier 提交于
      When running a guest with the architected timer disabled (with QEMU and
      the kernel_irqchip=off option, for example), it is important to make
      sure the timer gets turned off. Otherwise, the guest may try to
      enable it anyway, leading to a screaming HW interrupt.
      
      The fix is to unconditionally turn off the virtual timer on guest
      exit.
      
      Cc: stable@vger.kernel.org
      Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      c4cbba9f
    • W
      arm64: errata: add module build workaround for erratum #843419 · df057cc7
      Will Deacon 提交于
      Cortex-A53 processors <= r0p4 are affected by erratum #843419 which can
      lead to a memory access using an incorrect address in certain sequences
      headed by an ADRP instruction.
      
      There is a linker fix to generate veneers for ADRP instructions, but
      this doesn't work for kernel modules which are built as unlinked ELF
      objects.
      
      This patch adds a new config option for the erratum which, when enabled,
      builds kernel modules with the mcmodel=large flag. This uses absolute
      addressing for all kernel symbols, thereby removing the use of ADRP as
      a PC-relative form of addressing. The ADRP relocs are removed from the
      module loader so that we fail to load any potentially affected modules.
      
      Cc: <stable@vger.kernel.org>
      Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      df057cc7
    • W
      arm64: compat: fix vfp save/restore across signal handlers in big-endian · bdec97a8
      Will Deacon 提交于
      When saving/restoring the VFP registers from a compat (AArch32)
      signal frame, we rely on the compat registers forming a prefix of the
      native register file and therefore make use of copy_{to,from}_user to
      transfer between the native fpsimd_state and the compat_vfp_sigframe.
      
      Unfortunately, this doesn't work so well in a big-endian environment.
      Our fpsimd save/restore code operates directly on 128-bit quantities
      (Q registers) whereas the compat_vfp_sigframe represents the registers
      as an array of 64-bit (D) registers. The architecture packs the compat D
      registers into the Q registers, with the least significant bytes holding
      the lower register. Consequently, we need to swap the 64-bit halves when
      converting between these two representations on a big-endian machine.
      
      This patch replaces the __copy_{to,from}_user invocations in our
      compat VFP signal handling code with explicit __put_user loops that
      operate on 64-bit values and swap them accordingly.
      
      Cc: <stable@vger.kernel.org>
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      bdec97a8
    • W
      arm64: cpu hotplug: ensure we mask out CPU_TASKS_FROZEN in notifiers · e56d82a1
      Will Deacon 提交于
      We have a couple of CPU hotplug notifiers for resetting the CPU debug
      state to a sane value when a CPU comes online.
      
      This patch ensures that we mask out CPU_TASKS_FROZEN so that we don't
      miss any online events occuring due to suspend/resume.
      Acked-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      e56d82a1
    • L
      powerpc32: memset: only use dcbz once cache is enabled · 400c47d8
      LEROY Christophe 提交于
      memset() uses instruction dcbz to speed up clearing by not wasting time
      loading cache line with data that will be overwritten.
      Some platform like mpc52xx do no have cache active at startup and
      can therefore not use memset(). Allthough no part of the code
      explicitly uses memset(), GCC may make calls to it.
      
      This patch modifies memset() such that at startup, memset()
      unconditionally skip the optimised bloc that uses dcbz instruction.
      
      Once the initial MMU is set up, in machine_init() we patch memset()
      by replacing this inconditional jump by a NOP
      Tested-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      400c47d8
    • L
      powerpc32: memcpy: only use dcbz once cache is enabled · 1cd03890
      LEROY Christophe 提交于
      memcpy() uses instruction dcbz to speed up copy by not wasting time
      loading cache line with data that will be overwritten.
      Some platform like mpc52xx do no have cache active at startup and
      can therefore not use memcpy(). Allthough no part of the code
      explicitly uses memcpy(), GCC makes calls to it.
      
      This patch modifies memcpy() such that at startup, memcpy()
      unconditionally jumps to generic_memcpy() which doesn't use
      the dcbz instruction.
      
      Once the initial MMU is set up, in machine_init() we patch memcpy()
      by replacing this inconditional jump by a NOP
      Reported-by: NMichal Sojka <sojkam1@fel.cvut.cz>
      Tested-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      1cd03890
    • D
      ARM: 8425/1: kgdb: Don't try to stop the machine when setting breakpoints · 7ae85dc7
      Doug Anderson 提交于
      In (23a4e405 arm: kgdb: Handle read-only text / modules) we moved to
      using patch_text() to set breakpoints so that we could handle the case
      when we had CONFIG_DEBUG_RODATA.  That patch used patch_text().
      Unfortunately, patch_text() assumes that we're not in atomic context
      when it runs since it needs to grab a mutex and also wait for other
      CPUs to stop (which it does with a completion).
      
      This would result in a stack crawl if you had
      CONFIG_DEBUG_ATOMIC_SLEEP and tried to set a breakpoint in kgdb.  The
      crawl looked something like:
      
       BUG: scheduling while atomic: swapper/0/0/0x00010007
       CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.2.0-rc7-00133-geb63b34b #1073
       Hardware name: Rockchip (Device Tree)
        (unwind_backtrace) from [<c00133d4>] (show_stack+0x20/0x24)
        (show_stack) from [<c05400e8>] (dump_stack+0x84/0xb8)
        (dump_stack) from [<c004913c>] (__schedule_bug+0x54/0x6c)
        (__schedule_bug) from [<c054065c>] (__schedule+0x80/0x668)
        (__schedule) from [<c0540cfc>] (schedule+0xb8/0xd4)
        (schedule) from [<c0543a3c>] (schedule_timeout+0x2c/0x234)
        (schedule_timeout) from [<c05417c0>] (wait_for_common+0xf4/0x188)
        (wait_for_common) from [<c0541874>] (wait_for_completion+0x20/0x24)
        (wait_for_completion) from [<c00a0104>] (__stop_cpus+0x58/0x70)
        (__stop_cpus) from [<c00a0580>] (stop_cpus+0x3c/0x54)
        (stop_cpus) from [<c00a06c4>] (__stop_machine+0xcc/0xe8)
        (__stop_machine) from [<c00a0714>] (stop_machine+0x34/0x44)
        (stop_machine) from [<c00173e8>] (patch_text+0x28/0x34)
        (patch_text) from [<c001733c>] (kgdb_arch_set_breakpoint+0x40/0x4c)
        (kgdb_arch_set_breakpoint) from [<c00a0d68>] (kgdb_validate_break_address+0x2c/0x60)
        (kgdb_validate_break_address) from [<c00a0e90>] (dbg_set_sw_break+0x1c/0xdc)
        (dbg_set_sw_break) from [<c00a2e88>] (gdb_serial_stub+0x9c4/0xba4)
        (gdb_serial_stub) from [<c00a11cc>] (kgdb_cpu_enter+0x1f8/0x60c)
        (kgdb_cpu_enter) from [<c00a18cc>] (kgdb_handle_exception+0x19c/0x1d0)
        (kgdb_handle_exception) from [<c0016f7c>] (kgdb_compiled_brk_fn+0x30/0x3c)
        (kgdb_compiled_brk_fn) from [<c00091a4>] (do_undefinstr+0x1a4/0x20c)
        (do_undefinstr) from [<c001400c>] (__und_svc_finish+0x0/0x34)
      
      It turns out that when we're in kgdb all the CPUs are stopped anyway
      so there's no reason we should be calling patch_text().  We can
      instead directly call __patch_text() which assumes that CPUs have
      already been stopped.
      
      Fixes: 23a4e405 ("arm: kgdb: Handle read-only text / modules")
      Reported-by: NAapo Vienamo <avienamo@nvidia.com>
      Signed-off-by: NDouglas Anderson <dianders@chromium.org>
      Reviewed-by: NStephen Boyd <sboyd@codeaurora.org>
      Acked-by: NKees Cook <keescook@chromium.org>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      7ae85dc7
    • A
      ARM: 8437/1: dma-mapping: fix build warning with new DMA_ERROR_CODE definition · 90cde558
      Andre Przywara 提交于
      Commit 96231b26: ("ARM: 8419/1: dma-mapping: harmonize definition
      of DMA_ERROR_CODE") changed the definition of DMA_ERROR_CODE to use
      dma_addr_t, which makes the compiler barf on assigning this to an
      "int" variable on ARM with LPAE enabled:
      *************
      In file included from /src/linux/include/linux/dma-mapping.h:86:0,
                       from /src/linux/arch/arm/mm/dma-mapping.c:21:
      /src/linux/arch/arm/mm/dma-mapping.c: In function '__iommu_create_mapping':
      /src/linux/arch/arm/include/asm/dma-mapping.h:16:24: warning:
      overflow in implicit constant conversion [-Woverflow]
       #define DMA_ERROR_CODE (~(dma_addr_t)0x0)
                              ^
      /src/linux/arch/arm/mm/dma-mapping.c:1252:15: note: in expansion of
      macro DMA_ERROR_CODE'
        int i, ret = DMA_ERROR_CODE;
                     ^
      *************
      
      Remove the actually unneeded initialization of "ret" in
      __iommu_create_mapping() and move the variable declaration inside the
      for-loop to make the scope of this variable more clear.
      Signed-off-by: NAndre Przywara <andre.przywara@arm.com>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      90cde558
    • R
      ARM: get rid of needless #if in signal handling code · 12fc7306
      Russell King 提交于
      Remove the #if statement which caused trouble for kernels that support
      both ARMv6 and ARMv7.  Older architectures do not implement these bits,
      so it should be safe to always clear them.
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      12fc7306
    • P
      arm/arm64: KVM: vgic: Check for !irqchip_in_kernel() when mapping resources · c2f58514
      Pavel Fedin 提交于
      Until b26e5fda ("arm/arm64: KVM: introduce per-VM ops"),
      kvm_vgic_map_resources() used to include a check on irqchip_in_kernel(),
      and vgic_v2_map_resources() still has it.
      
      But now vm_ops are not initialized until we call kvm_vgic_create().
      Therefore kvm_vgic_map_resources() can being called without a VGIC,
      and we die because vm_ops.map_resources is NULL.
      
      Fixing this restores QEMU's kernel-irqchip=off option to a working state,
      allowing to use GIC emulation in userspace.
      
      Fixes: b26e5fda ("arm/arm64: KVM: introduce per-VM ops")
      Cc: stable@vger.kernel.org
      Signed-off-by: NPavel Fedin <p.fedin@samsung.com>
      [maz: reworked commit message]
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      c2f58514
  3. 16 9月, 2015 18 次提交
  4. 15 9月, 2015 5 次提交
    • J
      powerpc, irq: Use access helper irq_data_get_affinity_mask() · da92b4eb
      Jiang Liu 提交于
      Use access helper irq_data_get_affinity_mask() so we can move the
      affinity mask to irq_common_data.
      Signed-off-by: NJiang Liu <jiang.liu@linux.intel.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: linuxppc-dev@lists.ozlabs.org
      Link: http://lkml.kernel.org/r/1433145945-789-25-git-send-email-jiang.liu@linux.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      da92b4eb
    • W
      arm64: head.S: initialise mdcr_el2 in el2_setup · d10bcd47
      Will Deacon 提交于
      When entering the kernel at EL2, we fail to initialise the MDCR_EL2
      register which controls debug access and PMU capabilities at EL1.
      
      This patch ensures that the register is initialised so that all traps
      are disabled and all the PMU counters are available to the host. When a
      guest is scheduled, KVM takes care to configure trapping appropriately.
      
      Cc: <stable@vger.kernel.org>
      Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      d10bcd47
    • L
      arm64: enable generic idle loop · 2314ee4d
      Leo Yan 提交于
      Enable generic idle loop for ARM64, so can support for hlt/nohlt
      command line options to override default idle loop behavior.
      Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NLeo Yan <leo.yan@linaro.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      2314ee4d
    • S
      x86/apic: Serialize LVTT and TSC_DEADLINE writes · 5d7c631d
      Shaohua Li 提交于
      The APIC LVTT register is MMIO mapped but the TSC_DEADLINE register is an
      MSR. The write to the TSC_DEADLINE MSR is not serializing, so it's not
      guaranteed that the write to LVTT has reached the APIC before the
      TSC_DEADLINE MSR is written. In such a case the write to the MSR is
      ignored and as a consequence the local timer interrupt never fires.
      
      The SDM decribes this issue for xAPIC and x2APIC modes. The
      serialization methods recommended by the SDM differ.
      
      xAPIC:
       "1. Memory-mapped write to LVT Timer Register, setting bits 18:17 to 10b.
        2. WRMSR to the IA32_TSC_DEADLINE MSR a value much larger than current time-stamp counter.
        3. If RDMSR of the IA32_TSC_DEADLINE MSR returns zero, go to step 2.
        4. WRMSR to the IA32_TSC_DEADLINE MSR the desired deadline."
      
      x2APIC:
       "To allow for efficient access to the APIC registers in x2APIC mode,
        the serializing semantics of WRMSR are relaxed when writing to the
        APIC registers. Thus, system software should not use 'WRMSR to APIC
        registers in x2APIC mode' as a serializing instruction. Read and write
        accesses to the APIC registers will occur in program order. A WRMSR to
        an APIC register may complete before all preceding stores are globally
        visible; software can prevent this by inserting a serializing
        instruction, an SFENCE, or an MFENCE before the WRMSR."
      
      The xAPIC method is to just wait for the memory mapped write to hit
      the LVTT by checking whether the MSR write has reached the hardware.
      There is no reason why a proper MFENCE after the memory mapped write would
      not do the same. Andi Kleen confirmed that MFENCE is sufficient for the
      xAPIC case as well.
      
      Issue MFENCE before writing to the TSC_DEADLINE MSR. This can be done
      unconditionally as all CPUs which have TSC_DEADLINE also have MFENCE
      support.
      
      [ tglx: Massaged the changelog ]
      Signed-off-by: NShaohua Li <shli@fb.com>
      Reviewed-by: NIngo Molnar <mingo@kernel.org>
      Cc: <Kernel-team@fb.com>
      Cc: <lenb@kernel.org>
      Cc: <fenghua.yu@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: stable@vger.kernel.org #v3.7+
      Link: http://lkml.kernel.org/r/20150909041352.GA2059853@devbig257.prn2.facebook.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      5d7c631d
    • T
      x86/ioapic: Force affinity setting in setup_ioapic_dest() · 4857c91f
      Thomas Gleixner 提交于
      The recent ioapic cleanups changed the affinity setting in
      setup_ioapic_dest() from a direct write to the hardware to the delayed
      affinity setup via irq_set_affinity().
      
      That results in a warning from chained_irq_exit():
      WARNING: CPU: 0 PID: 5 at kernel/irq/migration.c:32 irq_move_masked_irq
      [<ffffffff810a0a88>] irq_move_masked_irq+0xb8/0xc0
      [<ffffffff8103c161>] ioapic_ack_level+0x111/0x130
      [<ffffffff812bbfe8>] intel_gpio_irq_handler+0x148/0x1c0
      
      The reason is that irq_set_affinity() does not write directly to the
      hardware. It marks the affinity setting as pending and executes it
      from the next interrupt. The chained handler infrastructure does not
      take the irq descriptor lock for performance reasons because such a
      chained interrupt is not visible to any interfaces. So the delayed
      affinity setting triggers the warning in irq_move_masked_irq().
      
      Restore the old behaviour by calling the set_affinity function of the
      ioapic chip in setup_ioapic_dest(). This is safe as none of the
      interrupts can be on the fly at this point.
      
      Fixes: aa5cb97f 'x86/irq: Remove x86_io_apic_ops.set_affinity and related interfaces'
      Reported-and-tested-by: NMika Westerberg <mika.westerberg@linux.intel.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Jiang Liu <jiang.liu@linux.intel.com>
      Cc: jarkko.nikula@linux.intel.com
      4857c91f
  5. 14 9月, 2015 1 次提交
    • W
      KVM: arm64: add workaround for Cortex-A57 erratum #852523 · 43297dda
      Will Deacon 提交于
      When restoring the system register state for an AArch32 guest at EL2,
      writes to DACR32_EL2 may not be correctly synchronised by Cortex-A57,
      which can lead to the guest effectively running with junk in the DACR
      and running into unexpected domain faults.
      
      This patch works around the issue by re-ordering our restoration of the
      AArch32 register aliases so that they happen before the AArch64 system
      registers. Ensuring that the registers are restored in this order
      guarantees that they will be correctly synchronised by the core.
      
      Cc: <stable@vger.kernel.org>
      Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      43297dda