1. 26 8月, 2016 9 次提交
    • J
      Revert "arm64: hibernate: Refuse to hibernate if the boot cpu is offline" · b2d8b0cb
      James Morse 提交于
      Now that we use the MPIDR to resume on the same CPU that we hibernated on,
      we no longer need to refuse to hibernate if the boot cpu is offline. (Which
      we can't possibly know if kexec causes logical CPUs to be renumbered).
      
      This reverts commit 1fe492ce.
      Signed-off-by: NJames Morse <james.morse@arm.com>
      Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      b2d8b0cb
    • J
      arm64: hibernate: Resume when hibernate image created on non-boot CPU · 8ec058fd
      James Morse 提交于
      disable_nonboot_cpus() assumes that the lowest numbered online CPU is
      the boot CPU, and that this is the correct CPU to run any power
      management code on.
      
      On arm64 CPU0 can be taken offline. For hibernate/resume this means we
      may hibernate on a CPU other than CPU0. If the system is rebooted with
      kexec 'CPU0' will be assigned to a different CPU. This complicates
      hibernate/resume as now we can't trust the CPU numbers.
      
      We currently forbid hibernate if CPU0 has been hotplugged out to avoid
      this situation without kexec.
      
      Save the MPIDR of the CPU we hibernated on in the hibernate arch-header,
      use hibernate_resume_nonboot_cpu_disable() to direct which CPU we should
      resume on based on the MPIDR of the CPU we hibernated on. This allows us to
      hibernate/resume on any CPU, even if the logical numbers have been
      shuffled by kexec.
      Signed-off-by: NJames Morse <james.morse@arm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      8ec058fd
    • M
      arm64: always enable DEBUG_RODATA and remove the Kconfig option · 40982fd6
      Mark Rutland 提交于
      Follow the example set by x86 in commit 9ccaf77c ("x86/mm:
      Always enable CONFIG_DEBUG_RODATA and remove the Kconfig option"), and
      make these protections a fundamental security feature rather than an
      opt-in. This also results in a minor code simplification.
      
      For those rare cases when users wish to disable this protection (e.g.
      for debugging), this can be done by passing 'rodata=off' on the command
      line.
      
      As DEBUG_RODATA_ALIGN is only intended to address a performance/memory
      tradeoff, and does not affect correctness, this is left user-selectable.
      DEBUG_MODULE_RONX is also left user-selectable until the core code
      provides a boot-time option to disable the protection for debugging
      use-cases.
      
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Acked-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Acked-by: NKees Cook <keescook@chromium.org>
      Acked-by: NLaura Abbott <labbott@redhat.com>
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      40982fd6
    • A
      arm64: mark reserved memblock regions explicitly in iomem · e7cd1903
      AKASHI Takahiro 提交于
      Kdump(kexec-tools) parses /proc/iomem to identify all the memory regions
      on the system. Since the current kernel names "nomap" regions, like UEFI
      runtime services code/data, as "System RAM," kexec-tools sets up elf core
      header to include them in a crash dump file (/proc/vmcore).
      
      Then crash dump kernel parses UEFI memory map again, re-marks those regions
      as "nomap" and does not create a memory mapping for them unlike the other
      areas of System RAM. In this case, copying /proc/vmcore through
      copy_oldmem_page() on crash dump kernel will end up with a kernel abort,
      as reported in [1].
      
      This patch names all the "nomap" regions explicitly as "reserved" so that
      we can exclude them from a crash dump file. acpi_os_ioremap() must also
      be modified because those regions have WB attributes [2].
      
      Apart from kdump, this change also matches x86's use of acpi (and
      /proc/iomem).
      
      [1] http://lists.infradead.org/pipermail/linux-arm-kernel/2016-August/448186.html
      [2] http://lists.infradead.org/pipermail/linux-arm-kernel/2016-August/450089.htmlReviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Tested-by: NJames Morse <james.morse@arm.com>
      Reviewed-by: NJames Morse <james.morse@arm.com>
      Signed-off-by: NAKASHI Takahiro <takahiro.akashi@linaro.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      e7cd1903
    • J
      arm64: hibernate: Support DEBUG_PAGEALLOC · 5ebe3a44
      James Morse 提交于
      DEBUG_PAGEALLOC removes the valid bit of page table entries to prevent
      any access to unallocated memory. Hibernate uses this as a hint that those
      pages don't need to be saved/restored. This patch adds the
      kernel_page_present() function it uses.
      
      hibernate.c copies the resume kernel's linear map for use during restore.
      Add _copy_pte() to fill-in the holes made by DEBUG_PAGEALLOC in the resume
      kernel, so we can restore data the original kernel had at these addresses.
      
      Finally, DEBUG_PAGEALLOC means the linear-map alias of KERNEL_START to
      KERNEL_END may have holes in it, so we can't lazily clean this whole
      area to the PoC. Only clean the new mmuoff region, and the kernel/kvm
      idmaps.
      
      This reverts commit da24eb1f.
      Reported-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NJames Morse <james.morse@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      5ebe3a44
    • J
      arm64: vmlinux.ld: Add mmuoff data sections and move mmuoff text into idmap · b6113038
      James Morse 提交于
      Resume from hibernate needs to clean any text executed by the kernel with
      the MMU off to the PoC. Collect these functions together into the
      .idmap.text section as all this code is tightly coupled and also needs
      the same cleaning after resume.
      
      Data is more complicated, secondary_holding_pen_release is written with
      the MMU on, clean and invalidated, then read with the MMU off. In contrast
      __boot_cpu_mode is written with the MMU off, the corresponding cache line
      is invalidated, so when we read it with the MMU on we don't get stale data.
      These cache maintenance operations conflict with each other if the values
      are within a Cache Writeback Granule (CWG) of each other.
      Collect the data into two sections .mmuoff.data.read and .mmuoff.data.write,
      the linker script ensures mmuoff.data.write section is aligned to the
      architectural maximum CWG of 2KB.
      Signed-off-by: NJames Morse <james.morse@arm.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      b6113038
    • J
      arm64: Create sections.h · ee78fdc7
      James Morse 提交于
      Each time new section markers are added, kernel/vmlinux.ld.S is updated,
      and new extern char __start_foo[] definitions are scattered through the
      tree.
      
      Create asm/include/sections.h to collect these definitions (and include
      the existing asm-generic version).
      Signed-off-by: NJames Morse <james.morse@arm.com>
      Reviewed-by: NMark Rutland <mark.rutland@arm.com>
      Tested-by: NMark Rutland <mark.rutland@arm.com>
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      ee78fdc7
    • C
      arm64: Introduce execute-only page access permissions · cab15ce6
      Catalin Marinas 提交于
      The ARMv8 architecture allows execute-only user permissions by clearing
      the PTE_UXN and PTE_USER bits. However, the kernel running on a CPU
      implementation without User Access Override (ARMv8.2 onwards) can still
      access such page, so execute-only page permission does not protect
      against read(2)/write(2) etc. accesses. Systems requiring such
      protection must enable features like SECCOMP.
      
      This patch changes the arm64 __P100 and __S100 protection_map[] macros
      to the new __PAGE_EXECONLY attributes. A side effect is that
      pte_user() no longer triggers for __PAGE_EXECONLY since PTE_USER isn't
      set. To work around this, the check is done on the PTE_NG bit via the
      pte_ng() macro. VM_READ is also checked now for page faults.
      Reviewed-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      cab15ce6
    • P
      arm64: kprobe: Always clear pstate.D in breakpoint exception handler · 7419333f
      Pratyush Anand 提交于
      Whenever we are hitting a kprobe from a none-kprobe debug exception handler,
      we hit an infinite occurrences of "Unexpected kernel single-step exception
      at EL1"
      
      PSTATE.D is debug exception mask bit. It is set whenever we enter into an
      exception mode. When it is set then Watchpoint, Breakpoint, and Software
      Step exceptions are masked. However, software Breakpoint Instruction
      exceptions can never be masked. Therefore, if we ever execute a BRK
      instruction, irrespective of D-bit setting, we will be receiving a
      corresponding breakpoint exception.
      
      For example:
      
      - We are executing kprobe pre/post handler, and kprobe has been inserted in
        one of the instruction of a function called by handler. So, it executes
        BRK instruction and we land into the case of KPROBE_REENTER. (This case is
        already handled by current code)
      
      - We are executing uprobe handler or any other BRK handler such as in
        WARN_ON (BRK BUG_BRK_IMM), and we trace that path using kprobe.So, we
        enter into kprobe breakpoint handler,from another BRK handler.(This case
        is not being handled currently)
      
      In all such cases kprobe breakpoint exception will be raised when we were
      already in debug exception mode. SPSR's D bit (bit 9) shows the value of
      PSTATE.D immediately before the exception was taken. So, in above example
      cases we would find it set in kprobe breakpoint handler.  Single step
      exception will always be followed by a kprobe breakpoint exception.However,
      it will only be raised gracefully if we clear D bit while returning from
      breakpoint exception.  If D bit is set then, it results into undefined
      exception and when it's handler enables dbg then single step exception is
      generated, however it will never be handled(because address does not match
      and therefore treated as unexpected).
      
      This patch clears D-flag unconditionally in setup_singlestep, so that we can
      always get single step exception correctly after returning from breakpoint
      exception. Additionally, it also removes D-flag set statement for
      KPROBE_REENTER return path, because debug exception for KPROBE_REENTER will
      always take place in a debug exception state. So, D-flag will already be set
      in this case.
      Acked-by: NSandeepa Prabhu <sandeepa.s.prabhu@gmail.com>
      Acked-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Signed-off-by: NPratyush Anand <panand@redhat.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      7419333f
  2. 22 8月, 2016 10 次提交
  3. 20 8月, 2016 2 次提交
    • H
      parisc: Fix order of EREFUSED define in errno.h · 3eb53b20
      Helge Deller 提交于
      When building gccgo in userspace, errno.h gets parsed and the go include file
      sysinfo.go is generated.
      
      Since EREFUSED is defined to the same value as ECONNREFUSED, and ECONNREFUSED
      is defined later on in errno.h, this leads to go complaining that EREFUSED
      isn't defined yet.
      
      Fix this trivial problem by moving the define of EREFUSED down after
      ECONNREFUSED in errno.h (and clean up the indenting while touching this line).
      Signed-off-by: NHelge Deller <deller@gmx.de>
      Cc: stable@vger.kernel.org
      3eb53b20
    • H
      parisc: Fix automatic selection of cr16 clocksource · ae141830
      Helge Deller 提交于
      Commit 54b66800 (parisc: Add native high-resolution sched_clock()
      implementation) added support to use the CPU-internal cr16 counters as reliable
      clocksource with the help of HAVE_UNSTABLE_SCHED_CLOCK.
      
      Sadly the commit missed to remove the hack which prevented cr16 to become the
      default clocksource even on SMP systems.
      Signed-off-by: NHelge Deller <deller@gmx.de>
      Cc: stable@vger.kernel.org # 4.7+
      ae141830
  4. 18 8月, 2016 5 次提交
    • C
      arm64: Fix shift warning in arch/arm64/mm/dump.c · a93a4d62
      Catalin Marinas 提交于
      When building with 48-bit VAs and 16K page configuration, it's possible
      to get the following warning when building the arm64 page table dumping
      code:
      
      arch/arm64/mm/dump.c: In function ‘walk_pud’:
      arch/arm64/mm/dump.c:274:102: warning: right shift count >= width of type [-Wshift-count-overflow]
      
      This is because pud_offset(pgd, 0) performs a shift to the right by 36
      while the value 0 has the type 'int' by default, therefore 32-bit.
      
      This patch modifies all the p*_offset() uses in arch/arm64/mm/dump.c to
      use 0UL for the address argument.
      Acked-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      a93a4d62
    • J
      x86/smp: Fix __max_logical_packages value setup · 7b0501b1
      Jiri Olsa 提交于
      Frank reported kernel panic when he disabled several cores in BIOS
      via following option:
      
        Core Disable Bitmap(Hex)   [0]
      
      with number 0xFFE, which leaves 16 CPUs in system (out of 48).
      
      The kernel panic below goes along with following messages:
      
       smpboot: Max logical packages: 2^M
       smpboot: APIC(0) Converting physical 0 to logical package 0^M
       smpboot: APIC(20) Converting physical 1 to logical package 1^M
       smpboot: APIC(40) Package 2 exceeds logical package map^M
       smpboot: CPU 8 APICId 40 disabled^M
       smpboot: APIC(60) Package 3 exceeds logical package map^M
       smpboot: CPU 12 APICId 60 disabled^M
       ...
       general protection fault: 0000 [#1] SMP^M
       Modules linked in:^M
       CPU: 15 PID: 1 Comm: swapper/0 Not tainted 4.7.0-rc5+ #1^M
       Hardware name: SGI UV300/UV300, BIOS SGI UV 300 series BIOS 05/25/2016^M
       task: ffff8801673e0000 ti: ffff8801673ac000 task.ti: ffff8801673ac000^M
       RIP: 0010:[<ffffffff81014d54>]  [<ffffffff81014d54>] uncore_change_context+0xd4/0x180^M
       ...
        [<ffffffff810158ac>] uncore_event_init_cpu+0x6c/0x70^M
        [<ffffffff81d8c91c>] intel_uncore_init+0x1c2/0x2dd^M
        [<ffffffff81d8c75a>] ? uncore_cpu_setup+0x17/0x17^M
        [<ffffffff81002190>] do_one_initcall+0x50/0x190^M
        [<ffffffff810ab193>] ? parse_args+0x293/0x480^M
        [<ffffffff81d87365>] kernel_init_freeable+0x1a5/0x249^M
        [<ffffffff81d86a35>] ? set_debug_rodata+0x12/0x12^M
        [<ffffffff816dc19e>] kernel_init+0xe/0x110^M
        [<ffffffff816e93bf>] ret_from_fork+0x1f/0x40^M
        [<ffffffff816dc190>] ? rest_init+0x80/0x80^M
      
      The reason for the panic is wrong value of __max_logical_packages,
      which lets logical_package_map uninitialized and the uncore code
      relying on this map being properly initialized (maybe we should
      add some safety checks there as well).
      
      The __max_logical_packages is computed as:
      
        DIV_ROUND_UP(total_cpus, ncpus);
        - ncpus being number of cores
      
      With above BIOS setup we get total_cpus == 16 which set
      __max_logical_packages to 2 (ncpus is 12).
      
      Once topology_update_package_map processes CPU with logical
      pkg over 2 we display above messages and fail to initialize
      the physical_to_logical_pkg map, which makes the uncore code
      crash.
      
      The fix is to remove logical_package_map bitmap completely
      and keep and update the logical_packages number instead.
      
      After we enumerate all the present CPUs, we check if the
      enumerated logical packages count is within its computed
      maximum from BIOS data.
      
      If it's not the case, we set this maximum to the new enumerated
      value and freeze any new addition of logical packages.
      
      The freeze is because lot of init code like uncore/rapl/cqm
      depends on having maximum logical package value set to allocate
      their data, so we can't change it later on.
      
      Prarit Bhargava tested the patch and confirms that it solves
      the problem:
      
        From dmidecode:
                Core Count: 24
                Core Enabled: 24
                Thread Count: 48
      
      Orig kernel boot log:
      
       [    0.464981] smpboot: Max logical packages: 19
       [    0.469861] smpboot: APIC(0) Converting physical 0 to logical package 0
       [    0.477261] smpboot: APIC(40) Converting physical 1 to logical package 1
       [    0.484760] smpboot: APIC(80) Converting physical 2 to logical package 2
       [    0.492258] smpboot: APIC(c0) Converting physical 3 to logical package 3
      
      1.  nr_cpus=8, should stop enumerating in package 0:
      
       [    0.533664] smpboot: APIC(0) Converting physical 0 to logical package 0
       [    0.539596] smpboot: Max logical packages: 19
      
      2.  max_cpus=8, should still enumerate all packages:
      
       [    0.526494] smpboot: APIC(0) Converting physical 0 to logical package 0
       [    0.532428] smpboot: APIC(40) Converting physical 1 to logical package 1
       [    0.538456] smpboot: APIC(80) Converting physical 2 to logical package 2
       [    0.544486] smpboot: APIC(c0) Converting physical 3 to logical package 3
       [    0.550524] smpboot: Max logical packages: 19
      
      3.  nr_cpus=49 ( 2 socket + 1 core on 3rd socket), should stop enumerating in
          package 2:
      
       [    0.521378] smpboot: APIC(0) Converting physical 0 to logical package 0
       [    0.527314] smpboot: APIC(40) Converting physical 1 to logical package 1
       [    0.533345] smpboot: APIC(80) Converting physical 2 to logical package 2
       [    0.539368] smpboot: Max logical packages: 19
      
      4.  maxcpus=49, should still enumerate all packages:
      
       [    0.525591] smpboot: APIC(0) Converting physical 0 to logical package 0
       [    0.531525] smpboot: APIC(40) Converting physical 1 to logical package 1
       [    0.537547] smpboot: APIC(80) Converting physical 2 to logical package 2
       [    0.543579] smpboot: APIC(c0) Converting physical 3 to logical package 3
       [    0.549624] smpboot: Max logical packages: 19
      
      5.  kdump (nr_cpus=1) works as well.
      Reported-by: NFrank Ramsay <framsay@redhat.com>
      Tested-by: NPrarit Bhargava <prarit@redhat.com>
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Reviewed-by: NPrarit Bhargava <prarit@redhat.com>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20160815101700.GA30090@kravaSigned-off-by: NIngo Molnar <mingo@kernel.org>
      7b0501b1
    • B
      x86/microcode/AMD: Fix initrd loading with CONFIG_RANDOMIZE_MEMORY=y · 88b2f634
      Borislav Petkov 提交于
      Similar to:
      
        efaad554 ("x86/microcode/intel: Fix initrd loading with CONFIG_RANDOMIZE_MEMORY=y")
      
      ... fix microcode loading from the initrd on AMD by adding the
      randomization offset to the microcode patch container within the initrd.
      Reported-and-tested-by: NBrian Gerst <brgerst@gmail.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-tip-commits@vger.kernel.org
      Link: http://lkml.kernel.org/r/20160817113314.GA19221@nazgul.tnicSigned-off-by: NIngo Molnar <mingo@kernel.org>
      88b2f634
    • A
      arm64: kernel: avoid literal load of virtual address with MMU off · bc9f3d77
      Ard Biesheuvel 提交于
      Literal loads of virtual addresses are subject to runtime relocation when
      CONFIG_RELOCATABLE=y, and given that the relocation routines run with the
      MMU and caches enabled, literal loads of relocated values performed with
      the MMU off are not guaranteed to return the latest value unless the
      memory covering the literal is cleaned to the PoC explicitly.
      
      So defer the literal load until after the MMU has been enabled, just like
      we do for primary_switch() and secondary_switch() in head.S.
      
      Fixes: 1e48ef7f ("arm64: add support for building vmlinux as a relocatable PIE binary")
      Cc: <stable@vger.kernel.org> # 4.6+
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Acked-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      bc9f3d77
    • C
      arm64: Fix NUMA build error when !CONFIG_ACPI · bfe6c8a8
      Catalin Marinas 提交于
      Since asm/acpi.h is only included by linux/acpi.h when CONFIG_ACPI is
      enabled, disabling the latter leads to the following build error on
      arm64:
      
      arch/arm64/mm/numa.c: In function ‘arm64_numa_init’:
      arch/arm64/mm/numa.c:395:24: error: ‘arm64_acpi_numa_init’ undeclared (first use in this function)
         if (!acpi_disabled && !numa_init(arm64_acpi_numa_init))
      
      This patch include the asm/acpi.h explicitly in arch/arm64/mm/numa.c for
      the arm64_acpi_numa_init() definition.
      
      Fixes: d8b47fca ("arm64, ACPI, NUMA: NUMA support based on SRAT and SLIT")
      Reviewed-by: NHanjun Guo <hanjun.guo@linaro.org>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      bfe6c8a8
  5. 16 8月, 2016 1 次提交
  6. 13 8月, 2016 7 次提交
    • G
      h8300: Add missing include file to asm/io.h · 2b05980d
      Guenter Roeck 提交于
      h8300 builds fail with
      
      arch/h8300/include/asm/io.h:9:15: error: unknown type name ‘u8’
      arch/h8300/include/asm/io.h:15:15: error: unknown type name ‘u16’
      arch/h8300/include/asm/io.h:21:15: error: unknown type name ‘u32’
      
      and many related errors.
      
      Fixes: 23c82d41bdf4 ("kexec-allow-architectures-to-override-boot-mapping-fix")
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NGuenter Roeck <linux@roeck-us.net>
      2b05980d
    • G
      unicore32: mm: Add missing parameter to arch_vma_access_permitted · 783011b1
      Guenter Roeck 提交于
      unicore32 fails to compile with the following errors.
      
      mm/memory.c: In function ‘__handle_mm_fault’:
      mm/memory.c:3381: error:
      	too many arguments to function ‘arch_vma_access_permitted’
      mm/gup.c: In function ‘check_vma_flags’:
      mm/gup.c:456: error:
      	too many arguments to function ‘arch_vma_access_permitted’
      mm/gup.c: In function ‘vma_permits_fault’:
      mm/gup.c:640: error:
      	too many arguments to function ‘arch_vma_access_permitted’
      
      Fixes: d61172b4 ("mm/core, x86/mm/pkeys: Differentiate instruction fetches")
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@kernel.org>
      Signed-off-by: NGuenter Roeck <linux@roeck-us.net>
      Acked-by: NGuan Xuetao <gxt@mprc.pku.edu.cn>
      783011b1
    • M
      arm64: defconfig: enable CONFIG_LOCALVERSION_AUTO · 53fb45d3
      Masahiro Yamada 提交于
      When CONFIG_LOCALVERSION_AUTO is disabled, the version string is
      just a tag name (or with a '+' appended if HEAD is not a tagged
      commit).
      
      During the development (and especially when git-bisecting), longer
      version string would be helpful to identify the commit we are running.
      
      This is a default y option, so drop the unset to enable it.
      Signed-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      53fb45d3
    • R
      arm64: defconfig: add options for virtualization and containers · 2323439f
      Riku Voipio 提交于
      Enable options commonly needed by popular virtualization
      and container applications. Use modules when possible to
      avoid too much overhead for users not interested.
      
      - add namespace and cgroup options needed
      - add seccomp - optional, but enhances Qemu etc
      - bridge, nat, veth, macvtap and multicast for routing
        guests and containers
      - btfrs and overlayfs modules for container COW backends
      - while near it, make fuse a module instead of built-in.
      
      Generated with make saveconfig and dropping unrelated spurious
      change hunks while commiting. bloat-o-meter old-vmlinux vmlinux:
      
      add/remove: 905/390 grow/shrink: 767/229 up/down: 183513/-94861 (88652)
      ....
      Total: Before=10515408, After=10604060, chg +0.84%
      Signed-off-by: NRiku Voipio <riku.voipio@linaro.org>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      2323439f
    • M
      arm64: hibernate: handle allocation failures · dfbca61a
      Mark Rutland 提交于
      In create_safe_exec_page(), we create a copy of the hibernate exit text,
      along with some page tables to map this via TTBR0. We then install the
      new tables in TTBR0.
      
      In swsusp_arch_resume() we call create_safe_exec_page() before trying a
      number of operations which may fail (e.g. copying the linear map page
      tables). If these fail, we bail out of swsusp_arch_resume() and return
      an error code, but leave TTBR0 as-is. Subsequently, the core hibernate
      code will call free_basic_memory_bitmaps(), which will free all of the
      memory allocations we made, including the page tables installed in
      TTBR0.
      
      Thus, we may have TTBR0 pointing at dangling freed memory for some
      period of time. If the hibernate attempt was triggered by a user
      requesting a hibernate test via the reboot syscall, we may return to
      userspace with the clobbered TTBR0 value.
      
      Avoid these issues by reorganising swsusp_arch_resume() such that we
      have no failure paths after create_safe_exec_page(). We also add a check
      that the zero page allocation succeeded, matching what we have for other
      allocations.
      
      Fixes: 82869ac5 ("arm64: kernel: Add support for hibernate/suspend-to-disk")
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Acked-by: NJames Morse <james.morse@arm.com>
      Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: <stable@vger.kernel.org> # 4.7+
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      dfbca61a
    • M
      arm64: hibernate: avoid potential TLB conflict · 0194e760
      Mark Rutland 提交于
      In create_safe_exec_page we install a set of global mappings in TTBR0,
      then subsequently invalidate TLBs. While TTBR0 points at the zero page,
      and the TLBs should be free of stale global entries, we may have stale
      ASID-tagged entries (e.g. from the EFI runtime services mappings) for
      the same VAs. Per the ARM ARM these ASID-tagged entries may conflict
      with newly-allocated global entries, and we must follow a
      Break-Before-Make approach to avoid issues resulting from this.
      
      This patch reworks create_safe_exec_page to invalidate TLBs while the
      zero page is still in place, ensuring that there are no potential
      conflicts when the new TTBR0 value is installed. As a single CPU is
      online while this code executes, we do not need to perform broadcast TLB
      maintenance, and can call local_flush_tlb_all(), which also subsumes
      some barriers. The remaining assembly is converted to use write_sysreg()
      and isb().
      
      Other than this, we safely manipulate TTBRs in the hibernate dance. The
      code we install as part of the new TTBR0 mapping (the hibernated
      kernel's swsusp_arch_suspend_exit) installs a zero page into TTBR1,
      invalidates TLBs, then installs its preferred value. Upon being restored
      to the middle of swsusp_arch_suspend, the new image will call
      __cpu_suspend_exit, which will call cpu_uninstall_idmap, installing the
      zero page in TTBR0 and invalidating all TLB entries.
      
      Fixes: 82869ac5 ("arm64: kernel: Add support for hibernate/suspend-to-disk")
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Acked-by: NJames Morse <james.morse@arm.com>
      Tested-by: NJames Morse <james.morse@arm.com>
      Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: <stable@vger.kernel.org> # 4.7+
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      0194e760
    • L
      arm64: Handle el1 synchronous instruction aborts cleanly · 9adeb8e7
      Laura Abbott 提交于
      Executing from a non-executable area gives an ugly message:
      
      lkdtm: Performing direct entry EXEC_RODATA
      lkdtm: attempting ok execution at ffff0000084c0e08
      lkdtm: attempting bad execution at ffff000008880700
      Bad mode in Synchronous Abort handler detected on CPU2, code 0x8400000e -- IABT (current EL)
      CPU: 2 PID: 998 Comm: sh Not tainted 4.7.0-rc2+ #13
      Hardware name: linux,dummy-virt (DT)
      task: ffff800077e35780 ti: ffff800077970000 task.ti: ffff800077970000
      PC is at lkdtm_rodata_do_nothing+0x0/0x8
      LR is at execute_location+0x74/0x88
      
      The 'IABT (current EL)' indicates the error but it's a bit cryptic
      without knowledge of the ARM ARM. There is also no indication of the
      specific address which triggered the fault. The increase in kernel
      page permissions makes hitting this case more likely as well.
      Handling the case in the vectors gives a much more familiar looking
      error message:
      
      lkdtm: Performing direct entry EXEC_RODATA
      lkdtm: attempting ok execution at ffff0000084c0840
      lkdtm: attempting bad execution at ffff000008880680
      Unable to handle kernel paging request at virtual address ffff000008880680
      pgd = ffff8000089b2000
      [ffff000008880680] *pgd=00000000489b4003, *pud=0000000048904003, *pmd=0000000000000000
      Internal error: Oops: 8400000e [#1] PREEMPT SMP
      Modules linked in:
      CPU: 1 PID: 997 Comm: sh Not tainted 4.7.0-rc1+ #24
      Hardware name: linux,dummy-virt (DT)
      task: ffff800077f9f080 ti: ffff800008a1c000 task.ti: ffff800008a1c000
      PC is at lkdtm_rodata_do_nothing+0x0/0x8
      LR is at execute_location+0x74/0x88
      Acked-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NLaura Abbott <labbott@redhat.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      9adeb8e7
  7. 12 8月, 2016 6 次提交
    • J
      MIPS: KVM: Propagate kseg0/mapped tlb fault errors · 9b731bcf
      James Hogan 提交于
      Propagate errors from kvm_mips_handle_kseg0_tlb_fault() and
      kvm_mips_handle_mapped_seg_tlb_fault(), usually triggering an internal
      error since they normally indicate the guest accessed bad physical
      memory or the commpage in an unexpected way.
      
      Fixes: 858dd5d4 ("KVM/MIPS32: MMU/TLB operations for the Guest.")
      Fixes: e685c689 ("KVM/MIPS32: Privileged instruction/target branch emulation.")
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      Cc: <stable@vger.kernel.org> # 3.10.x-
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      9b731bcf
    • J
      MIPS: KVM: Fix gfn range check in kseg0 tlb faults · 0741f52d
      James Hogan 提交于
      Two consecutive gfns are loaded into host TLB, so ensure the range check
      isn't off by one if guest_pmap_npages is odd.
      
      Fixes: 858dd5d4 ("KVM/MIPS32: MMU/TLB operations for the Guest.")
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      Cc: <stable@vger.kernel.org> # 3.10.x-
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      0741f52d
    • J
      MIPS: KVM: Add missing gfn range check · 8985d503
      James Hogan 提交于
      kvm_mips_handle_mapped_seg_tlb_fault() calculates the guest frame number
      based on the guest TLB EntryLo values, however it is not range checked
      to ensure it lies within the guest_pmap. If the physical memory the
      guest refers to is out of range then dump the guest TLB and emit an
      internal error.
      
      Fixes: 858dd5d4 ("KVM/MIPS32: MMU/TLB operations for the Guest.")
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      Cc: <stable@vger.kernel.org> # 3.10.x-
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      8985d503
    • J
      MIPS: KVM: Fix mapped fault broken commpage handling · c604cffa
      James Hogan 提交于
      kvm_mips_handle_mapped_seg_tlb_fault() appears to map the guest page at
      virtual address 0 to PFN 0 if the guest has created its own mapping
      there. The intention is unclear, but it may have been an attempt to
      protect the zero page from being mapped to anything but the comm page in
      code paths you wouldn't expect from genuine commpage accesses (guest
      kernel mode cache instructions on that address, hitting trapping
      instructions when executing from that address with a coincidental TLB
      eviction during the KVM handling, and guest user mode accesses to that
      address).
      
      Fix this to check for mappings exactly at KVM_GUEST_COMMPAGE_ADDR (it
      may not be at address 0 since commit 42aa12e7 ("MIPS: KVM: Move
      commpage so 0x0 is unmapped")), and set the corresponding EntryLo to be
      interpreted as 0 (invalid).
      
      Fixes: 858dd5d4 ("KVM/MIPS32: MMU/TLB operations for the Guest.")
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      Cc: <stable@vger.kernel.org> # 3.10.x-
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      c604cffa
    • C
      KVM: Protect device ops->create and list_add with kvm->lock · a28ebea2
      Christoffer Dall 提交于
      KVM devices were manipulating list data structures without any form of
      synchronization, and some implementations of the create operations also
      suffered from a lack of synchronization.
      
      Now when we've split the xics create operation into create and init, we
      can hold the kvm->lock mutex while calling the create operation and when
      manipulating the devices list.
      
      The error path in the generic code gets slightly ugly because we have to
      take the mutex again and delete the device from the list, but holding
      the mutex during anon_inode_getfd or releasing/locking the mutex in the
      common non-error path seemed wrong.
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
      Acked-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      a28ebea2
    • C
      KVM: PPC: Move xics_debugfs_init out of create · 023e9fdd
      Christoffer Dall 提交于
      As we are about to hold the kvm->lock during the create operation on KVM
      devices, we should move the call to xics_debugfs_init into its own
      function, since holding a mutex over extended amounts of time might not
      be a good idea.
      
      Introduce an init operation on the kvm_device_ops struct which cannot
      fail and call this, if configured, after the device has been created.
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      023e9fdd