1. 12 2月, 2019 19 次提交
  2. 10 2月, 2019 1 次提交
    • J
      x86/mm: Make set_pmd_at() paravirt aware · 20e55bc1
      Juergen Gross 提交于
      set_pmd_at() calls native_set_pmd() unconditionally on x86. This was
      fine as long as only huge page entries were written via set_pmd_at(),
      as Xen pv guests don't support those.
      
      Commit 2c91bd4a ("mm: speed up mremap by 20x on large regions")
      introduced a usage of set_pmd_at() possible on pv guests, leading to
      failures like:
      
      BUG: unable to handle kernel paging request at ffff888023e26778
      #PF error: [PROT] [WRITE]
      RIP: e030:move_page_tables+0x7c1/0xae0
      move_vma.isra.3+0xd1/0x2d0
      __se_sys_mremap+0x3c6/0x5b0
       do_syscall_64+0x49/0x100
      entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Make set_pmd_at() paravirt aware by just letting it use set_pmd().
      
      Fixes: 2c91bd4a ("mm: speed up mremap by 20x on large regions")
      Reported-by: NSander Eikelenboom <linux@eikelenboom.it>
      Signed-off-by: NJuergen Gross <jgross@suse.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: xen-devel@lists.xenproject.org
      Cc: boris.ostrovsky@oracle.com
      Cc: sstabellini@kernel.org
      Cc: hpa@zytor.com
      Cc: bp@alien8.de
      Cc: torvalds@linux-foundation.org
      Link: https://lkml.kernel.org/r/20190210074056.11842-1-jgross@suse.com
      20e55bc1
  3. 08 2月, 2019 5 次提交
  4. 07 2月, 2019 2 次提交
  5. 05 2月, 2019 4 次提交
    • J
      arm64: kexec_file: handle empty command-line · ea573680
      Jean-Philippe Brucker 提交于
      Calling strlen() on cmdline == NULL produces a kernel oops. Since having
      a NULL cmdline is valid, handle this case explicitly.
      
      Fixes: 52b2a8af ("arm64: kexec_file: load initrd and device-tree")
      Signed-off-by: NJean-Philippe Brucker <jean-philippe.brucker@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      ea573680
    • J
      MIPS: Remove function size check in get_frame_info() · 2b424cfc
      Jun-Ru Chang 提交于
      Patch (b6c7a324 "MIPS: Fix get_frame_info() handling of
      microMIPS function size.") introduces additional function size
      check for microMIPS by only checking insn between ip and ip + func_size.
      However, func_size in get_frame_info() is always 0 if KALLSYMS is not
      enabled. This causes get_frame_info() to return immediately without
      calculating correct frame_size, which in turn causes "Can't analyze
      schedule() prologue" warning messages at boot time.
      
      This patch removes func_size check, and let the frame_size check run
      up to 128 insns for both MIPS and microMIPS.
      Signed-off-by: NJun-Ru Chang <jrjang@realtek.com>
      Signed-off-by: NTony Wu <tonywu@realtek.com>
      Signed-off-by: NPaul Burton <paul.burton@mips.com>
      Fixes: b6c7a324 ("MIPS: Fix get_frame_info() handling of microMIPS function size.")
      Cc: <ralf@linux-mips.org>
      Cc: <jhogan@kernel.org>
      Cc: <macro@mips.com>
      Cc: <yamada.masahiro@socionext.com>
      Cc: <peterz@infradead.org>
      Cc: <mingo@kernel.org>
      Cc: <linux-mips@vger.kernel.org>
      Cc: <linux-kernel@vger.kernel.org>
      2b424cfc
    • P
      MIPS: Use lower case for addresses in nexys4ddr.dts · 047f2d94
      Paul Burton 提交于
      DTC introduced an i2c_bus_reg check in v1.4.7, used since Linux v4.20,
      which complains about upper case addresses used in the unit name.
      
      nexys4ddr.dts names an I2C device node "ad7420@4B", leading to:
      
        arch/mips/boot/dts/xilfpga/nexys4ddr.dts:109.16-112.8: Warning
          (i2c_bus_reg): /i2c@10A00000/ad7420@4B: I2C bus unit address format
          error, expected "4b"
      
      Fix this by switching to lower case addresses throughout the file, as is
      *mostly* the case in the file already & fairly standard throughout the
      tree.
      Signed-off-by: NPaul Burton <paul.burton@mips.com>
      Cc: stable@vger.kernel.org # v4.20+
      Cc: linux-mips@vger.kernel.org
      047f2d94
    • H
      MIPS: Loongson: Introduce and use loongson_llsc_mb() · e02e07e3
      Huacai Chen 提交于
      On the Loongson-2G/2H/3A/3B there is a hardware flaw that ll/sc and
      lld/scd is very weak ordering. We should add sync instructions "before
      each ll/lld" and "at the branch-target between ll/sc" to workaround.
      Otherwise, this flaw will cause deadlock occasionally (e.g. when doing
      heavy load test with LTP).
      
      Below is the explaination of CPU designer:
      
      "For Loongson 3 family, when a memory access instruction (load, store,
      or prefetch)'s executing occurs between the execution of LL and SC, the
      success or failure of SC is not predictable. Although programmer would
      not insert memory access instructions between LL and SC, the memory
      instructions before LL in program-order, may dynamically executed
      between the execution of LL/SC, so a memory fence (SYNC) is needed
      before LL/LLD to avoid this situation.
      
      Since Loongson-3A R2 (3A2000), we have improved our hardware design to
      handle this case. But we later deduce a rarely circumstance that some
      speculatively executed memory instructions due to branch misprediction
      between LL/SC still fall into the above case, so a memory fence (SYNC)
      at branch-target (if its target is not between LL/SC) is needed for
      Loongson 3A1000, 3B1500, 3A2000 and 3A3000.
      
      Our processor is continually evolving and we aim to to remove all these
      workaround-SYNCs around LL/SC for new-come processor."
      
      Here is an example:
      
      Both cpu1 and cpu2 simutaneously run atomic_add by 1 on same atomic var,
      this bug cause both 'sc' run by two cpus (in atomic_add) succeed at same
      time('sc' return 1), and the variable is only *added by 1*, sometimes,
      which is wrong and unacceptable(it should be added by 2).
      
      Why disable fix-loongson3-llsc in compiler?
      Because compiler fix will cause problems in kernel's __ex_table section.
      
      This patch fix all the cases in kernel, but:
      
      +. the fix at the end of futex_atomic_cmpxchg_inatomic is for branch-target
      of 'bne', there other cases which smp_mb__before_llsc() and smp_llsc_mb() fix
      the ll and branch-target coincidently such as atomic_sub_if_positive/
      cmpxchg/xchg, just like this one.
      
      +. Loongson 3 does support CONFIG_EDAC_ATOMIC_SCRUB, so no need to touch
      edac.h
      
      +. local_ops and cmpxchg_local should not be affected by this bug since
      only the owner can write.
      
      +. mips_atomic_set for syscall.c is deprecated and rarely used, just let
      it go
      Signed-off-by: NHuacai Chen <chenhc@lemote.com>
      Signed-off-by: NHuang Pei <huangpei@loongson.cn>
      [paul.burton@mips.com:
        - Simplify the addition of -mno-fix-loongson3-llsc to cflags, and add
          a comment describing why it's there.
        - Make loongson_llsc_mb() a no-op when
          CONFIG_CPU_LOONGSON3_WORKAROUNDS=n, rather than a compiler memory
          barrier.
        - Add a comment describing the bug & how loongson_llsc_mb() helps
          in asm/barrier.h.]
      Signed-off-by: NPaul Burton <paul.burton@mips.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: ambrosehua@gmail.com
      Cc: Steven J . Hill <Steven.Hill@cavium.com>
      Cc: linux-mips@linux-mips.org
      Cc: Fuxin Zhang <zhangfx@lemote.com>
      Cc: Zhangjin Wu <wuzhangjin@gmail.com>
      Cc: Li Xuefeng <lixuefeng@loongson.cn>
      Cc: Xu Chenghua <xuchenghua@loongson.cn>
      e02e07e3
  6. 04 2月, 2019 3 次提交
    • W
      arm64: ptdump: Don't iterate kernel page tables using PTRS_PER_PXX · d23c808c
      Will Deacon 提交于
      When 52-bit virtual addressing is enabled for userspace
      (CONFIG_ARM64_USER_VA_BITS_52=y), the kernel continues to utilise 48-bit
      virtual addressing in TTBR1. Consequently, PTRS_PER_PGD reflects the
      larger page table size for userspace and the pgd pointer for kernel page
      tables is offset before being written to TTBR1.
      
      This means that we can't use PTRS_PER_PGD to iterate over kernel page
      tables unless we apply the same offset, which is fiddly to get right and
      leads to some non-idiomatic walking code. Instead, just follow the usual
      pattern when walking page tables by using a while loop driven by
      pXd_offset() and pXd_addr_end().
      Reported-by: NQian Cai <cai@lca.pw>
      Tested-by: NQian Cai <cai@lca.pw>
      Acked-by: NSteve Capper <steve.capper@arm.com>
      Tested-by: NSteve Capper <steve.capper@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      d23c808c
    • P
      perf/x86/intel: Delay memory deallocation until x86_pmu_dead_cpu() · 602cae04
      Peter Zijlstra 提交于
      intel_pmu_cpu_prepare() allocated memory for ->shared_regs among other
      members of struct cpu_hw_events. This memory is released in
      intel_pmu_cpu_dying() which is wrong. The counterpart of the
      intel_pmu_cpu_prepare() callback is x86_pmu_dead_cpu().
      
      Otherwise if the CPU fails on the UP path between CPUHP_PERF_X86_PREPARE
      and CPUHP_AP_PERF_X86_STARTING then it won't release the memory but
      allocate new memory on the next attempt to online the CPU (leaking the
      old memory).
      Also, if the CPU down path fails between CPUHP_AP_PERF_X86_STARTING and
      CPUHP_PERF_X86_PREPARE then the CPU will go back online but never
      allocate the memory that was released in x86_pmu_dying_cpu().
      
      Make the memory allocation/free symmetrical in regard to the CPU hotplug
      notifier by moving the deallocation to intel_pmu_cpu_dead().
      
      This started in commit:
      
         a7e3ed1e ("perf: Add support for supplementary event registers").
      
      In principle the bug was introduced in v2.6.39 (!), but it will almost
      certainly not backport cleanly across the big CPU hotplug rewrite between v4.7-v4.15...
      
      [ bigeasy: Added patch description. ]
      [ mingo: Added backporting guidance. ]
      Reported-by: NHe Zhe <zhe.he@windriver.com>
      Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> # With developer hat on
      Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> # With maintainer hat on
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: acme@kernel.org
      Cc: bp@alien8.de
      Cc: hpa@zytor.com
      Cc: jolsa@kernel.org
      Cc: kan.liang@linux.intel.com
      Cc: namhyung@kernel.org
      Cc: <stable@vger.kernel.org>
      Fixes: a7e3ed1e ("perf: Add support for supplementary event registers").
      Link: https://lkml.kernel.org/r/20181219165350.6s3jvyxbibpvlhtq@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      602cae04
    • K
      perf/x86/intel/uncore: Add Node ID mask · 9e63a789
      Kan Liang 提交于
      Some PCI uncore PMUs cannot be registered on an 8-socket system (HPE
      Superdome Flex).
      
      To understand which Socket the PCI uncore PMUs belongs to, perf retrieves
      the local Node ID of the uncore device from CPUNODEID(0xC0) of the PCI
      configuration space, and the mapping between Socket ID and Node ID from
      GIDNIDMAP(0xD4). The Socket ID can be calculated accordingly.
      
      The local Node ID is only available at bit 2:0, but current code doesn't
      mask it. If a BIOS doesn't clear the rest of the bits, an incorrect Node ID
      will be fetched.
      
      Filter the Node ID by adding a mask.
      Reported-by: NSong Liu <songliubraving@fb.com>
      Tested-by: NSong Liu <songliubraving@fb.com>
      Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: <stable@vger.kernel.org> # v3.7+
      Fixes: 7c94ee2e ("perf/x86: Add Intel Nehalem and Sandy Bridge-EP uncore support")
      Link: https://lkml.kernel.org/r/1548600794-33162-1-git-send-email-kan.liang@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      9e63a789
  7. 03 2月, 2019 1 次提交
    • T
      x86/MCE: Initialize mce.bank in the case of a fatal error in mce_no_way_out() · d28af26f
      Tony Luck 提交于
      Internal injection testing crashed with a console log that said:
      
        mce: [Hardware Error]: CPU 7: Machine Check Exception: f Bank 0: bd80000000100134
      
      This caused a lot of head scratching because the MCACOD (bits 15:0) of
      that status is a signature from an L1 data cache error. But Linux says
      that it found it in "Bank 0", which on this model CPU only reports L1
      instruction cache errors.
      
      The answer was that Linux doesn't initialize "m->bank" in the case that
      it finds a fatal error in the mce_no_way_out() pre-scan of banks. If
      this was a local machine check, then this partially initialized struct
      mce is being passed to mce_panic().
      
      Fix is simple: just initialize m->bank in the case of a fatal error.
      
      Fixes: 40c36e27 ("x86/mce: Fix incorrect "Machine check from unknown source" message")
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vishal Verma <vishal.l.verma@intel.com>
      Cc: x86-ml <x86@kernel.org>
      Cc: stable@vger.kernel.org # v4.18 Note pre-v5.0 arch/x86/kernel/cpu/mce/core.c was called arch/x86/kernel/cpu/mcheck/mce.c
      Link: https://lkml.kernel.org/r/20190201003341.10638-1-tony.luck@intel.com
      d28af26f
  8. 02 2月, 2019 5 次提交
    • J
      x86/resctrl: Avoid confusion over the new X86_RESCTRL config · e6d42931
      Johannes Weiner 提交于
      "Resource Control" is a very broad term for this CPU feature, and a term
      that is also associated with containers, cgroups etc. This can easily
      cause confusion.
      
      Make the user prompt more specific. Match the config symbol name.
      
       [ bp: In the future, the corresponding ARM arch-specific code will be
         under ARM_CPU_RESCTRL and the arch-agnostic bits will be carved out
         under the CPU_RESCTRL umbrella symbol. ]
      Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Babu Moger <Babu.Moger@amd.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Morse <james.morse@arm.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: linux-doc@vger.kernel.org
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Pu Wen <puwen@hygon.cn>
      Cc: Reinette Chatre <reinette.chatre@intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: x86-ml <x86@kernel.org>
      Link: https://lkml.kernel.org/r/20190130195621.GA30653@cmpxchg.org
      e6d42931
    • Q
      x86_64: increase stack size for KASAN_EXTRA · a8e911d1
      Qian Cai 提交于
      If the kernel is configured with KASAN_EXTRA, the stack size is
      increasted significantly because this option sets "-fstack-reuse" to
      "none" in GCC [1].  As a result, it triggers stack overrun quite often
      with 32k stack size compiled using GCC 8.  For example, this reproducer
      
        https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/syscalls/madvise/madvise06.c
      
      triggers a "corrupted stack end detected inside scheduler" very reliably
      with CONFIG_SCHED_STACK_END_CHECK enabled.
      
      There are just too many functions that could have a large stack with
      KASAN_EXTRA due to large local variables that have been called over and
      over again without being able to reuse the stacks.  Some noticiable ones
      are
      
        size
        7648 shrink_page_list
        3584 xfs_rmap_convert
        3312 migrate_page_move_mapping
        3312 dev_ethtool
        3200 migrate_misplaced_transhuge_page
        3168 copy_process
      
      There are other 49 functions are over 2k in size while compiling kernel
      with "-Wframe-larger-than=" even with a related minimal config on this
      machine.  Hence, it is too much work to change Makefiles for each object
      to compile without "-fsanitize-address-use-after-scope" individually.
      
      [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81715#c23
      
      Although there is a patch in GCC 9 to help the situation, GCC 9 probably
      won't be released in a few months and then it probably take another
      6-month to 1-year for all major distros to include it as a default.
      Hence, the stack usage with KASAN_EXTRA can be revisited again in 2020
      when GCC 9 is everywhere.  Until then, this patch will help users avoid
      stack overrun.
      
      This has already been fixed for arm64 for the same reason via
      6e883067 ("arm64: kasan: Increase stack size for KASAN_EXTRA").
      
      Link: http://lkml.kernel.org/r/20190109215209.2903-1-cai@lca.pwSigned-off-by: NQian Cai <cai@lca.pw>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a8e911d1
    • M
      arch: unexport asm/shmparam.h for all architectures · 36c0f7f0
      Masahiro Yamada 提交于
      Most architectures do not export shmparam.h to user-space.
      
        $ find arch -name shmparam.h  | sort
        arch/alpha/include/asm/shmparam.h
        arch/arc/include/asm/shmparam.h
        arch/arm64/include/asm/shmparam.h
        arch/arm/include/asm/shmparam.h
        arch/csky/include/asm/shmparam.h
        arch/ia64/include/asm/shmparam.h
        arch/mips/include/asm/shmparam.h
        arch/nds32/include/asm/shmparam.h
        arch/nios2/include/asm/shmparam.h
        arch/parisc/include/asm/shmparam.h
        arch/powerpc/include/asm/shmparam.h
        arch/s390/include/asm/shmparam.h
        arch/sh/include/asm/shmparam.h
        arch/sparc/include/asm/shmparam.h
        arch/x86/include/asm/shmparam.h
        arch/xtensa/include/asm/shmparam.h
      
      Strangely, some users of the asm-generic wrapper export shmparam.h
      
        $ git grep 'generic-y += shmparam.h'
        arch/c6x/include/uapi/asm/Kbuild:generic-y += shmparam.h
        arch/h8300/include/uapi/asm/Kbuild:generic-y += shmparam.h
        arch/hexagon/include/uapi/asm/Kbuild:generic-y += shmparam.h
        arch/m68k/include/uapi/asm/Kbuild:generic-y += shmparam.h
        arch/microblaze/include/uapi/asm/Kbuild:generic-y += shmparam.h
        arch/openrisc/include/uapi/asm/Kbuild:generic-y += shmparam.h
        arch/riscv/include/asm/Kbuild:generic-y += shmparam.h
        arch/unicore32/include/uapi/asm/Kbuild:generic-y += shmparam.h
      
      The newly added riscv correctly creates the asm-generic wrapper
      in the kernel space, but the others (c6x, h8300, hexagon, m68k,
      microblaze, openrisc, unicore32) create the one in the uapi directory.
      
      Digging into the git history, now I guess fcc8487d ("uapi:
      export all headers under uapi directories") was the misconversion.
      Prior to that commit, no architecture exported to shmparam.h
      As its commit description said, that commit exported shmparam.h
      for c6x, h8300, hexagon, m68k, openrisc, unicore32.
      
      83f0124a ("microblaze: remove asm-generic wrapper headers")
      accidentally exported shmparam.h for microblaze.
      
      This commit unexports shmparam.h for those architectures.
      
      There is no more reason to export include/uapi/asm-generic/shmparam.h,
      so it has been moved to include/asm-generic/shmparam.h
      
      Link: http://lkml.kernel.org/r/1546904307-11124-1-git-send-email-yamada.masahiro@socionext.comSigned-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com>
      Acked-by: NStafford Horne <shorne@gmail.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: Richard Kuo <rkuo@codeaurora.org>
      Cc: Guan Xuetao <gxt@pku.edu.cn>
      Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Aurelien Jacquiot <jacquiot.aurelien@gmail.com>
      Cc: Greentime Hu <green.hu@gmail.com>
      Cc: Guo Ren <guoren@kernel.org>
      Cc: Palmer Dabbelt <palmer@sifive.com>
      Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Albert Ou <aou@eecs.berkeley.edu>
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: Vincent Chen <deanbo422@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      36c0f7f0
    • K
      x86/kexec: Don't setup EFI info if EFI runtime is not enabled · 2aa958c9
      Kairui Song 提交于
      Kexec-ing a kernel with "efi=noruntime" on the first kernel's command
      line causes the following null pointer dereference:
      
        BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
        #PF error: [normal kernel read fault]
        Call Trace:
         efi_runtime_map_copy+0x28/0x30
         bzImage64_load+0x688/0x872
         arch_kexec_kernel_image_load+0x6d/0x70
         kimage_file_alloc_init+0x13e/0x220
         __x64_sys_kexec_file_load+0x144/0x290
         do_syscall_64+0x55/0x1a0
         entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Just skip the EFI info setup if EFI runtime services are not enabled.
      
       [ bp: Massage commit message. ]
      Suggested-by: NDave Young <dyoung@redhat.com>
      Signed-off-by: NKairui Song <kasong@redhat.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Acked-by: NDave Young <dyoung@redhat.com>
      Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: bhe@redhat.com
      Cc: David Howells <dhowells@redhat.com>
      Cc: erik.schmauss@intel.com
      Cc: fanc.fnst@cn.fujitsu.com
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: kexec@lists.infradead.org
      Cc: lenb@kernel.org
      Cc: linux-acpi@vger.kernel.org
      Cc: Philipp Rudo <prudo@linux.vnet.ibm.com>
      Cc: rafael.j.wysocki@intel.com
      Cc: robert.moore@intel.com
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: x86-ml <x86@kernel.org>
      Cc: Yannik Sembritzki <yannik@sembritzki.me>
      Link: https://lkml.kernel.org/r/20190118111310.29589-2-kasong@redhat.com
      2aa958c9
    • L
      x86: explicitly align IO accesses in memcpy_{to,from}io · c228d294
      Linus Torvalds 提交于
      In commit 170d13ca ("x86: re-introduce non-generic memcpy_{to,from}io")
      I made our copy from IO space use a separate copy routine rather than
      rely on the generic memcpy.  I did that because our generic memory copy
      isn't actually well-defined when it comes to internal access ordering or
      alignment, and will in fact depend on various CPUID flags.
      
      In particular, the default memcpy() for a modern Intel CPU will
      generally be just a "rep movsb", which works reasonably well for
      medium-sized memory copies of regular RAM, since the CPU will turn it
      into fairly optimized microcode.
      
      However, for non-cached memory and IO, "rep movs" ends up being
      horrendously slow and will just do the architectural "one byte at a
      time" accesses implied by the movsb.
      
      At the other end of the spectrum, if you _don't_ end up using the "rep
      movsb" code, you'd likely fall back to the software copy, which does
      overlapping accesses for the tail, and may copy things backwards.
      Again, for regular memory that's fine, for IO memory not so much.
      
      The thinking was that clearly nobody really cared (because things
      worked), but some people had seen horrible performance due to the byte
      accesses, so let's just revert back to our long ago version that dod
      "rep movsl" for the bulk of the copy, and then fixed up the potentially
      last few bytes of the tail with "movsw/b".
      
      Interestingly (and perhaps not entirely surprisingly), while that was
      our original memory copy implementation, and had been used before for
      IO, in the meantime many new users of memcpy_*io() had come about.  And
      while the access patterns for the memory copy weren't well-defined (so
      arguably _any_ access pattern should work), in practice the "rep movsb"
      case had been very common for the last several years.
      
      In particular Jarkko Sakkinen reported that the memcpy_*io() change
      resuled in weird errors from his Geminilake NUC TPM module.
      
      And it turns out that the TPM TCG accesses according to spec require
      that the accesses be
      
       (a) done strictly sequentially
      
       (b) be naturally aligned
      
      otherwise the TPM chip will abort the PCI transaction.
      
      And, in fact, the tpm_crb.c driver did this:
      
      	memcpy_fromio(buf, priv->rsp, 6);
      	...
      	memcpy_fromio(&buf[6], &priv->rsp[6], expected - 6);
      
      which really should never have worked in the first place, but back
      before commit 170d13ca it *happened* to work, because the
      memcpy_fromio() would be expanded to a regular memcpy, and
      
       (a) gcc would expand the first memcpy in-line, and turn it into a
           4-byte and a 2-byte read, and they happened to be in the right
           order, and the alignment was right.
      
       (b) gcc would call "memcpy()" for the second one, and the machines that
           had this TPM chip also apparently ended up always having ERMS
           ("Enhanced REP MOVSB/STOSB instructions"), so we'd use the "rep
           movbs" for that copy.
      
      In other words, basically by pure luck, the code happened to use the
      right access sizes in the (two different!) memcpy() implementations to
      make it all work.
      
      But after commit 170d13ca, both of the memcpy_fromio() calls
      resulted in a call to the routine with the consistent memory accesses,
      and in both cases it started out transferring with 4-byte accesses.
      Which worked for the first copy, but resulted in the second copy doing a
      32-bit read at an address that was only 2-byte aligned.
      
      Jarkko is actually fixing the fragile code in the TPM driver, but since
      this is an excellent example of why we absolutely must not use a generic
      memcpy for IO accesses, _and_ an IO-specific one really should strive to
      align the IO accesses, let's do exactly that.
      
      Side note: Jarkko also noted that the driver had been used on ARM
      platforms, and had worked.  That was because on 32-bit ARM, memcpy_*io()
      ends up always doing byte accesses, and on 64-bit ARM it first does byte
      accesses to align to 8-byte boundaries, and then does 8-byte accesses
      for the bulk.
      
      So ARM actually worked by design, and the x86 case worked by pure luck.
      
      We *might* want to make x86-64 do the 8-byte case too.  That should be a
      pretty straightforward extension, but let's do one thing at a time.  And
      generally MMIO accesses aren't really all that performance-critical, as
      shown by the fact that for a long time we just did them a byte at a
      time, and very few people ever noticed.
      Reported-and-tested-by: NJarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
      Tested-by: NJerry Snitselaar <jsnitsel@redhat.com>
      Cc: David Laight <David.Laight@aculab.com>
      Fixes: 170d13ca ("x86: re-introduce non-generic memcpy_{to,from}io")
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c228d294
新手
引导
客服 返回
顶部