1. 12 2月, 2019 19 次提交
  2. 10 2月, 2019 1 次提交
    • J
      x86/mm: Make set_pmd_at() paravirt aware · 20e55bc1
      Juergen Gross 提交于
      set_pmd_at() calls native_set_pmd() unconditionally on x86. This was
      fine as long as only huge page entries were written via set_pmd_at(),
      as Xen pv guests don't support those.
      
      Commit 2c91bd4a ("mm: speed up mremap by 20x on large regions")
      introduced a usage of set_pmd_at() possible on pv guests, leading to
      failures like:
      
      BUG: unable to handle kernel paging request at ffff888023e26778
      #PF error: [PROT] [WRITE]
      RIP: e030:move_page_tables+0x7c1/0xae0
      move_vma.isra.3+0xd1/0x2d0
      __se_sys_mremap+0x3c6/0x5b0
       do_syscall_64+0x49/0x100
      entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Make set_pmd_at() paravirt aware by just letting it use set_pmd().
      
      Fixes: 2c91bd4a ("mm: speed up mremap by 20x on large regions")
      Reported-by: NSander Eikelenboom <linux@eikelenboom.it>
      Signed-off-by: NJuergen Gross <jgross@suse.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: xen-devel@lists.xenproject.org
      Cc: boris.ostrovsky@oracle.com
      Cc: sstabellini@kernel.org
      Cc: hpa@zytor.com
      Cc: bp@alien8.de
      Cc: torvalds@linux-foundation.org
      Link: https://lkml.kernel.org/r/20190210074056.11842-1-jgross@suse.com
      20e55bc1
  3. 08 2月, 2019 3 次提交
  4. 07 2月, 2019 1 次提交
  5. 04 2月, 2019 2 次提交
    • P
      perf/x86/intel: Delay memory deallocation until x86_pmu_dead_cpu() · 602cae04
      Peter Zijlstra 提交于
      intel_pmu_cpu_prepare() allocated memory for ->shared_regs among other
      members of struct cpu_hw_events. This memory is released in
      intel_pmu_cpu_dying() which is wrong. The counterpart of the
      intel_pmu_cpu_prepare() callback is x86_pmu_dead_cpu().
      
      Otherwise if the CPU fails on the UP path between CPUHP_PERF_X86_PREPARE
      and CPUHP_AP_PERF_X86_STARTING then it won't release the memory but
      allocate new memory on the next attempt to online the CPU (leaking the
      old memory).
      Also, if the CPU down path fails between CPUHP_AP_PERF_X86_STARTING and
      CPUHP_PERF_X86_PREPARE then the CPU will go back online but never
      allocate the memory that was released in x86_pmu_dying_cpu().
      
      Make the memory allocation/free symmetrical in regard to the CPU hotplug
      notifier by moving the deallocation to intel_pmu_cpu_dead().
      
      This started in commit:
      
         a7e3ed1e ("perf: Add support for supplementary event registers").
      
      In principle the bug was introduced in v2.6.39 (!), but it will almost
      certainly not backport cleanly across the big CPU hotplug rewrite between v4.7-v4.15...
      
      [ bigeasy: Added patch description. ]
      [ mingo: Added backporting guidance. ]
      Reported-by: NHe Zhe <zhe.he@windriver.com>
      Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> # With developer hat on
      Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> # With maintainer hat on
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: acme@kernel.org
      Cc: bp@alien8.de
      Cc: hpa@zytor.com
      Cc: jolsa@kernel.org
      Cc: kan.liang@linux.intel.com
      Cc: namhyung@kernel.org
      Cc: <stable@vger.kernel.org>
      Fixes: a7e3ed1e ("perf: Add support for supplementary event registers").
      Link: https://lkml.kernel.org/r/20181219165350.6s3jvyxbibpvlhtq@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      602cae04
    • K
      perf/x86/intel/uncore: Add Node ID mask · 9e63a789
      Kan Liang 提交于
      Some PCI uncore PMUs cannot be registered on an 8-socket system (HPE
      Superdome Flex).
      
      To understand which Socket the PCI uncore PMUs belongs to, perf retrieves
      the local Node ID of the uncore device from CPUNODEID(0xC0) of the PCI
      configuration space, and the mapping between Socket ID and Node ID from
      GIDNIDMAP(0xD4). The Socket ID can be calculated accordingly.
      
      The local Node ID is only available at bit 2:0, but current code doesn't
      mask it. If a BIOS doesn't clear the rest of the bits, an incorrect Node ID
      will be fetched.
      
      Filter the Node ID by adding a mask.
      Reported-by: NSong Liu <songliubraving@fb.com>
      Tested-by: NSong Liu <songliubraving@fb.com>
      Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: <stable@vger.kernel.org> # v3.7+
      Fixes: 7c94ee2e ("perf/x86: Add Intel Nehalem and Sandy Bridge-EP uncore support")
      Link: https://lkml.kernel.org/r/1548600794-33162-1-git-send-email-kan.liang@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      9e63a789
  6. 03 2月, 2019 1 次提交
    • T
      x86/MCE: Initialize mce.bank in the case of a fatal error in mce_no_way_out() · d28af26f
      Tony Luck 提交于
      Internal injection testing crashed with a console log that said:
      
        mce: [Hardware Error]: CPU 7: Machine Check Exception: f Bank 0: bd80000000100134
      
      This caused a lot of head scratching because the MCACOD (bits 15:0) of
      that status is a signature from an L1 data cache error. But Linux says
      that it found it in "Bank 0", which on this model CPU only reports L1
      instruction cache errors.
      
      The answer was that Linux doesn't initialize "m->bank" in the case that
      it finds a fatal error in the mce_no_way_out() pre-scan of banks. If
      this was a local machine check, then this partially initialized struct
      mce is being passed to mce_panic().
      
      Fix is simple: just initialize m->bank in the case of a fatal error.
      
      Fixes: 40c36e27 ("x86/mce: Fix incorrect "Machine check from unknown source" message")
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vishal Verma <vishal.l.verma@intel.com>
      Cc: x86-ml <x86@kernel.org>
      Cc: stable@vger.kernel.org # v4.18 Note pre-v5.0 arch/x86/kernel/cpu/mce/core.c was called arch/x86/kernel/cpu/mcheck/mce.c
      Link: https://lkml.kernel.org/r/20190201003341.10638-1-tony.luck@intel.com
      d28af26f
  7. 02 2月, 2019 4 次提交
    • J
      x86/resctrl: Avoid confusion over the new X86_RESCTRL config · e6d42931
      Johannes Weiner 提交于
      "Resource Control" is a very broad term for this CPU feature, and a term
      that is also associated with containers, cgroups etc. This can easily
      cause confusion.
      
      Make the user prompt more specific. Match the config symbol name.
      
       [ bp: In the future, the corresponding ARM arch-specific code will be
         under ARM_CPU_RESCTRL and the arch-agnostic bits will be carved out
         under the CPU_RESCTRL umbrella symbol. ]
      Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Babu Moger <Babu.Moger@amd.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Morse <james.morse@arm.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: linux-doc@vger.kernel.org
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Pu Wen <puwen@hygon.cn>
      Cc: Reinette Chatre <reinette.chatre@intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: x86-ml <x86@kernel.org>
      Link: https://lkml.kernel.org/r/20190130195621.GA30653@cmpxchg.org
      e6d42931
    • Q
      x86_64: increase stack size for KASAN_EXTRA · a8e911d1
      Qian Cai 提交于
      If the kernel is configured with KASAN_EXTRA, the stack size is
      increasted significantly because this option sets "-fstack-reuse" to
      "none" in GCC [1].  As a result, it triggers stack overrun quite often
      with 32k stack size compiled using GCC 8.  For example, this reproducer
      
        https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/syscalls/madvise/madvise06.c
      
      triggers a "corrupted stack end detected inside scheduler" very reliably
      with CONFIG_SCHED_STACK_END_CHECK enabled.
      
      There are just too many functions that could have a large stack with
      KASAN_EXTRA due to large local variables that have been called over and
      over again without being able to reuse the stacks.  Some noticiable ones
      are
      
        size
        7648 shrink_page_list
        3584 xfs_rmap_convert
        3312 migrate_page_move_mapping
        3312 dev_ethtool
        3200 migrate_misplaced_transhuge_page
        3168 copy_process
      
      There are other 49 functions are over 2k in size while compiling kernel
      with "-Wframe-larger-than=" even with a related minimal config on this
      machine.  Hence, it is too much work to change Makefiles for each object
      to compile without "-fsanitize-address-use-after-scope" individually.
      
      [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81715#c23
      
      Although there is a patch in GCC 9 to help the situation, GCC 9 probably
      won't be released in a few months and then it probably take another
      6-month to 1-year for all major distros to include it as a default.
      Hence, the stack usage with KASAN_EXTRA can be revisited again in 2020
      when GCC 9 is everywhere.  Until then, this patch will help users avoid
      stack overrun.
      
      This has already been fixed for arm64 for the same reason via
      6e883067 ("arm64: kasan: Increase stack size for KASAN_EXTRA").
      
      Link: http://lkml.kernel.org/r/20190109215209.2903-1-cai@lca.pwSigned-off-by: NQian Cai <cai@lca.pw>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a8e911d1
    • K
      x86/kexec: Don't setup EFI info if EFI runtime is not enabled · 2aa958c9
      Kairui Song 提交于
      Kexec-ing a kernel with "efi=noruntime" on the first kernel's command
      line causes the following null pointer dereference:
      
        BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
        #PF error: [normal kernel read fault]
        Call Trace:
         efi_runtime_map_copy+0x28/0x30
         bzImage64_load+0x688/0x872
         arch_kexec_kernel_image_load+0x6d/0x70
         kimage_file_alloc_init+0x13e/0x220
         __x64_sys_kexec_file_load+0x144/0x290
         do_syscall_64+0x55/0x1a0
         entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Just skip the EFI info setup if EFI runtime services are not enabled.
      
       [ bp: Massage commit message. ]
      Suggested-by: NDave Young <dyoung@redhat.com>
      Signed-off-by: NKairui Song <kasong@redhat.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Acked-by: NDave Young <dyoung@redhat.com>
      Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: bhe@redhat.com
      Cc: David Howells <dhowells@redhat.com>
      Cc: erik.schmauss@intel.com
      Cc: fanc.fnst@cn.fujitsu.com
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: kexec@lists.infradead.org
      Cc: lenb@kernel.org
      Cc: linux-acpi@vger.kernel.org
      Cc: Philipp Rudo <prudo@linux.vnet.ibm.com>
      Cc: rafael.j.wysocki@intel.com
      Cc: robert.moore@intel.com
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: x86-ml <x86@kernel.org>
      Cc: Yannik Sembritzki <yannik@sembritzki.me>
      Link: https://lkml.kernel.org/r/20190118111310.29589-2-kasong@redhat.com
      2aa958c9
    • L
      x86: explicitly align IO accesses in memcpy_{to,from}io · c228d294
      Linus Torvalds 提交于
      In commit 170d13ca ("x86: re-introduce non-generic memcpy_{to,from}io")
      I made our copy from IO space use a separate copy routine rather than
      rely on the generic memcpy.  I did that because our generic memory copy
      isn't actually well-defined when it comes to internal access ordering or
      alignment, and will in fact depend on various CPUID flags.
      
      In particular, the default memcpy() for a modern Intel CPU will
      generally be just a "rep movsb", which works reasonably well for
      medium-sized memory copies of regular RAM, since the CPU will turn it
      into fairly optimized microcode.
      
      However, for non-cached memory and IO, "rep movs" ends up being
      horrendously slow and will just do the architectural "one byte at a
      time" accesses implied by the movsb.
      
      At the other end of the spectrum, if you _don't_ end up using the "rep
      movsb" code, you'd likely fall back to the software copy, which does
      overlapping accesses for the tail, and may copy things backwards.
      Again, for regular memory that's fine, for IO memory not so much.
      
      The thinking was that clearly nobody really cared (because things
      worked), but some people had seen horrible performance due to the byte
      accesses, so let's just revert back to our long ago version that dod
      "rep movsl" for the bulk of the copy, and then fixed up the potentially
      last few bytes of the tail with "movsw/b".
      
      Interestingly (and perhaps not entirely surprisingly), while that was
      our original memory copy implementation, and had been used before for
      IO, in the meantime many new users of memcpy_*io() had come about.  And
      while the access patterns for the memory copy weren't well-defined (so
      arguably _any_ access pattern should work), in practice the "rep movsb"
      case had been very common for the last several years.
      
      In particular Jarkko Sakkinen reported that the memcpy_*io() change
      resuled in weird errors from his Geminilake NUC TPM module.
      
      And it turns out that the TPM TCG accesses according to spec require
      that the accesses be
      
       (a) done strictly sequentially
      
       (b) be naturally aligned
      
      otherwise the TPM chip will abort the PCI transaction.
      
      And, in fact, the tpm_crb.c driver did this:
      
      	memcpy_fromio(buf, priv->rsp, 6);
      	...
      	memcpy_fromio(&buf[6], &priv->rsp[6], expected - 6);
      
      which really should never have worked in the first place, but back
      before commit 170d13ca it *happened* to work, because the
      memcpy_fromio() would be expanded to a regular memcpy, and
      
       (a) gcc would expand the first memcpy in-line, and turn it into a
           4-byte and a 2-byte read, and they happened to be in the right
           order, and the alignment was right.
      
       (b) gcc would call "memcpy()" for the second one, and the machines that
           had this TPM chip also apparently ended up always having ERMS
           ("Enhanced REP MOVSB/STOSB instructions"), so we'd use the "rep
           movbs" for that copy.
      
      In other words, basically by pure luck, the code happened to use the
      right access sizes in the (two different!) memcpy() implementations to
      make it all work.
      
      But after commit 170d13ca, both of the memcpy_fromio() calls
      resulted in a call to the routine with the consistent memory accesses,
      and in both cases it started out transferring with 4-byte accesses.
      Which worked for the first copy, but resulted in the second copy doing a
      32-bit read at an address that was only 2-byte aligned.
      
      Jarkko is actually fixing the fragile code in the TPM driver, but since
      this is an excellent example of why we absolutely must not use a generic
      memcpy for IO accesses, _and_ an IO-specific one really should strive to
      align the IO accesses, let's do exactly that.
      
      Side note: Jarkko also noted that the driver had been used on ARM
      platforms, and had worked.  That was because on 32-bit ARM, memcpy_*io()
      ends up always doing byte accesses, and on 64-bit ARM it first does byte
      accesses to align to 8-byte boundaries, and then does 8-byte accesses
      for the bulk.
      
      So ARM actually worked by design, and the x86 case worked by pure luck.
      
      We *might* want to make x86-64 do the 8-byte case too.  That should be a
      pretty straightforward extension, but let's do one thing at a time.  And
      generally MMIO accesses aren't really all that performance-critical, as
      shown by the fact that for a long time we just did them a byte at a
      time, and very few people ever noticed.
      Reported-and-tested-by: NJarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
      Tested-by: NJerry Snitselaar <jsnitsel@redhat.com>
      Cc: David Laight <David.Laight@aculab.com>
      Fixes: 170d13ca ("x86: re-introduce non-generic memcpy_{to,from}io")
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c228d294
  8. 31 1月, 2019 2 次提交
    • T
      x86/microcode/amd: Don't falsely trick the late loading mechanism · 912139cf
      Thomas Lendacky 提交于
      The load_microcode_amd() function searches for microcode patches and
      attempts to apply a microcode patch if it is of different level than the
      currently installed level.
      
      While the processor won't actually load a level that is less than
      what is already installed, the logic wrongly returns UCODE_NEW thus
      signaling to its caller reload_store() that a late loading should be
      attempted.
      
      If the file-system contains an older microcode revision than what is
      currently running, such a late microcode reload can result in these
      misleading messages:
      
        x86/CPU: CPU features have changed after loading microcode, but might not take effect.
        x86/CPU: Please consider either early loading through initrd/built-in or a potential BIOS update.
      
      These messages were issued on a system where SME/SEV are not
      enabled by the BIOS (MSR C001_0010[23] = 0b) because during boot,
      early_detect_mem_encrypt() is called and cleared the SME and SEV
      features in this case.
      
      However, after the wrong late load attempt, get_cpu_cap() is called and
      reloads the SME and SEV feature bits, resulting in the messages.
      
      Update the microcode level check to not attempt microcode loading if the
      current level is greater than(!) and not only equal to the current patch
      level.
      
       [ bp: massage commit message. ]
      
      Fixes: 2613f36e ("x86/microcode: Attempt late loading only when new microcode is present")
      Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: x86-ml <x86@kernel.org>
      Link: https://lkml.kernel.org/r/154894518427.9406.8246222496874202773.stgit@tlendack-t1.amdoffice.net
      912139cf
    • J
      cpu/hotplug: Fix "SMT disabled by BIOS" detection for KVM · b284909a
      Josh Poimboeuf 提交于
      With the following commit:
      
        73d5e2b4 ("cpu/hotplug: detect SMT disabled by BIOS")
      
      ... the hotplug code attempted to detect when SMT was disabled by BIOS,
      in which case it reported SMT as permanently disabled.  However, that
      code broke a virt hotplug scenario, where the guest is booted with only
      primary CPU threads, and a sibling is brought online later.
      
      The problem is that there doesn't seem to be a way to reliably
      distinguish between the HW "SMT disabled by BIOS" case and the virt
      "sibling not yet brought online" case.  So the above-mentioned commit
      was a bit misguided, as it permanently disabled SMT for both cases,
      preventing future virt sibling hotplugs.
      
      Going back and reviewing the original problems which were attempted to
      be solved by that commit, when SMT was disabled in BIOS:
      
        1) /sys/devices/system/cpu/smt/control showed "on" instead of
           "notsupported"; and
      
        2) vmx_vm_init() was incorrectly showing the L1TF_MSG_SMT warning.
      
      I'd propose that we instead consider #1 above to not actually be a
      problem.  Because, at least in the virt case, it's possible that SMT
      wasn't disabled by BIOS and a sibling thread could be brought online
      later.  So it makes sense to just always default the smt control to "on"
      to allow for that possibility (assuming cpuid indicates that the CPU
      supports SMT).
      
      The real problem is #2, which has a simple fix: change vmx_vm_init() to
      query the actual current SMT state -- i.e., whether any siblings are
      currently online -- instead of looking at the SMT "control" sysfs value.
      
      So fix it by:
      
        a) reverting the original "fix" and its followup fix:
      
           73d5e2b4 ("cpu/hotplug: detect SMT disabled by BIOS")
           bc2d8d26 ("cpu/hotplug: Fix SMT supported evaluation")
      
           and
      
        b) changing vmx_vm_init() to query the actual current SMT state --
           instead of the sysfs control value -- to determine whether the L1TF
           warning is needed.  This also requires the 'sched_smt_present'
           variable to exported, instead of 'cpu_smt_control'.
      
      Fixes: 73d5e2b4 ("cpu/hotplug: detect SMT disabled by BIOS")
      Reported-by: NIgor Mammedov <imammedo@redhat.com>
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Joe Mario <jmario@redhat.com>
      Cc: Jiri Kosina <jikos@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: kvm@vger.kernel.org
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/e3a85d585da28cc333ecbc1e78ee9216e6da9396.1548794349.git.jpoimboe@redhat.com
      b284909a
  9. 30 1月, 2019 2 次提交
  10. 29 1月, 2019 1 次提交
  11. 26 1月, 2019 4 次提交
    • G
      KVM: x86: Mark expected switch fall-throughs · b2869f28
      Gustavo A. R. Silva 提交于
      In preparation to enabling -Wimplicit-fallthrough, mark switch
      cases where we are expecting to fall through.
      
      This patch fixes the following warnings:
      
      arch/x86/kvm/lapic.c:1037:27: warning: this statement may fall through [-Wimplicit-fallthrough=]
      arch/x86/kvm/lapic.c:1876:3: warning: this statement may fall through [-Wimplicit-fallthrough=]
      arch/x86/kvm/hyperv.c:1637:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
      arch/x86/kvm/svm.c:4396:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
      arch/x86/kvm/mmu.c:4372:36: warning: this statement may fall through [-Wimplicit-fallthrough=]
      arch/x86/kvm/x86.c:3835:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
      arch/x86/kvm/x86.c:7938:23: warning: this statement may fall through [-Wimplicit-fallthrough=]
      arch/x86/kvm/vmx/vmx.c:2015:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
      arch/x86/kvm/vmx/vmx.c:1773:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
      
      Warning level 3 was used: -Wimplicit-fallthrough=3
      
      This patch is part of the ongoing efforts to enabling -Wimplicit-fallthrough.
      Signed-off-by: NGustavo A. R. Silva <gustavo@embeddedor.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      b2869f28
    • M
      KVM: x86: fix TRACE_INCLUDE_PATH and remove -I. header search paths · 5cd5548f
      Masahiro Yamada 提交于
      The header search path -I. in kernel Makefiles is very suspicious;
      it allows the compiler to search for headers in the top of $(srctree),
      where obviously no header file exists.
      
      The reason of having -I. here is to make the incorrectly set
      TRACE_INCLUDE_PATH working.
      
      As the comment block in include/trace/define_trace.h says,
      TRACE_INCLUDE_PATH should be a relative path to the define_trace.h
      
      Fix the TRACE_INCLUDE_PATH, and remove the iffy include paths.
      Signed-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      5cd5548f
    • V
      x86/kvm/hyper-v: nested_enable_evmcs() sets vmcs_version incorrectly · 3a2f5773
      Vitaly Kuznetsov 提交于
      Commit e2e871ab ("x86/kvm/hyper-v: Introduce nested_get_evmcs_version()
      helper") broke EVMCS enablement: to set vmcs_version we now call
      nested_get_evmcs_version() but this function checks
      enlightened_vmcs_enabled flag which is not yet set so we end up returning
      zero.
      
      Fix the issue by re-arranging things in nested_enable_evmcs().
      
      Fixes: e2e871ab ("x86/kvm/hyper-v: Introduce nested_get_evmcs_version() helper")
      Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      3a2f5773
    • S
      KVM: VMX: Move vmx_vcpu_run()'s VM-Enter asm blob to a helper function · 5ad6ece8
      Sean Christopherson 提交于
      ...along with the function's STACK_FRAME_NON_STANDARD tag.  Moving the
      asm blob results in a significantly smaller amount of code that is
      marked with STACK_FRAME_NON_STANDARD, which makes it far less likely
      that gcc will split the function and trigger a spurious objtool warning.
      As a bonus, removing STACK_FRAME_NON_STANDARD from vmx_vcpu_run() allows
      the bulk of code to be properly checked by objtool.
      
      Because %rbp is not loaded via VMCS fields, vmx_vcpu_run() must manually
      save/restore the host's RBP and load the guest's RBP prior to calling
      vmx_vmenter().  Modifying %rbp triggers objtool's stack validation code,
      and so vmx_vcpu_run() is tagged with STACK_FRAME_NON_STANDARD since it's
      impossible to avoid modifying %rbp.
      
      Unfortunately, vmx_vcpu_run() is also a gigantic function that gcc will
      split into separate functions, e.g. so that pieces of the function can
      be inlined.  Splitting the function means that the compiled Elf file
      will contain one or more vmx_vcpu_run.part.* functions in addition to
      a vmx_vcpu_run function.  Depending on where the function is split,
      objtool may warn about a "call without frame pointer save/setup" in
      vmx_vcpu_run.part.* since objtool's stack validation looks for exact
      names when whitelisting functions tagged with STACK_FRAME_NON_STANDARD.
      
      Up until recently, the undesirable function splitting was effectively
      blocked because vmx_vcpu_run() was tagged with __noclone.  At the time,
      __noclone had an unintended side effect that put vmx_vcpu_run() into a
      separate optimization unit, which in turn prevented gcc from inlining
      the function (or any of its own function calls) and thus eliminated gcc's
      motivation to split the function.  Removing the __noclone attribute
      allowed gcc to optimize vmx_vcpu_run(), exposing the objtool warning.
      
      Kudos to Qian Cai for root causing that the fnsplit optimization is what
      caused objtool to complain.
      
      Fixes: 453eafbe ("KVM: VMX: Move VM-Enter + VM-Exit handling to non-inline sub-routines")
      Tested-by: NQian Cai <cai@lca.pw>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Reported-by: Nkbuild test robot <lkp@intel.com>
      Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      5ad6ece8
新手
引导
客服 返回
顶部