1. 11 2月, 2019 5 次提交
  2. 06 2月, 2019 1 次提交
  3. 04 2月, 2019 2 次提交
    • P
      perf/x86/intel: Delay memory deallocation until x86_pmu_dead_cpu() · 602cae04
      Peter Zijlstra 提交于
      intel_pmu_cpu_prepare() allocated memory for ->shared_regs among other
      members of struct cpu_hw_events. This memory is released in
      intel_pmu_cpu_dying() which is wrong. The counterpart of the
      intel_pmu_cpu_prepare() callback is x86_pmu_dead_cpu().
      
      Otherwise if the CPU fails on the UP path between CPUHP_PERF_X86_PREPARE
      and CPUHP_AP_PERF_X86_STARTING then it won't release the memory but
      allocate new memory on the next attempt to online the CPU (leaking the
      old memory).
      Also, if the CPU down path fails between CPUHP_AP_PERF_X86_STARTING and
      CPUHP_PERF_X86_PREPARE then the CPU will go back online but never
      allocate the memory that was released in x86_pmu_dying_cpu().
      
      Make the memory allocation/free symmetrical in regard to the CPU hotplug
      notifier by moving the deallocation to intel_pmu_cpu_dead().
      
      This started in commit:
      
         a7e3ed1e ("perf: Add support for supplementary event registers").
      
      In principle the bug was introduced in v2.6.39 (!), but it will almost
      certainly not backport cleanly across the big CPU hotplug rewrite between v4.7-v4.15...
      
      [ bigeasy: Added patch description. ]
      [ mingo: Added backporting guidance. ]
      Reported-by: NHe Zhe <zhe.he@windriver.com>
      Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> # With developer hat on
      Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> # With maintainer hat on
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: acme@kernel.org
      Cc: bp@alien8.de
      Cc: hpa@zytor.com
      Cc: jolsa@kernel.org
      Cc: kan.liang@linux.intel.com
      Cc: namhyung@kernel.org
      Cc: <stable@vger.kernel.org>
      Fixes: a7e3ed1e ("perf: Add support for supplementary event registers").
      Link: https://lkml.kernel.org/r/20181219165350.6s3jvyxbibpvlhtq@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      602cae04
    • K
      perf/x86/intel/uncore: Add Node ID mask · 9e63a789
      Kan Liang 提交于
      Some PCI uncore PMUs cannot be registered on an 8-socket system (HPE
      Superdome Flex).
      
      To understand which Socket the PCI uncore PMUs belongs to, perf retrieves
      the local Node ID of the uncore device from CPUNODEID(0xC0) of the PCI
      configuration space, and the mapping between Socket ID and Node ID from
      GIDNIDMAP(0xD4). The Socket ID can be calculated accordingly.
      
      The local Node ID is only available at bit 2:0, but current code doesn't
      mask it. If a BIOS doesn't clear the rest of the bits, an incorrect Node ID
      will be fetched.
      
      Filter the Node ID by adding a mask.
      Reported-by: NSong Liu <songliubraving@fb.com>
      Tested-by: NSong Liu <songliubraving@fb.com>
      Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: <stable@vger.kernel.org> # v3.7+
      Fixes: 7c94ee2e ("perf/x86: Add Intel Nehalem and Sandy Bridge-EP uncore support")
      Link: https://lkml.kernel.org/r/1548600794-33162-1-git-send-email-kan.liang@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      9e63a789
  4. 02 2月, 2019 4 次提交
    • J
      x86/resctrl: Avoid confusion over the new X86_RESCTRL config · e6d42931
      Johannes Weiner 提交于
      "Resource Control" is a very broad term for this CPU feature, and a term
      that is also associated with containers, cgroups etc. This can easily
      cause confusion.
      
      Make the user prompt more specific. Match the config symbol name.
      
       [ bp: In the future, the corresponding ARM arch-specific code will be
         under ARM_CPU_RESCTRL and the arch-agnostic bits will be carved out
         under the CPU_RESCTRL umbrella symbol. ]
      Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Babu Moger <Babu.Moger@amd.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Morse <james.morse@arm.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: linux-doc@vger.kernel.org
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Pu Wen <puwen@hygon.cn>
      Cc: Reinette Chatre <reinette.chatre@intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: x86-ml <x86@kernel.org>
      Link: https://lkml.kernel.org/r/20190130195621.GA30653@cmpxchg.org
      e6d42931
    • Q
      x86_64: increase stack size for KASAN_EXTRA · a8e911d1
      Qian Cai 提交于
      If the kernel is configured with KASAN_EXTRA, the stack size is
      increasted significantly because this option sets "-fstack-reuse" to
      "none" in GCC [1].  As a result, it triggers stack overrun quite often
      with 32k stack size compiled using GCC 8.  For example, this reproducer
      
        https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/syscalls/madvise/madvise06.c
      
      triggers a "corrupted stack end detected inside scheduler" very reliably
      with CONFIG_SCHED_STACK_END_CHECK enabled.
      
      There are just too many functions that could have a large stack with
      KASAN_EXTRA due to large local variables that have been called over and
      over again without being able to reuse the stacks.  Some noticiable ones
      are
      
        size
        7648 shrink_page_list
        3584 xfs_rmap_convert
        3312 migrate_page_move_mapping
        3312 dev_ethtool
        3200 migrate_misplaced_transhuge_page
        3168 copy_process
      
      There are other 49 functions are over 2k in size while compiling kernel
      with "-Wframe-larger-than=" even with a related minimal config on this
      machine.  Hence, it is too much work to change Makefiles for each object
      to compile without "-fsanitize-address-use-after-scope" individually.
      
      [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81715#c23
      
      Although there is a patch in GCC 9 to help the situation, GCC 9 probably
      won't be released in a few months and then it probably take another
      6-month to 1-year for all major distros to include it as a default.
      Hence, the stack usage with KASAN_EXTRA can be revisited again in 2020
      when GCC 9 is everywhere.  Until then, this patch will help users avoid
      stack overrun.
      
      This has already been fixed for arm64 for the same reason via
      6e883067 ("arm64: kasan: Increase stack size for KASAN_EXTRA").
      
      Link: http://lkml.kernel.org/r/20190109215209.2903-1-cai@lca.pwSigned-off-by: NQian Cai <cai@lca.pw>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a8e911d1
    • K
      x86/kexec: Don't setup EFI info if EFI runtime is not enabled · 2aa958c9
      Kairui Song 提交于
      Kexec-ing a kernel with "efi=noruntime" on the first kernel's command
      line causes the following null pointer dereference:
      
        BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
        #PF error: [normal kernel read fault]
        Call Trace:
         efi_runtime_map_copy+0x28/0x30
         bzImage64_load+0x688/0x872
         arch_kexec_kernel_image_load+0x6d/0x70
         kimage_file_alloc_init+0x13e/0x220
         __x64_sys_kexec_file_load+0x144/0x290
         do_syscall_64+0x55/0x1a0
         entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Just skip the EFI info setup if EFI runtime services are not enabled.
      
       [ bp: Massage commit message. ]
      Suggested-by: NDave Young <dyoung@redhat.com>
      Signed-off-by: NKairui Song <kasong@redhat.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Acked-by: NDave Young <dyoung@redhat.com>
      Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: bhe@redhat.com
      Cc: David Howells <dhowells@redhat.com>
      Cc: erik.schmauss@intel.com
      Cc: fanc.fnst@cn.fujitsu.com
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: kexec@lists.infradead.org
      Cc: lenb@kernel.org
      Cc: linux-acpi@vger.kernel.org
      Cc: Philipp Rudo <prudo@linux.vnet.ibm.com>
      Cc: rafael.j.wysocki@intel.com
      Cc: robert.moore@intel.com
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: x86-ml <x86@kernel.org>
      Cc: Yannik Sembritzki <yannik@sembritzki.me>
      Link: https://lkml.kernel.org/r/20190118111310.29589-2-kasong@redhat.com
      2aa958c9
    • L
      x86: explicitly align IO accesses in memcpy_{to,from}io · c228d294
      Linus Torvalds 提交于
      In commit 170d13ca ("x86: re-introduce non-generic memcpy_{to,from}io")
      I made our copy from IO space use a separate copy routine rather than
      rely on the generic memcpy.  I did that because our generic memory copy
      isn't actually well-defined when it comes to internal access ordering or
      alignment, and will in fact depend on various CPUID flags.
      
      In particular, the default memcpy() for a modern Intel CPU will
      generally be just a "rep movsb", which works reasonably well for
      medium-sized memory copies of regular RAM, since the CPU will turn it
      into fairly optimized microcode.
      
      However, for non-cached memory and IO, "rep movs" ends up being
      horrendously slow and will just do the architectural "one byte at a
      time" accesses implied by the movsb.
      
      At the other end of the spectrum, if you _don't_ end up using the "rep
      movsb" code, you'd likely fall back to the software copy, which does
      overlapping accesses for the tail, and may copy things backwards.
      Again, for regular memory that's fine, for IO memory not so much.
      
      The thinking was that clearly nobody really cared (because things
      worked), but some people had seen horrible performance due to the byte
      accesses, so let's just revert back to our long ago version that dod
      "rep movsl" for the bulk of the copy, and then fixed up the potentially
      last few bytes of the tail with "movsw/b".
      
      Interestingly (and perhaps not entirely surprisingly), while that was
      our original memory copy implementation, and had been used before for
      IO, in the meantime many new users of memcpy_*io() had come about.  And
      while the access patterns for the memory copy weren't well-defined (so
      arguably _any_ access pattern should work), in practice the "rep movsb"
      case had been very common for the last several years.
      
      In particular Jarkko Sakkinen reported that the memcpy_*io() change
      resuled in weird errors from his Geminilake NUC TPM module.
      
      And it turns out that the TPM TCG accesses according to spec require
      that the accesses be
      
       (a) done strictly sequentially
      
       (b) be naturally aligned
      
      otherwise the TPM chip will abort the PCI transaction.
      
      And, in fact, the tpm_crb.c driver did this:
      
      	memcpy_fromio(buf, priv->rsp, 6);
      	...
      	memcpy_fromio(&buf[6], &priv->rsp[6], expected - 6);
      
      which really should never have worked in the first place, but back
      before commit 170d13ca it *happened* to work, because the
      memcpy_fromio() would be expanded to a regular memcpy, and
      
       (a) gcc would expand the first memcpy in-line, and turn it into a
           4-byte and a 2-byte read, and they happened to be in the right
           order, and the alignment was right.
      
       (b) gcc would call "memcpy()" for the second one, and the machines that
           had this TPM chip also apparently ended up always having ERMS
           ("Enhanced REP MOVSB/STOSB instructions"), so we'd use the "rep
           movbs" for that copy.
      
      In other words, basically by pure luck, the code happened to use the
      right access sizes in the (two different!) memcpy() implementations to
      make it all work.
      
      But after commit 170d13ca, both of the memcpy_fromio() calls
      resulted in a call to the routine with the consistent memory accesses,
      and in both cases it started out transferring with 4-byte accesses.
      Which worked for the first copy, but resulted in the second copy doing a
      32-bit read at an address that was only 2-byte aligned.
      
      Jarkko is actually fixing the fragile code in the TPM driver, but since
      this is an excellent example of why we absolutely must not use a generic
      memcpy for IO accesses, _and_ an IO-specific one really should strive to
      align the IO accesses, let's do exactly that.
      
      Side note: Jarkko also noted that the driver had been used on ARM
      platforms, and had worked.  That was because on 32-bit ARM, memcpy_*io()
      ends up always doing byte accesses, and on 64-bit ARM it first does byte
      accesses to align to 8-byte boundaries, and then does 8-byte accesses
      for the bulk.
      
      So ARM actually worked by design, and the x86 case worked by pure luck.
      
      We *might* want to make x86-64 do the 8-byte case too.  That should be a
      pretty straightforward extension, but let's do one thing at a time.  And
      generally MMIO accesses aren't really all that performance-critical, as
      shown by the fact that for a long time we just did them a byte at a
      time, and very few people ever noticed.
      Reported-and-tested-by: NJarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
      Tested-by: NJerry Snitselaar <jsnitsel@redhat.com>
      Cc: David Laight <David.Laight@aculab.com>
      Fixes: 170d13ca ("x86: re-introduce non-generic memcpy_{to,from}io")
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c228d294
  5. 31 1月, 2019 2 次提交
    • T
      x86/microcode/amd: Don't falsely trick the late loading mechanism · 912139cf
      Thomas Lendacky 提交于
      The load_microcode_amd() function searches for microcode patches and
      attempts to apply a microcode patch if it is of different level than the
      currently installed level.
      
      While the processor won't actually load a level that is less than
      what is already installed, the logic wrongly returns UCODE_NEW thus
      signaling to its caller reload_store() that a late loading should be
      attempted.
      
      If the file-system contains an older microcode revision than what is
      currently running, such a late microcode reload can result in these
      misleading messages:
      
        x86/CPU: CPU features have changed after loading microcode, but might not take effect.
        x86/CPU: Please consider either early loading through initrd/built-in or a potential BIOS update.
      
      These messages were issued on a system where SME/SEV are not
      enabled by the BIOS (MSR C001_0010[23] = 0b) because during boot,
      early_detect_mem_encrypt() is called and cleared the SME and SEV
      features in this case.
      
      However, after the wrong late load attempt, get_cpu_cap() is called and
      reloads the SME and SEV feature bits, resulting in the messages.
      
      Update the microcode level check to not attempt microcode loading if the
      current level is greater than(!) and not only equal to the current patch
      level.
      
       [ bp: massage commit message. ]
      
      Fixes: 2613f36e ("x86/microcode: Attempt late loading only when new microcode is present")
      Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: x86-ml <x86@kernel.org>
      Link: https://lkml.kernel.org/r/154894518427.9406.8246222496874202773.stgit@tlendack-t1.amdoffice.net
      912139cf
    • J
      cpu/hotplug: Fix "SMT disabled by BIOS" detection for KVM · b284909a
      Josh Poimboeuf 提交于
      With the following commit:
      
        73d5e2b4 ("cpu/hotplug: detect SMT disabled by BIOS")
      
      ... the hotplug code attempted to detect when SMT was disabled by BIOS,
      in which case it reported SMT as permanently disabled.  However, that
      code broke a virt hotplug scenario, where the guest is booted with only
      primary CPU threads, and a sibling is brought online later.
      
      The problem is that there doesn't seem to be a way to reliably
      distinguish between the HW "SMT disabled by BIOS" case and the virt
      "sibling not yet brought online" case.  So the above-mentioned commit
      was a bit misguided, as it permanently disabled SMT for both cases,
      preventing future virt sibling hotplugs.
      
      Going back and reviewing the original problems which were attempted to
      be solved by that commit, when SMT was disabled in BIOS:
      
        1) /sys/devices/system/cpu/smt/control showed "on" instead of
           "notsupported"; and
      
        2) vmx_vm_init() was incorrectly showing the L1TF_MSG_SMT warning.
      
      I'd propose that we instead consider #1 above to not actually be a
      problem.  Because, at least in the virt case, it's possible that SMT
      wasn't disabled by BIOS and a sibling thread could be brought online
      later.  So it makes sense to just always default the smt control to "on"
      to allow for that possibility (assuming cpuid indicates that the CPU
      supports SMT).
      
      The real problem is #2, which has a simple fix: change vmx_vm_init() to
      query the actual current SMT state -- i.e., whether any siblings are
      currently online -- instead of looking at the SMT "control" sysfs value.
      
      So fix it by:
      
        a) reverting the original "fix" and its followup fix:
      
           73d5e2b4 ("cpu/hotplug: detect SMT disabled by BIOS")
           bc2d8d26 ("cpu/hotplug: Fix SMT supported evaluation")
      
           and
      
        b) changing vmx_vm_init() to query the actual current SMT state --
           instead of the sysfs control value -- to determine whether the L1TF
           warning is needed.  This also requires the 'sched_smt_present'
           variable to exported, instead of 'cpu_smt_control'.
      
      Fixes: 73d5e2b4 ("cpu/hotplug: detect SMT disabled by BIOS")
      Reported-by: NIgor Mammedov <imammedo@redhat.com>
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Joe Mario <jmario@redhat.com>
      Cc: Jiri Kosina <jikos@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: kvm@vger.kernel.org
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/e3a85d585da28cc333ecbc1e78ee9216e6da9396.1548794349.git.jpoimboe@redhat.com
      b284909a
  6. 30 1月, 2019 2 次提交
  7. 29 1月, 2019 1 次提交
  8. 26 1月, 2019 16 次提交
  9. 23 1月, 2019 1 次提交
  10. 21 1月, 2019 2 次提交
    • A
      perf/core, arch/x86: Strengthen exclusion checks with PERF_PMU_CAP_NO_EXCLUDE · 88dbe3c9
      Andrew Murray 提交于
      For x86 PMUs that do not support context exclusion let's advertise the
      PERF_PMU_CAP_NO_EXCLUDE capability. This ensures that perf will
      prevent us from handling events where any exclusion flags are set.
      Let's also remove the now unnecessary check for exclusion flags.
      
      This change means that amd/iommu and amd/uncore will now also
      indicate that they do not support exclude_{hv|idle} and intel/uncore
      that it does not support exclude_{guest|host}.
      Signed-off-by: NAndrew Murray <andrew.murray@arm.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Sascha Hauer <s.hauer@pengutronix.de>
      Cc: Shawn Guo <shawnguo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: linuxppc-dev@lists.ozlabs.org
      Cc: robin.murphy@arm.com
      Cc: suzuki.poulose@arm.com
      Link: https://lkml.kernel.org/r/1547128414-50693-12-git-send-email-andrew.murray@arm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      88dbe3c9
    • A
      perf/core, arch/x86: Use PERF_PMU_CAP_NO_EXCLUDE for exclusion incapable PMUs · 2ff40250
      Andrew Murray 提交于
      For drivers that do not support context exclusion let's advertise the
      PERF_PMU_CAP_NOEXCLUDE capability. This ensures that perf will
      prevent us from handling events where any exclusion flags are set.
      Let's also remove the now unnecessary check for exclusion flags.
      
      PMU drivers that support at least one exclude flag won't have the
      PERF_PMU_CAP_NOEXCLUDE capability set - these PMU drivers should still
      check and fail on unsupported exclude flags. These missing tests are
      not added in this patch.
      Signed-off-by: NAndrew Murray <andrew.murray@arm.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Sascha Hauer <s.hauer@pengutronix.de>
      Cc: Shawn Guo <shawnguo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: linuxppc-dev@lists.ozlabs.org
      Cc: robin.murphy@arm.com
      Cc: suzuki.poulose@arm.com
      Link: https://lkml.kernel.org/r/1547128414-50693-11-git-send-email-andrew.murray@arm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      2ff40250
  11. 20 1月, 2019 1 次提交
    • W
      x86: uaccess: Inhibit speculation past access_ok() in user_access_begin() · 6e693b3f
      Will Deacon 提交于
      Commit 594cc251 ("make 'user_access_begin()' do 'access_ok()'")
      makes the access_ok() check part of the user_access_begin() preceding a
      series of 'unsafe' accesses.  This has the desirable effect of ensuring
      that all 'unsafe' accesses have been range-checked, without having to
      pick through all of the callsites to verify whether the appropriate
      checking has been made.
      
      However, the consolidated range check does not inhibit speculation, so
      it is still up to the caller to ensure that they are not susceptible to
      any speculative side-channel attacks for user addresses that ultimately
      fail the access_ok() check.
      
      This is an oversight, so use __uaccess_begin_nospec() to ensure that
      speculation is inhibited until the access_ok() check has passed.
      Reported-by: NJulien Thierry <julien.thierry@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6e693b3f
  12. 18 1月, 2019 2 次提交
  13. 17 1月, 2019 1 次提交
    • J
      xen: Fix x86 sched_clock() interface for xen · 867cefb4
      Juergen Gross 提交于
      Commit f94c8d11 ("sched/clock, x86/tsc: Rework the x86 'unstable'
      sched_clock() interface") broke Xen guest time handling across
      migration:
      
      [  187.249951] Freezing user space processes ... (elapsed 0.001 seconds) done.
      [  187.251137] OOM killer disabled.
      [  187.251137] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
      [  187.252299] suspending xenstore...
      [  187.266987] xen:grant_table: Grant tables using version 1 layout
      [18446743811.706476] OOM killer enabled.
      [18446743811.706478] Restarting tasks ... done.
      [18446743811.720505] Setting capacity to 16777216
      
      Fix that by setting xen_sched_clock_offset at resume time to ensure a
      monotonic clock value.
      
      [boris: replaced pr_info() with pr_info_once() in xen_callback_vector()
       to avoid printing with incorrect timestamp during resume (as we
       haven't re-adjusted the clock yet)]
      
      Fixes: f94c8d11 ("sched/clock, x86/tsc: Rework the x86 'unstable' sched_clock() interface")
      Cc: <stable@vger.kernel.org> # 4.11
      Reported-by: NHans van Kranenburg <hans.van.kranenburg@mendix.com>
      Signed-off-by: NJuergen Gross <jgross@suse.com>
      Tested-by: NHans van Kranenburg <hans.van.kranenburg@mendix.com>
      Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
      867cefb4