1. 03 7月, 2018 1 次提交
  2. 17 5月, 2018 1 次提交
    • T
      x86/speculation: Handle HT correctly on AMD · 1f50ddb4
      Thomas Gleixner 提交于
      The AMD64_LS_CFG MSR is a per core MSR on Family 17H CPUs. That means when
      hyperthreading is enabled the SSBD bit toggle needs to take both cores into
      account. Otherwise the following situation can happen:
      
      CPU0		CPU1
      
      disable SSB
      		disable SSB
      		enable  SSB <- Enables it for the Core, i.e. for CPU0 as well
      
      So after the SSB enable on CPU1 the task on CPU0 runs with SSB enabled
      again.
      
      On Intel the SSBD control is per core as well, but the synchronization
      logic is implemented behind the per thread SPEC_CTRL MSR. It works like
      this:
      
        CORE_SPEC_CTRL = THREAD0_SPEC_CTRL | THREAD1_SPEC_CTRL
      
      i.e. if one of the threads enables a mitigation then this affects both and
      the mitigation is only disabled in the core when both threads disabled it.
      
      Add the necessary synchronization logic for AMD family 17H. Unfortunately
      that requires a spinlock to serialize the access to the MSR, but the locks
      are only shared between siblings.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NBorislav Petkov <bp@suse.de>
      Reviewed-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      1f50ddb4
  3. 06 5月, 2018 1 次提交
  4. 26 4月, 2018 1 次提交
  5. 17 4月, 2018 1 次提交
    • A
      x86,sched: Allow topologies where NUMA nodes share an LLC · 1340ccfa
      Alison Schofield 提交于
      Intel's Skylake Server CPUs have a different LLC topology than previous
      generations. When in Sub-NUMA-Clustering (SNC) mode, the package is divided
      into two "slices", each containing half the cores, half the LLC, and one
      memory controller and each slice is enumerated to Linux as a NUMA
      node. This is similar to how the cores and LLC were arranged for the
      Cluster-On-Die (CoD) feature.
      
      CoD allowed the same cache line to be present in each half of the LLC.
      But, with SNC, each line is only ever present in *one* slice. This means
      that the portion of the LLC *available* to a CPU depends on the data being
      accessed:
      
          Remote socket: entire package LLC is shared
          Local socket->local slice: data goes into local slice LLC
          Local socket->remote slice: data goes into remote-slice LLC. Slightly
                          		higher latency than local slice LLC.
      
      The biggest implication from this is that a process accessing all
      NUMA-local memory only sees half the LLC capacity.
      
      The CPU describes its cache hierarchy with the CPUID instruction. One of
      the CPUID leaves enumerates the "logical processors sharing this
      cache". This information is used for scheduling decisions so that tasks
      move more freely between CPUs sharing the cache.
      
      But, the CPUID for the SNC configuration discussed above enumerates the LLC
      as being shared by the entire package. This is not 100% precise because the
      entire cache is not usable by all accesses. But, it *is* the way the
      hardware enumerates itself, and this is not likely to change.
      
      The userspace visible impact of all the above is that the sysfs info
      reports the entire LLC as being available to the entire package. As noted
      above, this is not true for local socket accesses. This patch does not
      correct the sysfs info. It is the same, pre and post patch.
      
      The current code emits the following warning:
      
       sched: CPU #3's llc-sibling CPU #0 is not on the same node! [node: 1 != 0]. Ignoring dependency.
      
      The warning is coming from the topology_sane() check in smpboot.c because
      the topology is not matching the expectations of the model for obvious
      reasons.
      
      To fix this, add a vendor and model specific check to never call
      topology_sane() for these systems. Also, just like "Cluster-on-Die" disable
      the "coregroup" sched_domain_topology_level and use NUMA information from
      the SRAT alone.
      
      This is OK at least on the hardware we are immediately concerned about
      because the LLC sharing happens at both the slice and at the package level,
      which are also NUMA boundaries.
      Signed-off-by: NAlison Schofield <alison.schofield@intel.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NBorislav Petkov <bp@suse.de>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
      Cc: brice.goglin@gmail.com
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Igor Mammedov <imammedo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@linux.intel.com>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Link: https://lkml.kernel.org/r/20180407002130.GA18984@alison-desk.jf.intel.com
      1340ccfa
  6. 23 2月, 2018 1 次提交
  7. 17 2月, 2018 1 次提交
  8. 13 2月, 2018 1 次提交
    • M
      x86/smpboot: Fix uncore_pci_remove() indexing bug when hot-removing a physical CPU · 295cc7eb
      Masayoshi Mizuma 提交于
      When a physical CPU is hot-removed, the following warning messages
      are shown while the uncore device is removed in uncore_pci_remove():
      
        WARNING: CPU: 120 PID: 5 at arch/x86/events/intel/uncore.c:988
        uncore_pci_remove+0xf1/0x110
        ...
        CPU: 120 PID: 5 Comm: kworker/u1024:0 Not tainted 4.15.0-rc8 #1
        Workqueue: kacpi_hotplug acpi_hotplug_work_fn
        ...
        Call Trace:
        pci_device_remove+0x36/0xb0
        device_release_driver_internal+0x145/0x210
        pci_stop_bus_device+0x76/0xa0
        pci_stop_root_bus+0x44/0x60
        acpi_pci_root_remove+0x1f/0x80
        acpi_bus_trim+0x54/0x90
        acpi_bus_trim+0x2e/0x90
        acpi_device_hotplug+0x2bc/0x4b0
        acpi_hotplug_work_fn+0x1a/0x30
        process_one_work+0x141/0x340
        worker_thread+0x47/0x3e0
        kthread+0xf5/0x130
      
      When uncore_pci_remove() runs, it tries to get the package ID to
      clear the value of uncore_extra_pci_dev[].dev[] by using
      topology_phys_to_logical_pkg(). The warning messesages are
      shown because topology_phys_to_logical_pkg() returns -1.
      
        arch/x86/events/intel/uncore.c:
        static void uncore_pci_remove(struct pci_dev *pdev)
        {
        ...
                phys_id = uncore_pcibus_to_physid(pdev->bus);
        ...
                        pkg = topology_phys_to_logical_pkg(phys_id); // returns -1
                        for (i = 0; i < UNCORE_EXTRA_PCI_DEV_MAX; i++) {
                                if (uncore_extra_pci_dev[pkg].dev[i] == pdev) {
                                        uncore_extra_pci_dev[pkg].dev[i] = NULL;
                                        break;
                                }
                        }
                        WARN_ON_ONCE(i >= UNCORE_EXTRA_PCI_DEV_MAX); // <=========== HERE!!
      
      topology_phys_to_logical_pkg() tries to find
      cpuinfo_x86->phys_proc_id that matches the phys_pkg argument.
      
        arch/x86/kernel/smpboot.c:
        int topology_phys_to_logical_pkg(unsigned int phys_pkg)
        {
                int cpu;
      
                for_each_possible_cpu(cpu) {
                        struct cpuinfo_x86 *c = &cpu_data(cpu);
      
                        if (c->initialized && c->phys_proc_id == phys_pkg)
                                return c->logical_proc_id;
                }
                return -1;
        }
      
      However, the phys_proc_id was already set to 0 by remove_siblinginfo()
      when the CPU was offlined.
      
      So, topology_phys_to_logical_pkg() cannot find the correct
      logical_proc_id and always returns -1.
      
      As the result, uncore_pci_remove() calls WARN_ON_ONCE() and the warning
      messages are shown.
      
      What is worse is that the bogus 'pkg' index results in two bugs:
      
       - We dereference uncore_extra_pci_dev[] with a negative index
       - We fail to clean up a stale pointer in uncore_extra_pci_dev[][]
      
      To fix these bugs, remove the clearing of ->phys_proc_id from remove_siblinginfo().
      
      This should not cause any problems, because ->phys_proc_id is not
      used after it is hot-removed and it is re-set while hot-adding.
      Signed-off-by: NMasayoshi Mizuma <m.mizuma@jp.fujitsu.com>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: yasu.isimatu@gmail.com
      Cc: <stable@vger.kernel.org>
      Fixes: 30bb9811 ("x86/topology: Avoid wasting 128k for package id array")
      Link: http://lkml.kernel.org/r/ed738d54-0f01-b38b-b794-c31dc118c207@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      295cc7eb
  9. 15 1月, 2018 1 次提交
  10. 31 12月, 2017 1 次提交
  11. 23 12月, 2017 1 次提交
    • T
      init: Invoke init_espfix_bsp() from mm_init() · 613e396b
      Thomas Gleixner 提交于
      init_espfix_bsp() needs to be invoked before the page table isolation
      initialization. Move it into mm_init() which is the place where pti_init()
      will be added.
      
      While at it get rid of the #ifdeffery and provide proper stub functions.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      613e396b
  12. 12 12月, 2017 1 次提交
  13. 07 12月, 2017 1 次提交
    • P
      x86/smpboot: Do not use smp_num_siblings in __max_logical_packages calculation · 947134d9
      Prarit Bhargava 提交于
      Documentation/x86/topology.txt defines smp_num_siblings as "The number of
      threads in a core".  Since commit bbb65d2d ("x86: use cpuid vector 0xb
      when available for detecting cpu topology") smp_num_siblings is the
      maximum number of threads in a core.  If Simultaneous MultiThreading
      (SMT) is disabled on a system, smp_num_siblings is 2 and not 1 as
      expected.
      
      Use topology_max_smt_threads(), which contains the active numer of threads,
      in the __max_logical_packages calculation.
      
      On a single socket, single core, single thread system __max_smt_threads has
      not been updated when the __max_logical_packages calculation happens, so its
      zero which makes the package estimate fail. Initialize it to one, which is
      the minimum number of threads on a core.
      
      [ tglx: Folded the __max_smt_threads fix in ]
      
      Fixes: b4c0a732 ("x86/smpboot: Fix __max_logical_packages estimate")
      Reported-by: NJakub Kicinski <kubakici@wp.pl>
      Signed-off-by: Prarit Bhargava <prarit@redhat.com
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Tested-by: NJakub Kicinski <kubakici@wp.pl>
      Cc: netdev@vger.kernel.org
      Cc: "netdev@vger.kernel.org"
      Cc: Clark Williams <williams@redhat.com>
      Link: https://lkml.kernel.org/r/20171204164521.17870-1-prarit@redhat.com
      947134d9
  14. 28 11月, 2017 1 次提交
  15. 17 11月, 2017 2 次提交
    • P
      x86/smpboot: Fix __max_logical_packages estimate · b4c0a732
      Prarit Bhargava 提交于
      A system booted with a small number of cores enabled per package
      panics because the estimate of __max_logical_packages is too low.
      
      This occurs when the total number of active cores across all packages is
      less than the maximum core count for a single package. e.g.:
      
        On a 4 package system with 20 cores/package where only 4 cores are
        enabled on each package, the value of __max_logical_packages is
        calculated as DIV_ROUND_UP(16 / 20) = 1 and not 4.
      
      Calculate __max_logical_packages after the cpu enumeration has completed.
      Use the boot cpu's data to extrapolate the number of packages.
      Signed-off-by: NPrarit Bhargava <prarit@redhat.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: He Chen <he.chen@linux.intel.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Piotr Luc <piotr.luc@intel.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arvind Yadav <arvind.yadav.cs@gmail.com>
      Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Mathias Krause <minipli@googlemail.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Link: https://lkml.kernel.org/r/20171114124257.22013-4-prarit@redhat.com
      b4c0a732
    • A
      x86/topology: Avoid wasting 128k for package id array · 30bb9811
      Andi Kleen 提交于
      Analyzing large early boot allocations unveiled the logical package id
      storage as a prominent memory waste. Since commit 1f12e32f
      ("x86/topology: Create logical package id") every 64-bit system allocates a
      128k array to convert logical package ids.
      
      This happens because the array is sized for MAX_LOCAL_APIC which is always
      32k on 64bit systems, and it needs 4 bytes for each entry.
      
      This is fairly wasteful, especially for the common case of having only one
      socket, which uses exactly 4 byte out of 128K.
      
      There is no user of the package id map which is performance critical, so
      the lookup is not required to be O(1). Store the logical processor id in
      cpu_data and use a loop based lookup.
      
      To keep the mapping stable accross cpu hotplug operations, add a flag to
      cpu_data which is set when the CPU is brought up the first time. When the
      flag is set, then cpu_data is not reinitialized by copying boot_cpu_data on
      subsequent bringups.
      
      [ tglx: Rename the flag to 'initialized', use proper pointers instead of
        	repeated cpu_data(x) evaluation and massage changelog. ]
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NPrarit Bhargava <prarit@redhat.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: He Chen <he.chen@linux.intel.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Piotr Luc <piotr.luc@intel.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arvind Yadav <arvind.yadav.cs@gmail.com>
      Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Mathias Krause <minipli@googlemail.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Link: https://lkml.kernel.org/r/20171114124257.22013-3-prarit@redhat.com
      30bb9811
  16. 08 11月, 2017 1 次提交
  17. 07 11月, 2017 1 次提交
  18. 02 11月, 2017 1 次提交
  19. 30 10月, 2017 1 次提交
  20. 10 10月, 2017 1 次提交
  21. 04 10月, 2017 1 次提交
    • J
      x86/boot: Spell out "boot CPU" for BP · a1652bb8
      Jean Delvare 提交于
      It's not obvious to everybody that BP stands for boot processor. At
      least it was not for me. And BP is also a CPU register on x86, so it
      is ambiguous. Spell out "boot CPU" everywhere instead.
      Signed-off-by: NJean Delvare <jdelvare@suse.de>
      Cc: Alok Kataria <akataria@vmware.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      a1652bb8
  22. 26 9月, 2017 4 次提交
    • T
      x86/irq: Simplify hotplug vector accounting · 2cffad7b
      Thomas Gleixner 提交于
      Before a CPU is taken offline the number of active interrupt vectors on the
      outgoing CPU and the number of vectors which are available on the other
      online CPUs are counted and compared. If the active vectors are more than
      the available vectors on the other CPUs then the CPU hot-unplug operation
      is aborted. This again uses loop based search and is inaccurate.
      
      The bitmap matrix allocator has accurate accounting information and can
      tell exactly whether the vector space is sufficient or not.
      
      Emit a message when the number of globaly reserved (unallocated) vectors is
      larger than the number of available vectors after offlining a CPU because
      after that point request_irq() might fail.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Tested-by: NJuergen Gross <jgross@suse.com>
      Tested-by: NYu Chen <yu.c.chen@intel.com>
      Acked-by: NJuergen Gross <jgross@suse.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Alok Kataria <akataria@vmware.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Rui Zhang <rui.zhang@intel.com>
      Cc: "K. Y. Srinivasan" <kys@microsoft.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Len Brown <lenb@kernel.org>
      Link: https://lkml.kernel.org/r/20170913213156.351193962@linutronix.de
      2cffad7b
    • T
      x86/smpboot: Set online before setting up vectors · 8ed4f3e6
      Thomas Gleixner 提交于
      There is no reason to set the CPU online after establishing the vectors on
      the upcoming CPU. The vector space is protected by the vector lock so no
      changes can happen.
      
      Marking the CPU online before setting up the vector space makes tracing
      work in the early vector management cpu online code.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Tested-by: NJuergen Gross <jgross@suse.com>
      Tested-by: NYu Chen <yu.c.chen@intel.com>
      Acked-by: NJuergen Gross <jgross@suse.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Alok Kataria <akataria@vmware.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Rui Zhang <rui.zhang@intel.com>
      Cc: "K. Y. Srinivasan" <kys@microsoft.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Len Brown <lenb@kernel.org>
      Link: https://lkml.kernel.org/r/20170913213155.264311994@linutronix.de
      8ed4f3e6
    • T
      x86/irq/vector: Initialize matrix allocator · 0fa115da
      Thomas Gleixner 提交于
      Initialize the matrix allocator and add the proper accounting points to the
      code.
      
      No functional change, just preparation.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Tested-by: NJuergen Gross <jgross@suse.com>
      Tested-by: NYu Chen <yu.c.chen@intel.com>
      Acked-by: NJuergen Gross <jgross@suse.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Alok Kataria <akataria@vmware.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Rui Zhang <rui.zhang@intel.com>
      Cc: "K. Y. Srinivasan" <kys@microsoft.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Len Brown <lenb@kernel.org>
      Link: https://lkml.kernel.org/r/20170913213155.108410660@linutronix.de
      0fa115da
    • T
      x86/ioapic: Remove obsolete post hotplug update · ef9e56d8
      Thomas Gleixner 提交于
      With single CPU affinities the post SMP boot vector update is pointless as
      it will just leave the affinities on the same vectors and the same CPUs.
      
      Remove it.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Tested-by: NJuergen Gross <jgross@suse.com>
      Tested-by: NYu Chen <yu.c.chen@intel.com>
      Acked-by: NJuergen Gross <jgross@suse.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Alok Kataria <akataria@vmware.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Rui Zhang <rui.zhang@intel.com>
      Cc: "K. Y. Srinivasan" <kys@microsoft.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Len Brown <lenb@kernel.org>
      Link: https://lkml.kernel.org/r/20170913213154.308697243@linutronix.de
      ef9e56d8
  23. 25 9月, 2017 6 次提交
  24. 18 9月, 2017 1 次提交
  25. 13 9月, 2017 1 次提交
    • A
      x86/mm/64: Initialize CR4.PCIDE early · c7ad5ad2
      Andy Lutomirski 提交于
      cpu_init() is weird: it's called rather late (after early
      identification and after most MMU state is initialized) on the boot
      CPU but is called extremely early (before identification) on secondary
      CPUs.  It's called just late enough on the boot CPU that its CR4 value
      isn't propagated to mmu_cr4_features.
      
      Even if we put CR4.PCIDE into mmu_cr4_features, we'd hit two
      problems.  First, we'd crash in the trampoline code.  That's
      fixable, and I tried that.  It turns out that mmu_cr4_features is
      totally ignored by secondary_start_64(), though, so even with the
      trampoline code fixed, it wouldn't help.
      
      This means that we don't currently have CR4.PCIDE reliably initialized
      before we start playing with cpu_tlbstate.  This is very fragile and
      tends to cause boot failures if I make even small changes to the TLB
      handling code.
      
      Make it more robust: initialize CR4.PCIDE earlier on the boot CPU
      and propagate it to secondary CPUs in start_secondary().
      
      ( Yes, this is ugly.  I think we should have improved mmu_cr4_features
        to actually control CR4 during secondary bootup, but that would be
        fairly intrusive at this stage. )
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Reported-by: NSai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
      Tested-by: NSai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
      Cc: Borislav Petkov <bpetkov@suse.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Fixes: 660da7c9 ("x86/mm: Enable CR4.PCIDE on supported systems")
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      c7ad5ad2
  26. 09 9月, 2017 1 次提交
  27. 10 8月, 2017 1 次提交
    • V
      x86/smpboot: Unbreak CPU0 hotplug · 10e66760
      Vitaly Kuznetsov 提交于
      A hang on CPU0 onlining after a preceding offlining is observed. Trace
      shows that CPU0 is stuck in check_tsc_sync_target() waiting for source
      CPU to run check_tsc_sync_source() but this never happens. Source CPU,
      in its turn, is stuck on synchronize_sched() which is called from
      native_cpu_up() -> do_boot_cpu() -> unregister_nmi_handler().
      
      So it's a classic ABBA deadlock, due to the use of synchronize_sched() in
      unregister_nmi_handler().
      
      Fix the bug by moving unregister_nmi_handler() from do_boot_cpu() to
      native_cpu_up() after cpu onlining is done.
      Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20170803105818.9934-1-vkuznets@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      10e66760
  28. 22 6月, 2017 1 次提交
  29. 23 5月, 2017 1 次提交
  30. 16 3月, 2017 1 次提交
    • T
      x86: Remap GDT tables in the fixmap section · 69218e47
      Thomas Garnier 提交于
      Each processor holds a GDT in its per-cpu structure. The sgdt
      instruction gives the base address of the current GDT. This address can
      be used to bypass KASLR memory randomization. With another bug, an
      attacker could target other per-cpu structures or deduce the base of
      the main memory section (PAGE_OFFSET).
      
      This patch relocates the GDT table for each processor inside the
      fixmap section. The space is reserved based on number of supported
      processors.
      
      For consistency, the remapping is done by default on 32 and 64-bit.
      
      Each processor switches to its remapped GDT at the end of
      initialization. For hibernation, the main processor returns with the
      original GDT and switches back to the remapping at completion.
      
      This patch was tested on both architectures. Hibernation and KVM were
      both tested specially for their usage of the GDT.
      
      Thanks to Boris Ostrovsky <boris.ostrovsky@oracle.com> for testing and
      recommending changes for Xen support.
      Signed-off-by: NThomas Garnier <thgarnie@google.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Kosina <jikos@kernel.org>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Len Brown <len.brown@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Lorenzo Stoakes <lstoakes@gmail.com>
      Cc: Luis R . Rodriguez <mcgrof@kernel.org>
      Cc: Matt Fleming <matt@codeblueprint.co.uk>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Rafael J . Wysocki <rjw@rjwysocki.net>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Stanislaw Gruszka <sgruszka@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
      Cc: kasan-dev@googlegroups.com
      Cc: kernel-hardening@lists.openwall.com
      Cc: kvm@vger.kernel.org
      Cc: lguest@lists.ozlabs.org
      Cc: linux-doc@vger.kernel.org
      Cc: linux-efi@vger.kernel.org
      Cc: linux-mm@kvack.org
      Cc: linux-pm@vger.kernel.org
      Cc: xen-devel@lists.xenproject.org
      Cc: zijun_hu <zijun_hu@htc.com>
      Link: http://lkml.kernel.org/r/20170314170508.100882-2-thgarnie@google.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      69218e47
  31. 02 3月, 2017 1 次提交