提交 · a826faf108e2d855929342268e68c43ba667379a · openanolis / cloud-kernel

13 7月, 2017 5 次提交

KVM: x86: make backwards_tsc_observed a per-VM variable · a826faf1

由 Ladi Prosek 提交于 6月 26, 2017

The backwards_tsc_observed global introduced in commit 16a96021 is never
reset to false. If a VM happens to be running while the host is suspended
(a common source of the TSC jumping backwards), master clock will never
be enabled again for any VM. In contrast, if no VM is running while the
host is suspended, master clock is unaffected. This is inconsistent and
unnecessarily strict. Let's track the backwards_tsc_observed variable
separately and let each VM start with a clean slate.

Real world impact: My Windows VMs get slower after my laptop undergoes a
suspend/resume cycle. The only way to get the perf back is unloading and
reloading the kvm module.
Signed-off-by: NLadi Prosek <lprosek@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

a826faf1

KVM: SVM: Enable Virtual VMLOAD VMSAVE feature · 89c8a498

由 Janakarajan Natarajan 提交于 7月 06, 2017

Enable the Virtual VMLOAD VMSAVE feature. This is done by setting bit 1
at position B8h in the vmcb.

The processor must have nested paging enabled, be in 64-bit mode and
have support for the Virtual VMLOAD VMSAVE feature for the bit to be set
in the vmcb.
Signed-off-by: NJanakarajan Natarajan <Janakarajan.Natarajan@amd.com>
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

89c8a498

KVM: SVM: Add Virtual VMLOAD VMSAVE feature definition · 76ff3592

由 Janakarajan Natarajan 提交于 7月 06, 2017

Define a new cpufeature definition for Virtual VMLOAD VMSAVE.
Signed-off-by: NJanakarajan Natarajan <Janakarajan.Natarajan@amd.com>
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

76ff3592

KVM: SVM: Rename lbr_ctl field in the vmcb control area · 0dc92119

由 Janakarajan Natarajan 提交于 7月 06, 2017

Rename the lbr_ctl variable to better reflect the purpose of the field -
provide support for virtualization extensions.
Signed-off-by: NJanakarajan Natarajan <Janakarajan.Natarajan@amd.com>
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

0dc92119

KVM: SVM: Prepare for new bit definition in lbr_ctl · 8a77e909

由 Janakarajan Natarajan 提交于 7月 06, 2017

The lbr_ctl variable in the vmcb control area is used to enable or
disable Last Branch Record (LBR) virtualization. However, this is to be
done using only bit 0 of the variable. To correct this and to prepare
for a new feature, change the current usage to work only on a particular
bit.
Signed-off-by: NJanakarajan Natarajan <Janakarajan.Natarajan@amd.com>
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

8a77e909

07 7月, 2017 1 次提交

mm/hugetlb: clean up ARCH_HAS_GIGANTIC_PAGE · e1073d1e

由 Aneesh Kumar K.V 提交于 7月 06, 2017

This moves the #ifdef in C code to a Kconfig dependency. Also we move
the gigantic_page_supported() function to be arch specific.

This allows architectures to conditionally enable runtime allocation of
gigantic huge page. Architectures like ppc64 supports different
gigantic huge page size (16G and 1G) based on the translation mode
selected. This provides an opportunity for ppc64 to enable runtime
allocation only w.r.t 1G hugepage.

No functional change in this patch.

Link: http://lkml.kernel.org/r/1494995292-4443-1-git-send-email-aneesh.kumar@linux.vnet.ibm.comSigned-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e1073d1e

05 7月, 2017 3 次提交

x86/boot/e820: Introduce the bootloader provided e820_table_firmware[] table · 12df216c

由 Chen Yu 提交于 7月 03, 2017

Add the real e820_tabel_firmware[] that will not be modified by the kernel
or the EFI boot stub under any circumstance.

In addition to that modify the code so that e820_table_firmwarep[] is
exposed via sysfs to represent the real firmware memory layout,
rather than exposing the e820_table_kexec[] table.

This fixes a hibernation bug/warning, which uses e820_table_kexec[] to check
RAM layout consistency across hibernation/resume:

  The suspend kernel:
  [    0.000000] e820: update [mem 0x76671018-0x76679457] usable ==> usable

  The resume kernel:
  [    0.000000] e820: update [mem 0x7666f018-0x76677457] usable ==> usable
  ...
  [   15.752088] PM: Using 3 thread(s) for decompression.
  [   15.752088] PM: Loading and decompressing image data (471870 pages)...
  [   15.764971] Hibernate inconsistent memory map detected!
  [   15.770833] PM: Image mismatch: architecture specific data

Actually it is safe to restore these pages because E820_TYPE_RAM and
E820_TYPE_RESERVED_KERN are treated the same during hibernation, so
the original e820 table provided by the bootloader is used for
hibernation MD5 fingerprint checking.

The side effect is that, this newly introduced variable might increase the
kernel size at compile time.
Suggested-by: NIngo Molnar <mingo@redhat.com>
Signed-off-by: NChen Yu <yu.c.chen@intel.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Len Brown <lenb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Xunlei Pang <xlpang@redhat.com>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: NIngo Molnar <mingo@kernel.org>

12df216c

x86/boot/e820: Rename the e820_table_firmware to e820_table_kexec · a09bae0f

由 Chen Yu 提交于 7月 03, 2017

Currently the e820_table_firmware[] table is mainly used by the kexec,
and it is not what it's supposed to be - despite its name it might be
modified by the kernel.

So change its name to e820_table_kexec[]. In the next patch we will
introduce the real e820_table_firmware[] table.

No functional change.
Signed-off-by: NChen Yu <yu.c.chen@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Xunlei Pang <xlpang@redhat.com>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: NIngo Molnar <mingo@kernel.org>

a09bae0f

x86/mm/pat: Don't report PAT on CPUs that don't support it · 99c13b8c

由 Mikulas Patocka 提交于 7月 04, 2017

The pat_enabled() logic is broken on CPUs which do not support PAT and
where the initialization code fails to call pat_init(). Due to that the
enabled flag stays true and pat_enabled() returns true wrongfully.

As a consequence the mappings, e.g. for Xorg, are set up with the wrong
caching mode and the required MTRR setups are omitted.

To cure this the following changes are required:

  1) Make pat_enabled() return true only if PAT initialization was
     invoked and successful.

  2) Invoke init_cache_modes() unconditionally in setup_arch() and
     remove the extra callsites in pat_disable() and the pat disabled
     code path in pat_init().

Also rename __pat_enabled to pat_disabled to reflect the real purpose of
this variable.

Fixes: 9cd25aac ("x86/mm/pat: Emulate PAT when it is disabled")
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Bernhard Held <berny156@gmx.de>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: "Luis R. Rodriguez" <mcgrof@suse.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/alpine.LRH.2.02.1707041749300.3456@file01.intranet.prod.int.rdu2.redhat.com

99c13b8c

03 7月, 2017 2 次提交

x86/xen: allow userspace access during hypercalls · c54590ca

由 Marek Marczykowski-Górecki 提交于 6月 26, 2017

Userspace application can do a hypercall through /dev/xen/privcmd, and
some for some hypercalls argument is a pointers to user-provided
structure. When SMAP is supported and enabled, hypervisor can't access.
So, lets allow it.

The same applies to HYPERVISOR_dm_op, where additionally privcmd driver
carefully verify buffer addresses.

Cc: stable@vger.kernel.org
Signed-off-by: NMarek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Reviewed-by: NJuergen Gross <jgross@suse.com>
Signed-off-by: NJuergen Gross <jgross@suse.com>

c54590ca

kvm: x86: mmu: allow A/D bits to be disabled in an mmu · ac8d57e5

由 Peter Feiner 提交于 6月 30, 2017

Adds the plumbing to disable A/D bits in the MMU based on a new role
bit, ad_disabled. When A/D is disabled, the MMU operates as though A/D
aren't available (i.e., using access tracking faults instead).

To avoid SP -> kvm_mmu_page.role.ad_disabled lookups all over the
place, A/D disablement is now stored in the SPTE. This state is stored
in the SPTE by tweaking the use of SPTE_SPECIAL_MASK for access
tracking. Rather than just setting SPTE_SPECIAL_MASK when an
access-tracking SPTE is non-present, we now always set
SPTE_SPECIAL_MASK for access-tracking SPTEs.
Signed-off-by: NPeter Feiner <pfeiner@google.com>
[Use role.ad_disabled even for direct (non-shadow) EPT page tables.  Add
 documentation and a few MMU_WARN_ONs. - Paolo]
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

ac8d57e5

29 6月, 2017 2 次提交

arch: remove unused macro/function thread_saved_pc() · 6474924e

由 Tobias Klauser 提交于 6月 28, 2017

The only user of thread_saved_pc() in non-arch-specific code was removed
in commit 8243d559 ("sched/core: Remove pointless printout in
sched_show_task()").  Remove the implementations as well.

Some architectures use thread_saved_pc() in their arch-specific code.
Leave their thread_saved_pc() intact.
Signed-off-by: NTobias Klauser <tklauser@distanz.ch>
Acked-by: NGeert Uytterhoeven <geert@linux-m68k.org>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

6474924e

x86/PCI: Remove duplicate defines · 9304d162

由 Thomas Gleixner 提交于 3月 16, 2017

For some historic reason these defines are duplicated and also available in
arch/x86/include/asm/pci_x86.h,

Remove them.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Acked-by: NBjorn Helgaas <helgaas@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: linux-pci@vger.kernel.org
Link: http://lkml.kernel.org/r/20170316215056.967808646@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

9304d162

28 6月, 2017 4 次提交

x86: remove arch specific dma_supported implementation · 5860acc1

由 Christoph Hellwig 提交于 5月 22, 2017

And instead wire it up as method for all the dma_map_ops instances.

Note that this also means the arch specific check will be fully instead
of partially applied in the AMD iommu driver.
Signed-off-by: NChristoph Hellwig <hch@lst.de>

5860acc1

x86: remove DMA_ERROR_CODE · a760088b

由 Christoph Hellwig 提交于 5月 22, 2017

All dma_map_ops instances now handle their errors through
->mapping_error.
Signed-off-by: NChristoph Hellwig <hch@lst.de>

a760088b

x86, libnvdimm, pmem: remove global pmem api · ca6a4657

由 Dan Williams 提交于 1月 13, 2017

Now that all callers of the pmem api have been converted to dax helpers that
call back to the pmem driver, we can remove include/linux/pmem.h and
asm/pmem.h.

Cc: <x86@kernel.org>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Toshi Kani <toshi.kani@hpe.com>
Cc: Oliver O'Halloran <oohall@gmail.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Reviewed-by: NJan Kara <jack@suse.cz>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

ca6a4657

x86, libnvdimm, pmem: move arch_invalidate_pmem() to libnvdimm · f2b61257

由 Dan Williams 提交于 5月 29, 2017

Kill this globally defined wrapper and move to libnvdimm so that we can
ultimately remove include/linux/pmem.h and asm/pmem.h.

Cc: <x86@kernel.org>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Reviewed-by: NJan Kara <jack@suse.cz>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

f2b61257

24 6月, 2017 2 次提交

x86/paravirt: Remove unnecessary return from void function · e8ad8bc4

由 Anton Vasilyev 提交于 6月 23, 2017

The patch removes unnecessary return from void function.

Found by Linux Driver Verification project (linuxtesting.org).
Signed-off-by: NAnton Vasilyev <vasilyev@ispras.ru>
Cc: Alok Kataria <akataria@vmware.com>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: ldv-project@linuxtesting.org
Cc: virtualization@lists.linux-foundation.org
Link: http://lkml.kernel.org/r/1498234993-1320-1-git-send-email-vasilyev@ispras.ruSigned-off-by: NIngo Molnar <mingo@kernel.org>

e8ad8bc4

x86/mshyperv: Remove excess #includes from mshyperv.h · 26fcd952

由 Thomas Gleixner 提交于 6月 23, 2017

A recent commit included linux/slab.h in linux/irq.h. This breaks the build
of vdso32 on a 64-bit kernel.

The reason is that linux/irq.h gets included into the vdso code via
linux/interrupt.h which is included from asm/mshyperv.h. That makes the
32-bit vdso compile fail, because slab.h includes the pgtable headers for
64-bit on a 64-bit build.

Neither linux/clocksource.h nor linux/interrupt.h are needed in the
mshyperv.h header file itself - it has a dependency on <linux/atomic.h>.

Remove the includes and unbreak the build.
Reported-by: NIngo Molnar <mingo@kernel.org>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: devel@linuxdriverproject.org
Fixes: dee863b5 ("hv: export current Hyper-V clocksource")
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1706231038460.2647@nanosSigned-off-by: NIngo Molnar <mingo@kernel.org>

26fcd952

23 6月, 2017 6 次提交

x86/apic: Add irq_data argument to apic->cpu_mask_to_apicid() · 0e24f7c9

由 Thomas Gleixner 提交于 6月 20, 2017

The decision to which CPUs an interrupt is effectively routed happens in
the various apic->cpu_mask_to_apicid() implementations

To support effective affinity masks this information needs to be updated in
irq_data. Add a pointer to irq_data to the callbacks and feed it through
the call chain.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235446.720739075@linutronix.de

0e24f7c9

x86/apic: Move cpumask and to core code · 91cd9cb7

由 Thomas Gleixner 提交于 6月 20, 2017

All implementations of apic->cpu_mask_to_apicid_and() and the two incoming
cpumasks to search for the target.

Move that operation to the call site and rename it to cpu_mask_to_apicid()
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235446.641575516@linutronix.de

91cd9cb7

x86/apic: Move flat_cpu_mask_to_apicid_and() into C source · ad95212e

由 Thomas Gleixner 提交于 6月 20, 2017

No point in having inlines assigned to function pointers at multiple
places. Just bloats the text.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235446.405975721@linutronix.de

ad95212e

genirq/cpuhotplug: Add support for cleaning up move in progress · f0383c24

由 Thomas Gleixner 提交于 6月 20, 2017

In order to move x86 to the generic hotplug migration code, add support for
cleaning up move in progress bits.

On architectures which have this x86 specific (mis)feature not enabled,
this is optimized out by the compiler.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235445.525817311@linutronix.de

f0383c24

x86/msi: Remove unused remap irq domain interface · 0323b969

由 Thomas Gleixner 提交于 6月 20, 2017

Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235444.221049665@linutronix.de

0323b969

x86/msi: Provide new iommu irqdomain interface · 667724c5

由 Thomas Gleixner 提交于 6月 20, 2017

Provide a new interface for creating the iommu remapping domains, so that
the caller can supply a name and a id in order to create named irqdomains.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: iommu@lists.linux-foundation.org
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235443.986661206@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

667724c5

22 6月, 2017 4 次提交

KVM: x86: fix singlestepping over syscall · c8401dda

由 Paolo Bonzini 提交于 6月 07, 2017

TF is handled a bit differently for syscall and sysret, compared
to the other instructions: TF is checked after the instruction completes,
so that the OS can disable #DB at a syscall by adding TF to FMASK.
When the sysret is executed the #DB is taken "as if" the syscall insn
just completed.

KVM emulates syscall so that it can trap 32-bit syscall on Intel processors.
Fix the behavior, otherwise you could get #DB on a user stack which is not
nice.  This does not affect Linux guests, as they use an IST or task gate
for #DB.

This fixes CVE-2017-7518.

Cc: stable@vger.kernel.org
Reported-by: NAndy Lutomirski <luto@kernel.org>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

c8401dda

x86/mm: Remove reset_lazy_tlbstate() · d5436812

由 Andy Lutomirski 提交于 6月 20, 2017

The only call site also calls idle_task_exit(), and idle_task_exit()
puts us into a clean state by explicitly switching to init_mm.
Signed-off-by: NAndy Lutomirski <luto@kernel.org>
Reviewed-by: NRik van Riel <riel@redhat.com>
Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NBorislav Petkov <bp@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/3acc7ad02a2ec060d2321a1e0f6de1cb90069517.1498022414.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

d5436812

x86/ldt: Simplify the LDT switching logic · 73534258

由 Andy Lutomirski 提交于 6月 20, 2017

Originally, Linux reloaded the LDT whenever the prev mm or the next
mm had an LDT. It was changed in 2002 in:

  0bbed3beb4f2 ("[PATCH] Thread-Local Storage (TLS) support")

(commit from the historical tree), like this:

-		/* load_LDT, if either the previous or next thread
-		 * has a non-default LDT.
+		/*
+		 * load the LDT, if the LDT is different:
		 */
-		if (next->context.size+prev->context.size)
+		if (unlikely(prev->context.ldt != next->context.ldt))
			load_LDT(&next->context);

The current code is unlikely to avoid any LDT reloads, since different
mms won't share an LDT.

When we redo lazy mode to stop flush IPIs without switching to
init_mm, though, the current logic would become incorrect: it will
be possible to have real_prev == next but nonetheless have a stale
LDT descriptor.

Simplify the code to update LDTR if either the previous or the next
mm has an LDT, i.e. effectively restore the historical logic..
While we're at it, clean up the code by moving all the ifdeffery to
a header where it belongs.
Signed-off-by: NAndy Lutomirski <luto@kernel.org>
Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NBorislav Petkov <bp@suse.de>
Acked-by: NRik van Riel <riel@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/2a859ac01245f9594c58f9d0a8b2ed8a7cd2507e.1498022414.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

73534258

x86/power/64: Use char arrays for asm function names · c0944883

由 Kees Cook 提交于 5月 09, 2017

This switches the hibernate_64.S function names into character arrays
to match other areas of the kernel where this is done (e.g., linker
scripts). Specifically this fixes a compile-time error noticed by the
future CONFIG_FORTIFY_SOURCE routines that complained about PAGE_SIZE
being copied out of the "single byte" core_restore_code variable.

Additionally drops the "acpi_save_state_mem" exern which does not
appear to be used anywhere else in the kernel.
Signed-off-by: NKees Cook <keescook@chromium.org>
Acked-by: NIngo Molnar <mingo@kernel.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

c0944883

16 6月, 2017 3 次提交

x86, dax, libnvdimm: remove wb_cache_pmem() indirection · 4e4f00a9

由 Dan Williams 提交于 5月 29, 2017

With all handling of the CONFIG_ARCH_HAS_PMEM_API case being moved to
libnvdimm and the pmem driver directly we do not need to provide global
wrappers and fallbacks in the CONFIG_ARCH_HAS_PMEM_API=n case. The pmem
driver will simply not link to arch_wb_cache_pmem() in that case.  Same
as before, pmem flushing is only defined for x86_64, via
clean_cache_range(), but it is straightforward to add other archs in the
future.

arch_wb_cache_pmem() is an exported function since the pmem module needs
to find it, but it is privately declared in drivers/nvdimm/pmem.h because
there are no consumers outside of the pmem driver.

Cc: <x86@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Oliver O'Halloran <oohall@gmail.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Suggested-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

4e4f00a9

x86, dax: replace clear_pmem() with open coded memset + dax_ops->flush · 81f55870

由 Dan Williams 提交于 5月 29, 2017

The clear_pmem() helper simply combines a memset() plus a cache flush.
Now that the flush routine is optionally provided by the dax device
driver we can avoid unnecessary cache management on dax devices fronting
volatile memory.

With clear_pmem() gone we can follow on with a patch to make pmem cache
management completely defined within the pmem driver.

Cc: <x86@kernel.org>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Reviewed-by: NJan Kara <jack@suse.cz>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

81f55870

filesystem-dax: convert to dax_copy_from_iter() · fec53774

由 Dan Williams 提交于 5月 29, 2017

Now that all possible providers of the dax_operations copy_from_iter
method are implemented, switch filesytem-dax to call the driver rather
than copy_to_iter_pmem.
Reviewed-by: NJan Kara <jack@suse.cz>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

fec53774

14 6月, 2017 2 次提交

x86/mce: Get rid of register_mce_write_callback() · fbe9ff9e

由 Borislav Petkov 提交于 6月 13, 2017

Make the mcelog call a notifier which lands in the injector module and
does the injection. This allows for mce-inject to be a normal kernel
module now.
Tested-by: NYazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Acked-by: NYazen Ghannam <yazen.ghannam@amd.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Link: http://lkml.kernel.org/r/20170613162835.30750-5-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>

fbe9ff9e

x86/mce: Merge mce_amd_inj into mce-inject · bc8e80d5

由 Borislav Petkov 提交于 6月 13, 2017

Reuse mce_amd_inj's debugfs interface so that mce-inject can
benefit from it too. The old functionality is still preserved under
CONFIG_X86_MCELOG_LEGACY.
Tested-by: NYazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Acked-by: NYazen Ghannam <yazen.ghannam@amd.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Link: http://lkml.kernel.org/r/20170613162835.30750-4-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>

bc8e80d5

13 6月, 2017 6 次提交

x86/boot/64: Add support of additional page table level during early boot · 032370b9

由 Kirill A. Shutemov 提交于 6月 06, 2017

This patch adds support for 5-level paging during early boot.
It generalizes boot for 4- and 5-level paging on 64-bit systems with
compile-time switch between them.
Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-arch@vger.kernel.org
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/20170606113133.22974-10-kirill.shutemov@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

032370b9

x86/boot/64: Rename init_level4_pgt and early_level4_pgt · 65ade2f8

由 Kirill A. Shutemov 提交于 6月 06, 2017

With CONFIG_X86_5LEVEL=y, level 4 is no longer top level of page tables.

Let's give these variable more generic names: init_top_pgt and
early_top_pgt.
Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: NJuergen Gross <jgross@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-arch@vger.kernel.org
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/20170606113133.22974-9-kirill.shutemov@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

65ade2f8

x86/mm/gup: Switch GUP to the generic get_user_page_fast() implementation · e585513b

由 Kirill A. Shutemov 提交于 6月 06, 2017

This patch provides all required callbacks required by the generic
get_user_pages_fast() code and switches x86 over - and removes
the platform specific implementation.
Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-arch@vger.kernel.org
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/20170606113133.22974-2-kirill.shutemov@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

e585513b

x86/mm: Split read_cr3() into read_cr3_pa() and __read_cr3() · 6c690ee1

由 Andy Lutomirski 提交于 6月 12, 2017

The kernel has several code paths that read CR3.  Most of them assume that
CR3 contains the PGD's physical address, whereas some of them awkwardly
use PHYSICAL_PAGE_MASK to mask off low bits.

Add explicit mask macros for CR3 and convert all of the CR3 readers.
This will keep them from breaking when PCID is enabled.
Signed-off-by: NAndy Lutomirski <luto@kernel.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: xen-devel <xen-devel@lists.xen.org>
Link: http://lkml.kernel.org/r/883f8fb121f4616c1c1427ad87350bb2f5ffeca1.1497288170.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

6c690ee1

x86/time: Make setup_default_timer_irq() static · b1b4f2fe

由 Dou Liyang 提交于 6月 13, 2017

This function isn't used outside of time.c, so let's mark it static.
Signed-off-by: NDou Liyang <douly.fnst@cn.fujitsu.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1497321029-29049-1-git-send-email-douly.fnst@cn.fujitsu.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

b1b4f2fe

x86/debug: Handle early WARN_ONs proper · 8a524f80

由 Peter Zijlstra 提交于 6月 12, 2017

Hans managed to trigger a WARN very early in the boot which killed his
(Virtual) box.

The reason is that the recent rework of WARN() to use UD0 forgot to add the
fixup_bug() call to early_fixup_exception(). As a result the kernel does
not handle the WARN_ON injected UD0 exception and panics.

Add the missing fixup call, so early UD's injected by WARN() get handled.

Fixes: 9a93848f ("x86/debug: Implement __WARN() using UD0")
Reported-and-tested-by: NHans de Goede <hdegoede@redhat.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Frank Mehnert <frank.mehnert@oracle.com>
Cc: Hans de Goede <hdegoede@redhat.com>
Cc: Michael Thayer <michael.thayer@oracle.com>
Link: http://lkml.kernel.org/r/20170612180108.w4vgu2ckucmllf3a@hirez.programming.kicks-ass.net

8a524f80

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功