提交 · 71b3c126e61177eb693423f2e18a1914205b165e · openeuler / Kernel

11 1月, 2016 1 次提交

x86/mm: Add barriers and document switch_mm()-vs-flush synchronization · 71b3c126

由 Andy Lutomirski 提交于 9年前

When switch_mm() activates a new PGD, it also sets a bit that
tells other CPUs that the PGD is in use so that TLB flush IPIs
will be sent.  In order for that to work correctly, the bit
needs to be visible prior to loading the PGD and therefore
starting to fill the local TLB.

Document all the barriers that make this work correctly and add
a couple that were missing.
Signed-off-by: NAndy Lutomirski <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-mm@kvack.org
Cc: stable@vger.kernel.org
Signed-off-by: NIngo Molnar <mingo@kernel.org>

71b3c126

07 1月, 2016 1 次提交

kvm: x86: only channel 0 of the i8254 is linked to the HPET · e5e57e7a

由 Paolo Bonzini 提交于 9年前

While setting the KVM PIT counters in 'kvm_pit_load_count', if
'hpet_legacy_start' is set, the function disables the timer on
channel[0], instead of the respective index 'channel'. This is
because channels 1-3 are not linked to the HPET.  Fix the caller
to only activate the special HPET processing for channel 0.
Reported-by: NP J P <pjp@fedoraproject.org>
Fixes: 0185604cSigned-off-by: NPaolo Bonzini <pbonzini@redhat.com>

e5e57e7a

31 12月, 2015 1 次提交

x86/numachip: Fix NumaConnect2 MMCFG PCI access · dd7a5ab4

由 Daniel J Blueman 提交于 9年前

The MMCFG PCI accessors weren't being setup for NumacConnect2
correctly due to over-early assignment; this would create the
potential for the wrong PCI domain to be accessed.

Fix this by using the correct arch-specific PCI init function.
Signed-off-by: NDaniel J Blueman <daniel@numascale.com>
Acked-by: NSteffen Persvold <sp@numascale.com>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1451498807-15920-1-git-send-email-daniel@numascale.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

dd7a5ab4

30 12月, 2015 1 次提交

arch/x86/xen/suspend.c: include xen/xen.h · facca616

由 Andrew Morton 提交于 9年前

Fix the build warning:

  arch/x86/xen/suspend.c: In function 'xen_arch_pre_suspend':
  arch/x86/xen/suspend.c:70:9: error: implicit declaration of function 'xen_pv_domain' [-Werror=implicit-function-declaration]
          if (xen_pv_domain())
              ^
Reported-by: Nkbuild test robot <fengguang.wu@intel.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

facca616

23 12月, 2015 1 次提交

um: Fix pointer cast · de379379

由 Mickaël Salaün 提交于 9年前

Fix a pointer cast typo introduced in v4.4-rc5 especially visible for
the i386 subarchitecture where it results in a kernel crash.

[ Also removed pointless cast as per Al Viro - Linus ]

Fixes: 8090bfd2 ("um: Fix fpstate handling")
Signed-off-by: NMickaël Salaün <mic@digikod.net>
Cc: Jeff Dike <jdike@addtoit.com>
Acked-by: NRichard Weinberger <richard@nod.at>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

de379379

22 12月, 2015 4 次提交

KVM: x86: Reload pit counters for all channels when restoring state · 0185604c

由 Andrew Honig 提交于 9年前

Currently if userspace restores the pit counters with a count of 0
on channels 1 or 2 and the guest attempts to read the count on those
channels, then KVM will perform a mod of 0 and crash.  This will ensure
that 0 values are converted to 65536 as per the spec.

This is CVE-2015-7513.
Signed-off-by: NAndy Honig <ahonig@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

0185604c

KVM: MTRR: treat memory as writeback if MTRR is disabled in guest CPUID · e24dea2a

由 Paolo Bonzini 提交于 9年前

Virtual machines can be run with CPUID such that there are no MTRRs.
In that case, the firmware will never enable MTRRs and it is obviously
undesirable to run the guest entirely with UC memory. Check out guest
CPUID, and use WB memory if MTRR do not exist.

Cc: qemu-stable@nongnu.org
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=107561Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

e24dea2a

KVM: MTRR: observe maxphyaddr from guest CPUID, not host · fa7c4ebd

由 Paolo Bonzini 提交于 9年前

Conversion of MTRRs to ranges used the maxphyaddr from the boot CPU.
This is wrong, because var_mtrr_range's mask variable then is discontiguous
(like FF00FFFF000, where the first run of 0s corresponds to the bits
between host and guest maxphyaddr).  Instead always set up the masks
to be full 64-bit values---we know that the reserved bits at the top
are zero, and we can restore them when reading the MSR.  This way
var_mtrr_range gets a mask that just works.

Fixes: a13842dc
Cc: qemu-stable@nongnu.org
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=107561Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

fa7c4ebd

KVM: MTRR: fix fixed MTRR segment look up · a7f2d786

由 Alexis Dambricourt 提交于 9年前

This fixes the slow-down of VM running with pci-passthrough, since some MTRR
range changed from MTRR_TYPE_WRBACK to MTRR_TYPE_UNCACHABLE.  Memory in the
0K-640K range was incorrectly treated as uncacheable.

Fixes: f7bfb57b
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=107561
Cc: qemu-stable@nongnu.org
Signed-off-by: NAlexis Dambricourt <alexis.dambricourt@gmail.com>
[Use correct BZ for "Fixes" annotation.  - Paolo]
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

a7f2d786

21 12月, 2015 2 次提交

x86/entry: Restore traditional SYSENTER calling convention · 30bfa7b3

由 Andy Lutomirski 提交于 9年前

It turns out that some Android versions hardcode the SYSENTER
calling convention.  This is buggy and will cause problems no
matter what the kernel does.  Nonetheless, we should try to
support it.

Credit goes to Linus for pointing out a clean way to handle
the SYSENTER/SYSCALL clobber differences while preserving
straightforward DWARF annotations.

I believe that the original offending Android commit was:

https://android.googlesource.com/platform%2Fbionic/+/7dc3684d7a2587e43e6d2a8e0e3f39bf759bd535Reported-by: NQiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: NAndy Lutomirski <luto@kernel.org>
Reviewed-and-tested-by: NBorislav Petkov <bp@alien8.de>
Cc: <mark.gross@intel.com>
Cc: Su Tao <tao.su@intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: <frank.wang@intel.com>
Cc: <borun.fu@intel.com>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Mingwei Shi <mingwei.shi@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

30bfa7b3

x86/entry: Fix some comments · 6a613ac6

由 Andy Lutomirski 提交于 9年前

Signed-off-by: NAndy Lutomirski <luto@kernel.org>
Reviewed-and-tested-by: NBorislav Petkov <bp@alien8.de>
Cc: <mark.gross@intel.com>
Cc: Su Tao <tao.su@intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: <qiuxu.zhuo@intel.com>
Cc: <frank.wang@intel.com>
Cc: <borun.fu@intel.com>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Mingwei Shi <mingwei.shi@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

6a613ac6

20 12月, 2015 1 次提交

x86/paravirt: Prevent rtc_cmos platform device init on PV guests · d8c98a1d

由 David Vrabel 提交于 9年前

Adding the rtc platform device in non-privileged Xen PV guests causes
an IRQ conflict because these guests do not have legacy PIC and may
allocate irqs in the legacy range.

In a single VCPU Xen PV guest we should have:

/proc/interrupts:
           CPU0
  0:       4934  xen-percpu-virq      timer0
  1:          0  xen-percpu-ipi       spinlock0
  2:          0  xen-percpu-ipi       resched0
  3:          0  xen-percpu-ipi       callfunc0
  4:          0  xen-percpu-virq      debug0
  5:          0  xen-percpu-ipi       callfuncsingle0
  6:          0  xen-percpu-ipi       irqwork0
  7:        321   xen-dyn-event     xenbus
  8:         90   xen-dyn-event     hvc_console
  ...

But hvc_console cannot get its interrupt because it is already in use
by rtc0 and the console does not work.

  genirq: Flags mismatch irq 8. 00000000 (hvc_console) vs. 00000000 (rtc0)

We can avoid this problem by realizing that unprivileged PV guests (both
Xen and lguests) are not supposed to have rtc_cmos device and so
adding it is not necessary.

Privileged guests (i.e. Xen's dom0) do use it but they should not have
irq conflicts since they allocate irqs above legacy range (above
gsi_top, in fact).

Instead of explicitly testing whether the guest is privileged we can
extend pv_info structure to include information about guest's RTC
support.
Reported-and-tested-by: NSander Eikelenboom <linux@eikelenboom.it>
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: vkuznets@redhat.com
Cc: xen-devel@lists.xenproject.org
Cc: konrad.wilk@oracle.com
Cc: stable@vger.kernel.org # 4.2+
Link: http://lkml.kernel.org/r/1449842873-2613-1-git-send-email-boris.ostrovsky@oracle.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

d8c98a1d

19 12月, 2015 2 次提交

x86/xen: Avoid fast syscall path for Xen PV guests · 91e2eea9

由 Boris Ostrovsky 提交于 9年前

After 32-bit syscall rewrite, and specifically after commit:

  5f310f73 ("x86/entry/32: Re-implement SYSENTER using the new C path")

... the stack frame that is passed to xen_sysexit is no longer a
"standard" one (i.e. it's not pt_regs).

Since we end up calling xen_iret from xen_sysexit we don't need
to fix up the stack and instead follow entry_SYSENTER_32's IRET
path directly to xen_iret.

We can do the same thing for compat mode even though stack does
not need to be fixed. This will allow us to drop usergs_sysret32
paravirt op (in the subsequent patch)
Suggested-by: NAndy Lutomirski <luto@amacapital.net>
Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: NBorislav Petkov <bp@suse.de>
Acked-by: NAndy Lutomirski <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: david.vrabel@citrix.com
Cc: konrad.wilk@oracle.com
Cc: virtualization@lists.linux-foundation.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/1447970147-1733-2-git-send-email-boris.ostrovsky@oracle.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

91e2eea9

x86/mce: Ensure offline CPUs don't participate in rendezvous process · d90167a9

由 Ashok Raj 提交于 9年前

Intel's MCA implementation broadcasts MCEs to all CPUs on the
node. This poses a problem for offlined CPUs which cannot
participate in the rendezvous process:

  Kernel panic - not syncing: Timeout: Not all CPUs entered broadcast exception handler
  Kernel Offset: disabled
  Rebooting in 100 seconds..

More specifically, Linux does a soft offline of a CPU when
writing a 0 to /sys/devices/system/cpu/cpuX/online, which
doesn't prevent the #MC exception from being broadcasted to that
CPU.

Ensure that offline CPUs don't participate in the MCE rendezvous
and clear the RIP valid status bit so that a second MCE won't
cause a shutdown.

Without the patch, mce_start() will increment mce_callin and
wait for all CPUs. Offlined CPUs should avoid participating in
the rendezvous process altogether.
Signed-off-by: NAshok Raj <ashok.raj@intel.com>
[ Massage commit message. ]
Signed-off-by: NBorislav Petkov <bp@suse.de>
Reviewed-by: NTony Luck <tony.luck@intel.com>
Cc: <stable@vger.kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1449742346-21470-2-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

d90167a9

16 12月, 2015 1 次提交

Fix user-visible spelling error · 173ae9ba

由 Linus Torvalds 提交于 9年前

Pavel Machek reports a warning about W+X pages found in the "Persisent"
kmap area. After grepping for it (using the correct spelling), and not
finding it, I noticed how the debug printk was just misspelled. Fix it.

The actual mapping bug that Pavel reported is still open. It's
apparently a separate issue from the known EFI page tables, looks like
it's related to the HIGHMEM mappings.
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

173ae9ba

14 12月, 2015 2 次提交

KVM: VMX: Fix host initiated access to guest MSR_TSC_AUX · 81b1b9ca

由 Haozhong Zhang 提交于 9年前

The current handling of accesses to guest MSR_TSC_AUX returns error if
vcpu does not support rdtscp, though those accesses are initiated by
host. This can result in the reboot failure of some versions of
QEMU. This patch fixes this issue by passing those host initiated
accesses for further handling instead.
Signed-off-by: NHaozhong Zhang <haozhong.zhang@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

81b1b9ca

xen/x86/pvh: Use HVM's flush_tlb_others op · 20f36e03

由 Boris Ostrovsky 提交于 9年前

Using MMUEXT_TLB_FLUSH_MULTI doesn't buy us much since the hypervisor
will likely perform same IPIs as would have the guest.

More importantly, using MMUEXT_INVLPG_MULTI may not to invalidate the
guest's address on remote CPU (when, for example, VCPU from another guest
is running there).
Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
Suggested-by: NJan Beulich <jbeulich@suse.com>
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>

20f36e03

11 12月, 2015 1 次提交

kvm: x86: move tracepoints outside extended quiescent state · 8b89fe1f

由 Paolo Bonzini 提交于 9年前

Invoking tracepoints within kvm_guest_enter/kvm_guest_exit causes a
lockdep splat.
Reported-by: NBorislav Petkov <bp@alien8.de>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

8b89fe1f

09 12月, 2015 1 次提交

um: Fix fpstate handling · 8090bfd2

由 Richard Weinberger 提交于 9年前

The x86 FPU cleanup changed fpstate to a plain integer.
UML on x86 has to deal with that too.
Signed-off-by: NRichard Weinberger <richard@nod.at>

8090bfd2

06 12月, 2015 4 次提交

perf/x86/intel: Fix INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_NA macro · 169b932a

由 Jiri Olsa 提交于 9年前

We need to add rest of the flags to the constraint mask
instead of another INTEL_ARCH_EVENT_MASK, fixing a typo.
Signed-off-by: NJiri Olsa <jolsa@kernel.org>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/1447061071-28085-1-git-send-email-jolsa@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

169b932a

perf/x86/intel: Make L1D_PEND_MISS.FB_FULL not constrained on Haswell · e0fbac1c

由 Yuanfang Chen 提交于 9年前

There was a mistake in the Haswell constraints table.
Signed-off-by: NYuanfang Chen <cheny@udel.edu>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: NAndi Kleen <ak@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/1448384701-9110-1-git-send-email-cheny@udel.eduSigned-off-by: NIngo Molnar <mingo@kernel.org>

e0fbac1c

x86/signal: Fix restart_syscall number for x32 tasks · 22eab110

由 Dmitry V. Levin 提交于 9年前

When restarting a syscall with regs->ax == -ERESTART_RESTARTBLOCK,
regs->ax is assigned to a restart_syscall number. For x32 tasks, this
syscall number must have __X32_SYSCALL_BIT set, otherwise it will be
an x86_64 syscall number instead of a valid x32 syscall number. This
issue has been there since the introduction of x32.

Reported-by: strace/tests/restart_syscall.test
Reported-and-tested-by: NElvira Khabirova <lineprinter0@gmail.com>
Signed-off-by: NDmitry V. Levin <ldv@altlinux.org>
Cc: Elvira Khabirova <lineprinter0@gmail.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20151130215436.GA25996@altlinux.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

22eab110

x86/mpx: Fix instruction decoder condition · 8e8efe03

由 Dave Hansen 提交于 9年前

MPX decodes instructions in order to tell which bounds register
was violated.  Part of this decoding involves looking at the "REX
prefix" which is a special instrucion prefix used to retrofit
support for new registers in to old instructions.

The X86_REX_*() macros are defined to return actual bit values:

	#define X86_REX_R(rex) ((rex) & 4)

*not* boolean values.  However, the MPX code was checking for
them like they were booleans.  This might have led to us
mis-decoding the "REX prefix" and giving false information out to
userspace about bounds violations.  X86_REX_B() actually is bit 1,
so this is really only broken for the X86_REX_X() case.

Fix the conditionals up to tolerate the non-boolean values.

Fixes: fcc7ffd6 "x86, mpx: Decode MPX instruction to get bound violation information"
Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
Cc: x86@kernel.org
Cc: Dave Hansen <dave@sr71.net>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20151201003113.D800C1E0@viggo.jf.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

8e8efe03

04 12月, 2015 1 次提交

x86/mm: Fix regression with huge pages on PAE · 70f15287

由 Kirill A. Shutemov 提交于 9年前

Recent PAT patchset has caused issue on 32-bit PAE machines:

  page:eea45000 count:0 mapcount:-128 mapping:  (null) index:0x0 flags: 0x40000000()
  page dumped because: VM_BUG_ON_PAGE(page_mapcount(page) < 0)
  ------------[ cut here ]------------
  kernel BUG at /home/build/linux-boris/mm/huge_memory.c:1485!
  invalid opcode: 0000 [#1] SMP
  [...]
  Call Trace:
   unmap_single_vma
   ? __wake_up
   unmap_vmas
   unmap_region
   do_munmap
   vm_munmap
   SyS_munmap
   do_fast_syscall_32
   ? __do_page_fault
   sysenter_past_esp
  Code: ...
  EIP: [<c11bde80>] zap_huge_pmd+0x240/0x260 SS:ESP 0068:f6459d98

The problem is in pmd_pfn_mask() and pmd_flags_mask(). These
helpers use PMD_PAGE_MASK to calculate resulting mask.
PMD_PAGE_MASK is 'unsigned long', not 'unsigned long long' as
phys_addr_t is on 32-bit PAE (ARCH_PHYS_ADDR_T_64BIT). As a
result, the upper bits of resulting mask get truncated.

pud_pfn_mask() and pud_flags_mask() aren't problematic since we
don't have PUD page table level on 32-bit systems, but it's
reasonable to keep them consistent with PMD counterpart.

Introduce PHYSICAL_PMD_PAGE_MASK and PHYSICAL_PUD_PAGE_MASK in
addition to existing PHYSICAL_PAGE_MASK and reworks helpers to
use them.
Reported-and-Tested-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
[ Fix -Woverflow warnings from the realmode code. ]
Signed-off-by: NBorislav Petkov <bp@suse.de>
Reviewed-by: NToshi Kani <toshi.kani@hpe.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jürgen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: elliott@hpe.com
Cc: konrad.wilk@oracle.com
Cc: linux-mm <linux-mm@kvack.org>
Fixes: f70abb0f ("x86/asm: Fix pud/pmd interfaces to handle large PAT bit")
Link: http://lkml.kernel.org/r/1448878233-11390-2-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
Signed-off-by: NIngo Molnar <mingo@kernel.org>

70f15287

03 12月, 2015 1 次提交

xen: Resume PMU from non-atomic context · de0afc9b

由 Boris Ostrovsky 提交于 9年前

Resuming PMU currently triggers a warning from ___might_sleep() (assuming
CONFIG_DEBUG_ATOMIC_SLEEP is set) when xen_pmu_init() allocates GFP_KERNEL
page because we are in state resembling atomic context.

Move resuming PMU to xen_arch_resume() which is called in regular context.
For symmetry move suspending PMU to xen_arch_suspend() as well.
Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
Reported-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: <stable@vger.kernel.org> # 4.3
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>

de0afc9b

02 12月, 2015 1 次提交

x86/PCI/ACPI: Fix regression caused by commit · 727ae8be

由 Liu Jiang 提交于 9年前

Commit 4d6b4e69 ("x86/PCI/ACPI: Use common interface to support
PCI host bridge") converted x86 to use the common interface
acpi_pci_root_create, but the conversion missed on code piece in
arch/x86/pci/bus_numa.c, which causes regression on some legacy
AMD platforms as reported by Arthur Marsh <arthur.marsh@internode.on.net>.
The root causes is that acpi_pci_root_create() fails to insert
host bridge resources into iomem_resource/ioport_resource because
x86_pci_root_bus_resources() has already inserted those resources.
So change x86_pci_root_bus_resources() to not insert resources into
iomem_resource/ioport_resource.

Fixes: 4d6b4e69 ("x86/PCI/ACPI: Use common interface to support PCI host bridge")
Signed-off-by: NJiang Liu <jiang.liu@linux.intel.com>
Reported-and-tested-by: NArthur Marsh <arthur.marsh@internode.on.net>
Reported-and-tested-by: NKrzysztof Kolasa <kkolasa@winsoft.pl>
Reported-and-tested-by: NKeith Busch <keith.busch@intel.com>
Reported-and-tested-by: NHans de Bruin <jmdebruin@xmsnet.nl>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

727ae8be

01 12月, 2015 1 次提交

libnvdimm, e820: skip module loading when no type-12 · bc0d0d09

由 Dan Williams 提交于 9年前

If there are no persistent memory ranges present then don't bother
creating the platform device.  Otherwise, it loads the full libnvdimm
sub-system only to discover no resources present.
Reported-by: NAndy Lutomirski <luto@amacapital.net>
Acked-by: NAndy Lutomirski <luto@amacapital.net>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

bc0d0d09

26 11月, 2015 1 次提交

x86 smpboot: Re-enable init_udelay=0 by default on modern CPUs · 656279a1

由 Len Brown 提交于 9年前

commit f1ccd249 allowed the cmdline "cpu_init_udelay=" to work
with all values, including the default of 10000.

But in setting the default of 10000, it over-rode the code that sets
the delay 0 on modern processors.

Also, tidy up use of INT/UINT.

Fixes: f1ccd249 "x86/smpboot: Fix cpu_init_udelay=10000 corner case boot parameter misbehavior"
Reported-by: NShane <shrybman@teksavvy.com>
Signed-off-by: NLen Brown <len.brown@intel.com>
Cc: dparsons@brightdsl.net
Cc: stable@kernel.org
Link: http://lkml.kernel.org/r/9082eb809ef40dad02db714759c7aaf618c518d4.1448232494.git.len.brown@intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

656279a1

25 11月, 2015 1 次提交

KVM: nVMX: remove incorrect vpid check in nested invvpid emulation · b2467e74

由 Haozhong Zhang 提交于 9年前

This patch removes the vpid check when emulating nested invvpid
instruction of type all-contexts invalidation. The existing code is
incorrect because:
 (1) According to Intel SDM Vol 3, Section "INVVPID - Invalidate
     Translations Based on VPID", invvpid instruction does not check
     vpid in the invvpid descriptor when its type is all-contexts
     invalidation.
 (2) According to the same document, invvpid of type all-contexts
     invalidation does not require there is an active VMCS, so/and
     get_vmcs12() in the existing code may result in a NULL-pointer
     dereference. In practice, it can crash both KVM itself and L1
     hypervisors that use invvpid (e.g. Xen).
Signed-off-by: NHaozhong Zhang <haozhong.zhang@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

b2467e74

24 11月, 2015 1 次提交

x86/entry/64: Fix irqflag tracing wrt context tracking · f1075053

由 Andy Lutomirski 提交于 9年前

Paolo pointed out that enter_from_user_mode could be called
while irqflags were traced as though IRQs were on.

In principle, this could confuse lockdep.  It doesn't cause any
problems that I've seen in any configuration, but if I build
with CONFIG_DEBUG_LOCKDEP=y, enable a nohz_full CPU, and add
code like:

	if (irqs_disabled()) {
		spin_lock(&something);
		spin_unlock(&something);
	}

to the top of enter_from_user_mode, then lockdep will complain
without this fix.  It seems that lockdep's irqflags sanity
checks are too weak to detect this bug without forcing the
issue.

This patch adds one byte to normal kernels, and it's IMO a bit
ugly. I haven't spotted a better way to do this yet, though.
The issue is that we can't do TRACE_IRQS_OFF until after SWAPGS
(if needed), but we're also supposed to do it before calling C
code.

An alternative approach would be to call trace_hardirqs_off in
enter_from_user_mode.  That would be less code and would not
bloat normal kernels at all, but it would be harder to see how
the code worked.
Signed-off-by: NAndy Lutomirski <luto@kernel.org>
Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/86237e362390dfa6fec12de4d75a238acb0ae787.1447361906.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

f1075053

23 11月, 2015 4 次提交

x86/microcode: Initialize the driver late when facilities are up · 2d5be37d

由 Borislav Petkov 提交于 9年前

Running microcode_init() from setup_arch() is a bad idea because
not even kmalloc() is ready at that point and the loader does
all kinds of allocations and init/registration with various
subsystems.

Make it a late initcall when required facilities are initialized
so that the microcode driver initialization can succeed too.
Reported-and-tested-by: NMarkus Trippelsdorf <markus@trippelsdorf.de>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20151120112400.GC4028@pd.tnicSigned-off-by: NIngo Molnar <mingo@kernel.org>

2d5be37d

treewide: Remove old email address · 90eec103

由 Peter Zijlstra 提交于 9年前

There were still a number of references to my old Red Hat email
address in the kernel source. Remove these while keeping the
Red Hat copyright notices intact.
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: NIngo Molnar <mingo@kernel.org>

90eec103

perf/x86: Fix LBR call stack save/restore · b28ae956

由 Andi Kleen 提交于 9年前

This fixes a bug I added in the following commit:

  90405aa0 ("perf/x86/intel/lbr: Limit LBR accesses to TOS in callstack mode")

The bug could lead to lost LBR call stacks. When restoring the LBR state
we need to use the TOS of the previous context, not the current context.
To do that we need to save/restore the TOS.
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: acme@kernel.org
Cc: jolsa@kernel.org
Link: http://lkml.kernel.org/r/1445366797-30894-1-git-send-email-andi@firstfloor.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

b28ae956

perf/core: Robustify the perf_cgroup_from_task() RCU checks · 614e4c4e

由 Stephane Eranian 提交于 9年前

This patch reinforces the lockdep checks performed by
perf_cgroup_from_tsk() by passing the perf_event_context
whenever possible. It is okay to not hold the RCU read lock
when we know we hold the ctx->lock. This patch makes sure this
property holds.

In some functions, such as perf_cgroup_sched_in(), we do not
pass the context because we are sure we are holding the RCU
read lock.
Signed-off-by: NStephane Eranian <eranian@google.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: edumazet@google.com
Link: http://lkml.kernel.org/r/1447322404-10920-3-git-send-email-eranian@google.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

614e4c4e

19 11月, 2015 2 次提交

x86/cpu: Fix SMAP check in PVOPS environments · 581b7f15

由 Andrew Cooper 提交于 9年前

There appears to be no formal statement of what pv_irq_ops.save_fl() is
supposed to return precisely.  Native returns the full flags, while lguest and
Xen only return the Interrupt Flag, and both have comments by the
implementations stating that only the Interrupt Flag is looked at.  This may
have been true when initially implemented, but no longer is.

To make matters worse, the Xen PVOP leaves the upper bits undefined, making
the BUG_ON() undefined behaviour.  Experimentally, this now trips for 32bit PV
guests on Broadwell hardware.  The BUG_ON() is consistent for an individual
build, but not consistent for all builds.  It has also been a sitting timebomb
since SMAP support was introduced.

Use native_save_fl() instead, which will obtain an accurate view of the AC
flag.
Signed-off-by: NAndrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: NDavid Vrabel <david.vrabel@citrix.com>
Tested-by: NRusty Russell <rusty@rustcorp.com.au>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: <lguest@lists.ozlabs.org>
Cc: Xen-devel <xen-devel@lists.xen.org>
CC: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1433323874-6927-1-git-send-email-andrew.cooper3@citrix.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

581b7f15

x86/ftrace: Add comment on static function tracing · 112677d6

由 Namhyung Kim 提交于 9年前

There was a confusion between update_ftrace_function() and static
function tracing trampoline regarding 3rd parameter (ftrace_ops).
Add a comment for clarification.
Suggested-by: NSteven Rostedt <rostedt@goodmis.org>
Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
Cc: H. Peter Anvin <hpa@linux.intel.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/1447721004-2551-1-git-send-email-namhyung@kernel.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

112677d6

18 11月, 2015 4 次提交

KVM: x86: request interrupt window when IRQ chip is split · 62a193ed

由 Matt Gingell 提交于 9年前

Before this patch, we incorrectly enter the guest without requesting an
interrupt window if the IRQ chip is split between user space and the
kernel.

Because lapic_in_kernel no longer implies the PIC is in the kernel, this
patch tests pic_in_kernel to determining whether an interrupt window
should be requested when entering the guest.

If the APIC is in the kernel and we request an interrupt window the
guest will return immediately. If the APIC is masked the guest will not
not make forward progress and unmask it, leading to a loop when KVM
reenters and requests again. This patch adds a check to ensure the APIC
is ready to accept an interrupt before requesting a window.
Reviewed-by: NSteve Rutherford <srutherford@google.com>
Signed-off-by: NMatt Gingell <gingell@google.com>
[Use the other newly introduced functions. - Paolo]
Fixes: 1c1a9ce9
Cc: stable@vger.kernel.org
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

62a193ed

KVM: x86: set KVM_REQ_EVENT on local interrupt request from user space · 934bf653

由 Matt Gingell 提交于 9年前

Set KVM_REQ_EVENT when a PIC in user space injects a local interrupt.

Currently a request is only made when neither the PIC nor the APIC is in
the kernel, which is not sufficient in the split IRQ chip case.

This addresses a problem in QEMU where interrupts are delayed until
another path invokes the event loop.
Reviewed-by: NSteve Rutherford <srutherford@google.com>
Signed-off-by: NMatt Gingell <gingell@google.com>
Fixes: 1c1a9ce9
Cc: stable@vger.kernel.org
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

934bf653

KVM: x86: split kvm_vcpu_ready_for_interrupt_injection out of dm_request_for_irq_injection · 782d422b

由 Matt Gingell 提交于 9年前

This patch breaks out a new function kvm_vcpu_ready_for_interrupt_injection.
This routine encapsulates the logic required to determine whether a vcpu
is ready to accept an interrupt injection, which is now required on
multiple paths.
Reviewed-by: NSteve Rutherford <srutherford@google.com>
Signed-off-by: NMatt Gingell <gingell@google.com>
Fixes: 1c1a9ce9
Cc: stable@vger.kernel.org
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

782d422b

KVM: x86: fix interrupt window handling in split IRQ chip case · 127a457a

由 Matt Gingell 提交于 9年前

This patch ensures that dm_request_for_irq_injection and
post_kvm_run_save are in sync, avoiding that an endless ping-pong
between userspace (who correctly notices that IF=0) and
the kernel (who insists that userspace handles its request
for the interrupt window).

To synchronize them, it also adds checks for kvm_arch_interrupt_allowed
and !kvm_event_needs_reinjection.  These are always needed, not
just for in-kernel LAPIC.
Signed-off-by: NMatt Gingell <gingell@google.com>
[A collage of two patches from Matt. - Paolo]
Fixes: 1c1a9ce9
Cc: stable@vger.kernel.org
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

127a457a

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功