1. 29 4月, 2016 1 次提交
  2. 22 4月, 2016 1 次提交
    • J
      x86/mm/xen: Suppress hugetlbfs in PV guests · 103f6112
      Jan Beulich 提交于
      Huge pages are not normally available to PV guests. Not suppressing
      hugetlbfs use results in an endless loop of page faults when user mode
      code tries to access a hugetlbfs mapped area (since the hypervisor
      denies such PTEs to be created, but error indications can't be
      propagated out of xen_set_pte_at(), just like for various of its
      siblings), and - once killed in an oops like this:
      
        kernel BUG at .../fs/hugetlbfs/inode.c:428!
        invalid opcode: 0000 [#1] SMP
        ...
        RIP: e030:[<ffffffff811c333b>]  [<ffffffff811c333b>] remove_inode_hugepages+0x25b/0x320
        ...
        Call Trace:
         [<ffffffff811c3415>] hugetlbfs_evict_inode+0x15/0x40
         [<ffffffff81167b3d>] evict+0xbd/0x1b0
         [<ffffffff8116514a>] __dentry_kill+0x19a/0x1f0
         [<ffffffff81165b0e>] dput+0x1fe/0x220
         [<ffffffff81150535>] __fput+0x155/0x200
         [<ffffffff81079fc0>] task_work_run+0x60/0xa0
         [<ffffffff81063510>] do_exit+0x160/0x400
         [<ffffffff810637eb>] do_group_exit+0x3b/0xa0
         [<ffffffff8106e8bd>] get_signal+0x1ed/0x470
         [<ffffffff8100f854>] do_signal+0x14/0x110
         [<ffffffff810030e9>] prepare_exit_to_usermode+0xe9/0xf0
         [<ffffffff814178a5>] retint_user+0x8/0x13
      
      This is CVE-2016-3961 / XSA-174.
      Reported-by: NVitaly Kuznetsov <vkuznets@redhat.com>
      Signed-off-by: NJan Beulich <jbeulich@suse.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: David Vrabel <david.vrabel@citrix.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Juergen Gross <JGross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Luis R. Rodriguez <mcgrof@suse.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: stable@vger.kernel.org
      Cc: xen-devel <xen-devel@lists.xenproject.org>
      Link: http://lkml.kernel.org/r/57188ED802000078000E431C@prv-mh.provo.novell.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      103f6112
  3. 19 4月, 2016 1 次提交
  4. 13 4月, 2016 20 次提交
  5. 08 4月, 2016 1 次提交
  6. 02 4月, 2016 1 次提交
    • N
      mm/rmap: batched invalidations should use existing api · 858eaaa7
      Nadav Amit 提交于
      The recently introduced batched invalidations mechanism uses its own
      mechanism for shootdown.  However, it does wrong accounting of
      interrupts (e.g., inc_irq_stat is called for local invalidations),
      trace-points (e.g., TLB_REMOTE_SHOOTDOWN for local invalidations) and
      may break some platforms as it bypasses the invalidation mechanisms of
      Xen and SGI UV.
      
      This patch reuses the existing TLB flushing mechnaisms instead.  We use
      NULL as mm to indicate a global invalidation is required.
      
      Fixes 72b252ae ("mm: send one IPI per CPU to TLB flush all entries after unmapping pages")
      Signed-off-by: NNadav Amit <namit@vmware.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      858eaaa7
  7. 01 4月, 2016 1 次提交
    • P
      KVM: x86: reduce default value of halt_poll_ns parameter · 14ebda33
      Paolo Bonzini 提交于
      Windows lets applications choose the frequency of the timer tick,
      and in Windows 10 the maximum rate was changed from 1024 Hz to
      2048 Hz.  Unfortunately, because of the way the Windows API
      works, most applications who need a higher rate than the default
      64 Hz will just do
      
         timeGetDevCaps(&tc, sizeof(tc));
         timeBeginPeriod(tc.wPeriodMin);
      
      and pick the maximum rate.  This causes very high CPU usage when
      playing media or games on Windows 10, even if the guest does not
      actually use the CPU very much, because the frequent timer tick
      causes halt_poll_ns to kick in.
      
      There is no really good solution, especially because Microsoft
      could sooner or later bump the limit to 4096 Hz, but for now
      the best we can do is lower a bit the upper limit for
      halt_poll_ns. :-(
      Reported-by: NJon Panozzo <jonp@lime-technology.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      14ebda33
  8. 31 3月, 2016 10 次提交
  9. 29 3月, 2016 4 次提交
    • T
      x86/xen, pat: Remove PAT table init code from Xen · 88ba2811
      Toshi Kani 提交于
      Xen supports PAT without MTRRs for its guests.  In order to
      enable WC attribute, it was necessary for xen_start_kernel()
      to call pat_init_cache_modes() to update PAT table before
      starting guest kernel.
      
      Now that the kernel initializes PAT table to the BIOS handoff
      state when MTRR is disabled, this Xen-specific PAT init code
      is no longer necessary.  Delete it from xen_start_kernel().
      
      Also change __init_cache_modes() to a static function since
      PAT table should not be tweaked by other modules.
      Signed-off-by: NToshi Kani <toshi.kani@hpe.com>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NJuergen Gross <jgross@suse.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Luis R. Rodriguez <mcgrof@suse.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: elliott@hpe.com
      Cc: paul.gortmaker@windriver.com
      Cc: xen-devel@lists.xenproject.org
      Link: http://lkml.kernel.org/r/1458769323-24491-7-git-send-email-toshi.kani@hpe.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      88ba2811
    • T
      x86/mtrr: Fix Xorg crashes in Qemu sessions · edfe63ec
      Toshi Kani 提交于
      A Xorg failure on qemu32 was reported as a regression [1] caused by
      commit 9cd25aac ("x86/mm/pat: Emulate PAT when it is disabled").
      
      This patch fixes the Xorg crash.
      
      Negative effects of this regression were the following two failures [2]
      in Xorg on QEMU with QEMU CPU model "qemu32" (-cpu qemu32), which were
      triggered by the fact that its virtual CPU does not support MTRRs.
      
       #1. copy_process() failed in the check in reserve_pfn_range()
      
          copy_process
           copy_mm
            dup_mm
             dup_mmap
              copy_page_range
               track_pfn_copy
                reserve_pfn_range
      
       A WC map request was tracked as WC in memtype, which set a PTE as
       UC (pgprot) per __cachemode2pte_tbl[].  This led to this error in
       reserve_pfn_range() called from track_pfn_copy(), which obtained
       a pgprot from a PTE.  It converts pgprot to page_cache_mode, which
       does not necessarily result in the original page_cache_mode since
       __cachemode2pte_tbl[] redirects multiple types to UC.
      
       #2. error path in copy_process() then hit WARN_ON_ONCE in
           untrack_pfn().
      
           x86/PAT: Xorg:509 map pfn expected mapping type uncached-
           minus for [mem 0xfd000000-0xfdffffff], got write-combining
            Call Trace:
           dump_stack
           warn_slowpath_common
           ? untrack_pfn
           ? untrack_pfn
           warn_slowpath_null
           untrack_pfn
           ? __kunmap_atomic
           unmap_single_vma
           ? pagevec_move_tail_fn
           unmap_vmas
           exit_mmap
           mmput
           copy_process.part.47
           _do_fork
           SyS_clone
           do_syscall_32_irqs_on
           entry_INT80_32
      
      These negative effects are caused by two separate bugs, but they
      can be addressed in separate patches.  Fixing the pat_init() issue
      described below addresses the root cause, and avoids Xorg to hit
      these cases.
      
      When the CPU does not support MTRRs, MTRR does not call pat_init(),
      which leaves PAT enabled without initializing PAT.  This pat_init()
      issue is a long-standing issue, but manifested as issue #1 (and then
      hit issue #2) with the above-mentioned commit because the memtype
      now tracks cache attribute with 'page_cache_mode'.
      
      This pat_init() issue existed before the commit, but we used pgprot
      in memtype.  Hence, we did not have issue #1 before.  But WC request
      resulted in WT in effect because WC pgrot is actually WT when PAT
      is not initialized.  This is not how it was designed to work.  When
      PAT is set to disable properly, WC is converted to UC.  The use of
      WT can result in a system crash if the target range does not support
      WT.  Fortunately, nobody ran into such issue before.
      
      To fix this pat_init() issue, PAT code has been enhanced to provide
      pat_disable() interface.  Call this interface when MTRRs are disabled.
      By setting PAT to disable properly, PAT bypasses the memtype check,
      and avoids issue #1.
      
        [1]: https://lkml.org/lkml/2016/3/3/828
        [2]: https://lkml.org/lkml/2016/3/4/775Signed-off-by: NToshi Kani <toshi.kani@hpe.com>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Luis R. Rodriguez <mcgrof@suse.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: elliott@hpe.com
      Cc: konrad.wilk@oracle.com
      Cc: paul.gortmaker@windriver.com
      Cc: xen-devel@lists.xenproject.org
      Link: http://lkml.kernel.org/r/1458769323-24491-5-git-send-email-toshi.kani@hpe.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      edfe63ec
    • T
      x86/mm/pat: Add pat_disable() interface · 224bb1e5
      Toshi Kani 提交于
      In preparation for fixing a regression caused by:
      
        9cd25aac ("x86/mm/pat: Emulate PAT when it is disabled")
      
      ... PAT needs to provide an interface that prevents the OS from
      initializing the PAT MSR.
      
      PAT MSR initialization must be done on all CPUs using the specific
      sequence of operations defined in the Intel SDM.  This requires MTRRs
      to be enabled since pat_init() is called as part of MTRR init
      from mtrr_rendezvous_handler().
      
      Make pat_disable() as the interface that prevents the OS from
      initializing the PAT MSR.  MTRR will call this interface when it
      cannot provide the SDM-defined sequence to initialize PAT.
      
      This also assures that pat_disable() called from pat_bsp_init()
      will set the PAT table properly when CPU does not support PAT.
      Signed-off-by: NToshi Kani <toshi.kani@hpe.com>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Luis R. Rodriguez <mcgrof@suse.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Robert Elliott <elliott@hpe.com>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: konrad.wilk@oracle.com
      Cc: paul.gortmaker@windriver.com
      Cc: xen-devel@lists.xenproject.org
      Link: http://lkml.kernel.org/r/1458769323-24491-3-git-send-email-toshi.kani@hpe.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      224bb1e5
    • T
      x86/mm/pat: Add support of non-default PAT MSR setting · 02f037d6
      Toshi Kani 提交于
      In preparation for fixing a regression caused by:
      
        9cd25aac ("x86/mm/pat: Emulate PAT when it is disabled")'
      
      ... PAT needs to support a case that PAT MSR is initialized with a
      non-default value.
      
      When pat_init() is called and PAT is disabled, it initializes the
      PAT table with the BIOS default value. Xen, however, sets PAT MSR
      with a non-default value to enable WC. This causes inconsistency
      between the PAT table and PAT MSR when PAT is set to disable on Xen.
      
      Change pat_init() to handle the PAT disable cases properly.  Add
      init_cache_modes() to handle two cases when PAT is set to disable.
      
       1. CPU supports PAT: Set PAT table to be consistent with PAT MSR.
       2. CPU does not support PAT: Set PAT table to be consistent with
          PWT and PCD bits in a PTE.
      
      Note, __init_cache_modes(), renamed from pat_init_cache_modes(),
      will be changed to a static function in a later patch.
      Signed-off-by: NToshi Kani <toshi.kani@hpe.com>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Luis R. Rodriguez <mcgrof@suse.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: elliott@hpe.com
      Cc: konrad.wilk@oracle.com
      Cc: paul.gortmaker@windriver.com
      Cc: xen-devel@lists.xenproject.org
      Link: http://lkml.kernel.org/r/1458769323-24491-2-git-send-email-toshi.kani@hpe.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      02f037d6