1. 21 1月, 2015 2 次提交
    • B
      kvm: Fix CR3_PCID_INVD type on 32-bit · cfaa790a
      Borislav Petkov 提交于
      arch/x86/kvm/emulate.c: In function ‘check_cr_write’:
      arch/x86/kvm/emulate.c:3552:4: warning: left shift count >= width of type
          rsvd = CR3_L_MODE_RESERVED_BITS & ~CR3_PCID_INVD;
      
      happens because sizeof(UL) on 32-bit is 4 bytes but we shift it 63 bits
      to the left.
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      cfaa790a
    • M
      KVM: x86: workaround SuSE's 2.6.16 pvclock vs masterclock issue · 54750f2c
      Marcelo Tosatti 提交于
      SuSE's 2.6.16 kernel fails to boot if the delta between tsc_timestamp
      and rdtsc is larger than a given threshold:
      
       * If we get more than the below threshold into the future, we rerequest
       * the real time from the host again which has only little offset then
       * that we need to adjust using the TSC.
       *
       * For now that threshold is 1/5th of a jiffie. That should be good
       * enough accuracy for completely broken systems, but also give us swing
       * to not call out to the host all the time.
       */
      #define PVCLOCK_DELTA_MAX ((1000000000ULL / HZ) / 5)
      
      Disable masterclock support (which increases said delta) in case the
      boot vcpu does not use MSR_KVM_SYSTEM_TIME_NEW.
      
      Upstreams kernels which support pvclock vsyscalls (and therefore make
      use of PVCLOCK_STABLE_BIT) use MSR_KVM_SYSTEM_TIME_NEW.
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      54750f2c
  2. 09 1月, 2015 4 次提交
  3. 18 12月, 2014 3 次提交
  4. 16 12月, 2014 17 次提交
  5. 12 12月, 2014 2 次提交
    • A
      arch: Add lightweight memory barriers dma_rmb() and dma_wmb() · 1077fa36
      Alexander Duyck 提交于
      There are a number of situations where the mandatory barriers rmb() and
      wmb() are used to order memory/memory operations in the device drivers
      and those barriers are much heavier than they actually need to be.  For
      example in the case of PowerPC wmb() calls the heavy-weight sync
      instruction when for coherent memory operations all that is really needed
      is an lsync or eieio instruction.
      
      This commit adds a coherent only version of the mandatory memory barriers
      rmb() and wmb().  In most cases this should result in the barrier being the
      same as the SMP barriers for the SMP case, however in some cases we use a
      barrier that is somewhere in between rmb() and smp_rmb().  For example on
      ARM the rmb barriers break down as follows:
      
        Barrier   Call     Explanation
        --------- -------- ----------------------------------
        rmb()     dsb()    Data synchronization barrier - system
        dma_rmb() dmb(osh) data memory barrier - outer sharable
        smp_rmb() dmb(ish) data memory barrier - inner sharable
      
      These new barriers are not as safe as the standard rmb() and wmb().
      Specifically they do not guarantee ordering between coherent and incoherent
      memories.  The primary use case for these would be to enforce ordering of
      reads and writes when accessing coherent memory that is shared between the
      CPU and a device.
      
      It may also be noted that there is no dma_mb().  Most architectures don't
      provide a good mechanism for performing a coherent only full barrier without
      resorting to the same mechanism used in mb().  As such there isn't much to
      be gained in trying to define such a function.
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Cc: Michael Ellerman <michael@ellerman.id.au>
      Cc: Michael Neuling <mikey@neuling.org>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: David Miller <davem@davemloft.net>
      Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1077fa36
    • A
      arch: Cleanup read_barrier_depends() and comments · 8a449718
      Alexander Duyck 提交于
      This patch is meant to cleanup the handling of read_barrier_depends and
      smp_read_barrier_depends.  In multiple spots in the kernel headers
      read_barrier_depends is defined as "do {} while (0)", however we then go
      into the SMP vs non-SMP sections and have the SMP version reference
      read_barrier_depends, and the non-SMP define it as yet another empty
      do/while.
      
      With this commit I went through and cleaned out the duplicate definitions
      and reduced the number of definitions down to 2 per header.  In addition I
      moved the 50 line comments for the macro from the x86 and mips headers that
      defined it as an empty do/while to those that were actually defining the
      macro, alpha and blackfin.
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8a449718
  6. 11 12月, 2014 4 次提交
    • B
      x86/asm: Unify segment selector defines · be9d1738
      Borislav Petkov 提交于
      Those are identical on 32- and 64-bit, unify them. No functional
      change.
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/1418127959-29902-1-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      be9d1738
    • X
      x86/mm: Fix zone ranges boot printout · c072b90c
      Xishi Qiu 提交于
      This is the usual physical memory layout boot printout:
      	...
      	[    0.000000] Zone ranges:
      	[    0.000000]   DMA      [mem 0x00001000-0x00ffffff]
      	[    0.000000]   DMA32    [mem 0x01000000-0xffffffff]
      	[    0.000000]   Normal   [mem 0x100000000-0xc3fffffff]
      	[    0.000000] Movable zone start for each node
      	[    0.000000] Early memory node ranges
      	[    0.000000]   node   0: [mem 0x00001000-0x00099fff]
      	[    0.000000]   node   0: [mem 0x00100000-0xbf78ffff]
      	[    0.000000]   node   0: [mem 0x100000000-0x63fffffff]
      	[    0.000000]   node   1: [mem 0x640000000-0xc3fffffff]
      	...
      
      This is the log when we set "mem=2G" on the boot cmdline:
      	...
      	[    0.000000] Zone ranges:
      	[    0.000000]   DMA      [mem 0x00001000-0x00ffffff]
      	[    0.000000]   DMA32    [mem 0x01000000-0xffffffff]  // should be 0x7fffffff, right?
      	[    0.000000]   Normal   empty
      	[    0.000000] Movable zone start for each node
      	[    0.000000] Early memory node ranges
      	[    0.000000]   node   0: [mem 0x00001000-0x00099fff]
      	[    0.000000]   node   0: [mem 0x00100000-0x7fffffff]
      	...
      
      This patch fixes the printout, the following log shows the right
      ranges:
      	...
      	[    0.000000] Zone ranges:
      	[    0.000000]   DMA      [mem 0x00001000-0x00ffffff]
      	[    0.000000]   DMA32    [mem 0x01000000-0x7fffffff]
      	[    0.000000]   Normal   empty
      	[    0.000000] Movable zone start for each node
      	[    0.000000] Early memory node ranges
      	[    0.000000]   node   0: [mem 0x00001000-0x00099fff]
      	[    0.000000]   node   0: [mem 0x00100000-0x7fffffff]
      	...
      Suggested-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NXishi Qiu <qiuxishi@huawei.com>
      Cc: Linux MM <linux-mm@kvack.org>
      Cc: <dave@sr71.net>
      Cc: Rik van Riel <riel@redhat.com>
      Link: http://lkml.kernel.org/r/5487AB3D.6070306@huawei.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      c072b90c
    • K
      mm: fix huge zero page accounting in smaps report · c164e038
      Kirill A. Shutemov 提交于
      As a small zero page, huge zero page should not be accounted in smaps
      report as normal page.
      
      For small pages we rely on vm_normal_page() to filter out zero page, but
      vm_normal_page() is not designed to handle pmds.  We only get here due
      hackish cast pmd to pte in smaps_pte_range() -- pte and pmd format is not
      necessary compatible on each and every architecture.
      
      Let's add separate codepath to handle pmds.  follow_trans_huge_pmd() will
      detect huge zero page for us.
      
      We would need pmd_dirty() helper to do this properly.  The patch adds it
      to THP-enabled architectures which don't yet have one.
      
      [akpm@linux-foundation.org: use do_div to fix 32-bit build]
      Signed-off-by: N"Kirill A. Shutemov" <kirill@shutemov.name>
      Reported-by: NFengguang Wu <fengguang.wu@intel.com>
      Tested-by: NFengwei Yin <yfw.kernel@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c164e038
    • D
      net, lib: kill arch_fast_hash library bits · 0cb6c969
      Daniel Borkmann 提交于
      As there are now no remaining users of arch_fast_hash(), lets kill
      it entirely.
      
      This basically reverts commit 71ae8aac ("lib: introduce arch
      optimized hash library") and follow-up work, that is f.e., commit
      23721754 ("lib: hash: follow-up fixups for arch hash"),
      commit e3fec2f7 ("lib: Add missing arch generic-y entries for
      asm-generic/hash.h") and last but not least commit 6a02652d
      ("perf tools: Fix include for non x86 architectures").
      
      Cc: Francesco Fusco <fusco@ntop.org>
      Cc: Thomas Graf <tgraf@suug.ch>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0cb6c969
  7. 08 12月, 2014 2 次提交
  8. 06 12月, 2014 1 次提交
    • B
      x86, microcode: Reload microcode on resume · fbae4ba8
      Borislav Petkov 提交于
      Normally, we do reapply microcode on resume. However, in the cases where
      that microcode comes from the early loader and the late loader hasn't
      been utilized yet, there's no easy way for us to go and apply the patch
      applied during boot by the early loader.
      
      Thus, reuse the patch stashed by the early loader for the BSP.
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      fbae4ba8
  9. 05 12月, 2014 3 次提交
  10. 04 12月, 2014 2 次提交
    • J
      xen: switch to linear virtual mapped sparse p2m list · 054954eb
      Juergen Gross 提交于
      At start of the day the Xen hypervisor presents a contiguous mfn list
      to a pv-domain. In order to support sparse memory this mfn list is
      accessed via a three level p2m tree built early in the boot process.
      Whenever the system needs the mfn associated with a pfn this tree is
      used to find the mfn.
      
      Instead of using a software walked tree for accessing a specific mfn
      list entry this patch is creating a virtual address area for the
      entire possible mfn list including memory holes. The holes are
      covered by mapping a pre-defined  page consisting only of "invalid
      mfn" entries. Access to a mfn entry is possible by just using the
      virtual base address of the mfn list and the pfn as index into that
      list. This speeds up the (hot) path of determining the mfn of a
      pfn.
      
      Kernel build on a Dell Latitude E6440 (2 cores, HT) in 64 bit Dom0
      showed following improvements:
      
      Elapsed time: 32:50 ->  32:35
      System:       18:07 ->  17:47
      User:        104:00 -> 103:30
      
      Tested with following configurations:
      - 64 bit dom0, 8GB RAM
      - 64 bit dom0, 128 GB RAM, PCI-area above 4 GB
      - 32 bit domU, 512 MB, 8 GB, 43 GB (more wouldn't work even without
                                          the patch)
      - 32 bit domU, ballooning up and down
      - 32 bit domU, save and restore
      - 32 bit domU with PCI passthrough
      - 64 bit domU, 8 GB, 2049 MB, 5000 MB
      - 64 bit domU, ballooning up and down
      - 64 bit domU, save and restore
      - 64 bit domU with PCI passthrough
      Signed-off-by: NJuergen Gross <jgross@suse.com>
      Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
      054954eb
    • J
      xen: Hide get_phys_to_machine() to be able to tune common path · 0aad5689
      Juergen Gross 提交于
      Today get_phys_to_machine() is always called when the mfn for a pfn
      is to be obtained. Add a wrapper __pfn_to_mfn() as inline function
      to be able to avoid calling get_phys_to_machine() when possible as
      soon as the switch to a linear mapped p2m list has been done.
      Signed-off-by: NJuergen Gross <jgross@suse.com>
      Reviewed-by: NDavid Vrabel <david.vrabel@citrix.com>
      Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
      0aad5689