1. 08 6月, 2018 1 次提交
    • L
      mm: introduce ARCH_HAS_PTE_SPECIAL · 3010a5ea
      Laurent Dufour 提交于
      Currently the PTE special supports is turned on in per architecture
      header files.  Most of the time, it is defined in
      arch/*/include/asm/pgtable.h depending or not on some other per
      architecture static definition.
      
      This patch introduce a new configuration variable to manage this
      directly in the Kconfig files.  It would later replace
      __HAVE_ARCH_PTE_SPECIAL.
      
      Here notes for some architecture where the definition of
      __HAVE_ARCH_PTE_SPECIAL is not obvious:
      
      arm
       __HAVE_ARCH_PTE_SPECIAL which is currently defined in
      arch/arm/include/asm/pgtable-3level.h which is included by
      arch/arm/include/asm/pgtable.h when CONFIG_ARM_LPAE is set.
      So select ARCH_HAS_PTE_SPECIAL if ARM_LPAE.
      
      powerpc
      __HAVE_ARCH_PTE_SPECIAL is defined in 2 files:
       - arch/powerpc/include/asm/book3s/64/pgtable.h
       - arch/powerpc/include/asm/pte-common.h
      The first one is included if (PPC_BOOK3S & PPC64) while the second is
      included in all the other cases.
      So select ARCH_HAS_PTE_SPECIAL all the time.
      
      sparc:
      __HAVE_ARCH_PTE_SPECIAL is defined if defined(__sparc__) &&
      defined(__arch64__) which are defined through the compiler in
      sparc/Makefile if !SPARC32 which I assume to be if SPARC64.
      So select ARCH_HAS_PTE_SPECIAL if SPARC64
      
      There is no functional change introduced by this patch.
      
      Link: http://lkml.kernel.org/r/1523433816-14460-2-git-send-email-ldufour@linux.vnet.ibm.comSigned-off-by: NLaurent Dufour <ldufour@linux.vnet.ibm.com>
      Suggested-by: NJerome Glisse <jglisse@redhat.com>
      Reviewed-by: NJerome Glisse <jglisse@redhat.com>
      Acked-by: NDavid Rientjes <rientjes@google.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: Rich Felker <dalias@libc.org>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Palmer Dabbelt <palmer@sifive.com>
      Cc: Albert Ou <albert@sifive.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Robin Murphy <robin.murphy@arm.com>
      Cc: Christophe LEROY <christophe.leroy@c-s.fr>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3010a5ea
  2. 02 6月, 2018 1 次提交
  3. 01 6月, 2018 1 次提交
  4. 31 5月, 2018 10 次提交
    • K
      perf/x86/intel/uncore: Clean up client IMC uncore · 9aae1780
      Kan Liang 提交于
      The counters in client IMC uncore are free running counters, not fixed
      counters. It should be corrected. The new infrastructure for free
      running counter should be applied.
      
      Introducing a new type SNB_PCI_UNCORE_IMC_DATA for client IMC free
      running counters.
      
      Keeping the customized event_init() function to be compatible with old
      event encoding.
      
      Clean up other customized event_*() functions.
      Signed-off-by: NKan Liang <kan.liang@intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: acme@kernel.org
      Cc: eranian@google.com
      Link: http://lkml.kernel.org/r/1525371913-10597-8-git-send-email-kan.liang@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      9aae1780
    • K
      perf/x86/intel/uncore: Expose uncore_pmu_event*() functions · 5a6c9d94
      Kan Liang 提交于
      Some uncores have customized PMU. For customized PMU, it does not need
      to customize everything. For example, it only needs to customize init()
      function for client IMC uncore. Other functions like
      add()/del()/start()/stop()/read() can use generic code.
      
      Expose the uncore_pmu_event_add/del/start/stop() functions.
      Signed-off-by: NKan Liang <kan.liang@intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: acme@kernel.org
      Cc: eranian@google.com
      Link: http://lkml.kernel.org/r/1525371913-10597-7-git-send-email-kan.liang@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      5a6c9d94
    • K
      perf/x86/intel/uncore: Support IIO free-running counters on SKX · 0f519f03
      Kan Liang 提交于
      As of Skylake Server, there are a number of free running counters in
      each IIO Box that collect counts of per-box IO clocks and per-port
      Input/Output x BW/Utilization.
      
      The free running counters cannot be part of the existing IIO BOX,
      because, quoting from Peter Zijlstra:
      
        "This will result in some (probably) unexpected scheduling artifacts.
         Probably the only way to really cure that is to have the free running
         counters in their own PMU and not share with the GP counters of this
         box."
      
      So let's add a new PMU for the free running counters, as suggested.
      
      The free-running counter is read-only and always active. Counting will
      be suspended only when the IIO Box is powered down.
      
      There are three types of IIO free-running counters on Skylake server, IO
      CLOCKS counter, BANDWIDTH counters and UTILIZATION counters.
      IO CLOCKS counter is a clock of IIO box.
      BANDWIDTH counters are to count inbound(PCIe->CPU)/outbound(CPU->PCIe)
      bandwidth.
      UTILIZATION counters are to count input/output utilization.
      
      The bit width of the free-running counters is 36-bits.
      Suggested-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: NKan Liang <kan.liang@intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: acme@kernel.org
      Cc: eranian@google.com
      Link: http://lkml.kernel.org/r/1525371913-10597-6-git-send-email-kan.liang@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      0f519f03
    • K
      perf/x86/intel/uncore: Add infrastructure for free running counters · 0e0162df
      Kan Liang 提交于
      There are a number of free running counters introduced for uncore, which
      provide highly valuable information to a wide array of customers.
      However, the generic uncore code doesn't support them yet.
      
      The free running counters will be specially handled based on their
      unique attributes:
      
       - They are read-only. They cannot be enabled/disabled.
      
       - The event and the counter are always 1:1 mapped. It doesn't need to
         be assigned nor tracked by event_list.
      
       - They are always active. It doesn't need to check the availability.
      
       - They have different bit width.
      
      Also, using inline helpers to replace the check for fixed counter and
      free running counter.
      Signed-off-by: NKan Liang <kan.liang@intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: acme@kernel.org
      Cc: eranian@google.com
      Link: http://lkml.kernel.org/r/1525371913-10597-5-git-send-email-kan.liang@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      0e0162df
    • K
      perf/x86/intel/uncore: Add new data structures for free running counters · 927b2deb
      Kan Liang 提交于
      There are a number of free running counters introduced for uncore, which
      provide highly valuable information to a wide array of customers.
      For example, Skylake Server has IIO free running counters to collect
      Input/Output x BW/Utilization.
      
      There is NO event available on the general purpose counters, that is
      exactly the same as the free running counters. The generic uncore code
      needs to be enhanced to support the new counters.
      
      In the uncore document, there is no event-code assigned to free running
      counters. Some events need to be defined to indicate the free running
      counters. The events are encoded as event-code + umask-code.
      
      The event-code for all free running counters is 0xff, which is the same
      as the fixed counters:
      
      - It has not been decided what code will be used for common events on
        future platforms. 0xff is the only one which will definitely not be
        used as any common event-code.
      - Cannot re-use current events on the general purpose counters. Because
        there is NO event available, that is exactly the same as the free
        running counters.
      - Even in the existing codes, the fixed counters for core, that have the
        same event-code, may count different things. Hence, it should not
        surprise the users if the free running counters that share the same
        event-code also count different things.
        Umask will be used to distinguish the counters.
      
      The umask-code is used to distinguish a fixed counter and a free running
      counter, and different types of free running counters.
      
      For fixed counters, the umask-code is 0x0X, where X indicates the index
      of the fixed counter, which starts from 0.
      
       - Compatible with the old event encoding.
      
       - Currently, there is only one fixed counter. There are still 15
         reserved spaces for extension.
      
      For free running counters, the umask-code uses the rest of the space.
      It would follow the format of 0xXY:
      
       - X stands for the type of free running counters, which starts from 1.
      
       - Y stands for the index of free running counters of same type, which
         starts from 0.
      
      - The free running counters do different thing. It can be categorized to
        several types, according to the MSR location, bit width and
        definition. E.g. there are three types of IIO free running counters on
        Skylake server to monitor IO CLOCKS, BANDWIDTH and UTILIZATION  on
        different ports. It makes it easy to locate the free running counter
        of a specific type.
      
      - So far, there are at most 8 counters of each type.  There are still 8
        reserved spaces for extension.
      
      Introducing a new index to indicate the free running counters. Only one
      index is enough for all free running counters. Because the free running
      counters are always active, and the event and free running counter are
      always 1:1 mapped, it does not need extra index to indicate the assigned
      counter.
      
      Introducing a new data structure to store free running counters related
      information for each type. It includes the number of counters, bit
      width, base address, offset between counters and offset between boxes.
      
      Introducing several inline helpers to check index for fixed counter and
      free running counter, validate free running counter event, and retrieve
      the free running counter information according to box and event.
      Signed-off-by: NKan Liang <kan.liang@intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: acme@kernel.org
      Cc: eranian@google.com
      Link: http://lkml.kernel.org/r/1525371913-10597-4-git-send-email-kan.liang@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      927b2deb
    • K
      perf/x86/intel/uncore: Correct fixed counter index check in generic code · 4749f819
      Kan Liang 提交于
      There is no index which is bigger than UNCORE_PMC_IDX_FIXED. The only
      exception is client IMC uncore, which has been specially handled.
      For generic code, it is not correct to use >= to check fixed counter.
      The code quality issue will bring problem when a new counter index is
      introduced.
      Signed-off-by: NKan Liang <kan.liang@intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: acme@kernel.org
      Cc: eranian@google.com
      Link: http://lkml.kernel.org/r/1525371913-10597-3-git-send-email-kan.liang@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      4749f819
    • K
      perf/x86/intel/uncore: Correct fixed counter index check for NHM · d71f11c0
      Kan Liang 提交于
      For Nehalem and Westmere, there is only one fixed counter for W-Box.
      There is no index which is bigger than UNCORE_PMC_IDX_FIXED.
      It is not correct to use >= to check fixed counter.
      The code quality issue will bring problem when new counter index is
      introduced.
      Signed-off-by: NKan Liang <kan.liang@intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: acme@kernel.org
      Cc: eranian@google.com
      Link: http://lkml.kernel.org/r/1525371913-10597-2-git-send-email-kan.liang@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      d71f11c0
    • K
      perf/x86/intel/uncore: Introduce customized event_read() for client IMC uncore · 2da33146
      Kan Liang 提交于
      There are two free-running counters for client IMC uncore. The
      customized event_init() function hard codes their index to
      'UNCORE_PMC_IDX_FIXED' and 'UNCORE_PMC_IDX_FIXED + 1'.
      To support the index 'UNCORE_PMC_IDX_FIXED + 1', the generic
      uncore_perf_event_update is obscurely hacked.
      The code quality issue will bring problems when a new counter index is
      introduced into the generic code, for example, a new index for
      free-running counter.
      
      Introducing a customized event_read() function for client IMC uncore.
      The customized function is copied from previous generic
      uncore_pmu_event_read().
      The index 'UNCORE_PMC_IDX_FIXED + 1' will be isolated for client IMC
      uncore only.
      Signed-off-by: NKan Liang <kan.liang@intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: acme@kernel.org
      Cc: eranian@google.com
      Link: http://lkml.kernel.org/r/1525371913-10597-1-git-send-email-kan.liang@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      2da33146
    • E
      crypto: x86/salsa20 - remove x86 salsa20 implementations · b7b73cd5
      Eric Biggers 提交于
      The x86 assembly implementations of Salsa20 use the frame base pointer
      register (%ebp or %rbp), which breaks frame pointer convention and
      breaks stack traces when unwinding from an interrupt in the crypto code.
      Recent (v4.10+) kernels will warn about this, e.g.
      
      WARNING: kernel stack regs at 00000000a8291e69 in syzkaller047086:4677 has bad 'bp' value 000000001077994c
      [...]
      
      But after looking into it, I believe there's very little reason to still
      retain the x86 Salsa20 code.  First, these are *not* vectorized
      (SSE2/SSSE3/AVX2) implementations, which would be needed to get anywhere
      close to the best Salsa20 performance on any remotely modern x86
      processor; they're just regular x86 assembly.  Second, it's still
      unclear that anyone is actually using the kernel's Salsa20 at all,
      especially given that now ChaCha20 is supported too, and with much more
      efficient SSSE3 and AVX2 implementations.  Finally, in benchmarks I did
      on both Intel and AMD processors with both gcc 8.1.0 and gcc 4.9.4, the
      x86_64 salsa20-asm is actually slightly *slower* than salsa20-generic
      (~3% slower on Skylake, ~10% slower on Zen), while the i686 salsa20-asm
      is only slightly faster than salsa20-generic (~15% faster on Skylake,
      ~20% faster on Zen).  The gcc version made little difference.
      
      So, the x86_64 salsa20-asm is pretty clearly useless.  That leaves just
      the i686 salsa20-asm, which based on my tests provides a 15-20% speed
      boost.  But that's without updating the code to not use %ebp.  And given
      the maintenance cost, the small speed difference vs. salsa20-generic,
      the fact that few people still use i686 kernels, the doubt that anyone
      is even using the kernel's Salsa20 at all, and the fact that a SSE2
      implementation would almost certainly be much faster on any remotely
      modern x86 processor yet no one has cared enough to add one yet, I don't
      think it's worthwhile to keep.
      
      Thus, just remove both the x86_64 and i686 salsa20-asm implementations.
      
      Reported-by: syzbot+ffa3a158337bbc01ff09@syzkaller.appspotmail.com
      Signed-off-by: NEric Biggers <ebiggers@google.com>
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      b7b73cd5
    • O
      crypto: morus - Mark MORUS SIMD glue as x86-specific · 2808f173
      Ondrej Mosnacek 提交于
      Commit 56e8e57f ("crypto: morus - Add common SIMD glue code for
      MORUS") accidetally consiedered the glue code to be usable by different
      architectures, but it seems to be only usable on x86.
      
      This patch moves it under arch/x86/crypto and adds 'depends on X86' to
      the Kconfig options and also removes the prompt to hide these internal
      options from the user.
      Reported-by: Nkbuild test robot <lkp@intel.com>
      Signed-off-by: NOndrej Mosnacek <omosnacek@gmail.com>
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      2808f173
  5. 29 5月, 2018 3 次提交
    • M
      kconfig: add basic helper macros to scripts/Kconfig.include · e1cfdc0e
      Masahiro Yamada 提交于
      Kconfig got text processing tools like we see in Make.  Add Kconfig
      helper macros to scripts/Kconfig.include like we collect Makefile
      macros in scripts/Kbuild.include.
      Signed-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com>
      Reviewed-by: NKees Cook <keescook@chromium.org>
      Reviewed-by: NUlf Magnusson <ulfalizer@gmail.com>
      e1cfdc0e
    • M
      kconfig: show compiler version text in the top comment · 21c54b77
      Masahiro Yamada 提交于
      The kernel configuration phase is now tightly coupled with the compiler
      in use.  It will be nice to show the compiler information in Kconfig.
      
      The compiler information will be displayed like this:
      
        $ make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- config
        scripts/kconfig/conf  --oldaskconfig Kconfig
        *
        * Linux/arm64 4.16.0-rc1 Kernel Configuration
        *
        *
        * Compiler: aarch64-linux-gnu-gcc (Linaro GCC 7.2-2017.11) 7.2.1 20171011
        *
        *
        * General setup
        *
        Compile also drivers which will not load (COMPILE_TEST) [N/y/?]
      
      If you use GUI methods such as menuconfig, it will be displayed in the
      top menu.
      
      This is simply implemented by using the 'comment' statement.  So, it
      will be saved into the .config file as well.
      
      This commit has a very important meaning.  If the compiler is upgraded,
      Kconfig must be re-run since different compilers have different sets
      of supported options.
      
      All referenced environments are written to include/config/auto.conf.cmd
      so that any environment change triggers syncconfig, and prompt the user
      to input new values if needed.
      
      With this commit, something like follows will be added to
      include/config/auto.conf.cmd
      
        ifneq "$(CC_VERSION_TEXT)" "aarch64-linux-gnu-gcc (Linaro GCC 7.2-2017.11) 7.2.1 20171011"
        include/config/auto.conf: FORCE
        endif
      Signed-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com>
      Reviewed-by: NKees Cook <keescook@chromium.org>
      21c54b77
    • M
      kconfig: reference environment variables directly and remove 'option env=' · 104daea1
      Masahiro Yamada 提交于
      To get access to environment variables, Kconfig needs to define a
      symbol using "option env=" syntax.  It is tedious to add a symbol entry
      for each environment variable given that we need to define much more
      such as 'CC', 'AS', 'srctree' etc. to evaluate the compiler capability
      in Kconfig.
      
      Adding '$' for symbol references is grammatically inconsistent.
      Looking at the code, the symbols prefixed with 'S' are expanded by:
       - conf_expand_value()
         This is used to expand 'arch/$ARCH/defconfig' and 'defconfig_list'
       - sym_expand_string_value()
         This is used to expand strings in 'source' and 'mainmenu'
      
      All of them are fixed values independent of user configuration.  So,
      they can be changed into the direct expansion instead of symbols.
      
      This change makes the code much cleaner.  The bounce symbols 'SRCARCH',
      'ARCH', 'SUBARCH', 'KERNELVERSION' are gone.
      
      sym_init() hard-coding 'UNAME_RELEASE' is also gone.  'UNAME_RELEASE'
      should be replaced with an environment variable.
      
      ARCH_DEFCONFIG is a normal symbol, so it should be simply referenced
      without '$' prefix.
      
      The new syntax is addicted by Make.  The variable reference needs
      parentheses, like $(FOO), but you can omit them for single-letter
      variables, like $F.  Yet, in Makefiles, people tend to use the
      parenthetical form for consistency / clarification.
      
      At this moment, only the environment variable is supported, but I will
      extend the concept of 'variable' later on.
      
      The variables are expanded in the lexer so we can simplify the token
      handling on the parser side.
      
      For example, the following code works.
      
      [Example code]
      
        config MY_TOOLCHAIN_LIST
                string
                default "My tools: CC=$(CC), AS=$(AS), CPP=$(CPP)"
      
      [Result]
      
        $ make -s alldefconfig && tail -n 1 .config
        CONFIG_MY_TOOLCHAIN_LIST="My tools: CC=gcc, AS=as, CPP=gcc -E"
      Signed-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com>
      Reviewed-by: NKees Cook <keescook@chromium.org>
      104daea1
  6. 28 5月, 2018 3 次提交
  7. 27 5月, 2018 1 次提交
  8. 26 5月, 2018 1 次提交
  9. 25 5月, 2018 4 次提交
  10. 24 5月, 2018 2 次提交
    • W
      KVM: x86: Update cpuid properly when CR4.OSXAVE or CR4.PKE is changed · c4d21882
      Wei Huang 提交于
      The CPUID bits of OSXSAVE (function=0x1) and OSPKE (func=0x7, leaf=0x0)
      allows user apps to detect if OS has set CR4.OSXSAVE or CR4.PKE. KVM is
      supposed to update these CPUID bits when CR4 is updated. Current KVM
      code doesn't handle some special cases when updates come from emulator.
      Here is one example:
      
        Step 1: guest boots
        Step 2: guest OS enables XSAVE ==> CR4.OSXSAVE=1 and CPUID.OSXSAVE=1
        Step 3: guest hot reboot ==> QEMU reset CR4 to 0, but CPUID.OSXAVE==1
        Step 4: guest os checks CPUID.OSXAVE, detects 1, then executes xgetbv
      
      Step 4 above will cause an #UD and guest crash because guest OS hasn't
      turned on OSXAVE yet. This patch solves the problem by comparing the the
      old_cr4 with cr4. If the related bits have been changed,
      kvm_update_cpuid() needs to be called.
      Signed-off-by: NWei Huang <wei@redhat.com>
      Reviewed-by: NBandan Das <bsd@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      c4d21882
    • D
      x86/kvm: fix LAPIC timer drift when guest uses periodic mode · d8f2f498
      David Vrabel 提交于
      Since 4.10, commit 8003c9ae (KVM: LAPIC: add APIC Timer
      periodic/oneshot mode VMX preemption timer support), guests using
      periodic LAPIC timers (such as FreeBSD 8.4) would see their timers
      drift significantly over time.
      
      Differences in the underlying clocks and numerical errors means the
      periods of the two timers (hv and sw) are not the same. This
      difference will accumulate with every expiry resulting in a large
      error between the hv and sw timer.
      
      This means the sw timer may be running slow when compared to the hv
      timer. When the timer is switched from hv to sw, the now active sw
      timer will expire late. The guest VCPU is reentered and it switches to
      using the hv timer. This timer catches up, injecting multiple IRQs
      into the guest (of which the guest only sees one as it does not get to
      run until the hv timer has caught up) and thus the guest's timer rate
      is low (and becomes increasing slower over time as the sw timer lags
      further and further behind).
      
      I believe a similar problem would occur if the hv timer is the slower
      one, but I have not observed this.
      
      Fix this by synchronizing the deadlines for both timers to the same
      time source on every tick. This prevents the errors from accumulating.
      
      Fixes: 8003c9ae
      Cc: Wanpeng Li <wanpeng.li@hotmail.com>
      Signed-off-by: NDavid Vrabel <david.vrabel@nutanix.com>
      Cc: stable@vger.kernel.org
      Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
      Reviewed-by: NWanpeng Li <wanpengli@tencent.com>
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      d8f2f498
  11. 23 5月, 2018 3 次提交
  12. 20 5月, 2018 1 次提交
    • T
      x86/Hyper-V/hv_apic: Build the Hyper-V APIC conditionally · 2d2ccf24
      Thomas Gleixner 提交于
      The Hyper-V APIC code is built when CONFIG_HYPERV is enabled but the actual
      code in that file is guarded with CONFIG_X86_64. There is no point in doing
      this. Neither is there a point in having the CONFIG_HYPERV guard in there
      because the containing directory is not built when CONFIG_HYPERV=n.
      
      Further for the hv_init_apic() function a stub is provided only for
      CONFIG_HYPERV=n, which is pointless as the callsite is not compiled at
      all. But for X86_32 the stub is missing and the build fails.
      
      Clean that up:
      
        - Compile hv_apic.c only when CONFIG_X86_64=y
        - Make the stub for hv_init_apic() available when CONFG_X86_64=n
      
      Fixes: 6b48cb5f ("X86/Hyper-V: Enlighten APIC access")
      Reported-by: Nkbuild test robot <lkp@intel.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: K. Y. Srinivasan <kys@microsoft.com>
      Cc: Michael Kelley <mikelley@microsoft.com>
      2d2ccf24
  13. 19 5月, 2018 9 次提交