1. 02 3月, 2017 2 次提交
  2. 28 2月, 2017 2 次提交
  3. 25 2月, 2017 2 次提交
  4. 24 2月, 2017 1 次提交
    • J
      objtool: Improve detection of BUG() and other dead ends · d1091c7f
      Josh Poimboeuf 提交于
      The BUG() macro's use of __builtin_unreachable() via the unreachable()
      macro tells gcc that the instruction is a dead end, and that it's safe
      to assume the current code path will not execute past the previous
      instruction.
      
      On x86, the BUG() macro is implemented with the 'ud2' instruction.  When
      objtool's branch analysis sees that instruction, it knows the current
      code path has come to a dead end.
      
      Peter Zijlstra has been working on a patch to change the WARN macros to
      use 'ud2'.  That patch will break objtool's assumption that 'ud2' is
      always a dead end.
      
      Generally it's best for objtool to avoid making those kinds of
      assumptions anyway.  The more ignorant it is of kernel code internals,
      the better.
      
      So create a more generic way for objtool to detect dead ends by adding
      an annotation to the unreachable() macro.  The annotation stores a
      pointer to the end of the unreachable code path in an '__unreachable'
      section.  Objtool can read that section to find the dead ends.
      Tested-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/41a6d33971462ebd944a1c60ad4bf5be86c17b77.1487712920.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      d1091c7f
  5. 21 2月, 2017 3 次提交
    • W
      x86/kvm: Provide optimized version of vcpu_is_preempted() for x86-64 · dd0fd8bc
      Waiman Long 提交于
      It was found when running fio sequential write test with a XFS ramdisk
      on a KVM guest running on a 2-socket x86-64 system, the %CPU times
      as reported by perf were as follows:
      
       69.75%  0.59%  fio  [k] down_write
       69.15%  0.01%  fio  [k] call_rwsem_down_write_failed
       67.12%  1.12%  fio  [k] rwsem_down_write_failed
       63.48% 52.77%  fio  [k] osq_lock
        9.46%  7.88%  fio  [k] __raw_callee_save___kvm_vcpu_is_preempt
        3.93%  3.93%  fio  [k] __kvm_vcpu_is_preempted
      
      Making vcpu_is_preempted() a callee-save function has a relatively
      high cost on x86-64 primarily due to at least one more cacheline of
      data access from the saving and restoring of registers (8 of them)
      to and from stack as well as one more level of function call.
      
      To reduce this performance overhead, an optimized assembly version
      of the the __raw_callee_save___kvm_vcpu_is_preempt() function is
      provided for x86-64.
      
      With this patch applied on a KVM guest on a 2-socket 16-core 32-thread
      system with 16 parallel jobs (8 on each socket), the aggregrate
      bandwidth of the fio test on an XFS ramdisk were as follows:
      
         I/O Type      w/o patch    with patch
         --------      ---------    ----------
         random read   8141.2 MB/s  8497.1 MB/s
         seq read      8229.4 MB/s  8304.2 MB/s
         random write  1675.5 MB/s  1701.5 MB/s
         seq write     1681.3 MB/s  1699.9 MB/s
      
      There are some increases in the aggregated bandwidth because of
      the patch.
      
      The perf data now became:
      
       70.78%  0.58%  fio  [k] down_write
       70.20%  0.01%  fio  [k] call_rwsem_down_write_failed
       69.70%  1.17%  fio  [k] rwsem_down_write_failed
       59.91% 55.42%  fio  [k] osq_lock
       10.14% 10.14%  fio  [k] __kvm_vcpu_is_preempted
      
      The assembly code was verified by using a test kernel module to
      compare the output of C __kvm_vcpu_is_preempted() and that of assembly
      __raw_callee_save___kvm_vcpu_is_preempt() to verify that they matched.
      Suggested-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NWaiman Long <longman@redhat.com>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      dd0fd8bc
    • W
      x86/paravirt: Change vcp_is_preempted() arg type to long · 6c62985d
      Waiman Long 提交于
      The cpu argument in the function prototype of vcpu_is_preempted()
      is changed from int to long. That makes it easier to provide a better
      optimized assembly version of that function.
      
      For Xen, vcpu_is_preempted(long) calls xen_vcpu_stolen(int), the
      downcast from long to int is not a problem as vCPU number won't exceed
      32 bits.
      Signed-off-by: NWaiman Long <longman@redhat.com>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      6c62985d
    • A
      x86/kvm/vmx: Defer TR reload after VM exit · b7ffc44d
      Andy Lutomirski 提交于
      Intel's VMX is daft and resets the hidden TSS limit register to 0x67
      on VMX reload, and the 0x67 is not configurable.  KVM currently
      reloads TR using the LTR instruction on every exit, but this is quite
      slow because LTR is serializing.
      
      The 0x67 limit is entirely harmless unless ioperm() is in use, so
      defer the reload until a task using ioperm() is actually running.
      
      Here's some poorly done benchmarking using kvm-unit-tests:
      
      Before:
      
      cpuid 1313
      vmcall 1195
      mov_from_cr8 11
      mov_to_cr8 17
      inl_from_pmtimer 6770
      inl_from_qemu 6856
      inl_from_kernel 2435
      outl_to_kernel 1402
      
      After:
      
      cpuid 1291
      vmcall 1181
      mov_from_cr8 11
      mov_to_cr8 16
      inl_from_pmtimer 6457
      inl_from_qemu 6209
      inl_from_kernel 2339
      outl_to_kernel 1391
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      [Force-reload TR in invalidate_tss_limit. - Paolo]
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      b7ffc44d
  6. 14 2月, 2017 1 次提交
  7. 11 2月, 2017 1 次提交
  8. 10 2月, 2017 2 次提交
    • T
      x86/tsc: Make the TSC ADJUST sanitizing work for tsc_reliable · 5f2e71e7
      Thomas Gleixner 提交于
      When the TSC is marked reliable then the synchronization check is skipped,
      but that also skips the TSC ADJUST sanitizing code. So on a machine with a
      wreckaged BIOS the TSC deviation between CPUs might go unnoticed.
      
      Let the TSC adjust sanitizing code run unconditionally and just skip the
      expensive synchronization checks when TSC is marked reliable.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Olof Johansson <olof@lixom.net>
      Link: http://lkml.kernel.org/r/20170209151231.491189912@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      5f2e71e7
    • T
      x86/tsc: Avoid the large time jump when sanitizing TSC ADJUST · f2e04214
      Thomas Gleixner 提交于
      Olof reported that on a machine which has a BIOS wreckaged TSC the
      timestamps in dmesg are making a large jump because the TSC value is
      jumping forward after resetting the TSC ADJUST register to a sane value.
      
      This can be avoided by calling the TSC ADJUST saniziting function before
      initializing the per cpu sched clock machinery. That takes the offset into
      account and avoid the time jump.
      
      What cannot be avoided is that the 'Firmware Bug' warnings on the secondary
      CPUs are printed with the large time offsets because it would be too much
      effort and ugly hackery to print those warnings into a buffer and emit them
      after the adjustemt on the starting CPUs. It's a firmware bug and should be
      fixed in firmware. The weird timestamps are collateral damage and just
      illustrate the sillyness of the BIOS folks:
      
      [    0.397445] smp: Bringing up secondary CPUs ...
      [    0.402100] x86: Booting SMP configuration:
      [    0.406343] .... node  #0, CPUs:      #1
      [1265776479.930667] [Firmware Bug]: TSC ADJUST differs: Reference CPU0: -2978888639075328 CPU1: -2978888639183101
      [1265776479.944664] TSC ADJUST synchronize: Reference CPU0: 0 CPU1: -2978888639183101
      [    0.508119]  #2
      [1265776480.032346] [Firmware Bug]: TSC ADJUST differs: Reference CPU0: -2978888639075328 CPU2: -2978888639183677
      [1265776480.044192] TSC ADJUST synchronize: Reference CPU0: 0 CPU2: -2978888639183677
      [    0.607643]  #3
      [1265776480.131874] [Firmware Bug]: TSC ADJUST differs: Reference CPU0: -2978888639075328 CPU3: -2978888639184530
      [1265776480.143720] TSC ADJUST synchronize: Reference CPU0: 0 CPU3: -2978888639184530
      [    0.707108] smp: Brought up 1 node, 4 CPUs
      [    0.711271] smpboot: Total of 4 processors activated (21698.88 BogoMIPS)
      Reported-by: NOlof Johansson <olof@lixom.net>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20170209151231.411460506@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      f2e04214
  9. 09 2月, 2017 2 次提交
  10. 07 2月, 2017 5 次提交
  11. 06 2月, 2017 1 次提交
  12. 05 2月, 2017 3 次提交
  13. 04 2月, 2017 4 次提交
  14. 01 2月, 2017 9 次提交
  15. 31 1月, 2017 1 次提交
    • K
      x86/mm: Remove CONFIG_DEBUG_NX_TEST · 3ad38ceb
      Kees Cook 提交于
      CONFIG_DEBUG_NX_TEST has been broken since CONFIG_DEBUG_SET_MODULE_RONX=y
      was added in v2.6.37 via:
      
        84e1c6bb ("x86: Add RO/NX protection for loadable kernel modules")
      
      since the exception table was then made read-only.
      
      Additionally, the manually constructed extables were never fixed when
      relative extables were introduced in v3.5 via:
      
        70627654 ("x86, extable: Switch to relative exception table entries")
      
      However, relative extables won't work for test_nx.c, since test instruction
      memory areas may be more than INT_MAX away from an executable fixup
      (e.g. stack and heap too far away from executable memory with the fixup).
      
      Since clearly no one has been using this code for a while now, and similar
      tests exist in LKDTM, this should just be removed entirely.
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jinbum Park <jinb.park7@gmail.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20170131003711.GA74048@beastSigned-off-by: NIngo Molnar <mingo@kernel.org>
      3ad38ceb
  16. 30 1月, 2017 1 次提交
    • B
      x86/microcode: Do not access the initrd after it has been freed · 24c25032
      Borislav Petkov 提交于
      When we look for microcode blobs, we first try builtin and if that
      doesn't succeed, we fallback to the initrd supplied to the kernel.
      
      However, at some point doing boot, that initrd gets jettisoned and we
      shouldn't access it anymore. But we do, as the below KASAN report shows.
      That's because find_microcode_in_initrd() doesn't check whether the
      initrd is still valid or not.
      
      So do that.
      
        ==================================================================
        BUG: KASAN: use-after-free in find_cpio_data
        Read of size 1 by task swapper/1/0
        page:ffffea0000db9d40 count:0 mapcount:0 mapping:          (null) index:0x1
        flags: 0x100000000000000()
        raw: 0100000000000000 0000000000000000 0000000000000001 00000000ffffffff
        raw: dead000000000100 dead000000000200 0000000000000000 0000000000000000
        page dumped because: kasan: bad access detected
        CPU: 1 PID: 0 Comm: swapper/1 Tainted: G        W       4.10.0-rc5-debug-00075-g2dbde22 #3
        Hardware name: Dell Inc. XPS 13 9360/0839Y6, BIOS 1.2.3 12/01/2016
        Call Trace:
         dump_stack
         ? _atomic_dec_and_lock
         ? __dump_page
         kasan_report_error
         ? pointer
         ? find_cpio_data
         __asan_report_load1_noabort
         ? find_cpio_data
         find_cpio_data
         ? vsprintf
         ? dump_stack
         ? get_ucode_user
         ? print_usage_bug
         find_microcode_in_initrd
         __load_ucode_intel
         ? collect_cpu_info_early
         ? debug_check_no_locks_freed
         load_ucode_intel_ap
         ? collect_cpu_info
         ? trace_hardirqs_on
         ? flat_send_IPI_mask_allbutself
         load_ucode_ap
         ? get_builtin_firmware
         ? flush_tlb_func
         ? do_raw_spin_trylock
         ? cpumask_weight
         cpu_init
         ? trace_hardirqs_off
         ? play_dead_common
         ? native_play_dead
         ? hlt_play_dead
         ? syscall_init
         ? arch_cpu_idle_dead
         ? do_idle
         start_secondary
         start_cpu
        Memory state around the buggy address:
         ffff880036e74f00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
         ffff880036e74f80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
        >ffff880036e75000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
                           ^
         ffff880036e75080: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
         ffff880036e75100: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
        ==================================================================
      Reported-by: NAndrey Ryabinin <aryabinin@virtuozzo.com>
      Tested-by: NAndrey Ryabinin <aryabinin@virtuozzo.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20170126165833.evjemhbqzaepirxo@pd.tnicSigned-off-by: NIngo Molnar <mingo@kernel.org>
      24c25032