1. 23 3月, 2015 7 次提交
  2. 20 3月, 2015 1 次提交
    • R
      Revert "x86/PCI: Refine the way to release PCI IRQ resources" · 9e8ce4b9
      Rafael J. Wysocki 提交于
      Commit b4b55cda (Refine the way to release PCI IRQ resources)
      introduced a regression in the PCI IRQ resource management by causing
      the IRQ resource of a device, established when pci_enabled_device()
      is called on a fully disabled device, to be released when the driver
      is unbound from the device, regardless of the enable_cnt.
      
      This leads to the situation that an ill-behaved driver can now make a
      device unusable to subsequent drivers by an imbalance in their use of
      pci_enable/disable_device().  That is a serious problem for secondary
      drivers like vfio-pci, which are innocent of the transgressions of
      the previous driver.
      
      Since the solution of this problem is not immediate and requires
      further discussion, revert commit b4b55cda and the issue it was
      supposed to address (a bug related to xen-pciback) will be taken
      care of in a different way going forward.
      Reported-by: NAlex Williamson <alex.williamson@redhat.com>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      9e8ce4b9
  3. 17 3月, 2015 13 次提交
  4. 16 3月, 2015 1 次提交
    • B
      Revert "x86/mm/ASLR: Propagate base load address calculation" · 69797daf
      Borislav Petkov 提交于
      This reverts commit:
      
        f47233c2 ("x86/mm/ASLR: Propagate base load address calculation")
      
      The main reason for the revert is that the new boot flag does not work
      at all currently, and in order to make this work, we need non-trivial
      changes to the x86 boot code which we didn't manage to get done in
      time for merging.
      
      And even if we did, they would've been too risky so instead of
      rushing things and break booting 4.1 on boxes left and right, we
      will be very strict and conservative and will take our time with
      this to fix and test it properly.
      Reported-by: NYinghai Lu <yinghai@kernel.org>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: H. Peter Anvin <hpa@linux.intel.com
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Junjie Mao <eternal.n08@gmail.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Matt Fleming <matt.fleming@intel.com>
      Link: http://lkml.kernel.org/r/20150316100628.GD22995@pd.tnicSigned-off-by: NIngo Molnar <mingo@kernel.org>
      69797daf
  5. 13 3月, 2015 5 次提交
    • W
      KVM: VMX: Set msr bitmap correctly if vcpu is in guest mode · 670125bd
      Wincy Van 提交于
      In commit 3af18d9c ("KVM: nVMX: Prepare for using hardware MSR bitmap"),
      we are setting MSR_BITMAP in prepare_vmcs02 if we should use hardware. This
      is not enough since the field will be modified by following vmx_set_efer.
      
      Fix this by setting vmx_msr_bitmap_nested in vmx_set_msr_bitmap if vcpu is
      in guest mode.
      Signed-off-by: NWincy Van <fanwenyi0529@gmail.com>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      670125bd
    • O
      x86/fpu: Drop_fpu() should not assume that tsk equals current · f4c36863
      Oleg Nesterov 提交于
      drop_fpu() does clear_used_math() and usually this is correct
      because tsk == current.
      
      However switch_fpu_finish()->restore_fpu_checking() is called before
      __switch_to() updates the "current_task" variable. If it fails,
      we will wrongly clear the PF_USED_MATH flag of the previous task.
      
      So use clear_stopped_child_used_math() instead.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Reviewed-by: NRik van Riel <riel@redhat.com>
      Cc: <stable@vger.kernel.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Pekka Riikonen <priikone@iki.fi>
      Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
      Cc: Suresh Siddha <sbsiddha@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20150309171041.GB11388@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      f4c36863
    • O
      x86/fpu: Avoid math_state_restore() without used_math() in __restore_xstate_sig() · a7c80ebc
      Oleg Nesterov 提交于
      math_state_restore() assumes it is called with irqs disabled,
      but this is not true if the caller is __restore_xstate_sig().
      
      This means that if ia32_fxstate == T and __copy_from_user()
      fails, __restore_xstate_sig() returns with irqs disabled too.
      
      This triggers:
      
        BUG: sleeping function called from invalid context at kernel/locking/rwsem.c:41
         dump_stack
         ___might_sleep
         ? _raw_spin_unlock_irqrestore
         __might_sleep
         down_read
         ? _raw_spin_unlock_irqrestore
         print_vma_addr
         signal_fault
         sys32_rt_sigreturn
      
      Change __restore_xstate_sig() to call set_used_math()
      unconditionally. This avoids enabling and disabling interrupts
      in math_state_restore(). If copy_from_user() fails, we can
      simply do fpu_finit() by hand.
      
      [ Note: this is only the first step. math_state_restore() should
              not check used_math(), it should set this flag. While
      	init_fpu() should simply die. ]
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: <stable@vger.kernel.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Pekka Riikonen <priikone@iki.fi>
      Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Suresh Siddha <sbsiddha@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20150307153844.GB25954@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      a7c80ebc
    • S
      crypto: aesni - fix memory usage in GCM decryption · ccfe8c3f
      Stephan Mueller 提交于
      The kernel crypto API logic requires the caller to provide the
      length of (ciphertext || authentication tag) as cryptlen for the
      AEAD decryption operation. Thus, the cipher implementation must
      calculate the size of the plaintext output itself and cannot simply use
      cryptlen.
      
      The RFC4106 GCM decryption operation tries to overwrite cryptlen memory
      in req->dst. As the destination buffer for decryption only needs to hold
      the plaintext memory but cryptlen references the input buffer holding
      (ciphertext || authentication tag), the assumption of the destination
      buffer length in RFC4106 GCM operation leads to a too large size. This
      patch simply uses the already calculated plaintext size.
      
      In addition, this patch fixes the offset calculation of the AAD buffer
      pointer: as mentioned before, cryptlen already includes the size of the
      tag. Thus, the tag does not need to be added. With the addition, the AAD
      will be written beyond the already allocated buffer.
      
      Note, this fixes a kernel crash that can be triggered from user space
      via AF_ALG(aead) -- simply use the libkcapi test application
      from [1] and update it to use rfc4106-gcm-aes.
      
      Using [1], the changes were tested using CAVS vectors to demonstrate
      that the crypto operation still delivers the right results.
      
      [1] http://www.chronox.de/libkcapi.html
      
      CC: Tadeusz Struk <tadeusz.struk@intel.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NStephan Mueller <smueller@chronox.de>
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      ccfe8c3f
    • P
      kvm: x86: i8259: return initialized data on invalid-size read · c1a6bff2
      Petr Matousek 提交于
      If data is read from PIC with invalid access size, the return data stays
      uninitialized even though success is returned.
      
      Fix this by always initializing the data.
      Signed-off-by: NPetr Matousek <pmatouse@redhat.com>
      Reported-by: NNadav Amit <nadav.amit@gmail.com>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      c1a6bff2
  6. 12 3月, 2015 2 次提交
  7. 11 3月, 2015 1 次提交
  8. 10 3月, 2015 4 次提交
  9. 07 3月, 2015 3 次提交
    • D
      x86/asm: Optimize unnecessarily wide TEST instructions · 3e1aa7cb
      Denys Vlasenko 提交于
      By the nature of the TEST operation, it is often possible to test
      a narrower part of the operand:
      
          "testl $3,  mem"  ->  "testb $3, mem",
          "testq $3, %rcx"  ->  "testb $3, %cl"
      
      This results in shorter instructions, because the TEST instruction
      has no sign-entending byte-immediate forms unlike other ALU ops.
      
      Note that this change does not create any LCP (Length-Changing Prefix)
      stalls, which happen when adding a 0x66 prefix, which happens when
      16-bit immediates are used, which changes such TEST instructions:
      
        [test_opcode] [modrm] [imm32]
      
      to:
      
        [0x66] [test_opcode] [modrm] [imm16]
      
      where [imm16] has a *different length* now: 2 bytes instead of 4.
      This confuses the decoder and slows down execution.
      
      REX prefixes were carefully designed to almost never hit this case:
      adding REX prefix does not change instruction length except MOVABS
      and MOV [addr],RAX instruction.
      
      This patch does not add instructions which would use a 0x66 prefix,
      code changes in assembly are:
      
          -48 f7 07 01 00 00 00 	testq  $0x1,(%rdi)
          +f6 07 01             	testb  $0x1,(%rdi)
          -48 f7 c1 01 00 00 00 	test   $0x1,%rcx
          +f6 c1 01             	test   $0x1,%cl
          -48 f7 c1 02 00 00 00 	test   $0x2,%rcx
          +f6 c1 02             	test   $0x2,%cl
          -41 f7 c2 01 00 00 00 	test   $0x1,%r10d
          +41 f6 c2 01          	test   $0x1,%r10b
          -48 f7 c1 04 00 00 00 	test   $0x4,%rcx
          +f6 c1 04             	test   $0x4,%cl
          -48 f7 c1 08 00 00 00 	test   $0x8,%rcx
          +f6 c1 08             	test   $0x8,%cl
      
      Linus further notes:
      
         "There are no stalls from using 8-bit instruction forms.
      
          Now, changing from 64-bit or 32-bit 'test' instructions to 8-bit ones
          *could* cause problems if it ends up having forwarding issues, so that
          instead of just forwarding the result, you end up having to wait for
          it to be stable in the L1 cache (or possibly the register file). The
          forwarding from the store buffer is simplest and most reliable if the
          read is done at the exact same address and the exact same size as the
          write that gets forwarded.
      
          But that's true only if:
      
           (a) the write was very recent and is still in the write queue. I'm
               not sure that's the case here anyway.
      
           (b) on at least most Intel microarchitectures, you have to test a
               different byte than the lowest one (so forwarding a 64-bit write
               to a 8-bit read ends up working fine, as long as the 8-bit read
               is of the low 8 bits of the written data).
      
          A very similar issue *might* show up for registers too, not just
          memory writes, if you use 'testb' with a high-byte register (where
          instead of forwarding the value from the original producer it needs to
          go through the register file and then shifted). But it's mainly a
          problem for store buffers.
      
          But afaik, the way Denys changed the test instructions, neither of the
          above issues should be true.
      
          The real problem for store buffer forwarding tends to be "write 8
          bits, read 32 bits". That can be really surprisingly expensive,
          because the read ends up having to wait until the write has hit the
          cacheline, and we might talk tens of cycles of latency here. But
          "write 32 bits, read the low 8 bits" *should* be fast on pretty much
          all x86 chips, afaik."
      Signed-off-by: NDenys Vlasenko <dvlasenk@redhat.com>
      Acked-by: NAndy Lutomirski <luto@amacapital.net>
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: H. Peter Anvin <hpa@linux.intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Will Drewry <wad@chromium.org>
      Link: http://lkml.kernel.org/r/1425675332-31576-1-git-send-email-dvlasenk@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      3e1aa7cb
    • A
      x86/asm/entry: Replace this_cpu_sp0() with current_top_of_stack() and fix it on x86_32 · a7fcf28d
      Andy Lutomirski 提交于
      I broke 32-bit kernels.  The implementation of sp0 was correct
      as far as I can tell, but sp0 was much weirder on x86_32 than I
      realized.  It has the following issues:
      
       - Init's sp0 is inconsistent with everything else's: non-init tasks
         are offset by 8 bytes.  (I have no idea why, and the comment is unhelpful.)
      
       - vm86 does crazy things to sp0.
      
      Fix it up by replacing this_cpu_sp0() with
      current_top_of_stack() and using a new percpu variable to track
      the top of the stack on x86_32.
      Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Fixes: 75182b16 ("x86/asm/entry: Switch all C consumers of kernel_stack to this_cpu_sp0()")
      Link: http://lkml.kernel.org/r/d09dbe270883433776e0cbee3c7079433349e96d.1425692936.git.luto@amacapital.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
      a7fcf28d
    • A
      x86/asm/entry: Delay loading sp0 slightly on task switch · b27559a4
      Andy Lutomirski 提交于
      The change:
      
        75182b16 ("x86/asm/entry: Switch all C consumers of kernel_stack to this_cpu_sp0()")
      
      had the unintended side effect of changing the return value of
      current_thread_info() during part of the context switch process.
      Change it back.
      
      This has no effect as far as I can tell -- it's just for
      consistency.
      Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/9fcaa47dd8487db59eed7a3911b6ae409476763e.1425692936.git.luto@amacapital.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
      b27559a4
  10. 06 3月, 2015 3 次提交