1. 28 1月, 2013 6 次提交
  2. 25 1月, 2013 1 次提交
    • A
      x86/MSI: Support multiple MSIs in presense of IRQ remapping · 51906e77
      Alexander Gordeev 提交于
      The MSI specification has several constraints in comparison with
      MSI-X, most notable of them is the inability to configure MSIs
      independently. As a result, it is impossible to dispatch
      interrupts from different queues to different CPUs. This is
      largely devalues the support of multiple MSIs in SMP systems.
      
      Also, a necessity to allocate a contiguous block of vector
      numbers for devices capable of multiple MSIs might cause a
      considerable pressure on x86 interrupt vector allocator and
      could lead to fragmentation of the interrupt vectors space.
      
      This patch overcomes both drawbacks in presense of IRQ remapping
      and lets devices take advantage of multiple queues and per-IRQ
      affinity assignments.
      Signed-off-by: NAlexander Gordeev <agordeev@redhat.com>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Suresh Siddha <suresh.b.siddha@intel.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Matthew Wilcox <willy@linux.intel.com>
      Cc: Jeff Garzik <jgarzik@pobox.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/c8bd86ff56b5fc118257436768aaa04489ac0a4c.1353324359.git.agordeev@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      51906e77
  3. 24 1月, 2013 2 次提交
  4. 23 1月, 2013 1 次提交
    • O
      ptrace: ensure arch_ptrace/ptrace_request can never race with SIGKILL · 9899d11f
      Oleg Nesterov 提交于
      putreg() assumes that the tracee is not running and pt_regs_access() can
      safely play with its stack.  However a killed tracee can return from
      ptrace_stop() to the low-level asm code and do RESTORE_REST, this means
      that debugger can actually read/modify the kernel stack until the tracee
      does SAVE_REST again.
      
      set_task_blockstep() can race with SIGKILL too and in some sense this
      race is even worse, the very fact the tracee can be woken up breaks the
      logic.
      
      As Linus suggested we can clear TASK_WAKEKILL around the arch_ptrace()
      call, this ensures that nobody can ever wakeup the tracee while the
      debugger looks at it.  Not only this fixes the mentioned problems, we
      can do some cleanups/simplifications in arch_ptrace() paths.
      
      Probably ptrace_unfreeze_traced() needs more callers, for example it
      makes sense to make the tracee killable for oom-killer before
      access_process_vm().
      
      While at it, add the comment into may_ptrace_stop() to explain why
      ptrace_stop() still can't rely on SIGKILL and signal_pending_state().
      Reported-by: NSalman Qazi <sqazi@google.com>
      Reported-by: NSuleiman Souhlal <suleiman@google.com>
      Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9899d11f
  5. 17 1月, 2013 1 次提交
    • A
      xen: Fix stack corruption in xen_failsafe_callback for 32bit PVOPS guests. · 9174adbe
      Andrew Cooper 提交于
      This fixes CVE-2013-0190 / XSA-40
      
      There has been an error on the xen_failsafe_callback path for failed
      iret, which causes the stack pointer to be wrong when entering the
      iret_exc error path.  This can result in the kernel crashing.
      
      In the classic kernel case, the relevant code looked a little like:
      
              popl %eax      # Error code from hypervisor
              jz 5f
              addl $16,%esp
              jmp iret_exc   # Hypervisor said iret fault
      5:      addl $16,%esp
                             # Hypervisor said segment selector fault
      
      Here, there are two identical addls on either option of a branch which
      appears to have been optimised by hoisting it above the jz, and
      converting it to an lea, which leaves the flags register unaffected.
      
      In the PVOPS case, the code looks like:
      
              popl_cfi %eax         # Error from the hypervisor
              lea 16(%esp),%esp     # Add $16 before choosing fault path
              CFI_ADJUST_CFA_OFFSET -16
              jz 5f
              addl $16,%esp         # Incorrectly adjust %esp again
              jmp iret_exc
      
      It is possible unprivileged userspace applications to cause this
      behaviour, for example by loading an LDT code selector, then changing
      the code selector to be not-present.  At this point, there is a race
      condition where it is possible for the hypervisor to return back to
      userspace from an interrupt, fault on its own iret, and inject a
      failsafe_callback into the kernel.
      
      This bug has been present since the introduction of Xen PVOPS support
      in commit 5ead97c8 (xen: Core Xen implementation), in 2.6.23.
      Signed-off-by: NFrediano Ziglio <frediano.ziglio@citrix.com>
      Signed-off-by: NAndrew Cooper <andrew.cooper3@citrix.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      9174adbe
  6. 16 1月, 2013 1 次提交
  7. 14 1月, 2013 2 次提交
  8. 12 1月, 2013 1 次提交
    • J
      x86/Sandy Bridge: reserve pages when integrated graphics is present · a9acc536
      Jesse Barnes 提交于
      SNB graphics devices have a bug that prevent them from accessing certain
      memory ranges, namely anything below 1M and in the pages listed in the
      table.  So reserve those at boot if set detect a SNB gfx device on the
      CPU to avoid GPU hangs.
      
      Stephane Marchesin had a similar patch to the page allocator awhile
      back, but rather than reserving pages up front, it leaked them at
      allocation time.
      
      [ hpa: made a number of stylistic changes, marked arrays as static
        const, and made less verbose; use "memblock=debug" for full
        verbosity. ]
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      a9acc536
  9. 10 1月, 2013 1 次提交
    • D
      perf x86: revert 20b279 - require exclude_guest to use PEBS - kernel side · a706d965
      David Ahern 提交于
      This patch is brought to you by the letter 'H'.
      
      Commit 20b279 breaks compatiblity with older perf binaries when run with
      precise modifier (:p or :pp) by requiring the exclude_guest attribute to be
      set. Older binaries default exclude_guest to 0 (ie., wanting guest-based
      samples) unless host only profiling is requested (:H modifier). The workaround
      for older binaries is to add H to the modifier list (e.g., -e cycles:ppH -
      toggles exclude_guest to 1). This was deemed unacceptable by Linus:
      
      https://lkml.org/lkml/2012/12/12/570
      
      Between family in town and the fresh snow in Breckenridge there is no time left
      to be working on the proper fix for this over the holidays. In the New Year I
      have more pressing problems to resolve -- like some memory leaks in perf which
      are proving to be elusive -- although the aforementioned snow is probably why
      they are proving to be elusive. Either way I do not have any spare time to work
      on this and from the time I have managed to spend on it the solution is more
      difficult than just moving to a new exclude_guest flag (does not work) or
      flipping the logic to include_guest (which is not as trivial as one would
      think).
      
      So, two options: silently force exclude_guest on as suggested by Gleb which
      means no impact to older perf binaries or revert the original patch which
      caused the breakage.
      
      This patch does the latter -- reverts the original patch that introduced the
      regression. The problem can be revisited in the future as time allows.
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Gleb Natapov <gleb@redhat.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Robert Richter <robert.richter@amd.com>
      Link: http://lkml.kernel.org/r/1356749767-17322-1-git-send-email-dsahern@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a706d965
  10. 08 1月, 2013 1 次提交
  11. 04 1月, 2013 1 次提交
    • G
      X86: drivers: remove __dev* attributes. · a18e3690
      Greg Kroah-Hartman 提交于
      CONFIG_HOTPLUG is going away as an option.  As a result, the __dev*
      markings need to be removed.
      
      This change removes the use of __devinit, __devexit_p, __devinitconst,
      and __devexit from these drivers.
      
      Based on patches originally written by Bill Pemberton, but redone by me
      in order to handle some of the coding style issues better, by hand.
      
      Cc: Bill Pemberton <wfp5p@virginia.edu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Daniel Drake <dsd@laptop.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a18e3690
  12. 27 12月, 2012 1 次提交
  13. 21 12月, 2012 1 次提交
  14. 20 12月, 2012 7 次提交
  15. 19 12月, 2012 2 次提交
  16. 18 12月, 2012 5 次提交
  17. 16 12月, 2012 2 次提交
  18. 15 12月, 2012 1 次提交
  19. 14 12月, 2012 1 次提交
    • K
      module: add syscall to load module from fd · 34e1169d
      Kees Cook 提交于
      As part of the effort to create a stronger boundary between root and
      kernel, Chrome OS wants to be able to enforce that kernel modules are
      being loaded only from our read-only crypto-hash verified (dm_verity)
      root filesystem. Since the init_module syscall hands the kernel a module
      as a memory blob, no reasoning about the origin of the blob can be made.
      
      Earlier proposals for appending signatures to kernel modules would not be
      useful in Chrome OS, since it would involve adding an additional set of
      keys to our kernel and builds for no good reason: we already trust the
      contents of our root filesystem. We don't need to verify those kernel
      modules a second time. Having to do signature checking on module loading
      would slow us down and be redundant. All we need to know is where a
      module is coming from so we can say yes/no to loading it.
      
      If a file descriptor is used as the source of a kernel module, many more
      things can be reasoned about. In Chrome OS's case, we could enforce that
      the module lives on the filesystem we expect it to live on.  In the case
      of IMA (or other LSMs), it would be possible, for example, to examine
      extended attributes that may contain signatures over the contents of
      the module.
      
      This introduces a new syscall (on x86), similar to init_module, that has
      only two arguments. The first argument is used as a file descriptor to
      the module and the second argument is a pointer to the NULL terminated
      string of module arguments.
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (merge fixes)
      34e1169d
  20. 13 12月, 2012 2 次提交