1. 09 9月, 2014 3 次提交
  2. 04 9月, 2014 1 次提交
    • A
      seccomp,x86,arm,mips,s390: Remove nr parameter from secure_computing · a4412fc9
      Andy Lutomirski 提交于
      The secure_computing function took a syscall number parameter, but
      it only paid any attention to that parameter if seccomp mode 1 was
      enabled.  Rather than coming up with a kludge to get the parameter
      to work in mode 2, just remove the parameter.
      
      To avoid churn in arches that don't have seccomp filters (and may
      not even support syscall_get_nr right now), this leaves the
      parameter in secure_computing_strict, which is now a real function.
      
      For ARM, this is a bit ugly due to the fact that ARM conditionally
      supports seccomp filters.  Fixing that would probably only be a
      couple of lines of code, but it should be coordinated with the audit
      maintainers.
      
      This will be a slight slowdown on some arches.  The right fix is to
      pass in all of seccomp_data instead of trying to make just the
      syscall nr part be fast.
      
      This is a prerequisite for making two-phase seccomp work cleanly.
      
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: linux-s390@vger.kernel.org
      Cc: x86@kernel.org
      Cc: Kees Cook <keescook@chromium.org>
      Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
      Signed-off-by: NKees Cook <keescook@chromium.org>
      a4412fc9
  3. 07 3月, 2014 2 次提交
    • S
      x86: Keep thread_info on thread stack in x86_32 · 198d208d
      Steven Rostedt 提交于
      x86_64 uses a per_cpu variable kernel_stack to always point to
      the thread stack of current. This is where the thread_info is stored
      and is accessed from this location even when the irq or exception stack
      is in use. This removes the complexity of having to maintain the
      thread info on the stack when interrupts are running and having to
      copy the preempt_count and other fields to the interrupt stack.
      
      x86_32 uses the old method of copying the thread_info from the thread
      stack to the exception stack just before executing the exception.
      
      Having the two different requires #ifdefs and also the x86_32 way
      is a bit of a pain to maintain. By converting x86_32 to the same
      method of x86_64, we can remove #ifdefs, clean up the x86_32 code
      a little, and remove the overhead of the copy.
      
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Brian Gerst <brgerst@gmail.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/r/20110806012354.263834829@goodmis.org
      Link: http://lkml.kernel.org/r/20140206144321.852942014@goodmis.orgSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      198d208d
    • S
      x86: Prepare removal of previous_esp from i386 thread_info structure · 0788aa6a
      Steven Rostedt 提交于
      The i386 thread_info contains a previous_esp field that is used
      to daisy chain the different stacks for dump_stack()
      (ie. irq, softirq, thread stacks).
      
      The goal is to eventual make i386 handling of thread_info the same
      as x86_64, which means that the thread_info will not be in the stack
      but as a per_cpu variable. We will no longer depend on thread_info
      being able to daisy chain different stacks as it will only exist
      in one location (the thread stack).
      
      By moving previous_esp to the end of thread_info and referencing
      it as an offset instead of using a thread_info field, this becomes
      a stepping stone to moving the thread_info.
      
      The offset to get to the previous stack is rather ugly in this
      patch, but this is only temporary and the prev_esp will be changed
      in the next commit. This commit is more for sanity checks of the
      change.
      
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Robert Richter <rric@kernel.org>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/r/20110806012353.891757693@goodmis.org
      Link: http://lkml.kernel.org/r/20140206144321.608754481@goodmis.orgSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      0788aa6a
  4. 10 7月, 2013 6 次提交
  5. 15 2月, 2013 1 次提交
  6. 01 12月, 2012 1 次提交
    • F
      context_tracking: New context tracking susbsystem · 91d1aa43
      Frederic Weisbecker 提交于
      Create a new subsystem that probes on kernel boundaries
      to keep track of the transitions between level contexts
      with two basic initial contexts: user or kernel.
      
      This is an abstraction of some RCU code that use such tracking
      to implement its userspace extended quiescent state.
      
      We need to pull this up from RCU into this new level of indirection
      because this tracking is also going to be used to implement an "on
      demand" generic virtual cputime accounting. A necessary step to
      shutdown the tick while still accounting the cputime.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Gilad Ben-Yossef <gilad@benyossef.com>
      Reviewed-by: NSteven Rostedt <rostedt@goodmis.org>
      [ paulmck: fix whitespace error and email address. ]
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      91d1aa43
  7. 21 11月, 2012 2 次提交
    • H
      x86-32: Export kernel_stack_pointer() for modules · cb57a2b4
      H. Peter Anvin 提交于
      Modules, in particular oprofile (and possibly other similar tools)
      need kernel_stack_pointer(), so export it using EXPORT_SYMBOL_GPL().
      
      Cc: Yang Wei <wei.yang@windriver.com>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Jun Zhang <jun.zhang@intel.com>
      Cc: <stable@vger.kernel.org>
      Link: http://lkml.kernel.org/r/20120912135059.GZ8285@erda.amd.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      cb57a2b4
    • R
      x86-32: Fix invalid stack address while in softirq · 10226238
      Robert Richter 提交于
      In 32 bit the stack address provided by kernel_stack_pointer() may
      point to an invalid range causing NULL pointer access or page faults
      while in NMI (see trace below). This happens if called in softirq
      context and if the stack is empty. The address at &regs->sp is then
      out of range.
      
      Fixing this by checking if regs and &regs->sp are in the same stack
      context. Otherwise return the previous stack pointer stored in struct
      thread_info. If that address is invalid too, return address of regs.
      
       BUG: unable to handle kernel NULL pointer dereference at 0000000a
       IP: [<c1004237>] print_context_stack+0x6e/0x8d
       *pde = 00000000
       Oops: 0000 [#1] SMP
       Modules linked in:
       Pid: 4434, comm: perl Not tainted 3.6.0-rc3-oprofile-i386-standard-g4411a05 #4 Hewlett-Packard HP xw9400 Workstation/0A1Ch
       EIP: 0060:[<c1004237>] EFLAGS: 00010093 CPU: 0
       EIP is at print_context_stack+0x6e/0x8d
       EAX: ffffe000 EBX: 0000000a ECX: f4435f94 EDX: 0000000a
       ESI: f4435f94 EDI: f4435f94 EBP: f5409ec0 ESP: f5409ea0
        DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
       CR0: 8005003b CR2: 0000000a CR3: 34ac9000 CR4: 000007d0
       DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
       DR6: ffff0ff0 DR7: 00000400
       Process perl (pid: 4434, ti=f5408000 task=f5637850 task.ti=f4434000)
       Stack:
        000003e8 ffffe000 00001ffc f4e39b00 00000000 0000000a f4435f94 c155198c
        f5409ef0 c1003723 c155198c f5409f04 00000000 f5409edc 00000000 00000000
        f5409ee8 f4435f94 f5409fc4 00000001 f5409f1c c12dce1c 00000000 c155198c
       Call Trace:
        [<c1003723>] dump_trace+0x7b/0xa1
        [<c12dce1c>] x86_backtrace+0x40/0x88
        [<c12db712>] ? oprofile_add_sample+0x56/0x84
        [<c12db731>] oprofile_add_sample+0x75/0x84
        [<c12ddb5b>] op_amd_check_ctrs+0x46/0x260
        [<c12dd40d>] profile_exceptions_notify+0x23/0x4c
        [<c1395034>] nmi_handle+0x31/0x4a
        [<c1029dc5>] ? ftrace_define_fields_irq_handler_entry+0x45/0x45
        [<c13950ed>] do_nmi+0xa0/0x2ff
        [<c1029dc5>] ? ftrace_define_fields_irq_handler_entry+0x45/0x45
        [<c13949e5>] nmi_stack_correct+0x28/0x2d
        [<c1029dc5>] ? ftrace_define_fields_irq_handler_entry+0x45/0x45
        [<c1003603>] ? do_softirq+0x4b/0x7f
        <IRQ>
        [<c102a06f>] irq_exit+0x35/0x5b
        [<c1018f56>] smp_apic_timer_interrupt+0x6c/0x7a
        [<c1394746>] apic_timer_interrupt+0x2a/0x30
       Code: 89 fe eb 08 31 c9 8b 45 0c ff 55 ec 83 c3 04 83 7d 10 00 74 0c 3b 5d 10 73 26 3b 5d e4 73 0c eb 1f 3b 5d f0 76 1a 3b 5d e8 73 15 <8b> 13 89 d0 89 55 e0 e8 ad 42 03 00 85 c0 8b 55 e0 75 a6 eb cc
       EIP: [<c1004237>] print_context_stack+0x6e/0x8d SS:ESP 0068:f5409ea0
       CR2: 000000000000000a
       ---[ end trace 62afee3481b00012 ]---
       Kernel panic - not syncing: Fatal exception in interrupt
      
      V2:
      * add comments to kernel_stack_pointer()
      * always return a valid stack address by falling back to the address
        of regs
      Reported-by: NYang Wei <wei.yang@windriver.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NRobert Richter <robert.richter@amd.com>
      Link: http://lkml.kernel.org/r/20120912135059.GZ8285@erda.amd.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      Cc: Jun Zhang <jun.zhang@intel.com>
      10226238
  8. 28 10月, 2012 1 次提交
  9. 26 9月, 2012 1 次提交
    • F
      x86: Syscall hooks for userspace RCU extended QS · bf5a3c13
      Frederic Weisbecker 提交于
      Add syscall slow path hooks to notify syscall entry
      and exit on CPUs that want to support userspace RCU
      extended quiescent state.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Alessio Igor Bogani <abogani@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Geoff Levand <geoff@infradead.org>
      Cc: Gilad Ben Yossef <gilad@benyossef.com>
      Cc: Hakan Akkan <hakanakkan@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Kevin Hilman <khilman@ti.com>
      Cc: Max Krasnyansky <maxk@qualcomm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephen Hemminger <shemminger@vyatta.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Sven-Thorsten Dietrich <thebigcorporation@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
      bf5a3c13
  10. 19 9月, 2012 1 次提交
    • S
      x86, fpu: Unify signal handling code paths for x86 and x86_64 kernels · 72a671ce
      Suresh Siddha 提交于
      Currently for x86 and x86_32 binaries, fpstate in the user sigframe is copied
      to/from the fpstate in the task struct.
      
      And in the case of signal delivery for x86_64 binaries, if the fpstate is live
      in the CPU registers, then the live state is copied directly to the user
      sigframe. Otherwise  fpstate in the task struct is copied to the user sigframe.
      During restore, fpstate in the user sigframe is restored directly to the live
      CPU registers.
      
      Historically, different code paths led to different bugs. For example,
      x86_64 code path was not preemption safe till recently. Also there is lot
      of code duplication for support of new features like xsave etc.
      
      Unify signal handling code paths for x86 and x86_64 kernels.
      
      New strategy is as follows:
      
      Signal delivery: Both for 32/64-bit frames, align the core math frame area to
      64bytes as needed by xsave (this where the main fpu/extended state gets copied
      to and excludes the legacy compatibility fsave header for the 32-bit [f]xsave
      frames). If the state is live, copy the register state directly to the user
      frame. If not live, copy the state in the thread struct to the user frame. And
      for 32-bit [f]xsave frames, construct the fsave header separately before
      the actual [f]xsave area.
      
      Signal return: As the 32-bit frames with [f]xstate has an additional
      'fsave' header, copy everything back from the user sigframe to the
      fpstate in the task structure and reconstruct the fxstate from the 'fsave'
      header (Also user passed pointers may not be correctly aligned for
      any attempt to directly restore any partial state). At the next fpstate usage,
      everything will be restored to the live CPU registers.
      For all the 64-bit frames and the 32-bit fsave frame, restore the state from
      the user sigframe directly to the live CPU registers. 64-bit signals always
      restored the math frame directly, so we can expect the math frame pointer
      to be correctly aligned. For 32-bit fsave frames, there are no alignment
      requirements, so we can restore the state directly.
      
      "lat_sig catch" microbenchmark numbers (for x86, x86_64, x86_32 binaries) are
      with in the noise range with this change.
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Link: http://lkml.kernel.org/r/1343171129-2747-4-git-send-email-suresh.b.siddha@intel.com
      [ Merged in compilation fix ]
      Link: http://lkml.kernel.org/r/1344544736.8326.17.camel@sbsiddha-desk.sc.intel.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      72a671ce
  11. 02 6月, 2012 1 次提交
  12. 14 4月, 2012 1 次提交
    • W
      x86: Enable HAVE_ARCH_SECCOMP_FILTER · c6cfbeb4
      Will Drewry 提交于
      Enable support for seccomp filter on x86:
      - syscall_get_arch()
      - syscall_get_arguments()
      - syscall_rollback()
      - syscall_set_return_value()
      - SIGSYS siginfo_t support
      - secure_computing is called from a ptrace_event()-safe context
      - secure_computing return value is checked (see below).
      
      SECCOMP_RET_TRACE and SECCOMP_RET_TRAP may result in seccomp needing to
      skip a system call without killing the process.  This is done by
      returning a non-zero (-1) value from secure_computing.  This change
      makes x86 respect that return value.
      
      To ensure that minimal kernel code is exposed, a non-zero return value
      results in an immediate return to user space (with an invalid syscall
      number).
      Signed-off-by: NWill Drewry <wad@chromium.org>
      Reviewed-by: NH. Peter Anvin <hpa@zytor.com>
      Acked-by: NEric Paris <eparis@redhat.com>
      Reviewed-by: NKees Cook <keescook@chromium.org>
      
      v18: rebase and tweaked change description, acked-by
      v17: added reviewed by and rebased
      v..: all rebases since original introduction.
      Signed-off-by: NJames Morris <james.l.morris@oracle.com>
      c6cfbeb4
  13. 29 3月, 2012 1 次提交
  14. 13 3月, 2012 1 次提交
  15. 06 3月, 2012 1 次提交
    • H
      x32: Add ptrace for x32 · 55283e25
      H.J. Lu 提交于
      X32 ptrace is a hybrid of 64bit ptrace and compat ptrace with 32bit
      address and longs.  It use 64bit ptrace to access the full 64bit
      registers.  PTRACE_PEEKUSR and PTRACE_POKEUSR are only allowed to access
      segment and debug registers.  PTRACE_PEEKUSR returns the lower 32bits
      and PTRACE_POKEUSR zero-extends 32bit value to 64bit.   It works since
      the upper 32bits of segment and debug registers of x32 process are always
      zero.  GDB only uses PTRACE_PEEKUSR and PTRACE_POKEUSR to access
      segment and debug registers.
      
      [ hpa: changed TIF_X32 test to use !is_ia32_task() instead, and moved
        the system call number to the now-unused 521 slot. ]
      Signed-off-by: N"H.J. Lu" <hjl.tools@gmail.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      Cc: Roland McGrath <roland@hack.frob.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Link: http://lkml.kernel.org/r/1329696488-16970-1-git-send-email-hpa@zytor.com
      55283e25
  16. 22 2月, 2012 1 次提交
    • L
      i387: Split up <asm/i387.h> into exported and internal interfaces · 1361b83a
      Linus Torvalds 提交于
      While various modules include <asm/i387.h> to get access to things we
      actually *intend* for them to use, most of that header file was really
      pretty low-level internal stuff that we really don't want to expose to
      others.
      
      So split the header file into two: the small exported interfaces remain
      in <asm/i387.h>, while the internal definitions that are only used by
      core architecture code are now in <asm/fpu-internal.h>.
      
      The guiding principle for this was to expose functions that we export to
      modules, and leave them in <asm/i387.h>, while stuff that is used by
      task switching or was marked GPL-only is in <asm/fpu-internal.h>.
      
      The fpu-internal.h file could be further split up too, especially since
      arch/x86/kvm/ uses some of the remaining stuff for its module.  But that
      kvm usage should probably be abstracted out a bit, and at least now the
      internal FPU accessor functions are much more contained.  Even if it
      isn't perhaps as contained as it _could_ be.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1202211340330.5354@i5.linux-foundation.orgSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      1361b83a
  17. 18 1月, 2012 2 次提交
    • E
      audit: inline audit_syscall_entry to reduce burden on archs · b05d8447
      Eric Paris 提交于
      Every arch calls:
      
      if (unlikely(current->audit_context))
      	audit_syscall_entry()
      
      which requires knowledge about audit (the existance of audit_context) in
      the arch code.  Just do it all in static inline in audit.h so that arch's
      can remain blissfully ignorant.
      Signed-off-by: NEric Paris <eparis@redhat.com>
      b05d8447
    • E
      Audit: push audit success and retcode into arch ptrace.h · d7e7528b
      Eric Paris 提交于
      The audit system previously expected arches calling to audit_syscall_exit to
      supply as arguments if the syscall was a success and what the return code was.
      Audit also provides a helper AUDITSC_RESULT which was supposed to simplify things
      by converting from negative retcodes to an audit internal magic value stating
      success or failure.  This helper was wrong and could indicate that a valid
      pointer returned to userspace was a failed syscall.  The fix is to fix the
      layering foolishness.  We now pass audit_syscall_exit a struct pt_reg and it
      in turns calls back into arch code to collect the return value and to
      determine if the syscall was a success or failure.  We also define a generic
      is_syscall_success() macro which determines success/failure based on if the
      value is < -MAX_ERRNO.  This works for arches like x86 which do not use a
      separate mechanism to indicate syscall failure.
      
      We make both the is_syscall_success() and regs_return_value() static inlines
      instead of macros.  The reason is because the audit function must take a void*
      for the regs.  (uml calls theirs struct uml_pt_regs instead of just struct
      pt_regs so audit_syscall_exit can't take a struct pt_regs).  Since the audit
      function takes a void* we need to use static inlines to cast it back to the
      arch correct structure to dereference it.
      
      The other major change is that on some arches, like ia64, MIPS and ppc, we
      change regs_return_value() to give us the negative value on syscall failure.
      THE only other user of this macro, kretprobe_example.c, won't notice and it
      makes the value signed consistently for the audit functions across all archs.
      
      In arch/sh/kernel/ptrace_64.c I see that we were using regs[9] in the old
      audit code as the return value.  But the ptrace_64.h code defined the macro
      regs_return_value() as regs[3].  I have no idea which one is correct, but this
      patch now uses the regs_return_value() function, so it now uses regs[3].
      
      For powerpc we previously used regs->result but now use the
      regs_return_value() function which uses regs->gprs[3].  regs->gprs[3] is
      always positive so the regs_return_value(), much like ia64 makes it negative
      before calling the audit code when appropriate.
      Signed-off-by: NEric Paris <eparis@redhat.com>
      Acked-by: H. Peter Anvin <hpa@zytor.com> [for x86 portion]
      Acked-by: Tony Luck <tony.luck@intel.com> [for ia64]
      Acked-by: Richard Weinberger <richard@nod.at> [for uml]
      Acked-by: David S. Miller <davem@davemloft.net> [for sparc]
      Acked-by: Ralf Baechle <ralf@linux-mips.org> [for mips]
      Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [for ppc]
      d7e7528b
  18. 05 12月, 2011 1 次提交
  19. 01 7月, 2011 2 次提交
    • A
      perf: Add context field to perf_event · 4dc0da86
      Avi Kivity 提交于
      The perf_event overflow handler does not receive any caller-derived
      argument, so many callers need to resort to looking up the perf_event
      in their local data structure.  This is ugly and doesn't scale if a
      single callback services many perf_events.
      
      Fix by adding a context parameter to perf_event_create_kernel_counter()
      (and derived hardware breakpoints APIs) and storing it in the perf_event.
      The field can be accessed from the callback as event->overflow_handler_context.
      All callers are updated.
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1309362157-6596-2-git-send-email-avi@redhat.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
      4dc0da86
    • P
      perf: Remove the nmi parameter from the swevent and overflow interface · a8b0ca17
      Peter Zijlstra 提交于
      The nmi parameter indicated if we could do wakeups from the current
      context, if not, we would set some state and self-IPI and let the
      resulting interrupt do the wakeup.
      
      For the various event classes:
      
        - hardware: nmi=0; PMI is in fact an NMI or we run irq_work_run from
          the PMI-tail (ARM etc.)
        - tracepoint: nmi=0; since tracepoint could be from NMI context.
        - software: nmi=[0,1]; some, like the schedule thing cannot
          perform wakeups, and hence need 0.
      
      As one can see, there is very little nmi=1 usage, and the down-side of
      not using it is that on some platforms some software events can have a
      jiffy delay in wakeup (when arch_irq_work_raise isn't implemented).
      
      The up-side however is that we can remove the nmi parameter and save a
      bunch of conditionals in fast paths.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Michael Cree <mcree@orcon.net.nz>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
      Cc: Anton Blanchard <anton@samba.org>
      Cc: Eric B Munson <emunson@mgebm.net>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jason Wessel <jason.wessel@windriver.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Link: http://lkml.kernel.org/n/tip-agjev8eu666tvknpb3iaj0fg@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>
      a8b0ca17
  20. 24 5月, 2011 1 次提交
  21. 25 4月, 2011 1 次提交
  22. 28 10月, 2010 2 次提交
  23. 01 5月, 2010 1 次提交
  24. 30 3月, 2010 1 次提交
    • T
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking... · 5a0e3ad6
      Tejun Heo 提交于
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
      
      percpu.h is included by sched.h and module.h and thus ends up being
      included when building most .c files.  percpu.h includes slab.h which
      in turn includes gfp.h making everything defined by the two files
      universally available and complicating inclusion dependencies.
      
      percpu.h -> slab.h dependency is about to be removed.  Prepare for
      this change by updating users of gfp and slab facilities include those
      headers directly instead of assuming availability.  As this conversion
      needs to touch large number of source files, the following script is
      used as the basis of conversion.
      
        http://userweb.kernel.org/~tj/misc/slabh-sweep.py
      
      The script does the followings.
      
      * Scan files for gfp and slab usages and update includes such that
        only the necessary includes are there.  ie. if only gfp is used,
        gfp.h, if slab is used, slab.h.
      
      * When the script inserts a new include, it looks at the include
        blocks and try to put the new include such that its order conforms
        to its surrounding.  It's put in the include block which contains
        core kernel includes, in the same order that the rest are ordered -
        alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
        doesn't seem to be any matching order.
      
      * If the script can't find a place to put a new include (mostly
        because the file doesn't have fitting include block), it prints out
        an error message indicating which .h file needs to be added to the
        file.
      
      The conversion was done in the following steps.
      
      1. The initial automatic conversion of all .c files updated slightly
         over 4000 files, deleting around 700 includes and adding ~480 gfp.h
         and ~3000 slab.h inclusions.  The script emitted errors for ~400
         files.
      
      2. Each error was manually checked.  Some didn't need the inclusion,
         some needed manual addition while adding it to implementation .h or
         embedding .c file was more appropriate for others.  This step added
         inclusions to around 150 files.
      
      3. The script was run again and the output was compared to the edits
         from #2 to make sure no file was left behind.
      
      4. Several build tests were done and a couple of problems were fixed.
         e.g. lib/decompress_*.c used malloc/free() wrappers around slab
         APIs requiring slab.h to be added manually.
      
      5. The script was run on all .h files but without automatically
         editing them as sprinkling gfp.h and slab.h inclusions around .h
         files could easily lead to inclusion dependency hell.  Most gfp.h
         inclusion directives were ignored as stuff from gfp.h was usually
         wildly available and often used in preprocessor macros.  Each
         slab.h inclusion directive was examined and added manually as
         necessary.
      
      6. percpu.h was updated not to include slab.h.
      
      7. Build test were done on the following configurations and failures
         were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
         distributed build env didn't work with gcov compiles) and a few
         more options had to be turned off depending on archs to make things
         build (like ipr on powerpc/64 which failed due to missing writeq).
      
         * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
         * powerpc and powerpc64 SMP allmodconfig
         * sparc and sparc64 SMP allmodconfig
         * ia64 SMP allmodconfig
         * s390 SMP allmodconfig
         * alpha SMP allmodconfig
         * um on x86_64 SMP allmodconfig
      
      8. percpu.h modifications were reverted so that it could be applied as
         a separate patch and serve as bisection point.
      
      Given the fact that I had only a couple of failures from tests on step
      6, I'm fairly confident about the coverage of this conversion patch.
      If there is a breakage, it's likely to be something in one of the arch
      headers which should be easily discoverable easily on most builds of
      the specific arch.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
      5a0e3ad6
  25. 26 3月, 2010 1 次提交
    • P
      x86, perf, bts, mm: Delete the never used BTS-ptrace code · faa4602e
      Peter Zijlstra 提交于
      Support for the PMU's BTS features has been upstreamed in
      v2.6.32, but we still have the old and disabled ptrace-BTS,
      as Linus noticed it not so long ago.
      
      It's buggy: TIF_DEBUGCTLMSR is trampling all over that MSR without
      regard for other uses (perf) and doesn't provide the flexibility
      needed for perf either.
      
      Its users are ptrace-block-step and ptrace-bts, since ptrace-bts
      was never used and ptrace-block-step can be implemented using a
      much simpler approach.
      
      So axe all 3000 lines of it. That includes the *locked_memory*()
      APIs in mm/mlock.c as well.
      Reported-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Roland McGrath <roland@redhat.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Markus Metzger <markus.t.metzger@intel.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      LKML-Reference: <20100325135413.938004390@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      faa4602e
  26. 20 2月, 2010 1 次提交
    • F
      hw-breakpoint: Keep track of dr7 local enable bits · 326264a0
      Frederic Weisbecker 提交于
      When the user enables breakpoints through dr7, he can choose
      between "local" or "global" enable bits but given how linux is
      implemented, both have the same effect.
      
      That said we don't keep track how the user enabled the breakpoints
      so when the user requests the dr7 value, we only translate the
      "enabled" status using the global enabled bits. It means that if
      the user enabled a breakpoint using the local enabled bit, reading
      back dr7 will set the global bit and clear the local one.
      
      Apps like Wine expect a full dr7 POKEUSER/PEEKUSER match for emulated
      softwares that implement old reverse engineering protection schemes.
      
      We fix that by keeping track of the whole dr7 value given by the user
      in the thread structure to drop this bug. We'll think about
      something more proper later.
      
      This fixes a 2.6.32 - 2.6.33-x ptrace regression.
      Reported-and-tested-by: NMichael Stefaniuc <mstefani@redhat.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NK.Prasad <prasad@linux.vnet.ibm.com>
      Cc: Alan Stern <stern@rowland.harvard.edu>
      Cc: Maneesh Soni <maneesh@linux.vnet.ibm.com>
      Cc: Alexandre Julliard <julliard@winehq.org>
      Cc: Rafael J. Wysocki <rjw@sisk.pl>
      Cc: Maciej Rutecki <maciej.rutecki@gmail.com>
      326264a0
  27. 12 2月, 2010 1 次提交
    • S
      x86, ptrace: regset extensions to support xstate · 5b3efd50
      Suresh Siddha 提交于
      Add the xstate regset support which helps extend the kernel ptrace and the
      core-dump interfaces to support AVX state etc.
      
      This regset interface is designed to support all the future state that gets
      supported using xsave/xrstor infrastructure.
      
      Looking at the memory layout saved by "xsave", one can't say which state
      is represented in the memory layout. This is because if a particular state is
      in init state, in the xsave hdr it can be represented by bit '0'. And hence
      we can't really say by the xsave header wether a state is in init state or
      the state is not saved in the memory layout.
      
      And hence the xsave memory layout available through this regset
      interface uses SW usable bytes [464..511] to convey what state is represented
      in the memory layout.
      
      First 8 bytes of the sw_usable_bytes[464..467] will be set to OS enabled xstate
      mask(which is same as the 64bit mask returned by the xgetbv's xCR0).
      
      The note NT_X86_XSTATE represents the extended state information in the
      core file, using the above mentioned memory layout.
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      LKML-Reference: <20100211195614.802495327@sbs-t61.sc.intel.com>
      Signed-off-by: NHongjiu Lu <hjl.tools@gmail.com>
      Cc: Roland McGrath <roland@redhat.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      5b3efd50
  28. 05 2月, 2010 1 次提交