1. 19 2月, 2011 7 次提交
    • T
      genirq: Rremove redundant check · b008207c
      Thomas Gleixner 提交于
      IRQ_NO_BALANCING is already checked in irq_can_set_affinity() above,
      no need to check it again.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      b008207c
    • T
      genirq: Simplify affinity related code · 1fa46f1f
      Thomas Gleixner 提交于
      There is lot of #ifdef CONFIG_GENERIC_PENDING_IRQ along with
      duplicated code in the irq core. Move the #ifdeffery into one place
      and cleanup the code so it's readable. No functional change.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      1fa46f1f
    • T
      genirq: Namespace cleanup · a0cd9ca2
      Thomas Gleixner 提交于
      The irq namespace has become quite convoluted. My bad.  Clean it up
      and deprecate the old functions. All new functions follow the scheme:
      
      irq number based:
          irq_set/get/xxx/_xxx(unsigned int irq, ...)
      
      irq_data based:
      	 irq_data_set/get/xxx/_xxx(struct irq_data *d, ....)
      
      irq_desc based:
      	 irq_desc_get_xxx(struct irq_desc *desc)
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      a0cd9ca2
    • T
      genirq: Add missing buslock to set_irq_type(), set_irq_wake() · 43abe43c
      Thomas Gleixner 提交于
      chips behind a slow bus cannot update the chip under desc->lock, but
      we miss the chip_buslock/chip_bus_sync_unlock() calls around the set
      type and set wake functions.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      43abe43c
    • T
      genirq: Make nr_irqs runtime expandable · e7bcecb7
      Thomas Gleixner 提交于
      We face more and more the requirement to expand nr_irqs at
      runtime. The reason are irq expanders which can not be detected in the
      early boot stage. So we speculate nr_irqs to have enough room. Further
      Xen needs extra irq numbers and we really want to avoid adding more
      "detection" code into the early boot. There is no real good reason why
      we need to limit nr_irqs at early boot.
      
      Allow the allocation code to expand nr_irqs. We have already 8k extra
      number space in the allocation bitmap, so lets use it.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      e7bcecb7
    • T
      genirq: Disable the SHIRQ_DEBUG call in request_threaded_irq for now · 6d83f94d
      Thomas Gleixner 提交于
      With CONFIG_SHIRQ_DEBUG=y we call a newly installed interrupt handler
      in request_threaded_irq().
      
      The original implementation (commit a304e1b8) called the handler
      _BEFORE_ it was installed, but that caused problems with handlers
      calling disable_irq_nosync(). See commit 377bf1e4.
      
      It's braindead in the first place to call disable_irq_nosync in shared
      handlers, but ....
      
      Moving this call after we installed the handler looks innocent, but it
      is very subtle broken on SMP.
      
      Interrupt handlers rely on the fact, that the irq core prevents
      reentrancy.
      
      Now this debug call violates that promise because we run the handler
      w/o the IRQ_INPROGRESS protection - which we cannot apply here because
      that would result in a possibly forever masked interrupt line.
      
      A concurrent real hardware interrupt on a different CPU results in
      handler reentrancy and can lead to complete wreckage, which was
      unfortunately observed in reality and took a fricking long time to
      debug.
      
      Leave the code here for now. We want this debug feature, but that's
      not easy to fix. We really should get rid of those
      disable_irq_nosync() abusers and remove that function completely.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Anton Vorontsov <avorontsov@ru.mvista.com>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: stable@kernel.org # .28 -> .37
      6d83f94d
    • T
      genirq: Prevent access beyond allocated_irqs bitmap · c1ee6264
      Thomas Gleixner 提交于
      Lars-Peter Clausen pointed out:
      
         I stumbled upon this while looking through the existing archs using
         SPARSE_IRQ.  Even with SPARSE_IRQ the NR_IRQS is still the upper
         limit for the number of IRQs.
      
         Both PXA and MMP set NR_IRQS to IRQ_BOARD_START, with
         IRQ_BOARD_START being the number of IRQs used by the core.
      
         In various machine files the nr_irqs field of the ARM machine
         defintion struct is then set to "IRQ_BOARD_START + NR_BOARD_IRQS".
      
         As a result "nr_irqs" will greater then NR_IRQS which then again
         causes the "allocated_irqs" bitmap in the core irq code to be
         accessed beyond its size overwriting unrelated data.
      
      The core code really misses a sanity check there.
      
      This went unnoticed so far as by chance the compiler/linker places
      data behind that bitmap which gets initialized later on those affected
      platforms.
      
      So the obvious fix would be to add a sanity check in early_irq_init()
      and break all affected platforms. Though that check wants to be
      backported to stable as well, which will require to fix all known
      problematic platforms and probably some more yet not known ones as
      well. Lots of churn.
      
      A way simpler solution is to allocate a slightly larger bitmap and
      avoid the whole churn w/o breaking anything. Add a few warnings when
      an arch returns utter crap.
      Reported-by: NLars-Peter Clausen <lars@metafoo.de>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: stable@kernel.org # .37
      Cc: Haojian Zhuang <haojian.zhuang@marvell.com>
      Cc: Eric Miao <eric.y.miao@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      c1ee6264
  2. 17 2月, 2011 3 次提交
  3. 14 2月, 2011 1 次提交
  4. 12 2月, 2011 2 次提交
    • K
      timer debug: Hide kernel addresses via %pK in /proc/timer_list · f5903085
      Kees Cook 提交于
      In the continuing effort to avoid kernel addresses leaking to
      unprivileged users, this patch switches to %pK for
      /proc/timer_list reporting.
      Signed-off-by: NKees Cook <kees.cook@canonical.com>
      Cc: John Stultz <johnstul@us.ibm.com>
      Cc: Dan Rosenberg <drosenberg@vsecurity.com>
      Cc: Eugene Teo <eugeneteo@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      LKML-Reference: <20110212032125.GA23571@outflux.net>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f5903085
    • T
      ptrace: use safer wake up on ptrace_detach() · 01e05e9a
      Tejun Heo 提交于
      The wake_up_process() call in ptrace_detach() is spurious and not
      interlocked with the tracee state.  IOW, the tracee could be running or
      sleeping in any place in the kernel by the time wake_up_process() is
      called.  This can lead to the tracee waking up unexpectedly which can be
      dangerous.
      
      The wake_up is spurious and should be removed but for now reduce its
      toxicity by only waking up if the tracee is in TRACED or STOPPED state.
      
      This bug can possibly be used as an attack vector.  I don't think it
      will take too much effort to come up with an attack which triggers oops
      somewhere.  Most sleeps are wrapped in condition test loops and should
      be safe but we have quite a number of places where sleep and wakeup
      conditions are expected to be interlocked.  Although the window of
      opportunity is tiny, ptrace can be used by non-privileged users and with
      some loading the window can definitely be extended and exploited.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NRoland McGrath <roland@redhat.com>
      Acked-by: NOleg Nesterov <oleg@redhat.com>
      Cc: <stable@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      01e05e9a
  5. 11 2月, 2011 3 次提交
  6. 10 2月, 2011 1 次提交
  7. 09 2月, 2011 2 次提交
  8. 08 2月, 2011 4 次提交
  9. 04 2月, 2011 1 次提交
  10. 03 2月, 2011 6 次提交
    • S
      tracing: Replace syscall_meta_data struct array with pointer array · 3d56e331
      Steven Rostedt 提交于
      Currently the syscall_meta structures for the syscall tracepoints are
      placed in the __syscall_metadata section, and at link time, the linker
      makes one large array of all these syscall metadata structures. On boot
      up, this array is read (much like the initcall sections) and the syscall
      data is processed.
      
      The problem is that there is no guarantee that gcc will place complex
      structures nicely together in an array format. Two structures in the
      same file may be placed awkwardly, because gcc has no clue that they
      are suppose to be in an array.
      
      A hack was used previous to force the alignment to 4, to pack the
      structures together. But this caused alignment issues with other
      architectures (sparc).
      
      Instead of packing the structures into an array, the structures' addresses
      are now put into the __syscall_metadata section. As pointers are always the
      natural alignment, gcc should always pack them tightly together
      (otherwise initcall, extable, etc would also fail).
      
      By having the pointers to the structures in the section, we can still
      iterate the trace_events without causing unnecessary alignment problems
      with other architectures, or depending on the current behaviour of
      gcc that will likely change in the future just to tick us kernel developers
      off a little more.
      
      The __syscall_metadata section is also moved into the .init.data section
      as it is now only needed at boot up.
      Suggested-by: NDavid Miller <davem@davemloft.net>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      3d56e331
    • M
      tracepoints: Fix section alignment using pointer array · 65498646
      Mathieu Desnoyers 提交于
      Make the tracepoints more robust, making them solid enough to handle compiler
      changes by not relying on anything based on compiler-specific behavior with
      respect to structure alignment. Implement an approach proposed by David Miller:
      use an array of const pointers to refer to the individual structures, and export
      this pointer array through the linker script rather than the structures per se.
      It will consume 32 extra bytes per tracepoint (24 for structure padding and 8
      for the pointers), but are less likely to break due to compiler changes.
      
      History:
      
      commit 7e066fb8 tracepoints: add DECLARE_TRACE() and DEFINE_TRACE()
      added the aligned(32) type and variable attribute to the tracepoint structures
      to deal with gcc happily aligning statically defined structures on 32-byte
      multiples.
      
      One attempt was to use a 8-byte alignment for tracepoint structures by applying
      both the variable and type attribute to tracepoint structures definitions and
      declarations. It worked fine with gcc 4.5.1, but broke with gcc 4.4.4 and 4.4.5.
      
      The reason is that the "aligned" attribute only specify the _minimum_ alignment
      for a structure, leaving both the compiler and the linker free to align on
      larger multiples. Because tracepoint.c expects the structures to be placed as an
      array within each section, up-alignment cause NULL-pointer exceptions due to the
      extra unexpected padding.
      
      (this patch applies on top of -tip)
      Signed-off-by: NMathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      LKML-Reference: <20110126222622.GA10794@Krystal>
      CC: Frederic Weisbecker <fweisbec@gmail.com>
      CC: Ingo Molnar <mingo@elte.hu>
      CC: Thomas Gleixner <tglx@linutronix.de>
      CC: Andrew Morton <akpm@linux-foundation.org>
      CC: Peter Zijlstra <peterz@infradead.org>
      CC: Rusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      65498646
    • P
      sched: Fix update_curr_rt() · 06c3bc65
      Peter Zijlstra 提交于
      cpu_stopper_thread()
        migration_cpu_stop()
          __migrate_task()
            deactivate_task()
              dequeue_task()
                dequeue_task_rq()
                  update_curr_rt()
      
      Will call update_curr_rt() on rq->curr, which at that time is
      rq->stop. The problem is that rq->stop.prio matches an RT prio and
      thus falsely assumes its a rt_sched_class task.
      Reported-Debuged-Tested-Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <new-submission>
      Cc: stable@kernel.org # .37
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      06c3bc65
    • P
      perf: Fix reading in perf_event_read() · 542e72fc
      Peter Zijlstra 提交于
      It is quite possible for the event to have been disabled between
      perf_event_read() sending the IPI and the CPU servicing the IPI and
      calling __perf_event_read(), hence revalidate the state.
      Reported-by: NStephane Eranian <eranian@google.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      542e72fc
    • S
      tracing: Replace trace_event struct array with pointer array · e4a9ea5e
      Steven Rostedt 提交于
      Currently the trace_event structures are placed in the _ftrace_events
      section, and at link time, the linker makes one large array of all
      the trace_event structures. On boot up, this array is read (much like
      the initcall sections) and the events are processed.
      
      The problem is that there is no guarantee that gcc will place complex
      structures nicely together in an array format. Two structures in the
      same file may be placed awkwardly, because gcc has no clue that they
      are suppose to be in an array.
      
      A hack was used previous to force the alignment to 4, to pack the
      structures together. But this caused alignment issues with other
      architectures (sparc).
      
      Instead of packing the structures into an array, the structures' addresses
      are now put into the _ftrace_event section. As pointers are always the
      natural alignment, gcc should always pack them tightly together
      (otherwise initcall, extable, etc would also fail).
      
      By having the pointers to the structures in the section, we can still
      iterate the trace_events without causing unnecessary alignment problems
      with other architectures, or depending on the current behaviour of
      gcc that will likely change in the future just to tick us kernel developers
      off a little more.
      
      The _ftrace_event section is also moved into the .init.data section
      as it is now only needed at boot up.
      Suggested-by: NDavid Miller <davem@davemloft.net>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      e4a9ea5e
    • T
      genirq: Prevent irq storm on migration · f1a06390
      Thomas Gleixner 提交于
      move_native_irq() masks and unmasks the interrupt line
      unconditionally, but the interrupt line might be masked due to a
      threaded oneshot handler in progress. Unmasking the line in that case
      can lead to interrupt storms. Observed on PREEMPT_RT.
      
      Originally-from: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: stable@kernel.org
      f1a06390
  11. 31 1月, 2011 5 次提交
  12. 28 1月, 2011 1 次提交
    • E
      perf: Fix alloc_callchain_buffers() · 88d4f0db
      Eric Dumazet 提交于
      Commit 927c7a9e ("perf: Fix race in callchains") introduced
      a mismatch in the sizing of struct callchain_cpus_entries.
      
      nr_cpu_ids must be used instead of num_possible_cpus(), or we
      might get out of bound memory accesses on some machines.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: David Miller <davem@davemloft.net>
      Cc: Stephane Eranian <eranian@google.com>
      CC: stable@kernel.org
      LKML-Reference: <1295980851.3588.351.camel@edumazet-laptop>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      88d4f0db
  13. 26 1月, 2011 4 次提交