1. 17 3月, 2011 2 次提交
  2. 15 3月, 2011 3 次提交
  3. 14 3月, 2011 3 次提交
  4. 13 3月, 2011 1 次提交
    • T
      posix-clocks: Check write permissions in posix syscalls · 6e6823d1
      Torben Hohn 提交于
      pc_clock_settime() and pc_clock_adjtime() do not check whether the fd
      was opened in write mode, so a clock can be set with a read only fd.
      
      [ tglx: We deliberately do not return -EPERM as we want this to be
        	distingushable from the capability based permission check ]
      Signed-off-by: NTorben Hohn <torbenh@gmx.de>
      LKML-Reference: <1299173174-348-4-git-send-email-torbenh@gmx.de>
      Cc: Richard Cochran <richard.cochran@omicron.at>
      Cc: John Stultz <johnstul@us.ibm.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      6e6823d1
  5. 12 3月, 2011 3 次提交
    • T
      genirq: Add chip flag to force mask on suspend · d209a699
      Thomas Gleixner 提交于
      On suspend we disable all interrupts in the core code, but this does
      not mask the interrupt line in the default implementation as we use a
      lazy disable approach. That means we mark the interrupt disabled, but
      leave the hardware unmasked. That's an optimization because we avoid
      the hardware access for the common case where no interrupt happens
      after we marked it disabled. If an interrupt happens, then the
      interrupt flow handler masks the line at the hardware level and marks
      it pending.
      
      Suspend makes use of this delayed disable as it "disables" all
      interrupts when preparing the suspend transition. Right before the
      system goes into hardware suspend state it checks whether one of the
      interrupts which is marked as a wakeup interrupt came in after
      disabling it.
      
      Most interrupt chips have a separate register which selects the
      interrupts which can wake up the system from suspend, so we don't have
      to mask any on the non wakeup interrupts.
      
      But now we have to deal with brilliant designed hardware which lacks
      such a wakeup configuration facility. For such hardware it's necessary
      to mask all non wakeup interrupts before going into suspend in order
      to avoid the wakeup from random interrupts.
      
      Rather than working around this in the affected interrupt chip
      implementations we can solve this elegant in the core code itself.
      
      Add a flag IRQCHIP_MASK_ON_SUSPEND which can be set by the irq chip
      implementation to indicate, that the interrupts which are not selected
      as wakeup sources must be masked in the suspend path. Mask them in the
      loop which checks the wakeup interrupts pending flag.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NAbhijeet Dharmapurikar <adharmap@codeaurora.org>
      LKML-Reference: <alpine.LFD.2.00.1103112112310.2787@localhost6.localdomain6>
      d209a699
    • L
      futex,plist: Remove debug lock assignment from plist_node · 017f2b23
      Lai Jiangshan 提交于
      The original code uses &plist_node->plist as the fake head of
      the priority list for plist_del(), these debug locks in
      the fake head are needed for CONFIG_DEBUG_PI_LIST.
      
      But now we always pass the real head to plist_del(), the debug locks
      in plist_node will not be used, so we remove these assignments.
      Acked-by: NDarren Hart <dvhart@linux.intel.com>
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      LKML-Reference: <4D10797E.7040803@cn.fujitsu.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      017f2b23
    • L
      futex,plist: Pass the real head of the priority list to plist_del() · 2e12978a
      Lai Jiangshan 提交于
      Some plist_del()s in kernel/futex.c are passed a faked head of the
      priority list.
      
      It does not fail because the current code does not require the real head
      in plist_del(). The current code of plist_del() just uses the head for checking,
      so it will not cause a bad result even when we use a faked head.
      
      But it is undocumented usage:
      
      /**
       * plist_del - Remove a @node from plist.
       *
       * @node:	&struct plist_node pointer - entry to be removed
       * @head:	&struct plist_head pointer - list head
       */
      
      The document says that the @head is the "list head" head of the priority list.
      
      In futex code, several places use "plist_del(&q->list, &q->list.plist);",
      they pass a fake head. We need to fix them all.
      
      Thanks to Darren Hart for many suggestions.
      Acked-by: NDarren Hart <dvhart@linux.intel.com>
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      LKML-Reference: <4D11984A.5030203@cn.fujitsu.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      2e12978a
  6. 11 3月, 2011 5 次提交
    • M
      futex: Sanitize cmpxchg_futex_value_locked API · 37a9d912
      Michel Lespinasse 提交于
      The cmpxchg_futex_value_locked API was funny in that it returned either
      the original, user-exposed futex value OR an error code such as -EFAULT.
      This was confusing at best, and could be a source of livelocks in places
      that retry the cmpxchg_futex_value_locked after trying to fix the issue
      by running fault_in_user_writeable().
          
      This change makes the cmpxchg_futex_value_locked API more similar to the
      get_futex_value_locked one, returning an error code and updating the
      original value through a reference argument.
      Signed-off-by: NMichel Lespinasse <walken@google.com>
      Acked-by: Chris Metcalf <cmetcalf@tilera.com>  [tile]
      Acked-by: Tony Luck <tony.luck@intel.com>  [ia64]
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Tested-by: Michal Simek <monstr@monstr.eu>  [microblaze]
      Acked-by: David Howells <dhowells@redhat.com> [frv]
      Cc: Darren Hart <darren@dvhart.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      LKML-Reference: <20110311024851.GC26122@google.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      37a9d912
    • T
      futex: Avoid redudant evaluation of task_pid_vnr() · c0c9ed15
      Thomas Gleixner 提交于
      The result is not going to change under us, so no need to reevaluate
      this over and over. Seems to be a leftover from the mechanical mass
      conversion of task->pid to task_pid_vnr(tsk).
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      c0c9ed15
    • T
      SUNRPC: Close a race in __rpc_wait_for_completion_task() · bf294b41
      Trond Myklebust 提交于
      Although they run as rpciod background tasks, under normal operation
      (i.e. no SIGKILL), functions like nfs_sillyrename(), nfs4_proc_unlck()
      and nfs4_do_close() want to be fully synchronous. This means that when we
      exit, we want all references to the rpc_task to be gone, and we want
      any dentry references etc. held by that task to be released.
      
      For this reason these functions call __rpc_wait_for_completion_task(),
      followed by rpc_put_task() in the expectation that the latter will be
      releasing the last reference to the rpc_task, and thus ensuring that the
      callback_ops->rpc_release() has been called synchronously.
      
      This patch fixes a race which exists due to the fact that
      rpciod calls rpc_complete_task() (in order to wake up the callers of
      __rpc_wait_for_completion_task()) and then subsequently calls
      rpc_put_task() without ensuring that these two steps are done atomically.
      
      In order to avoid adding new spin locks, the patch uses the existing
      waitqueue spin lock to order the rpc_task reference count releases between
      the waiting process and rpciod.
      The common case where nobody is waiting for completion is optimised for by
      checking if the RPC_TASK_ASYNC flag is cleared and/or if the rpc_task
      reference count is 1: in those cases we drop trying to grab the spin lock,
      and immediately free up the rpc_task.
      
      Those few processes that need to put the rpc_task from inside an
      asynchronous context and that do not care about ordering are given a new
      helper: rpc_put_task_async().
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      bf294b41
    • M
      futex: Update futex_wait_setup comments about locking · 8fe8f545
      Michel Lespinasse 提交于
      Reviving a cleanup I had done about a year ago as part of a larger
      futex_set_wait proposal. Over the years, the locking of the hashed
      futex queue got improved, so that some of the "rare but normal" race
      conditions described in comments can't actually happen anymore.
      Signed-off-by: NMichel Lespinasse <walken@google.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Darren Hart <dvhltc@us.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      LKML-Reference: <20110307020750.GA31188@google.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      8fe8f545
    • T
      hrtimer: Remove empty hrtimer_init_hres_timer() · a9e7acff
      Thomas Gleixner 提交于
      Leftover from earlier implementation. All empty, remove it.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      a9e7acff
  7. 10 3月, 2011 9 次提交
    • S
      tracing: Fix irqoff selftest expanding max buffer · 4a0b1665
      Steven Rostedt 提交于
      If the kernel command line declares a tracer "ftrace=sometracer" and
      that tracer is either not defined or is enabled after irqsoff,
      then the irqs off selftest will fail with the following error:
      
      Testing tracer irqsoff:
      ------------[ cut here ]------------
      WARNING: at /home/rostedt/work/autotest/nobackup/linux-test.git/kernel/trace/tra
      ce.c:713 update_max_tr_single+0xfa/0x11b()
      Hardware name:
      Modules linked in:
      Pid: 1, comm: swapper Not tainted 2.6.38-rc8-test #1
      Call Trace:
       [<c0441d9d>] ? warn_slowpath_common+0x65/0x7a
       [<c049adb2>] ? update_max_tr_single+0xfa/0x11b
       [<c0441dc1>] ? warn_slowpath_null+0xf/0x13
       [<c049adb2>] ? update_max_tr_single+0xfa/0x11b
       [<c049e454>] ? stop_critical_timing+0x154/0x204
       [<c049b54b>] ? trace_selftest_startup_irqsoff+0x5b/0xc1
       [<c049b54b>] ? trace_selftest_startup_irqsoff+0x5b/0xc1
       [<c049b54b>] ? trace_selftest_startup_irqsoff+0x5b/0xc1
       [<c049e529>] ? time_hardirqs_on+0x25/0x28
       [<c0468bca>] ? trace_hardirqs_on_caller+0x18/0x12f
       [<c0468cec>] ? trace_hardirqs_on+0xb/0xd
       [<c049b54b>] ? trace_selftest_startup_irqsoff+0x5b/0xc1
       [<c049b6b8>] ? register_tracer+0xf8/0x1a3
       [<c14e93fe>] ? init_irqsoff_tracer+0xd/0x11
       [<c040115e>] ? do_one_initcall+0x71/0x121
       [<c14e93f1>] ? init_irqsoff_tracer+0x0/0x11
       [<c14ce3a9>] ? kernel_init+0x13a/0x1b6
       [<c14ce26f>] ? kernel_init+0x0/0x1b6
       [<c0403842>] ? kernel_thread_helper+0x6/0x10
      ---[ end trace e93713a9d40cd06c ]---
      .. no entries found ..FAILED!
      
      What happens is the "ftrace=..." will expand the ring buffer to its
      default size (from its minimum size) but it will not expand the
      max ring buffer (the ring buffer to store maximum latencies).
      When the irqsoff test runs, it will call the ring buffer swap routine
      that checks if the max ring buffer is the same size as the normal
      ring buffer, and will fail if it is not. This causes the test to fail.
      
      The solution is to expand the max ring buffer before running the self
      test if the max ring buffer is used by that tracer and the normal ring
      buffer is expanded. The max ring buffer should be shrunk again after
      the test is done to save space.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      4a0b1665
    • S
      tracing: Align 4 byte ints together in struct tracer · 9a24470b
      Steven Rostedt 提交于
      Move elements in struct tracer for better alignment.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      9a24470b
    • Y
      tracing: Export trace_set_clr_event() · 56355b83
      Yuanhan Liu 提交于
      Trace events belonging to a module only exists when the module is
      loaded. Well, we can use trace_set_clr_event funtion to enable some
      trace event at the module init routine, so that we will not miss
      something while loading then module.
      
      So, Export the trace_set_clr_event function so that module can use it.
      Signed-off-by: NYuanhan Liu <yuanhan.liu@linux.intel.com>
      LKML-Reference: <1289196312-25323-1-git-send-email-yuanhan.liu@linux.intel.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      56355b83
    • J
      tracing: Explain about unstable clock on resume with ring buffer warning · 31274d72
      Jiri Olsa 提交于
      The "Delta way too big" warning might appear on a system with a
      unstable shed clock right after the system is resumed and tracing
      was enabled at time of suspend.
      
      Since it's not realy a bug, and the unstable sched clock is working
      fast and reliable otherwise, Steven suggested to keep using the
      sched clock in any case and just to make note in the warning itself.
      
      v2 changes:
      - added #ifdef CONFIG_HAVE_UNSTABLE_SCHED_CLOCK
      Signed-off-by: NJiri Olsa <jolsa@redhat.com>
      LKML-Reference: <20110218145219.GD2604@jolsa.brq.redhat.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      31274d72
    • D
      tracing: Adjust conditional expression latency formatting. · 10da37a6
      David Sharp 提交于
      Formatting change only to improve code readability. No code changes except to
      introduce intermediate variables.
      Signed-off-by: NDavid Sharp <dhsharp@google.com>
      LKML-Reference: <1291421609-14665-13-git-send-email-dhsharp@google.com>
      
      [ Keep variable declarations and assignment separate ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      10da37a6
    • D
      tracing: Fix event alignment: ftrace:context_switch and ftrace:wakeup · 140e4f2d
      David Sharp 提交于
      Signed-off-by: NDavid Sharp <dhsharp@google.com>
      LKML-Reference: <1291421609-14665-6-git-send-email-dhsharp@google.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      140e4f2d
    • S
      tracing: Remove lock_depth from event entry · e6e1e259
      Steven Rostedt 提交于
      The lock_depth field in the event headers was added as a temporary
      data point for help in removing the BKL. Now that the BKL is pretty
      much been removed, we can remove this field.
      
      This in turn changes the header from 12 bytes to 8 bytes,
      removing the 4 byte buffer that gcc would insert if the first field
      in the data load was 8 bytes in size.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      e6e1e259
    • D
      ring-buffer: Remove unused #include <linux/trace_irq.h> · de29be5e
      David Sharp 提交于
      Signed-off-by: NDavid Sharp <dhsharp@google.com>
      LKML-Reference: <1291421609-14665-3-git-send-email-dhsharp@google.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      de29be5e
    • D
      tracing: Add an 'overwrite' trace_option. · 750912fa
      David Sharp 提交于
      Add an "overwrite" trace_option for ftrace to control whether the buffer should
      be overwritten on overflow or not. The default remains to overwrite old events
      when the buffer is full. This patch adds the option to instead discard newest
      events when the buffer is full. This is useful to get a snapshot of traces just
      after enabling traces. Dropping the current event is also a simpler code path.
      Signed-off-by: NDavid Sharp <dhsharp@google.com>
      LKML-Reference: <1291844807-15481-1-git-send-email-dhsharp@google.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      750912fa
  8. 09 3月, 2011 1 次提交
  9. 08 3月, 2011 2 次提交
    • S
      debugobjects: Add hint for better object identification · 99777288
      Stanislaw Gruszka 提交于
      In complex subsystems like mac80211 structures can contain several
      timers and work structs, so identifying a specific instance from the
      call trace and object type output of debugobjects can be hard.
      
      Allow the subsystems which support debugobjects to provide a hint
      function. This function returns a pointer to a kernel address
      (preferrably the objects callback function) which is printed along
      with the debugobjects type.
      
      Add hint methods for timer_list, work_struct and hrtimer.
      
      [ tglx: Massaged changelog, made it compile ]
      Signed-off-by: NStanislaw Gruszka <sgruszka@redhat.com>
      LKML-Reference: <20110307085809.GA9334@redhat.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      99777288
    • A
      unfuck proc_sysctl ->d_compare() · dfef6dcd
      Al Viro 提交于
      a) struct inode is not going to be freed under ->d_compare();
      however, the thing PROC_I(inode)->sysctl points to just might.
      Fortunately, it's enough to make freeing that sucker delayed,
      provided that we don't step on its ->unregistering, clear
      the pointer to it in PROC_I(inode) before dropping the reference
      and check if it's NULL in ->d_compare().
      
      b) I'm not sure that we *can* walk into NULL inode here (we recheck
      dentry->seq between verifying that it's still hashed / fetching
      dentry->d_inode and passing it to ->d_compare() and there's no
      negative hashed dentries in /proc/sys/*), but if we can walk into
      that, we really should not have ->d_compare() return 0 on it!
      Said that, I really suspect that this check can be simply killed.
      Nick?
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      dfef6dcd
  10. 05 3月, 2011 6 次提交
  11. 04 3月, 2011 5 次提交