1. 29 8月, 2013 3 次提交
    • H
      cgroup: fix rmdir EBUSY regression in 3.11 · bb78a92f
      Hugh Dickins 提交于
      On 3.11-rc we are seeing cgroup directories left behind when they should
      have been removed.  Here's a trivial reproducer:
      
      cd /sys/fs/cgroup/memory
      mkdir parent parent/child; rmdir parent/child parent
      rmdir: failed to remove `parent': Device or resource busy
      
      It's because cgroup_destroy_locked() (step 1 of destruction) leaves
      cgroup on parent's children list, letting cgroup_offline_fn() (step 2 of
      destruction) remove it; but step 2 is run by work queue, which may not
      yet have removed the children when parent destruction checks the list.
      
      Fix that by checking through a non-empty list of children: if every one
      of them has already been marked CGRP_DEAD, then it's safe to proceed:
      those children are invisible to userspace, and should not obstruct rmdir.
      
      (I didn't see any reason to keep the cgrp->children checks under the
      unrelated css_set_lock, so moved them out.)
      
      tj: Flattened nested ifs a bit and updated comment so that it's
          correct on both for-3.11-fixes and for-3.12.
      Signed-off-by: NHugh Dickins <hughd@google.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      bb78a92f
    • T
      workqueue: cond_resched() after processing each work item · b22ce278
      Tejun Heo 提交于
      If !PREEMPT, a kworker running work items back to back can hog CPU.
      This becomes dangerous when a self-requeueing work item which is
      waiting for something to happen races against stop_machine.  Such
      self-requeueing work item would requeue itself indefinitely hogging
      the kworker and CPU it's running on while stop_machine would wait for
      that CPU to enter stop_machine while preventing anything else from
      happening on all other CPUs.  The two would deadlock.
      
      Jamie Liu reports that this deadlock scenario exists around
      scsi_requeue_run_queue() and libata port multiplier support, where one
      port may exclude command processing from other ports.  With the right
      timing, scsi_requeue_run_queue() can end up requeueing itself trying
      to execute an IO which is asked to be retried while another device has
      an exclusive access, which in turn can't make forward progress due to
      stop_machine.
      
      Fix it by invoking cond_resched() after executing each work item.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: NJamie Liu <jamieliu@google.com>
      References: http://thread.gmane.org/gmane.linux.kernel/1552567
      Cc: stable@vger.kernel.org
      --
       kernel/workqueue.c |    9 +++++++++
       1 file changed, 9 insertions(+)
      b22ce278
    • N
      timer_list: correct the iterator for timer_list · 84a78a65
      Nathan Zimmer 提交于
      Correct an issue with /proc/timer_list reported by Holger.
      
      When reading from the proc file with a sufficiently small buffer, 2k so
      not really that small, there was one could get hung trying to read the
      file a chunk at a time.
      
      The timer_list_start function failed to account for the possibility that
      the offset was adjusted outside the timer_list_next.
      Signed-off-by: NNathan Zimmer <nzimmer@sgi.com>
      Reported-by: NHolger Hans Peter Freyther <holger@freyther.de>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Berke Durak <berke.durak@xiphos.com>
      Cc: Jeff Layton <jlayton@redhat.com>
      Tested-by: NAl Viro <viro@zeniv.linux.org.uk>
      Cc: <stable@vger.kernel.org> # 3.10.x
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      84a78a65
  2. 28 8月, 2013 1 次提交
  3. 21 8月, 2013 1 次提交
    • L
      cpuset: fix a regression in validating config change · 1c09b195
      Li Zefan 提交于
      It's not allowed to clear masks of a cpuset if there're tasks in it,
      but it's broken:
      
        # mkdir /cgroup/sub
        # echo 0 > /cgroup/sub/cpuset.cpus
        # echo 0 > /cgroup/sub/cpuset.mems
        # echo $$ > /cgroup/sub/tasks
        # echo > /cgroup/sub/cpuset.cpus
        (should fail)
      
      This bug was introduced by commit 88fa523b
      ("cpuset: allow to move tasks to empty cpusets").
      
      tj: Dropped temp bool variables and nestes the conditionals directly.
      Signed-off-by: NLi Zefan <lizefan@huawei.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      1c09b195
  4. 20 8月, 2013 1 次提交
  5. 14 8月, 2013 2 次提交
    • M
      microblaze: fix clone syscall · dfa9771a
      Michal Simek 提交于
      Fix inadvertent breakage in the clone syscall ABI for Microblaze that
      was introduced in commit f3268edb ("microblaze: switch to generic
      fork/vfork/clone").
      
      The Microblaze syscall ABI for clone takes the parent tid address in the
      4th argument; the third argument slot is used for the stack size.  The
      incorrectly-used CLONE_BACKWARDS type assigned parent tid to the 3rd
      slot.
      
      This commit restores the original ABI so that existing userspace libc
      code will work correctly.
      
      All kernel versions from v3.8-rc1 were affected.
      Signed-off-by: NMichal Simek <michal.simek@xilinx.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      dfa9771a
    • S
      PM / QoS: Fix workqueue deadlock when using pm_qos_update_request_timeout() · 40fea92f
      Stephen Boyd 提交于
      pm_qos_update_request_timeout() updates a qos and then schedules
      a delayed work item to bring the qos back down to the default
      after the timeout. When the work item runs, pm_qos_work_fn() will
      call pm_qos_update_request() and deadlock because it tries to
      cancel itself via cancel_delayed_work_sync(). Future callers of
      that qos will also hang waiting to cancel the work that is
      canceling itself. Let's extract the little bit of code that does
      the real work of pm_qos_update_request() and call it from the
      work function so that we don't deadlock.
      
      Before ed1ac6e9 (PM: don't use [delayed_]work_pending()) this didn't
      happen because the work function wouldn't try to cancel itself.
      Signed-off-by: NStephen Boyd <sboyd@codeaurora.org>
      Reviewed-by: NTejun Heo <tj@kernel.org>
      Cc: 3.9+ <stable@vger.kernel.org> # 3.9+
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      40fea92f
  6. 13 8月, 2013 2 次提交
    • O
      sched: fix the theoretical signal_wake_up() vs schedule() race · e0acd0a6
      Oleg Nesterov 提交于
      This is only theoretical, but after try_to_wake_up(p) was changed
      to check p->state under p->pi_lock the code like
      
      	__set_current_state(TASK_INTERRUPTIBLE);
      	schedule();
      
      can miss a signal. This is the special case of wait-for-condition,
      it relies on try_to_wake_up/schedule interaction and thus it does
      not need mb() between __set_current_state() and if(signal_pending).
      
      However, this __set_current_state() can move into the critical
      section protected by rq->lock, now that try_to_wake_up() takes
      another lock we need to ensure that it can't be reordered with
      "if (signal_pending(current))" check inside that section.
      
      The patch is actually one-liner, it simply adds smp_wmb() before
      spin_lock_irq(rq->lock). This is what try_to_wake_up() already
      does by the same reason.
      
      We turn this wmb() into the new helper, smp_mb__before_spinlock(),
      for better documentation and to allow the architectures to change
      the default implementation.
      
      While at it, kill smp_mb__after_lock(), it has no callers.
      
      Perhaps we can also add smp_mb__before/after_spinunlock() for
      prepare_to_wait().
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e0acd0a6
    • L
      cpuset: fix the return value of cpuset_write_u64() · a903f086
      Li Zefan 提交于
      Writing to this file always returns -ENODEV:
      
        # echo 1 > cpuset.memory_pressure_enabled
        -bash: echo: write error: No such device
      Signed-off-by: NLi Zefan <lizefan@huawei.com>
      Cc: <stable@vger.kernel.org> # 3.9+
      Signed-off-by: NTejun Heo <tj@kernel.org>
      a903f086
  7. 09 8月, 2013 1 次提交
  8. 07 8月, 2013 3 次提交
  9. 03 8月, 2013 3 次提交
  10. 02 8月, 2013 1 次提交
  11. 01 8月, 2013 10 次提交
    • S
      workqueue: copy workqueue_attrs with all fields · 2865a8fb
      Shaohua Li 提交于
       $echo '0' > /sys/bus/workqueue/devices/xxx/numa
       $cat /sys/bus/workqueue/devices/xxx/numa
      
      I got 1. It should be 0, the reason is copy_workqueue_attrs() called
      in apply_workqueue_attrs() doesn't copy no_numa field.
      
      Fix it by making copy_workqueue_attrs() copy ->no_numa too.  This
      would also make get_unbound_pool() set a pool's ->no_numa attribute
      according to the workqueue attributes used when the pool was created.
      While harmelss, as ->no_numa isn't a pool attribute, this is a bit
      confusing.  Clear it explicitly.
      
      tj: Updated description and comments a bit.
      Signed-off-by: NShaohua Li <shli@fusionio.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: stable@vger.kernel.org
      2865a8fb
    • S
      tracing/kprobes: Fail to unregister if probe event files are in use · 40c32592
      Steven Rostedt (Red Hat) 提交于
      When a probe is being removed, it cleans up the event files that correspond
      to the probe. But there is a race between writing to one of these files
      and deleting the probe. This is especially true for the "enable" file.
      
      	CPU 0				CPU 1
      	-----				-----
      
      				  fd = open("enable",O_WRONLY);
      
        probes_open()
        release_all_trace_probes()
        unregister_trace_probe()
        if (trace_probe_is_enabled(tp))
      	return -EBUSY
      
      				   write(fd, "1", 1)
      				   __ftrace_set_clr_event()
      				   call->class->reg()
      				    (kprobe_register)
      				     enable_trace_probe(tp)
      
        __unregister_trace_probe(tp);
        list_del(&tp->list)
        unregister_probe_event(tp) <-- fails!
        free_trace_probe(tp)
      
      				   write(fd, "0", 1)
      				   __ftrace_set_clr_event()
      				   call->class->unreg
      				    (kprobe_register)
      				    disable_trace_probe(tp) <-- BOOM!
      
      A test program was written that used two threads to simulate the
      above scenario adding a nanosleep() interval to change the timings
      and after several thousand runs, it was able to trigger this bug
      and crash:
      
      BUG: unable to handle kernel paging request at 00000005000000f9
      IP: [<ffffffff810dee70>] probes_open+0x3b/0xa7
      PGD 7808a067 PUD 0
      Oops: 0000 [#1] PREEMPT SMP
      Dumping ftrace buffer:
      ---------------------------------
      Modules linked in: ipt_MASQUERADE sunrpc ip6t_REJECT nf_conntrack_ipv6
      CPU: 1 PID: 2070 Comm: test-kprobe-rem Not tainted 3.11.0-rc3-test+ #47
      Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./To be filled by O.E.M., BIOS SDBLI944.86P 05/08/2007
      task: ffff880077756440 ti: ffff880076e52000 task.ti: ffff880076e52000
      RIP: 0010:[<ffffffff810dee70>]  [<ffffffff810dee70>] probes_open+0x3b/0xa7
      RSP: 0018:ffff880076e53c38  EFLAGS: 00010203
      RAX: 0000000500000001 RBX: ffff88007844f440 RCX: 0000000000000003
      RDX: 0000000000000003 RSI: 0000000000000003 RDI: ffff880076e52000
      RBP: ffff880076e53c58 R08: ffff880076e53bd8 R09: 0000000000000000
      R10: ffff880077756440 R11: 0000000000000006 R12: ffffffff810dee35
      R13: ffff880079250418 R14: 0000000000000000 R15: ffff88007844f450
      FS:  00007f87a276f700(0000) GS:ffff88007d480000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      CR2: 00000005000000f9 CR3: 0000000077262000 CR4: 00000000000007e0
      Stack:
       ffff880076e53c58 ffffffff81219ea0 ffff88007844f440 ffffffff810dee35
       ffff880076e53ca8 ffffffff81130f78 ffff8800772986c0 ffff8800796f93a0
       ffffffff81d1b5d8 ffff880076e53e04 0000000000000000 ffff88007844f440
      Call Trace:
       [<ffffffff81219ea0>] ? security_file_open+0x2c/0x30
       [<ffffffff810dee35>] ? unregister_trace_probe+0x4b/0x4b
       [<ffffffff81130f78>] do_dentry_open+0x162/0x226
       [<ffffffff81131186>] finish_open+0x46/0x54
       [<ffffffff8113f30b>] do_last+0x7f6/0x996
       [<ffffffff8113cc6f>] ? inode_permission+0x42/0x44
       [<ffffffff8113f6dd>] path_openat+0x232/0x496
       [<ffffffff8113fc30>] do_filp_open+0x3a/0x8a
       [<ffffffff8114ab32>] ? __alloc_fd+0x168/0x17a
       [<ffffffff81131f4e>] do_sys_open+0x70/0x102
       [<ffffffff8108f06e>] ? trace_hardirqs_on_caller+0x160/0x197
       [<ffffffff81131ffe>] SyS_open+0x1e/0x20
       [<ffffffff81522742>] system_call_fastpath+0x16/0x1b
      Code: e5 41 54 53 48 89 f3 48 83 ec 10 48 23 56 78 48 39 c2 75 6c 31 f6 48 c7
      RIP  [<ffffffff810dee70>] probes_open+0x3b/0xa7
       RSP <ffff880076e53c38>
      CR2: 00000005000000f9
      ---[ end trace 35f17d68fc569897 ]---
      
      The unregister_trace_probe() must be done first, and if it fails it must
      fail the removal of the kprobe.
      
      Several changes have already been made by Oleg Nesterov and Masami Hiramatsu
      to allow moving the unregister_probe_event() before the removal of
      the probe and exit the function if it fails. This prevents the tp
      structure from being used after it is freed.
      
      Link: http://lkml.kernel.org/r/20130704034038.819592356@goodmis.orgAcked-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      40c32592
    • J
      printk: rename struct log to struct printk_log · 62e32ac3
      Joe Perches 提交于
      Rename the struct to enable moving portions of
      printk.c to separate files.
      
      The rename changes output of /proc/vmcoreinfo.
      Signed-off-by: NJoe Perches <joe@perches.com>
      Cc: Samuel Thibault <samuel.thibault@ens-lyon.org>
      Cc: Ming Lei <ming.lei@canonical.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      62e32ac3
    • J
      printk: use pointer for console_cmdline indexing · 23475408
      Joe Perches 提交于
      Make the code a bit more compact by always using a pointer for the active
      console_cmdline.
      
      Move overly indented code to correct indent level.
      Signed-off-by: NJoe Perches <joe@perches.com>
      Cc: Samuel Thibault <samuel.thibault@ens-lyon.org>
      Cc: Ming Lei <ming.lei@canonical.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      23475408
    • J
      printk: move braille console support into separate braille.[ch] files · bbeddf52
      Joe Perches 提交于
      Create files with prototypes and static inlines for braille support.  Make
      braille_console functions return 1 on success.
      
      Corrected CONFIG_A11Y_BRAILLE_CONSOLE=n _braille_console_setup
      return value to NULL.
      Signed-off-by: NJoe Perches <joe@perches.com>
      Reviewed-by: NSamuel Thibault <samuel.thibault@ens-lyon.org>
      Cc: Ming Lei <ming.lei@canonical.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      bbeddf52
    • J
      printk: add console_cmdline.h · d197c43d
      Joe Perches 提交于
      Add an include file for the console_cmdline struct so that the braille
      console driver can be separated.
      Signed-off-by: NJoe Perches <joe@perches.com>
      Cc: Samuel Thibault <samuel.thibault@ens-lyon.org>
      Cc: Ming Lei <ming.lei@canonical.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d197c43d
    • J
      printk: move to separate directory for easier modification · b9ee979e
      Joe Perches 提交于
      Make it easier to break up printk into bite-sized chunks.
      
      Remove printk path/filename from comment.
      Signed-off-by: NJoe Perches <joe@perches.com>
      Cc: Samuel Thibault <samuel.thibault@ens-lyon.org>
      Cc: Ming Lei <ming.lei@canonical.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b9ee979e
    • D
      mm: sched: numa: fix NUMA balancing when !SCHED_DEBUG · 10e84b97
      Dave Kleikamp 提交于
      Commit 3105b86a ("mm: sched: numa: Control enabling and disabling of
      NUMA balancing if !SCHED_DEBUG") defined numabalancing_enabled to
      control the enabling and disabling of automatic NUMA balancing, but it
      is never used.
      
      I believe the intention was to use this in place of sched_feat_numa(NUMA).
      
      Currently, if SCHED_DEBUG is not defined, sched_feat_numa(NUMA) will
      never be changed from the initial "false".
      Signed-off-by: NDave Kleikamp <dave.kleikamp@oracle.com>
      Acked-by: NMel Gorman <mgorman@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      10e84b97
    • S
      tracing: Add comment to describe special break case in probe_remove_event_call() · 2ba64035
      Steven Rostedt (Red Hat) 提交于
      The "break" used in the do_for_each_event_file() is used as an optimization
      as the loop is really a double loop. The loop searches all event files
      for each trace_array. There's only one matching event file per trace_array
      and after we find the event file for the trace_array, the break is used
      to jump to the next trace_array and start the search there.
      
      As this is not a standard way of using "break" in C code, it requires
      a comment right before the break to let people know what is going on.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      2ba64035
    • O
      tracing: trace_remove_event_call() should fail if call/file is in use · 2816c551
      Oleg Nesterov 提交于
      Change trace_remove_event_call(call) to return the error if this
      call is active. This is what the callers assume but can't verify
      outside of the tracing locks. Both trace_kprobe.c/trace_uprobe.c
      need the additional changes, unregister_trace_probe() should abort
      if trace_remove_event_call() fails.
      
      The caller is going to free this call/file so we must ensure that
      nobody can use them after trace_remove_event_call() succeeds.
      debugfs should be fine after the previous changes and event_remove()
      does TRACE_REG_UNREGISTER, but still there are 2 reasons why we need
      the additional checks:
      
      - There could be a perf_event(s) attached to this tp_event, so the
        patch checks ->perf_refcount.
      
      - TRACE_REG_UNREGISTER can be suppressed by FTRACE_EVENT_FL_SOFT_MODE,
        so we simply check FTRACE_EVENT_FL_ENABLED protected by event_mutex.
      
      Link: http://lkml.kernel.org/r/20130729175033.GB26284@redhat.comReviewed-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      2816c551
  12. 31 7月, 2013 4 次提交
    • L
      cgroup: fix a leak when percpu_ref_init() fails · da0a12ca
      Li Zefan 提交于
      ss->css_free() is not called when perfcpu_ref_init() fails.
      Signed-off-by: NLi Zefan <lizefan@huawei.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      da0a12ca
    • S
      ftrace: Check module functions being traced on reload · 8c4f3c3f
      Steven Rostedt (Red Hat) 提交于
      There's been a nasty bug that would show up and not give much info.
      The bug displayed the following warning:
      
       WARNING: at kernel/trace/ftrace.c:1529 __ftrace_hash_rec_update+0x1e3/0x230()
       Pid: 20903, comm: bash Tainted: G           O 3.6.11+ #38405.trunk
       Call Trace:
        [<ffffffff8103e5ff>] warn_slowpath_common+0x7f/0xc0
        [<ffffffff8103e65a>] warn_slowpath_null+0x1a/0x20
        [<ffffffff810c2ee3>] __ftrace_hash_rec_update+0x1e3/0x230
        [<ffffffff810c4f28>] ftrace_hash_move+0x28/0x1d0
        [<ffffffff811401cc>] ? kfree+0x2c/0x110
        [<ffffffff810c68ee>] ftrace_regex_release+0x8e/0x150
        [<ffffffff81149f1e>] __fput+0xae/0x220
        [<ffffffff8114a09e>] ____fput+0xe/0x10
        [<ffffffff8105fa22>] task_work_run+0x72/0x90
        [<ffffffff810028ec>] do_notify_resume+0x6c/0xc0
        [<ffffffff8126596e>] ? trace_hardirqs_on_thunk+0x3a/0x3c
        [<ffffffff815c0f88>] int_signal+0x12/0x17
       ---[ end trace 793179526ee09b2c ]---
      
      It was finally narrowed down to unloading a module that was being traced.
      
      It was actually more than that. When functions are being traced, there's
      a table of all functions that have a ref count of the number of active
      tracers attached to that function. When a function trace callback is
      registered to a function, the function's record ref count is incremented.
      When it is unregistered, the function's record ref count is decremented.
      If an inconsistency is detected (ref count goes below zero) the above
      warning is shown and the function tracing is permanently disabled until
      reboot.
      
      The ftrace callback ops holds a hash of functions that it filters on
      (and/or filters off). If the hash is empty, the default means to filter
      all functions (for the filter_hash) or to disable no functions (for the
      notrace_hash).
      
      When a module is unloaded, it frees the function records that represent
      the module functions. These records exist on their own pages, that is
      function records for one module will not exist on the same page as
      function records for other modules or even the core kernel.
      
      Now when a module unloads, the records that represents its functions are
      freed. When the module is loaded again, the records are recreated with
      a default ref count of zero (unless there's a callback that traces all
      functions, then they will also be traced, and the ref count will be
      incremented).
      
      The problem is that if an ftrace callback hash includes functions of the
      module being unloaded, those hash entries will not be removed. If the
      module is reloaded in the same location, the hash entries still point
      to the functions of the module but the module's ref counts do not reflect
      that.
      
      With the help of Steve and Joern, we found a reproducer:
      
       Using uinput module and uinput_release function.
      
       cd /sys/kernel/debug/tracing
       modprobe uinput
       echo uinput_release > set_ftrace_filter
       echo function > current_tracer
       rmmod uinput
       modprobe uinput
       # check /proc/modules to see if loaded in same addr, otherwise try again
       echo nop > current_tracer
      
       [BOOM]
      
      The above loads the uinput module, which creates a table of functions that
      can be traced within the module.
      
      We add uinput_release to the filter_hash to trace just that function.
      
      Enable function tracincg, which increments the ref count of the record
      associated to uinput_release.
      
      Remove uinput, which frees the records including the one that represents
      uinput_release.
      
      Load the uinput module again (and make sure it's at the same address).
      This recreates the function records all with a ref count of zero,
      including uinput_release.
      
      Disable function tracing, which will decrement the ref count for uinput_release
      which is now zero because of the module removal and reload, and we have
      a mismatch (below zero ref count).
      
      The solution is to check all currently tracing ftrace callbacks to see if any
      are tracing any of the module's functions when a module is loaded (it already does
      that with callbacks that trace all functions). If a callback happens to have
      a module function being traced, it increments that records ref count and starts
      tracing that function.
      
      There may be a strange side effect with this, where tracing module functions
      on unload and then reloading a new module may have that new module's functions
      being traced. This may be something that confuses the user, but it's not
      a big deal. Another approach is to disable all callback hashes on module unload,
      but this leaves some ftrace callbacks that may not be registered, but can
      still have hashes tracing the module's function where ftrace doesn't know about
      it. That situation can cause the same bug. This solution solves that case too.
      Another benefit of this solution, is it is possible to trace a module's
      function on unload and load.
      
      Link: http://lkml.kernel.org/r/20130705142629.GA325@redhat.comReported-by: NJörn Engel <joern@logfs.org>
      Reported-by: NDave Jones <davej@redhat.com>
      Reported-by: NSteve Hodgson <steve@purestorage.com>
      Tested-by: NSteve Hodgson <steve@purestorage.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      8c4f3c3f
    • M
      mutex: Fix w/w mutex deadlock injection · 85f48961
      Maarten Lankhorst 提交于
      The check needs to be for > 1, because ctx->acquired is already incremented.
      This will prevent ww_mutex_lock_slow from returning -EDEADLK and not locking
      the mutex. It caused a lot of false gpu lockups on radeon with
      CONFIG_DEBUG_WW_MUTEX_SLOWPATH=y because a function that shouldn't be able
      to return -EDEADLK did.
      Signed-off-by: NMaarten Lankhorst <maarten.lankhorst@canonical.com>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Alex Deucher <alexander.deucher@amd.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/51F775B5.201@canonical.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      85f48961
    • P
      sched: Ensure update_cfs_shares() is called for parents of continuously-running tasks · bf0bd948
      Peter Zijlstra 提交于
      We typically update a task_group's shares within the dequeue/enqueue
      path.  However, continuously running tasks sharing a CPU are not
      subject to these updates as they are only put/picked.  Unfortunately,
      when we reverted f269ae04 (in 17bc14b7), we lost the augmenting
      periodic update that was supposed to account for this; resulting in a
      potential loss of fairness.
      
      To fix this, re-introduce the explicit update in
      update_cfs_rq_blocked_load() [called via entity_tick()].
      Reported-by: NMax Hailperin <max@gustavus.edu>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Reviewed-by: NPaul Turner <pjt@google.com>
      Link: http://lkml.kernel.org/n/tip-9545m3apw5d93ubyrotrj31y@git.kernel.org
      Cc: <stable@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      bf0bd948
  13. 30 7月, 2013 8 次提交