1. 14 6月, 2013 11 次提交
    • T
      cgroup: update sane_behavior documentation · f63674fd
      Tejun Heo 提交于
      f12dc020 ("cgroup: mark "tasks" cgroup file as insane") and
      cc5943a7 ("cgroup: mark "notify_on_release" and "release_agent"
      cgroup files insane") forgot to update the changed behavior
      documentation in cgroup.h.  Update it.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      f63674fd
    • T
      cgroup: use percpu refcnt for cgroup_subsys_states · d3daf28d
      Tejun Heo 提交于
      A css (cgroup_subsys_state) is how each cgroup is represented to a
      controller.  As such, it can be used in hot paths across the various
      subsystems different controllers are associated with.
      
      One of the common operations is reference counting, which up until now
      has been implemented using a global atomic counter and can have
      significant adverse impact on scalability.  For example, css refcnt
      can be gotten and put multiple times by blkcg for each IO request.
      For highops configurations which try to do as much per-cpu as
      possible, the global frequent refcnting can be very expensive.
      
      In general, given the various and hugely diverse paths css's end up
      being used from, we need to make it cheap and highly scalable.  In its
      usage, css refcnting isn't very different from module refcnting.
      
      This patch converts css refcnting to use the recently added
      percpu_ref.  css_get/tryget/put() directly maps to the matching
      percpu_ref operations and the deactivation logic is no longer
      necessary as percpu_ref already has refcnt killing.
      
      The only complication is that as the refcnt is per-cpu,
      percpu_ref_kill() in itself doesn't ensure that further tryget
      operations will fail, which we need to guarantee before invoking
      ->css_offline()'s.  This is resolved collecting kill confirmation
      using percpu_ref_kill_and_confirm() and initiating the offline phase
      of destruction after all css refcnt's are confirmed to be seen as
      killed on all CPUs.  The previous patches already splitted destruction
      into two phases, so percpu_ref_kill_and_confirm() can be hooked up
      easily.
      
      This patch removes css_refcnt() which is used for rcu dereference
      sanity check in css_id().  While we can add a percpu refcnt API to ask
      the same question, css_id() itself is scheduled to be removed fairly
      soon, so let's not bother with it.  Just drop the sanity check and use
      rcu_dereference_raw() instead.
      
      v2: - init_cgroup_css() was calling percpu_ref_init() without checking
            the return value.  This causes two problems - the obvious lack
            of error handling and percpu_ref_init() being called from
            cgroup_init_subsys() before the allocators are up, which
            triggers warnings but doesn't cause actual problems as the
            refcnt isn't used for roots anyway.  Fix both by moving
            percpu_ref_init() to cgroup_create().
      
          - The base references were put too early by
            percpu_ref_kill_and_confirm() and cgroup_offline_fn() put the
            refs one extra time.  This wasn't noticeable because css's go
            through another RCU grace period before being freed.  Update
            cgroup_destroy_locked() to grab an extra reference before
            killing the refcnts.  This problem was noticed by Kent.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NKent Overstreet <koverstreet@google.com>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Mike Snitzer <snitzer@redhat.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: "Alasdair G. Kergon" <agk@redhat.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Mikulas Patocka <mpatocka@redhat.com>
      Cc: Glauber Costa <glommer@gmail.com>
      d3daf28d
    • T
      cgroup: split cgroup destruction into two steps · ea15f8cc
      Tejun Heo 提交于
      Split cgroup_destroy_locked() into two steps and put the latter half
      into cgroup_offline_fn() which is executed from a work item.  The
      latter half is responsible for offlining the css's, removing the
      cgroup from internal lists, and propagating release notification to
      the parent.  The separation is to allow using percpu refcnt for css.
      
      Note that this allows for other cgroup operations to happen between
      the first and second halves of destruction, including creating a new
      cgroup with the same name.  As the target cgroup is marked DEAD in the
      first half and cgroup internals don't care about the names of cgroups,
      this should be fine.  A comment explaining this will be added by the
      next patch which implements the actual percpu refcnting.
      
      As RCU freeing is guaranteed to happen after the second step of
      destruction, we can use the same work item for both.  This patch
      renames cgroup->free_work to ->destroy_work and uses it for both
      purposes.  INIT_WORK() is now performed right before queueing the work
      item.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      ea15f8cc
    • T
      percpu-refcount: implement percpu_tryget() along with percpu_ref_kill_and_confirm() · dbece3a0
      Tejun Heo 提交于
      Implement percpu_tryget() which stops giving out references once the
      percpu_ref is visible as killed.  Because the refcnt is per-cpu,
      different CPUs will start to see a refcnt as killed at different
      points in time and tryget() may continue to succeed on subset of cpus
      for a while after percpu_ref_kill() returns.
      
      For use cases where it's necessary to know when all CPUs start to see
      the refcnt as dead, percpu_ref_kill_and_confirm() is added.  The new
      function takes an extra argument @confirm_kill which is invoked when
      the refcnt is guaranteed to be viewed as killed on all CPUs.
      
      While this isn't the prettiest interface, it doesn't force synchronous
      wait and is much safer than requiring the caller to do its own
      call_rcu().
      
      v2: Patch description rephrased to emphasize that tryget() may
          continue to succeed on some CPUs after kill() returns as suggested
          by Kent.
      
      v3: Function comment in percpu_ref_kill_and_confirm() updated warning
          people to not depend on the implied RCU grace period from the
          confirm callback as it's an implementation detail.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Slightly-Grumpily-Acked-by: NKent Overstreet <koverstreet@google.com>
      dbece3a0
    • T
      percpu-refcount: implement percpu_ref_cancel_init() · bc497bd3
      Tejun Heo 提交于
      Normally, percpu_ref_init() initializes and percpu_ref_kill()
      initiates destruction which completes asynchronously.  The
      asynchronous destruction can be problematic in init failure path where
      the caller wants to destroy half-constructed object - distinguishing
      half-constructed objects from the usual release method can be painful
      for complex objects.
      
      This patch implements percpu_ref_cancel_init() which synchronously
      destroys the percpu_ref without invoking release.  To avoid
      unintentional misuses, the function requires the ref to have finished
      percpu_ref_init() but never used and triggers WARN otherwise.
      
      v2: Explain the weird name and usage restriction in the function
          comment.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NKent Overstreet <koverstreet@google.com>
      bc497bd3
    • T
      percpu-refcount: add __must_check to percpu_ref_init() and don't use... · acac7883
      Tejun Heo 提交于
      percpu-refcount: add __must_check to percpu_ref_init() and don't use ACCESS_ONCE() in percpu_ref_kill_rcu()
      
      Two small changes.
      
      * Unlike most init functions, percpu_ref_init() allocates memory and
        may fail.  Let's mark it with __must_check in case the caller
        forgets.
      
      * percpu_ref_kill_rcu() is unnecessarily using ACCESS_ONCE() to
        dereference @ref->pcpu_count, which can be misleading.  The pointer
        is guaranteed to be valid and visible and can't change underneath
        the function.  Drop ACCESS_ONCE().
      Signed-off-by: NTejun Heo <tj@kernel.org>
      acac7883
    • T
      cgroup: remove cgroup->count and use · 6f3d828f
      Tejun Heo 提交于
      cgroup->count tracks the number of css_sets associated with the cgroup
      and used only to verify that no css_set is associated when the cgroup
      is being destroyed.  It's superflous as the destruction path can
      simply check whether cgroup->cset_links is empty instead.
      
      Drop cgroup->count and check ->cset_links directly from
      cgroup_destroy_locked().
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      6f3d828f
    • T
      cgroup: rename CGRP_REMOVED to CGRP_DEAD · 54766d4a
      Tejun Heo 提交于
      We will add another flag indicating that the cgroup is in the process
      of being killed.  REMOVING / REMOVED is more difficult to distinguish
      and cgroup_is_removing()/cgroup_is_removed() are a bit awkward.  Also,
      later percpu_ref usage will involve "kill"ing the refcnt.
      
       s/CGRP_REMOVED/CGRP_DEAD/
       s/cgroup_is_removed()/cgroup_is_dead()
      
      This patch is purely cosmetic.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      54766d4a
    • T
      cgroup: clean up css_[try]get() and css_put() · 5de0107e
      Tejun Heo 提交于
      * __css_get() isn't used by anyone.  Fold it into css_get().
      
      * Add proper function comments to all css reference functions.
      
      This patch is purely cosmetic.
      
      v2: Typo fix as per Li.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      5de0107e
    • T
      cgroup: bring some sanity to naming around cg_cgroup_link · 69d0206c
      Tejun Heo 提交于
      cgroups and css_sets are mapped M:N and this M:N mapping is
      represented by struct cg_cgroup_link which forms linked lists on both
      sides.  The naming around this mapping is already confusing and struct
      cg_cgroup_link exacerbates the situation quite a bit.
      
      >From cgroup side, it starts off ->css_sets and runs through
      ->cgrp_link_list.  From css_set side, it starts off ->cg_links and
      runs through ->cg_link_list.  This is rather reversed as
      cgrp_link_list is used to iterate css_sets and cg_link_list cgroups.
      Also, this is the only place which is still using the confusing "cg"
      for css_sets.  This patch cleans it up a bit.
      
      * s/cgroup->css_sets/cgroup->cset_links/
        s/css_set->cg_links/css_set->cgrp_links/
        s/cgroup_iter->cg_link/cgroup_iter->cset_link/
      
      * s/cg_cgroup_link/cgrp_cset_link/
      
      * s/cgrp_cset_link->cg/cgrp_cset_link->cset/
        s/cgrp_cset_link->cgrp_link_list/cgrp_cset_link->cset_link/
        s/cgrp_cset_link->cg_link_list/cgrp_cset_link->cgrp_link/
      
      * s/init_css_set_link/init_cgrp_cset_link/
        s/free_cg_links/free_cgrp_cset_links/
        s/allocate_cg_links/allocate_cgrp_cset_links/
      
      * s/cgl[12]/link[12]/ in compare_css_sets()
      
      * s/saved_link/tmp_link/ s/tmp/tmp_links/ and a couple similar
        adustments.
      
      * Comment and whiteline adjustments.
      
      After the changes, we have
      
      	list_for_each_entry(link, &cont->cset_links, cset_link) {
      		struct css_set *cset = link->cset;
      
      instead of
      
      	list_for_each_entry(link, &cont->css_sets, cgrp_link_list) {
      		struct css_set *cset = link->cg;
      
      This patch is purely cosmetic.
      
      v2: Fix broken sentences in the patch description.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      69d0206c
    • T
      cgroup: remove now unused css_depth() · 3fc3db9a
      Tejun Heo 提交于
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      3fc3db9a
  2. 13 6月, 2013 2 次提交
  3. 04 6月, 2013 1 次提交
    • K
      percpu: implement generic percpu refcounting · 215e262f
      Kent Overstreet 提交于
      This implements a refcount with similar semantics to
      atomic_get()/atomic_dec_and_test() - but percpu.
      
      It also implements two stage shutdown, as we need it to tear down the
      percpu counts.  Before dropping the initial refcount, you must call
      percpu_ref_kill(); this puts the refcount in "shutting down mode" and
      switches back to a single atomic refcount with the appropriate
      barriers (synchronize_rcu()).
      
      It's also legal to call percpu_ref_kill() multiple times - it only
      returns true once, so callers don't have to reimplement shutdown
      synchronization.
      
      [akpm@linux-foundation.org: fix build]
      [akpm@linux-foundation.org: coding-style tweak]
      Signed-off-by: NKent Overstreet <koverstreet@google.com>
      Cc: Zach Brown <zab@redhat.com>
      Cc: Felipe Balbi <balbi@ti.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Asai Thambi S P <asamymuthupa@micron.com>
      Cc: Selvan Mani <smani@micron.com>
      Cc: Sam Bradshaw <sbradshaw@micron.com>
      Cc: Jeff Moyer <jmoyer@redhat.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Benjamin LaHaise <bcrl@kvack.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Reviewed-by: N"Theodore Ts'o" <tytso@mit.edu>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      215e262f
  4. 31 5月, 2013 1 次提交
  5. 25 5月, 2013 4 次提交
    • R
      linux/kernel.h: fix kernel-doc warning · 7450231f
      Randy Dunlap 提交于
      Fix kernel-doc warning in <linux/kernel.h>:
      
        Warning(include/linux/kernel.h:590): No description found for parameter 'ip'
      
      scripts/kernel-doc cannot handle macros, functions, or function
      prototypes between the function or macro that is being documented and
      its definition, so move these prototypes above the function that is
      being documented.
      Signed-off-by: NRandy Dunlap <rdunlap@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7450231f
    • I
      wait: fix false timeouts when using wait_event_timeout() · 4c663cfc
      Imre Deak 提交于
      Many callers of the wait_event_timeout() and
      wait_event_interruptible_timeout() expect that the return value will be
      positive if the specified condition becomes true before the timeout
      elapses.  However, at the moment this isn't guaranteed.  If the wake-up
      handler is delayed enough, the time remaining until timeout will be
      calculated as 0 - and passed back as a return value - even if the
      condition became true before the timeout has passed.
      
      Fix this by returning at least 1 if the condition becomes true.  This
      semantic is in line with what wait_for_condition_timeout() does; see
      commit bb10ed09 ("sched: fix wait_for_completion_timeout() spurious
      failure under heavy load").
      
      Daniel said "We have 3 instances of this bug in drm/i915.  One case even
      where we switch between the interruptible and not interruptible
      wait_event_timeout variants, foolishly presuming they have the same
      semantics.  I very much like this."
      
      One such bug is reported at
        https://bugs.freedesktop.org/show_bug.cgi?id=64133Signed-off-by: NImre Deak <imre.deak@intel.com>
      Acked-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      Acked-by: NDavid Howells <dhowells@redhat.com>
      Acked-by: NJens Axboe <axboe@kernel.dk>
      Cc: "Paul E.  McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Dave Jones <davej@redhat.com>
      Cc: Lukas Czerner <lczerner@redhat.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4c663cfc
    • A
      rapidio: add enumeration/discovery start from user space · bc8fcfea
      Alexandre Bounine 提交于
      Add RapidIO enumeration/discovery start from user space.  User space
      start allows to defer RapidIO fabric scan until the moment when all
      participating endpoints are initialized avoiding mandatory synchronized
      start of all endpoints (which may be challenging in systems with large
      number of RapidIO endpoints).
      Signed-off-by: NAlexandre Bounine <alexandre.bounine@idt.com>
      Cc: Matt Porter <mporter@kernel.crashing.org>
      Cc: Li Yang <leoli@freescale.com>
      Cc: Kumar Gala <galak@kernel.crashing.org>
      Cc: Andre van Herk <andre.van.herk@Prodrive.nl>
      Cc: Micha Nelissen <micha.nelissen@Prodrive.nl>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      bc8fcfea
    • A
      rapidio: make enumeration/discovery configurable · a11650e1
      Alexandre Bounine 提交于
      Systems that use RapidIO fabric may need to implement their own
      enumeration and discovery methods which are better suitable for needs of
      a target application.
      
      The following set of patches is intended to simplify process of
      introduction of new RapidIO fabric enumeration/discovery methods.
      
      The first patch offers ability to add new RapidIO enumeration/discovery
      methods using kernel configuration options.  This new configuration
      option mechanism allows to select statically linked or modular
      enumeration/discovery method(s) from the list of existing methods or use
      external module(s).
      
      This patch also updates the currently existing enumeration/discovery
      code to be used as a statically linked or modular method.
      
      The corresponding configuration option is named "Basic
      enumeration/discovery" method.  This is the only one configuration
      option available today but new methods are expected to be introduced
      after adoption of provided patches.
      
      The second patch address a long time complaint of RapidIO subsystem
      users regarding fabric enumeration/discovery start sequence.  Existing
      implementation offers only a boot-time enumeration/discovery start which
      requires synchronized boot of all endpoints in RapidIO network.  While
      it works for small closed configurations with limited number of
      endpoints, using this approach in systems with large number of endpoints
      is quite challenging.
      
      To eliminate requirement for synchronized start the second patch
      introduces RapidIO enumeration/discovery start from user space.
      
      For compatibility with the existing RapidIO subsystem implementation,
      automatic boot time enumeration/discovery start can be configured in by
      specifying "rio-scan.scan=1" command line parameter if statically linked
      basic enumeration method is selected.
      
      This patch:
      
      Rework to implement RapidIO enumeration/discovery method selection
      combined with ability to use enumeration/discovery as a kernel module.
      
      This patch adds ability to introduce new RapidIO enumeration/discovery
      methods using kernel configuration options.  Configuration option
      mechanism allows to select statically linked or modular
      enumeration/discovery method from the list of existing methods or use
      external modules.  If a modular enumeration/discovery is selected each
      RapidIO mport device can have its own method attached to it.
      
      The existing enumeration/discovery code was updated to be used as
      statically linked or modular method.  This configuration option is named
      "Basic enumeration/discovery" method.
      
      Several common routines have been moved from rio-scan.c to make them
      available to other enumeration methods and reduce number of exported
      symbols.
      Signed-off-by: NAlexandre Bounine <alexandre.bounine@idt.com>
      Cc: Matt Porter <mporter@kernel.crashing.org>
      Cc: Li Yang <leoli@freescale.com>
      Cc: Kumar Gala <galak@kernel.crashing.org>
      Cc: Andre van Herk <andre.van.herk@Prodrive.nl>
      Cc: Micha Nelissen <micha.nelissen@Prodrive.nl>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a11650e1
  6. 24 5月, 2013 4 次提交
    • T
      cgroup: update iterators to use cgroup_next_sibling() · 75501a6d
      Tejun Heo 提交于
      This patch converts cgroup_for_each_child(),
      cgroup_next_descendant_pre/post() and thus
      cgroup_for_each_descendant_pre/post() to use cgroup_next_sibling()
      instead of manually dereferencing ->sibling.next.
      
      The only reason the iterators couldn't allow dropping RCU read lock
      while iteration is in progress was because they couldn't determine the
      next sibling safely once RCU read lock is dropped.  Using
      cgroup_next_sibling() removes that problem and enables all iterators
      to allow dropping RCU read lock in the middle.  Comments are updated
      accordingly.
      
      This makes the iterators easier to use and will simplify controllers.
      
      Note that @cgroup argument is renamed to @cgrp in
      cgroup_for_each_child() because it conflicts with "struct cgroup" used
      in the new macro body.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NSerge E. Hallyn <serge.hallyn@ubuntu.com>
      Reviewed-by: NMichal Hocko <mhocko@suse.cz>
      75501a6d
    • T
      cgroup: add cgroup->serial_nr and implement cgroup_next_sibling() · 53fa5261
      Tejun Heo 提交于
      Currently, there's no easy way to find out the next sibling cgroup
      unless it's known that the current cgroup is accessed from the
      parent's children list in a single RCU critical section.  This in turn
      forces all iterators to require whole iteration to be enclosed in a
      single RCU critical section, which sometimes is too restrictive.  This
      patch implements cgroup_next_sibling() which can reliably determine
      the next sibling regardless of the state of the current cgroup as long
      as it's accessible.
      
      It currently is impossible to determine the next sibling after
      dropping RCU read lock because the cgroup being iterated could be
      removed anytime and if RCU read lock is dropped, nothing guarantess
      its ->sibling.next pointer is accessible.  A removed cgroup would
      continue to point to its next sibling for RCU accesses but stop
      receiving updates from the sibling.  IOW, the next sibling could be
      removed and then complete its grace period while RCU read lock is
      dropped, making it unsafe to dereference ->sibling.next after dropping
      and re-acquiring RCU read lock.
      
      This can be solved by adding a way to traverse to the next sibling
      without dereferencing ->sibling.next.  This patch adds a monotonically
      increasing cgroup serial number, cgroup->serial_nr, which guarantees
      that all cgroup->children lists are kept in increasing serial_nr
      order.  A new function, cgroup_next_sibling(), is implemented, which,
      if CGRP_REMOVED is not set on the current cgroup, follows
      ->sibling.next; otherwise, traverses the parent's ->children list
      until it sees a sibling with higher ->serial_nr.
      
      This allows the function to always return the next sibling regardless
      of the state of the current cgroup without adding overhead in the fast
      path.
      
      Further patches will update the iterators to use cgroup_next_sibling()
      so that they allow dropping RCU read lock and blocking while iteration
      is in progress which in turn will be used to simplify controllers.
      
      v2: Typo fix as per Serge.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NSerge E. Hallyn <serge.hallyn@ubuntu.com>
      53fa5261
    • T
      cgroup: make cgroup_is_removed() static · bdc7119f
      Tejun Heo 提交于
      cgroup_is_removed() no longer has external users and it shouldn't grow
      any - controllers should deal with cgroup_subsys_state on/offline
      state instead of cgroup removal state.  Make it static.
      
      While at it, make it return bool.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      bdc7119f
    • T
      cgroup: fix a subtle bug in descendant pre-order walk · 7805d000
      Tejun Heo 提交于
      When cgroup_next_descendant_pre() initiates a walk, it checks whether
      the subtree root doesn't have any children and if not returns NULL.
      Later code assumes that the subtree isn't empty.  This is broken
      because the subtree may become empty inbetween, which can lead to the
      traversal escaping the subtree by walking to the sibling of the
      subtree root.
      
      There's no reason to have the early exit path.  Remove it along with
      the later assumption that the subtree isn't empty.  This simplifies
      the code a bit and fixes the subtle bug.
      
      While at it, fix the comment of cgroup_for_each_descendant_pre() which
      was incorrectly referring to ->css_offline() instead of
      ->css_online().
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NMichal Hocko <mhocko@suse.cz>
      Cc: stable@vger.kernel.org
      7805d000
  7. 22 5月, 2013 1 次提交
    • R
      Add include dependencies to <linux/printk.h>. · 154c2670
      Ralf Baechle 提交于
      If <linux/linkage.h> has not been included before <linux/printk.h>,
      a build error like the below one will result:
      
        CC      arch/mips/kernel/idle.o
      In file included from arch/mips/kernel/idle.c:17:0:
      include/linux/printk.h:109:1: error: data definition has no type or storage class [-Werror]
      include/linux/printk.h:109:1: error: type defaults to ‘int’ in declaration of ‘asmlinkage’ [-Werror=implicit-int]
      include/linux/printk.h:110:1: error: ‘format’ attribute only applies to function types [-Werror=attributes]
      include/linux/printk.h:110:1: error: expected ‘,’ or ‘;’ before ‘int’
      include/linux/printk.h:114:1: error: data definition has no type or storage class [-Werror]
      include/linux/printk.h:114:1: error: type defaults to ‘int’ in declaration of ‘asmlinkage’ [-Werror=implicit-int]
      include/linux/printk.h:115:1: error: ‘format’ attribute only applies to function types [-Werror=attributes]
      include/linux/printk.h:115:1: error: expected ‘,’ or ‘;’ before ‘int’
      include/linux/printk.h:117:1: error: data definition has no type or storage class [-Werror]
      include/linux/printk.h:117:1: error: type defaults to ‘int’ in declaration of ‘asmlinkage’ [-Werror=implicit-int]
      include/linux/printk.h:118:1: error: ‘format’ attribute only applies to function types [-Werror=attributes]
      include/linux/printk.h:118:1: error: ‘__cold__’ attribute ignored [-Werror=attributes]
      include/linux/printk.h:118:1: error: expected ‘,’ or ‘;’ before ‘asmlinkage’
      include/linux/printk.h:122:1: error: data definition has no type or storage class [-Werror]
      include/linux/printk.h:122:1: error: type defaults to ‘int’ in declaration of ‘asmlinkage’ [-Werror=implicit-int]
      include/linux/printk.h:123:1: error: ‘format’ attribute only applies to function types [-Werror=attributes]
      include/linux/printk.h:123:1: error: ‘__cold__’ attribute ignored [-Werror=attributes]
      include/linux/printk.h:123:1: error: expected ‘,’ or ‘;’ before ‘int’
      In file included from include/linux/kernel.h:14:0,
                       from include/linux/sched.h:15,
                       from arch/mips/kernel/idle.c:18:
      include/linux/dynamic_debug.h: In function ‘ddebug_dyndbg_module_param_cb’:
      include/linux/dynamic_debug.h:124:3: error: implicit declaration of function ‘printk’ [-Werror=implicit-function-declaration]
      
      Fixed by including <linux/linkage.h>.
      Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
      154c2670
  8. 21 5月, 2013 1 次提交
    • P
      tty/vt: Fix vc_deallocate() lock order · 421b40a6
      Peter Hurley 提交于
      Now that the tty port owns the flip buffers and i/o is allowed
      from the driver even when no tty is attached, the destruction
      of the tty port (and the flip buffers) must ensure that no
      outstanding work is pending.
      
      Unfortunately, this creates a lock order problem with the
      console_lock (see attached lockdep report [1] below).
      
      For single console deallocation, drop the console_lock prior
      to port destruction. When multiple console deallocation,
      defer port destruction until the consoles have been
      deallocated.
      
      tty_port_destroy() is not required if the port has not
      been used; remove from vc_allocate() failure path.
      
      [1] lockdep report from Dave Jones <davej@redhat.com>
      
       ======================================================
       [ INFO: possible circular locking dependency detected ]
       3.9.0+ #16 Not tainted
       -------------------------------------------------------
       (agetty)/26163 is trying to acquire lock:
       blocked:  ((&buf->work)){+.+...}, instance: ffff88011c8b0020, at: [<ffffffff81062065>] flush_work+0x5/0x2e0
      
       but task is already holding lock:
       blocked:  (console_lock){+.+.+.}, instance: ffffffff81c2fde0, at: [<ffffffff813bc201>] vt_ioctl+0xb61/0x1230
      
       which lock already depends on the new lock.
      
       the existing dependency chain (in reverse order) is:
      
       -> #1 (console_lock){+.+.+.}:
              [<ffffffff810b3f74>] lock_acquire+0xa4/0x210
              [<ffffffff810416c7>] console_lock+0x77/0x80
              [<ffffffff813c3dcd>] con_flush_chars+0x2d/0x50
              [<ffffffff813b32b2>] n_tty_receive_buf+0x122/0x14d0
              [<ffffffff813b7709>] flush_to_ldisc+0x119/0x170
              [<ffffffff81064381>] process_one_work+0x211/0x700
              [<ffffffff8106498b>] worker_thread+0x11b/0x3a0
              [<ffffffff8106ce5d>] kthread+0xed/0x100
              [<ffffffff81601cac>] ret_from_fork+0x7c/0xb0
      
       -> #0 ((&buf->work)){+.+...}:
              [<ffffffff810b349a>] __lock_acquire+0x193a/0x1c00
              [<ffffffff810b3f74>] lock_acquire+0xa4/0x210
              [<ffffffff810620ae>] flush_work+0x4e/0x2e0
              [<ffffffff81065305>] __cancel_work_timer+0x95/0x130
              [<ffffffff810653b0>] cancel_work_sync+0x10/0x20
              [<ffffffff813b8212>] tty_port_destroy+0x12/0x20
              [<ffffffff813c65e8>] vc_deallocate+0xf8/0x110
              [<ffffffff813bc20c>] vt_ioctl+0xb6c/0x1230
              [<ffffffff813b01a5>] tty_ioctl+0x285/0xd50
              [<ffffffff811ba825>] do_vfs_ioctl+0x305/0x530
              [<ffffffff811baad1>] sys_ioctl+0x81/0xa0
              [<ffffffff81601d59>] system_call_fastpath+0x16/0x1b
      
       other info that might help us debug this:
      
       [ 6760.076175]  Possible unsafe locking scenario:
      
              CPU0                    CPU1
              ----                    ----
         lock(console_lock);
                                      lock((&buf->work));
                                      lock(console_lock);
         lock((&buf->work));
      
        *** DEADLOCK ***
      
       1 lock on stack by (agetty)/26163:
        #0: blocked:  (console_lock){+.+.+.}, instance: ffffffff81c2fde0, at: [<ffffffff813bc201>] vt_ioctl+0xb61/0x1230
       stack backtrace:
       Pid: 26163, comm: (agetty) Not tainted 3.9.0+ #16
       Call Trace:
        [<ffffffff815edb14>] print_circular_bug+0x200/0x20e
        [<ffffffff810b349a>] __lock_acquire+0x193a/0x1c00
        [<ffffffff8100a269>] ? sched_clock+0x9/0x10
        [<ffffffff8100a269>] ? sched_clock+0x9/0x10
        [<ffffffff8100a200>] ? native_sched_clock+0x20/0x80
        [<ffffffff810b3f74>] lock_acquire+0xa4/0x210
        [<ffffffff81062065>] ? flush_work+0x5/0x2e0
        [<ffffffff810620ae>] flush_work+0x4e/0x2e0
        [<ffffffff81062065>] ? flush_work+0x5/0x2e0
        [<ffffffff810b15db>] ? mark_held_locks+0xbb/0x140
        [<ffffffff8113c8a3>] ? __free_pages_ok.part.57+0x93/0xc0
        [<ffffffff810b15db>] ? mark_held_locks+0xbb/0x140
        [<ffffffff810652f2>] ? __cancel_work_timer+0x82/0x130
        [<ffffffff81065305>] __cancel_work_timer+0x95/0x130
        [<ffffffff810653b0>] cancel_work_sync+0x10/0x20
        [<ffffffff813b8212>] tty_port_destroy+0x12/0x20
        [<ffffffff813c65e8>] vc_deallocate+0xf8/0x110
        [<ffffffff813bc20c>] vt_ioctl+0xb6c/0x1230
        [<ffffffff810aec41>] ? lock_release_holdtime.part.30+0xa1/0x170
        [<ffffffff813b01a5>] tty_ioctl+0x285/0xd50
        [<ffffffff812b00f6>] ? inode_has_perm.isra.46.constprop.61+0x56/0x80
        [<ffffffff811ba825>] do_vfs_ioctl+0x305/0x530
        [<ffffffff812b04db>] ? selinux_file_ioctl+0x5b/0x110
        [<ffffffff811baad1>] sys_ioctl+0x81/0xa0
        [<ffffffff81601d59>] system_call_fastpath+0x16/0x1b
      
      Cc: Dave Jones <davej@redhat.com>
      Signed-off-by: NPeter Hurley <peter@hurleysoftware.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      421b40a6
  9. 20 5月, 2013 2 次提交
  10. 18 5月, 2013 2 次提交
  11. 17 5月, 2013 4 次提交
  12. 16 5月, 2013 1 次提交
  13. 15 5月, 2013 5 次提交
  14. 14 5月, 2013 1 次提交