1. 27 9月, 2013 1 次提交
    • E
      sysfs: Allow mounting without CONFIG_NET · 667b4102
      Eric W. Biederman 提交于
      In kobj_ns_current_may_mount the default should be to allow the
      mount.  The test is only for a single kobj_ns_type at a time, and unless
      there is a reason to prevent it the mounting sysfs should be allowed.
      Subsystems that are not registered can't have are not involved so can't
      have a reason to prevent mounting sysfs.
      
      This is a bug-fix to:
          commit 7dc5dbc8
          Author: Eric W. Biederman <ebiederm@xmission.com>
          Date:   Mon Mar 25 20:07:01 2013 -0700
      
              sysfs: Restrict mounting sysfs
      
              Don't allow mounting sysfs unless the caller has CAP_SYS_ADMIN rights
              over the net namespace.  The principle here is if you create or have
              capabilities over it you can mount it, otherwise you get to live with
              what other people have mounted.
      
              Instead of testing this with a straight forward ns_capable call,
              perform this check the long and torturous way with kobject helpers,
              this keeps direct knowledge of namespaces out of sysfs, and preserves
              the existing sysfs abstractions.
      Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      
      That came in via the userns tree during the 3.12 merge window.
      Reported-by: NJames Hogan <james.hogan@imgtec.com>
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      667b4102
  2. 21 9月, 2013 1 次提交
    • W
      lockref: use cmpxchg64 explicitly for lockless updates · 8f4c3446
      Will Deacon 提交于
      The cmpxchg() function tends not to support 64-bit arguments on 32-bit
      architectures.  This could be either due to use of unsigned long
      arguments (like on ARM) or lack of instruction support (cmpxchgq on
      x86).  However, these architectures may implement a specific cmpxchg64()
      function to provide 64-bit cmpxchg support instead.
      
      Since the lockref code requires a 64-bit cmpxchg and relies on the
      architecture selecting ARCH_USE_CMPXCHG_LOCKREF, move to using cmpxchg64
      instead of cmpxchg and allow 32-bit architectures to make use of the
      lockless lockref implementation.
      
      Cc: Waiman Long <Waiman.Long@hp.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8f4c3446
  3. 13 9月, 2013 1 次提交
  4. 12 9月, 2013 11 次提交
  5. 10 9月, 2013 1 次提交
    • K
      idr: Percpu ida · 798ab48e
      Kent Overstreet 提交于
      Percpu frontend for allocating ids. With percpu allocation (that works),
      it's impossible to guarantee it will always be possible to allocate all
      nr_tags - typically, some will be stuck on a remote percpu freelist
      where the current job can't get to them.
      
      We do guarantee that it will always be possible to allocate at least
      (nr_tags / 2) tags - this is done by keeping track of which and how many
      cpus have tags on their percpu freelists. On allocation failure if
      enough cpus have tags that there could potentially be (nr_tags / 2) tags
      stuck on remote percpu freelists, we then pick a remote cpu at random to
      steal from.
      
      Note that there's no cpu hotplug notifier - we don't care, because
      steal_tags() will eventually get the down cpu's tags. We _could_ satisfy
      more allocations if we had a notifier - but we'll still meet our
      guarantees and it's absolutely not a correctness issue, so I don't think
      it's worth the extra code.
      
      From akpm:
      
          "It looks OK to me (that's as close as I get to an ack :))
      
      v6 changes:
        - Add #include <linux/cpumask.h> to include/linux/percpu_ida.h to
          make alpha/arc builds happy (Fengguang)
        - Move second (cpu >= nr_cpu_ids) check inside of first check scope
          in steal_tags() (akpm + nab)
      
      v5 changes:
        - Change percpu_ida->cpus_have_tags to cpumask_t (kmo + akpm)
        - Add comment for percpu_ida_cpu->lock + ->nr_free (kmo + akpm)
        - Convert steal_tags() to use cpumask_weight() + cpumask_next() +
          cpumask_first() + cpumask_clear_cpu() (kmo + akpm)
        - Add comment for alloc_global_tags() (kmo + akpm)
        - Convert percpu_ida_alloc() to use cpumask_set_cpu() (kmo + akpm)
        - Convert percpu_ida_free() to use cpumask_set_cpu() (kmo + akpm)
        - Drop percpu_ida->cpus_have_tags allocation in percpu_ida_init()
          (kmo + akpm)
        - Drop percpu_ida->cpus_have_tags kfree in percpu_ida_destroy()
          (kmo + akpm)
        - Add comment for percpu_ida_alloc @ gfp (kmo + akpm)
        - Move to percpu_ida.c + percpu_ida.h (kmo + akpm + nab)
      
      v4 changes:
      
        - Fix tags.c reference in percpu_ida_init (akpm)
      Signed-off-by: NKent Overstreet <kmo@daterainc.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: "Nicholas A. Bellinger" <nab@linux-iscsi.org>
      Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>
      798ab48e
  6. 08 9月, 2013 2 次提交
    • L
      lockref: add ability to mark lockrefs "dead" · e7d33bb5
      Linus Torvalds 提交于
      The only actual current lockref user (dcache) uses zero reference counts
      even for perfectly live dentries, because it's a cache: there may not be
      any users, but that doesn't mean that we want to throw away the dentry.
      
      At the same time, the dentry cache does have a notion of a truly "dead"
      dentry that we must not even increment the reference count of, because
      we have pruned it and it is not valid.
      
      Currently that distinction is not visible in the lockref itself, and the
      dentry cache validation uses "lockref_get_or_lock()" to either get a new
      reference to a dentry that already had existing references (and thus
      cannot be dead), or get the dentry lock so that we can then verify the
      dentry and increment the reference count under the lock if that
      verification was successful.
      
      That's all somewhat complicated.
      
      This adds the concept of being "dead" to the lockref itself, by simply
      using a count that is negative.  This allows a usage scenario where we
      can increment the refcount of a dentry without having to validate it,
      and pushing the special "we killed it" case into the lockref code.
      
      The dentry code itself doesn't actually use this yet, and it's probably
      too late in the merge window to do that code (the dentry_kill() code
      with its "should I decrement the count" logic really is pretty complex
      code), but let's introduce the concept at the lockref level now.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e7d33bb5
    • L
      lockref: fix docbook argument names · 44a0cf92
      Linus Torvalds 提交于
      The code got rewritten, but the comments got copied as-is from older
      versions, and as a result the argument name in the comment didn't
      actually match the code any more.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      44a0cf92
  7. 07 9月, 2013 1 次提交
  8. 05 9月, 2013 1 次提交
    • V
      Kconfig.debug: Add FRAME_POINTER anti-dependency for ARC · cc80ae38
      Vineet Gupta 提交于
      Frame pointer on ARC doesn't serve the conventional purpose of stack
      unwinding due to the typical way ABI designates it's usage.
      Thus it's explicit usage on ARC is discouraged (gcc is free to use it,
      for some tricky stack frames even if -fomit-frame-pointer).
      
      Hence no point enabling it for ARC.
      
      References: http://www.spinics.net/lists/kernel/msg1593937.htmlSigned-off-by: NVineet Gupta <vgupta@synopsys.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: "Paul E. McKenney" <paul.mckenney@linaro.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Michel Lespinasse <walken@google.com>
      Cc: linux-kernel@vger.kernel.org
      cc80ae38
  9. 04 9月, 2013 2 次提交
  10. 03 9月, 2013 2 次提交
    • L
      lockref: implement lockless reference count updates using cmpxchg() · bc08b449
      Linus Torvalds 提交于
      Instead of taking the spinlock, the lockless versions atomically check
      that the lock is not taken, and do the reference count update using a
      cmpxchg() loop.  This is semantically identical to doing the reference
      count update protected by the lock, but avoids the "wait for lock"
      contention that you get when accesses to the reference count are
      contended.
      
      Note that a "lockref" is absolutely _not_ equivalent to an atomic_t.
      Even when the lockref reference counts are updated atomically with
      cmpxchg, the fact that they also verify the state of the spinlock means
      that the lockless updates can never happen while somebody else holds the
      spinlock.
      
      So while "lockref_put_or_lock()" looks a lot like just another name for
      "atomic_dec_and_lock()", and both optimize to lockless updates, they are
      fundamentally different: the decrement done by atomic_dec_and_lock() is
      truly independent of any lock (as long as it doesn't decrement to zero),
      so a locked region can still see the count change.
      
      The lockref structure, in contrast, really is a *locked* reference
      count.  If you hold the spinlock, the reference count will be stable and
      you can modify the reference count without using atomics, because even
      the lockless updates will see and respect the state of the lock.
      
      In order to enable the cmpxchg lockless code, the architecture needs to
      do three things:
      
       (1) Make sure that the "arch_spinlock_t" and an "unsigned int" can fit
           in an aligned u64, and have a "cmpxchg()" implementation that works
           on such a u64 data type.
      
       (2) define a helper function to test for a spinlock being unlocked
           ("arch_spin_value_unlocked()")
      
       (3) select the "ARCH_USE_CMPXCHG_LOCKREF" config variable in its
           Kconfig file.
      
      This enables it for x86-64 (but not 32-bit, we'd need to make sure
      cmpxchg() turns into the proper cmpxchg8b in order to enable it for
      32-bit mode).
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      bc08b449
    • L
      lockref: uninline lockref helper functions · 2f4f12e5
      Linus Torvalds 提交于
      They aren't very good to inline, since they already call external
      functions (the spinlock code), and we're going to create rather more
      complicated versions of them that can do the reference count updates
      locklessly.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2f4f12e5
  11. 29 8月, 2013 2 次提交
  12. 27 8月, 2013 2 次提交
  13. 24 8月, 2013 1 次提交
  14. 23 8月, 2013 1 次提交
    • M
      math64: New separate div64_u64_rem helper · eb18cba7
      Mike Snitzer 提交于
      Commit f7926850 ("math64: New
      div64_u64_rem helper") implemented div64_u64 in terms of div64_u64_rem.
      But div64_u64_rem was removed because it slowed down div64_u64 (and
      there were no other users of div64_u64_rem).
      
      Device Mapper's I/O statistics support has a need for div64_u64_rem;
      reintroduce this helper as a separate method that doesn't slow down
      div64_u64, especially on 32-bit systems.
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      Cc: Stanislaw Gruszka <sgruszka@redhat.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
      eb18cba7
  15. 19 8月, 2013 1 次提交
    • P
      debugobjects: Make debug_object_activate() return status · b778ae25
      Paul E. McKenney 提交于
      In order to better respond to things like duplicate invocations
      of call_rcu(), RCU needs to see the status of a call to
      debug_object_activate().  This would allow RCU to leak the callback in
      order to avoid adding freelist-reuse mischief to the duplicate invoations.
      This commit therefore makes debug_object_activate() return status,
      zero for success and -EINVAL for failure.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Cc: Sedat Dilek <sedat.dilek@gmail.com>
      Cc: Davidlohr Bueso <davidlohr.bueso@hp.com>
      Cc: Rik van Riel <riel@surriel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Tested-by: NSedat Dilek <sedat.dilek@gmail.com>
      Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
      b778ae25
  16. 15 8月, 2013 1 次提交
  17. 09 8月, 2013 1 次提交
  18. 07 8月, 2013 1 次提交
  19. 26 7月, 2013 1 次提交
    • R
      kobject: delayed kobject release: help find buggy drivers · c817a67e
      Russell King 提交于
      Implement debugging for kobject release functions.  kobjects are
      reference counted, so the drop of the last reference to them is not
      predictable. However, the common case is for the last reference to be
      the kobject's removal from a subsystem, which results in the release
      function being immediately called.
      
      This can hide subtle bugs, which can occur when another thread holds a
      reference to the kobject at the same time that a kobject is removed.
      This results in the release method being delayed.
      
      In order to make these kinds of problems more visible, the following
      patch implements a delayed release; this has the effect that the
      release function will be out of order with respect to the removal of
      the kobject in the same manner that it would be if a reference was
      being held.
      
      This provides us with an easy way to allow driver writers to debug
      their drivers and fix otherwise hidden problems.
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c817a67e
  20. 24 7月, 2013 1 次提交
  21. 20 7月, 2013 1 次提交
  22. 15 7月, 2013 1 次提交
    • P
      kernel: delete __cpuinit usage from all core kernel files · 0db0628d
      Paul Gortmaker 提交于
      The __cpuinit type of throwaway sections might have made sense
      some time ago when RAM was more constrained, but now the savings
      do not offset the cost and complications.  For example, the fix in
      commit 5e427ec2 ("x86: Fix bit corruption at CPU resume time")
      is a good example of the nasty type of bugs that can be created
      with improper use of the various __init prefixes.
      
      After a discussion on LKML[1] it was decided that cpuinit should go
      the way of devinit and be phased out.  Once all the users are gone,
      we can then finally remove the macros themselves from linux/init.h.
      
      This removes all the uses of the __cpuinit macros from C files in
      the core kernel directories (kernel, init, lib, mm, and include)
      that don't really have a specific maintainer.
      
      [1] https://lkml.org/lkml/2013/5/20/589Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      0db0628d
  23. 13 7月, 2013 1 次提交
    • O
      llist: fix/simplify llist_add() and llist_add_batch() · fb4214db
      Oleg Nesterov 提交于
      1. This is mostly theoretical, but llist_add*() need ACCESS_ONCE().
      
         Otherwise it is not guaranteed that the first cmpxchg() uses the
         same value for old_entry and new_last->next.
      
      2. These helpers cache the result of cmpxchg() and read the initial
         value of head->first before the main loop. I do not think this
         makes sense. In the likely case cmpxchg() succeeds, otherwise
         it doesn't hurt to reload head->first.
      
         I think it would be better to simplify the code and simply read
         ->first before cmpxchg().
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Andrey Vagin <avagin@openvz.org>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Huang Ying <ying.huang@intel.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      fb4214db
  24. 12 7月, 2013 1 次提交
  25. 10 7月, 2013 1 次提交