1. 25 9月, 2014 5 次提交
  2. 24 9月, 2014 1 次提交
    • T
      blk-mq, percpu_ref: implement a kludge for SCSI blk-mq stall during probe · 0a30288d
      Tejun Heo 提交于
      blk-mq uses percpu_ref for its usage counter which tracks the number
      of in-flight commands and used to synchronously drain the queue on
      freeze.  percpu_ref shutdown takes measureable wallclock time as it
      involves a sched RCU grace period.  This means that draining a blk-mq
      takes measureable wallclock time.  One would think that this shouldn't
      matter as queue shutdown should be a rare event which takes place
      asynchronously w.r.t. userland.
      
      Unfortunately, SCSI probing involves synchronously setting up and then
      tearing down a lot of request_queues back-to-back for non-existent
      LUNs.  This means that SCSI probing may take more than ten seconds
      when scsi-mq is used.
      
      This will be properly fixed by implementing a mechanism to keep
      q->mq_usage_counter in atomic mode till genhd registration; however,
      that involves rather big updates to percpu_ref which is difficult to
      apply late in the devel cycle (v3.17-rc6 at the moment).  As a
      stop-gap measure till the proper fix can be implemented in the next
      cycle, this patch introduces __percpu_ref_kill_expedited() and makes
      blk_mq_freeze_queue() use it.  This is heavy-handed but should work
      for testing the experimental SCSI blk-mq implementation.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: NChristoph Hellwig <hch@infradead.org>
      Link: http://lkml.kernel.org/g/20140919113815.GA10791@lst.de
      Fixes: add703fd ("blk-mq: use percpu_ref for mq usage count")
      Cc: Kent Overstreet <kmo@daterainc.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Tested-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      0a30288d
  3. 20 9月, 2014 3 次提交
    • T
      percpu-refcount: make percpu_ref based on longs instead of ints · e625305b
      Tejun Heo 提交于
      percpu_ref is currently based on ints and the number of refs it can
      cover is (1 << 31).  This makes it impossible to use a percpu_ref to
      count memory objects or pages on 64bit machines as it may overflow.
      This forces those users to somehow aggregate the references before
      contributing to the percpu_ref which is often cumbersome and sometimes
      challenging to get the same level of performance as using the
      percpu_ref directly.
      
      While using ints for the percpu counters makes them pack tighter on
      64bit machines, the possible gain from using ints instead of longs is
      extremely small compared to the overall gain from per-cpu operation.
      This patch makes percpu_ref based on longs so that it can be used to
      directly count memory objects or pages.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Kent Overstreet <kmo@daterainc.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      e625305b
    • T
      percpu-refcount: improve WARN messages · 4843c332
      Tejun Heo 提交于
      percpu_ref's WARN messages can be a lot more helpful by indicating
      who's the culprit.  Make them report the release function that the
      offending percpu-refcount is associated with.  This should make it a
      lot easier to track down the reported invalid refcnting operations.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Kent Overstreet <kmo@daterainc.com>
      4843c332
    • F
      lib: rhashtable: remove second linux/log2.h inclusion · b3f2512e
      Fabian Frederick 提交于
      linux/log2.h was included twice.
      Signed-off-by: NFabian Frederick <fabf@skynet.be>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b3f2512e
  4. 14 9月, 2014 1 次提交
    • L
      Make ARCH_HAS_FAST_MULTIPLIER a real config variable · 72d93104
      Linus Torvalds 提交于
      It used to be an ad-hoc hack defined by the x86 version of
      <asm/bitops.h> that enabled a couple of library routines to know whether
      an integer multiply is faster than repeated shifts and additions.
      
      This just makes it use the real Kconfig system instead, and makes x86
      (which was the only architecture that did this) select the option.
      
      NOTE! Even for x86, this really is kind of wrong.  If we cared, we would
      probably not enable this for builds optimized for netburst (P4), where
      shifts-and-adds are generally faster than multiplies.  This patch does
      *not* change that kind of logic, though, it is purely a syntactic change
      with no code changes.
      
      This was triggered by the fact that we have other places that really
      want to know "do I want to expand multiples by constants by hand or
      not", particularly the hash generation code.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      72d93104
  5. 12 9月, 2014 1 次提交
    • D
      KEYS: Fix termination condition in assoc array garbage collection · 95389b08
      David Howells 提交于
      This fixes CVE-2014-3631.
      
      It is possible for an associative array to end up with a shortcut node at the
      root of the tree if there are more than fan-out leaves in the tree, but they
      all crowd into the same slot in the lowest level (ie. they all have the same
      first nibble of their index keys).
      
      When assoc_array_gc() returns back up the tree after scanning some leaves, it
      can fall off of the root and crash because it assumes that the back pointer
      from a shortcut (after label ascend_old_tree) must point to a normal node -
      which isn't true of a shortcut node at the root.
      
      Should we find we're ascending rootwards over a shortcut, we should check to
      see if the backpointer is zero - and if it is, we have completed the scan.
      
      This particular bug cannot occur if the root node is not a shortcut - ie. if
      you have fewer than 17 keys in a keyring or if you have at least two keys that
      sit into separate slots (eg. a keyring and a non keyring).
      
      This can be reproduced by:
      
      	ring=`keyctl newring bar @s`
      	for ((i=1; i<=18; i++)); do last_key=`keyctl newring foo$i $ring`; done
      	keyctl timeout $last_key 2
      
      Doing this:
      
      	echo 3 >/proc/sys/kernel/keys/gc_delay
      
      first will speed things up.
      
      If we do fall off of the top of the tree, we get the following oops:
      
      BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
      IP: [<ffffffff8136cea7>] assoc_array_gc+0x2f7/0x540
      PGD dae15067 PUD cfc24067 PMD 0
      Oops: 0000 [#1] SMP
      Modules linked in: xt_nat xt_mark nf_conntrack_netbios_ns nf_conntrack_broadcast ip6t_rpfilter ip6t_REJECT xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_ni
      CPU: 0 PID: 26011 Comm: kworker/0:1 Not tainted 3.14.9-200.fc20.x86_64 #1
      Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
      Workqueue: events key_garbage_collector
      task: ffff8800918bd580 ti: ffff8800aac14000 task.ti: ffff8800aac14000
      RIP: 0010:[<ffffffff8136cea7>] [<ffffffff8136cea7>] assoc_array_gc+0x2f7/0x540
      RSP: 0018:ffff8800aac15d40  EFLAGS: 00010206
      RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff8800aaecacc0
      RDX: ffff8800daecf440 RSI: 0000000000000001 RDI: ffff8800aadc2bc0
      RBP: ffff8800aac15da8 R08: 0000000000000001 R09: 0000000000000003
      R10: ffffffff8136ccc7 R11: 0000000000000000 R12: 0000000000000000
      R13: 0000000000000000 R14: 0000000000000070 R15: 0000000000000001
      FS:  0000000000000000(0000) GS:ffff88011fc00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      CR2: 0000000000000018 CR3: 00000000db10d000 CR4: 00000000000006f0
      Stack:
       ffff8800aac15d50 0000000000000011 ffff8800aac15db8 ffffffff812e2a70
       ffff880091a00600 0000000000000000 ffff8800aadc2bc3 00000000cd42c987
       ffff88003702df20 ffff88003702dfa0 0000000053b65c09 ffff8800aac15fd8
      Call Trace:
       [<ffffffff812e2a70>] ? keyring_detect_cycle_iterator+0x30/0x30
       [<ffffffff812e3e75>] keyring_gc+0x75/0x80
       [<ffffffff812e1424>] key_garbage_collector+0x154/0x3c0
       [<ffffffff810a67b6>] process_one_work+0x176/0x430
       [<ffffffff810a744b>] worker_thread+0x11b/0x3a0
       [<ffffffff810a7330>] ? rescuer_thread+0x3b0/0x3b0
       [<ffffffff810ae1a8>] kthread+0xd8/0xf0
       [<ffffffff810ae0d0>] ? insert_kthread_work+0x40/0x40
       [<ffffffff816ffb7c>] ret_from_fork+0x7c/0xb0
       [<ffffffff810ae0d0>] ? insert_kthread_work+0x40/0x40
      Code: 08 4c 8b 22 0f 84 bf 00 00 00 41 83 c7 01 49 83 e4 fc 41 83 ff 0f 4c 89 65 c0 0f 8f 5a fe ff ff 48 8b 45 c0 4d 63 cf 49 83 c1 02 <4e> 8b 34 c8 4d 85 f6 0f 84 be 00 00 00 41 f6 c6 01 0f 84 92
      RIP  [<ffffffff8136cea7>] assoc_array_gc+0x2f7/0x540
       RSP <ffff8800aac15d40>
      CR2: 0000000000000018
      ---[ end trace 1129028a088c0cbd ]---
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Acked-by: NDon Zickus <dzickus@redhat.com>
      Signed-off-by: NJames Morris <james.l.morris@oracle.com>
      95389b08
  6. 08 9月, 2014 4 次提交
    • T
      percpu-refcount: add @gfp to percpu_ref_init() · a34375ef
      Tejun Heo 提交于
      Percpu allocator now supports allocation mask.  Add @gfp to
      percpu_ref_init() so that !GFP_KERNEL allocation masks can be used
      with percpu_refs too.
      
      This patch doesn't make any functional difference.
      
      v2: blk-mq conversion was missing.  Updated.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Kent Overstreet <koverstreet@google.com>
      Cc: Benjamin LaHaise <bcrl@kvack.org>
      Cc: Li Zefan <lizefan@huawei.com>
      Cc: Nicholas A. Bellinger <nab@linux-iscsi.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      a34375ef
    • T
      proportions: add @gfp to init functions · 20ae0079
      Tejun Heo 提交于
      Percpu allocator now supports allocation mask.  Add @gfp to
      [flex_]proportions init functions so that !GFP_KERNEL allocation masks
      can be used with them too.
      
      This patch doesn't make any functional difference.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Cc: Peter Zijlstra <peterz@infradead.org>
      20ae0079
    • T
      percpu_counter: add @gfp to percpu_counter_init() · 908c7f19
      Tejun Heo 提交于
      Percpu allocator now supports allocation mask.  Add @gfp to
      percpu_counter_init() so that !GFP_KERNEL allocation masks can be used
      with percpu_counters too.
      
      We could have left percpu_counter_init() alone and added
      percpu_counter_init_gfp(); however, the number of users isn't that
      high and introducing _gfp variants to all percpu data structures would
      be quite ugly, so let's just do the conversion.  This is the one with
      the most users.  Other percpu data structures are a lot easier to
      convert.
      
      This patch doesn't make any functional difference.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NJan Kara <jack@suse.cz>
      Acked-by: N"David S. Miller" <davem@davemloft.net>
      Cc: x86@kernel.org
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      908c7f19
    • T
      percpu_counter: make percpu_counters_lock irq-safe · ebd8fef3
      Tejun Heo 提交于
      percpu_counter is scheduled to grow @gfp support to allow atomic
      initialization.  This patch makes percpu_counters_lock irq-safe so
      that it can be safely used from atomic contexts.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      ebd8fef3
  7. 03 9月, 2014 1 次提交
  8. 30 8月, 2014 1 次提交
  9. 28 8月, 2014 1 次提交
  10. 15 8月, 2014 2 次提交
  11. 09 8月, 2014 6 次提交
  12. 07 8月, 2014 14 次提交