1. 16 8月, 2016 1 次提交
    • V
      rhashtable: fix shift by 64 when shrinking · 12311959
      Vegard Nossum 提交于
      I got this:
      
          ================================================================================
          UBSAN: Undefined behaviour in ./include/linux/log2.h:63:13
          shift exponent 64 is too large for 64-bit type 'long unsigned int'
          CPU: 1 PID: 721 Comm: kworker/1:1 Not tainted 4.8.0-rc1+ #87
          Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
          Workqueue: events rht_deferred_worker
           0000000000000000 ffff88011661f8d8 ffffffff82344f50 0000000041b58ab3
           ffffffff84f98000 ffffffff82344ea4 ffff88011661f900 ffff88011661f8b0
           0000000000000001 ffff88011661f6b8 dffffc0000000000 ffffffff867f7640
          Call Trace:
           [<ffffffff82344f50>] dump_stack+0xac/0xfc
           [<ffffffff82344ea4>] ? _atomic_dec_and_lock+0xc4/0xc4
           [<ffffffff8242f5b8>] ubsan_epilogue+0xd/0x8a
           [<ffffffff82430c41>] __ubsan_handle_shift_out_of_bounds+0x255/0x29a
           [<ffffffff824309ec>] ? __ubsan_handle_out_of_bounds+0x180/0x180
           [<ffffffff84003436>] ? nl80211_req_set_reg+0x256/0x2f0
           [<ffffffff812112ba>] ? print_context_stack+0x8a/0x160
           [<ffffffff81200031>] ? amd_pmu_reset+0x341/0x380
           [<ffffffff823af808>] rht_deferred_worker+0x1618/0x1790
           [<ffffffff823af808>] ? rht_deferred_worker+0x1618/0x1790
           [<ffffffff823ae1f0>] ? rhashtable_jhash2+0x370/0x370
           [<ffffffff8134c12d>] ? process_one_work+0x6fd/0x1970
           [<ffffffff8134c1cf>] process_one_work+0x79f/0x1970
           [<ffffffff8134c12d>] ? process_one_work+0x6fd/0x1970
           [<ffffffff8134ba30>] ? try_to_grab_pending+0x4c0/0x4c0
           [<ffffffff8134d564>] ? worker_thread+0x1c4/0x1340
           [<ffffffff8134d8ff>] worker_thread+0x55f/0x1340
           [<ffffffff845e904f>] ? __schedule+0x4df/0x1d40
           [<ffffffff8134d3a0>] ? process_one_work+0x1970/0x1970
           [<ffffffff8134d3a0>] ? process_one_work+0x1970/0x1970
           [<ffffffff813642f7>] kthread+0x237/0x390
           [<ffffffff813640c0>] ? __kthread_parkme+0x280/0x280
           [<ffffffff845f8c93>] ? _raw_spin_unlock_irq+0x33/0x50
           [<ffffffff845f95df>] ret_from_fork+0x1f/0x40
           [<ffffffff813640c0>] ? __kthread_parkme+0x280/0x280
          ================================================================================
      
      roundup_pow_of_two() is undefined when called with an argument of 0, so
      let's avoid the call and just fall back to ht->p.min_size (which should
      never be smaller than HASH_MIN_SIZE).
      
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NVegard Nossum <vegard.nossum@oracle.com>
      Acked-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      12311959
  2. 15 8月, 2016 1 次提交
    • F
      rhashtable: avoid large lock-array allocations · 4cf0b354
      Florian Westphal 提交于
      Sander reports following splat after netfilter nat bysrc table got
      converted to rhashtable:
      
      swapper/0: page allocation failure: order:3, mode:0x2084020(GFP_ATOMIC|__GFP_COMP)
       CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.8.0-rc1 [..]
       [<ffffffff811633ed>] warn_alloc_failed+0xdd/0x140
       [<ffffffff811638b1>] __alloc_pages_nodemask+0x3e1/0xcf0
       [<ffffffff811a72ed>] alloc_pages_current+0x8d/0x110
       [<ffffffff8117cb7f>] kmalloc_order+0x1f/0x70
       [<ffffffff811aec19>] __kmalloc+0x129/0x140
       [<ffffffff8146d561>] bucket_table_alloc+0xc1/0x1d0
       [<ffffffff8146da1d>] rhashtable_insert_rehash+0x5d/0xe0
       [<ffffffff819fcfff>] nf_nat_setup_info+0x2ef/0x400
      
      The failure happens when allocating the spinlock array.
      Even with GFP_KERNEL its unlikely for such a large allocation
      to succeed.
      
      Thomas Graf pointed me at inet_ehash_locks_alloc(), so in addition
      to adding NOWARN for atomic allocations this also makes the bucket-array
      sizing more conservative.
      
      In commit 095dc8e0 ("tcp: fix/cleanup inet_ehash_locks_alloc()"),
      Eric Dumazet says: "Budget 2 cache lines per cpu worth of 'spinlocks'".
      IOW, consider size needed by a single spinlock when determining
      number of locks per cpu.  So with 64 byte per cacheline and 4 byte per
      spinlock this gives 32 locks per cpu.
      
      Resulting size of the lock-array (sizeof(spinlock) == 4):
      
      cpus:    1   2   4   8   16   32   64
      old:    1k  1k  4k  8k  16k  16k  16k
      new:   128 256 512  1k   2k   4k   8k
      
      8k allocation should have decent chance of success even
      with GFP_ATOMIC, and should not fail with GFP_KERNEL.
      
      With 72-byte spinlock (LOCKDEP):
      cpus :   1   2
      old:    9k 18k
      new:   ~2k ~4k
      Reported-by: NSander Eikelenboom <linux@eikelenboom.it>
      Suggested-by: NThomas Graf <tgraf@suug.ch>
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4cf0b354
  3. 05 4月, 2016 1 次提交
  4. 19 12月, 2015 1 次提交
  5. 17 12月, 2015 1 次提交
    • H
      rhashtable: Fix walker list corruption · c6ff5268
      Herbert Xu 提交于
      The commit ba7c95ea ("rhashtable:
      Fix sleeping inside RCU critical section in walk_stop") introduced
      a new spinlock for the walker list.  However, it did not convert
      all existing users of the list over to the new spin lock.  Some
      continued to use the old mutext for this purpose.  This obviously
      led to corruption of the list.
      
      The fix is to use the spin lock everywhere where we touch the list.
      
      This also allows us to do rcu_rad_lock before we take the lock in
      rhashtable_walk_start.  With the old mutex this would've deadlocked
      but it's safe with the new spin lock.
      
      Fixes: ba7c95ea ("rhashtable: Fix sleeping inside RCU...")
      Reported-by: NColin Ian King <colin.king@canonical.com>
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c6ff5268
  6. 16 12月, 2015 1 次提交
  7. 09 12月, 2015 1 次提交
  8. 06 12月, 2015 1 次提交
  9. 05 12月, 2015 2 次提交
  10. 23 9月, 2015 1 次提交
  11. 09 7月, 2015 1 次提交
    • P
      rhashtable: fix for resize events during table walk · 142b942a
      Phil Sutter 提交于
      If rhashtable_walk_next detects a resize operation in progress, it jumps
      to the new table and continues walking that one. But it misses to drop
      the reference to it's current item, leading it to continue traversing
      the new table's bucket in which the current item is sorted into, and
      after reaching that bucket's end continues traversing the new table's
      second bucket instead of the first one, thereby potentially missing
      items.
      
      This fixes the rhashtable runtime test for me. Bug probably introduced
      by Herbert Xu's patch eddee5ba ("rhashtable: Fix walker behaviour during
      rehash") although not explicitly tested.
      
      Fixes: eddee5ba ("rhashtable: Fix walker behaviour during rehash")
      Signed-off-by: NPhil Sutter <phil@nwl.cc>
      Acked-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      142b942a
  12. 07 6月, 2015 1 次提交
  13. 17 5月, 2015 1 次提交
    • H
      rhashtable: Add cap on number of elements in hash table · 07ee0722
      Herbert Xu 提交于
      We currently have no limit on the number of elements in a hash table.
      This is a problem because some users (tipc) set a ceiling on the
      maximum table size and when that is reached the hash table may
      degenerate.  Others may encounter OOM when growing and if we allow
      insertions when that happens the hash table perofrmance may also
      suffer.
      
      This patch adds a new paramater insecure_max_entries which becomes
      the cap on the table.  If unset it defaults to max_size * 2.  If
      it is also zero it means that there is no cap on the number of
      elements in the table.  However, the table will grow whenever the
      utilisation hits 100% and if that growth fails, you will get ENOMEM
      on insertion.
      
      As allowing oversubscription is potentially dangerous, the name
      contains the word insecure.
      
      Note that the cap is not a hard limit.  This is done for performance
      reasons as enforcing a hard limit will result in use of atomic ops
      that are heavier than the ones we currently use.
      
      The reasoning is that we're only guarding against a gross over-
      subscription of the table, rather than a small breach of the limit.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      07ee0722
  14. 06 5月, 2015 1 次提交
  15. 23 4月, 2015 2 次提交
  16. 26 3月, 2015 1 次提交
  17. 25 3月, 2015 4 次提交
  18. 24 3月, 2015 7 次提交
  19. 21 3月, 2015 3 次提交
    • H
      rhashtable: Rip out obsolete out-of-line interface · dc0ee268
      Herbert Xu 提交于
      Now that all rhashtable users have been converted over to the
      inline interface, this patch removes the unused out-of-line
      interface.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dc0ee268
    • H
      rhashtable: Allow hash/comparison functions to be inlined · 02fd97c3
      Herbert Xu 提交于
      This patch deals with the complaint that we make indirect function
      calls on the fast paths unnecessarily in rhashtable.  We resolve
      it by moving the fast paths into inline functions that take struct
      rhashtable_param (which obviously must be the same set of parameters
      supplied to rhashtable_init) as an argument.
      
      The only remaining indirect call is to obj_hashfn (or key_hashfn it
      obj_hashfn is unset) on the rehash as well as the insert-during-
      rehash slow path.
      
      This patch also extends the support of vairable-length keys to
      include those where the key is fixed but scattered in the object.
      For example, in netlink we want to key off the namespace and the
      portid but they're not next to each other.
      
      This patch does this by directly using the object hash function
      as the indicator of whether the key is accessible or not.  It
      also adds a new function obj_cmpfn to compare a key against an
      object.  This means that the caller no longer needs to supply
      explicit compare functions.
      
      All this is done in a backwards compatible manner so no existing
      users are affected until they convert to the new interface.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      02fd97c3
    • H
      rhashtable: Make rhashtable_init params argument const · 488fb86e
      Herbert Xu 提交于
      This patch marks the rhashtable_init params argument const as
      there is no reason to modify it since we will always make a copy
      of it in the rhashtable.
      
      This patch also fixes a bug where we don't actually round up the
      value of min_size unless it is less than HASH_MIN_SIZE.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Acked-by: NThomas Graf <tgraf@suug.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      488fb86e
  20. 20 3月, 2015 1 次提交
  21. 19 3月, 2015 3 次提交
  22. 17 3月, 2015 2 次提交
  23. 16 3月, 2015 2 次提交
    • H
      rhashtable: Fix rhashtable_remove failures · 565e8640
      Herbert Xu 提交于
      The commit 9d901bc0 ("rhashtable:
      Free bucket tables asynchronously after rehash") causes gratuitous
      failures in rhashtable_remove.
      
      The reason is that it inadvertently introduced multiple rehashing
      from the perspective of readers.  IOW it is now possible to see
      more than two tables during a single RCU critical section.
      
      Fortunately the other reader rhashtable_lookup already deals with
      this correctly thanks to c4db8848
      ("rhashtable: rhashtable: Move future_tbl into struct bucket_table")
      so only rhashtable_remove is broken by this change.
      
      This patch fixes this by looping over every table from the first
      one to the last or until we find the element that we were trying
      to delete.
      
      Incidentally the simple test for detecting rehashing to prevent
      starting another shrinking no longer works.  Since it isn't needed
      anyway (the work queue and the mutex serves as a natural barrier
      to unnecessary rehashes) I've simply killed the test.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      565e8640
    • H
      rhashtable: Fix use-after-free in rhashtable_walk_stop · 963ecbd4
      Herbert Xu 提交于
      The commit c4db8848 ("rhashtable:
      Move future_tbl into struct bucket_table") introduced a use-after-
      free bug in rhashtable_walk_stop because it dereferences tbl after
      droping the RCU read lock.
      
      This patch fixes it by moving the RCU read unlock down to the bottom
      of rhashtable_walk_stop.  In fact this was how I had it originally
      but it got dropped while rearranging patches because this one
      depended on the async freeing of bucket_table.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      963ecbd4