1. 09 4月, 2019 8 次提交
  2. 08 4月, 2019 6 次提交
    • V
      net: sched: flower: insert filter to ht before offloading it to hw · 1f17f774
      Vlad Buslov 提交于
      John reports:
      
      Recent refactoring of fl_change aims to use the classifier spinlock to
      avoid the need for rtnl lock. In doing so, the fl_hw_replace_filer()
      function was moved to before the lock is taken. This can create problems
      for drivers if duplicate filters are created (commmon in ovs tc offload
      due to filters being triggered by user-space matches).
      
      Drivers registered for such filters will now receive multiple copies of
      the same rule, each with a different cookie value. This means that the
      drivers would need to do a full match field lookup to determine
      duplicates, repeating work that will happen in flower __fl_lookup().
      Currently, drivers do not expect to receive duplicate filters.
      
      To fix this, verify that filter with same key is not present in flower
      classifier hash table and insert the new filter to the flower hash table
      before offloading it to hardware. Implement helper function
      fl_ht_insert_unique() to atomically verify/insert a filter.
      
      This change makes filter visible to fast path at the beginning of
      fl_change() function, which means it can no longer be freed directly in
      case of error. Refactor fl_change() error handling code to deallocate the
      filter with rcu timeout.
      
      Fixes: 620da486 ("net: sched: flower: refactor fl_change")
      Reported-by: NJohn Hurley <john.hurley@netronome.com>
      Signed-off-by: NVlad Buslov <vladbu@mellanox.com>
      Acked-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1f17f774
    • D
      Merge branch 'rhashtable-bitlocks' · 9186c90b
      David S. Miller 提交于
      NeilBrown says:
      
      ====================
      Convert rhashtable to use bitlocks
      
      This series converts rhashtable to use a per-bucket bitlock
      rather than a separate array of spinlocks.
      This:
        reduces memory usage
        results in slightly fewer memory accesses
        slightly improves parallelism
        makes a configuration option unnecessary
      
      The main change from previous version is to use a distinct type for
      the pointer in the bucket which has a bit-lock in it.  This
      helped find two places where rht_ptr() was missed, one
      in  rhashtable_free_and_destroy() in print_ht in the test code.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9186c90b
    • N
      rhashtable: add lockdep tracking to bucket bit-spin-locks. · 149212f0
      NeilBrown 提交于
      Native bit_spin_locks are not tracked by lockdep.
      
      The bit_spin_locks used for rhashtable buckets are local
      to the rhashtable implementation, so there is little opportunity
      for the sort of misuse that lockdep might detect.
      However locks are held while a hash function or compare
      function is called, and if one of these took a lock,
      a misbehaviour is possible.
      
      As it is quite easy to add lockdep support this unlikely
      possibility seems to be enough justification.
      
      So create a lockdep class for bucket bit_spin_lock and attach
      through a lockdep_map in each bucket_table.
      
      Without the 'nested' annotation in rhashtable_rehash_one(), lockdep
      correctly reports a possible problem as this lock is taken
      while another bucket lock (in another table) is held.  This
      confirms that the added support works.
      With the correct nested annotation in place, lockdep reports
      no problems.
      Signed-off-by: NNeilBrown <neilb@suse.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      149212f0
    • N
      rhashtable: use bit_spin_locks to protect hash bucket. · 8f0db018
      NeilBrown 提交于
      This patch changes rhashtables to use a bit_spin_lock on BIT(1) of the
      bucket pointer to lock the hash chain for that bucket.
      
      The benefits of a bit spin_lock are:
       - no need to allocate a separate array of locks.
       - no need to have a configuration option to guide the
         choice of the size of this array
       - locking cost is often a single test-and-set in a cache line
         that will have to be loaded anyway.  When inserting at, or removing
         from, the head of the chain, the unlock is free - writing the new
         address in the bucket head implicitly clears the lock bit.
         For __rhashtable_insert_fast() we ensure this always happens
         when adding a new key.
       - even when lockings costs 2 updates (lock and unlock), they are
         in a cacheline that needs to be read anyway.
      
      The cost of using a bit spin_lock is a little bit of code complexity,
      which I think is quite manageable.
      
      Bit spin_locks are sometimes inappropriate because they are not fair -
      if multiple CPUs repeatedly contend of the same lock, one CPU can
      easily be starved.  This is not a credible situation with rhashtable.
      Multiple CPUs may want to repeatedly add or remove objects, but they
      will typically do so at different buckets, so they will attempt to
      acquire different locks.
      
      As we have more bit-locks than we previously had spinlocks (by at
      least a factor of two) we can expect slightly less contention to
      go with the slightly better cache behavior and reduced memory
      consumption.
      
      To enhance type checking, a new struct is introduced to represent the
        pointer plus lock-bit
      that is stored in the bucket-table.  This is "struct rhash_lock_head"
      and is empty.  A pointer to this needs to be cast to either an
      unsigned lock, or a "struct rhash_head *" to be useful.
      Variables of this type are most often called "bkt".
      
      Previously "pprev" would sometimes point to a bucket, and sometimes a
      ->next pointer in an rhash_head.  As these are now different types,
      pprev is NULL when it would have pointed to the bucket. In that case,
      'blk' is used, together with correct locking protocol.
      Signed-off-by: NNeilBrown <neilb@suse.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8f0db018
    • N
      rhashtable: allow rht_bucket_var to return NULL. · ff302db9
      NeilBrown 提交于
      Rather than returning a pointer to a static nulls, rht_bucket_var()
      now returns NULL if the bucket doesn't exist.
      This will make the next patch, which stores a bitlock in the
      bucket pointer, somewhat cleaner.
      
      This change involves introducing __rht_bucket_nested() which is
      like rht_bucket_nested(), but doesn't provide the static nulls,
      and changing rht_bucket_nested() to call this and possible
      provide a static nulls - as is still needed for the non-var case.
      Signed-off-by: NNeilBrown <neilb@suse.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ff302db9
    • N
      rhashtable: use cmpxchg() in nested_table_alloc() · 7a41c294
      NeilBrown 提交于
      nested_table_alloc() relies on the fact that there is
      at most one spinlock allocated for every slot in the top
      level nested table, so it is not possible for two threads
      to try to allocate the same table at the same time.
      
      This assumption is a little fragile (it is not explicit) and is
      unnecessary as cmpxchg() can be used instead.
      
      A future patch will replace the spinlocks by per-bucket bitlocks,
      and then we won't be able to protect the slot pointer with a spinlock.
      
      So replace rcu_assign_pointer() with cmpxchg() - which has equivalent
      barrier properties.
      If it the cmp fails, free the table that was just allocated.
      Signed-off-by: NNeilBrown <neilb@suse.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7a41c294
  3. 07 4月, 2019 26 次提交