- 07 2月, 2015 7 次提交
-
-
由 Thomas Graf 提交于
The remove logic properly searched the remaining chain for a matching entry with an identical hash but it did this while searching from both the old and new table. Instead in order to not leave stale references behind we need to: 1. When growing and searching from the new table: Search remaining chain for entry with same hash to avoid having the new table directly point to a entry with a different hash. 2. When shrinking and searching from the old table: Check if the element after the removed would create a cross reference and avoid it if so. These bugs were present from the beginning in nft_hash. Also, both insert functions calculated the hash based on the mask of the new table. This worked while growing. Wwhile shrinking, the mask of the inew table is smaller than the mask of the old table. This lead to a bit not being taken into account when selecting the bucket lock and thus caused the wrong bucket to be locked eventually. Fixes: 7e1e7763 ("lib: Resizable, Scalable, Concurrent Hash Table") Fixes: 97defe1e ("rhashtable: Per bucket locks & deferred expansion/shrinking") Reported-by: NYing Xue <ying.xue@windriver.com> Signed-off-by: NThomas Graf <tgraf@suug.ch> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Thomas Graf 提交于
During a resize, when two buckets in the larger table map to a single bucket in the smaller table and the new table has already been (partially) linked to the old table. Removal of an element may result the bucket in the larger table to point to entries which all hash to a different value than the bucket index. Thus causing two buckets to point to the same sub chain after unzipping. This is not illegal *during* the resize phase but after it has completed. Keep the old table around until all of the unzipping is done to allow the removal code to only search for matching hashed entries during this special period. Reported-by: NYing Xue <ying.xue@windriver.com> Fixes: 97defe1e ("rhashtable: Per bucket locks & deferred expansion/shrinking") Signed-off-by: NThomas Graf <tgraf@suug.ch> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Thomas Graf 提交于
Catch hash miscalculations which result in hard to track down race conditions. Signed-off-by: NThomas Graf <tgraf@suug.ch> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Thomas Graf 提交于
This simplifies debugging of locking violations if compiled with CONFIG_PROVE_LOCKING. Signed-off-by: NThomas Graf <tgraf@suug.ch> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Thomas Graf 提交于
We need to wait for all RCU readers to complete after the last bit of unzipping has been completed. Otherwise the old table is freed up prematurely. Fixes: 7e1e7763 ("lib: Resizable, Scalable, Concurrent Hash Table") Signed-off-by: NThomas Graf <tgraf@suug.ch> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Thomas Graf 提交于
rhashtable currently allows to use a bucket lock per bucket. This requires multiple levels of complicated nested locking because when resizing, a single bucket of the smaller table will map to two buckets in the larger table. So far rhashtable has explicitly locked both buckets in the larger table. By excluding the highest bit of the hash from the bucket lock map and thus only allowing locks to buckets in a ratio of 1:2, the locking can be simplified a lot without losing the benefits of multiple locks. Larger tables which benefit from multiple locks will not have a single lock per bucket anyway. Signed-off-by: NThomas Graf <tgraf@suug.ch> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Thomas Graf 提交于
The value computed by key_hashfn() is used by rhashtable_lookup_compare() to traverse both tables during a resize. key_hashfn() must therefore return the hash value without the buckets mask applied so it can be masked to the size of each individual table. Fixes: 97defe1e ("rhashtable: Per bucket locks & deferred expansion/shrinking") Signed-off-by: NThomas Graf <tgraf@suug.ch> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 05 2月, 2015 2 次提交
-
-
由 Herbert Xu 提交于
Some existing rhashtable users get too intimate with it by walking the buckets directly. This prevents us from easily changing the internals of rhashtable. This patch adds the helpers rhashtable_walk_init/exit/start/next/stop which will replace these custom walkers. They are meant to be usable for both procfs seq_file walks as well as walking by a netlink dump. The iterator structure should fit inside a netlink dump cb structure, with at least one element to spare. Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Herbert Xu 提交于
The current being_destroyed check in rhashtable_expand is not enough since if we start a shrinking process after freeing all elements in the table that's also going to crash. This patch adds a being_destroyed check to the deferred worker thread so that we bail out as soon as we take the lock. Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 31 1月, 2015 1 次提交
-
-
由 Geert Uytterhoeven 提交于
Allow the selftest on the resizable hash table to be built modular, just like all other tests that do not depend on DEBUG_KERNEL. Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org> Acked-by: NThomas Graf <tgraf@suug.ch> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 27 1月, 2015 1 次提交
-
-
由 Thomas Graf 提交于
As removals can occur during resizes, entries may be referred to from both tbl and future_tbl when the removal is requested. Therefore rhashtable_remove() must unlink the entry in both tables if this is the case. The existing code did search both tables but stopped when it hit the first match. Failing to unlink in both tables resulted in use after free. Fixes: 97defe1e ("rhashtable: Per bucket locks & deferred expansion/shrinking") Reported-by: NYing Xue <ying.xue@windriver.com> Signed-off-by: NThomas Graf <tgraf@suug.ch> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 16 1月, 2015 1 次提交
-
-
由 Ying Xue 提交于
When we put our declared work task in the global workqueue with schedule_delayed_work(), its delay parameter is always zero. Therefore, we should define a regular work in rhashtable structure instead of a delayed work. By the way, we add a condition to check whether resizing functions are NULL before cancelling the work, avoiding to cancel an uninitialized work. Lastly, while we wait for all work items we submitted before to run to completion with cancel_delayed_work(), ht->mutex has been taken in rhashtable_destroy(). Moreover, cancel_delayed_work() doesn't return until all work items are accomplished, and when work items are scheduled, the work's function - rht_deferred_worker() will be called. However, as rht_deferred_worker() also needs to acquire the lock, deadlock might happen at the moment as the lock is already held before. So if the cancel work function is moved out of the lock covered scope, this will avoid the deadlock. Fixes: 97defe1e ("rhashtable: Per bucket locks & deferred expansion/shrinking") Signed-off-by: NYing Xue <ying.xue@windriver.com> Cc: Thomas Graf <tgraf@suug.ch> Acked-by: NThomas Graf <tgraf@suug.ch> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 14 1月, 2015 2 次提交
-
-
由 Thomas Graf 提交于
Each per bucket lock covers a configurable number of buckets. While shrinking, two buckets in the old table contain entries for a single bucket in the new table. We need to lock down both while linking. Check if they are protected by different locks to avoid a recursive lock. Fixes: 97defe1e ("rhashtable: Per bucket locks & deferred expansion/shrinking") Reported-by: NFengguang Wu <fengguang.wu@intel.com> Signed-off-by: NThomas Graf <tgraf@suug.ch> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ying Xue 提交于
Introduce a new function called rhashtable_lookup_compare_insert() which is very similar to rhashtable_lookup_insert(). But the former makes use of users' given compare function to look for an object, and then inserts it into hash table if found. As the entire process of search and insertion is under protection of per bucket lock, this can help users to avoid the involvement of extra lock. Signed-off-by: NYing Xue <ying.xue@windriver.com> Cc: Thomas Graf <tgraf@suug.ch> Acked-by: NThomas Graf <tgraf@suug.ch> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 09 1月, 2015 6 次提交
-
-
由 Ying Xue 提交于
Signed-off-by: NYing Xue <ying.xue@windriver.com> Cc: Thomas Graf <tgraf@suug.ch> Acked-by: NThomas Graf <tgraf@suug.ch> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ying Xue 提交于
Move condition statements of verifying whether hash table size exceeds its maximum threshold or reaches its minimum threshold from resizing functions to resizing decision functions, avoiding unnecessary wakeup for worker queue thread. Signed-off-by: NYing Xue <ying.xue@windriver.com> Cc: Thomas Graf <tgraf@suug.ch> Acked-by: NThomas Graf <tgraf@suug.ch> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ying Xue 提交于
When remove an object from hash table, we currently only traverse old bucket table to check whether the object exists. If the object is not found in it, we will try again. But in the second search loop, we still search the object from the old table instead of future table. As a result, the object may be not removed from hash table especially when resizing is currently in progress and the object is just saved in the future table. Signed-off-by: NYing Xue <ying.xue@windriver.com> Cc: Thomas Graf <tgraf@suug.ch> Acked-by: NThomas Graf <tgraf@suug.ch> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ying Xue 提交于
Involve a new function called rhashtable_lookup_insert() which makes lookup and insertion atomic under bucket lock protection, helping us avoid to introduce an extra lock when we search and insert an object into hash table. Signed-off-by: NYing Xue <ying.xue@windriver.com> Signed-off-by: NThomas Graf <tgraf@suug.ch> Acked-by: NThomas Graf <tgraf@suug.ch> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ying Xue 提交于
Introduce rhashtable_wakeup_worker() helper function to reduce duplicated code where to wake up worker. By the way, as long as the both "future_tbl" and "tbl" bucket table pointers point to the same bucket array, we should try to wake up the resizing worker thread, otherwise, it indicates the work of resizing hash table is not finished yet. However, currently we will wake up the worker thread only when the two pointers point to different bucket array. Obviously this is wrong. So, the issue is also fixed as well in the patch. Signed-off-by: NYing Xue <ying.xue@windriver.com> Cc: Thomas Graf <tgraf@suug.ch> Acked-by: NThomas Graf <tgraf@suug.ch> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ying Xue 提交于
Define an internal compare function and relevant compare argument, and then make use of rhashtable_lookup_compare() to lookup key in hash table, reducing duplicated code between rhashtable_lookup() and rhashtable_lookup_compare(). Signed-off-by: NYing Xue <ying.xue@windriver.com> Cc: Thomas Graf <tgraf@suug.ch> Acked-by: NThomas Graf <tgraf@suug.ch> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 04 1月, 2015 7 次提交
-
-
由 Thomas Graf 提交于
In order to allow for wider usage of rhashtable, use a special nulls marker to terminate each chain. The reason for not using the existing nulls_list is that the prev pointer usage would not be valid as entries can be linked in two different buckets at the same time. The 4 nulls base bits can be set through the rhashtable_params structure like this: struct rhashtable_params params = { [...] .nulls_base = (1U << RHT_BASE_SHIFT), }; This reduces the hash length from 32 bits to 27 bits. Signed-off-by: NThomas Graf <tgraf@suug.ch> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Thomas Graf 提交于
Introduces an array of spinlocks to protect bucket mutations. The number of spinlocks per CPU is configurable and selected based on the hash of the bucket. This allows for parallel insertions and removals of entries which do not share a lock. The patch also defers expansion and shrinking to a worker queue which allows insertion and removal from atomic context. Insertions and deletions may occur in parallel to it and are only held up briefly while the particular bucket is linked or unzipped. Mutations of the bucket table pointer is protected by a new mutex, read access is RCU protected. In the event of an expansion or shrinking, the new bucket table allocated is exposed as a so called future table as soon as the resize process starts. Lookups, deletions, and insertions will briefly use both tables. The future table becomes the main table after an RCU grace period and initial linking of the old to the new table was performed. Optimization of the chains to make use of the new number of buckets follows only the new table is in use. The side effect of this is that during that RCU grace period, a bucket traversal using any rht_for_each() variant on the main table will not see any insertions performed during the RCU grace period which would at that point land in the future table. The lookup will see them as it searches both tables if needed. Having multiple insertions and removals occur in parallel requires nelems to become an atomic counter. Signed-off-by: NThomas Graf <tgraf@suug.ch> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Thomas Graf 提交于
The removal function of nft_hash currently stores a reference to the previous element during lookup which is used to optimize removal later on. This was possible because a lock is held throughout calling rhashtable_lookup() and rhashtable_remove(). With the introdution of deferred table resizing in parallel to lookups and insertions, the nftables lock will no longer synchronize all table mutations and the stored pprev may become invalid. Removing this optimization makes removal slightly more expensive on average but allows taking the resize cost out of the insert and remove path. Signed-off-by: NThomas Graf <tgraf@suug.ch> Cc: netfilter-devel@vger.kernel.org Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Thomas Graf 提交于
Subsequent patches will require access to the bucket tail. Access to the tail is relatively cheap as the automatic resizing of the table should keep the number of entries per bucket to no more than 0.75 on average. Signed-off-by: NThomas Graf <tgraf@suug.ch> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Thomas Graf 提交于
This patch is in preparation to introduce per bucket spinlocks. It extends all iterator macros to take the bucket table and bucket index. It also introduces a new rht_dereference_bucket() to handle protected accesses to buckets. It introduces a barrier() to the RCU iterators to the prevent the compiler from caching the first element. The lockdep verifier is introduced as stub which always succeeds and properly implement in the next patch when the locks are introduced. Signed-off-by: NThomas Graf <tgraf@suug.ch> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Thomas Graf 提交于
Signed-off-by: NThomas Graf <tgraf@suug.ch> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Thomas Graf 提交于
Hash the key inside of rhashtable_lookup_compare() like rhashtable_lookup() does. This allows to simplify the hashing functions and keep them private. Signed-off-by: NThomas Graf <tgraf@suug.ch> Cc: netfilter-devel@vger.kernel.org Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 11 12月, 2014 1 次提交
-
-
由 Daniel Borkmann 提交于
This patch effectively reverts commit 500f8087 ("net: ovs: use CRC32 accelerated flow hash if available"), and other remaining arch_fast_hash() users such as from nfsd via commit 6282cd56 ("NFSD: Don't hand out delegations for 30 seconds after recalling them.") where it has been used as a hash function for bloom filtering. While we think that these users are actually not much of concern, it has been requested to remove the arch_fast_hash() library bits that arose from [1] entirely as per recent discussion [2]. The main argument is that using it as a hash may introduce bias due to its linearity (see avalanche criterion) and thus makes it less clear (though we tried to document that) when this security/performance trade-off is actually acceptable for a general purpose library function. Lets therefore avoid any further confusion on this matter and remove it to prevent any future accidental misuse of it. For the time being, this is going to make hashing of flow keys a bit more expensive in the ovs case, but future work could reevaluate a different hashing discipline. [1] https://patchwork.ozlabs.org/patch/299369/ [2] https://patchwork.ozlabs.org/patch/418756/ Cc: Neil Brown <neilb@suse.de> Cc: Francesco Fusco <fusco@ntop.org> Cc: Jesse Gross <jesse@nicira.com> Cc: Thomas Graf <tgraf@suug.ch> Signed-off-by: NDaniel Borkmann <dborkman@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 25 11月, 2014 1 次提交
-
-
由 Thomas Graf 提交于
Verify whether both the lock and RCU protected iterators see all test entries before and after expanding and shrinking has been performed. Also verify whether the number of entries in the hashtable remains stable during expansion and shrinking. Signed-off-by: NThomas Graf <tgraf@suug.ch> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 14 11月, 2014 4 次提交
-
-
由 Thomas Graf 提交于
Reallocation is only required for shrinking and expanding and both rely on a mutex for synchronization and callers of rhashtable_init() are in non atomic context. Therefore, no reason to continue passing allocation hints through the API. Instead, use GFP_KERNEL and add __GFP_NOWARN | __GFP_NORETRY to allow for silent fall back to vzalloc() without the OOM killer jumping in as pointed out by Eric Dumazet and Eric W. Biederman. Signed-off-by: NThomas Graf <tgraf@suug.ch> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Herbert Xu 提交于
Currently mutex_is_held can only test locks in the that are global since it takes no arguments. This prevents rhashtable from being used in places where locks are lock, e.g., per-namespace locks. This patch adds a parent field to mutex_is_held and rhashtable_params so that local locks can be used (and tested). Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Herbert Xu 提交于
The rhashtable function mutex_is_held is only used when PROVE_LOCKING is enabled. This patch makes the mutex_is_held field in rhashtable optional depending on PROVE_LOCKING. Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Herbert Xu 提交于
My editor spewed garbage that looked like memory corruption on my screen. It turns out that a number of occurences of "fi" got turned into a ligature. This patch replaces these ligatures with the ASCII letters "fi". Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Cheers, Acked-by: NThomas Graf <tgraf@suug.ch> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 20 9月, 2014 1 次提交
-
-
由 Fabian Frederick 提交于
linux/log2.h was included twice. Signed-off-by: NFabian Frederick <fabf@skynet.be> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 04 9月, 2014 1 次提交
-
-
由 Ying Xue 提交于
Although rhashtable library allows user to specify a quiet big size for user's created hash table, the table may be shrunk to a very small size - HASH_MIN_SIZE(4) after object is removed from the table at the first time. Subsequently, even if the total amount of objects saved in the table is quite lower than user's initial setting in a long time, the hash table size is still dynamically adjusted by rhashtable_shrink() or rhashtable_expand() each time object is inserted or removed from the table. However, as synchronize_rcu() has to be called when table is shrunk or expanded by the two functions, we should permit user to set the minimum table size through configuring the minimum number of shifts according to user specific requirement, avoiding these expensive actions of shrinking or expanding because of calling synchronize_rcu(). Signed-off-by: NYing Xue <ying.xue@windriver.com> Acked-by: NThomas Graf <tgraf@suug.ch> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 03 9月, 2014 1 次提交
-
-
由 Pablo Neira Ayuso 提交于
No need for rht_dereference() from rhashtable_destroy() since the existing callers don't hold the mutex when invoking this function from: 1) Netlink, this is called in case of memory allocation errors in the initialization path, no nl_sk_hash_lock is held. 2) Netfilter, this is called from the rcu callback, no nfnl_lock is held either. I think it's reasonable to assume that the caller has to make sure that no hash resizing may happen before releasing the bucket array. Therefore, the caller should be responsible for releasing this in a safe way, document this to make people aware of it. This resolves a rcu lockdep splat in nft_hash: =============================== [ INFO: suspicious RCU usage. ] 3.16.0+ #178 Not tainted ------------------------------- lib/rhashtable.c:596 suspicious rcu_dereference_protected() usage! other info that might help us debug this: rcu_scheduler_active = 1, debug_locks = 1 1 lock held by ksoftirqd/2/18: #0: (rcu_callback){......}, at: [<ffffffff810918fd>] rcu_process_callbacks+0x27e/0x4c7 stack backtrace: CPU: 2 PID: 18 Comm: ksoftirqd/2 Not tainted 3.16.0+ #178 Hardware name: LENOVO 23259H1/23259H1, BIOS G2ET32WW (1.12 ) 05/30/2012 0000000000000001 ffff88011706bb68 ffffffff8143debc 0000000000000000 ffff880117062610 ffff88011706bb98 ffffffff81077515 ffff8800ca041a50 0000000000000004 ffff8800ca386480 ffff8800ca041a00 ffff88011706bbb8 Call Trace: [<ffffffff8143debc>] dump_stack+0x4e/0x68 [<ffffffff81077515>] lockdep_rcu_suspicious+0xfa/0x103 [<ffffffff81228b1b>] rhashtable_destroy+0x46/0x52 [<ffffffffa06f21a7>] nft_hash_destroy+0x73/0x82 [nft_hash] Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org> Acked-by: NThomas Graf <tgraf@suug.ch>
-
- 26 8月, 2014 1 次提交
-
-
由 Geert Uytterhoeven 提交于
Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org> Cc: Thomas Graf <tgraf@suug.ch> Signed-off-by: NJiri Kosina <jkosina@suse.cz>
-
- 15 8月, 2014 2 次提交
-
-
由 Thomas Graf 提交于
No need to export rht_obj(), all inner to outer object translations occur internally. It was intended to be used with rht_for_each() which now primarily serves as the iterator for rhashtable_remove_pprev() to effectively flush and free the full table. Signed-off-by: NThomas Graf <tgraf@suug.ch> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Thomas Graf 提交于
Properly annotate next pointers as access is RCU protected in the lookup path. Signed-off-by: NThomas Graf <tgraf@suug.ch> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 03 8月, 2014 1 次提交
-
-
由 Thomas Graf 提交于
Generic implementation of a resizable, scalable, concurrent hash table based on [0]. The implementation supports both, fixed size keys specified via an offset and length, or arbitrary keys via own hash and compare functions. Lookups are lockless and protected as RCU read side critical sections. Automatic growing/shrinking based on user configurable watermarks is available while allowing concurrent lookups to take place. Objects to be hashed must include a struct rhash_head. The reason for not using the existing struct hlist_head is that the expansion and shrinking will have two buckets point to a single entry which would lead in obscure reverse chaining behaviour. Code includes a boot selftest if CONFIG_TEST_RHASHTABLE is defined. [0] https://www.usenix.org/legacy/event/atc11/tech/final_files/Triplett.pdfSigned-off-by: NThomas Graf <tgraf@suug.ch> Reviewed-by: NNikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-