- 29 7月, 2020 1 次提交
-
-
由 Herbert Xu 提交于
This patch restores the RCU marking on bucket_table->buckets as it really does need RCU protection. Its removal had led to a fatal bug. Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 07 6月, 2020 1 次提交
-
-
由 Herbert Xu 提交于
This patch replaces some unnecessary uses of rcu_dereference_raw in the rhashtable code with rcu_dereference_protected. The top-level nested table entry is only marked as RCU because it shares the same type as the tree entries underneath it. So it doesn't need any RCU protection. We also don't need RCU protection when we're freeing a nested RCU table because by this stage we've long passed a memory barrier when anyone could change the nested table. Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 19 6月, 2019 1 次提交
-
-
由 Thomas Gleixner 提交于
Based on 2 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation # extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 4122 file(s). Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NEnrico Weigelt <info@metux.net> Reviewed-by: NKate Stewart <kstewart@linuxfoundation.org> Reviewed-by: NAllison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190604081206.933168790@linutronix.deSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 17 5月, 2019 2 次提交
-
-
由 Herbert Xu 提交于
As cmpxchg is a non-RCU mechanism it will cause sparse warnings when we use it for RCU. This patch adds explicit casts to silence those warnings. This should probably be moved to RCU itself in future. Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Herbert Xu 提交于
The opaque type rhash_lock_head should not be marked with __rcu because it can never be dereferenced. We should apply the RCU marking when we turn it into a pointer which can be dereferenced. This patch does exactly that. This fixes a number of sparse warnings as well as getting rid of some unnecessary RCU checking. Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 13 4月, 2019 5 次提交
-
-
由 NeilBrown 提交于
As reported by Guenter Roeck, the new bit-locking using BIT(1) doesn't work on the m68k architecture. m68k only requires 2-byte alignment for words and longwords, so there is only one unused bit in pointers to structs - We current use two, one for the NULLS marker at the end of the linked list, and one for the bit-lock in the head of the list. The two uses don't need to conflict as we never need the head of the list to be a NULLS marker - the marker is only needed to check if an object has moved to a different table, and the bucket head cannot move. The NULLS marker is only needed in a ->next pointer. As we already have different types for the bucket head pointer (struct rhash_lock_head) and the ->next pointers (struct rhash_head), it is fairly easy to treat the lsb differently in each. So: Initialize buckets heads to NULL, and use the lsb for locking. When loading the pointer from the bucket head, if it is NULL (ignoring the lock big), report as being the expected NULLS marker. When storing a value into a bucket head, if it is a NULLS marker, store NULL instead. And convert all places that used bit 1 for locking, to use bit 0. Fixes: 8f0db018 ("rhashtable: use bit_spin_locks to protect hash bucket.") Reported-by: NGuenter Roeck <linux@roeck-us.net> Tested-by: NGuenter Roeck <linux@roeck-us.net> Signed-off-by: NNeilBrown <neilb@suse.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 NeilBrown 提交于
The only times rht_ptr_locked() is used, it is to store a new value in a bucket-head. This is the only time it makes sense to use it too. So replace it by a function which does the whole task: Sets the lock bit and assigns to a bucket head. Signed-off-by: NNeilBrown <neilb@suse.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 NeilBrown 提交于
Rather than dereferencing a pointer to a bucket and then passing the result to rht_ptr(), we now pass in the pointer and do the dereference in rht_ptr(). This requires that we pass in the tbl and hash as well to support RCU checks, and means that the various rht_for_each functions can expect a pointer that can be dereferenced without further care. There are two places where we dereference a bucket pointer where there is no testable protection - in each case we know that we much have exclusive access without having taken a lock. The previous code used rht_dereference() to pretend that holding the mutex provided protects, but holding the mutex never provides protection for accessing buckets. So instead introduce rht_ptr_exclusive() that can be used when there is known to be exclusive access without holding any locks. Signed-off-by: NNeilBrown <neilb@suse.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 NeilBrown 提交于
With these annotations, the rhashtable now gets no warnings when compiled with "C=1" for sparse checking. Signed-off-by: NNeilBrown <neilb@suse.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Gustavo A. R. Silva 提交于
One of the more common cases of allocation size calculations is finding the size of a structure that has a zero-sized array at the end, along with memory for some number of elements for that array. For example: struct foo { int stuff; struct boo entry[]; }; size = sizeof(struct foo) + count * sizeof(struct boo); instance = kvzalloc(size, GFP_KERNEL); Instead of leaving these open-coded and prone to type mistakes, we can now use the new struct_size() helper: instance = kvzalloc(struct_size(instance, entry, count), GFP_KERNEL); This code was detected with the help of Coccinelle. Signed-off-by: NGustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 08 4月, 2019 4 次提交
-
-
由 NeilBrown 提交于
Native bit_spin_locks are not tracked by lockdep. The bit_spin_locks used for rhashtable buckets are local to the rhashtable implementation, so there is little opportunity for the sort of misuse that lockdep might detect. However locks are held while a hash function or compare function is called, and if one of these took a lock, a misbehaviour is possible. As it is quite easy to add lockdep support this unlikely possibility seems to be enough justification. So create a lockdep class for bucket bit_spin_lock and attach through a lockdep_map in each bucket_table. Without the 'nested' annotation in rhashtable_rehash_one(), lockdep correctly reports a possible problem as this lock is taken while another bucket lock (in another table) is held. This confirms that the added support works. With the correct nested annotation in place, lockdep reports no problems. Signed-off-by: NNeilBrown <neilb@suse.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 NeilBrown 提交于
This patch changes rhashtables to use a bit_spin_lock on BIT(1) of the bucket pointer to lock the hash chain for that bucket. The benefits of a bit spin_lock are: - no need to allocate a separate array of locks. - no need to have a configuration option to guide the choice of the size of this array - locking cost is often a single test-and-set in a cache line that will have to be loaded anyway. When inserting at, or removing from, the head of the chain, the unlock is free - writing the new address in the bucket head implicitly clears the lock bit. For __rhashtable_insert_fast() we ensure this always happens when adding a new key. - even when lockings costs 2 updates (lock and unlock), they are in a cacheline that needs to be read anyway. The cost of using a bit spin_lock is a little bit of code complexity, which I think is quite manageable. Bit spin_locks are sometimes inappropriate because they are not fair - if multiple CPUs repeatedly contend of the same lock, one CPU can easily be starved. This is not a credible situation with rhashtable. Multiple CPUs may want to repeatedly add or remove objects, but they will typically do so at different buckets, so they will attempt to acquire different locks. As we have more bit-locks than we previously had spinlocks (by at least a factor of two) we can expect slightly less contention to go with the slightly better cache behavior and reduced memory consumption. To enhance type checking, a new struct is introduced to represent the pointer plus lock-bit that is stored in the bucket-table. This is "struct rhash_lock_head" and is empty. A pointer to this needs to be cast to either an unsigned lock, or a "struct rhash_head *" to be useful. Variables of this type are most often called "bkt". Previously "pprev" would sometimes point to a bucket, and sometimes a ->next pointer in an rhash_head. As these are now different types, pprev is NULL when it would have pointed to the bucket. In that case, 'blk' is used, together with correct locking protocol. Signed-off-by: NNeilBrown <neilb@suse.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 NeilBrown 提交于
Rather than returning a pointer to a static nulls, rht_bucket_var() now returns NULL if the bucket doesn't exist. This will make the next patch, which stores a bitlock in the bucket pointer, somewhat cleaner. This change involves introducing __rht_bucket_nested() which is like rht_bucket_nested(), but doesn't provide the static nulls, and changing rht_bucket_nested() to call this and possible provide a static nulls - as is still needed for the non-var case. Signed-off-by: NNeilBrown <neilb@suse.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 NeilBrown 提交于
nested_table_alloc() relies on the fact that there is at most one spinlock allocated for every slot in the top level nested table, so it is not possible for two threads to try to allocate the same table at the same time. This assumption is a little fragile (it is not explicit) and is unnecessary as cmpxchg() can be used instead. A future patch will replace the spinlocks by per-bucket bitlocks, and then we won't be able to protect the slot pointer with a spinlock. So replace rcu_assign_pointer() with cmpxchg() - which has equivalent barrier properties. If it the cmp fails, free the table that was just allocated. Signed-off-by: NNeilBrown <neilb@suse.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 22 3月, 2019 3 次提交
-
-
由 NeilBrown 提交于
The pattern set by list.h is that for_each..continue() iterators start at the next entry after the given one, while for_each..from() iterators start at the given entry. The rht_for_each*continue() iterators are documented as though the start at the 'next' entry, but actually start at the given entry, and they are used expecting that behaviour. So fix the documentation and change the names to *from for consistency with list.h Acked-by: NHerbert Xu <herbert@gondor.apana.org.au> Acked-by: NMiguel Ojeda <miguel.ojeda.sandonis@gmail.com> Signed-off-by: NNeilBrown <neilb@suse.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 NeilBrown 提交于
rhashtable_try_insert() currently holds a lock on the bucket in the first table, while also locking buckets in subsequent tables. This is unnecessary and looks like a hold-over from some earlier version of the implementation. As insert and remove always lock a bucket in each table in turn, and as insert only inserts in the final table, there cannot be any races that are not covered by simply locking a bucket in each table in turn. When an insert call reaches that last table it can be sure that there is no matchinf entry in any other table as it has searched them all, and insertion never happens anywhere but in the last table. The fact that code tests for the existence of future_tbl while holding a lock on the relevant bucket ensures that two threads inserting the same key will make compatible decisions about which is the "last" table. This simplifies the code and allows the ->rehash field to be discarded. We still need a way to ensure that a dead bucket_table is never re-linked by rhashtable_walk_stop(). This can be achieved by calling call_rcu() inside the locked region, and checking with rcu_head_after_call_rcu() in rhashtable_walk_stop() to see if the bucket table is empty and dead. Acked-by: NHerbert Xu <herbert@gondor.apana.org.au> Reviewed-by: NPaul E. McKenney <paulmck@linux.ibm.com> Signed-off-by: NNeilBrown <neilb@suse.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Herbert Xu 提交于
As it stands if a shrink is delayed because of an outstanding rehash, we will go into a rescheduling loop without ever doing the rehash. This patch fixes this by still carrying out the rehash and then rescheduling so that we can shrink after the completion of the rehash should it still be necessary. The return value of EEXIST captures this case and other cases (e.g., another thread expanded/rehashed the table at the same time) where we should still proceed with the rehash. Fixes: da20420f ("rhashtable: Add nested tables") Reported-by: NJosh Elsasser <jelsasser@appneta.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Tested-by: NJosh Elsasser <jelsasser@appneta.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 22 2月, 2019 1 次提交
-
-
由 Herbert Xu 提交于
The rhashtable_walk_init function has been obsolete for more than two years. This patch finally converts its last users over to rhashtable_walk_enter and removes it. Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
-
- 04 12月, 2018 1 次提交
-
-
由 NeilBrown 提交于
Some users of rhashtables might need to move an object from one table to another - this appears to be the reason for the incomplete usage of NULLS markers. To support these, we store a unique NULLS_MARKER at the end of each chain, and when a search fails to find a match, we check if the NULLS marker found was the expected one. If not, the search may not have examined all objects in the target bucket, so it is repeated. The unique NULLS_MARKER is derived from the address of the head of the chain. As this cannot be derived at load-time the static rhnull in rht_bucket_nested() needs to be initialised at run time. Any caller of a lookup function must still be prepared for the possibility that the object returned is in a different table - it might have been there for some time. Note that this does NOT provide support for other uses of NULLS_MARKERs such as allocating with SLAB_TYPESAFE_BY_RCU or changing the key of an object and re-inserting it in the same table. These could only be done safely if new objects were inserted at the *start* of a hash chain, and that is not currently the case. Signed-off-by: NNeilBrown <neilb@suse.com> Acked-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 23 8月, 2018 2 次提交
-
-
由 Davidlohr Bueso 提交于
rhashtable_init() may fail due to -ENOMEM, thus making the entire api unusable. This patch removes this scenario, however unlikely. In order to guarantee memory allocation, this patch always ends up doing GFP_KERNEL|__GFP_NOFAIL for both the tbl as well as alloc_bucket_spinlocks(). Upon the first table allocation failure, we shrink the size to the smallest value that makes sense and retry with __GFP_NOFAIL semantics. With the defaults, this means that from 64 buckets, we retry with only 4. Any later issues regarding performance due to collisions or larger table resizing (when more memory becomes available) is the least of our problems. Link: http://lkml.kernel.org/r/20180712185241.4017-9-manfred@colorfullife.comSigned-off-by: NDavidlohr Bueso <dbueso@suse.de> Signed-off-by: NManfred Spraul <manfred@colorfullife.com> Acked-by: NHerbert Xu <herbert@gondor.apana.org.au> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Kees Cook <keescook@chromium.org> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Michal Hocko <mhocko@suse.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Davidlohr Bueso 提交于
As of ce91f6ee ("mm: kvmalloc does not fallback to vmalloc for incompatible gfp flags") we can simplify the caller and trust kvzalloc() to just do the right thing. For the case of the GFP_ATOMIC context, we can drop the __GFP_NORETRY flag for obvious reasons, and for the __GFP_NOWARN case, however, it is changed such that the caller passes the flag instead of making bucket_table_alloc() handle it. This slightly changes the gfp flags passed on to nested_table_alloc() as it will now also use GFP_ATOMIC | __GFP_NOWARN. However, I consider this a positive consequence as for the same reasons we want nowarn semantics in bucket_table_alloc(). [manfred@colorfullife.com: commit id extended to 12 digits, line wraps updated] Link: http://lkml.kernel.org/r/20180712185241.4017-8-manfred@colorfullife.comSigned-off-by: NDavidlohr Bueso <dbueso@suse.de> Signed-off-by: NManfred Spraul <manfred@colorfullife.com> Acked-by: NMichal Hocko <mhocko@suse.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Kees Cook <keescook@chromium.org> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 21 8月, 2018 1 次提交
-
-
由 Yue Haibing 提交于
Remove duplicated include. Signed-off-by: NYue Haibing <yuehaibing@huawei.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 19 7月, 2018 1 次提交
-
-
由 Davidlohr Bueso 提交于
rhashtable_init() currently does not take into account the user-passed min_size parameter unless param->nelem_hint is set as well. As such, the default size (number of buckets) will always be HASH_DEFAULT_SIZE even if the smallest allowed size is larger than that. Remediate this by unconditionally calling into rounded_hashtable_size() and handling things accordingly. Signed-off-by: NDavidlohr Bueso <dbueso@suse.de> Acked-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 10 7月, 2018 1 次提交
-
-
由 Taehee Yoo 提交于
rhashtable_free_and_destroy() cancels re-hash deferred work then walks and destroys elements. at this moment, some elements can be still in future_tbl. that elements are not destroyed. test case: nft_rhash_destroy() calls rhashtable_free_and_destroy() to destroy all elements of sets before destroying sets and chains. But rhashtable_free_and_destroy() doesn't destroy elements of future_tbl. so that splat occurred. test script: %cat test.nft table ip aa { map map1 { type ipv4_addr : verdict; elements = { 0 : jump a0, 1 : jump a0, 2 : jump a0, 3 : jump a0, 4 : jump a0, 5 : jump a0, 6 : jump a0, 7 : jump a0, 8 : jump a0, 9 : jump a0, } } chain a0 { } } flush ruleset table ip aa { map map1 { type ipv4_addr : verdict; elements = { 0 : jump a0, 1 : jump a0, 2 : jump a0, 3 : jump a0, 4 : jump a0, 5 : jump a0, 6 : jump a0, 7 : jump a0, 8 : jump a0, 9 : jump a0, } } chain a0 { } } flush ruleset %while :; do nft -f test.nft; done Splat looks like: [ 200.795603] kernel BUG at net/netfilter/nf_tables_api.c:1363! [ 200.806944] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI [ 200.812253] CPU: 1 PID: 1582 Comm: nft Not tainted 4.17.0+ #24 [ 200.820297] Hardware name: To be filled by O.E.M. To be filled by O.E.M./Aptio CRB, BIOS 5.6.5 07/08/2015 [ 200.830309] RIP: 0010:nf_tables_chain_destroy.isra.34+0x62/0x240 [nf_tables] [ 200.838317] Code: 43 50 85 c0 74 26 48 8b 45 00 48 8b 4d 08 ba 54 05 00 00 48 c7 c6 60 6d 29 c0 48 c7 c7 c0 65 29 c0 4c 8b 40 08 e8 58 e5 fd f8 <0f> 0b 48 89 da 48 b8 00 00 00 00 00 fc ff [ 200.860366] RSP: 0000:ffff880118dbf4d0 EFLAGS: 00010282 [ 200.866354] RAX: 0000000000000061 RBX: ffff88010cdeaf08 RCX: 0000000000000000 [ 200.874355] RDX: 0000000000000061 RSI: 0000000000000008 RDI: ffffed00231b7e90 [ 200.882361] RBP: ffff880118dbf4e8 R08: ffffed002373bcfb R09: ffffed002373bcfa [ 200.890354] R10: 0000000000000000 R11: ffffed002373bcfb R12: dead000000000200 [ 200.898356] R13: dead000000000100 R14: ffffffffbb62af38 R15: dffffc0000000000 [ 200.906354] FS: 00007fefc31fd700(0000) GS:ffff88011b800000(0000) knlGS:0000000000000000 [ 200.915533] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 200.922355] CR2: 0000557f1c8e9128 CR3: 0000000106880000 CR4: 00000000001006e0 [ 200.930353] Call Trace: [ 200.932351] ? nf_tables_commit+0x26f6/0x2c60 [nf_tables] [ 200.939525] ? nf_tables_setelem_notify.constprop.49+0x1a0/0x1a0 [nf_tables] [ 200.947525] ? nf_tables_delchain+0x6e0/0x6e0 [nf_tables] [ 200.952383] ? nft_add_set_elem+0x1700/0x1700 [nf_tables] [ 200.959532] ? nla_parse+0xab/0x230 [ 200.963529] ? nfnetlink_rcv_batch+0xd06/0x10d0 [nfnetlink] [ 200.968384] ? nfnetlink_net_init+0x130/0x130 [nfnetlink] [ 200.975525] ? debug_show_all_locks+0x290/0x290 [ 200.980363] ? debug_show_all_locks+0x290/0x290 [ 200.986356] ? sched_clock_cpu+0x132/0x170 [ 200.990352] ? find_held_lock+0x39/0x1b0 [ 200.994355] ? sched_clock_local+0x10d/0x130 [ 200.999531] ? memset+0x1f/0x40 V2: - free all tables requested by Herbert Xu Signed-off-by: NTaehee Yoo <ap420073@gmail.com> Acked-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 03 7月, 2018 1 次提交
-
-
由 Rishabh Bhatnagar 提交于
In file lib/rhashtable.c line 777, skip variable is assigned to itself. The following error was observed: lib/rhashtable.c:777:41: warning: explicitly assigning value of variable of type 'int' to itself [-Wself-assign] error, forbidden warning: rhashtable.c:777 This error was found when compiling with Clang 6.0. Change it to iter->skip. Signed-off-by: NRishabh Bhatnagar <rishabhb@codeaurora.org> Acked-by: NHerbert Xu <herbert@gondor.apana.org.au> Reviewed-by: NNeilBrown <neilb@suse.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 22 6月, 2018 6 次提交
-
-
由 NeilBrown 提交于
Using rht_dereference_bucket() to dereference ->future_tbl looks like a type error, and could be confusing. Using rht_dereference_rcu() to test a pointer for NULL adds an unnecessary barrier - rcu_access_pointer() is preferred for NULL tests when no lock is held. This uses 3 different ways to access ->future_tbl. - if we know the mutex is held, use rht_dereference() - if we don't hold the mutex, and are only testing for NULL, use rcu_access_pointer() - otherwise (using RCU protection for true dereference), use rht_dereference_rcu(). Note that this includes a simplification of the call to rhashtable_last_table() - we don't do an extra dereference before the call any more. Acked-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NNeilBrown <neilb@suse.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 NeilBrown 提交于
Rather than borrowing one of the bucket locks to protect ->future_tbl updates, use cmpxchg(). This gives more freedom to change how bucket locking is implemented. Acked-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NNeilBrown <neilb@suse.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 NeilBrown 提交于
Now that we don't use the hash value or shift in nested_table_alloc() there is room for simplification. We only need to pass a "is this a leaf" flag to nested_table_alloc(), and don't need to track as much information in rht_bucket_nested_insert(). Note there is another minor cleanup in nested_table_alloc() here. The number of elements in a page of "union nested_tables" is most naturally PAGE_SIZE / sizeof(ntbl[0]) The previous code had PAGE_SIZE / sizeof(ntbl[0].bucket) which happens to be the correct value only because the bucket uses all the space in the union. Acked-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NNeilBrown <neilb@suse.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 NeilBrown 提交于
The 'ht' and 'hash' arguments to INIT_RHT_NULLS_HEAD() are no longer used - so drop them. This allows us to also remove the nhash argument from nested_table_alloc(). Acked-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NNeilBrown <neilb@suse.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 NeilBrown 提交于
This "feature" is unused, undocumented, and untested and so doesn't really belong. A patch is under development to properly implement support for detecting when a search gets diverted down a different chain, which the common purpose of nulls markers. This patch actually fixes a bug too. The table resizing allows a table to grow to 2^31 buckets, but the hash is truncated to 27 bits - any growth beyond 2^27 is wasteful an ineffective. This patch results in NULLS_MARKER(0) being used for all chains, and leaves the use of rht_is_a_null() to test for it. Acked-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NNeilBrown <neilb@suse.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 NeilBrown 提交于
Due to the use of rhashtables in net namespaces, rhashtable.h is included in lots of the kernel, so a small changes can required a large recompilation. This makes development painful. This patch splits out rhashtable-types.h which just includes the major type declarations, and does not include (non-trivial) inline code. rhashtable.h is no longer included by anything in the include/ directory. Common include files only include rhashtable-types.h so a large recompilation is only triggered when that changes. Acked-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NNeilBrown <neilb@suse.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 25 4月, 2018 3 次提交
-
-
由 NeilBrown 提交于
When a walk of an rhashtable is interrupted with rhastable_walk_stop() and then rhashtable_walk_start(), the location to restart from is based on a 'skip' count in the current hash chain, and this can be incorrect if insertions or deletions have happened. This does not happen when the walk is not stopped and started as iter->p is a placeholder which is safe to use while holding the RCU read lock. In rhashtable_walk_start() we can revalidate that 'p' is still in the same hash chain. If it isn't then the current method is still used. With this patch, if a rhashtable walker ensures that the current object remains in the table over a stop/start period (possibly by elevating the reference count if that is sufficient), it can be sure that a walk will not miss objects that were in the hashtable for the whole time of the walk. rhashtable_walk_start() may not find the object even though it is still in the hashtable if a rehash has moved it to a new table. In this case it will (eventually) get -EAGAIN and will need to proceed through the whole table again to be sure to see everything at least once. Acked-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NNeilBrown <neilb@suse.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 NeilBrown 提交于
The documentation claims that when rhashtable_walk_start_check() detects a resize event, it will rewind back to the beginning of the table. This is not true. We need to set ->slot and ->skip to be zero for it to be true. Acked-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NNeilBrown <neilb@suse.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 NeilBrown 提交于
Neither rhashtable_walk_enter() or rhltable_walk_enter() sleep, though they do take a spinlock without irq protection. So revise the comments to accurately state the contexts in which these functions can be called. Acked-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NNeilBrown <neilb@suse.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 01 4月, 2018 1 次提交
-
-
由 Eric Dumazet 提交于
Rehashing and destroying large hash table takes a lot of time, and happens in process context. It is safe to add cond_resched() in rhashtable_rehash_table() and rhashtable_free_and_destroy() Signed-off-by: NEric Dumazet <edumazet@google.com> Acked-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 07 3月, 2018 1 次提交
-
-
由 Paul Blakey 提交于
When inserting duplicate objects (those with the same key), current rhlist implementation messes up the chain pointers by updating the bucket pointer instead of prev next pointer to the newly inserted node. This causes missing elements on removal and travesal. Fix that by properly updating pprev pointer to point to the correct rhash_head next pointer. Issue: 1241076 Change-Id: I86b2c140bcb4aeb10b70a72a267ff590bb2b17e7 Fixes: ca26893f ('rhashtable: Add rhlist interface') Signed-off-by: NPaul Blakey <paulb@mellanox.com> Acked-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 11 12月, 2017 3 次提交
-
-
由 Tom Herbert 提交于
To allocate the array of bucket locks for the hash table we now call library function alloc_bucket_spinlocks. This function is based on the old alloc_bucket_locks in rhashtable and should produce the same effect. Signed-off-by: NTom Herbert <tom@quantonium.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Tom Herbert 提交于
This function is like rhashtable_walk_next except that it only returns the current element in the inter and does not advance the iter. This patch also creates __rhashtable_walk_find_next. It finds the next element in the table when the entry cached in iter is NULL or at the end of a slot. __rhashtable_walk_find_next is called from rhashtable_walk_next and rhastable_walk_peek. end_of_table is an added field to the iter structure. This indicates that the end of table was reached (walker.tbl being NULL is not a sufficient condition for end of table). Signed-off-by: NTom Herbert <tom@quantonium.net> Acked-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Tom Herbert 提交于
Most callers of rhashtable_walk_start don't care about a resize event which is indicated by a return value of -EAGAIN. So calls to rhashtable_walk_start are wrapped wih code to ignore -EAGAIN. Something like this is common: ret = rhashtable_walk_start(rhiter); if (ret && ret != -EAGAIN) goto out; Since zero and -EAGAIN are the only possible return values from the function this check is pointless. The condition never evaluates to true. This patch changes rhashtable_walk_start to return void. This simplifies code for the callers that ignore -EAGAIN. For the few cases where the caller cares about the resize event, particularly where the table can be walked in mulitple parts for netlink or seq file dump, the function rhashtable_walk_start_check has been added that returns -EAGAIN on a resize event. Signed-off-by: NTom Herbert <tom@quantonium.net> Acked-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 20 9月, 2017 1 次提交
-
-
由 Andreas Gruenbacher 提交于
Clarify that rhashtable_walk_{stop,start} will not reset the iterator to the beginning of the hash table. Confusion between rhashtable_walk_enter and rhashtable_walk_start has already lead to a bug. Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-