提交 · c42d7121fbee1ee30fd9221d594e9c5a4bc1fed6 · openeuler / Kernel

11 7月, 2016 1 次提交

netfilter: nf_tables: get rid of possible_net_t from set and basechain · 42a55769

由 Pablo Neira Ayuso 提交于 7月 08, 2016

We can pass the netns pointer as parameter to the functions that need to
gain access to it. From basechains, I didn't find any client for this
field anymore so let's remove this too.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

42a55769

24 6月, 2016 1 次提交

netfilter: nft_hash: support deletion of inactive elements · 8eee54be

由 Pablo Neira Ayuso 提交于 6月 21, 2016

New elements are inactive in the preparation phase, and its
NFT_SET_ELEM_BUSY_MASK flag is set on.

This busy flag doesn't allow us to delete it from the same transaction,
following a sequence like:

	begin transaction
	add element X
	delete element X
	end transaction

This sequence is valid and may be triggered by robots. To resolve this
problem, allow deactivating elements that are active in the current
generation (ie. those that has been just added in this batch).
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

8eee54be

15 6月, 2016 1 次提交

netfilter: nf_tables: reject loops from set element jump to chain · 8588ac09

由 Pablo Neira Ayuso 提交于 6月 11, 2016

Liping Zhang says:

"Users may add such a wrong nft rules successfully, which will cause an
endless jump loop:

  # nft add rule filter test tcp dport vmap {1: jump test}

This is because before we commit, the element in the current anonymous
set is inactive, so osp->walk will skip this element and miss the
validate check."

To resolve this problem, this patch passes the generation mask to the
walk function through the iter container structure depending on the code
path:

1) If we're dumping the elements, then we have to check if the element
   is active in the current generation. Thus, we check for the current
   bit in the genmask.

2) If we're checking for loops, then we have to check if the element is
   active in the next generation, as we're in the middle of a
   transaction. Thus, we check for the next bit in the genmask.

Based on original patch from Liping Zhang.
Reported-by: NLiping Zhang <liping.zhang@spreadtrum.com>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
Tested-by: NLiping Zhang <liping.zhang@spreadtrum.com>

8588ac09

05 4月, 2016 1 次提交

rhashtable: accept GFP flags in rhashtable_walk_init · 8f6fd83c

由 Bob Copeland 提交于 3月 02, 2016

In certain cases, the 802.11 mesh pathtable code wants to
iterate over all of the entries in the forwarding table from
the receive path, which is inside an RCU read-side critical
section.  Enable walks inside atomic sections by allowing
GFP_ATOMIC allocations for the walker state.

Change all existing callsites to pass in GFP_KERNEL.
Acked-by: NThomas Graf <tgraf@suug.ch>
Signed-off-by: NBob Copeland <me@bobcopeland.com>
[also adjust gfs2/glock.c and rhashtable tests]
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

8f6fd83c

13 4月, 2015 4 次提交

netfilter: nf_tables: variable sized set element keys / data · 7d740264

由 Patrick McHardy 提交于 4月 11, 2015

This patch changes sets to support variable sized set element keys / data
up to 64 bytes each by using variable sized set extensions. This allows
to use concatenations with bigger data items suchs as IPv6 addresses.

As a side effect, small keys/data now don't require the full 16 bytes
of struct nft_data anymore but just the space they need.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

7d740264

netfilter: nf_tables: convert sets to u32 data pointers · 8cd8937a

由 Patrick McHardy 提交于 4月 11, 2015

Simple conversion to use u32 pointers to the beginning of the data
area to keep follow up patches smaller.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

8cd8937a

netfilter: nf_tables: kill nft_data_cmp() · e562d860

由 Patrick McHardy 提交于 4月 11, 2015

Only needlessly complicates things due to requiring specific argument
types. Use memcmp directly.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

e562d860

netfilter: nf_tables: get rid of NFT_REG_VERDICT usage · a55e22e9

由 Patrick McHardy 提交于 4月 11, 2015

Replace the array of registers passed to expressions by a struct nft_regs,
containing the verdict as a seperate member, which aliases to the
NFT_REG_VERDICT register.

This is needed to seperate the verdict from the data registers completely,
so their size can be changed.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

a55e22e9

08 4月, 2015 2 次提交

netfilter: nf_tables: add support for dynamic set updates · 22fe54d5

由 Patrick McHardy 提交于 4月 05, 2015

Add a new "dynset" expression for dynamic set updates.

A new set op ->update() is added which, for non existant elements,
invokes an initialization callback and inserts the new element.
For both new or existing elements the extenstion pointer is returned
to the caller to optionally perform timer updates or other actions.

Element removal is not supported so far, however that seems to be a
rather exotic need and can be added later on.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

22fe54d5

netfilter: nf_tables: prepare set element accounting for async updates · 3dd0673a

由 Patrick McHardy 提交于 4月 05, 2015

Use atomic operations for the element count to avoid races with async
updates.

To properly handle the transactional semantics during netlink updates,
deleted but not yet committed elements are accounted for seperately and
are treated as being already removed. This means for the duration of
a netlink transaction, the limit might be exceeded by the amount of
elements deleted. Set implementations must be prepared to handle this.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

3dd0673a

01 4月, 2015 1 次提交

netfilter: nft_hash: add support for timeouts · 9d098292

由 Patrick McHardy 提交于 3月 26, 2015

Add support for element timeouts to nft_hash. The lookup and walking
functions are changed to ignore timed out elements, a periodic garbage
collection task cleans out expired entries.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

9d098292

26 3月, 2015 7 次提交

netfilter: nf_tables: implement set transaction support · cc02e457

由 Patrick McHardy 提交于 3月 25, 2015

Set elements are the last object type not supporting transaction support.
Implement similar to the existing rule transactions:

The global transaction counter keeps track of two generations, current
and next. Each element contains a bitmask specifying in which generations
it is inactive.

New elements start out as inactive in the current generation and active
in the next. On commit, the previous next generation becomes the current
generation and the element becomes active. The bitmask is then cleared
to indicate that the element is active in all future generations. If the
transaction is aborted, the element is removed from the set before it
becomes active.

When removing an element, it gets marked as inactive in the next generation.
On commit the next generation becomes active and the therefor the element
inactive. It is then taken out of then set and released. On abort, the
element is marked as active for the next generation again.

Lookups ignore elements not active in the current generation.

The current set types (hash/rbtree) both use a field in the extension area
to store the generation mask. This (currently) does not require any
additional memory since we have some free space in there.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

cc02e457

netfilter: nf_tables: return set extensions from ->lookup() · b2832dd6

由 Patrick McHardy 提交于 3月 25, 2015

Return the extension area from the ->lookup() function to allow to
consolidate common actions.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

b2832dd6

netfilter: nf_tables: consolide set element destruction · 61edafbb

由 Patrick McHardy 提交于 3月 25, 2015

With the conversion to set extensions, it is now possible to consolidate
the different set element destruction functions.

The set implementations' ->remove() functions are changed to only take
the element out of their internal data structures. Elements will be freed
in a batched fashion after the global transaction's completion RCU grace
period.

This reduces the amount of grace periods required for nft_hash from N
to zero additional ones, additionally this guarantees that the set
elements' extensions of all implementations can be used under RCU
protection.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

61edafbb

netfilter: nf_tables: convert hash and rbtree to set extensions · fe2811eb

由 Patrick McHardy 提交于 3月 25, 2015

The set implementations' private struct will only contain the elements
needed to maintain the search structure, all other elements are moved
to the set extensions.

Element allocation and initialization is performed centrally by
nf_tables_api instead of by the different set implementations'
->insert() functions. A new "elemsize" member in the set ops specifies
the amount of memory to reserve for internal usage. Destruction
will also be moved out of the set implementations by a following patch.

Except for element allocation, the patch is a simple conversion to
using data from the extension area.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

fe2811eb

netfilter: nft_hash: convert to use rhashtable callbacks · bfd6e327

由 Patrick McHardy 提交于 3月 25, 2015

A following patch will convert sets to use so called set extensions,
where the key is not located in a fixed position anymore. This will
require rhashtable hashing and comparison callbacks to be used.

As preparation, convert nft_hash to use these callbacks without any
functional changes.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

bfd6e327

netfilter: nft_hash: indent rhashtable parameters · 45d84751

由 Patrick McHardy 提交于 3月 25, 2015

Improve readability by indenting the parameter initialization.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

45d84751

netfilter: nft_hash: restore struct nft_hash · 745f5450

由 Patrick McHardy 提交于 3月 25, 2015

Following patches will add new private members, restore struct nft_hash
as preparation.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

745f5450

25 3月, 2015 2 次提交

rhashtable: Add rhashtable_free_and_destroy() · 6b6f302c

由 Thomas Graf 提交于 3月 24, 2015

rhashtable_destroy() variant which stops rehashes, iterates over
the table and calls a callback to release resources.

Avoids need for nft_hash to embed rhashtable internals and allows to
get rid of the being_destroyed flag. It also saves a 2nd mutex
lock upon destruction.

Also fixes an RCU lockdep splash on nft set destruction due to
calling rht_for_each_entry_safe() without holding bucket locks.
Open code this loop as we need know that no mutations may occur in
parallel.
Signed-off-by: NThomas Graf <tgraf@suug.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6b6f302c

rhashtable: Disable automatic shrinking by default · b5e2c150

由 Thomas Graf 提交于 3月 24, 2015

Introduce a new bool automatic_shrinking to require the
user to explicitly opt-in to automatic shrinking of tables.
Signed-off-by: NThomas Graf <tgraf@suug.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b5e2c150

21 3月, 2015 1 次提交

netfilter: Convert nft_hash to inlined rhashtable · fa377321

由 Herbert Xu 提交于 3月 20, 2015

This patch converts nft_hash to the inlined rhashtable interface.

This patch also replaces the call to rhashtable_lookup_compare with
a straight rhashtable_lookup_fast because it's simply doing a memcmp
(in fact nft_hash_lookup already uses memcmp instead of nft_data_cmp).

Furthermore, the compare function is only meant to compare, it is not
supposed to have side-effects.  The current side-effect code can
simply be moved into the nft_hash_get.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fa377321

13 3月, 2015 1 次提交

netfilter: Fix potential crash in nft_hash walker · d8bdff59

由 Herbert Xu 提交于 3月 13, 2015

When we get back an EAGAIN from rhashtable_walk_next we were
treating it as a valid object which obviously doesn't work too
well.

Luckily this is hard to trigger so it seems nobody has run into
it yet.

This patch fixes it by redoing the next call when we get an EAGAIN.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

d8bdff59

28 2月, 2015 1 次提交

rhashtable: remove indirection for grow/shrink decision functions · 4c4b52d9

由 Daniel Borkmann 提交于 2月 25, 2015

Currently, all real users of rhashtable default their grow and shrink
decision functions to rht_grow_above_75() and rht_shrink_below_30(),
so that there's currently no need to have this explicitly selectable.

It can/should be generic and private inside rhashtable until a real
use case pops up. Since we can make this private, we'll save us this
additional indirection layer and can improve insertion/deletion time
as well.

Reference: http://patchwork.ozlabs.org/patch/443040/Suggested-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NThomas Graf <tgraf@suug.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4c4b52d9

05 2月, 2015 1 次提交

netfilter: Use rhashtable walk iterator · 9a776628

由 Herbert Xu 提交于 2月 04, 2015

This patch gets rid of the manual rhashtable walk in nft_hash
which touches rhashtable internals that should not be exposed.
It does so by using the rhashtable iterator primitives.

Note that I'm leaving nft_hash_destroy alone since it's only
invoked on shutdown and it shouldn't be affected by changes
to rhashtable internals (or at least not what I'm planning to
change).
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9a776628

04 1月, 2015 4 次提交

rhashtable: Per bucket locks & deferred expansion/shrinking · 97defe1e

由 Thomas Graf 提交于 1月 02, 2015

Introduces an array of spinlocks to protect bucket mutations. The number
of spinlocks per CPU is configurable and selected based on the hash of
the bucket. This allows for parallel insertions and removals of entries
which do not share a lock.

The patch also defers expansion and shrinking to a worker queue which
allows insertion and removal from atomic context. Insertions and
deletions may occur in parallel to it and are only held up briefly
while the particular bucket is linked or unzipped.

Mutations of the bucket table pointer is protected by a new mutex, read
access is RCU protected.

In the event of an expansion or shrinking, the new bucket table allocated
is exposed as a so called future table as soon as the resize process
starts.  Lookups, deletions, and insertions will briefly use both tables.
The future table becomes the main table after an RCU grace period and
initial linking of the old to the new table was performed. Optimization
of the chains to make use of the new number of buckets follows only the
new table is in use.

The side effect of this is that during that RCU grace period, a bucket
traversal using any rht_for_each() variant on the main table will not see
any insertions performed during the RCU grace period which would at that
point land in the future table. The lookup will see them as it searches
both tables if needed.

Having multiple insertions and removals occur in parallel requires nelems
to become an atomic counter.
Signed-off-by: NThomas Graf <tgraf@suug.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

97defe1e

nft_hash: Remove rhashtable_remove_pprev() · 897362e4

由 Thomas Graf 提交于 1月 02, 2015

The removal function of nft_hash currently stores a reference to the
previous element during lookup which is used to optimize removal later
on. This was possible because a lock is held throughout calling
rhashtable_lookup() and rhashtable_remove().

With the introdution of deferred table resizing in parallel to lookups
and insertions, the nftables lock will no longer synchronize all
table mutations and the stored pprev may become invalid.

Removing this optimization makes removal slightly more expensive on
average but allows taking the resize cost out of the insert and
remove path.
Signed-off-by: NThomas Graf <tgraf@suug.ch>
Cc: netfilter-devel@vger.kernel.org
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

897362e4

rhashtable: Convert bucket iterators to take table and index · 88d6ed15

由 Thomas Graf 提交于 1月 02, 2015

This patch is in preparation to introduce per bucket spinlocks. It
extends all iterator macros to take the bucket table and bucket
index. It also introduces a new rht_dereference_bucket() to
handle protected accesses to buckets.

It introduces a barrier() to the RCU iterators to the prevent
the compiler from caching the first element.

The lockdep verifier is introduced as stub which always succeeds
and properly implement in the next patch when the locks are
introduced.
Signed-off-by: NThomas Graf <tgraf@suug.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

88d6ed15

rhashtable: Do hashing inside of rhashtable_lookup_compare() · 8d24c0b4

由 Thomas Graf 提交于 1月 02, 2015

Hash the key inside of rhashtable_lookup_compare() like
rhashtable_lookup() does. This allows to simplify the hashing
functions and keep them private.
Signed-off-by: NThomas Graf <tgraf@suug.ch>
Cc: netfilter-devel@vger.kernel.org
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8d24c0b4

14 11月, 2014 3 次提交

rhashtable: Drop gfp_flags arg in insert/remove functions · 6eba8224

由 Thomas Graf 提交于 11月 13, 2014

Reallocation is only required for shrinking and expanding and both rely
on a mutex for synchronization and callers of rhashtable_init() are in
non atomic context. Therefore, no reason to continue passing allocation
hints through the API.

Instead, use GFP_KERNEL and add __GFP_NOWARN | __GFP_NORETRY to allow
for silent fall back to vzalloc() without the OOM killer jumping in as
pointed out by Eric Dumazet and Eric W. Biederman.
Signed-off-by: NThomas Graf <tgraf@suug.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6eba8224

rhashtable: Add parent argument to mutex_is_held · 7b4ce235

由 Herbert Xu 提交于 11月 13, 2014

Currently mutex_is_held can only test locks in the that are global
since it takes no arguments.  This prevents rhashtable from being
used in places where locks are lock, e.g., per-namespace locks.

This patch adds a parent field to mutex_is_held and rhashtable_params
so that local locks can be used (and tested).
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7b4ce235

netfilter: Move mutex_is_held under PROVE_LOCKING · 1f501d62

由 Herbert Xu 提交于 11月 13, 2014

The rhashtable function mutex_is_held is only used when PROVE_LOCKING
is enabled. This patch modifies netfilter so that we can rhashtable.h
itself can later make mutex_is_held optional depending on PROVE_LOCKING.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1f501d62

03 9月, 2014 1 次提交

netfilter: nft_hash: no need for rcu in the hash set destroy path · 39f39016

由 Pablo Neira Ayuso 提交于 9月 01, 2014

The sets are released from the rcu callback, after the rule is removed
from the chain list, which implies that nfnetlink cannot update the
hashes (thus, no resizing may occur) and no packets are walking on the
set anymore.

This resolves a lockdep splat in the nft_hash_destroy() path since the
nfnl mutex is not held there.

===============================
[ INFO: suspicious RCU usage. ]
3.16.0-rc2+ #168 Not tainted
-------------------------------
net/netfilter/nft_hash.c:362 suspicious rcu_dereference_protected() usage!

other info that might help us debug this:

rcu_scheduler_active = 1, debug_locks = 1
1 lock held by ksoftirqd/0/3:
 #0:  (rcu_callback){......}, at: [<ffffffff81096393>] rcu_process_callbacks+0x27e/0x4c7

stack backtrace:
CPU: 0 PID: 3 Comm: ksoftirqd/0 Not tainted 3.16.0-rc2+ #168
Hardware name: LENOVO 23259H1/23259H1, BIOS G2ET32WW (1.12 ) 05/30/2012
 0000000000000001 ffff88011769bb98 ffffffff8142c922 0000000000000006
 ffff880117694090 ffff88011769bbc8 ffffffff8107c3ff ffff8800cba52400
 ffff8800c476bea8 ffff8800c476bea8 ffff8800cba52400 ffff88011769bc08
Call Trace:
 [<ffffffff8142c922>] dump_stack+0x4e/0x68
 [<ffffffff8107c3ff>] lockdep_rcu_suspicious+0xfa/0x103
 [<ffffffffa079931e>] nft_hash_destroy+0x50/0x137 [nft_hash]
 [<ffffffffa078cd57>] nft_set_destroy+0x11/0x2a [nf_tables]
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
Acked-by: NThomas Graf <tgraf@suug.ch>

39f39016

03 8月, 2014 1 次提交

nftables: Convert nft_hash to use generic rhashtable · cfe4a9dd

由 Thomas Graf 提交于 8月 02, 2014

The sizing of the hash table and the practice of requiring a lookup
to retrieve the pprev to be stored in the element cookie before the
deletion of an entry is left intact.
Signed-off-by: NThomas Graf <tgraf@suug.ch>
Acked-by: NPatrick McHardy <kaber@trash.net>
Reviewed-by: NNikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cfe4a9dd

05 6月, 2014 1 次提交

net: use the new API kvfree() · 4cb28970

由 WANG Cong 提交于 6月 02, 2014

It is available since v3.15-rc5.

Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4cb28970

03 4月, 2014 2 次提交

netfilter: nft_hash: use set global element counter instead of private one · 2c96c25d

由 Patrick McHardy 提交于 3月 28, 2014

Now that nf_tables performs global accounting of set elements, it is not
needed in the hash type anymore.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

2c96c25d

netfilter: nf_tables: implement proper set selection · c50b960c

由 Patrick McHardy 提交于 3月 28, 2014

The current set selection simply choses the first set type that provides
the requested features, which always results in the rbtree being chosen
by virtue of being the first set in the list.

What we actually want to do is choose the implementation that can provide
the requested features and is optimal from either a performance or memory
perspective depending on the characteristics of the elements and the
preferences specified by the user.

The elements are not known when creating a set. Even if we would provide
them for anonymous (literal) sets, we'd still have standalone sets where
the elements are not known in advance. We therefore need an abstract
description of the data charcteristics.

The kernel already knows the size of the key, this patch starts by
introducing a nested set description which so far contains only the maximum
amount of elements. Based on this the set implementations are changed to
provide an estimate of the required amount of memory and the lookup
complexity class.

The set ops have a new callback ->estimate() that is invoked during set
selection. It receives a structure containing the attributes known to the
kernel and is supposed to populate a struct nft_set_estimate with the
complexity class and, in case the size is known, the complete amount of
memory required, or the amount of memory required per element otherwise.

Based on the policy specified by the user (performance/memory, defaulting
to performance) the kernel will then select the best suited implementation.

Even if the set implementation would allow to add more than the specified
maximum amount of elements, they are enforced since new implementations
might not be able to add more than maximum based on which they were
selected.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

c50b960c

19 3月, 2014 1 次提交
- D
  netfilter: Add missing vmalloc.h include to nft_hash.c · 3ab428a4
  由 David S. Miller 提交于 3月 18, 2014
```
Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  3ab428a4
07 3月, 2014 1 次提交

netfilter: nft_hash: bug fixes and resizing · ce6eb0d7

由 Patrick McHardy 提交于 3月 04, 2014

The hash set type is very broken and was never meant to be merged in this
state. Missing RCU synchronization on element removal, leaking chain
refcounts when used as a verdict map, races during lookups, a fixed table
size are probably just some of the problems. Luckily it is currently
never chosen by the kernel when the rbtree type is also available.

Rewrite it to be usable.

The new implementation supports automatic hash table resizing using RCU,
based on Paul McKenney's and Josh Triplett's algorithm "Optimized Resizing
For RCU-Protected Hash Tables" described in [1].

Resizing doesn't require a second list head in the elements, it works by
chosing a hash function that remaps elements to a predictable set of buckets,
only resizing by integral factors and

- during expansion: linking new buckets to the old bucket that contains
elements for any of the new buckets, thereby creating imprecise chains,
then incrementally seperating the elements until the new buckets only
contain elements that hash directly to them.

- during shrinking: linking the hash chains of all old buckets that hash
to the same new bucket to form a single chain.

Expansion requires at most the number of elements in the longest hash chain
grace periods, shrinking requires a single grace period.

Due to the requirement of having hash chains/elements linked to multiple
buckets during resizing, homemade single linked lists are used instead of
the existing list helpers, that don't support this in a clean fashion.
As a side effect, the amount of memory required per element is reduced by
one pointer.

Expansion is triggered when the load factors exceeds 75%, shrinking when
the load factor goes below 30%. Both operations are allowed to fail and
will be retried on the next insertion or removal if their respective
conditions still hold.

[1] http://dl.acm.org/citation.cfm?id=2002181.2002192Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
Reviewed-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

ce6eb0d7

06 1月, 2014 1 次提交

Revert "netfilter: avoid get_random_bytes calls" · 2a50d805

由 Pablo Neira Ayuso 提交于 1月 06, 2014

This reverts commit a42b99a6.

Hannes Frederic Sowa reported some problems with this patch, more specifically
that prandom_u32() may not be ready at boot time, see:

http://marc.info/?l=linux-netdev&m=138896532403533&w=2Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

2a50d805

20 12月, 2013 1 次提交

netfilter: avoid get_random_bytes calls · a42b99a6

由 Florian Westphal 提交于 12月 19, 2013

All these users need an initial seed value for jhash, prandom is
perfectly fine.  This avoids draining the entropy pool where
its not strictly required.

nfnetlink_log did not use the random value at all.
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

a42b99a6

openeuler / Kernel 12 个月 前同步成功

openeuler / Kernel
12 个月前同步成功