提交 · 4152d146ee2169653297e03b9fa2e0f476923959 · openeuler / Kernel

02 6月, 2020 3 次提交

net: remove indirect block netdev event registration · 709ffbe1

由 Pablo Neira Ayuso 提交于 5月 29, 2020

Drivers do not register to netdev events to set up indirect blocks
anymore. Remove __flow_indr_block_cb_register() and
__flow_indr_block_cb_unregister().

The frontends set up the callbacks through flow_indr_dev_setup_block()
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

709ffbe1

net: use flow_indr_dev_setup_offload() · 0fdcf78d

由 Pablo Neira Ayuso 提交于 5月 29, 2020

Update existing frontends to use flow_indr_dev_setup_offload().

This new function must be called if ->ndo_setup_tc is unset to deal
with tunnel devices.

If there is no driver that is subscribed to new tunnel device
flow_block bindings, then this function bails out with EOPNOTSUPP.

If the driver module is removed, the ->cleanup() callback removes the
entries that belong to this tunnel device. This cleanup procedures is
triggered when the device unregisters the tunnel device offload handler.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0fdcf78d

netfilter: nf_flowtable: expose nf_flow_table_gc_cleanup() · a8284c68

由 Pablo Neira Ayuso 提交于 5月 29, 2020

This function schedules the flow teardown state and it forces a gc run.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a8284c68

28 5月, 2020 9 次提交

netfilter: nf_tables: skip flowtable hooknum and priority on device updates · 5b6743fb

由 Pablo Neira Ayuso 提交于 5月 23, 2020

On device updates, the hooknum and priority attributes are not required.
This patch makes optional these two netlink attributes.

Moreover, bail out with EOPNOTSUPP if userspace tries to update the
hooknum and priority for existing flowtables.

While at this, turn EINVAL into EOPNOTSUPP in case the hooknum is not
ingress. EINVAL is reserved for missing netlink attribute / malformed
netlink messages.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

5b6743fb

netfilter: nf_tables: allow to register flowtable with no devices · 05abe445

由 Pablo Neira Ayuso 提交于 5月 20, 2020

A flowtable might be composed of dynamic interfaces only. Such dynamic
interfaces might show up at a later stage. This patch allows users to
register a flowtable with no devices. Once the dynamic interface becomes
available, the user adds the dynamic devices to the flowtable.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

05abe445

netfilter: nf_tables: delete devices from flowtable · abadb2f8

由 Pablo Neira Ayuso 提交于 5月 20, 2020

This patch allows users to delete devices from existing flowtables.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

abadb2f8

netfilter: nf_tables: add devices to existing flowtable · 78d9f48f

由 Pablo Neira Ayuso 提交于 5月 20, 2020

This patch allows users to add devices to an existing flowtable.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

78d9f48f

netfilter: nf_tables: pass hook list to flowtable event notifier · c42d8bda

由 Pablo Neira Ayuso 提交于 5月 20, 2020

Update the flowtable netlink notifier to take the list of hooks as input.
This allows to reuse this function in incremental flowtable hook updates.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

c42d8bda

netfilter: nf_tables: add nft_flowtable_hooks_destroy() · 389a2cbc

由 Pablo Neira Ayuso 提交于 5月 20, 2020

This patch adds a helper function destroy the flowtable hooks.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

389a2cbc

P
netfilter: nf_tables: pass hook list to nft_{un,}register_flowtable_net_hooks() · f9382669
由 Pablo Neira Ayuso 提交于 5月 19, 2020
```
This patch prepares for incremental flowtable hook updates.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
```
f9382669

netfilter: nf_tables: generalise flowtable hook parsing · d9246a53

由 Pablo Neira Ayuso 提交于 5月 20, 2020

Update nft_flowtable_parse_hook() to take the flowtable hook list as
parameter. This allows to reuse this function to update the hooks.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

d9246a53

netfilter: ctnetlink: add kernel side filtering for dump · cb8aa9a3

由 Romain Bellan 提交于 5月 04, 2020

Conntrack dump does not support kernel side filtering (only get exists,
but it returns only one entry. And user has to give a full valid tuple)

It means that userspace has to implement filtering after receiving many
irrelevant entries, consuming resources (conntrack table is sometimes
very huge, much more than a routing table for example).

This patch adds filtering in kernel side. To achieve this goal, we:

 * Add a new CTA_FILTER netlink attributes, actually a flag list to
   parametize filtering
 * Convert some *nlattr_to_tuple() functions, to allow a partial parsing
   of CTA_TUPLE_ORIG and CTA_TUPLE_REPLY (so nf_conntrack_tuple it not
   fully set)

Filtering is now possible on:
 * IP SRC/DST values
 * Ports for TCP and UDP flows
 * IMCP(v6) codes types and IDs

Filtering is done as an "AND" operator. For example, when flags
PROTO_SRC_PORT, PROTO_NUM and IP_SRC are sets, only entries matching all
values are dumped.

Changes since v1:
  Set NLM_F_DUMP_FILTERED in nlm flags if entries are filtered

Changes since v2:
  Move several constants to nf_internals.h
  Move a fix on netlink values check in a separate patch
  Add a check on not-supported flags
  Return EOPNOTSUPP if CDA_FILTER is set in ctnetlink_flush_conntrack
  (not yet implemented)
  Code style issues

Changes since v3:
  Fix compilation warning reported by kbuild test robot

Changes since v4:
  Fix a regression introduced in v3 (returned EINVAL for valid netlink
  messages without CTA_MARK)

Changes since v5:
  Change definition of CTA_FILTER_F_ALL
  Fix a regression when CTA_TUPLE_ZONE is not set
Signed-off-by: NRomain Bellan <romain.bellan@wifirst.fr>
Signed-off-by: NFlorent Fourcot <florent.fourcot@wifirst.fr>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

cb8aa9a3

27 5月, 2020 3 次提交

netfilter: nf_conntrack_pptp: fix compilation warning with W=1 build · 4946ea5c

由 Pablo Neira Ayuso 提交于 5月 27, 2020

>> include/linux/netfilter/nf_conntrack_pptp.h:13:20: warning: 'const' type qualifier on return type has no effect [-Wignored-qualifiers]
extern const char *const pptp_msg_name(u_int16_t msg);
^~~~~~
Reported-by: Nkbuild test robot <lkp@intel.com>
Fixes: 4c559f15 ("netfilter: nf_conntrack_pptp: prevent buffer overflows in debug code")
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

4946ea5c

netfilter: conntrack: comparison of unsigned in cthelper confirmation · 94945ad2

由 Pablo Neira Ayuso 提交于 5月 27, 2020

net/netfilter/nf_conntrack_core.c: In function nf_confirm_cthelper:
net/netfilter/nf_conntrack_core.c:2117:15: warning: comparison of unsigned expression in < 0 is always false [-Wtype-limits]
 2117 |   if (protoff < 0 || (frag_off & htons(~0x7)) != 0)
      |               ^

ipv6_skip_exthdr() returns a signed integer.
Reported-by: NColin Ian King <colin.king@canonical.com>
Fixes: 703acd70 ("netfilter: nfnetlink_cthelper: unbreak userspace helper support")
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

94945ad2

netfilter: conntrack: Pass value of ctinfo to __nf_conntrack_update · 46c1e062

由 Nathan Chancellor 提交于 5月 27, 2020

Clang warns:

net/netfilter/nf_conntrack_core.c:2068:21: warning: variable 'ctinfo' is
uninitialized when used here [-Wuninitialized]
        nf_ct_set(skb, ct, ctinfo);
                           ^~~~~~
net/netfilter/nf_conntrack_core.c:2024:2: note: variable 'ctinfo' is
declared here
        enum ip_conntrack_info ctinfo;
        ^
1 warning generated.

nf_conntrack_update was split up into nf_conntrack_update and
__nf_conntrack_update, where the assignment of ctinfo is in
nf_conntrack_update but it is used in __nf_conntrack_update.

Pass the value of ctinfo from nf_conntrack_update to
__nf_conntrack_update so that uninitialized memory is not used
and everything works properly.

Fixes: ee04805f ("netfilter: conntrack: make conntrack userspace helpers work again")
Link: https://github.com/ClangBuiltLinux/linux/issues/1039Signed-off-by: NNathan Chancellor <natechancellor@gmail.com>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

46c1e062

26 5月, 2020 4 次提交

netfilter: nfnetlink_cthelper: unbreak userspace helper support · 703acd70

由 Pablo Neira Ayuso 提交于 5月 24, 2020

Restore helper data size initialization and fix memcopy of the helper
data size.

Fixes: 157ffffe ("netfilter: nfnetlink_cthelper: reject too large userspace allocation requests")
Reviewed-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

703acd70

netfilter: conntrack: make conntrack userspace helpers work again · ee04805f

由 Pablo Neira Ayuso 提交于 5月 24, 2020

Florian Westphal says:

"Problem is that after the helper hook was merged back into the confirm
one, the queueing itself occurs from the confirm hook, i.e. we queue
from the last netfilter callback in the hook-list.

Therefore, on return, the packet bypasses the confirm action and the
connection is never committed to the main conntrack table.

To fix this there are several ways:
1. revert the 'Fixes' commit and have a extra helper hook again.
   Works, but has the drawback of adding another indirect call for
   everyone.

2. Special case this: split the hooks only when userspace helper
   gets added, so queueing occurs at a lower priority again,
   and normal enqueue reinject would eventually call the last hook.

3. Extend the existing nf_queue ct update hook to allow a forced
   confirmation (plus run the seqadj code).

This goes for 3)."

Fixes: 827318fe ("netfilter: conntrack: remove helper hook again")
Reviewed-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

ee04805f

netfilter: nf_conntrack_pptp: prevent buffer overflows in debug code · 4c559f15

由 Pablo Neira Ayuso 提交于 5月 14, 2020

Dan Carpenter says: "Smatch complains that the value for "cmd" comes
from the network and can't be trusted."

Add pptp_msg_name() helper function that checks for the array boundary.

Fixes: f09943fe ("[NETFILTER]: nf_conntrack/nf_nat: add PPTP helper port")
Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

4c559f15

netfilter: ipset: Fix subcounter update skip · a164b95a

由 Phil Sutter 提交于 5月 14, 2020

If IPSET_FLAG_SKIP_SUBCOUNTER_UPDATE is set, user requested to not
update counters in sub sets. Therefore IPSET_FLAG_SKIP_COUNTER_UPDATE
must be set, not unset.

Fixes: 6e01781d ("netfilter: ipset: set match: add support to match the counters")
Signed-off-by: NPhil Sutter <phil@nwl.cc>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

a164b95a

12 5月, 2020 2 次提交

netfilter: nft_set_rbtree: Add missing expired checks · 340eaff6

由 Phil Sutter 提交于 5月 11, 2020

Expired intervals would still match and be dumped to user space until
garbage collection wiped them out. Make sure they stop matching and
disappear (from users' perspective) as soon as they expire.

Fixes: 8d8540c4 ("netfilter: nft_set_rbtree: add timeout support")
Signed-off-by: NPhil Sutter <phil@nwl.cc>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

340eaff6

netfilter: flowtable: set NF_FLOW_TEARDOWN flag on entry expiration · 9ed81c8e

由 Pablo Neira Ayuso 提交于 5月 11, 2020

If the flow timer expires, the gc sets on the NF_FLOW_TEARDOWN flag.
Otherwise, the flowtable software path might race to refresh the
timeout, leaving the state machine in inconsistent state.

Fixes: c29f74e0 ("netfilter: nf_flow_table: hardware offload support")
Reported-by: NPaul Blakey <paulb@mellanox.com>
Reviewed-by: NRoi Dayan <roid@mellanox.com>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

9ed81c8e

11 5月, 2020 4 次提交

netfilter: conntrack: fix infinite loop on rmmod · 54ab49fd

由 Florian Westphal 提交于 5月 10, 2020

'rmmod nf_conntrack' can hang forever, because the netns exit
gets stuck in nf_conntrack_cleanup_net_list():

i_see_dead_people:
 busy = 0;
 list_for_each_entry(net, net_exit_list, exit_list) {
  nf_ct_iterate_cleanup(kill_all, net, 0, 0);
  if (atomic_read(&net->ct.count) != 0)
   busy = 1;
 }
 if (busy) {
  schedule();
  goto i_see_dead_people;
 }

When nf_ct_iterate_cleanup iterates the conntrack table, all nf_conn
structures can be found twice:
once for the original tuple and once for the conntracks reply tuple.

get_next_corpse() only calls the iterator when the entry is
in original direction -- the idea was to avoid unneeded invocations
of the iterator callback.

When support for clashing entries was added, the assumption that
all nf_conn objects are added twice, once in original, once for reply
tuple no longer holds -- NF_CLASH_BIT entries are only added in
the non-clashing reply direction.

Thus, if at least one NF_CLASH entry is in the list then
nf_conntrack_cleanup_net_list() always skips it completely.

During normal netns destruction, this causes a hang of several
seconds, until the gc worker removes the entry (NF_CLASH entries
always have a 1 second timeout).

But in the rmmod case, the gc worker has already been stopped, so
ct.count never becomes 0.

We can fix this in two ways:

1. Add a second test for CLASH_BIT and call iterator for those
   entries as well, or:
2. Skip the original tuple direction and use the reply tuple.

2) is simpler, so do that.

Fixes: 6a757c07 ("netfilter: conntrack: allow insertion of clashing entries")
Reported-by: NChen Yi <yiche@redhat.com>
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

54ab49fd

netfilter: flowtable: Remove WQ_MEM_RECLAIM from workqueue · 1d10da0e

由 Roi Dayan 提交于 5月 10, 2020

This workqueue is in charge of handling offloaded flow tasks like
add/del/stats we should not use WQ_MEM_RECLAIM flag.
The flag can result in the following warning.

[  485.557189] ------------[ cut here ]------------
[  485.562976] workqueue: WQ_MEM_RECLAIM nf_flow_table_offload:flow_offload_worr
[  485.562985] WARNING: CPU: 7 PID: 3731 at kernel/workqueue.c:2610 check_flush0
[  485.590191] Kernel panic - not syncing: panic_on_warn set ...
[  485.597100] CPU: 7 PID: 3731 Comm: kworker/u112:8 Not tainted 5.7.0-rc1.21802
[  485.606629] Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.4.3 01/177
[  485.615487] Workqueue: nf_flow_table_offload flow_offload_work_handler [nf_f]
[  485.624834] Call Trace:
[  485.628077]  dump_stack+0x50/0x70
[  485.632280]  panic+0xfb/0x2d7
[  485.636083]  ? check_flush_dependency+0x110/0x130
[  485.641830]  __warn.cold.12+0x20/0x2a
[  485.646405]  ? check_flush_dependency+0x110/0x130
[  485.652154]  ? check_flush_dependency+0x110/0x130
[  485.657900]  report_bug+0xb8/0x100
[  485.662187]  ? sched_clock_cpu+0xc/0xb0
[  485.666974]  do_error_trap+0x9f/0xc0
[  485.671464]  do_invalid_op+0x36/0x40
[  485.675950]  ? check_flush_dependency+0x110/0x130
[  485.681699]  invalid_op+0x28/0x30

Fixes: 7da182a9 ("netfilter: flowtable: Use work entry per offload command")
Reported-by: NMarcelo Ricardo Leitner <mleitner@redhat.com>
Signed-off-by: NRoi Dayan <roid@mellanox.com>
Reviewed-by: NPaul Blakey <paulb@mellanox.com>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

1d10da0e

netfilter: flowtable: Add pending bit for offload work · 2c889795

由 Paul Blakey 提交于 5月 06, 2020

Gc step can queue offloaded flow del work or stats work.
Those work items can race each other and a flow could be freed
before the stats work is executed and querying it.
To avoid that, add a pending bit that if a work exists for a flow
don't queue another work for it.
This will also avoid adding multiple stats works in case stats work
didn't complete but gc step started again.
Signed-off-by: NPaul Blakey <paulb@mellanox.com>
Reviewed-by: NRoi Dayan <roid@mellanox.com>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

2c889795

netfilter: conntrack: avoid gcc-10 zero-length-bounds warning · 2c407aca

由 Arnd Bergmann 提交于 4月 30, 2020

gcc-10 warns around a suspicious access to an empty struct member:

net/netfilter/nf_conntrack_core.c: In function '__nf_conntrack_alloc':
net/netfilter/nf_conntrack_core.c:1522:9: warning: array subscript 0 is outside the bounds of an interior zero-length array 'u8[0]' {aka 'unsigned char[0]'} [-Wzero-length-bounds]
 1522 |  memset(&ct->__nfct_init_offset[0], 0,
      |         ^~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from net/netfilter/nf_conntrack_core.c:37:
include/net/netfilter/nf_conntrack.h:90:5: note: while referencing '__nfct_init_offset'
   90 |  u8 __nfct_init_offset[0];
      |     ^~~~~~~~~~~~~~~~~~

The code is correct but a bit unusual. Rework it slightly in a way that
does not trigger the warning, using an empty struct instead of an empty
array. There are probably more elegant ways to do this, but this is the
smallest change.

Fixes: c41884ce ("netfilter: conntrack: avoid zeroing timer")
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

2c407aca

01 5月, 2020 1 次提交

docs: networking: convert tproxy.txt to ReST · 4ac0b122

由 Mauro Carvalho Chehab 提交于 4月 30, 2020

- add SPDX header;
- adjust title markup;
- mark code blocks and literals as such;
- adjust identation, whitespaces and blank lines where needed;
- add to networking/index.rst.
Signed-off-by: NMauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4ac0b122

30 4月, 2020 1 次提交

netfilter: nf_osf: avoid passing pointer to local var · c165d57b

由 Arnd Bergmann 提交于 4月 29, 2020

gcc-10 points out that a code path exists where a pointer to a stack
variable may be passed back to the caller:

net/netfilter/nfnetlink_osf.c: In function 'nf_osf_hdr_ctx_init':
cc1: warning: function may return address of local variable [-Wreturn-local-addr]
net/netfilter/nfnetlink_osf.c:171:16: note: declared here
  171 |  struct tcphdr _tcph;
      |                ^~~~~

I am not sure whether this can happen in practice, but moving the
variable declaration into the callers avoids the problem.

Fixes: 31a9c292 ("netfilter: nf_osf: add struct nf_osf_hdr_ctx")
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Reviewed-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

c165d57b

29 4月, 2020 2 次提交

netfilter: add audit table unregister actions · a45d8853

由 Richard Guy Briggs 提交于 4月 22, 2020

Audit the action of unregistering ebtables and x_tables.

See: https://github.com/linux-audit/audit-kernel/issues/44Signed-off-by: NRichard Guy Briggs <rgb@redhat.com>
Signed-off-by: NPaul Moore <paul@paul-moore.com>

a45d8853

audit: tidy and extend netfilter_cfg x_tables · c4dad0aa

由 Richard Guy Briggs 提交于 4月 22, 2020

NETFILTER_CFG record generation was inconsistent for x_tables and
ebtables configuration changes.  The call was needlessly messy and there
were supporting records missing at times while they were produced when
not requested.  Simplify the logging call into a new audit_log_nfcfg
call.  Honour the audit_enabled setting while more consistently
recording information including supporting records by tidying up dummy
checks.

Add an op= field that indicates the operation being performed (register
or replace).

Here is the enhanced sample record:
  type=NETFILTER_CFG msg=audit(1580905834.919:82970): table=filter family=2 entries=83 op=replace

Generate audit NETFILTER_CFG records on ebtables table registration.
Previously this was being done for x_tables registration and replacement
operations and ebtables table replacement only.

See: https://github.com/linux-audit/audit-kernel/issues/25
See: https://github.com/linux-audit/audit-kernel/issues/35
See: https://github.com/linux-audit/audit-kernel/issues/43Signed-off-by: NRichard Guy Briggs <rgb@redhat.com>
Signed-off-by: NPaul Moore <paul@paul-moore.com>

c4dad0aa

28 4月, 2020 5 次提交

netfilter: nft_nat: add netmap support · 3ff7ddb1

由 Pablo Neira Ayuso 提交于 4月 24, 2020

This patch allows you to NAT the network address prefix onto another
network address prefix, a.k.a. netmapping.

Userspace must specify the NF_NAT_RANGE_NETMAP flag and the prefix
address through the NFTA_NAT_REG_ADDR_MIN and NFTA_NAT_REG_ADDR_MAX
netlink attributes.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

3ff7ddb1

netfilter: nft_nat: add helper function to set up NAT address and protocol · acd766e3

由 Pablo Neira Ayuso 提交于 4月 24, 2020

This patch add nft_nat_setup_addr() and nft_nat_setup_proto() to set up
the NAT mangling.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

acd766e3

netfilter: nft_nat: set flags from initialization path · 4566aa44

由 Pablo Neira Ayuso 提交于 4月 24, 2020

This patch sets the NAT flags from the control plane path.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

4566aa44

netfilter: nft_nat: return EOPNOTSUPP if type or flags are not supported · 0d7c8346

由 Pablo Neira Ayuso 提交于 4月 24, 2020

Instead of EINVAL which should be used for malformed netlink messages.

Fixes: eb31628e ("netfilter: nf_tables: Add support for IPv6 NAT")
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

0d7c8346

P
netfilter: nf_tables: allow up to 64 bytes in the set element data area · fdb9c405
由 Pablo Neira Ayuso 提交于 4月 24, 2020
```
So far, the set elements could store up to 128-bits in the data area.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
```
fdb9c405

27 4月, 2020 3 次提交

sysctl: pass kernel pointers to ->proc_handler · 32927393

由 Christoph Hellwig 提交于 4月 24, 2020

Instead of having all the sysctl handlers deal with user pointers, which
is rather hairy in terms of the BPF interaction, copy the input to and
from  userspace in common code.  This also means that the strings are
always NUL-terminated by the common code, making the API a little bit
safer.

As most handler just pass through the data to one of the common handlers
a lot of the changes are mechnical.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Acked-by: NAndrey Ignatov <rdna@fb.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

32927393

netfilter: nat: never update the UDP checksum when it's 0 · ea64d8d6

由 Guillaume Nault 提交于 4月 21, 2020

If the UDP header of a local VXLAN endpoint is NAT-ed, and the VXLAN
device has disabled UDP checksums and enabled Tx checksum offloading,
then the skb passed to udp_manip_pkt() has hdr->check == 0 (outer
checksum disabled) and skb->ip_summed == CHECKSUM_PARTIAL (inner packet
checksum offloaded).

Because of the ->ip_summed value, udp_manip_pkt() tries to update the
outer checksum with the new address and port, leading to an invalid
checksum sent on the wire, as the original null checksum obviously
didn't take the old address and port into account.

So, we can't take ->ip_summed into account in udp_manip_pkt(), as it
might not refer to the checksum we're acting on. Instead, we can base
the decision to update the UDP checksum entirely on the value of
hdr->check, because it's null if and only if checksum is disabled:

  * A fully computed checksum can't be 0, since a 0 checksum is
    represented by the CSUM_MANGLED_0 value instead.

  * A partial checksum can't be 0, since the pseudo-header always adds
    at least one non-zero value (the UDP protocol type 0x11) and adding
    more values to the sum can't make it wrap to 0 as the carry is then
    added to the wrapped number.

  * A disabled checksum uses the special value 0.

The problem seems to be there from day one, although it was probably
not visible before UDP tunnels were implemented.

Fixes: 5b1158e9 ("[NETFILTER]: Add NAT support for nf_conntrack")
Signed-off-by: NGuillaume Nault <gnault@redhat.com>
Reviewed-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

ea64d8d6

netfilter: nf_conntrack: add IPS_HW_OFFLOAD status bit · 74f99482

由 Bodong Wang 提交于 4月 21, 2020

This bit indicates that the conntrack entry is offloaded to hardware
flow table. nf_conntrack entry will be tagged with [HW_OFFLOAD] if
it's offload to hardware.

cat /proc/net/nf_conntrack
	ipv4 2 tcp 6 \
	src=1.1.1.17 dst=1.1.1.16 sport=56394 dport=5001 \
	src=1.1.1.16 dst=1.1.1.17 sport=5001 dport=56394 [HW_OFFLOAD] \
	mark=0 zone=0 use=3

Note that HW_OFFLOAD/OFFLOAD/ASSURED are mutually exclusive.

Changelog:

* V1->V2:
- Remove check of lastused from stats. It was meant for cases such
  as removing driver module while traffic still running. Better to
  handle such cases from garbage collector.
Signed-off-by: NBodong Wang <bodong@mellanox.com>
Reviewed-by: NOz Shlomo <ozsh@mellanox.com>
Reviewed-by: NPaul Blakey <paulb@mellanox.com>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

74f99482

19 4月, 2020 1 次提交

netfilter: nat: fix error handling upon registering inet hook · b4faef17

由 Hillf Danton 提交于 4月 18, 2020

A case of warning was reported by syzbot.

------------[ cut here ]------------
WARNING: CPU: 0 PID: 19934 at net/netfilter/nf_nat_core.c:1106
nf_nat_unregister_fn+0x532/0x5c0 net/netfilter/nf_nat_core.c:1106
Kernel panic - not syncing: panic_on_warn set ...
CPU: 0 PID: 19934 Comm: syz-executor.5 Not tainted 5.6.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x188/0x20d lib/dump_stack.c:118
 panic+0x2e3/0x75c kernel/panic.c:221
 __warn.cold+0x2f/0x35 kernel/panic.c:582
 report_bug+0x27b/0x2f0 lib/bug.c:195
 fixup_bug arch/x86/kernel/traps.c:175 [inline]
 fixup_bug arch/x86/kernel/traps.c:170 [inline]
 do_error_trap+0x12b/0x220 arch/x86/kernel/traps.c:267
 do_invalid_op+0x32/0x40 arch/x86/kernel/traps.c:286
 invalid_op+0x23/0x30 arch/x86/entry/entry_64.S:1027
RIP: 0010:nf_nat_unregister_fn+0x532/0x5c0 net/netfilter/nf_nat_core.c:1106
Code: ff df 48 c1 ea 03 80 3c 02 00 75 75 48 8b 44 24 10 4c 89 ef 48 c7 00 00 00 00 00 e8 e8 f8 53 fb e9 4d fe ff ff e8 ee 9c 16 fb <0f> 0b e9 41 fe ff ff e8 e2 45 54 fb e9 b5 fd ff ff 48 8b 7c 24 20
RSP: 0018:ffffc90005487208 EFLAGS: 00010246
RAX: 0000000000040000 RBX: 0000000000000004 RCX: ffffc9001444a000
RDX: 0000000000040000 RSI: ffffffff865c94a2 RDI: 0000000000000005
RBP: ffff88808b5cf000 R08: ffff8880a2620140 R09: fffffbfff14bcd79
R10: ffffc90005487208 R11: fffffbfff14bcd78 R12: 0000000000000000
R13: 0000000000000001 R14: 0000000000000001 R15: 0000000000000000
 nf_nat_ipv6_unregister_fn net/netfilter/nf_nat_proto.c:1017 [inline]
 nf_nat_inet_register_fn net/netfilter/nf_nat_proto.c:1038 [inline]
 nf_nat_inet_register_fn+0xfc/0x140 net/netfilter/nf_nat_proto.c:1023
 nf_tables_register_hook net/netfilter/nf_tables_api.c:224 [inline]
 nf_tables_addchain.constprop.0+0x82e/0x13c0 net/netfilter/nf_tables_api.c:1981
 nf_tables_newchain+0xf68/0x16a0 net/netfilter/nf_tables_api.c:2235
 nfnetlink_rcv_batch+0x83a/0x1610 net/netfilter/nfnetlink.c:433
 nfnetlink_rcv_skb_batch net/netfilter/nfnetlink.c:543 [inline]
 nfnetlink_rcv+0x3af/0x420 net/netfilter/nfnetlink.c:561
 netlink_unicast_kernel net/netlink/af_netlink.c:1303 [inline]
 netlink_unicast+0x537/0x740 net/netlink/af_netlink.c:1329
 netlink_sendmsg+0x882/0xe10 net/netlink/af_netlink.c:1918
 sock_sendmsg_nosec net/socket.c:652 [inline]
 sock_sendmsg+0xcf/0x120 net/socket.c:672
 ____sys_sendmsg+0x6bf/0x7e0 net/socket.c:2362
 ___sys_sendmsg+0x100/0x170 net/socket.c:2416
 __sys_sendmsg+0xec/0x1b0 net/socket.c:2449
 do_syscall_64+0xf6/0x7d0 arch/x86/entry/common.c:295
 entry_SYSCALL_64_after_hwframe+0x49/0xb3

and to quiesce it, unregister NFPROTO_IPV6 hook instead of NFPROTO_INET
in case of failing to register NFPROTO_IPV4 hook.
Reported-by: Nsyzbot <syzbot+33e06702fd6cffc24c40@syzkaller.appspotmail.com>
Fixes: d164385e ("netfilter: nat: add inet family nat support")
Cc: Florian Westphal <fw@strlen.de>
Cc: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: NHillf Danton <hdanton@sina.com>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

b4faef17

16 4月, 2020 1 次提交

netfilter: Avoid assigning 'const' pointer to non-const pointer · 514cc55b

由 Will Deacon 提交于 12月 17, 2019

nf_remove_net_hook() uses WRITE_ONCE() to assign a 'const' pointer to a
'non-const' pointer. Cleanups to the implementation of WRITE_ONCE() mean
that this will give rise to a compiler warning, just like a plain old
assignment would do:

  | In file included from ./include/linux/export.h:43,
  |                  from ./include/linux/linkage.h:7,
  |                  from ./include/linux/kernel.h:8,
  |                  from net/netfilter/core.c:9:
  | net/netfilter/core.c: In function ‘nf_remove_net_hook’:
  | ./include/linux/compiler.h:216:30: warning: assignment discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]
  |   *(volatile typeof(x) *)&(x) = (val);  \
  |                               ^
  | net/netfilter/core.c:379:3: note: in expansion of macro ‘WRITE_ONCE’
  |    WRITE_ONCE(orig_ops[i], &dummy_ops);
  |    ^~~~~~~~~~

Follow the pattern used elsewhere in this file and add a cast to 'void *'
to squash the warning.

Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Jozsef Kadlecsik <kadlec@netfilter.org>
Cc: Florian Westphal <fw@strlen.de>
Cc: "David S. Miller" <davem@davemloft.net>
Reviewed-by: NNick Desaulniers <ndesaulniers@google.com>
Signed-off-by: NWill Deacon <will@kernel.org>

514cc55b

15 4月, 2020 1 次提交

netfilter: flowtable: Free block_cb when being deleted · bc8e7131

由 Roi Dayan 提交于 4月 12, 2020

Free block_cb memory when asked to be deleted.

Fixes: 978703f4 ("netfilter: flowtable: Add API for registering to flow table events")
Signed-off-by: NRoi Dayan <roid@mellanox.com>
Reviewed-by: NPaul Blakey <paulb@mellanox.com>
Reviewed-by: NOz Shlomo <ozsh@mellanox.com>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

bc8e7131

openeuler / Kernel 大约 1 年 前同步成功

openeuler / Kernel
大约 1 年前同步成功