提交 · b74494872555d1f7888dfd9225700a363f4a84fc · openeuler / Kernel

31 7月, 2019 1 次提交

hrtimer: Remove task argument from hrtimer_init_sleeper() · b7449487

由 Thomas Gleixner 提交于 5年前

All callers hand in 'current' and that's the only task pointer which
actually makes sense. Remove the task argument and set current in the
function.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20190726185752.791885290@linutronix.de

b7449487

22 7月, 2019 3 次提交

tcp: be more careful in tcp_fragment() · b617158d

由 Eric Dumazet 提交于 5年前

Some applications set tiny SO_SNDBUF values and expect
TCP to just work. Recent patches to address CVE-2019-11478
broke them in case of losses, since retransmits might
be prevented.

We should allow these flows to make progress.

This patch allows the first and last skb in retransmit queue
to be split even if memory limits are hit.

It also adds the some room due to the fact that tcp_sendmsg()
and tcp_sendpage() might overshoot sk_wmem_queued by about one full
TSO skb (64KB size). Note this allowance was already present
in stable backports for kernels < 4.15

Note for < 4.15 backports :
 tcp_rtx_queue_tail() will probably look like :

static inline struct sk_buff *tcp_rtx_queue_tail(const struct sock *sk)
{
	struct sk_buff *skb = tcp_send_head(sk);

	return skb ? tcp_write_queue_prev(sk, skb) : tcp_write_queue_tail(sk);
}

Fixes: f070ef2a ("tcp: tcp_fragment() should apply sane memory limits")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Reported-by: NAndrew Prout <aprout@ll.mit.edu>
Tested-by: NAndrew Prout <aprout@ll.mit.edu>
Tested-by: NJonathan Lemon <jonathan.lemon@gmail.com>
Tested-by: NMichal Kubecek <mkubecek@suse.cz>
Acked-by: NNeal Cardwell <ncardwell@google.com>
Acked-by: NYuchung Cheng <ycheng@google.com>
Acked-by: NChristoph Paasch <cpaasch@apple.com>
Cc: Jonathan Looney <jtl@netflix.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b617158d

net: sched: verify that q!=NULL before setting q->flags · 503d81d4

由 Vlad Buslov 提交于 5年前

In function int tc_new_tfilter() q pointer can be NULL when adding filter
on a shared block. With recent change that resets TCQ_F_CAN_BYPASS after
filter creation, following NULL pointer dereference happens in case parent
block is shared:

[  212.925060] BUG: kernel NULL pointer dereference, address: 0000000000000010
[  212.925445] #PF: supervisor write access in kernel mode
[  212.925709] #PF: error_code(0x0002) - not-present page
[  212.925965] PGD 8000000827923067 P4D 8000000827923067 PUD 827924067 PMD 0
[  212.926302] Oops: 0002 [#1] SMP KASAN PTI
[  212.926539] CPU: 18 PID: 2617 Comm: tc Tainted: G    B             5.2.0+ #512
[  212.926938] Hardware name: Supermicro SYS-2028TP-DECR/X10DRT-P, BIOS 2.0b 03/30/2017
[  212.927364] RIP: 0010:tc_new_tfilter+0x698/0xd40
[  212.927633] Code: 74 0d 48 85 c0 74 08 48 89 ef e8 03 aa 62 00 48 8b 84 24 a0 00 00 00 48 8d 78 10 48 89 44 24 18 e8 4d 0c 6b ff 48 8b 44 24 18 <83> 60 10 f
b 48 85 ed 0f 85 3d fe ff ff e9 4f fe ff ff e8 81 26 f8
[  212.928607] RSP: 0018:ffff88884fd5f5d8 EFLAGS: 00010296
[  212.928905] RAX: 0000000000000000 RBX: 0000000000000000 RCX: dffffc0000000000
[  212.929201] RDX: 0000000000000007 RSI: 0000000000000004 RDI: 0000000000000297
[  212.929402] RBP: ffff88886bedd600 R08: ffffffffb91d4b51 R09: fffffbfff7616e4d
[  212.929609] R10: fffffbfff7616e4c R11: ffffffffbb0b7263 R12: ffff88886bc61040
[  212.929803] R13: ffff88884fd5f950 R14: ffffc900039c5000 R15: ffff88835e927680
[  212.929999] FS:  00007fe7c50b6480(0000) GS:ffff88886f980000(0000) knlGS:0000000000000000
[  212.930235] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  212.930394] CR2: 0000000000000010 CR3: 000000085bd04002 CR4: 00000000001606e0
[  212.930588] Call Trace:
[  212.930682]  ? tc_del_tfilter+0xa40/0xa40
[  212.930811]  ? __lock_acquire+0x5b5/0x2460
[  212.930948]  ? find_held_lock+0x85/0xa0
[  212.931081]  ? tc_del_tfilter+0xa40/0xa40
[  212.931201]  rtnetlink_rcv_msg+0x4ab/0x5f0
[  212.931332]  ? rtnl_dellink+0x490/0x490
[  212.931454]  ? lockdep_hardirqs_on+0x260/0x260
[  212.931589]  ? netlink_deliver_tap+0xab/0x5a0
[  212.931717]  ? match_held_lock+0x1b/0x240
[  212.931844]  netlink_rcv_skb+0xd0/0x200
[  212.931958]  ? rtnl_dellink+0x490/0x490
[  212.932079]  ? netlink_ack+0x440/0x440
[  212.932205]  ? netlink_deliver_tap+0x161/0x5a0
[  212.932335]  ? lock_downgrade+0x360/0x360
[  212.932457]  ? lock_acquire+0xe5/0x210
[  212.932579]  netlink_unicast+0x296/0x350
[  212.932705]  ? netlink_attachskb+0x390/0x390
[  212.932834]  ? _copy_from_iter_full+0xe0/0x3a0
[  212.932976]  netlink_sendmsg+0x394/0x600
[  212.937998]  ? netlink_unicast+0x350/0x350
[  212.943033]  ? move_addr_to_kernel.part.0+0x90/0x90
[  212.948115]  ? netlink_unicast+0x350/0x350
[  212.953185]  sock_sendmsg+0x96/0xa0
[  212.958099]  ___sys_sendmsg+0x482/0x520
[  212.962881]  ? match_held_lock+0x1b/0x240
[  212.967618]  ? copy_msghdr_from_user+0x250/0x250
[  212.972337]  ? lock_downgrade+0x360/0x360
[  212.976973]  ? rwlock_bug.part.0+0x60/0x60
[  212.981548]  ? __mod_node_page_state+0x1f/0xa0
[  212.986060]  ? match_held_lock+0x1b/0x240
[  212.990567]  ? find_held_lock+0x85/0xa0
[  212.994989]  ? do_user_addr_fault+0x349/0x5b0
[  212.999387]  ? lock_downgrade+0x360/0x360
[  213.003713]  ? find_held_lock+0x85/0xa0
[  213.007972]  ? __fget_light+0xa1/0xf0
[  213.012143]  ? sockfd_lookup_light+0x91/0xb0
[  213.016165]  __sys_sendmsg+0xba/0x130
[  213.020040]  ? __sys_sendmsg_sock+0xb0/0xb0
[  213.023870]  ? handle_mm_fault+0x337/0x470
[  213.027592]  ? page_fault+0x8/0x30
[  213.031316]  ? lockdep_hardirqs_off+0xbe/0x100
[  213.034999]  ? mark_held_locks+0x24/0x90
[  213.038671]  ? do_syscall_64+0x1e/0xe0
[  213.042297]  do_syscall_64+0x74/0xe0
[  213.045828]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  213.049354] RIP: 0033:0x7fe7c527c7b8
[  213.052792] Code: 89 02 48 c7 c0 ff ff ff ff eb bb 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 8d 05 65 8f 0c 00 8b 00 85 c0 75 17 b8 2e 00 00 00 0f 05 <48> 3d 00 f
0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 48 83 ec 28 89 54
[  213.060269] RSP: 002b:00007ffc3f7908a8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
[  213.064144] RAX: ffffffffffffffda RBX: 000000005d34716f RCX: 00007fe7c527c7b8
[  213.068094] RDX: 0000000000000000 RSI: 00007ffc3f790910 RDI: 0000000000000003
[  213.072109] RBP: 0000000000000000 R08: 0000000000000001 R09: 00007fe7c5340cc0
[  213.076113] R10: 0000000000404ec2 R11: 0000000000000246 R12: 0000000000000080
[  213.080146] R13: 0000000000480640 R14: 0000000000000080 R15: 0000000000000000
[  213.084147] Modules linked in: act_gact cls_flower sch_ingress nfsv3 nfs_acl nfs lockd grace fscache bridge stp llc sunrpc intel_rapl_msr intel_rapl_common
[<1;69;32Msb_edac rdma_ucm rdma_cm x86_pkg_temp_thermal iw_cm intel_powerclamp ib_cm coretemp kvm_intel kvm irqbypass mlx5_ib ib_uverbs ib_core crct10dif_pclmul crc32_pc
lmul crc32c_intel ghash_clmulni_intel mlx5_core intel_cstate intel_uncore iTCO_wdt igb iTCO_vendor_support mlxfw mei_me ptp ses intel_rapl_perf mei pcspkr ipmi
_ssif i2c_i801 joydev enclosure pps_core lpc_ich ioatdma wmi dca ipmi_si ipmi_devintf ipmi_msghandler acpi_power_meter acpi_pad ast i2c_algo_bit drm_vram_helpe
r ttm drm_kms_helper drm mpt3sas raid_class scsi_transport_sas
[  213.112326] CR2: 0000000000000010
[  213.117429] ---[ end trace adb58eb0a4ee6283 ]---

Verify that q pointer is not NULL before setting the 'flags' field.

Fixes: 3f05e688 ("net_sched: unset TCQ_F_CAN_BYPASS when adding filters")
Signed-off-by: NVlad Buslov <vladbu@mellanox.com>
Acked-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

503d81d4

tipc: Fix a typo · bad7f869

由 Christophe JAILLET 提交于 5年前

s/tipc_toprsv_listener_data_ready/tipc_topsrv_listener_data_ready/
(r and s switched in topsrv)
Signed-off-by: NChristophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bad7f869

21 7月, 2019 2 次提交

mac80211: don't warn about CW params when not using them · d2b3fe42

由 Brian Norris 提交于 5年前

ieee80211_set_wmm_default() normally sets up the initial CW min/max for
each queue, except that it skips doing this if the driver doesn't
support ->conf_tx. We still end up calling drv_conf_tx() in some cases
(e.g., ieee80211_reconfig()), which also still won't do anything
useful...except it complains here about the invalid CW parameters.

Let's just skip the WARN if we weren't going to do anything useful with
the parameters.
Signed-off-by: NBrian Norris <briannorris@chromium.org>
Link: https://lore.kernel.org/r/20190718015712.197499-1-briannorris@chromium.orgSigned-off-by: NJohannes Berg <johannes.berg@intel.com>

d2b3fe42

mac80211: fix possible memory leak in ieee80211_assign_beacon · bcc27fab

由 Lorenzo Bianconi 提交于 5年前

Free new beacon_data in ieee80211_assign_beacon whenever
ieee80211_assign_beacon fails

Fixes: 8860020e ("cfg80211: restructure AP/GO mode API")
Fixes: bc847970 ("mac80211: support FTM responder configuration/statistic")
Signed-off-by: NLorenzo Bianconi <lorenzo@kernel.org>
Link: https://lore.kernel.org/r/770285772543c9fca33777bb4ad4760239e56256.1562105631.git.lorenzo@kernel.orgSigned-off-by: NJohannes Berg <johannes.berg@intel.com>

bcc27fab

20 7月, 2019 5 次提交

net: flow_offload: add flow_block structure and use it · 14bfb13f

由 Pablo Neira Ayuso 提交于 5年前

This object stores the flow block callbacks that are attached to this
block. Update flow_block_cb_lookup() to take this new object.

This patch restores the block sharing feature.

Fixes: da3eeb90 ("net: flow_offload: add list handling functions")
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
Acked-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

14bfb13f

net: flow_offload: rename tc_setup_cb_t to flow_setup_cb_t · a7323311

由 Pablo Neira Ayuso 提交于 5年前

Rename this type definition and adapt users.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
Acked-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a7323311

net: flow_offload: remove netns parameter from flow_block_cb_alloc() · 0c7294dd

由 Pablo Neira Ayuso 提交于 5年前

No need to annotate the netns on the flow block callback object,
flow_block_cb_is_busy() already checks for used blocks.

Fixes: d63db30c ("net: flow_offload: add flow_block_cb_alloc() and flow_block_cb_free()")
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
Acked-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0c7294dd

net: openvswitch: rename flow_stats to sw_flow_stats · aef833c5

由 Pablo Neira Ayuso 提交于 5年前

There is a flow_stats structure defined in include/net/flow_offload.h
and a follow up patch adds #include <net/flow_offload.h> to
net/sch_generic.h.

This breaks compilation since OVS codebase includes net/sock.h which
pulls in linux/filter.h which includes net/sch_generic.h.

In file included from ./include/net/sch_generic.h:18:0,
                 from ./include/linux/filter.h:25,
                 from ./include/net/sock.h:59,
                 from ./include/linux/tcp.h:19,
                 from net/openvswitch/datapath.c:24

This definition takes precedence on OVS since it is placed in the
networking core, so rename flow_stats in OVS to sw_flow_stats since
this structure is contained in sw_flow.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
Acked-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

aef833c5

netfilter: bridge: make NF_TABLES_BRIDGE tristate · dfee0e99

由 Arnd Bergmann 提交于 5年前

The new nft_meta_bridge code fails to link as built-in when NF_TABLES
is a loadable module.

net/bridge/netfilter/nft_meta_bridge.o: In function `nft_meta_bridge_get_eval':
nft_meta_bridge.c:(.text+0x1e8): undefined reference to `nft_meta_get_eval'
net/bridge/netfilter/nft_meta_bridge.o: In function `nft_meta_bridge_get_init':
nft_meta_bridge.c:(.text+0x468): undefined reference to `nft_meta_get_init'
nft_meta_bridge.c:(.text+0x49c): undefined reference to `nft_parse_register'
nft_meta_bridge.c:(.text+0x4cc): undefined reference to `nft_validate_register_store'
net/bridge/netfilter/nft_meta_bridge.o: In function `nft_meta_bridge_module_exit':
nft_meta_bridge.c:(.exit.text+0x14): undefined reference to `nft_unregister_expr'
net/bridge/netfilter/nft_meta_bridge.o: In function `nft_meta_bridge_module_init':
nft_meta_bridge.c:(.init.text+0x14): undefined reference to `nft_register_expr'
net/bridge/netfilter/nft_meta_bridge.o:(.rodata+0x60): undefined reference to `nft_meta_get_dump'
net/bridge/netfilter/nft_meta_bridge.o:(.rodata+0x88): undefined reference to `nft_meta_set_eval'

This can happen because the NF_TABLES_BRIDGE dependency itself is just a
'bool'.  Make the symbol a 'tristate' instead so Kconfig can propagate the
dependencies correctly.

Fixes: 30e103fe ("netfilter: nft_meta: move bridge meta keys into nft_meta_bridge")
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

dfee0e99

19 7月, 2019 11 次提交

tcp: fix tcp_set_congestion_control() use from bpf hook · 8d650cde

由 Eric Dumazet 提交于 5年前

Neal reported incorrect use of ns_capable() from bpf hook.

bpf_setsockopt(...TCP_CONGESTION...)
  -> tcp_set_congestion_control()
   -> ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN)
    -> ns_capable_common()
     -> current_cred()
      -> rcu_dereference_protected(current->cred, 1)

Accessing 'current' in bpf context makes no sense, since packets
are processed from softirq context.

As Neal stated : The capability check in tcp_set_congestion_control()
was written assuming a system call context, and then was reused from
a BPF call site.

The fix is to add a new parameter to tcp_set_congestion_control(),
so that the ns_capable() call is only performed under the right
context.

Fixes: 91b5b21c ("bpf: Add support for changing congestion control")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Lawrence Brakmo <brakmo@fb.com>
Reported-by: NNeal Cardwell <ncardwell@google.com>
Acked-by: NNeal Cardwell <ncardwell@google.com>
Acked-by: NLawrence Brakmo <brakmo@fb.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8d650cde

proc/sysctl: add shared variables for range check · eec4844f

由 Matteo Croce 提交于 5年前

In the sysctl code the proc_dointvec_minmax() function is often used to
validate the user supplied value between an allowed range.  This
function uses the extra1 and extra2 members from struct ctl_table as
minimum and maximum allowed value.

On sysctl handler declaration, in every source file there are some
readonly variables containing just an integer which address is assigned
to the extra1 and extra2 members, so the sysctl range is enforced.

The special values 0, 1 and INT_MAX are very often used as range
boundary, leading duplication of variables like zero=0, one=1,
int_max=INT_MAX in different source files:

    $ git grep -E '\.extra[12].*&(zero|one|int_max)' |wc -l
    248

Add a const int array containing the most commonly used values, some
macros to refer more easily to the correct array member, and use them
instead of creating a local one for every object file.

This is the bloat-o-meter output comparing the old and new binary
compiled with the default Fedora config:

    # scripts/bloat-o-meter -d vmlinux.o.old vmlinux.o
    add/remove: 2/2 grow/shrink: 0/2 up/down: 24/-188 (-164)
    Data                                         old     new   delta
    sysctl_vals                                    -      12     +12
    __kstrtab_sysctl_vals                          -      12     +12
    max                                           14      10      -4
    int_max                                       16       -     -16
    one                                           68       -     -68
    zero                                         128      28    -100
    Total: Before=20583249, After=20583085, chg -0.00%

[mcroce@redhat.com: tipc: remove two unused variables]
  Link: http://lkml.kernel.org/r/20190530091952.4108-1-mcroce@redhat.com
[akpm@linux-foundation.org: fix net/ipv6/sysctl_net_ipv6.c]
[arnd@arndb.de: proc/sysctl: make firmware loader table conditional]
  Link: http://lkml.kernel.org/r/20190617130014.1713870-1-arnd@arndb.de
[akpm@linux-foundation.org: fix fs/eventpoll.c]
Link: http://lkml.kernel.org/r/20190430180111.10688-1-mcroce@redhat.comSigned-off-by: NMatteo Croce <mcroce@redhat.com>
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Acked-by: NKees Cook <keescook@chromium.org>
Reviewed-by: NAaron Tomlin <atomlin@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

eec4844f

net: dsa: sja1105: Fix missing unlock on error in sk_buff() · 008cfbaa

由 Wei Yongjun 提交于 5年前

Add the missing unlock before return from function sk_buff()
in the error handling case.

Fixes: f3097be2 ("net: dsa: sja1105: Add a state machine for RX timestamping")
Signed-off-by: NWei Yongjun <weiyongjun1@huawei.com>
Reviewed-by: NVladimir Oltean <olteanv@gmail.com>
Reviewed-by: NVivien Didelot <vivien.didelot@gmail.com>
Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

008cfbaa

ipv6: Unlink sibling route in case of failure · 54851aa9

由 Ido Schimmel 提交于 5年前

When a route needs to be appended to an existing multipath route,
fib6_add_rt2node() first appends it to the siblings list and increments
the number of sibling routes on each sibling.

Later, the function notifies the route via call_fib6_entry_notifiers().
In case the notification is vetoed, the route is not unlinked from the
siblings list, which can result in a use-after-free.

Fix this by unlinking the route from the siblings list before returning
an error.

Audited the rest of the call sites from which the FIB notification chain
is called and could not find more problems.

Fixes: 2233000c ("net/ipv6: Move call_fib6_entry_notifiers up for route adds")
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Reported-by: NAlexander Petrovskiy <alexpe@mellanox.com>
Reviewed-by: NDavid Ahern <dsahern@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

54851aa9

netfilter: bridge: NF_CONNTRACK_BRIDGE does not depend on NF_TABLES_BRIDGE · fc2f14f8

由 Pablo Neira Ayuso 提交于 5年前

Place NF_CONNTRACK_BRIDGE away from the NF_TABLES_BRIDGE dependency.

Fixes: 3c171f49 ("netfilter: bridge: add connection tracking system")
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

fc2f14f8

netfilter: nft_meta: skip EAGAIN if nft_meta_bridge is not a module · 78e21eb6

由 Pablo Neira Ayuso 提交于 5年前

If it is a module, request this module. Otherwise, if it is compiled
built-in or not selected, skip this.

Fixes: 0ef1efd1 ("netfilter: nf_tables: force module load in case select_ops() returns -EAGAIN")
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

78e21eb6

netfilter: synproxy: fix rst sequence number mismatch · e971ceb8

由 Fernando Fernandez Mancera 提交于 5年前

14:51:00.024418 IP 192.168.122.1.41462 > netfilter.90: Flags [S], seq
4023580551,
14:51:00.024454 IP netfilter.90 > 192.168.122.1.41462: Flags [S.], seq
727560212, ack 4023580552,
14:51:00.024524 IP 192.168.122.1.41462 > netfilter.90: Flags [.], ack 1,

Note: here, synproxy will send a SYN to the real server, as the 3whs was
completed sucessfully. Instead of a syn/ack that we can intercept, we instead
received a reset packet from the real backend, that we forward to the original
client. However, we don't use the correct sequence number, so the reset is not
effective in closing the connection coming from the client.

14:51:00.024550 IP netfilter.90 > 192.168.122.1.41462: Flags [R.], seq
3567407084,
14:51:00.231196 IP 192.168.122.1.41462 > netfilter.90: Flags [.], ack 1,
14:51:00.647911 IP 192.168.122.1.41462 > netfilter.90: Flags [.], ack 1,
14:51:01.474395 IP 192.168.122.1.41462 > netfilter.90: Flags [.], ack 1,

Fixes: 48b1de4c ("netfilter: add SYNPROXY core/target")
Signed-off-by: NFernando Fernandez Mancera <ffmancera@riseup.net>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

e971ceb8

udp: Fix typo in net/ipv4/udp.c · 666a3d6e

由 Su Yanjun 提交于 5年前

Signed-off-by: NSu Yanjun <suyj.fnst@cn.fujitsu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

666a3d6e

SUNRPC: Optimise transport balancing code · f554af28

由 Trond Myklebust 提交于 5年前

Moves the balancing code to avoid doing cursor changes on every search
iteration.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

f554af28

SUNRPC: Ensure the bvecs are reset when we re-encode the RPC request · 75369089

由 Trond Myklebust 提交于 5年前

The bvec tracks the list of pages, so if the number of pages changes
due to a re-encode, we need to reset the bvec as well.

Fixes: 277e4ab7 ("SUNRPC: Simplify TCP receive code by switching...")
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
Cc: stable@vger.kernel.org # v4.20+

75369089

netfilter: nf_tables: Support auto-loading for inet nat · b4f1483c

由 Phil Sutter 提交于 5年前

Trying to create an inet family nat chain would not cause
nft_chain_nat.ko module to auto-load due to missing module alias. Add a
proper one with hard-coded family value 1 for the pseudo-family
NFPROTO_INET.

Fixes: d164385e ("netfilter: nat: add inet family nat support")
Signed-off-by: NPhil Sutter <phil@nwl.cc>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

b4f1483c

18 7月, 2019 15 次提交

SUNRPC: Fix up backchannel slot table accounting · 7402a4fe

由 Trond Myklebust 提交于 5年前

Add a per-transport maximum limit in the socket case, and add
helpers to allow the NFSv4 code to discover that limit.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

7402a4fe

SUNRPC: Fix initialisation of struct rpc_xprt_switch · 9f98effc

由 Trond Myklebust 提交于 5年前

Ensure that we do initialise the fields xps_nactive, xps_queuelen
and xps_net.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

9f98effc

ipv6: rt6_check should return NULL if 'from' is NULL · 49d05fe2

由 David Ahern 提交于 5年前

Paul reported that l2tp sessions were broken after the commit referenced
in the Fixes tag. Prior to this commit rt6_check returned NULL if the
rt6_info 'from' was NULL - ie., the dst_entry was disconnected from a FIB
entry. Restore that behavior.

Fixes: 93531c67 ("net/ipv6: separate handling of FIB entries from dst based routes")
Reported-by: NPaul Donohue <linux-kernel@PaulSD.com>
Tested-by: NPaul Donohue <linux-kernel@PaulSD.com>
Signed-off-by: NDavid Ahern <dsahern@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

49d05fe2

tipc: initialize 'validated' field of received packets · 866e5fd8

由 Jon Maloy 提交于 5年前

The tipc_msg_validate() function leaves a boolean flag 'validated' in
the validated buffer's control block, to avoid performing this action
more than once. However, at reception of new packets, the position of
this field may already have been set by lower layer protocols, so
that the packet is erroneously perceived as already validated by TIPC.

We fix this by initializing the said field to 'false' before performing
the initial validation.
Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

866e5fd8

fib: relax source validation check for loopback packets · 66f82095

由 Cong Wang 提交于 5年前

In a rare case where we redirect local packets from veth to lo,
these packets fail to pass the source validation when rp_filter
is turned on, as the tracing shows:

  <...>-311708 [040] ..s1 7951180.957825: fib_table_lookup: table 254 oif 0 iif 1 src 10.53.180.130 dst 10.53.180.130 tos 0 scope 0 flags 0
  <...>-311708 [040] ..s1 7951180.957826: fib_table_lookup_nh: nexthop dev eth0 oif 4 src 10.53.180.130

So, the fib table lookup returns eth0 as the nexthop even though
the packets are local and should be routed to loopback nonetheless,
but they can't pass the dev match check in fib_info_nh_uses_dev()
without this patch.

It should be safe to relax this check for this special case, as
normally packets coming out of loopback device still have skb_dst
so they won't even hit this slow path.

Cc: Julian Anastasov <ja@ssi.bg>
Cc: David Ahern <dsahern@gmail.com>
Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
Reviewed-by: NDavid Ahern <dsahern@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

66f82095

net_sched: unset TCQ_F_CAN_BYPASS when adding filters · 3f05e688

由 Cong Wang 提交于 5年前

For qdisc's that support TC filters and set TCQ_F_CAN_BYPASS,
notably fq_codel, it makes no sense to let packets bypass the TC
filters we setup in any scenario, otherwise our packets steering
policy could not be enforced.

This can be reproduced easily with the following script:

 ip li add dev dummy0 type dummy
 ifconfig dummy0 up
 tc qd add dev dummy0 root fq_codel
 tc filter add dev dummy0 parent 8001: protocol arp basic action mirred egress redirect dev lo
 tc filter add dev dummy0 parent 8001: protocol ip basic action mirred egress redirect dev lo
 ping -I dummy0 192.168.112.1

Without this patch, packets are sent directly to dummy0 without
hitting any of the filters. With this patch, packets are redirected
to loopback as expected.

This fix is not perfect, it only unsets the flag but does not set it back
because we have to save the information somewhere in the qdisc if we
really want that. Note, both fq_codel and sfq clear this flag in their
->bind_tcf() but this is clearly not sufficient when we don't use any
class ID.

Fixes: 23624935 ("net_sched: TCQ_F_CAN_BYPASS generalization")
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
Reviewed-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3f05e688

net/rds: Initialize ic->i_fastreg_wrs upon allocation · aa494893

由 Gerd Rausch 提交于 5年前

Otherwise, if an IB connection is torn down before "rds_ib_setup_qp"
is called, the value of "ic->i_fastreg_wrs" is still at zero
(as it wasn't initialized by "rds_ib_setup_qp").
Consequently "rds_ib_conn_path_shutdown" will spin forever,
waiting for it to go back to "RDS_IB_DEFAULT_FR_WR",
which of course will never happen as there are no
outstanding work requests.
Signed-off-by: NGerd Rausch <gerd.rausch@oracle.com>
Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

aa494893

net/rds: Keep track of and wait for FRWR segments in use upon shutdown · 3a2886cc

由 Gerd Rausch 提交于 5年前

Since "rds_ib_free_frmr" and "rds_ib_free_frmr_list" simply put
the FRMR memory segments on the "drop_list" or "free_list",
and it is the job of "rds_ib_flush_mr_pool" to reap those entries
by ultimately issuing a "IB_WR_LOCAL_INV" work-request,
we need to trigger and then wait for all those memory segments
attached to a particular connection to be fully released before
we can move on to release the QP, CQ, etc.

So we make "rds_ib_conn_path_shutdown" wait for one more
atomic_t called "i_fastreg_inuse_count" that keeps track of how
many FRWR memory segments are out there marked "FRMR_IS_INUSE"
(and also wake_up rds_ib_ring_empty_wait, as they go away).
Signed-off-by: NGerd Rausch <gerd.rausch@oracle.com>
Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3a2886cc

net/rds: Set fr_state only to FRMR_IS_FREE if IB_WR_LOCAL_INV had been successful · 9547dff1

由 Gerd Rausch 提交于 5年前

Fix a bug where fr_state first goes to FRMR_IS_STALE, because of a failure
of operation IB_WR_LOCAL_INV, but then gets set back to "FRMR_IS_FREE"
uncoditionally, even though the operation failed.
Signed-off-by: NGerd Rausch <gerd.rausch@oracle.com>
Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9547dff1

net/rds: Fix NULL/ERR_PTR inconsistency · aea01a22

由 Gerd Rausch 提交于 5年前

Make function "rds_ib_try_reuse_ibmr" return NULL in case
memory region could not be allocated, since callers
simply check if the return value is not NULL.
Signed-off-by: NGerd Rausch <gerd.rausch@oracle.com>
Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

aea01a22

net/rds: Wait for the FRMR_IS_FREE (or FRMR_IS_STALE) transition after posting IB_WR_LOCAL_INV · 5f33141d

由 Gerd Rausch 提交于 5年前

In order to:
1) avoid a silly bouncing between "clean_list" and "drop_list"
   triggered by function "rds_ib_reg_frmr" as it is releases frmr
   regions whose state is not "FRMR_IS_FREE" right away.

2) prevent an invalid access error in a race from a pending
   "IB_WR_LOCAL_INV" operation with a teardown ("dma_unmap_sg", "put_page")
   and de-registration ("ib_dereg_mr") of the corresponding
   memory region.
Signed-off-by: NGerd Rausch <gerd.rausch@oracle.com>
Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5f33141d

net/rds: Get rid of "wait_clean_list_grace" and add locking · c9467447

由 Gerd Rausch 提交于 5年前

Waiting for activity on the "clean_list" to quiesce is no substitute
for proper locking.

We can have multiple threads competing for "llist_del_first"
via "rds_ib_reuse_mr", and a single thread competing
for "llist_del_all" and "llist_del_first" via "rds_ib_flush_mr_pool".

Since "llist_del_first" depends on "list->first->next" not to change
in the midst of the operation, simply waiting for all current calls
to "rds_ib_reuse_mr" to quiesce across all CPUs is woefully inadequate:

By the time "wait_clean_list_grace" is done iterating over all CPUs to see
that there is no concurrent caller to "rds_ib_reuse_mr", a new caller may
have just shown up on the first CPU.

Furthermore, <linux/llist.h> explicitly calls out the need for locking:
 * Cases where locking is needed:
 * If we have multiple consumers with llist_del_first used in one consumer,
 * and llist_del_first or llist_del_all used in other consumers,
 * then a lock is needed.

Also, while at it, drop the unused "pool" parameter
from "list_to_llist_nodes".
Signed-off-by: NGerd Rausch <gerd.rausch@oracle.com>
Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c9467447

net/rds: Give fr_state a chance to transition to FRMR_IS_FREE · 2c7da8e6

由 Gerd Rausch 提交于 5年前

In the context of FRMR (ib_frmr.c):

Memory regions make it onto the "clean_list" via "rds_ib_flush_mr_pool",
after the memory region has been posted for invalidation via
"rds_ib_post_inv".

At that point in time, "fr_state" may still be in state "FRMR_IS_INUSE",
since the only place where "fr_state" transitions to "FRMR_IS_FREE"
is in "rds_ib_mr_cqe_handler", which is triggered by a tasklet.

So in case we notice that "fr_state != FRMR_IS_FREE" (see below),
we wait for "fr_inv_done" to trigger with a maximum of 10msec.
Then we check again, and only put the memory region onto the drop_list
(via "rds_ib_free_frmr") in case the situation remains unchanged.

This avoids the problem of memory-regions bouncing between "clean_list"
and "drop_list" before they even have a chance to be properly invalidated.
Signed-off-by: NGerd Rausch <gerd.rausch@oracle.com>
Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2c7da8e6

net/sched: Make NET_ACT_CT depends on NF_NAT · f11fe1da

由 YueHaibing 提交于 5年前

If NF_NAT is m and NET_ACT_CT is y, build fails:

net/sched/act_ct.o: In function `tcf_ct_act':
act_ct.c:(.text+0x21ac): undefined reference to `nf_ct_nat_ext_add'
act_ct.c:(.text+0x229a): undefined reference to `nf_nat_icmp_reply_translation'
act_ct.c:(.text+0x233a): undefined reference to `nf_nat_setup_info'
act_ct.c:(.text+0x234a): undefined reference to `nf_nat_alloc_null_binding'
act_ct.c:(.text+0x237c): undefined reference to `nf_nat_packet'
Reported-by: NHulk Robot <hulkci@huawei.com>
Fixes: b57dc7c1 ("net/sched: Introduce action ct")
Signed-off-by: NYueHaibing <yuehaibing@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f11fe1da

net: sctp: fix warning "NULL check before some freeing functions is not needed" · 86fda90a

由 Hariprasad Kelam 提交于 5年前

This patch removes NULL checks before calling kfree.

fixes below issues reported by coccicheck
net/sctp/sm_make_chunk.c:2586:3-8: WARNING: NULL check before some
freeing functions is not needed.
net/sctp/sm_make_chunk.c:2652:3-8: WARNING: NULL check before some
freeing functions is not needed.
net/sctp/sm_make_chunk.c:2667:3-8: WARNING: NULL check before some
freeing functions is not needed.
net/sctp/sm_make_chunk.c:2684:3-8: WARNING: NULL check before some
freeing functions is not needed.
Signed-off-by: NHariprasad Kelam <hariprasad.kelam@gmail.com>
Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: NNeil Horman <nhorman@tuxdriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

86fda90a

17 7月, 2019 2 次提交

fix: taprio: Change type of txtime-delay parameter to u32 · a5b64700

由 Vedang Patel 提交于 5年前

During the review of the iproute2 patches for txtime-assist mode, it was
pointed out that it does not make sense for the txtime-delay parameter to
be negative. So, change the type of the parameter from s32 to u32.

Fixes: 4cfd5779 ("taprio: Add support for txtime-assist mode")
Reported-by: NStephen Hemminger <stephen@networkplumber.org>
Signed-off-by: NVedang Patel <vedang.patel@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a5b64700

skbuff: fix compilation warnings in skb_dump() · db8051f3

由 Qian Cai 提交于 5年前

The commit 6413139d ("skbuff: increase verbosity when dumping skb
data") introduced a few compilation warnings.

net/core/skbuff.c:766:32: warning: format specifies type 'unsigned
short' but the argument has type 'unsigned int' [-Wformat]
                       level, sk->sk_family, sk->sk_type,
sk->sk_protocol);
                                             ^~~~~~~~~~~
net/core/skbuff.c:766:45: warning: format specifies type 'unsigned
short' but the argument has type 'unsigned int' [-Wformat]
                       level, sk->sk_family, sk->sk_type,
sk->sk_protocol);
^~~~~~~~~~~~~~~

Fix them by using the proper types.

Fixes: 6413139d ("skbuff: increase verbosity when dumping skb data")
Signed-off-by: NQian Cai <cai@lca.pw>
Reviewed-by: NNathan Chancellor <natechancellor@gmail.com>
Acked-by: NWillem de Bruijn <willemb@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

db8051f3

16 7月, 2019 1 次提交

netfilter: nft_hash: fix symhash with modulus one · 28b1d6ef

由 Laura Garcia Liebana 提交于 5年前

The rule below doesn't work as the kernel raises -ERANGE.

nft add rule netdev nftlb lb01 ip daddr set \
	symhash mod 1 map { 0 : 192.168.0.10 } fwd to "eth0"

This patch allows to use the symhash modulus with one
element, in the same way that the other types of hashes and
algorithms that uses the modulus parameter.
Signed-off-by: NLaura Garcia Liebana <nevola@gmail.com>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

28b1d6ef

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功