1. 19 9月, 2020 13 次提交
  2. 18 9月, 2020 10 次提交
  3. 17 9月, 2020 3 次提交
  4. 16 9月, 2020 6 次提交
    • D
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · d5d325ea
      David S. Miller 提交于
      Alexei Starovoitov says:
      
      ====================
      pull-request: bpf 2020-09-15
      
      The following pull-request contains BPF updates for your *net* tree.
      
      We've added 12 non-merge commits during the last 19 day(s) which contain
      a total of 10 files changed, 47 insertions(+), 38 deletions(-).
      
      The main changes are:
      
      1) docs/bpf fixes, from Andrii.
      
      2) ld_abs fix, from Daniel.
      
      3) socket casting helpers fix, from Martin.
      
      4) hash iterator fixes, from Yonghong.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d5d325ea
    • Y
      bpf: Fix a rcu warning for bpffs map pretty-print · ce880cb8
      Yonghong Song 提交于
      Running selftest
        ./btf_btf -p
      the kernel had the following warning:
        [   51.528185] WARNING: CPU: 3 PID: 1756 at kernel/bpf/hashtab.c:717 htab_map_get_next_key+0x2eb/0x300
        [   51.529217] Modules linked in:
        [   51.529583] CPU: 3 PID: 1756 Comm: test_btf Not tainted 5.9.0-rc1+ #878
        [   51.530346] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3-1.el7.centos 04/01/2014
        [   51.531410] RIP: 0010:htab_map_get_next_key+0x2eb/0x300
        ...
        [   51.542826] Call Trace:
        [   51.543119]  map_seq_next+0x53/0x80
        [   51.543528]  seq_read+0x263/0x400
        [   51.543932]  vfs_read+0xad/0x1c0
        [   51.544311]  ksys_read+0x5f/0xe0
        [   51.544689]  do_syscall_64+0x33/0x40
        [   51.545116]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      The related source code in kernel/bpf/hashtab.c:
        709 static int htab_map_get_next_key(struct bpf_map *map, void *key, void *next_key)
        710 {
        711         struct bpf_htab *htab = container_of(map, struct bpf_htab, map);
        712         struct hlist_nulls_head *head;
        713         struct htab_elem *l, *next_l;
        714         u32 hash, key_size;
        715         int i = 0;
        716
        717         WARN_ON_ONCE(!rcu_read_lock_held());
      
      In kernel/bpf/inode.c, bpffs map pretty print calls map->ops->map_get_next_key()
      without holding a rcu_read_lock(), hence causing the above warning.
      To fix the issue, just surrounding map->ops->map_get_next_key() with rcu read lock.
      
      Fixes: a26ca7c9 ("bpf: btf: Add pretty print support to the basic arraymap")
      Reported-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NYonghong Song <yhs@fb.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Acked-by: NAndrii Nakryiko <andriin@fb.com>
      Cc: Martin KaFai Lau <kafai@fb.com>
      Link: https://lore.kernel.org/bpf/20200916004401.146277-1-yhs@fb.com
      ce880cb8
    • M
      bpf: Bpf_skc_to_* casting helpers require a NULL check on sk · 8c33dadc
      Martin KaFai Lau 提交于
      The bpf_skc_to_* type casting helpers are available to
      BPF_PROG_TYPE_TRACING.  The traced PTR_TO_BTF_ID may be NULL.
      For example, the skb->sk may be NULL.  Thus, these casting helpers
      need to check "!sk" also and this patch fixes them.
      
      Fixes: 0d4fad3e ("bpf: Add bpf_skc_to_udp6_sock() helper")
      Fixes: 478cfbdf ("bpf: Add bpf_skc_to_{tcp, tcp_timewait, tcp_request}_sock() helpers")
      Fixes: af7ec138 ("bpf: Add bpf_skc_to_tcp6_sock() helper")
      Signed-off-by: NMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Acked-by: NYonghong Song <yhs@fb.com>
      Acked-by: NSong Liu <songliubraving@fb.com>
      Link: https://lore.kernel.org/bpf/20200915182959.241101-1-kafai@fb.com
      8c33dadc
    • D
      ipv4: Update exception handling for multipath routes via same device · 2fbc6e89
      David Ahern 提交于
      Kfir reported that pmtu exceptions are not created properly for
      deployments where multipath routes use the same device.
      
      After some digging I see 2 compounding problems:
      1. ip_route_output_key_hash_rcu is updating the flowi4_oif *after*
         the route lookup. This is the second use case where this has
         been a problem (the first is related to use of vti devices with
         VRF). I can not find any reason for the oif to be changed after the
         lookup; the code goes back to the start of git. It does not seem
         logical so remove it.
      
      2. fib_lookups for exceptions do not call fib_select_path to handle
         multipath route selection based on the hash.
      
      The end result is that the fib_lookup used to add the exception
      always creates it based using the first leg of the route.
      
      An example topology showing the problem:
      
                       |  host1
                   +------+
                   | eth0 |  .209
                   +------+
                       |
                   +------+
           switch  | br0  |
                   +------+
                       |
             +---------+---------+
             | host2             |  host3
         +------+             +------+
         | eth0 | .250        | eth0 | 192.168.252.252
         +------+             +------+
      
         +-----+             +-----+
         | vti | .2          | vti | 192.168.247.3
         +-----+             +-----+
             \                  /
       =================================
       tunnels
               192.168.247.1/24
      
      for h in host1 host2 host3; do
              ip netns add ${h}
              ip -netns ${h} link set lo up
              ip netns exec ${h} sysctl -wq net.ipv4.ip_forward=1
      done
      
      ip netns add switch
      ip -netns switch li set lo up
      ip -netns switch link add br0 type bridge stp 0
      ip -netns switch link set br0 up
      
      for n in 1 2 3; do
              ip -netns switch link add eth-sw type veth peer name eth-h${n}
              ip -netns switch li set eth-h${n} master br0 up
              ip -netns switch li set eth-sw netns host${n} name eth0
      done
      
      ip -netns host1 addr add 192.168.252.209/24 dev eth0
      ip -netns host1 link set dev eth0 up
      ip -netns host1 route add 192.168.247.0/24 \
              nexthop via 192.168.252.250 dev eth0 nexthop via 192.168.252.252 dev eth0
      
      ip -netns host2 addr add 192.168.252.250/24 dev eth0
      ip -netns host2 link set dev eth0 up
      
      ip -netns host2 addr add 192.168.252.252/24 dev eth0
      ip -netns host3 link set dev eth0 up
      
      ip netns add tunnel
      ip -netns tunnel li set lo up
      ip -netns tunnel li add br0 type bridge
      ip -netns tunnel li set br0 up
      for n in $(seq 11 20); do
              ip -netns tunnel addr add dev br0 192.168.247.${n}/24
      done
      
      for n in 2 3
      do
              ip -netns tunnel link add vti${n} type veth peer name eth${n}
              ip -netns tunnel link set eth${n} mtu 1360 master br0 up
              ip -netns tunnel link set vti${n} netns host${n} mtu 1360 up
              ip -netns host${n} addr add dev vti${n} 192.168.247.${n}/24
      done
      ip -netns tunnel ro add default nexthop via 192.168.247.2 nexthop via 192.168.247.3
      
      ip netns exec host1 ping -M do -s 1400 -c3 -I 192.168.252.209 192.168.247.11
      ip netns exec host1 ping -M do -s 1400 -c3 -I 192.168.252.209 192.168.247.15
      ip -netns host1 ro ls cache
      
      Before this patch the cache always shows exceptions against the first
      leg in the multipath route; 192.168.252.250 per this example. Since the
      hash has an initial random seed, you may need to vary the final octet
      more than what is listed. In my tests, using addresses between 11 and 19
      usually found 1 that used both legs.
      
      With this patch, the cache will have exceptions for both legs.
      
      Fixes: 4895c771 ("ipv4: Add FIB nexthop exceptions")
      Reported-by: NKfir Itzhak <mastertheknife@gmail.com>
      Signed-off-by: NDavid Ahern <dsahern@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2fbc6e89
    • L
      net: tipc: kerneldoc fixes · 2e5117ba
      Lu Wei 提交于
      Fix parameter description of tipc_link_bc_create()
      Reported-by: NHulk Robot <hulkci@huawei.com>
      Fixes: 16ad3f40 ("tipc: introduce variable window congestion control")
      Signed-off-by: NLu Wei <luwei32@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2e5117ba
    • D
      ibmvnic: update MAINTAINERS · d3f2ef18
      Dany Madden 提交于
      Update supporters for IBM Power SRIOV Virtual NIC Device Driver.
      Thomas Falcon is moving on to other works. Dany Madden, Lijun Pan
      and Sukadev Bhattiprolu are the current supporters.
      Signed-off-by: NDany Madden <drt@linux.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d3f2ef18
  5. 15 9月, 2020 8 次提交
    • L
      batman-adv: mcast: fix duplicate mcast packets from BLA backbone to mesh · 2369e827
      Linus Lüssing 提交于
      Scenario:
      * Multicast frame send from BLA backbone gateways (multiple nodes
        with their bat0 bridged together, with BLA enabled) sharing the same
        LAN to nodes in the mesh
      
      Issue:
      * Nodes receive the frame multiple times on bat0 from the mesh,
        once from each foreign BLA backbone gateway which shares the same LAN
        with another
      
      For multicast frames via batman-adv broadcast packets coming from the
      same BLA backbone but from different backbone gateways duplicates are
      currently detected via a CRC history of previously received packets.
      
      However this CRC so far was not performed for multicast frames received
      via batman-adv unicast packets. Fixing this by appyling the same check
      for such packets, too.
      
      Room for improvements in the future: Ideally we would introduce the
      possibility to not only claim a client, but a complete originator, too.
      This would allow us to only send a multicast-in-unicast packet from a BLA
      backbone gateway claiming the node and by that avoid potential redundant
      transmissions in the first place.
      
      Fixes: 279e89b2 ("batman-adv: add broadcast duplicate check")
      Signed-off-by: NLinus Lüssing <linus.luessing@c0d3.blue>
      Signed-off-by: NSven Eckelmann <sven@narfation.org>
      Signed-off-by: NSimon Wunderlich <sw@simonwunderlich.de>
      2369e827
    • L
      batman-adv: mcast: fix duplicate mcast packets in BLA backbone from mesh · 74c09b72
      Linus Lüssing 提交于
      Scenario:
      * Multicast frame send from mesh to a BLA backbone (multiple nodes
        with their bat0 bridged together, with BLA enabled)
      
      Issue:
      * BLA backbone nodes receive the frame multiple times on bat0,
        once from mesh->bat0 and once from each backbone_gw from LAN
      
      For unicast, a node will send only to the best backbone gateway
      according to the TQ. However for multicast we currently cannot determine
      if multiple destination nodes share the same backbone if they don't share
      the same backbone with us. So we need to keep sending the unicasts to
      all backbone gateways and let the backbone gateways decide which one
      will forward the frame. We can use the CLAIM mechanism to make this
      decision.
      
      One catch: The batman-adv gateway feature for DHCP packets potentially
      sends multicast packets in the same batman-adv unicast header as the
      multicast optimizations code. And we are not allowed to drop those even
      if we did not claim the source address of the sender, as for such
      packets there is only this one multicast-in-unicast packet.
      
      How can we distinguish the two cases?
      
      The gateway feature uses a batman-adv unicast 4 address header. While
      the multicast-to-unicasts feature uses a simple, 3 address batman-adv
      unicast header. So let's use this to distinguish.
      
      Fixes: fe2da6ff ("batman-adv: check incoming packet type for bla")
      Signed-off-by: NLinus Lüssing <linus.luessing@c0d3.blue>
      Signed-off-by: NSven Eckelmann <sven@narfation.org>
      Signed-off-by: NSimon Wunderlich <sw@simonwunderlich.de>
      74c09b72
    • L
      batman-adv: mcast: fix duplicate mcast packets in BLA backbone from LAN · 3236d215
      Linus Lüssing 提交于
      Scenario:
      * Multicast frame send from a BLA backbone (multiple nodes with
        their bat0 bridged together, with BLA enabled)
      
      Issue:
      * BLA backbone nodes receive the frame multiple times on bat0
      
      For multicast frames received via batman-adv broadcast packets the
      originator of the broadcast packet is checked before decapsulating and
      forwarding the frame to bat0 (batadv_bla_is_backbone_gw()->
      batadv_recv_bcast_packet()). If it came from a node which shares the
      same BLA backbone with us then it is not forwarded to bat0 to avoid a
      loop.
      
      When sending a multicast frame in a non-4-address batman-adv unicast
      packet we are currently missing this check - and cannot do so because
      the batman-adv unicast packet has no originator address field.
      
      However, we can simply fix this on the sender side by only sending the
      multicast frame via unicasts to interested nodes which do not share the
      same BLA backbone with us. This also nicely avoids some unnecessary
      transmissions on mesh side.
      
      Note that no infinite loop was observed, probably because of dropping
      via batadv_interface_tx()->batadv_bla_tx(). However the duplicates still
      utterly confuse switches/bridges, ICMPv6 duplicate address detection and
      neighbor discovery and therefore leads to long delays before being able
      to establish TCP connections, for instance. And it also leads to the Linux
      bridge printing messages like:
      "br-lan: received packet on eth1 with own address as source address ..."
      
      Fixes: 2d3f6ccc ("batman-adv: Modified forwarding behaviour for multicast packets")
      Signed-off-by: NLinus Lüssing <linus.luessing@c0d3.blue>
      Signed-off-by: NSven Eckelmann <sven@narfation.org>
      Signed-off-by: NSimon Wunderlich <sw@simonwunderlich.de>
      3236d215
    • A
      docs/bpf: Remove source code links · 65dce596
      Andrii Nakryiko 提交于
      Make path to bench_ringbufs.c just a text, not a special link.
      
      Fixes: 97abb2b3 ("docs/bpf: Add BPF ring buffer design notes")
      Reported-by: NMauro Carvalho Chehab <mchehab+huawei@kernel.org>
      Signed-off-by: NAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20200915005031.2748397-1-andriin@fb.com
      65dce596
    • B
      xsk: Fix number of pinned pages/umem size discrepancy · 2b1667e5
      Björn Töpel 提交于
      For AF_XDP sockets, there was a discrepancy between the number of of
      pinned pages and the size of the umem region.
      
      The size of the umem region is used to validate the AF_XDP descriptor
      addresses. The logic that pinned the pages covered by the region only
      took whole pages into consideration, creating a mismatch between the
      size and pinned pages. A user could then pass AF_XDP addresses outside
      the range of pinned pages, but still within the size of the region,
      crashing the kernel.
      
      This change correctly calculates the number of pages to be
      pinned. Further, the size check for the aligned mode is
      simplified. Now the code simply checks if the size is divisible by the
      chunk size.
      
      Fixes: bbff2f32 ("xsk: new descriptor addressing scheme")
      Reported-by: NCiara Loftus <ciara.loftus@intel.com>
      Signed-off-by: NBjörn Töpel <bjorn.topel@intel.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Tested-by: NCiara Loftus <ciara.loftus@intel.com>
      Acked-by: NSong Liu <songliubraving@fb.com>
      Link: https://lore.kernel.org/bpf/20200910075609.7904-1-bjorn.topel@gmail.com
      2b1667e5
    • X
      net: sched: initialize with 0 before setting erspan md->u · 8e1b3ac4
      Xin Long 提交于
      In fl_set_erspan_opt(), all bits of erspan md was set 1, as this
      function is also used to set opt MASK. However, when setting for
      md->u.index for opt VALUE, the rest bits of the union md->u will
      be left 1. It would cause to fail the match of the whole md when
      version is 1 and only index is set.
      
      This patch is to fix by initializing with 0 before setting erspan
      md->u.
      Reported-by: NShuang Li <shuali@redhat.com>
      Fixes: 79b1011c ("net: sched: allow flower to match erspan options")
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8e1b3ac4
    • D
      Merge branch 'net-improve-vxlan-option-process-in-net_sched-and-lwtunnel' · ad7b27c9
      David S. Miller 提交于
      Xin Long says:
      
      ====================
      net: improve vxlan option process in net_sched and lwtunnel
      
      This patch is to do some mask when setting vxlan option in net_sched
      and lwtunnel, so that only available bits can be set on vxlan md gbp.
      
      This would help when users don't know exactly vxlan's gbp bits, and
      avoid some mismatch because of some unavailable bits set by users.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ad7b27c9
    • X
      lwtunnel: only keep the available bits when setting vxlan md->gbp · 681d2cfb
      Xin Long 提交于
      As we can see from vxlan_build/parse_gbp_hdr(), when processing metadata
      on vxlan rx/tx path, only dont_learn/policy_applied/policy_id fields can
      be set to or parse from the packet for vxlan gbp option.
      
      So do the mask when set it in lwtunnel, as it does in act_tunnel_key and
      cls_flower.
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      681d2cfb