1. 06 7月, 2019 2 次提交
  2. 05 7月, 2019 5 次提交
    • D
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next · c4cde580
      David S. Miller 提交于
      Daniel Borkmann says:
      
      ====================
      pull-request: bpf-next 2019-07-03
      
      The following pull-request contains BPF updates for your *net-next* tree.
      
      There is a minor merge conflict in mlx5 due to 8960b389 ("linux/dim:
      Rename externally used net_dim members") which has been pulled into your
      tree in the meantime, but resolution seems not that bad ... getting current
      bpf-next out now before there's coming more on mlx5. ;) I'm Cc'ing Saeed
      just so he's aware of the resolution below:
      
      ** First conflict in drivers/net/ethernet/mellanox/mlx5/core/en_main.c:
      
        <<<<<<< HEAD
        static int mlx5e_open_cq(struct mlx5e_channel *c,
                                 struct dim_cq_moder moder,
                                 struct mlx5e_cq_param *param,
                                 struct mlx5e_cq *cq)
        =======
        int mlx5e_open_cq(struct mlx5e_channel *c, struct net_dim_cq_moder moder,
                          struct mlx5e_cq_param *param, struct mlx5e_cq *cq)
        >>>>>>> e5a3e259
      
      Resolution is to take the second chunk and rename net_dim_cq_moder into
      dim_cq_moder. Also the signature for mlx5e_open_cq() in ...
      
        drivers/net/ethernet/mellanox/mlx5/core/en.h +977
      
      ... and in mlx5e_open_xsk() ...
      
        drivers/net/ethernet/mellanox/mlx5/core/en/xsk/setup.c +64
      
      ... needs the same rename from net_dim_cq_moder into dim_cq_moder.
      
      ** Second conflict in drivers/net/ethernet/mellanox/mlx5/core/en_main.c:
      
        <<<<<<< HEAD
                int cpu = cpumask_first(mlx5_comp_irq_get_affinity_mask(priv->mdev, ix));
                struct dim_cq_moder icocq_moder = {0, 0};
                struct net_device *netdev = priv->netdev;
                struct mlx5e_channel *c;
                unsigned int irq;
        =======
                struct net_dim_cq_moder icocq_moder = {0, 0};
        >>>>>>> e5a3e259
      
      Take the second chunk and rename net_dim_cq_moder into dim_cq_moder
      as well.
      
      Let me know if you run into any issues. Anyway, the main changes are:
      
      1) Long-awaited AF_XDP support for mlx5e driver, from Maxim.
      
      2) Addition of two new per-cgroup BPF hooks for getsockopt and
         setsockopt along with a new sockopt program type which allows more
         fine-grained pass/reject settings for containers. Also add a sock_ops
         callback that can be selectively enabled on a per-socket basis and is
         executed for every RTT to help tracking TCP statistics, both features
         from Stanislav.
      
      3) Follow-up fix from loops in precision tracking which was not propagating
         precision marks and as a result verifier assumed that some branches were
         not taken and therefore wrongly removed as dead code, from Alexei.
      
      4) Fix BPF cgroup release synchronization race which could lead to a
         double-free if a leaf's cgroup_bpf object is released and a new BPF
         program is attached to the one of ancestor cgroups in parallel, from Roman.
      
      5) Support for bulking XDP_TX on veth devices which improves performance
         in some cases by around 9%, from Toshiaki.
      
      6) Allow for lookups into BPF devmap and improve feedback when calling into
         bpf_redirect_map() as lookup is now performed right away in the helper
         itself, from Toke.
      
      7) Add support for fq's Earliest Departure Time to the Host Bandwidth
         Manager (HBM) sample BPF program, from Lawrence.
      
      8) Various cleanups and minor fixes all over the place from many others.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c4cde580
    • R
      net: ethernet: mediatek: Fix overlapping capability bits. · e2c74694
      René van Dorst 提交于
      Both MTK_TRGMII_MT7621_CLK and MTK_PATH_BIT are defined as bit 10.
      
      This can causes issues on non-MT7621 devices which has the
      MTK_PATH_BIT(MTK_ETH_PATH_GMAC1_RGMII) and MTK_TRGMII capability set.
      The wrong TRGMII setup code can be executed. The current wrongly executed
      code doesn’t do any harm on MT7623 and the TRGMII setup for the MT7623
      SOC side is done in MT7530 driver So it wasn’t noticed in the test.
      
      Move all capability bits in one enum so that they are all unique and easy
      to expand in the future.
      
      Because mtk_eth_path enum is merged in to mkt_eth_capabilities, the
      variable path value is no longer between 0 to number of paths,
      mtk_eth_path_name can’t be used anymore in this form. Convert the
      mtk_eth_path_name array to a function to lookup the pathname.
      
      The old code walked thru the mtk_eth_path enum, which is also merged
      with mkt_eth_capabilities. Expand array mtk_eth_muxc so it can store the
      name and capability bit of the mux. Convert the code so it can walk thru
      the mtk_eth_muxc array.
      
      Fixes: 8efaa653 ("net: ethernet: mediatek: Add MT7621 TRGMII mode support")
      Signed-off-by: NRené van Dorst <opensource@vdorst.com>
      
      v1->v2:
      - Move all capability bits in one enum, suggested by Willem de Bruijn
      - Convert the mtk_eth_path_name array to a function to lookup the pathname
      - Expand array mtk_eth_muxc so it can also store the name and capability
        bit of the mux
      - Updated commit message
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e2c74694
    • W
      net: stmmac: Enable dwmac4 jumbo frame more than 8KiB · c3efed5a
      Weifeng Voon 提交于
      Enable GMAC v4.xx and beyond to support 16KiB buffer.
      Signed-off-by: NWeifeng Voon <weifeng.voon@intel.com>
      Signed-off-by: NOng Boon Leong <boon.leong.ong@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c3efed5a
    • V
      bonding: add an option to specify a delay between peer notifications · 07a4ddec
      Vincent Bernat 提交于
      Currently, gratuitous ARP/ND packets are sent every `miimon'
      milliseconds. This commit allows a user to specify a custom delay
      through a new option, `peer_notif_delay'.
      
      Like for `updelay' and `downdelay', this delay should be a multiple of
      `miimon' to avoid managing an additional work queue. The configuration
      logic is copied from `updelay' and `downdelay'. However, the default
      value cannot be set using a module parameter: Netlink or sysfs should
      be used to configure this feature.
      
      When setting `miimon' to 100 and `peer_notif_delay' to 500, we can
      observe the 500 ms delay is respected:
      
          20:30:19.354693 ARP, Request who-has 203.0.113.10 tell 203.0.113.10, length 28
          20:30:19.874892 ARP, Request who-has 203.0.113.10 tell 203.0.113.10, length 28
          20:30:20.394919 ARP, Request who-has 203.0.113.10 tell 203.0.113.10, length 28
          20:30:20.914963 ARP, Request who-has 203.0.113.10 tell 203.0.113.10, length 28
      
      In bond_mii_monitor(), I have tried to keep the lock logic readable.
      The change is due to the fact we cannot rely on a notification to
      lower the value of `bond->send_peer_notif' as `NETDEV_NOTIFY_PEERS' is
      only triggered once every N times, while we need to decrement the
      counter each time.
      
      iproute2 also needs to be updated to be able to specify this new
      attribute through `ip link'.
      Signed-off-by: NVincent Bernat <vincent@bernat.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      07a4ddec
    • C
      net: ethernet: sun: remove redundant assignment to variable err · 2368a870
      Colin Ian King 提交于
      The variable err is being assigned with a value that is never
      read and it is being updated in the next statement with a new value.
      The assignment is redundant and can be removed.
      
      Addresses-Coverity: ("Unused value")
      Signed-off-by: NColin Ian King <colin.king@canonical.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2368a870
  3. 04 7月, 2019 12 次提交
  4. 03 7月, 2019 21 次提交
    • D
      Merge branch 'bpf-tcp-rtt-hook' · e5a3e259
      Daniel Borkmann 提交于
      Stanislav Fomichev says:
      
      ====================
      Congestion control team would like to have a periodic callback to
      track some TCP statistics. Let's add a sock_ops callback that can be
      selectively enabled on a socket by socket basis and is executed for
      every RTT. BPF program frequency can be further controlled by calling
      bpf_ktime_get_ns and bailing out early.
      
      I run neper tcp_stream and tcp_rr tests with the sample program
      from the last patch and didn't observe any noticeable performance
      difference.
      
      v2:
      * add a comment about second accept() in selftest (Yonghong Song)
      * refer to tcp_bpf.readme in sample program (Yonghong Song)
      ====================
      Suggested-by: NEric Dumazet <edumazet@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Priyaranjan Jha <priyarjha@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Cc: Soheil Hassas Yeganeh <soheil@google.com>
      Acked-by: NSoheil Hassas Yeganeh <soheil@google.com>
      Acked-by: NYuchung Cheng <ycheng@google.com>
      Acked-by: NYonghong Song <yhs@fb.com>
      Acked-by: NLawrence Brakmo <brakmo@fb.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      e5a3e259
    • S
      samples/bpf: fix tcp_bpf.readme detach command · d78e3f06
      Stanislav Fomichev 提交于
      Copy-paste, should be detach, not attach.
      Signed-off-by: NStanislav Fomichev <sdf@google.com>
      Acked-by: NSoheil Hassas Yeganeh <soheil@google.com>
      Acked-by: NYuchung Cheng <ycheng@google.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      d78e3f06
    • S
      samples/bpf: add sample program that periodically dumps TCP stats · 39533884
      Stanislav Fomichev 提交于
      Uses new RTT callback to dump stats every second.
      
      $ mkdir -p /tmp/cgroupv2
      $ mount -t cgroup2 none /tmp/cgroupv2
      $ mkdir -p /tmp/cgroupv2/foo
      $ echo $$ >> /tmp/cgroupv2/foo/cgroup.procs
      $ bpftool prog load ./tcp_dumpstats_kern.o /sys/fs/bpf/tcp_prog
      $ bpftool cgroup attach /tmp/cgroupv2/foo sock_ops pinned /sys/fs/bpf/tcp_prog
      $ bpftool prog tracelog
      $ # run neper/netperf/etc
      
      Used neper to compare performance with and without this program attached
      and didn't see any noticeable performance impact.
      
      Sample output:
        <idle>-0     [015] ..s.  2074.128800: 0: dsack_dups=0 delivered=242526
        <idle>-0     [015] ..s.  2074.128808: 0: delivered_ce=0 icsk_retransmits=0
        <idle>-0     [015] ..s.  2075.130133: 0: dsack_dups=0 delivered=323599
        <idle>-0     [015] ..s.  2075.130138: 0: delivered_ce=0 icsk_retransmits=0
        <idle>-0     [005] .Ns.  2076.131440: 0: dsack_dups=0 delivered=404648
        <idle>-0     [005] .Ns.  2076.131447: 0: delivered_ce=0 icsk_retransmits=0
      
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Priyaranjan Jha <priyarjha@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Cc: Soheil Hassas Yeganeh <soheil@google.com>
      Acked-by: NSoheil Hassas Yeganeh <soheil@google.com>
      Acked-by: NYuchung Cheng <ycheng@google.com>
      Signed-off-by: NStanislav Fomichev <sdf@google.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      39533884
    • S
      selftests/bpf: test BPF_SOCK_OPS_RTT_CB · b5587398
      Stanislav Fomichev 提交于
      Make sure the callback is invoked for syn-ack and data packet.
      
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Priyaranjan Jha <priyarjha@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Cc: Soheil Hassas Yeganeh <soheil@google.com>
      Acked-by: NSoheil Hassas Yeganeh <soheil@google.com>
      Acked-by: NYuchung Cheng <ycheng@google.com>
      Signed-off-by: NStanislav Fomichev <sdf@google.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      b5587398
    • S
      bpf/tools: sync bpf.h · 692cbaa9
      Stanislav Fomichev 提交于
      Sync new bpf_tcp_sock fields and new BPF_PROG_TYPE_SOCK_OPS RTT callback.
      
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Priyaranjan Jha <priyarjha@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Cc: Soheil Hassas Yeganeh <soheil@google.com>
      Acked-by: NSoheil Hassas Yeganeh <soheil@google.com>
      Acked-by: NYuchung Cheng <ycheng@google.com>
      Signed-off-by: NStanislav Fomichev <sdf@google.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      692cbaa9
    • S
      bpf: add icsk_retransmits to bpf_tcp_sock · c2cb5e82
      Stanislav Fomichev 提交于
      Add some inet_connection_sock fields to bpf_tcp_sock that might be useful
      for debugging congestion control issues.
      
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Priyaranjan Jha <priyarjha@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Cc: Soheil Hassas Yeganeh <soheil@google.com>
      Acked-by: NSoheil Hassas Yeganeh <soheil@google.com>
      Acked-by: NYuchung Cheng <ycheng@google.com>
      Signed-off-by: NStanislav Fomichev <sdf@google.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      c2cb5e82
    • S
      bpf: add dsack_dups/delivered{, _ce} to bpf_tcp_sock · 0357746d
      Stanislav Fomichev 提交于
      Add more fields to bpf_tcp_sock that might be useful for debugging
      congestion control issues.
      
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Priyaranjan Jha <priyarjha@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Cc: Soheil Hassas Yeganeh <soheil@google.com>
      Acked-by: NSoheil Hassas Yeganeh <soheil@google.com>
      Acked-by: NYuchung Cheng <ycheng@google.com>
      Signed-off-by: NStanislav Fomichev <sdf@google.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      0357746d
    • S
      bpf: split shared bpf_tcp_sock and bpf_sock_ops implementation · 2377b81d
      Stanislav Fomichev 提交于
      We've added bpf_tcp_sock member to bpf_sock_ops and don't expect
      any new tcp_sock fields in bpf_sock_ops. Let's remove
      CONVERT_COMMON_TCP_SOCK_FIELDS so bpf_tcp_sock can be independently
      extended.
      
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Priyaranjan Jha <priyarjha@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Cc: Soheil Hassas Yeganeh <soheil@google.com>
      Acked-by: NSoheil Hassas Yeganeh <soheil@google.com>
      Acked-by: NYuchung Cheng <ycheng@google.com>
      Signed-off-by: NStanislav Fomichev <sdf@google.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      2377b81d
    • S
      bpf: add BPF_CGROUP_SOCK_OPS callback that is executed on every RTT · 23729ff2
      Stanislav Fomichev 提交于
      Performance impact should be minimal because it's under a new
      BPF_SOCK_OPS_RTT_CB_FLAG flag that has to be explicitly enabled.
      Suggested-by: NEric Dumazet <edumazet@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Priyaranjan Jha <priyarjha@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Cc: Soheil Hassas Yeganeh <soheil@google.com>
      Acked-by: NSoheil Hassas Yeganeh <soheil@google.com>
      Acked-by: NYuchung Cheng <ycheng@google.com>
      Signed-off-by: NStanislav Fomichev <sdf@google.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      23729ff2
    • J
      selftests: bpf: standardize to static __always_inline · d2f5bbbc
      Jiri Benc 提交于
      The progs for bpf selftests use several different notations to force
      function inlining. Standardize to what most of them use,
      static __always_inline.
      Suggested-by: NSong Liu <liu.song.a23@gmail.com>
      Signed-off-by: NJiri Benc <jbenc@redhat.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      d2f5bbbc
    • B
      bpf: Add support for fq's EDT to HBM · 71634d7f
      brakmo 提交于
      Adds support for fq's Earliest Departure Time to HBM (Host Bandwidth
      Manager). Includes a new BPF program supporting EDT, and also updates
      corresponding programs.
      
      It will drop packets with an EDT of more than 500us in the future
      unless the packet belongs to a flow with less than 2 packets in flight.
      This is done so each flow has at least 2 packets in flight, so they
      will not starve, and also to help prevent delayed ACK timeouts.
      
      It will also work with ECN enabled traffic, where the packets will be
      CE marked if their EDT is more than 50us in the future.
      
      The table below shows some performance numbers. The flows are back to
      back RPCS. One server sending to another, either 2 or 4 flows.
      One flow is a 10KB RPC, the rest are 1MB RPCs. When there are more
      than one flow of a given RPC size, the numbers represent averages.
      
      The rate limit applies to all flows (they are in the same cgroup).
      Tests ending with "-edt" ran with the new BPF program supporting EDT.
      Tests ending with "-hbt" ran on top HBT qdisc with the specified rate
      (i.e. no HBM). The other tests ran with the HBM BPF program included
      in the HBM patch-set.
      
      EDT has limited value when using DCTCP, but it helps in many cases when
      using Cubic. It usually achieves larger link utilization and lower
      99% latencies for the 1MB RPCs.
      HBM ends up queueing a lot of packets with its default parameter values,
      reducing the goodput of the 10KB RPCs and increasing their latency. Also,
      the RTTs seen by the flows are quite large.
      
                               Aggr              10K  10K  10K   1MB  1MB  1MB
               Limit           rate drops  RTT  rate  P90  P99  rate  P90  P99
      Test      rate  Flows    Mbps   %     us  Mbps   us   us  Mbps   ms   ms
      --------  ----  -----    ---- -----  ---  ---- ---- ----  ---- ---- ----
      cubic       1G    2       904  0.02  108   257  511  539   647 13.4 24.5
      cubic-edt   1G    2       982  0.01  156   239  656  967   743 14.0 17.2
      dctcp       1G    2       977  0.00  105   324  408  744   653 14.5 15.9
      dctcp-edt   1G    2       981  0.01  142   321  417  811   660 15.7 17.0
      cubic-htb   1G    2       919  0.00 1825    40 2822 4140   879  9.7  9.9
      
      cubic     200M    2       155  0.30  220    81  532  655    74  283  450
      cubic-edt 200M    2       188  0.02  222    87 1035 1095   101   84   85
      dctcp     200M    2       188  0.03  111    77  912  939   111   76  325
      dctcp-edt 200M    2       188  0.03  217    74 1416 1738   114   76   79
      cubic-htb 200M    2       188  0.00 5015     8 14ms 15ms   180   48   50
      
      cubic       1G    4       952  0.03  110   165  516  546   262   38  154
      cubic-edt   1G    4       973  0.01  190   111 1034 1314   287   65   79
      dctcp       1G    4       951  0.00  103   180  617  905   257   37   38
      dctcp-edt   1G    4       967  0.00  163   151  732 1126   272   43   55
      cubic-htb   1G    4       914  0.00 3249    13  7ms  8ms   300   29   34
      
      cubic       5G    4      4236  0.00  134   305  490  624  1310   10   17
      cubic-edt   5G    4      4865  0.00  156   306  425  759  1520   10   16
      dctcp       5G    4      4936  0.00  128   485  221  409  1484    7    9
      dctcp-edt   5G    4      4924  0.00  148   390  392  623  1508   11   26
      
      v1 -> v2: Incorporated Andrii's suggestions
      v2 -> v3: Incorporated Yonghong's suggestions
      v3 -> v4: Removed credit update that is not needed
      Signed-off-by: NLawrence Brakmo <brakmo@fb.com>
      Acked-by: NYonghong Song <yhs@fb.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      71634d7f
    • L
      bpf, libbpf, smatch: Fix potential NULL pointer dereference · 33bae185
      Leo Yan 提交于
      Based on the following report from Smatch, fix the potential NULL
      pointer dereference check:
      
        tools/lib/bpf/libbpf.c:3493
        bpf_prog_load_xattr() warn: variable dereferenced before check 'attr'
        (see line 3483)
      
        3479 int bpf_prog_load_xattr(const struct bpf_prog_load_attr *attr,
        3480                         struct bpf_object **pobj, int *prog_fd)
        3481 {
        3482         struct bpf_object_open_attr open_attr = {
        3483                 .file           = attr->file,
        3484                 .prog_type      = attr->prog_type,
                                               ^^^^^^
        3485         };
      
      At the head of function, it directly access 'attr' without checking
      if it's NULL pointer. This patch moves the values assignment after
      validating 'attr' and 'attr->file'.
      Signed-off-by: NLeo Yan <leo.yan@linaro.org>
      Acked-by: NYonghong Song <yhs@fb.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      33bae185
    • A
      libbpf: fix GCC8 warning for strncpy · cdfc7f88
      Andrii Nakryiko 提交于
      GCC8 started emitting warning about using strncpy with number of bytes
      exactly equal destination size, which is generally unsafe, as can lead
      to non-zero terminated string being copied. Use IFNAMSIZ - 1 as number
      of bytes to ensure name is always zero-terminated.
      Signed-off-by: NAndrii Nakryiko <andriin@fb.com>
      Cc: Magnus Karlsson <magnus.karlsson@intel.com>
      Acked-by: NYonghong Song <yhs@fb.com>
      Acked-by: NMagnus Karlsson <magnus.karlsson@intel.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      cdfc7f88
    • A
      bpf: fix precision tracking · a3ce685d
      Alexei Starovoitov 提交于
      When equivalent state is found the current state needs to propagate precision marks.
      Otherwise the verifier will prune the search incorrectly.
      
      There is a price for correctness:
                            before      before    broken    fixed
                            cnst spill  precise   precise
      bpf_lb-DLB_L3.o       1923        8128      1863      1898
      bpf_lb-DLB_L4.o       3077        6707      2468      2666
      bpf_lb-DUNKNOWN.o     1062        1062      544       544
      bpf_lxc-DDROP_ALL.o   166729      380712    22629     36823
      bpf_lxc-DUNKNOWN.o    174607      440652    28805     45325
      bpf_netdev.o          8407        31904     6801      7002
      bpf_overlay.o         5420        23569     4754      4858
      bpf_lxc_jit.o         39389       359445    50925     69631
      Overall precision tracking is still very effective.
      
      Fixes: b5dc0163 ("bpf: precise scalar_value tracking")
      Reported-by: NLawrence Brakmo <brakmo@fb.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Acked-by: NAndrii Nakryiko <andriin@fb.com>
      Tested-by: NLawrence Brakmo <brakmo@fb.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      a3ce685d
    • P
      mlxsw: spectrum_ptp: Fix validation in mlxsw_sp1_ptp_packet_finish() · dbcdb61a
      Petr Machata 提交于
      Before mlxsw_sp1_ptp_packet_finish() sends the packet back, it validates
      whether the corresponding port is still valid. However the condition is
      incorrect: when mlxsw_sp_port == NULL, the code dereferences the port to
      compare it to skb->dev.
      
      The condition needs to check whether the port is present and skb->dev still
      refers to that port (or else is NULL). If that does not hold, bail out.
      Add a pair of parentheses to fix the condition.
      
      Fixes: d92e4e6e ("mlxsw: spectrum: PTP: Support timestamping on Spectrum-1")
      Reported-by: NColin Ian King <colin.king@canonical.com>
      Signed-off-by: NPetr Machata <petrm@mellanox.com>
      Signed-off-by: NColin Ian King <colin.king@canonical.com>
      Reviewed-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dbcdb61a
    • H
      r8169: add random MAC address fallback · c782e204
      Heiner Kallweit 提交于
      It was reported that the GPD MicroPC is broken in a way that no valid
      MAC address can be read from the network chip. The vendor driver deals
      with this by assigning a random MAC address as fallback. So let's do
      the same.
      Signed-off-by: NHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c782e204
    • H
      Revert "r8169: improve handling VLAN tag" · 7424edbb
      Heiner Kallweit 提交于
      This reverts commit 759d0957.
      
      The patch was based on a misunderstanding. As Al Viro pointed out [0]
      it's simply wrong on big endian. So let's revert it.
      
      [0] https://marc.info/?t=156200975600004&r=1&w=2Reported-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7424edbb
    • M
      net: stmmac: make "snps,reset-delays-us" optional again · cc5e92c2
      Martin Blumenstingl 提交于
      Commit 760f1dc2 ("net: stmmac: add sanity check to
      device_property_read_u32_array call") introduced error checking of the
      device_property_read_u32_array() call in stmmac_mdio_reset().
      This results in the following error when the "snps,reset-delays-us"
      property is not defined in devicetree:
        invalid property snps,reset-delays-us
      
      This sanity check made sense until commit 84ce4d0f ("net: stmmac:
      initialize the reset delay array") ensured that there are fallback
      values for the reset delay if the "snps,reset-delays-us" property is
      absent. That was at the cost of making that property mandatory though.
      
      Drop the sanity check for device_property_read_u32_array() and thus make
      the "snps,reset-delays-us" property optional again (avoiding the error
      message while loading the stmmac driver with a .dtb where the property
      is absent).
      
      Fixes: 760f1dc2 ("net: stmmac: add sanity check to device_property_read_u32_array call")
      Signed-off-by: NMartin Blumenstingl <martin.blumenstingl@googlemail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cc5e92c2
    • E
      bonding/main: fix NULL dereference in bond_select_active_slave() · b8bd72d3
      Eric Dumazet 提交于
      A bonding master can be up while best_slave is NULL.
      
      [12105.636318] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
      [12105.638204] mlx4_en: eth1: Linkstate event 1 -> 1
      [12105.648984] IP: bond_select_active_slave+0x125/0x250
      [12105.653977] PGD 0 P4D 0
      [12105.656572] Oops: 0000 [#1] SMP PTI
      [12105.660487] gsmi: Log Shutdown Reason 0x03
      [12105.664620] Modules linked in: kvm_intel loop act_mirred uhaul vfat fat stg_standard_ftl stg_megablocks stg_idt stg_hdi stg elephant_dev_num stg_idt_eeprom w1_therm wire i2c_mux_pca954x i2c_mux mlx4_i2c i2c_usb cdc_acm ehci_pci ehci_hcd i2c_iimc mlx4_en mlx4_ib ib_uverbs ib_core mlx4_core [last unloaded: kvm_intel]
      [12105.685686] mlx4_core 0000:03:00.0: dispatching link up event for port 2
      [12105.685700] mlx4_en: eth2: Linkstate event 2 -> 1
      [12105.685700] mlx4_en: eth2: Link Up (linkstate)
      [12105.724452] Workqueue: bond0 bond_mii_monitor
      [12105.728854] RIP: 0010:bond_select_active_slave+0x125/0x250
      [12105.734355] RSP: 0018:ffffaf146a81fd88 EFLAGS: 00010246
      [12105.739637] RAX: 0000000000000003 RBX: ffff8c62b03c6900 RCX: 0000000000000000
      [12105.746838] RDX: 0000000000000000 RSI: ffffaf146a81fd08 RDI: ffff8c62b03c6000
      [12105.754054] RBP: ffffaf146a81fdb8 R08: 0000000000000001 R09: ffff8c517d387600
      [12105.761299] R10: 00000000001075d9 R11: ffffffffaceba92f R12: 0000000000000000
      [12105.768553] R13: ffff8c8240ae4800 R14: 0000000000000000 R15: 0000000000000000
      [12105.775748] FS:  0000000000000000(0000) GS:ffff8c62bfa40000(0000) knlGS:0000000000000000
      [12105.783892] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [12105.789716] CR2: 0000000000000000 CR3: 0000000d0520e001 CR4: 00000000001626f0
      [12105.796976] Call Trace:
      [12105.799446]  [<ffffffffac31d387>] bond_mii_monitor+0x497/0x6f0
      [12105.805317]  [<ffffffffabd42643>] process_one_work+0x143/0x370
      [12105.811225]  [<ffffffffabd42c7a>] worker_thread+0x4a/0x360
      [12105.816761]  [<ffffffffabd48bc5>] kthread+0x105/0x140
      [12105.821865]  [<ffffffffabd42c30>] ? rescuer_thread+0x380/0x380
      [12105.827757]  [<ffffffffabd48ac0>] ? kthread_associate_blkcg+0xc0/0xc0
      [12105.834266]  [<ffffffffac600241>] ret_from_fork+0x51/0x60
      
      Fixes: e2a7420d ("bonding/main: convert to using slave printk macros")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: NJohn Sperbeck <jsperbeck@google.com>
      Cc: Jarod Wilson <jarod@redhat.com>
      CC: Jay Vosburgh <j.vosburgh@gmail.com>
      CC: Veaceslav Falico <vfalico@gmail.com>
      CC: Andy Gospodarek <andy@greyhouse.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b8bd72d3
    • X
      tipc: remove ub->ubsock checks · d2c3a4ba
      Xin Long 提交于
      Both tipc_udp_enable and tipc_udp_disable are called under rtnl_lock,
      ub->ubsock could never be NULL in tipc_udp_disable and cleanup_bearer,
      so remove the check.
      
      Also remove the one in tipc_udp_enable by adding "free" label.
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d2c3a4ba
    • S
      ipv4: Fix off-by-one in route dump counter without netlink strict checking · 885b8b4d
      Stefano Brivio 提交于
      In commit ee28906f ("ipv4: Dump route exceptions if requested") I
      added a counter of per-node dumped routes (including actual routes and
      exceptions), analogous to the existing counter for dumped nodes. Dumping
      exceptions means we need to also keep track of how many routes are dumped
      for each node: this would be just one route per node, without exceptions.
      
      When netlink strict checking is not enabled, we dump both routes and
      exceptions at the same time: the RTM_F_CLONED flag is not used as a
      filter. In this case, the per-node counter 'i_fa' is incremented by one
      to track the single dumped route, then also incremented by one for each
      exception dumped, and then stored as netlink callback argument as skip
      counter, 's_fa', to be used when a partial dump operation restarts.
      
      The per-node counter needs to be increased by one also when we skip a
      route (exception) due to a previous non-zero skip counter, because it
      needs to match the existing skip counter, if we are dumping both routes
      and exceptions. I missed this, and only incremented the counter, for
      regular routes, if the previous skip counter was zero. This means that,
      in case of a mixed dump, partial dump operations after the first one
      will start with a mismatching skip counter value, one less than expected.
      
      This means in turn that the first exception for a given node is skipped
      every time a partial dump operation restarts, if netlink strict checking
      is not enabled (iproute < 5.0).
      
      It turns out I didn't repeat the test in its final version, commit
      de755a85 ("selftests: pmtu: Introduce list_flush_ipv4_exception test
      case"), which also counts the number of route exceptions returned, with
      iproute2 versions < 5.0 -- I was instead using the equivalent of the IPv6
      test as it was before commit b964641e ("selftests: pmtu: Make
      list_flush_ipv6_exception test more demanding").
      
      Always increment the per-node counter by one if we previously dumped
      a regular route, so that it matches the current skip counter.
      
      Fixes: ee28906f ("ipv4: Dump route exceptions if requested")
      Signed-off-by: NStefano Brivio <sbrivio@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      885b8b4d