1. 05 10月, 2017 18 次提交
    • D
      net: Add extack to netdev_notifier_info · 51d0c047
      David Ahern 提交于
      Add netlink_ext_ack to netdev_notifier_info to allow notifier
      handlers to return errors to userspace.
      
      Clean up the initialization in dev.c such that extack is easily
      added in subsequent patches where relevant. Specifically, remove
      the init call in call_netdevice_notifiers_info and have callers
      initalize on stack when info is declared.
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      51d0c047
    • N
      dev: advertise the new nsid when the netns iface changes · 6621dd29
      Nicolas Dichtel 提交于
      x-netns interfaces are bound to two netns: the link netns and the upper
      netns. Usually, this kind of interfaces is created in the link netns and
      then moved to the upper netns. At the end, the interface is visible only
      in the upper netns. The link nsid is advertised via netlink in the upper
      netns, thus the user always knows where is the link part.
      
      There is no such mechanism in the link netns. When the interface is moved
      to another netns, the user cannot "follow" it.
      This patch adds a new netlink attribute which helps to follow an interface
      which moves to another netns. When the interface is unregistered, the new
      nsid is advertised. If the interface is a x-netns interface (ie
      rtnl_link_ops->get_link_net is defined), the nsid is allocated if needed.
      
      CC: Jason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6621dd29
    • D
      Merge branch 'bpf-cgroup-multi-prog' · b295edc5
      David S. Miller 提交于
      Alexei Starovoitov says:
      
      ====================
      bpf: muli prog support for cgroup-bpf
      
      v1->v2:
      - fixed accidentally swapped two lines which caused static_key not going to zero
      - addressed Martin's feedback and changed prog_query to be consistent
        with verifier output: return -enospc and fill supplied buffer instead
        of just returning -enospc when buffer is too small to fit all prog_ids
      
      v1:
      cgroup-bpf use cases are getting more advanced and running only
      one program per cgroup is no longer enough. Therefore introduce
      support for attaching multiple programs per cgroup and running
      a set of effective programs.
      
      These patches introduces BPF_F_ALLOW_MULTI flag for BPF_PROG_ATTACH cmd.
      The default is still NONE and behavior of BPF_F_ALLOW_OVERRIDE flag
      is unchanged.
      The difference between three possible flags for BPF_PROG_ATTACH command:
      - NONE(default): No further bpf programs allowed in the subtree.
      - BPF_F_ALLOW_OVERRIDE: If a sub-cgroup installs some bpf program,
        the program in this cgroup yields to sub-cgroup program.
      - BPF_F_ALLOW_MULTI: If a sub-cgroup installs some bpf program,
        that cgroup program gets run in addition to the program in this cgroup.
      
      Most of the logic is in patch 1. Even when cgroup doesn't have
      any programs attached its set of effective program can be non-empty.
      To quickly execute them and avoid penalizing cgroups without
      any effective programs introduce 'struct bpf_prog_array'
      which has an optimization for cgroups with zero effective programs.
      
      Patch 2 introduces BPF_PROG_QUERY command for introspection
      Patch 3 makes verifier more strict for cgroup-bpf program types.
      Patch 4+ are tests.
      
      More details in individual patches
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b295edc5
    • A
      samples/bpf: use bpf_prog_query() interface · dfc06999
      Alexei Starovoitov 提交于
      use BPF_PROG_QUERY command to strengthen test coverage
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dfc06999
    • A
      libbpf: add support for BPF_PROG_QUERY · 5d0cbf9b
      Alexei Starovoitov 提交于
      add support for BPF_PROG_QUERY command to libbpf
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5d0cbf9b
    • A
      libbpf: sync bpf.h · defd9c47
      Alexei Starovoitov 提交于
      tools/include/uapi/linux/bpf.h got out of sync with actual kernel header.
      Update it.
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      defd9c47
    • A
      samples/bpf: add multi-prog cgroup test case · 39323e78
      Alexei Starovoitov 提交于
      create 5 cgroups, attach 6 progs and check that progs are executed as:
      cgrp1 (MULTI progs A, B) ->
         cgrp2 (OVERRIDE prog C) ->
           cgrp3 (MULTI prog D) ->
             cgrp4 (OVERRIDE prog E) ->
               cgrp5 (NONE prog F)
      the event in cgrp5 triggers execution of F,D,A,B in that order.
      if prog F is detached, the execution is E,D,A,B
      if prog F and D are detached, the execution is E,A,B
      if prog F, E and D are detached, the execution is C,A,B
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      39323e78
    • A
      libbpf: introduce bpf_prog_detach2() · 244d20ef
      Alexei Starovoitov 提交于
      introduce bpf_prog_detach2() that takes one more argument prog_fd
      vs bpf_prog_detach() that takes only attach_fd and type.
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      244d20ef
    • A
      bpf: enforce return code for cgroup-bpf programs · 390ee7e2
      Alexei Starovoitov 提交于
      with addition of tnum logic the verifier got smart enough and
      we can enforce return codes at program load time.
      For now do so for cgroup-bpf program types.
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      390ee7e2
    • A
      bpf: introduce BPF_PROG_QUERY command · 468e2f64
      Alexei Starovoitov 提交于
      introduce BPF_PROG_QUERY command to retrieve a set of either
      attached programs to given cgroup or a set of effective programs
      that will execute for events within a cgroup
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NMartin KaFai Lau <kafai@fb.com>
      for cgroup bits
      Acked-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      468e2f64
    • A
      bpf: multi program support for cgroup+bpf · 324bda9e
      Alexei Starovoitov 提交于
      introduce BPF_F_ALLOW_MULTI flag that can be used to attach multiple
      bpf programs to a cgroup.
      
      The difference between three possible flags for BPF_PROG_ATTACH command:
      - NONE(default): No further bpf programs allowed in the subtree.
      - BPF_F_ALLOW_OVERRIDE: If a sub-cgroup installs some bpf program,
        the program in this cgroup yields to sub-cgroup program.
      - BPF_F_ALLOW_MULTI: If a sub-cgroup installs some bpf program,
        that cgroup program gets run in addition to the program in this cgroup.
      
      NONE and BPF_F_ALLOW_OVERRIDE existed before. This patch doesn't
      change their behavior. It only clarifies the semantics in relation
      to new flag.
      
      Only one program is allowed to be attached to a cgroup with
      NONE or BPF_F_ALLOW_OVERRIDE flag.
      Multiple programs are allowed to be attached to a cgroup with
      BPF_F_ALLOW_MULTI flag. They are executed in FIFO order
      (those that were attached first, run first)
      The programs of sub-cgroup are executed first, then programs of
      this cgroup and then programs of parent cgroup.
      All eligible programs are executed regardless of return code from
      earlier programs.
      
      To allow efficient execution of multiple programs attached to a cgroup
      and to avoid penalizing cgroups without any programs attached
      introduce 'struct bpf_prog_array' which is RCU protected array
      of pointers to bpf programs.
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NMartin KaFai Lau <kafai@fb.com>
      for cgroup bits
      Acked-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      324bda9e
    • E
      net: cache skb_shinfo() in skb_try_coalesce() · c818fa9e
      Eric Dumazet 提交于
      Compiler does not really know that skb_shinfo(to|from) are constants
      in skb_try_coalesce(), lets cache their values to shrink code.
      
      We might even take care of skb_zcopy() calls later.
      
      $ size net/core/skbuff.o.before net/core/skbuff.o
         text	   data	    bss	    dec	    hex	filename
        40727	   1298	      0	  42025	   a429	net/core/skbuff.o.before
        40631	   1298	      0	  41929	   a3c9	net/core/skbuff.o
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c818fa9e
    • F
      selftests: rtnetlink: try concurrent change of ifalias · e9b871ee
      Florian Westphal 提交于
      to make sure this is serialized correctly.
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e9b871ee
    • F
      rtnetlink: remove __rtnl_af_unregister · 5c45121d
      Florian Westphal 提交于
      switch the only caller to rtnl_af_unregister.
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5c45121d
    • F
      rtnetlink: remove slave_validate callback · e774d96b
      Florian Westphal 提交于
      no users in the tree.
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Acked-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e774d96b
    • C
      cxgb4vf: make a couple of functions static · ebf6b131
      Colin Ian King 提交于
      The functions t4vf_link_down_rc_str and t4vf_handle_get_port_info are
      local to the source and do not need to be in global scope, so make
      them static.
      
      Cleans up sparse warnings:
      symbol 't4vf_link_down_rc_str' was not declared. Should it be static?
      symbol 't4vf_handle_get_port_info' was not declared. Should it be static?
      Signed-off-by: NColin Ian King <colin.king@canonical.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ebf6b131
    • F
      net: core: fix kerneldoc comment · 20e88320
      Florian Westphal 提交于
      net/core/dev.c:1306: warning: No description found for parameter 'name'
      net/core/dev.c:1306: warning: Excess function parameter 'alias' description in 'dev_get_alias'
      
      Fixes: 6c557001 ("net: core: decouple ifalias get/set from rtnl lock")
      Reported-by: Nkbuild test robot <fengguang.wu@intel.com>
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      20e88320
    • S
      ravb: RX checksum offload · 4d86d381
      Simon Horman 提交于
      Add support for RX checksum offload. This is enabled by default and
      may be disabled and re-enabled using ethtool:
      
       # ethtool -K eth0 rx off
       # ethtool -K eth0 rx on
      
      The RAVB provides a simple checksumming scheme which appears to be
      completely compatible with CHECKSUM_COMPLETE: sum of all packet data after
      the L2 header is appended to packet data; this may be trivially read by the
      driver and used to update the skb accordingly.
      
      In terms of performance throughput is close to gigabit line-rate both with
      and without RX checksum offload enabled. Perf output, however, appears to
      indicate that significantly less time is spent in do_csum(). This is as
      expected.
      
      Test results with RX checksum offload enabled:
       # /usr/bin/perf_3.16 record -o /run/perf.data -a netperf -t TCP_MAERTS -H 10.4.3.162
       MIGRATED TCP MAERTS TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.4.3.162 () port 0 AF_INET : demo
       enable_enobufs failed: getprotobyname
       Recv   Send    Send
       Socket Socket  Message  Elapsed
       Size   Size    Size     Time     Throughput
       bytes  bytes   bytes    secs.    10^6bits/sec
      
        87380  16384  16384    10.00     937.54
      
       Summary of output of perf report:
          18.28%      ksoftirqd/0  [kernel.kallsyms]  [k] _raw_spin_unlock_irqrestore
          10.34%      ksoftirqd/0  [kernel.kallsyms]  [k] __pi_memcpy
           9.83%      ksoftirqd/0  [kernel.kallsyms]  [k] ravb_poll
           7.89%      ksoftirqd/0  [kernel.kallsyms]  [k] skb_put
           4.01%      ksoftirqd/0  [kernel.kallsyms]  [k] dev_gro_receive
           3.37%          netperf  [kernel.kallsyms]  [k] __arch_copy_to_user
           3.17%          swapper  [kernel.kallsyms]  [k] arch_cpu_idle
           2.55%          swapper  [kernel.kallsyms]  [k] tick_nohz_idle_enter
           2.04%      ksoftirqd/0  [kernel.kallsyms]  [k] __pi___inval_dcache_area
           2.03%          swapper  [kernel.kallsyms]  [k] _raw_spin_unlock_irq
           1.96%      ksoftirqd/0  [kernel.kallsyms]  [k] __netdev_alloc_skb
           1.59%      ksoftirqd/0  [kernel.kallsyms]  [k] __slab_alloc.isra.83
      
      Test results without RX checksum offload enabled:
       # /usr/bin/perf_3.16 record -o /run/perf.data -a netperf -t TCP_MAERTS -H 10.4.3.162
       MIGRATED TCP MAERTS TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.4.3.162 () port 0 AF_INET : demo
       enable_enobufs failed: getprotobyname
       Recv   Send    Send
       Socket Socket  Message  Elapsed
       Size   Size    Size     Time     Throughput
       bytes  bytes   bytes    secs.    10^6bits/sec
      
        87380  16384  16384    10.00     940.20
      
       Summary of output of perf report:
          17.10%    ksoftirqd/0  [kernel.kallsyms]  [k] _raw_spin_unlock_irqrestore
          10.99%    ksoftirqd/0  [kernel.kallsyms]  [k] __pi_memcpy
           8.87%    ksoftirqd/0  [kernel.kallsyms]  [k] ravb_poll
           8.16%    ksoftirqd/0  [kernel.kallsyms]  [k] skb_put
           7.42%    ksoftirqd/0  [kernel.kallsyms]  [k] do_csum
           3.91%    ksoftirqd/0  [kernel.kallsyms]  [k] dev_gro_receive
           2.31%        swapper  [kernel.kallsyms]  [k] arch_cpu_idle
           2.16%    ksoftirqd/0  [kernel.kallsyms]  [k] __pi___inval_dcache_area
           2.14%    ksoftirqd/0  [kernel.kallsyms]  [k] __netdev_alloc_skb
           1.93%        netperf  [kernel.kallsyms]  [k] __arch_copy_to_user
           1.79%        swapper  [kernel.kallsyms]  [k] tick_nohz_idle_enter
           1.63%    ksoftirqd/0  [kernel.kallsyms]  [k] __slab_alloc.isra.83
      
      Above results collected on an R-Car Gen 3 Salvator-X/r8a7796 ES1.0.
      Also tested on a R-Car Gen 3 Salvator-X/r8a7795 ES1.0.
      
      By inspection this also appears to be compatible with the ravb found
      on R-Car Gen 2 SoCs, however, this patch is currently untested on such
      hardware.
      Signed-off-by: NSimon Horman <horms+renesas@verge.net.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4d86d381
  2. 04 10月, 2017 22 次提交