1. 04 5月, 2018 9 次提交
    • D
      bpf: add skb_load_bytes_relative helper · 4e1ec56c
      Daniel Borkmann 提交于
      This adds a small BPF helper similar to bpf_skb_load_bytes() that
      is able to load relative to mac/net header offset from the skb's
      linear data. Compared to bpf_skb_load_bytes(), it takes a fifth
      argument namely start_header, which is either BPF_HDR_START_MAC
      or BPF_HDR_START_NET. This allows for a more flexible alternative
      compared to LD_ABS/LD_IND with negative offset. It's enabled for
      tc BPF programs as well as sock filter program types where it's
      mainly useful in reuseport programs to ease access to lower header
      data.
      
      Reference: https://lists.iovisor.org/pipermail/iovisor-dev/2017-March/000698.htmlSigned-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      4e1ec56c
    • D
      bpf: implement ld_abs/ld_ind in native bpf · e0cea7ce
      Daniel Borkmann 提交于
      The main part of this work is to finally allow removal of LD_ABS
      and LD_IND from the BPF core by reimplementing them through native
      eBPF instead. Both LD_ABS/LD_IND were carried over from cBPF and
      keeping them around in native eBPF caused way more trouble than
      actually worth it. To just list some of the security issues in
      the past:
      
        * fdfaf64e ("x86: bpf_jit: support negative offsets")
        * 35607b02 ("sparc: bpf_jit: fix loads from negative offsets")
        * e0ee9c12 ("x86: bpf_jit: fix two bugs in eBPF JIT compiler")
        * 07aee943 ("bpf, sparc: fix usage of wrong reg for load_skb_regs after call")
        * 6d59b7db ("bpf, s390x: do not reload skb pointers in non-skb context")
        * 87338c8e ("bpf, ppc64: do not reload skb pointers in non-skb context")
      
      For programs in native eBPF, LD_ABS/LD_IND are pretty much legacy
      these days due to their limitations and more efficient/flexible
      alternatives that have been developed over time such as direct
      packet access. LD_ABS/LD_IND only cover 1/2/4 byte loads into a
      register, the load happens in host endianness and its exception
      handling can yield unexpected behavior. The latter is explained
      in depth in f6b1b3bf ("bpf: fix subprog verifier bypass by
      div/mod by 0 exception") with similar cases of exceptions we had.
      In native eBPF more recent program types will disable LD_ABS/LD_IND
      altogether through may_access_skb() in verifier, and given the
      limitations in terms of exception handling, it's also disabled
      in programs that use BPF to BPF calls.
      
      In terms of cBPF, the LD_ABS/LD_IND is used in networking programs
      to access packet data. It is not used in seccomp-BPF but programs
      that use it for socket filtering or reuseport for demuxing with
      cBPF. This is mostly relevant for applications that have not yet
      migrated to native eBPF.
      
      The main complexity and source of bugs in LD_ABS/LD_IND is coming
      from their implementation in the various JITs. Most of them keep
      the model around from cBPF times by implementing a fastpath written
      in asm. They use typically two from the BPF program hidden CPU
      registers for caching the skb's headlen (skb->len - skb->data_len)
      and skb->data. Throughout the JIT phase this requires to keep track
      whether LD_ABS/LD_IND are used and if so, the two registers need
      to be recached each time a BPF helper would change the underlying
      packet data in native eBPF case. At least in eBPF case, available
      CPU registers are rare and the additional exit path out of the
      asm written JIT helper makes it also inflexible since not all
      parts of the JITer are in control from plain C. A LD_ABS/LD_IND
      implementation in eBPF therefore allows to significantly reduce
      the complexity in JITs with comparable performance results for
      them, e.g.:
      
      test_bpf             tcpdump port 22             tcpdump complex
      x64      - before    15 21 10                    14 19  18
               - after      7 10 10                     7 10  15
      arm64    - before    40 91 92                    40 91 151
               - after     51 64 73                    51 62 113
      
      For cBPF we now track any usage of LD_ABS/LD_IND in bpf_convert_filter()
      and cache the skb's headlen and data in the cBPF prologue. The
      BPF_REG_TMP gets remapped from R8 to R2 since it's mainly just
      used as a local temporary variable. This allows to shrink the
      image on x86_64 also for seccomp programs slightly since mapping
      to %rsi is not an ereg. In callee-saved R8 and R9 we now track
      skb data and headlen, respectively. For normal prologue emission
      in the JITs this does not add any extra instructions since R8, R9
      are pushed to stack in any case from eBPF side. cBPF uses the
      convert_bpf_ld_abs() emitter which probes the fast path inline
      already and falls back to bpf_skb_load_helper_{8,16,32}() helper
      relying on the cached skb data and headlen as well. R8 and R9
      never need to be reloaded due to bpf_helper_changes_pkt_data()
      since all skb access in cBPF is read-only. Then, for the case
      of native eBPF, we use the bpf_gen_ld_abs() emitter, which calls
      the bpf_skb_load_helper_{8,16,32}_no_cache() helper unconditionally,
      does neither cache skb data and headlen nor has an inlined fast
      path. The reason for the latter is that native eBPF does not have
      any extra registers available anyway, but even if there were, it
      avoids any reload of skb data and headlen in the first place.
      Additionally, for the negative offsets, we provide an alternative
      bpf_skb_load_bytes_relative() helper in eBPF which operates
      similarly as bpf_skb_load_bytes() and allows for more flexibility.
      Tested myself on x64, arm64, s390x, from Sandipan on ppc64.
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      e0cea7ce
    • D
      bpf: migrate ebpf ld_abs/ld_ind tests to test_verifier · 93731ef0
      Daniel Borkmann 提交于
      Remove all eBPF tests involving LD_ABS/LD_IND from test_bpf.ko. Reason
      is that the eBPF tests from test_bpf module do not go via BPF verifier
      and therefore any instruction rewrites from verifier cannot take place.
      
      Therefore, move them into test_verifier which runs out of user space,
      so that verfier can rewrite LD_ABS/LD_IND internally in upcoming patches.
      It will have the same effect since runtime tests are also performed from
      there. This also allows to finally unexport bpf_skb_vlan_{push,pop}_proto
      and keep it internal to core kernel.
      
      Additionally, also add further cBPF LD_ABS/LD_IND test coverage into
      test_bpf.ko suite.
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      93731ef0
    • D
      bpf: prefix cbpf internal helpers with bpf_ · b390134c
      Daniel Borkmann 提交于
      No change in functionality, just remove the '__' prefix and replace it
      with a 'bpf_' prefix instead. We later on add a couple of more helpers
      for cBPF and keeping the scheme with '__' is suboptimal there.
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      b390134c
    • M
      dev: packet: make packet_direct_xmit a common function · 865b03f2
      Magnus Karlsson 提交于
      The new dev_direct_xmit will be used by AF_XDP in later commits.
      Signed-off-by: NMagnus Karlsson <magnus.karlsson@intel.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      865b03f2
    • B
      xsk: wire up XDP_SKB side of AF_XDP · 02671e23
      Björn Töpel 提交于
      This commit wires up the xskmap to XDP_SKB layer.
      Signed-off-by: NBjörn Töpel <bjorn.topel@intel.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      02671e23
    • B
      xsk: wire up XDP_DRV side of AF_XDP · 1b1a251c
      Björn Töpel 提交于
      This commit wires up the xskmap to XDP_DRV layer.
      Signed-off-by: NBjörn Töpel <bjorn.topel@intel.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      1b1a251c
    • B
      xsk: add Rx receive functions and poll support · c497176c
      Björn Töpel 提交于
      Here the actual receive functions of AF_XDP are implemented, that in a
      later commit, will be called from the XDP layers.
      
      There's one set of functions for the XDP_DRV side and another for
      XDP_SKB (generic).
      
      A new XDP API, xdp_return_buff, is also introduced.
      
      Adding xdp_return_buff, which is analogous to xdp_return_frame, but
      acts upon an struct xdp_buff. The API will be used by AF_XDP in future
      commits.
      
      Support for the poll syscall is also implemented.
      
      v2: xskq_validate_id did not update cons_tail.
          The entries variable was calculated twice in xskq_nb_avail.
          Squashed xdp_return_buff commit.
      Signed-off-by: NBjörn Töpel <bjorn.topel@intel.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      c497176c
    • B
      net: initial AF_XDP skeleton · 68e8b849
      Björn Töpel 提交于
      Buildable skeleton of AF_XDP without any functionality. Just what it
      takes to register a new address family.
      Signed-off-by: NBjörn Töpel <bjorn.topel@intel.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      68e8b849
  2. 02 5月, 2018 2 次提交
    • F
      net: core: Inline netdev_features_size_check() · e283de3a
      Florian Fainelli 提交于
      We do not require this inline function to be used in multiple different
      locations, just inline it where it gets used in register_netdevice().
      Suggested-by: NDavid Miller <davem@davemloft.net>
      Suggested-by: NStephen Hemminger <stephen@networkplumber.org>
      Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e283de3a
    • W
      ethtool: fix a potential missing-check bug · d656fe49
      Wenwen Wang 提交于
      In ethtool_get_rxnfc(), the object "info" is firstly copied from
      user-space. If the FLOW_RSS flag is set in the member field flow_type of
      "info" (and cmd is ETHTOOL_GRXFH), info needs to be copied again from
      user-space because FLOW_RSS is newer and has new definition, as mentioned
      in the comment. However, given that the user data resides in user-space, a
      malicious user can race to change the data after the first copy. By doing
      so, the user can inject inconsistent data. For example, in the second
      copy, the FLOW_RSS flag could be cleared in the field flow_type of "info".
      In the following execution, "info" will be used in the function
      ops->get_rxnfc(). Such inconsistent data can potentially lead to unexpected
      information leakage since ops->get_rxnfc() will prepare various types of
      data according to flow_type, and the prepared data will be eventually
      copied to user-space. This inconsistent data may also cause undefined
      behaviors based on how ops->get_rxnfc() is implemented.
      
      This patch simply re-verifies the flow_type field of "info" after the
      second copy. If the value is not as expected, an error code will be
      returned.
      Signed-off-by: NWenwen Wang <wang6495@umn.edu>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d656fe49
  3. 01 5月, 2018 3 次提交
  4. 30 4月, 2018 2 次提交
  5. 28 4月, 2018 1 次提交
  6. 27 4月, 2018 5 次提交
  7. 26 4月, 2018 2 次提交
  8. 25 4月, 2018 2 次提交
    • W
      bpf: clear the ip_tunnel_info. · 5540fbf4
      William Tu 提交于
      The percpu metadata_dst might carry the stale ip_tunnel_info
      and cause incorrect behavior.  When mixing tests using ipv4/ipv6
      bpf vxlan and geneve tunnel, the ipv6 tunnel info incorrectly uses
      ipv4's src ip addr as its ipv6 src address, because the previous
      tunnel info does not clean up.  The patch zeros the fields in
      ip_tunnel_info.
      Signed-off-by: NWilliam Tu <u9012063@gmail.com>
      Reported-by: NYifeng Sun <pkusunyifeng@gmail.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      5540fbf4
    • E
      bpf: add helper for getting xfrm states · 12bed760
      Eyal Birger 提交于
      This commit introduces a helper which allows fetching xfrm state
      parameters by eBPF programs attached to TC.
      
      Prototype:
      bpf_skb_get_xfrm_state(skb, index, xfrm_state, size, flags)
      
      skb: pointer to skb
      index: the index in the skb xfrm_state secpath array
      xfrm_state: pointer to 'struct bpf_xfrm_state'
      size: size of 'struct bpf_xfrm_state'
      flags: reserved for future extensions
      
      The helper returns 0 on success. Non zero if no xfrm state at the index
      is found - or non exists at all.
      
      struct bpf_xfrm_state currently includes the SPI, peer IPv4/IPv6
      address and the reqid; it can be further extended by adding elements to
      its end - indicating the populated fields by the 'size' argument -
      keeping backwards compatibility.
      
      Typical usage:
      
      struct bpf_xfrm_state x = {};
      bpf_skb_get_xfrm_state(skb, 0, &x, sizeof(x), 0);
      ...
      Signed-off-by: NEyal Birger <eyal.birger@gmail.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      12bed760
  9. 24 4月, 2018 1 次提交
  10. 23 4月, 2018 2 次提交
  11. 20 4月, 2018 2 次提交
  12. 19 4月, 2018 4 次提交
    • J
      bpf: reserve xdp_frame size in xdp headroom · 97e19cce
      Jesper Dangaard Brouer 提交于
      Commit 6dfb970d ("xdp: avoid leaking info stored in frame data on
      page reuse") tried to allow user/bpf_prog to (re)use area used by
      xdp_frame (stored in frame headroom), by memset clearing area when
      bpf_xdp_adjust_head give bpf_prog access to headroom area.
      
      The mentioned commit had two bugs. (1) Didn't take bpf_xdp_adjust_meta
      into account. (2) a combination of bpf_xdp_adjust_head calls, where
      xdp->data is moved into xdp_frame section, can cause clearing
      xdp_frame area again for area previously granted to bpf_prog.
      
      After discussions with Daniel, we choose to implement a simpler
      solution to the problem, which is to reserve the headroom used by
      xdp_frame info.
      
      This also avoids the situation where bpf_prog is allowed to adjust/add
      headers, and then XDP_REDIRECT later drops the packet due to lack of
      headroom for the xdp_frame.  This would likely confuse the end-user.
      
      Fixes: 6dfb970d ("xdp: avoid leaking info stored in frame data on page reuse")
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      97e19cce
    • S
      hv_netvsc: propogate Hyper-V friendly name into interface alias · 0fe554a4
      Stephen Hemminger 提交于
      This patch implement the 'Device Naming' feature of the Hyper-V
      network device API. In Hyper-V on the host through the GUI or PowerShell
      it is possible to enable the device naming feature which causes
      the host to make available to the guest the name of the device.
      This shows up in the RNDIS protocol as the friendly name.
      
      The name has no particular meaning and is limited to 256 characters.
      The value can only be set via PowerShell on the host, but could
      be scripted for mass deployments. The default value is the
      string 'Network Adapter' and since that is the same for all devices
      and useless, the driver ignores it.
      
      In Windows, the value goes into a registry key for use in SNMP
      ifAlias. For Linux, this patch puts the value in the network
      device alias property; where it is visible in ip tools and SNMP.
      
      The host provided ifAlias is just a suggestion, and can be
      overridden by later ip commands.
      
      Also requires exporting dev_set_alias in netdev core.
      Signed-off-by: NStephen Hemminger <sthemmin@microsoft.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0fe554a4
    • N
      bpf: make generic xdp compatible w/ bpf_xdp_adjust_tail · 198d83bb
      Nikita V. Shirokov 提交于
      w/ bpf_xdp_adjust_tail helper xdp's data_end pointer could be changed as
      well (only "decrease" of pointer's location is going to be supported).
      changing of this pointer will change packet's size.
      for generic XDP we need to reflect this packet's length change by
      adjusting skb's tail pointer
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NNikita V. Shirokov <tehnerd@tehnerd.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      198d83bb
    • N
      bpf: adding bpf_xdp_adjust_tail helper · b32cc5b9
      Nikita V. Shirokov 提交于
      Adding new bpf helper which would allow us to manipulate
      xdp's data_end pointer, and allow us to reduce packet's size
      indended use case: to generate ICMP messages from XDP context,
      where such message would contain truncated original packet.
      Signed-off-by: NNikita V. Shirokov <tehnerd@tehnerd.com>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      b32cc5b9
  13. 18 4月, 2018 3 次提交
    • D
      net/ipv6: move metrics from dst to rt6_info · d4ead6b3
      David Ahern 提交于
      Similar to IPv4, add fib metrics to the fib struct, which at the moment
      is rt6_info. Will be moved to fib6_info in a later patch. Copy metrics
      into dst by reference using refcount.
      
      To make the transition:
      - add dst_metrics to rt6_info. Default to dst_default_metrics if no
        metrics are passed during route add. No need for a separate pmtu
        entry; it can reference the MTU slot in fib6_metrics
      
      - ip6_convert_metrics allocates memory in the FIB entry and uses
        ip_metrics_convert to copy from netlink attribute to metrics entry
      
      - the convert metrics call is done in ip6_route_info_create simplifying
        the route add path
        + fib6_commit_metrics and fib6_copy_metrics and the temporary
          mx6_config are no longer needed
      
      - add fib6_metric_set helper to change the value of a metric in the
        fib entry since dst_metric_set can no longer be used
      
      - cow_metrics for IPv6 can drop to dst_cow_metrics_generic
      
      - rt6_dst_from_metrics_check is no longer needed
      
      - rt6_fill_node needs the FIB entry and dst as separate arguments to
        keep compatibility with existing output. Current dst address is
        renamed to dest.
        (to be consistent with IPv4 rt6_fill_node really should be split
        into 2 functions similar to fib_dump_info and rt_fill_info)
      
      - rt6_fill_node no longer needs the temporary metrics variable
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d4ead6b3
    • D
      net: Handle null dst in rtnl_put_cacheinfo · 3940746d
      David Ahern 提交于
      Need to keep expires time for IPv6 routes in a dump of FIB entries.
      Update rtnl_put_cacheinfo to allow dst to be NULL in which case
      rta_cacheinfo will only contain non-dst data.
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3940746d
    • T
      vlan: Fix reading memory beyond skb->tail in skb_vlan_tagged_multi · 7ce23672
      Toshiaki Makita 提交于
      Syzkaller spotted an old bug which leads to reading skb beyond tail by 4
      bytes on vlan tagged packets.
      This is caused because skb_vlan_tagged_multi() did not check
      skb_headlen.
      
      BUG: KMSAN: uninit-value in eth_type_vlan include/linux/if_vlan.h:283 [inline]
      BUG: KMSAN: uninit-value in skb_vlan_tagged_multi include/linux/if_vlan.h:656 [inline]
      BUG: KMSAN: uninit-value in vlan_features_check include/linux/if_vlan.h:672 [inline]
      BUG: KMSAN: uninit-value in dflt_features_check net/core/dev.c:2949 [inline]
      BUG: KMSAN: uninit-value in netif_skb_features+0xd1b/0xdc0 net/core/dev.c:3009
      CPU: 1 PID: 3582 Comm: syzkaller435149 Not tainted 4.16.0+ #82
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
        __dump_stack lib/dump_stack.c:17 [inline]
        dump_stack+0x185/0x1d0 lib/dump_stack.c:53
        kmsan_report+0x142/0x240 mm/kmsan/kmsan.c:1067
        __msan_warning_32+0x6c/0xb0 mm/kmsan/kmsan_instr.c:676
        eth_type_vlan include/linux/if_vlan.h:283 [inline]
        skb_vlan_tagged_multi include/linux/if_vlan.h:656 [inline]
        vlan_features_check include/linux/if_vlan.h:672 [inline]
        dflt_features_check net/core/dev.c:2949 [inline]
        netif_skb_features+0xd1b/0xdc0 net/core/dev.c:3009
        validate_xmit_skb+0x89/0x1320 net/core/dev.c:3084
        __dev_queue_xmit+0x1cb2/0x2b60 net/core/dev.c:3549
        dev_queue_xmit+0x4b/0x60 net/core/dev.c:3590
        packet_snd net/packet/af_packet.c:2944 [inline]
        packet_sendmsg+0x7c57/0x8a10 net/packet/af_packet.c:2969
        sock_sendmsg_nosec net/socket.c:630 [inline]
        sock_sendmsg net/socket.c:640 [inline]
        sock_write_iter+0x3b9/0x470 net/socket.c:909
        do_iter_readv_writev+0x7bb/0x970 include/linux/fs.h:1776
        do_iter_write+0x30d/0xd40 fs/read_write.c:932
        vfs_writev fs/read_write.c:977 [inline]
        do_writev+0x3c9/0x830 fs/read_write.c:1012
        SYSC_writev+0x9b/0xb0 fs/read_write.c:1085
        SyS_writev+0x56/0x80 fs/read_write.c:1082
        do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287
        entry_SYSCALL_64_after_hwframe+0x3d/0xa2
      RIP: 0033:0x43ffa9
      RSP: 002b:00007fff2cff3948 EFLAGS: 00000217 ORIG_RAX: 0000000000000014
      RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 000000000043ffa9
      RDX: 0000000000000001 RSI: 0000000020000080 RDI: 0000000000000003
      RBP: 00000000006cb018 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000217 R12: 00000000004018d0
      R13: 0000000000401960 R14: 0000000000000000 R15: 0000000000000000
      
      Uninit was created at:
        kmsan_save_stack_with_flags mm/kmsan/kmsan.c:278 [inline]
        kmsan_internal_poison_shadow+0xb8/0x1b0 mm/kmsan/kmsan.c:188
        kmsan_kmalloc+0x94/0x100 mm/kmsan/kmsan.c:314
        kmsan_slab_alloc+0x11/0x20 mm/kmsan/kmsan.c:321
        slab_post_alloc_hook mm/slab.h:445 [inline]
        slab_alloc_node mm/slub.c:2737 [inline]
        __kmalloc_node_track_caller+0xaed/0x11c0 mm/slub.c:4369
        __kmalloc_reserve net/core/skbuff.c:138 [inline]
        __alloc_skb+0x2cf/0x9f0 net/core/skbuff.c:206
        alloc_skb include/linux/skbuff.h:984 [inline]
        alloc_skb_with_frags+0x1d4/0xb20 net/core/skbuff.c:5234
        sock_alloc_send_pskb+0xb56/0x1190 net/core/sock.c:2085
        packet_alloc_skb net/packet/af_packet.c:2803 [inline]
        packet_snd net/packet/af_packet.c:2894 [inline]
        packet_sendmsg+0x6444/0x8a10 net/packet/af_packet.c:2969
        sock_sendmsg_nosec net/socket.c:630 [inline]
        sock_sendmsg net/socket.c:640 [inline]
        sock_write_iter+0x3b9/0x470 net/socket.c:909
        do_iter_readv_writev+0x7bb/0x970 include/linux/fs.h:1776
        do_iter_write+0x30d/0xd40 fs/read_write.c:932
        vfs_writev fs/read_write.c:977 [inline]
        do_writev+0x3c9/0x830 fs/read_write.c:1012
        SYSC_writev+0x9b/0xb0 fs/read_write.c:1085
        SyS_writev+0x56/0x80 fs/read_write.c:1082
        do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287
        entry_SYSCALL_64_after_hwframe+0x3d/0xa2
      
      Fixes: 58e998c6 ("offloading: Force software GSO for multiple vlan tags.")
      Reported-and-tested-by: syzbot+0bbe42c764feafa82c5a@syzkaller.appspotmail.com
      Signed-off-by: NToshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7ce23672
  14. 17 4月, 2018 2 次提交
    • J
      xdp: avoid leaking info stored in frame data on page reuse · 6dfb970d
      Jesper Dangaard Brouer 提交于
      The bpf infrastructure and verifier goes to great length to avoid
      bpf progs leaking kernel (pointer) info.
      
      For queueing an xdp_buff via XDP_REDIRECT, xdp_frame info stores
      kernel info (incl pointers) in top part of frame data (xdp->data_hard_start).
      Checks are in place to assure enough headroom is available for this.
      
      This info is not cleared, and if the frame is reused, then a
      malicious user could use bpf_xdp_adjust_head helper to move
      xdp->data into this area.  Thus, making this area readable.
      
      This is not super critical as XDP progs requires root or
      CAP_SYS_ADMIN, which are privileged enough for such info.  An
      effort (is underway) towards moving networking bpf hooks to the
      lesser privileged mode CAP_NET_ADMIN, where leaking such info
      should be avoided.  Thus, this patch to clear the info when
      needed.
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6dfb970d
    • J
      xdp: transition into using xdp_frame for ndo_xdp_xmit · 44fa2dbd
      Jesper Dangaard Brouer 提交于
      Changing API ndo_xdp_xmit to take a struct xdp_frame instead of struct
      xdp_buff.  This brings xdp_return_frame and ndp_xdp_xmit in sync.
      
      This builds towards changing the API further to become a bulk API,
      because xdp_buff is not a queue-able object while xdp_frame is.
      
      V4: Adjust for commit 59655a5b ("tuntap: XDP_TX can use native XDP")
      V7: Adjust for commit d9314c47 ("i40e: add support for XDP_REDIRECT")
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      44fa2dbd