1. 09 3月, 2016 11 次提交
    • K
      net_sched: dsmark: use qdisc_dequeue_peeked() · f8b33d8e
      Kyeong Yoo 提交于
      This fix is for dsmark similar to commit 3557619f
      ("net_sched: prio: use qdisc_dequeue_peeked")
      and makes use of qdisc_dequeue_peeked() instead of direct dequeue() call.
      
      First time, wrr peeks dsmark, which will then peek into sfq.
      sfq dequeues an skb and it's stored in sch->gso_skb.
      Next time, wrr tries to dequeue from dsmark, which will call sfq dequeue
      directly. This results skipping the previously peeked skb.
      
      So changed dsmark dequeue to call qdisc_dequeue_peeked() instead to use
      peeked skb if exists.
      Signed-off-by: NKyeong Yoo <kyeong.yoo@alliedtelesis.co.nz>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f8b33d8e
    • D
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next · 4c38cd61
      David S. Miller 提交于
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter/IPVS updates for net-next
      
      The following patchset contains Netfilter updates for your net-next tree,
      they are:
      
      1) Remove useless debug message when deleting IPVS service, from
         Yannick Brosseau.
      
      2) Get rid of compilation warning when CONFIG_PROC_FS is unset in
         several spots of the IPVS code, from Arnd Bergmann.
      
      3) Add prandom_u32 support to nft_meta, from Florian Westphal.
      
      4) Remove unused variable in xt_osf, from Sudip Mukherjee.
      
      5) Don't calculate IP checksum twice from netfilter ipv4 defrag hook
         since fixing af_packet defragmentation issues, from Joe Stringer.
      
      6) On-demand hook registration for iptables from netns. Instead of
         registering the hooks for every available netns whenever we need
         one of the support tables, we register this on the specific netns
         that needs it, patchset from Florian Westphal.
      
      7) Add missing port range selection to nf_tables masquerading support.
      
      BTW, just for the record, there is a typo in the description of
      5f6c253e ("netfilter: bridge: register hooks only when bridge
      interface is added") that refers to the cluster match as deprecated, but
      it is actually the CLUSTERIP target (which registers hooks
      inconditionally) the one that is scheduled for removal.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4c38cd61
    • D
      Merge branch 'bpf-next' · d24ad3fc
      David S. Miller 提交于
      Daniel Borkmann says:
      
      ====================
      BPF updates
      
      Couple of misc updates to BPF, besides others this series adds
      bpf_csum_diff() to be used with L3 csums, allows for managing
      tunnel options for collect meta data mode, and enabling ipv6
      traffic class for collect meta data in vxlan specifically (geneve
      already supports it). For more details, please see individual
      patches.
      
      The series requires net to be merged into net-next first to
      avoid any further pending merge conflicts.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d24ad3fc
    • D
      vxlan: allow setting ipv6 traffic class · 1400615d
      Daniel Borkmann 提交于
      We can already do that for IPv4, but IPv6 support was missing. Add
      it for vxlan, so it can be used with collect metadata frontends.
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1400615d
    • D
      bpf, vxlan, geneve, gre: fix usage of dst_cache on xmit · db3c6139
      Daniel Borkmann 提交于
      The assumptions from commit 0c1d70af ("net: use dst_cache for vxlan
      device"), 468dfffc ("geneve: add dst caching support") and 3c1cb4d2
      ("net/ipv4: add dst cache support for gre lwtunnels") on dst_cache usage
      when ip_tunnel_info is used is unfortunately not always valid as assumed.
      
      While it seems correct for ip_tunnel_info front-ends such as OVS, eBPF
      however can fill in ip_tunnel_info for consumers like vxlan, geneve or gre
      with different remote dsts, tos, etc, therefore they cannot be assumed as
      packet independent.
      
      Right now vxlan, geneve, gre would cache the dst for eBPF and every packet
      would reuse the same entry that was first created on the initial route
      lookup. eBPF doesn't store/cache the ip_tunnel_info, so each skb may have
      a different one.
      
      Fix it by adding a flag that checks the ip_tunnel_info. Also the !tos test
      in vxlan needs to be handeled differently in this context as it is currently
      inferred from ip_tunnel_info as well if present. ip_tunnel_dst_cache_usable()
      helper is added for the three tunnel cases, which checks if we can use dst
      cache.
      
      Fixes: 0c1d70af ("net: use dst_cache for vxlan device")
      Fixes: 468dfffc ("geneve: add dst caching support")
      Fixes: 3c1cb4d2 ("net/ipv4: add dst cache support for gre lwtunnels")
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NPaolo Abeni <pabeni@redhat.com>
      Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      db3c6139
    • D
      bpf: support for access to tunnel options · 14ca0751
      Daniel Borkmann 提交于
      After eBPF being able to programmatically access/manage tunnel key meta
      data via commit d3aa45ce ("bpf: add helpers to access tunnel metadata")
      and more recently also for IPv6 through c6c33454 ("bpf: support ipv6
      for bpf_skb_{set,get}_tunnel_key"), this work adds two complementary
      helpers to generically access their auxiliary tunnel options.
      
      Geneve and vxlan support this facility. For geneve, TLVs can be pushed,
      and for the vxlan case its GBP extension. I.e. setting tunnel key for geneve
      case only makes sense, if we can also read/write TLVs into it. In the GBP
      case, it provides the flexibility to easily map the group policy ID in
      combination with other helpers or maps.
      
      I chose to model this as two separate helpers, bpf_skb_{set,get}_tunnel_opt(),
      for a couple of reasons. bpf_skb_{set,get}_tunnel_key() is already rather
      complex by itself, and there may be cases for tunnel key backends where
      tunnel options are not always needed. If we would have integrated this
      into bpf_skb_{set,get}_tunnel_key() nevertheless, we are very limited with
      remaining helper arguments, so keeping compatibility on structs in case of
      passing in a flat buffer gets more cumbersome. Separating both also allows
      for more flexibility and future extensibility, f.e. options could be fed
      directly from a map, etc.
      
      Moreover, change geneve's xmit path to test only for info->options_len
      instead of TUNNEL_GENEVE_OPT flag. This makes it more consistent with vxlan's
      xmit path and allows for avoiding to specify a protocol flag in the API on
      xmit, so it can be protocol agnostic. Having info->options_len is enough
      information that is needed. Tested with vxlan and geneve.
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      14ca0751
    • D
      bpf: allow to propagate df in bpf_skb_set_tunnel_key · 22080870
      Daniel Borkmann 提交于
      Added by 9a628224 ("ip_tunnel: Add dont fragment flag."), allow to
      feed df flag into tunneling facilities (currently supported on TX by
      vxlan, geneve and gre) as a hint from eBPF's bpf_skb_set_tunnel_key()
      helper.
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      22080870
    • D
      bpf: make helper function protos static · 577c50aa
      Daniel Borkmann 提交于
      They are only used here, so there's no reason they should not be static.
      Only the vlan push/pop protos are used in the test_bpf suite.
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      577c50aa
    • D
      bpf: add flags to bpf_skb_store_bytes for clearing hash · 8afd54c8
      Daniel Borkmann 提交于
      When overwriting parts of the packet with bpf_skb_store_bytes() that
      were fed previously into skb->hash calculation, we should clear the
      current hash with skb_clear_hash(), so that a next skb_get_hash() call
      can determine the correct hash related to this skb.
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8afd54c8
    • D
      bpf: allow bpf_csum_diff to feed bpf_l3_csum_replace as well · 8050c0f0
      Daniel Borkmann 提交于
      Commit 7d672345 ("bpf: add generic bpf_csum_diff helper") added a
      generic checksum diff helper that can feed bpf_l4_csum_replace() with
      a target __wsum diff that is to be applied to the L4 checksum. This
      facility is very flexible, can be cascaded, allows for adding, removing,
      or diffing data, or for calculating the pseudo header checksum from
      scratch, but it can also be reused for working with the IPv4 header
      checksum.
      
      Thus, analogous to bpf_l4_csum_replace(), add a case for header field
      value of 0 to change the checksum at a given offset through a new helper
      csum_replace_by_diff(). Also, in addition to that, this provides an
      easy to use interface for feeding precalculated diffs f.e. coming from
      a map. It nicely complements bpf_l3_csum_replace() that currently allows
      only for csum updates of 2 and 4 byte diffs.
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8050c0f0
    • D
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 810813c4
      David S. Miller 提交于
      Several cases of overlapping changes, as well as one instance
      (vxlan) of a bug fix in 'net' overlapping with code movement
      in 'net-next'.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      810813c4
  2. 08 3月, 2016 28 次提交
  3. 07 3月, 2016 1 次提交