1. 16 7月, 2016 1 次提交
  2. 15 7月, 2016 2 次提交
  3. 08 7月, 2016 2 次提交
  4. 06 7月, 2016 4 次提交
  5. 05 7月, 2016 5 次提交
  6. 03 7月, 2016 3 次提交
    • J
      netfilter: Convert FWINV<[foo]> macros and uses to NF_INVF · c37a2dfa
      Joe Perches 提交于
      netfilter uses multiple FWINV #defines with identical form that hide a
      specific structure variable and dereference it with a invflags member.
      
      $ git grep "#define FWINV"
      include/linux/netfilter_bridge/ebtables.h:#define FWINV(bool,invflg) ((bool) ^ !!(info->invflags & invflg))
      net/bridge/netfilter/ebtables.c:#define FWINV2(bool, invflg) ((bool) ^ !!(e->invflags & invflg))
      net/ipv4/netfilter/arp_tables.c:#define FWINV(bool, invflg) ((bool) ^ !!(arpinfo->invflags & (invflg)))
      net/ipv4/netfilter/ip_tables.c:#define FWINV(bool, invflg) ((bool) ^ !!(ipinfo->invflags & (invflg)))
      net/ipv6/netfilter/ip6_tables.c:#define FWINV(bool, invflg) ((bool) ^ !!(ip6info->invflags & (invflg)))
      net/netfilter/xt_tcpudp.c:#define FWINVTCP(bool, invflg) ((bool) ^ !!(tcpinfo->invflags & (invflg)))
      
      Consolidate these macros into a single NF_INVF macro.
      
      Miscellanea:
      
      o Neaten the alignment around these uses
      o A few lines are > 80 columns for intelligibility
      Signed-off-by: NJoe Perches <joe@perches.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      c37a2dfa
    • H
      net/mlx5e: Create NIC global resources only once · b50d292b
      Hadar Hen Zion 提交于
      To allow creating more than one netdev over the same PCI function, we
      change the driver such that global NIC resources are created once and
      later be shared amongst all the mlx5e netdevs running over that port.
      
      Move the CQ UAR, PD (pdn), Transport Domain (tdn), MKey resources from
      being kept in the mlx5e priv part to a new resources structure
      (mlx5e_resources) placed under the mlx5_core device.
      
      This patch doesn't add any new functionality.
      Signed-off-by: NHadar Hen Zion <hadarh@mellanox.com>
      Reviewed-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b50d292b
    • O
      net/mlx5: Introduce offloads steering namespace · acbc2004
      Or Gerlitz 提交于
      Add a new namespace (MLX5_FLOW_NAMESPACE_OFFLOADS) to be populated
      with flow steering rules that deal with rules that have have to
      be executed before the EN NIC steering rules are matched.
      
      The namespace is located after the bypass name-space and before the
      kernel name-space. Therefore, it precedes the HW processing done for
      rules set for the kernel NIC name-space.
      
      Under SRIOV, it would allow us to match on e-switch missed packet
      and forward them to the relevant VF representor TIR.
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NAmir Vadai <amir@vadai.me>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      acbc2004
  7. 02 7月, 2016 5 次提交
    • M
      cgroup: Add cgroup_get_from_fd · 1f3fe7eb
      Martin KaFai Lau 提交于
      Add a helper function to get a cgroup2 from a fd.  It will be
      stored in a bpf array (BPF_MAP_TYPE_CGROUP_ARRAY) which will
      be introduced in the later patch.
      Signed-off-by: NMartin KaFai Lau <kafai@fb.com>
      Cc: Alexei Starovoitov <ast@fb.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Tejun Heo <tj@kernel.org>
      Acked-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1f3fe7eb
    • W
      net_sched: fix mirrored packets checksum · 82a31b92
      WANG Cong 提交于
      Similar to commit 9b368814 ("net: fix bridge multicast packet checksum validation")
      we need to fixup the checksum for CHECKSUM_COMPLETE when
      pushing skb on RX path. Otherwise we get similar splats.
      
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Cc: Tom Herbert <tom@herbertland.com>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Acked-by: NJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      82a31b92
    • D
      packet: Use symmetric hash for PACKET_FANOUT_HASH. · eb70db87
      David S. Miller 提交于
      People who use PACKET_FANOUT_HASH want a symmetric hash, meaning that
      they want packets going in both directions on a flow to hash to the
      same bucket.
      
      The core kernel SKB hash became non-symmetric when the ipv6 flow label
      and other entities were incorporated into the standard flow hash order
      to increase entropy.
      
      But there are no users of PACKET_FANOUT_HASH who want an assymetric
      hash, they all want a symmetric one.
      
      Therefore, use the flow dissector to compute a flat symmetric hash
      over only the protocol, addresses and ports.  This hash does not get
      installed into and override the normal skb hash, so this change has
      no effect whatsoever on the rest of the stack.
      Reported-by: NEric Leblond <eric@regit.org>
      Tested-by: NEric Leblond <eric@regit.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      eb70db87
    • D
      bpf: refactor bpf_prog_get and type check into helper · 113214be
      Daniel Borkmann 提交于
      Since bpf_prog_get() and program type check is used in a couple of places,
      refactor this into a small helper function that we can make use of. Since
      the non RO prog->aux part is not used in performance critical paths and a
      program destruction via RCU is rather very unlikley when doing the put, we
      shouldn't have an issue just doing the bpf_prog_get() + prog->type != type
      check, but actually not taking the ref at all (due to being in fdget() /
      fdput() section of the bpf fd) is even cleaner and makes the diff smaller
      as well, so just go for that. Callsites are changed to make use of the new
      helper where possible.
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      113214be
    • D
      bpf: generally move prog destruction to RCU deferral · 1aacde3d
      Daniel Borkmann 提交于
      Jann Horn reported following analysis that could potentially result
      in a very hard to trigger (if not impossible) UAF race, to quote his
      event timeline:
      
       - Set up a process with threads T1, T2 and T3
       - Let T1 set up a socket filter F1 that invokes another filter F2
         through a BPF map [tail call]
       - Let T1 trigger the socket filter via a unix domain socket write,
         don't wait for completion
       - Let T2 call PERF_EVENT_IOC_SET_BPF with F2, don't wait for completion
       - Now T2 should be behind bpf_prog_get(), but before bpf_prog_put()
       - Let T3 close the file descriptor for F2, dropping the reference
         count of F2 to 2
       - At this point, T1 should have looked up F2 from the map, but not
         finished executing it
       - Let T3 remove F2 from the BPF map, dropping the reference count of
         F2 to 1
       - Now T2 should call bpf_prog_put() (wrong BPF program type), dropping
         the reference count of F2 to 0 and scheduling bpf_prog_free_deferred()
         via schedule_work()
       - At this point, the BPF program could be freed
       - BPF execution is still running in a freed BPF program
      
      While at PERF_EVENT_IOC_SET_BPF time it's only guaranteed that the perf
      event fd we're doing the syscall on doesn't disappear from underneath us
      for whole syscall time, it may not be the case for the bpf fd used as
      an argument only after we did the put. It needs to be a valid fd pointing
      to a BPF program at the time of the call to make the bpf_prog_get() and
      while T2 gets preempted, F2 must have dropped reference to 1 on the other
      CPU. The fput() from the close() in T3 should also add additionally delay
      to the reference drop via exit_task_work() when bpf_prog_release() gets
      called as well as scheduling bpf_prog_free_deferred().
      
      That said, it makes nevertheless sense to move the BPF prog destruction
      generally after RCU grace period to guarantee that such scenario above,
      but also others as recently fixed in ceb56070 ("bpf, perf: delay release
      of BPF prog after grace period") with regards to tail calls won't happen.
      Integrating bpf_prog_free_deferred() directly into the RCU callback is
      not allowed since the invocation might happen from either softirq or
      process context, so we're not permitted to block. Reviewing all bpf_prog_put()
      invocations from eBPF side (note, cBPF -> eBPF progs don't use this for
      their destruction) with call_rcu() look good to me.
      
      Since we don't know whether at the time of attaching the program, we're
      already part of a tail call map, we need to use RCU variant. However, due
      to this, there won't be severely more stress on the RCU callback queue:
      situations with above bpf_prog_get() and bpf_prog_put() combo in practice
      normally won't lead to releases, but even if they would, enough effort/
      cycles have to be put into loading a BPF program into the kernel already.
      Reported-by: NJann Horn <jannh@google.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1aacde3d
  8. 01 7月, 2016 8 次提交
  9. 30 6月, 2016 8 次提交
  10. 29 6月, 2016 2 次提交