1. 18 6月, 2017 13 次提交
    • W
      ipv6: mark DST_NOGC and remove the operation of dst_free() · 587fea74
      Wei Wang 提交于
      With the previous preparation patches, we are ready to get rid of the
      dst gc operation in ipv6 code and release dst based on refcnt only.
      So this patch adds DST_NOGC flag for all IPv6 dst and remove the calls
      to dst_free() and its related functions.
      At this point, all dst created in ipv6 code do not use the dst gc
      anymore and will be destroyed at the point when refcnt drops to 0.
      
      Also, as icmp6 dst route is refcounted during creation and will be freed
      by user during its call of dst_release(), there is no need to add this
      dst to the icmp6 gc list as well.
      Instead, we need to add it into uncached list so that when a
      NETDEV_DOWN/NETDEV_UNREGISRER event comes, we can properly go through
      these icmp6 dst as well and release the net device properly.
      Signed-off-by: NWei Wang <weiwan@google.com>
      Acked-by: NMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      587fea74
    • W
      ipv6: call dst_hold_safe() properly · ad65a2f0
      Wei Wang 提交于
      Similar as ipv4, ipv6 path also needs to call dst_hold_safe() when
      necessary to avoid double free issue on the dst.
      Signed-off-by: NWei Wang <weiwan@google.com>
      Acked-by: NMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ad65a2f0
    • W
      ipv6: call dst_dev_put() properly · 9514528d
      Wei Wang 提交于
      As the intend of this patch series is to completely remove dst gc,
      we need to call dst_dev_put() to release the reference to dst->dev
      when removing routes from fib because we won't keep the gc list anymore
      and will lose the dst pointer right after removing the routes.
      Without the gc list, there is no way to find all the dst's that have
      dst->dev pointing to the going-down dev.
      Hence, we are doing dst_dev_put() immediately before we lose the last
      reference of the dst from the routing code. The next dst_check() will
      trigger a route re-lookup to find another route (if there is any).
      Signed-off-by: NWei Wang <weiwan@google.com>
      Acked-by: NMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9514528d
    • W
      ipv6: take dst->__refcnt for insertion into fib6 tree · 1cfb71ee
      Wei Wang 提交于
      In IPv6 routing code, struct rt6_info is created for each static route
      and RTF_CACHE route and inserted into fib6 tree. In both cases, dst
      ref count is not taken.
      As explained in the previous patch, this leads to the need of the dst
      garbage collector.
      
      This patch holds ref count of dst before inserting the route into fib6
      tree and properly releases the dst when deleting it from the fib6 tree
      as a preparation in order to fully get rid of dst gc later.
      
      Also, correct fib6_age() logic to check dst->__refcnt to be 1 to indicate
      no user is referencing the dst.
      
      And remove dst_hold() in vrf_rt6_create() as ip6_dst_alloc() already puts
      dst->__refcnt to 1.
      Signed-off-by: NWei Wang <weiwan@google.com>
      Acked-by: NMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1cfb71ee
    • W
      ipv4: mark DST_NOGC and remove the operation of dst_free() · b838d5e1
      Wei Wang 提交于
      With the previous preparation patches, we are ready to get rid of the
      dst gc operation in ipv4 code and release dst based on refcnt only.
      So this patch adds DST_NOGC flag for all IPv4 dst and remove the calls
      to dst_free().
      At this point, all dst created in ipv4 code do not use the dst gc
      anymore and will be destroyed at the point when refcnt drops to 0.
      Signed-off-by: NWei Wang <weiwan@google.com>
      Acked-by: NMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b838d5e1
    • W
      ipv4: call dst_hold_safe() properly · 9df16efa
      Wei Wang 提交于
      This patch checks all the calls to
      dst_hold()/skb_dst_force()/dst_clone()/dst_use() to see if
      dst_hold_safe() is needed to avoid double free issue if dst
      gc is removed and dst_release() directly destroys dst when
      dst->__refcnt drops to 0.
      
      In tx path, TCP hold sk->sk_rx_dst ref count and also hold sock_lock().
      UDP and other similar protocols always hold refcount for
      skb->_skb_refdst. So both paths seem to be safe.
      
      In rx path, as it is lockless and skb_dst_set_noref() is likely to be
      used, dst_hold_safe() should always be used when trying to hold dst.
      
      In the routing code, if dst is held during an rcu protected session, it
      is necessary to call dst_hold_safe() as the current dst might be in its
      rcu grace period.
      Signed-off-by: NWei Wang <weiwan@google.com>
      Acked-by: NMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9df16efa
    • W
      ipv4: call dst_dev_put() properly · 95c47f9c
      Wei Wang 提交于
      As the intend of this patch series is to completely remove dst gc,
      we need to call dst_dev_put() to release the reference to dst->dev
      when removing routes from fib because we won't keep the gc list anymore
      and will lose the dst pointer right after removing the routes.
      Without the gc list, there is no way to find all the dst's that have
      dst->dev pointing to the going-down dev.
      Hence, we are doing dst_dev_put() immediately before we lose the last
      reference of the dst from the routing code. The next dst_check() will
      trigger a route re-lookup to find another route (if there is any).
      Signed-off-by: NWei Wang <weiwan@google.com>
      Acked-by: NMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      95c47f9c
    • W
      ipv4: take dst->__refcnt when caching dst in fib · 0830106c
      Wei Wang 提交于
      In IPv4 routing code, fib_nh and fib_nh_exception can hold pointers
      to struct rtable but they never increment dst->__refcnt.
      This leads to the need of the dst garbage collector because when user
      is done with this dst and calls dst_release(), it can only decrement
      dst->__refcnt and can not free the dst even it sees dst->__refcnt
      drops from 1 to 0 (unless DST_NOCACHE flag is set) because the routing
      code might still hold reference to it.
      And when the routing code tries to delete a route, it has to put the
      dst to the gc_list if dst->__refcnt is not yet 0 and have a gc thread
      running periodically to check on dst->__refcnt and finally to free dst
      when refcnt becomes 0.
      
      This patch increments dst->__refcnt when
      fib_nh/fib_nh_exception holds reference to this dst and properly release
      the dst when fib_nh/fib_nh_exception has been updated with a new dst.
      
      This patch is a preparation in order to fully get rid of dst gc later.
      Signed-off-by: NWei Wang <weiwan@google.com>
      Acked-by: NMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0830106c
    • W
      net: introduce a new function dst_dev_put() · 4a6ce2b6
      Wei Wang 提交于
      This function should be called when removing routes from fib tree after
      the dst gc is no longer in use.
      We first mark DST_OBSOLETE_DEAD on this dst to make sure next
      dst_ops->check() fails and returns NULL.
      Secondly, as we no longer keep the gc_list, we need to properly
      release dst->dev right at the moment when the dst is removed from
      the fib/fib6 tree.
      It does the following:
      1. change dst->input and output pointers to dst_discard/dst_dscard_out to
         discard all packets
      2. replace dst->dev with loopback interface
      Signed-off-by: NWei Wang <weiwan@google.com>
      Acked-by: NMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4a6ce2b6
    • W
      net: introduce DST_NOGC in dst_release() to destroy dst based on refcnt · 5f56f409
      Wei Wang 提交于
      The current mechanism of freeing dst is a bit complicated. dst has its
      ref count and when user grabs the reference to the dst, the ref count is
      properly taken in most cases except in IPv4/IPv6/decnet/xfrm routing
      code due to some historic reasons.
      
      If the reference to dst is always taken properly, we should be able to
      simplify the logic in dst_release() to destroy dst when dst->__refcnt
      drops from 1 to 0. And this should be the only condition to determine
      if we can call dst_destroy().
      And as dst is always ref counted, there is no need for a dst garbage
      list to hold the dst entries that already get removed by the routing
      code but are still held by other users. And the task to periodically
      check the list to free dst if ref count become 0 is also not needed
      anymore.
      
      This patch introduces a temporary flag DST_NOGC(no garbage collector).
      If it is set in the dst, dst_release() will call dst_destroy() when
      dst->__refcnt drops to 0. dst_hold_safe() will also check for this flag
      and do atomic_inc_not_zero() similar as DST_NOCACHE to avoid double free
      issue.
      This temporary flag is mainly used so that we can make the transition
      component by component without breaking other parts.
      This flag will be removed after all components are properly transitioned.
      
      This patch also introduces a new function dst_release_immediate() which
      destroys dst without waiting on the rcu when refcnt drops to 0. It will
      be used in later patches.
      
      Follow-up patches will correct all the places to properly take ref count
      on dst and mark DST_NOGC. dst_release() or dst_release_immediate() will
      be used to release the dst instead of dst_free() and its related
      functions.
      And final clean-up patch will remove the DST_NOGC flag.
      Signed-off-by: NWei Wang <weiwan@google.com>
      Acked-by: NMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5f56f409
    • W
      net: use loopback dev when generating blackhole route · 1dbe3252
      Wei Wang 提交于
      Existing ipv4/6_blackhole_route() code generates a blackhole route
      with dst->dev pointing to the passed in dst->dev.
      It is not necessary to hold reference to the passed in dst->dev
      because the packets going through this route are dropped anyway.
      A loopback interface is good enough so that we don't need to worry about
      releasing this dst->dev when this dev is going down.
      Signed-off-by: NWei Wang <weiwan@google.com>
      Acked-by: NMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1dbe3252
    • W
      udp: call dst_hold_safe() in udp_sk_rx_set_dst() · d24406c8
      Wei Wang 提交于
      In udp_v4/6_early_demux() code, we try to hold dst->__refcnt for
      dst with DST_NOCACHE flag. This is because later in udp_sk_rx_dst_set()
      function, we will try to cache this dst in sk for connected case.
      However, a better way to achieve this is to not try to hold dst in
      early_demux(), but in udp_sk_rx_dst_set(), call dst_hold_safe(). This
      approach is also more consistant with how tcp is handling it. And it
      will make later changes simpler.
      Signed-off-by: NWei Wang <weiwan@google.com>
      Acked-by: NMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d24406c8
    • W
      ipv6: remove unnecessary dst_hold() in ip6_fragment() · 1758fd46
      Wei Wang 提交于
      In ipv6 tx path, rcu_read_lock() is taken so that dst won't get freed
      during the execution of ip6_fragment(). Hence, no need to hold dst in
      it.
      Signed-off-by: NWei Wang <weiwan@google.com>
      Acked-by: NMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1758fd46
  2. 17 6月, 2017 25 次提交
  3. 16 6月, 2017 2 次提交
    • D
      Merge branch 'bpf-xdp-Report-bpf_prog-ID-in-IFLA_XDP' · c52e6098
      David S. Miller 提交于
      Martin KaFai Lau says:
      
      ====================
      bpf: xdp: Report bpf_prog ID in IFLA_XDP
      
      This is the first usage of the new bpf_prog ID.  It is for
      reporting the ID of a xdp_prog through netlink.
      
      It rides on the existing IFLA_XDP.  This patch adds IFLA_XDP_PROG_ID
      for the bpf_prog ID reporting.
      
      It starts with changing the generic_xdp first.  After that,
      the hardware driver is changed one by one.  Jakub Kicinski mentioned
      that he will soon introduce XDP_ATTACHED_HW (on top of the existing
      XDP_ATTACHED_DRV and XDP_ATTACHED_SKB)
      and he is going to reuse the prog_attached for this purpose.
      Hence, this patch set keeps the prog_attached even though
      !!prog_id also implies there is xdp_prog attached.
      
      I have tested with generic_xdp, mlx4 and mlx5.
      
      v3:
      1. Replace 'if' by '?' when checking the xdp_prog pointer
         as suggested by Jakub Kicinski (thanks!)
      
      v2:
      1. Remove READ_ONCE since it is alredy under rtnl lock
      2. Keep prog_attached in 'struct netdev_xdp' as
         requested by Jakub Kicinski.  The existing prog_attached
         and the new prog_id are put under a struct for XDP_QUERY_PROG.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c52e6098
    • M
      bpf: qede: Report bpf_prog ID during XDP_QUERY_PROG · 22e0d75f
      Martin KaFai Lau 提交于
      Add support to qede to report bpf_prog ID during XDP_QUERY_PROG.
      Signed-off-by: NMartin KaFai Lau <kafai@fb.com>
      Cc: Mintz Yuval <Yuval.Mintz@cavium.com>
      Acked-by: NAlexei Starovoitov <ast@fb.com>
      Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      22e0d75f