1. 11 6月, 2016 5 次提交
    • E
      net_sched: remove generic throttled management · 45f50bed
      Eric Dumazet 提交于
      __QDISC_STATE_THROTTLED bit manipulation is rather expensive
      for HTB and few others.
      
      I already removed it for sch_fq in commit f2600cf0
      ("net: sched: avoid costly atomic operation in fq_dequeue()")
      and so far nobody complained.
      
      When one ore more packets are stuck in one or more throttled
      HTB class, a htb dequeue() performs two atomic operations
      to clear/set __QDISC_STATE_THROTTLED bit, while root qdisc
      lock is held.
      
      Removing this pair of atomic operations bring me a 8 % performance
      increase on 200 TCP_RR tests, in presence of throttled classes.
      
      This patch has no side effect, since nothing actually uses
      disc_is_throttled() anymore.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      45f50bed
    • E
      net_sched: netem: remove qdisc_is_throttled() use · 42117927
      Eric Dumazet 提交于
      Looks like it is only there as some optimization attempt.
      
      Since __QDISC_STATE_THROTTLED set/unset is way too expensive,
      and netem is the last user, just remove this check.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      42117927
    • E
      net_sched: cbq: remove a flaky use of qdisc_is_throttled() · cca605dd
      Eric Dumazet 提交于
      So far no qdisc ever unset the throttled bit at enqueue() time,
      so CBQ usage of qdisc_is_throttled() was flaky.
      
      Since __QDISC_STATE_THROTTLED set/unset is way too expensive
      considering that only CBQ was eventually caring for this status,
      it would make sense to implement a Qdisc ops ->is_throttled()
      if we find that this is needed.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cca605dd
    • E
      net_sched: sch_plug: use a private throttled status · 8fe6a79f
      Eric Dumazet 提交于
      We want to get rid of generic qdisc throttled management,
      so this qdisc has to use a private flag.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8fe6a79f
    • D
      net, cls: allow for deleting all filters for given parent · ea7f8277
      Daniel Borkmann 提交于
      Add a possibility where the user can just specify the parent and
      all filters under that parent are then being purged. Currently,
      for example for scripting, one needs to specify pref/prio to have
      a well-defined number for 'tc filter del' command for addressing
      the previously created instance or additionally filter handle in
      case of priorities being the same. Improve usage by allowing the
      option for tc to specify the parent and removing the whole chain
      for that given parent.
      
      Example usage after patch, no tc changes required:
      
        # tc qdisc replace dev foo clsact
        # tc filter add dev foo egress bpf da obj ./bpf.o
        # tc filter add dev foo egress bpf da obj ./bpf.o
        # tc filter show dev foo egress
        filter protocol all pref 49151 bpf
        filter protocol all pref 49151 bpf handle 0x1 bpf.o:[classifier] direct-action
        filter protocol all pref 49152 bpf
        filter protocol all pref 49152 bpf handle 0x1 bpf.o:[classifier] direct-action
        # tc filter del dev foo egress
        # tc filter show dev foo egress
        #
      
      Previously, RTM_DELTFILTER requests with invalid prio of 0 were
      rejected, so only netlink requests with RTM_NEWTFILTER and NLM_F_CREATE
      flag were allowed where the kernel would auto-generate a pref/prio.
      We can piggyback on that and use prio of 0 as a wildcard for
      requests of RTM_DELTFILTER.
      
      For notifying tc netlink monitoring users (e.g. libnl uses this
      for caching), there are two options, that is, sending individual
      tfilter_notify() notifications for each tcf_proto, or sending a
      single one indicating wildcard removal. I tried both and there
      are pros and cons for each, eventually I decided for sending
      individual tfilter_notify(), so that user space can support this
      seamlessly and there won't be a mess of changing each and every
      application to make sure expectations from the kernel won't break
      when they don't understand single notification. Since linear chains
      don't really scale, I expect only a handful of classifiers to be
      attached at max for a given parent anyway.
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NJamal Hadi Salim <jhs@mojatatu.com>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ea7f8277
  2. 10 6月, 2016 1 次提交
  3. 09 6月, 2016 7 次提交
  4. 08 6月, 2016 13 次提交
  5. 07 6月, 2016 1 次提交
  6. 04 6月, 2016 5 次提交
  7. 25 5月, 2016 2 次提交
  8. 18 5月, 2016 1 次提交
    • W
      net_sched: close another race condition in tcf_mirred_release() · dc327f89
      WANG Cong 提交于
      We saw the following extra refcount release on veth device:
      
        kernel: [7957821.463992] unregister_netdevice: waiting for mesos50284 to become free. Usage count = -1
      
      Since we heavily use mirred action to redirect packets to veth, I think
      this is caused by the following race condition:
      
      CPU0:
      tcf_mirred_release(): (in RCU callback)
      	struct net_device *dev = rcu_dereference_protected(m->tcfm_dev, 1);
      
      CPU1:
      mirred_device_event():
              spin_lock_bh(&mirred_list_lock);
              list_for_each_entry(m, &mirred_list, tcfm_list) {
                      if (rcu_access_pointer(m->tcfm_dev) == dev) {
                              dev_put(dev);
                              /* Note : no rcu grace period necessary, as
                               * net_device are already rcu protected.
                               */
                              RCU_INIT_POINTER(m->tcfm_dev, NULL);
                      }
              }
              spin_unlock_bh(&mirred_list_lock);
      
      CPU0:
      tcf_mirred_release():
              spin_lock_bh(&mirred_list_lock);
              list_del(&m->tcfm_list);
              spin_unlock_bh(&mirred_list_lock);
              if (dev)               // <======== Stil refers to the old m->tcfm_dev
                      dev_put(dev);  // <======== dev_put() is called on it again
      
      The action init code path is good because it is impossible to modify
      an action that is being removed.
      
      So, fix this by moving everything under the spinlock.
      
      Fixes: 2ee22a90 ("net_sched: act_mirred: remove spinlock in fast path")
      Fixes: 6bd00b85 ("act_mirred: fix a race condition on mirred_list")
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Acked-by: NJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dc327f89
  9. 17 5月, 2016 4 次提交
  10. 11 5月, 2016 1 次提交