1. 26 9月, 2014 1 次提交
    • E
      net: sched: use pinned timers · 4a8e320c
      Eric Dumazet 提交于
      While using a MQ + NETEM setup, I had confirmation that the default
      timer migration ( /proc/sys/kernel/timer_migration ) is killing us.
      
      Installing this on a receiver side of a TCP_STREAM test, (NIC has 8 TX
      queues) :
      
      EST="est 1sec 4sec"
      for ETH in eth1
      do
       tc qd del dev $ETH root 2>/dev/null
       tc qd add dev $ETH root handle 1: mq
       tc qd add dev $ETH parent 1:1 $EST netem limit 70000 delay 6ms
       tc qd add dev $ETH parent 1:2 $EST netem limit 70000 delay 8ms
       tc qd add dev $ETH parent 1:3 $EST netem limit 70000 delay 10ms
       tc qd add dev $ETH parent 1:4 $EST netem limit 70000 delay 12ms
       tc qd add dev $ETH parent 1:5 $EST netem limit 70000 delay 14ms
       tc qd add dev $ETH parent 1:6 $EST netem limit 70000 delay 16ms
       tc qd add dev $ETH parent 1:7 $EST netem limit 80000 delay 18ms
       tc qd add dev $ETH parent 1:8 $EST netem limit 90000 delay 20ms
      done
      
      We can see that timers get migrated into a single cpu, presumably idle
      at the time timers are set up.
      Then all qdisc dequeues run from this cpu and huge lock contention
      happens. This single cpu is stuck in softirq mode and cannot dequeue
      fast enough.
      
          39.24%  [kernel]          [k] _raw_spin_lock
           2.65%  [kernel]          [k] netem_enqueue
           1.80%  [kernel]          [k] netem_dequeue
           1.63%  [kernel]          [k] copy_user_enhanced_fast_string
           1.45%  [kernel]          [k] _raw_spin_lock_bh
      
      By pinning qdisc timers on the cpu running the qdisc, we respect proper
      XPS setting and remove this lock contention.
      
           5.84%  [kernel]          [k] netem_enqueue
           4.83%  [kernel]          [k] _raw_spin_lock
           2.92%  [kernel]          [k] copy_user_enhanced_fast_string
      
      Current Qdiscs that benefit from this change are :
      
      	netem, cbq, fq, hfsc, tbf, htb.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4a8e320c
  2. 23 9月, 2014 4 次提交
  3. 20 9月, 2014 2 次提交
  4. 17 9月, 2014 7 次提交
    • J
      net: sched: cls_cgroup need tcf_exts_init in all cases · 9f6c38e7
      John Fastabend 提交于
      This ensures the tcf_exts_init() is called for all cases.
      
      Fixes: 952313bd ("net: sched: cls_cgroup use RCU")
      Signed-off-by: NJohn Fastabend <john.r.fastabend@intel.com>
      Acked-by: NCong Wang <cwang@twopensource.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9f6c38e7
    • J
      net: sched: cls_fw: add missing tcf_exts_init call in fw_change() · e1f93eb0
      John Fastabend 提交于
      When allocating a new structure we also need to call tcf_exts_init
      to initialize exts.
      
      A follow up patch might be in order to remove some of this code
      and do tcf_exts_assign(). With this we could remove the
      tcf_exts_init/tcf_exts_change pattern for some of the classifiers.
      As part of the future tcf_actions RCU series this will need to be
      done. For now fix the call here.
      
      Fixes e35a8ee5 ("net: sched: fw use RCU")
      Signed-off-by: NJohn Fastabend <john.r.fastabend@intel.com>
      Acked-by: NCong Wang <cwang@twopensource.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e1f93eb0
    • J
      net: sched: cls_cgroup fix possible memory leak of 'new' · d14cbfc8
      John Fastabend 提交于
      tree:   git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git master
      head:   54996b52
      commit: c7953ef2 [625/646] net: sched: cls_cgroup use RCU
      
      net/sched/cls_cgroup.c:130 cls_cgroup_change() warn: possible memory leak of 'new'
      net/sched/cls_cgroup.c:135 cls_cgroup_change() warn: possible memory leak of 'new'
      net/sched/cls_cgroup.c:139 cls_cgroup_change() warn: possible memory leak of 'new'
      
      Fixes: c7953ef2 ("net: sched: cls_cgroup use RCU")
      Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: NJohn Fastabend <john.r.fastabend@intel.com>
      Acked-by: NCong Wang <cwang@twopensource.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d14cbfc8
    • J
      net: sched: cls_u32 add missing rcu_assign_pointer and annotation · a96366bf
      John Fastabend 提交于
      Add missing rcu_assign_pointer and missing  annotation for ht_up
      in cls_u32.c
      
      Caught by kbuild bot,
      
      >> net/sched/cls_u32.c:378:36: sparse: incorrect type in initializer (different address spaces)
         net/sched/cls_u32.c:378:36:    expected struct tc_u_hnode *ht
         net/sched/cls_u32.c:378:36:    got struct tc_u_hnode [noderef] <asn:4>*ht_up
      >> net/sched/cls_u32.c:610:54: sparse: incorrect type in argument 4 (different address spaces)
         net/sched/cls_u32.c:610:54:    expected struct tc_u_hnode *ht
         net/sched/cls_u32.c:610:54:    got struct tc_u_hnode [noderef] <asn:4>*ht_up
      >> net/sched/cls_u32.c:684:18: sparse: incorrect type in assignment (different address spaces)
         net/sched/cls_u32.c:684:18:    expected struct tc_u_hnode [noderef] <asn:4>*ht_up
         net/sched/cls_u32.c:684:18:    got struct tc_u_hnode *[assigned] ht
      >> net/sched/cls_u32.c:359:18: sparse: dereference of noderef expression
      Signed-off-by: NJohn Fastabend <john.r.fastabend@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a96366bf
    • J
      net: sched: fix unsued cpu variable · 80aab73d
      John Fastabend 提交于
      kbuild test robot reported an unused variable cpu in cls_u32.c
      after the patch below. This happens when PERF and MARK config
      variables are disabled
      
      Fix this is to use separate variables for perf and mark
      and define the cpu variable inside the ifdef logic.
      
      Fixes: 459d5f62 ("net: sched: make cls_u32 per cpu")'
      Signed-off-by: NJohn Fastabend <john.r.fastabend@intel.com>
      Acked-by: NCong Wang <cwang@twopensource.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      80aab73d
    • W
      net_sched: fix a null pointer dereference in tcindex_set_parms() · 69301eaa
      WANG Cong 提交于
      This patch fixes the following crash:
      
      [   42.199159] BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
      [   42.200027] IP: [<ffffffff817e3fc4>] tcindex_set_parms+0x45c/0x526
      [   42.200027] PGD d2319067 PUD d4ffe067 PMD 0
      [   42.200027] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
      [   42.200027] CPU: 0 PID: 541 Comm: tc Not tainted 3.17.0-rc4+ #603
      [   42.200027] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
      [   42.200027] task: ffff8800d22d2670 ti: ffff8800ce790000 task.ti: ffff8800ce790000
      [   42.200027] RIP: 0010:[<ffffffff817e3fc4>]  [<ffffffff817e3fc4>] tcindex_set_parms+0x45c/0x526
      [   42.200027] RSP: 0018:ffff8800ce793898  EFLAGS: 00010202
      [   42.200027] RAX: 0000000000000001 RBX: ffff8800d1786498 RCX: 0000000000000000
      [   42.200027] RDX: ffffffff82114ec8 RSI: ffffffff82114ec8 RDI: ffffffff82114ec8
      [   42.200027] RBP: ffff8800ce793958 R08: 00000000000080d0 R09: 0000000000000001
      [   42.200027] R10: ffff8800ce7939a0 R11: 0000000000000246 R12: ffff8800d017d238
      [   42.200027] R13: 0000000000000018 R14: ffff8800d017c6a0 R15: ffff8800d1786620
      [   42.200027] FS:  00007f4e24539740(0000) GS:ffff88011a600000(0000) knlGS:0000000000000000
      [   42.200027] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   42.200027] CR2: 0000000000000018 CR3: 00000000cff38000 CR4: 00000000000006f0
      [   42.200027] Stack:
      [   42.200027]  ffff8800ce0949f0 0000000000000000 0000000200000003 ffff880000000000
      [   42.200027]  ffff8800ce7938b8 ffff8800ce7938b8 0000000600000007 0000000000000000
      [   42.200027]  ffff8800ce7938d8 ffff8800ce7938d8 0000000600000007 ffff8800ce0949f0
      [   42.200027] Call Trace:
      [   42.200027]  [<ffffffff817e4169>] tcindex_change+0xdb/0xee
      [   42.200027]  [<ffffffff817c16ca>] tc_ctl_tfilter+0x44d/0x63f
      [   42.200027]  [<ffffffff8179d161>] rtnetlink_rcv_msg+0x181/0x194
      [   42.200027]  [<ffffffff8179cf9d>] ? rtnl_lock+0x17/0x19
      [   42.200027]  [<ffffffff8179cfe0>] ? __rtnl_unlock+0x17/0x17
      [   42.200027]  [<ffffffff817ee296>] netlink_rcv_skb+0x49/0x8b
      [   43.462494]  [<ffffffff8179cfc2>] rtnetlink_rcv+0x23/0x2a
      [   43.462494]  [<ffffffff817ec8df>] netlink_unicast+0xc7/0x148
      [   43.462494]  [<ffffffff817ed413>] netlink_sendmsg+0x5cb/0x63d
      [   43.462494]  [<ffffffff810ad781>] ? mark_lock+0x2e/0x224
      [   43.462494]  [<ffffffff817757b8>] __sock_sendmsg_nosec+0x25/0x27
      [   43.462494]  [<ffffffff81778165>] sock_sendmsg+0x57/0x71
      [   43.462494]  [<ffffffff81152bbd>] ? might_fault+0x57/0xa4
      [   43.462494]  [<ffffffff81152c06>] ? might_fault+0xa0/0xa4
      [   43.462494]  [<ffffffff81152bbd>] ? might_fault+0x57/0xa4
      [   43.462494]  [<ffffffff817838fd>] ? verify_iovec+0x69/0xb7
      [   43.462494]  [<ffffffff817784f8>] ___sys_sendmsg+0x21d/0x2bb
      [   43.462494]  [<ffffffff81009db3>] ? native_sched_clock+0x35/0x37
      [   43.462494]  [<ffffffff8109ab53>] ? sched_clock_local+0x12/0x72
      [   43.462494]  [<ffffffff810ad781>] ? mark_lock+0x2e/0x224
      [   43.462494]  [<ffffffff8109ada4>] ? sched_clock_cpu+0xa0/0xb9
      [   43.462494]  [<ffffffff810aee37>] ? __lock_acquire+0x5fe/0xde4
      [   43.462494]  [<ffffffff8119f570>] ? rcu_read_lock_held+0x36/0x38
      [   43.462494]  [<ffffffff8119f75a>] ? __fcheck_files.isra.7+0x4b/0x57
      [   43.462494]  [<ffffffff8119fbf2>] ? __fget_light+0x30/0x54
      [   43.462494]  [<ffffffff81779012>] __sys_sendmsg+0x42/0x60
      [   43.462494]  [<ffffffff81779042>] SyS_sendmsg+0x12/0x1c
      [   43.462494]  [<ffffffff819d24d2>] system_call_fastpath+0x16/0x1b
      
      'p->h' could be NULL while 'cp->h' is always update to date.
      
      Fixes: commit 331b7292 ("net: sched: RCU cls_tcindex")
      Cc: John Fastabend <john.fastabend@gmail.com>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Acked-By: NJohn Fastabend <john.r.fastabend@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      69301eaa
    • W
      net_sched: fix memory leak in cls_tcindex · 44b75e43
      WANG Cong 提交于
      Fixes: commit 331b7292 ("net: sched: RCU cls_tcindex")
      Cc: John Fastabend <john.fastabend@gmail.com>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Acked-By: NJohn Fastabend <john.r.fastabend@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      44b75e43
  5. 16 9月, 2014 4 次提交
  6. 14 9月, 2014 22 次提交