1. 23 10月, 2018 1 次提交
  2. 17 9月, 2018 2 次提交
  3. 01 9月, 2018 1 次提交
  4. 22 8月, 2018 1 次提交
  5. 14 8月, 2018 1 次提交
  6. 12 8月, 2018 1 次提交
  7. 08 7月, 2018 5 次提交
  8. 28 3月, 2018 1 次提交
  9. 22 3月, 2018 1 次提交
    • D
      net/sched: fix idr leak in the error path of tcf_act_police_init() · 5bf7f818
      Davide Caratti 提交于
      tcf_act_police_init() can fail after the idr has been successfully
      reserved (e.g., qdisc_get_rtab() may return NULL). When this happens,
      subsequent attempts to configure a police rule using the same idr value
      systematiclly fail with -ENOSPC:
      
       # tc action add action police rate 1000 burst 1000 drop index 100
       RTNETLINK answers: Cannot allocate memory
       We have an error talking to the kernel
       # tc action add action police rate 1000 burst 1000 drop index 100
       RTNETLINK answers: No space left on device
       We have an error talking to the kernel
       # tc action add action police rate 1000 burst 1000 drop index 100
       RTNETLINK answers: No space left on device
       ...
      
      Fix this in the error path of tcf_act_police_init(), calling
      tcf_idr_release() in place of tcf_idr_cleanup().
      
      Fixes: 65a206c0 ("net/sched: Change act_api and act_xxx modules to use IDR")
      Acked-by: NJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: NDavide Caratti <dcaratti@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5bf7f818
  10. 28 2月, 2018 1 次提交
    • K
      net: Convert tc_action_net_init() and tc_action_net_exit() based pernet_operations · 685ecfb1
      Kirill Tkhai 提交于
      These pernet_operations are from net/sched directory, and they call only
      tc_action_net_init() and tc_action_net_exit():
      
      bpf_net_ops
      connmark_net_ops
      csum_net_ops
      gact_net_ops
      ife_net_ops
      ipt_net_ops
      xt_net_ops
      mirred_net_ops
      nat_net_ops
      pedit_net_ops
      police_net_ops
      sample_net_ops
      simp_net_ops
      skbedit_net_ops
      skbmod_net_ops
      tunnel_key_net_ops
      vlan_net_ops
      
      1)tc_action_net_init() just allocates and initializes per-net memory.
      2)There should not be in-flight packets at the time of tc_action_net_exit()
      call, or another pernet_operations send packets to dying net (except
      netlink). So, it seems they can be marked as async.
      Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      685ecfb1
  11. 17 2月, 2018 4 次提交
  12. 22 12月, 2017 1 次提交
  13. 14 12月, 2017 1 次提交
  14. 09 11月, 2017 1 次提交
  15. 03 11月, 2017 1 次提交
  16. 31 8月, 2017 1 次提交
  17. 15 6月, 2017 1 次提交
  18. 14 4月, 2017 1 次提交
  19. 06 12月, 2016 1 次提交
    • E
      net_sched: gen_estimator: complete rewrite of rate estimators · 1c0d32fd
      Eric Dumazet 提交于
      1) Old code was hard to maintain, due to complex lock chains.
         (We probably will be able to remove some kfree_rcu() in callers)
      
      2) Using a single timer to update all estimators does not scale.
      
      3) Code was buggy on 32bit kernel (WRITE_ONCE() on 64bit quantity
         is not supposed to work well)
      
      In this rewrite :
      
      - I removed the RB tree that had to be scanned in
        gen_estimator_active(). qdisc dumps should be much faster.
      
      - Each estimator has its own timer.
      
      - Estimations are maintained in net_rate_estimator structure,
        instead of dirtying the qdisc. Minor, but part of the simplification.
      
      - Reading the estimator uses RCU and a seqcount to provide proper
        support for 32bit kernels.
      
      - We reduce memory need when estimators are not used, since
        we store a pointer, instead of the bytes/packets counters.
      
      - xt_rateest_mt() no longer has to grab a spinlock.
        (In the future, xt_rateest_tg() could be switched to per cpu counters)
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1c0d32fd
  20. 18 11月, 2016 1 次提交
    • A
      netns: make struct pernet_operations::id unsigned int · c7d03a00
      Alexey Dobriyan 提交于
      Make struct pernet_operations::id unsigned.
      
      There are 2 reasons to do so:
      
      1)
      This field is really an index into an zero based array and
      thus is unsigned entity. Using negative value is out-of-bound
      access by definition.
      
      2)
      On x86_64 unsigned 32-bit data which are mixed with pointers
      via array indexing or offsets added or subtracted to pointers
      are preffered to signed 32-bit data.
      
      "int" being used as an array index needs to be sign-extended
      to 64-bit before being used.
      
      	void f(long *p, int i)
      	{
      		g(p[i]);
      	}
      
        roughly translates to
      
      	movsx	rsi, esi
      	mov	rdi, [rsi+...]
      	call 	g
      
      MOVSX is 3 byte instruction which isn't necessary if the variable is
      unsigned because x86_64 is zero extending by default.
      
      Now, there is net_generic() function which, you guessed it right, uses
      "int" as an array index:
      
      	static inline void *net_generic(const struct net *net, int id)
      	{
      		...
      		ptr = ng->ptr[id - 1];
      		...
      	}
      
      And this function is used a lot, so those sign extensions add up.
      
      Patch snipes ~1730 bytes on allyesconfig kernel (without all junk
      messing with code generation):
      
      	add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730)
      
      Unfortunately some functions actually grow bigger.
      This is a semmingly random artefact of code generation with register
      allocator being used differently. gcc decides that some variable
      needs to live in new r8+ registers and every access now requires REX
      prefix. Or it is shifted into r12, so [r12+0] addressing mode has to be
      used which is longer than [r8]
      
      However, overall balance is in negative direction:
      
      	add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730)
      	function                                     old     new   delta
      	nfsd4_lock                                  3886    3959     +73
      	tipc_link_build_proto_msg                   1096    1140     +44
      	mac80211_hwsim_new_radio                    2776    2808     +32
      	tipc_mon_rcv                                1032    1058     +26
      	svcauth_gss_legacy_init                     1413    1429     +16
      	tipc_bcbase_select_primary                   379     392     +13
      	nfsd4_exchange_id                           1247    1260     +13
      	nfsd4_setclientid_confirm                    782     793     +11
      		...
      	put_client_renew_locked                      494     480     -14
      	ip_set_sockfn_get                            730     716     -14
      	geneve_sock_add                              829     813     -16
      	nfsd4_sequence_done                          721     703     -18
      	nlmclnt_lookup_host                          708     686     -22
      	nfsd4_lockt                                 1085    1063     -22
      	nfs_get_client                              1077    1050     -27
      	tcf_bpf_init                                1106    1076     -30
      	nfsd4_encode_fattr                          5997    5930     -67
      	Total: Before=154856051, After=154854321, chg -0.00%
      Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c7d03a00
  21. 20 9月, 2016 2 次提交
  22. 18 8月, 2016 2 次提交
  23. 26 7月, 2016 2 次提交
    • W
      net_sched: get rid of struct tcf_common · ec0595cc
      WANG Cong 提交于
      After the previous patch, struct tc_action should be enough
      to represent the generic tc action, tcf_common is not necessary
      any more. This patch gets rid of it to make tc action code
      more readable.
      
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ec0595cc
    • W
      net_sched: move tc_action into tcf_common · a85a970a
      WANG Cong 提交于
      struct tc_action is confusing, currently we use it for two purposes:
      1) Pass in arguments and carry out results from helper functions
      2) A generic representation for tc actions
      
      The first one is error-prone, since we need to make sure we don't
      miss anything. This patch aims to get rid of this use, by moving
      tc_action into tcf_common, so that they are allocated together
      in hashtable and can be cast'ed easily.
      
      And together with the following patch, we could really make
      tc_action a generic representation for all tc actions and each
      type of action can inherit from it.
      
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a85a970a
  24. 15 6月, 2016 1 次提交
  25. 08 6月, 2016 3 次提交
    • W
      act_police: fix a crash during removal · a03e6fe5
      WANG Cong 提交于
      The police action is using its own code to initialize tcf hash
      info, which makes us to forgot to initialize a->hinfo correctly.
      Fix this by calling the helper function tcf_hash_create() directly.
      
      This patch fixed the following crash:
      
       BUG: unable to handle kernel NULL pointer dereference at 0000000000000028
       IP: [<ffffffff810c099f>] __lock_acquire+0xd3/0xf91
       PGD d3c34067 PUD d3e18067 PMD 0
       Oops: 0000 [#1] SMP
       CPU: 2 PID: 853 Comm: tc Not tainted 4.6.0+ #87
       Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
       task: ffff8800d3e28040 ti: ffff8800d3f6c000 task.ti: ffff8800d3f6c000
       RIP: 0010:[<ffffffff810c099f>]  [<ffffffff810c099f>] __lock_acquire+0xd3/0xf91
       RSP: 0000:ffff88011b203c80  EFLAGS: 00010002
       RAX: 0000000000000046 RBX: 0000000000000000 RCX: 0000000000000000
       RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000028
       RBP: ffff88011b203d40 R08: 0000000000000001 R09: 0000000000000000
       R10: ffff88011b203d58 R11: ffff88011b208000 R12: 0000000000000001
       R13: ffff8800d3e28040 R14: 0000000000000028 R15: 0000000000000000
       FS:  0000000000000000(0000) GS:ffff88011b200000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: 0000000000000028 CR3: 00000000d4be1000 CR4: 00000000000006e0
       Stack:
        ffff8800d3e289c0 0000000000000046 000000001b203d60 ffffffff00000000
        0000000000000000 ffff880000000000 0000000000000000 ffffffff00000000
        ffffffff8187142c ffff88011b203ce8 ffff88011b203ce8 ffffffff8101dbfc
       Call Trace:
        <IRQ>
        [<ffffffff8187142c>] ? __tcf_hash_release+0x77/0xd1
        [<ffffffff8101dbfc>] ? native_sched_clock+0x1a/0x35
        [<ffffffff8101dbfc>] ? native_sched_clock+0x1a/0x35
        [<ffffffff810a9604>] ? sched_clock_local+0x11/0x78
        [<ffffffff810bf6a1>] ? mark_lock+0x24/0x201
        [<ffffffff810c1dbd>] lock_acquire+0x120/0x1b4
        [<ffffffff810c1dbd>] ? lock_acquire+0x120/0x1b4
        [<ffffffff8187142c>] ? __tcf_hash_release+0x77/0xd1
        [<ffffffff81aad89f>] _raw_spin_lock_bh+0x3c/0x72
        [<ffffffff8187142c>] ? __tcf_hash_release+0x77/0xd1
        [<ffffffff8187142c>] __tcf_hash_release+0x77/0xd1
        [<ffffffff81871a27>] tcf_action_destroy+0x49/0x7c
        [<ffffffff81870b1c>] tcf_exts_destroy+0x20/0x2d
        [<ffffffff8189273b>] u32_destroy_key+0x1b/0x4d
        [<ffffffff81892788>] u32_delete_key_freepf_rcu+0x1b/0x1d
        [<ffffffff810de3b8>] rcu_process_callbacks+0x610/0x82e
        [<ffffffff8189276d>] ? u32_destroy_key+0x4d/0x4d
        [<ffffffff81ab0bc1>] __do_softirq+0x191/0x3f4
      
      Fixes: ddf97ccd ("net_sched: add network namespace support for tc actions")
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a03e6fe5
    • E
      net: sched: do not acquire qdisc spinlock in qdisc/class stats dump · edb09eb1
      Eric Dumazet 提交于
      Large tc dumps (tc -s {qdisc|class} sh dev ethX) done by Google BwE host
      agent [1] are problematic at scale :
      
      For each qdisc/class found in the dump, we currently lock the root qdisc
      spinlock in order to get stats. Sampling stats every 5 seconds from
      thousands of HTB classes is a challenge when the root qdisc spinlock is
      under high pressure. Not only the dumps take time, they also slow
      down the fast path (queue/dequeue packets) by 10 % to 20 % in some cases.
      
      An audit of existing qdiscs showed that sch_fq_codel is the only qdisc
      that might need the qdisc lock in fq_codel_dump_stats() and
      fq_codel_dump_class_stats()
      
      In v2 of this patch, I now use the Qdisc running seqcount to provide
      consistent reads of packets/bytes counters, regardless of 32/64 bit arches.
      
      I also changed rate estimators to use the same infrastructure
      so that they no longer need to lock root qdisc lock.
      
      [1]
      http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43838.pdfSigned-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Cong Wang <xiyou.wangcong@gmail.com>
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Cc: John Fastabend <john.fastabend@gmail.com>
      Cc: Kevin Athey <kda@google.com>
      Cc: Xiaotian Pei <xiaotian@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      edb09eb1
    • J
      net sched actions: introduce timestamp for firsttime use · 53eb440f
      Jamal Hadi Salim 提交于
      Useful to know when the action was first used for accounting
      (and debugging)
      Signed-off-by: NJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      53eb440f
  26. 25 5月, 2016 1 次提交
  27. 26 2月, 2016 1 次提交