1. 01 1月, 2015 8 次提交
    • A
      fib_trie: Update meaning of pos to represent unchecked bits · e9b44019
      Alexander Duyck 提交于
      This change moves the pos value to the other side of the "bits" field.  By
      doing this it actually simplifies a significant amount of code in the trie.
      
      For example when halving a tree we know that the bit lost exists at
      oldnode->pos, and if we inflate the tree the new bit being add is at
      tn->pos.  Previously to find those bits you would have to subtract pos and
      bits from the keylength or start with a value of (1 << 31) and then shift
      that.
      
      There are a number of spots throughout the code that benefit from this.  In
      the case of the hot-path searches the main advantage is that we can drop 2
      or more operations from the search path as we no longer need to compute the
      value for the index to be shifted by and can instead just use the raw pos
      value.
      
      In addition the tkey_extract_bits is now defunct and can be replaced by
      get_index since the two operations were doing the same thing, but now
      get_index does it much more quickly as it is only an xor and shift versus a
      pair of shifts and a subtraction.
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e9b44019
    • A
      fib_trie: Optimize fib_table_insert · 836a0123
      Alexander Duyck 提交于
      This patch updates the fib_table_insert function to take advantage of the
      changes made to improve the performance of fib_table_lookup.  As a result
      the code should be smaller and run faster then the original.
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      836a0123
    • A
      fib_trie: Optimize fib_find_node · 939afb06
      Alexander Duyck 提交于
      This patch makes use of the same features I made use of for
      fib_table_lookup to streamline fib_find_node.  The resultant code should be
      smaller and run faster than the original.
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      939afb06
    • A
      fib_trie: Optimize fib_table_lookup to avoid wasting time on loops/variables · 9f9e636d
      Alexander Duyck 提交于
      This patch is meant to reduce the complexity of fib_table_lookup by reducing
      the number of variables to the bare minimum while still keeping the same if
      not improved functionality versus the original.
      
      Most of this change was started off by the desire to rid the function of
      chopped_off and current_prefix_length as they actually added very little to
      the function since they only applied when computing the cindex.  I was able
      to replace them mostly with just a check for the prefix match.  As long as
      the prefix between the key and the node being tested was the same we know
      we can search the tnode fully versus just testing cindex 0.
      
      The second portion of the change ended up being a massive reordering.
      Originally the calls to check_leaf were up near the start of the loop, and
      the backtracing and descending into lower levels of tnodes was later.  This
      didn't make much sense as the structure of the tree means the leaves are
      always the last thing to be tested.  As such I reordered things so that we
      instead have a loop that will delve into the tree and only exit when we
      have either found a leaf or we have exhausted the tree.  The advantage of
      rearranging things like this is that we can fully inline check_leaf since
      there is now only one reference to it in the function.
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9f9e636d
    • A
      fib_trie: Merge leaf into tnode · adaf9816
      Alexander Duyck 提交于
      This change makes it so that leaf and tnode are the same struct.  As a
      result there is no need for rt_trie_node anymore since everyting can be
      merged into tnode.
      
      On 32b systems this results in the leaf being 4 bytes larger, however I
      don't know if that is really an issue as this and an eariler patch that
      added bits & pos have increased the size from 20 to 28.  If I am not
      mistaken slub/slab allocate on power of 2 sizes so 20 was likely being
      rounded up to 32 anyway.
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      adaf9816
    • A
      fib_trie: Merge tnode_free and leaf_free into node_free · 37fd30f2
      Alexander Duyck 提交于
      Both the leaf and the tnode had an rcu_head in them, but they had them in
      slightly different places.  Since we now have them in the same spot and
      know that any node with bits == 0 is a leaf and the rest are either vmalloc
      or kmalloc tnodes depending on the value of bits it makes it easy to combine
      the functions and reduce overhead.
      
      In addition I have taken advantage of the rcu_head pointer to go ahead and
      put together a simple linked list instead of using the tnode pointer as
      this way we can merge either type of structure for freeing.
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      37fd30f2
    • A
      fib_trie: Make leaf and tnode more uniform · 64c9b6fb
      Alexander Duyck 提交于
      This change makes some fundamental changes to the way leaves and tnodes are
      constructed.  The big differences are:
      1.  Leaves now populate pos and bits indicating their full key size.
      2.  Trie nodes now mask out their lower bits to be consistent with the leaf
      3.  Both structures have been reordered so that rt_trie_node now consisists
          of a much larger region including the pos, bits, and rcu portions of
          the tnode structure.
      
      On 32b systems this will result in the leaf being 4B larger as the pos and
      bits values were added to a hole created by the key as it was only 4B in
      length.
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      64c9b6fb
    • A
      fib_trie: Update usage stats to be percpu instead of global variables · 8274a97a
      Alexander Duyck 提交于
      The trie usage stats were currently being shared by all threads that were
      calling fib_table_lookup.  As a result when multiple threads were
      performing lookups simultaneously the trie would begin to cache bounce
      between those threads.
      
      In order to prevent this I have updated the usage stats to use a set of
      percpu variables.  By doing this we should be able to avoid the cache
      bouncing and still make use of these stats.
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8274a97a
  2. 12 12月, 2014 1 次提交
    • A
      fib_trie: Fix trie balancing issue if new node pushes down existing node · e962f302
      Alexander Duyck 提交于
      This patch addresses an issue with the level compression of the fib_trie.
      Specifically in the case of adding a new leaf that triggers a new node to
      be added that takes the place of the old node.  The result is a trie where
      the 1 child tnode is on one side and one leaf is on the other which gives
      you a very deep trie.  Below is the script I used to generate a trie on
      dummy0 with a 10.X.X.X family of addresses.
      
        ip link add type dummy
        ipval=184549374
        bit=2
        for i in `seq 1 23`
        do
          ifconfig dummy0:$bit $ipval/8
          ipval=`expr $ipval - $bit`
          bit=`expr $bit \* 2`
        done
        cat /proc/net/fib_triestat
      
      Running the script before the patch:
      
      	Local:
      		Aver depth:     10.82
      		Max depth:      23
      		Leaves:         29
      		Prefixes:       30
      		Internal nodes: 27
      		  1: 26  2: 1
      		Pointers: 56
      	Null ptrs: 1
      	Total size: 5  kB
      
      After applying the patch and repeating:
      
      	Local:
      		Aver depth:     4.72
      		Max depth:      9
      		Leaves:         29
      		Prefixes:       30
      		Internal nodes: 12
      		  1: 3  2: 2  3: 7
      		Pointers: 70
      	Null ptrs: 30
      	Total size: 4  kB
      
      What this fix does is start the rebalance at the newly created tnode
      instead of at the parent tnode.  This way if there is a gap between the
      parent and the new node it doesn't prevent the new tnode from being
      coalesced with any pre-existing nodes that may have been pushed into one
      of the new nodes child branches.
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e962f302
  3. 07 8月, 2014 1 次提交
    • K
      list: fix order of arguments for hlist_add_after(_rcu) · 1d023284
      Ken Helias 提交于
      All other add functions for lists have the new item as first argument
      and the position where it is added as second argument.  This was changed
      for no good reason in this function and makes using it unnecessary
      confusing.
      
      The name was changed to hlist_add_behind() to cause unconverted code to
      generate a compile error instead of using the wrong parameter order.
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: NKen Helias <kenhelias@firemail.de>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	[intel driver bits]
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1d023284
  4. 05 6月, 2014 1 次提交
    • S
      net: Revert "fib_trie: use seq_file_net rather than seq->private" · f830b022
      Sasha Levin 提交于
      This reverts commit 30f38d2f.
      
      fib_triestat is surrounded by a big lie: while it claims that it's a
      seq_file (fib_triestat_seq_open, fib_triestat_seq_show), it isn't:
      
      	static const struct file_operations fib_triestat_fops = {
      	        .owner  = THIS_MODULE,
      	        .open   = fib_triestat_seq_open,
      	        .read   = seq_read,
      	        .llseek = seq_lseek,
      	        .release = single_release_net,
      	};
      
      Yes, fib_triestat is just a regular file.
      
      A small detail (assuming CONFIG_NET_NS=y) is that while for seq_files
      you could do seq_file_net() to get the net ptr, doing so for a regular
      file would be wrong and would dereference an invalid pointer.
      
      The fib_triestat lie claimed a victim, and trying to show the file would
      be bad for the kernel. This patch just reverts the issue and fixes
      fib_triestat, which still needs a rewrite to either be a seq_file or
      stop claiming it is.
      Signed-off-by: NSasha Levin <sasha.levin@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f830b022
  5. 03 6月, 2014 1 次提交
  6. 15 11月, 2013 1 次提交
  7. 10 10月, 2013 1 次提交
  8. 03 10月, 2013 1 次提交
    • B
      fib_trie: avoid a redundant bit judgement in inflate · bbe34cf8
      baker.zhang 提交于
      Because 'node' is the i'st child of 'oldnode',
      thus, here 'i' equals
      tkey_extract_bits(node->key, oldtnode->pos, oldtnode->bits)
      
      we just get 1 more bit,
      and need not care the detail value of this bits.
      
      I apologize for the mistake.
      
      I generated the patch on a branch version,
      and did not notice the put_child has been changed.
      
      I have redone the test on HEAD version with my patch.
      
      two cases are used.
      case 1. inflate a node which has a leaf child node.
      case 2: inflate a node which has a an child node with skipped bits
      
      test env:
        ip link set eth0 up
        ip a add dev eth0 192.168.11.1/32
      here, we just focus on route table(MAIN),
      so I use a "192.168.11.1/32" address to simplify the test case.
      
      call trace:
      + fib_insert_node
      + + trie_rebalance
      + + + resize
      + + + + inflate
      
      Test case 1:  inflate a node which has a leaf child node.
      
      ===========================================================
      step 1. prepare a fib trie
      ------------------------------------------
        ip r a 192.168.0.0/24 via 192.168.11.1
        ip r a 192.168.1.0/24 via 192.168.11.1
      
      we get a fib trie.
      root@baker:~# cat /proc/net/fib_trie
      Main:
        +-- 192.168.0.0/23 1 0 0
         |-- 192.168.0.0
          /24 universe UNICAST
         |-- 192.168.1.0
          /24 universe UNICAST
      Local:
      .....
      
      step 2. Add the third route
      ------------------------------------------
      root@baker:~# ip r a 192.168.2.0/24 via 192.168.11.1
      
      A fib_trie leaf will be inserted in fib_insert_node before trie_rebalance.
      
      For function 'inflate':
      'inflate' is called with following trie.
        +-- 192.168.0.0/22 1 1 0 <=== tn node
          +-- 192.168.0.0/23 1 0 0    <== node a
              |-- 192.168.0.0
                /24 universe UNICAST
              |-- 192.168.1.0
                /24 universe UNICAST
            |-- 192.168.2.0          <== leaf(node b)
      
      When process node b, which is a leaf. here:
      i is 1,
      node key "192.168.2.0"
      oldnode is (pos:22, bits:1)
      
      unpatch source:
      tkey_extract_bits(node->key, oldtnode->pos + oldtnode->bits, 1)
      it equals:
      tkey_extract_bits("192.168,2,0", 22 + 1, 1)
      
      thus got 0, and call put_child(tn, 2*i, node); <== 2*i=2.
      
      patched source:
      tkey_extract_bits(node->key, oldtnode->pos, oldtnode->bits + 1),
      tkey_extract_bits("192.168,2,0", 22, 1 + 1)  <== get 2.
      
      Test case 2:  inflate a node which has a an child node with skipped bits
      ==========================================================================
      step 1. prepare a fib trie.
        ip link set eth0 up
        ip a add dev eth0 192.168.11.1/32
        ip r a 192.168.128.0/24 via 192.168.11.1
        ip r a 192.168.0.0/24  via 192.168.11.1
        ip r a 192.168.16.0/24   via 192.168.11.1
        ip r a 192.168.32.0/24  via 192.168.11.1
        ip r a 192.168.48.0/24  via 192.168.11.1
        ip r a 192.168.144.0/24   via 192.168.11.1
        ip r a 192.168.160.0/24   via 192.168.11.1
        ip r a 192.168.176.0/24   via 192.168.11.1
      
      check:
      root@baker:~# cat /proc/net/fib_trie
      Main:
        +-- 192.168.0.0/16 1 0 0
           +-- 192.168.0.0/18 2 0 0
              |-- 192.168.0.0
                 /24 universe UNICAST
              |-- 192.168.16.0
                 /24 universe UNICAST
              |-- 192.168.32.0
                 /24 universe UNICAST
              |-- 192.168.48.0
                 /24 universe UNICAST
           +-- 192.168.128.0/18 2 0 0
              |-- 192.168.128.0
                 /24 universe UNICAST
              |-- 192.168.144.0
                 /24 universe UNICAST
              |-- 192.168.160.0
                 /24 universe UNICAST
              |-- 192.168.176.0
                 /24 universe UNICAST
      Local:
        ...
      
      step 2. add a route to trigger inflate.
        ip r a 192.168.96.0/24   via 192.168.11.1
      
      This command will call serveral times inflate.
      In the first time, the fib_trie is:
      ________________________
      +-- 192.168.128.0/(16, 1) <== tn node
       +-- 192.168.0.0/(17, 1)  <== node a
        +-- 192.168.0.0/(18, 2)
         |-- 192.168.0.0
         |-- 192.168.16.0
         |-- 192.168.32.0
         |-- 192.168.48.0
        |-- 192.168.96.0
       +-- 192.168.128.0/(18, 2) <== node b.
        |-- 192.168.128.0
        |-- 192.168.144.0
        |-- 192.168.160.0
        |-- 192.168.176.0
      
      NOTE: node b is a interal node with skipped bits.
      here,
      i:1,
      node->key "192.168.128.0",
      oldnode:(pos:16, bits:1)
      so
      tkey_extract_bits(node->key, oldtnode->pos + oldtnode->bits, 1)
      it equals:
      tkey_extract_bits("192.168,128,0", 16 + 1, 1) <=== 0
      
      tkey_extract_bits(node->key, oldtnode->pos, oldtnode->bits, 1)
      it equals:
      tkey_extract_bits("192.168,128,0", 16, 1+1) <=== 2
      
      2*i + 0 == 2, so the result is same.
      Signed-off-by: Nbaker.zhang <baker.kernel@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bbe34cf8
  9. 06 8月, 2013 1 次提交
  10. 25 7月, 2013 1 次提交
  11. 06 5月, 2013 1 次提交
  12. 28 2月, 2013 1 次提交
    • S
      hlist: drop the node parameter from iterators · b67bfe0d
      Sasha Levin 提交于
      I'm not sure why, but the hlist for each entry iterators were conceived
      
              list_for_each_entry(pos, head, member)
      
      The hlist ones were greedy and wanted an extra parameter:
      
              hlist_for_each_entry(tpos, pos, head, member)
      
      Why did they need an extra pos parameter? I'm not quite sure. Not only
      they don't really need it, it also prevents the iterator from looking
      exactly like the list iterator, which is unfortunate.
      
      Besides the semantic patch, there was some manual work required:
      
       - Fix up the actual hlist iterators in linux/list.h
       - Fix up the declaration of other iterators based on the hlist ones.
       - A very small amount of places were using the 'node' parameter, this
       was modified to use 'obj->member' instead.
       - Coccinelle didn't handle the hlist_for_each_entry_safe iterator
       properly, so those had to be fixed up manually.
      
      The semantic patch which is mostly the work of Peter Senna Tschudin is here:
      
      @@
      iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host;
      
      type T;
      expression a,c,d,e;
      identifier b;
      statement S;
      @@
      
      -T b;
          <+... when != b
      (
      hlist_for_each_entry(a,
      - b,
      c, d) S
      |
      hlist_for_each_entry_continue(a,
      - b,
      c) S
      |
      hlist_for_each_entry_from(a,
      - b,
      c) S
      |
      hlist_for_each_entry_rcu(a,
      - b,
      c, d) S
      |
      hlist_for_each_entry_rcu_bh(a,
      - b,
      c, d) S
      |
      hlist_for_each_entry_continue_rcu_bh(a,
      - b,
      c) S
      |
      for_each_busy_worker(a, c,
      - b,
      d) S
      |
      ax25_uid_for_each(a,
      - b,
      c) S
      |
      ax25_for_each(a,
      - b,
      c) S
      |
      inet_bind_bucket_for_each(a,
      - b,
      c) S
      |
      sctp_for_each_hentry(a,
      - b,
      c) S
      |
      sk_for_each(a,
      - b,
      c) S
      |
      sk_for_each_rcu(a,
      - b,
      c) S
      |
      sk_for_each_from
      -(a, b)
      +(a)
      S
      + sk_for_each_from(a) S
      |
      sk_for_each_safe(a,
      - b,
      c, d) S
      |
      sk_for_each_bound(a,
      - b,
      c) S
      |
      hlist_for_each_entry_safe(a,
      - b,
      c, d, e) S
      |
      hlist_for_each_entry_continue_rcu(a,
      - b,
      c) S
      |
      nr_neigh_for_each(a,
      - b,
      c) S
      |
      nr_neigh_for_each_safe(a,
      - b,
      c, d) S
      |
      nr_node_for_each(a,
      - b,
      c) S
      |
      nr_node_for_each_safe(a,
      - b,
      c, d) S
      |
      - for_each_gfn_sp(a, c, d, b) S
      + for_each_gfn_sp(a, c, d) S
      |
      - for_each_gfn_indirect_valid_sp(a, c, d, b) S
      + for_each_gfn_indirect_valid_sp(a, c, d) S
      |
      for_each_host(a,
      - b,
      c) S
      |
      for_each_host_safe(a,
      - b,
      c, d) S
      |
      for_each_mesh_entry(a,
      - b,
      c, d) S
      )
          ...+>
      
      [akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c]
      [akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c]
      [akpm@linux-foundation.org: checkpatch fixes]
      [akpm@linux-foundation.org: fix warnings]
      [akpm@linux-foudnation.org: redo intrusive kvm changes]
      Tested-by: NPeter Senna Tschudin <peter.senna@gmail.com>
      Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NSasha Levin <sasha.levin@oracle.com>
      Cc: Wu Fengguang <fengguang.wu@intel.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Gleb Natapov <gleb@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b67bfe0d
  13. 19 2月, 2013 2 次提交
  14. 19 9月, 2012 1 次提交
  15. 11 9月, 2012 1 次提交
  16. 08 9月, 2012 1 次提交
  17. 15 8月, 2012 1 次提交
  18. 09 8月, 2012 1 次提交
  19. 08 8月, 2012 1 次提交
  20. 30 7月, 2012 2 次提交
  21. 13 7月, 2012 1 次提交
  22. 11 6月, 2012 1 次提交
  23. 04 6月, 2012 1 次提交
    • J
      net: Remove casts to same type · e3192690
      Joe Perches 提交于
      Adding casts of objects to the same type is unnecessary
      and confusing for a human reader.
      
      For example, this cast:
      
      	int y;
      	int *p = (int *)&y;
      
      I used the coccinelle script below to find and remove these
      unnecessary casts.  I manually removed the conversions this
      script produces of casts with __force and __user.
      
      @@
      type T;
      T *p;
      @@
      
      -	(T *)p
      +	p
      Signed-off-by: NJoe Perches <joe@perches.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e3192690
  24. 11 5月, 2012 1 次提交
  25. 29 3月, 2012 1 次提交
  26. 12 3月, 2012 1 次提交
    • J
      net: Convert printks to pr_<level> · 058bd4d2
      Joe Perches 提交于
      Use a more current kernel messaging style.
      
      Convert a printk block to print_hex_dump.
      Coalesce formats, align arguments.
      Use %s, __func__ instead of embedding function names.
      
      Some messages that were prefixed with <foo>_close are
      now prefixed with <foo>_fini.  Some ah4 and esp messages
      are now not prefixed with "ip ".
      
      The intent of this patch is to later add something like
        #define pr_fmt(fmt) "IPv4: " fmt.
      to standardize the output messages.
      
      Text size is trivially reduced. (x86-32 allyesconfig)
      
      $ size net/ipv4/built-in.o*
         text	   data	    bss	    dec	    hex	filename
       887888	  31558	 249696	1169142	 11d6f6	net/ipv4/built-in.o.new
       887934	  31558	 249800	1169292	 11d78c	net/ipv4/built-in.o.old
      Signed-off-by: NJoe Perches <joe@perches.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      058bd4d2
  27. 13 1月, 2012 1 次提交
  28. 05 12月, 2011 1 次提交
  29. 01 11月, 2011 1 次提交
  30. 02 8月, 2011 1 次提交
  31. 19 7月, 2011 1 次提交