1. 26 1月, 2015 4 次提交
    • A
      fib_trie: Add collapse() and should_collapse() to resize · 95f60ea3
      Alexander Duyck 提交于
      This patch really does two things.
      
      First it pulls the logic for determining if we should collapse one node out
      of the tree and the actual code doing the collapse into a separate pair of
      functions.  This helps to make the changes to these areas more readable.
      
      Second it encodes the upper 32b of the empty_children value onto the
      full_children value in the case of bits == KEYLENGTH.  By doing this we are
      able to handle the case of a 32b node where empty_children would appear to
      be 0 when it was actually 1ul << 32.
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      95f60ea3
    • A
      fib_trie: Fall back to slen update on inflate/halve failure · a80e89d4
      Alexander Duyck 提交于
      This change corrects an issue where if inflate or halve fails we were
      exiting the resize function without at least updating the slen for the
      node.  To correct this I have moved the update of max_size into the while
      loop so that it is only decremented on a successful call to either inflate
      or halve.
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a80e89d4
    • A
      fib_trie: Fix RCU bug and merge similar bits of inflate/halve · 69fa57b1
      Alexander Duyck 提交于
      This patch addresses two issues.
      
      The first issue is the fact that I believe I had the RCU freeing sequence
      slightly out of order.  As a result we could get into an issue if a caller
      went into a child of a child of the new node, then backtraced into the to be
      freed parent, and then attempted to access a child of a child that may have
      been consumed in a resize of one of the new nodes children.  To resolve this I
      have moved the resize after we have freed the oldtnode.  The only side effect
      of this is that we will now be calling resize on more nodes in the case of
      inflate due to the fact that we don't have a good way to test to see if a
      full_tnode on the new node was there before or after the allocation.  This
      should have minimal impact however since the node should already be
      correctly size so it is just the cost of calling should_inflate that we
      will be taking on the node which is only a couple of cycles.
      
      The second issue is the fact that inflate and halve were essentially doing
      the same thing after the new node was added to the trie replacing the old
      one.  As such it wasn't really necessary to keep the code in both functions
      so I have split it out into two other functions, called replace and
      update_children.
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      69fa57b1
    • A
      fib_trie: Use index & (~0ul << n->bits) instead of index >> n->bits · b3832117
      Alexander Duyck 提交于
      In doing performance testing and analysis of the changes I recently found
      that by shifting the index I had created an unnecessary dependency.
      
      I have updated the code so that we instead shift a mask by bits and then
      just test against that as that should save us about 2 CPU cycles since we
      can generate the mask while the key and pos are being processed.
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b3832117
  2. 01 1月, 2015 17 次提交
    • A
      fib_trie: Add tracking value for suffix length · 5405afd1
      Alexander Duyck 提交于
      This change adds a tracking value for the maximum suffix length of all
      prefixes stored in any given tnode.  With this value we can determine if we
      need to backtrace or not based on if the suffix is greater than the pos
      value.
      
      By doing this we can reduce the CPU overhead for lookups in the local table
      as many of the prefixes there are 32b long and have a suffix length of 0
      meaning we can immediately backtrace to the root node without needing to
      test any of the nodes between it and where we ended up.
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5405afd1
    • A
      fib_trie: Remove checks for index >= tnode_child_length from tnode_get_child · 21d1f11d
      Alexander Duyck 提交于
      For some reason the compiler doesn't seem to understand that when we are in
      a loop that runs from tnode_child_length - 1 to 0 we don't expect the value
      of tn->bits to change.  As such every call to tnode_get_child was rerunning
      tnode_chile_length which ended up consuming quite a bit of space in the
      resultant assembly code.
      
      I have gone though and verified that in all cases where tnode_get_child
      is used we are either winding though a fixed loop from tnode_child_length -
      1 to 0, or are in a fastpath case where we are verifying the value by
      either checking for any remaining bits after shifting index by bits and
      testing for leaf, or by using tnode_child_length.
      
      size net/ipv4/fib_trie.o
      Before:
         text	   data	    bss	    dec	    hex	filename
        15506	    376	      8	  15890	   3e12	net/ipv4/fib_trie.o
      
      After:
         text	   data	    bss	    dec	    hex	filename
        14827	    376	      8	  15211	   3b6b	net/ipv4/fib_trie.o
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      21d1f11d
    • A
      fib_trie: inflate/halve nodes in a more RCU friendly way · 12c081a5
      Alexander Duyck 提交于
      This change pulls the node_set_parent functionality out of put_child_reorg
      and instead leaves that to the function to take care of as well.  By doing
      this we can fully construct the new cluster of tnodes and all of the
      pointers out of it before we start routing pointers into it.
      
      I am suspecting this will likely fix some concurency issues though I don't
      have a good test to show as such.
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      12c081a5
    • A
      fib_trie: Push tnode flushing down to inflate/halve · fc86a93b
      Alexander Duyck 提交于
      This change pushes the tnode freeing down into the inflate and halve
      functions.  It makes more sense here as we have a better grasp of what is
      going on and when a given cluster of nodes is ready to be freed.
      
      I believe this may address a bug in the freeing logic as well.  For some
      reason if the freelist got to a certain size we would call
      synchronize_rcu().  I'm assuming that what they meant to do is call
      synchronize_rcu() after they had handed off that much memory via
      call_rcu().  As such that is what I have updated the behavior to be.
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fc86a93b
    • A
      fib_trie: Push assignment of child to parent down into inflate/halve · ff181ed8
      Alexander Duyck 提交于
      This change makes it so that the assignment of the tnode to the parent is
      handled directly within whatever function is currently handling the node be
      it inflate, halve, or resize.  By doing this we can avoid some of the need
      to set NULL pointers in the tree while we are resizing the subnodes.
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ff181ed8
    • A
      fib_trie: Add functions should_inflate and should_halve · f05a4819
      Alexander Duyck 提交于
      This change pulls the logic for if we should inflate/halve the nodes out
      into separate functions.  It also addresses what I believe is a bug where 1
      full node is all that is needed to keep a node from ever being halved.
      
      Simple script to reproduce the issue:
      	modprobe dummy;	ifconfig dummy0 up
      	for i in `seq 0 255`; do ifconfig dummy0:$i 10.0.${i}.1/24 up; done
      	ifconfig dummy0:256 10.0.255.33/16 up
      	for i in `seq 0 254`; do ifconfig dummy0:$i down; done
      
      Results from /proc/net/fib_triestat
      Before:
      	Local:
      		Aver depth:     3.00
      		Max depth:      4
      		Leaves:         17
      		Prefixes:       18
      		Internal nodes: 11
      		  1: 8  2: 2  10: 1
      		Pointers: 1048
      	Null ptrs: 1021
      	Total size: 11  kB
      After:
      	Local:
      		Aver depth:     3.41
      		Max depth:      5
      		Leaves:         17
      		Prefixes:       18
      		Internal nodes: 12
      		  1: 8  2: 3  3: 1
      		Pointers: 36
      	Null ptrs: 8
      	Total size: 3  kB
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f05a4819
    • A
      fib_trie: Move resize to after inflate/halve · cf3637bb
      Alexander Duyck 提交于
      This change consists of a cut/paste of resize to behind inflate and halve
      so that I could remove the two function prototypes.
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cf3637bb
    • A
      fib_trie: Push rcu_read_lock/unlock to callers · 345e9b54
      Alexander Duyck 提交于
      This change is to start cleaning up some of the rcu_read_lock/unlock
      handling.  I realized while reviewing the code there are several spots that
      I don't believe are being handled correctly or are masking warnings by
      locally calling rcu_read_lock/unlock instead of calling them at the correct
      level.
      
      A common example is a call to fib_get_table followed by fib_table_lookup.
      The rcu_read_lock/unlock ought to wrap both but there are several spots where
      they were not wrapped.
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      345e9b54
    • A
      fib_trie: Use unsigned long for anything dealing with a shift by bits · 98293e8d
      Alexander Duyck 提交于
      This change makes it so that anything that can be shifted by, or compared
      to a value shifted by bits is updated to be an unsigned long.  This is
      mostly a precaution against an insanely huge address space that somehow
      starts coming close to the 2^32 root node size which would require
      something like 1.5 billion addresses.
      
      I chose unsigned long instead of unsigned long long since I do not believe
      it is possible to allocate a 32 bit tnode on a 32 bit system as the memory
      consumed would be 16GB + 28B which exceeds the addressible space for any
      one process.
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      98293e8d
    • A
      fib_trie: Update meaning of pos to represent unchecked bits · e9b44019
      Alexander Duyck 提交于
      This change moves the pos value to the other side of the "bits" field.  By
      doing this it actually simplifies a significant amount of code in the trie.
      
      For example when halving a tree we know that the bit lost exists at
      oldnode->pos, and if we inflate the tree the new bit being add is at
      tn->pos.  Previously to find those bits you would have to subtract pos and
      bits from the keylength or start with a value of (1 << 31) and then shift
      that.
      
      There are a number of spots throughout the code that benefit from this.  In
      the case of the hot-path searches the main advantage is that we can drop 2
      or more operations from the search path as we no longer need to compute the
      value for the index to be shifted by and can instead just use the raw pos
      value.
      
      In addition the tkey_extract_bits is now defunct and can be replaced by
      get_index since the two operations were doing the same thing, but now
      get_index does it much more quickly as it is only an xor and shift versus a
      pair of shifts and a subtraction.
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e9b44019
    • A
      fib_trie: Optimize fib_table_insert · 836a0123
      Alexander Duyck 提交于
      This patch updates the fib_table_insert function to take advantage of the
      changes made to improve the performance of fib_table_lookup.  As a result
      the code should be smaller and run faster then the original.
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      836a0123
    • A
      fib_trie: Optimize fib_find_node · 939afb06
      Alexander Duyck 提交于
      This patch makes use of the same features I made use of for
      fib_table_lookup to streamline fib_find_node.  The resultant code should be
      smaller and run faster than the original.
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      939afb06
    • A
      fib_trie: Optimize fib_table_lookup to avoid wasting time on loops/variables · 9f9e636d
      Alexander Duyck 提交于
      This patch is meant to reduce the complexity of fib_table_lookup by reducing
      the number of variables to the bare minimum while still keeping the same if
      not improved functionality versus the original.
      
      Most of this change was started off by the desire to rid the function of
      chopped_off and current_prefix_length as they actually added very little to
      the function since they only applied when computing the cindex.  I was able
      to replace them mostly with just a check for the prefix match.  As long as
      the prefix between the key and the node being tested was the same we know
      we can search the tnode fully versus just testing cindex 0.
      
      The second portion of the change ended up being a massive reordering.
      Originally the calls to check_leaf were up near the start of the loop, and
      the backtracing and descending into lower levels of tnodes was later.  This
      didn't make much sense as the structure of the tree means the leaves are
      always the last thing to be tested.  As such I reordered things so that we
      instead have a loop that will delve into the tree and only exit when we
      have either found a leaf or we have exhausted the tree.  The advantage of
      rearranging things like this is that we can fully inline check_leaf since
      there is now only one reference to it in the function.
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9f9e636d
    • A
      fib_trie: Merge leaf into tnode · adaf9816
      Alexander Duyck 提交于
      This change makes it so that leaf and tnode are the same struct.  As a
      result there is no need for rt_trie_node anymore since everyting can be
      merged into tnode.
      
      On 32b systems this results in the leaf being 4 bytes larger, however I
      don't know if that is really an issue as this and an eariler patch that
      added bits & pos have increased the size from 20 to 28.  If I am not
      mistaken slub/slab allocate on power of 2 sizes so 20 was likely being
      rounded up to 32 anyway.
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      adaf9816
    • A
      fib_trie: Merge tnode_free and leaf_free into node_free · 37fd30f2
      Alexander Duyck 提交于
      Both the leaf and the tnode had an rcu_head in them, but they had them in
      slightly different places.  Since we now have them in the same spot and
      know that any node with bits == 0 is a leaf and the rest are either vmalloc
      or kmalloc tnodes depending on the value of bits it makes it easy to combine
      the functions and reduce overhead.
      
      In addition I have taken advantage of the rcu_head pointer to go ahead and
      put together a simple linked list instead of using the tnode pointer as
      this way we can merge either type of structure for freeing.
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      37fd30f2
    • A
      fib_trie: Make leaf and tnode more uniform · 64c9b6fb
      Alexander Duyck 提交于
      This change makes some fundamental changes to the way leaves and tnodes are
      constructed.  The big differences are:
      1.  Leaves now populate pos and bits indicating their full key size.
      2.  Trie nodes now mask out their lower bits to be consistent with the leaf
      3.  Both structures have been reordered so that rt_trie_node now consisists
          of a much larger region including the pos, bits, and rcu portions of
          the tnode structure.
      
      On 32b systems this will result in the leaf being 4B larger as the pos and
      bits values were added to a hole created by the key as it was only 4B in
      length.
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      64c9b6fb
    • A
      fib_trie: Update usage stats to be percpu instead of global variables · 8274a97a
      Alexander Duyck 提交于
      The trie usage stats were currently being shared by all threads that were
      calling fib_table_lookup.  As a result when multiple threads were
      performing lookups simultaneously the trie would begin to cache bounce
      between those threads.
      
      In order to prevent this I have updated the usage stats to use a set of
      percpu variables.  By doing this we should be able to avoid the cache
      bouncing and still make use of these stats.
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8274a97a
  3. 12 12月, 2014 1 次提交
    • A
      fib_trie: Fix trie balancing issue if new node pushes down existing node · e962f302
      Alexander Duyck 提交于
      This patch addresses an issue with the level compression of the fib_trie.
      Specifically in the case of adding a new leaf that triggers a new node to
      be added that takes the place of the old node.  The result is a trie where
      the 1 child tnode is on one side and one leaf is on the other which gives
      you a very deep trie.  Below is the script I used to generate a trie on
      dummy0 with a 10.X.X.X family of addresses.
      
        ip link add type dummy
        ipval=184549374
        bit=2
        for i in `seq 1 23`
        do
          ifconfig dummy0:$bit $ipval/8
          ipval=`expr $ipval - $bit`
          bit=`expr $bit \* 2`
        done
        cat /proc/net/fib_triestat
      
      Running the script before the patch:
      
      	Local:
      		Aver depth:     10.82
      		Max depth:      23
      		Leaves:         29
      		Prefixes:       30
      		Internal nodes: 27
      		  1: 26  2: 1
      		Pointers: 56
      	Null ptrs: 1
      	Total size: 5  kB
      
      After applying the patch and repeating:
      
      	Local:
      		Aver depth:     4.72
      		Max depth:      9
      		Leaves:         29
      		Prefixes:       30
      		Internal nodes: 12
      		  1: 3  2: 2  3: 7
      		Pointers: 70
      	Null ptrs: 30
      	Total size: 4  kB
      
      What this fix does is start the rebalance at the newly created tnode
      instead of at the parent tnode.  This way if there is a gap between the
      parent and the new node it doesn't prevent the new tnode from being
      coalesced with any pre-existing nodes that may have been pushed into one
      of the new nodes child branches.
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e962f302
  4. 07 8月, 2014 1 次提交
    • K
      list: fix order of arguments for hlist_add_after(_rcu) · 1d023284
      Ken Helias 提交于
      All other add functions for lists have the new item as first argument
      and the position where it is added as second argument.  This was changed
      for no good reason in this function and makes using it unnecessary
      confusing.
      
      The name was changed to hlist_add_behind() to cause unconverted code to
      generate a compile error instead of using the wrong parameter order.
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: NKen Helias <kenhelias@firemail.de>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	[intel driver bits]
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1d023284
  5. 05 6月, 2014 1 次提交
    • S
      net: Revert "fib_trie: use seq_file_net rather than seq->private" · f830b022
      Sasha Levin 提交于
      This reverts commit 30f38d2f.
      
      fib_triestat is surrounded by a big lie: while it claims that it's a
      seq_file (fib_triestat_seq_open, fib_triestat_seq_show), it isn't:
      
      	static const struct file_operations fib_triestat_fops = {
      	        .owner  = THIS_MODULE,
      	        .open   = fib_triestat_seq_open,
      	        .read   = seq_read,
      	        .llseek = seq_lseek,
      	        .release = single_release_net,
      	};
      
      Yes, fib_triestat is just a regular file.
      
      A small detail (assuming CONFIG_NET_NS=y) is that while for seq_files
      you could do seq_file_net() to get the net ptr, doing so for a regular
      file would be wrong and would dereference an invalid pointer.
      
      The fib_triestat lie claimed a victim, and trying to show the file would
      be bad for the kernel. This patch just reverts the issue and fixes
      fib_triestat, which still needs a rewrite to either be a seq_file or
      stop claiming it is.
      Signed-off-by: NSasha Levin <sasha.levin@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f830b022
  6. 03 6月, 2014 1 次提交
  7. 15 11月, 2013 1 次提交
  8. 10 10月, 2013 1 次提交
  9. 03 10月, 2013 1 次提交
    • B
      fib_trie: avoid a redundant bit judgement in inflate · bbe34cf8
      baker.zhang 提交于
      Because 'node' is the i'st child of 'oldnode',
      thus, here 'i' equals
      tkey_extract_bits(node->key, oldtnode->pos, oldtnode->bits)
      
      we just get 1 more bit,
      and need not care the detail value of this bits.
      
      I apologize for the mistake.
      
      I generated the patch on a branch version,
      and did not notice the put_child has been changed.
      
      I have redone the test on HEAD version with my patch.
      
      two cases are used.
      case 1. inflate a node which has a leaf child node.
      case 2: inflate a node which has a an child node with skipped bits
      
      test env:
        ip link set eth0 up
        ip a add dev eth0 192.168.11.1/32
      here, we just focus on route table(MAIN),
      so I use a "192.168.11.1/32" address to simplify the test case.
      
      call trace:
      + fib_insert_node
      + + trie_rebalance
      + + + resize
      + + + + inflate
      
      Test case 1:  inflate a node which has a leaf child node.
      
      ===========================================================
      step 1. prepare a fib trie
      ------------------------------------------
        ip r a 192.168.0.0/24 via 192.168.11.1
        ip r a 192.168.1.0/24 via 192.168.11.1
      
      we get a fib trie.
      root@baker:~# cat /proc/net/fib_trie
      Main:
        +-- 192.168.0.0/23 1 0 0
         |-- 192.168.0.0
          /24 universe UNICAST
         |-- 192.168.1.0
          /24 universe UNICAST
      Local:
      .....
      
      step 2. Add the third route
      ------------------------------------------
      root@baker:~# ip r a 192.168.2.0/24 via 192.168.11.1
      
      A fib_trie leaf will be inserted in fib_insert_node before trie_rebalance.
      
      For function 'inflate':
      'inflate' is called with following trie.
        +-- 192.168.0.0/22 1 1 0 <=== tn node
          +-- 192.168.0.0/23 1 0 0    <== node a
              |-- 192.168.0.0
                /24 universe UNICAST
              |-- 192.168.1.0
                /24 universe UNICAST
            |-- 192.168.2.0          <== leaf(node b)
      
      When process node b, which is a leaf. here:
      i is 1,
      node key "192.168.2.0"
      oldnode is (pos:22, bits:1)
      
      unpatch source:
      tkey_extract_bits(node->key, oldtnode->pos + oldtnode->bits, 1)
      it equals:
      tkey_extract_bits("192.168,2,0", 22 + 1, 1)
      
      thus got 0, and call put_child(tn, 2*i, node); <== 2*i=2.
      
      patched source:
      tkey_extract_bits(node->key, oldtnode->pos, oldtnode->bits + 1),
      tkey_extract_bits("192.168,2,0", 22, 1 + 1)  <== get 2.
      
      Test case 2:  inflate a node which has a an child node with skipped bits
      ==========================================================================
      step 1. prepare a fib trie.
        ip link set eth0 up
        ip a add dev eth0 192.168.11.1/32
        ip r a 192.168.128.0/24 via 192.168.11.1
        ip r a 192.168.0.0/24  via 192.168.11.1
        ip r a 192.168.16.0/24   via 192.168.11.1
        ip r a 192.168.32.0/24  via 192.168.11.1
        ip r a 192.168.48.0/24  via 192.168.11.1
        ip r a 192.168.144.0/24   via 192.168.11.1
        ip r a 192.168.160.0/24   via 192.168.11.1
        ip r a 192.168.176.0/24   via 192.168.11.1
      
      check:
      root@baker:~# cat /proc/net/fib_trie
      Main:
        +-- 192.168.0.0/16 1 0 0
           +-- 192.168.0.0/18 2 0 0
              |-- 192.168.0.0
                 /24 universe UNICAST
              |-- 192.168.16.0
                 /24 universe UNICAST
              |-- 192.168.32.0
                 /24 universe UNICAST
              |-- 192.168.48.0
                 /24 universe UNICAST
           +-- 192.168.128.0/18 2 0 0
              |-- 192.168.128.0
                 /24 universe UNICAST
              |-- 192.168.144.0
                 /24 universe UNICAST
              |-- 192.168.160.0
                 /24 universe UNICAST
              |-- 192.168.176.0
                 /24 universe UNICAST
      Local:
        ...
      
      step 2. add a route to trigger inflate.
        ip r a 192.168.96.0/24   via 192.168.11.1
      
      This command will call serveral times inflate.
      In the first time, the fib_trie is:
      ________________________
      +-- 192.168.128.0/(16, 1) <== tn node
       +-- 192.168.0.0/(17, 1)  <== node a
        +-- 192.168.0.0/(18, 2)
         |-- 192.168.0.0
         |-- 192.168.16.0
         |-- 192.168.32.0
         |-- 192.168.48.0
        |-- 192.168.96.0
       +-- 192.168.128.0/(18, 2) <== node b.
        |-- 192.168.128.0
        |-- 192.168.144.0
        |-- 192.168.160.0
        |-- 192.168.176.0
      
      NOTE: node b is a interal node with skipped bits.
      here,
      i:1,
      node->key "192.168.128.0",
      oldnode:(pos:16, bits:1)
      so
      tkey_extract_bits(node->key, oldtnode->pos + oldtnode->bits, 1)
      it equals:
      tkey_extract_bits("192.168,128,0", 16 + 1, 1) <=== 0
      
      tkey_extract_bits(node->key, oldtnode->pos, oldtnode->bits, 1)
      it equals:
      tkey_extract_bits("192.168,128,0", 16, 1+1) <=== 2
      
      2*i + 0 == 2, so the result is same.
      Signed-off-by: Nbaker.zhang <baker.kernel@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bbe34cf8
  10. 06 8月, 2013 1 次提交
  11. 25 7月, 2013 1 次提交
  12. 06 5月, 2013 1 次提交
  13. 28 2月, 2013 1 次提交
    • S
      hlist: drop the node parameter from iterators · b67bfe0d
      Sasha Levin 提交于
      I'm not sure why, but the hlist for each entry iterators were conceived
      
              list_for_each_entry(pos, head, member)
      
      The hlist ones were greedy and wanted an extra parameter:
      
              hlist_for_each_entry(tpos, pos, head, member)
      
      Why did they need an extra pos parameter? I'm not quite sure. Not only
      they don't really need it, it also prevents the iterator from looking
      exactly like the list iterator, which is unfortunate.
      
      Besides the semantic patch, there was some manual work required:
      
       - Fix up the actual hlist iterators in linux/list.h
       - Fix up the declaration of other iterators based on the hlist ones.
       - A very small amount of places were using the 'node' parameter, this
       was modified to use 'obj->member' instead.
       - Coccinelle didn't handle the hlist_for_each_entry_safe iterator
       properly, so those had to be fixed up manually.
      
      The semantic patch which is mostly the work of Peter Senna Tschudin is here:
      
      @@
      iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host;
      
      type T;
      expression a,c,d,e;
      identifier b;
      statement S;
      @@
      
      -T b;
          <+... when != b
      (
      hlist_for_each_entry(a,
      - b,
      c, d) S
      |
      hlist_for_each_entry_continue(a,
      - b,
      c) S
      |
      hlist_for_each_entry_from(a,
      - b,
      c) S
      |
      hlist_for_each_entry_rcu(a,
      - b,
      c, d) S
      |
      hlist_for_each_entry_rcu_bh(a,
      - b,
      c, d) S
      |
      hlist_for_each_entry_continue_rcu_bh(a,
      - b,
      c) S
      |
      for_each_busy_worker(a, c,
      - b,
      d) S
      |
      ax25_uid_for_each(a,
      - b,
      c) S
      |
      ax25_for_each(a,
      - b,
      c) S
      |
      inet_bind_bucket_for_each(a,
      - b,
      c) S
      |
      sctp_for_each_hentry(a,
      - b,
      c) S
      |
      sk_for_each(a,
      - b,
      c) S
      |
      sk_for_each_rcu(a,
      - b,
      c) S
      |
      sk_for_each_from
      -(a, b)
      +(a)
      S
      + sk_for_each_from(a) S
      |
      sk_for_each_safe(a,
      - b,
      c, d) S
      |
      sk_for_each_bound(a,
      - b,
      c) S
      |
      hlist_for_each_entry_safe(a,
      - b,
      c, d, e) S
      |
      hlist_for_each_entry_continue_rcu(a,
      - b,
      c) S
      |
      nr_neigh_for_each(a,
      - b,
      c) S
      |
      nr_neigh_for_each_safe(a,
      - b,
      c, d) S
      |
      nr_node_for_each(a,
      - b,
      c) S
      |
      nr_node_for_each_safe(a,
      - b,
      c, d) S
      |
      - for_each_gfn_sp(a, c, d, b) S
      + for_each_gfn_sp(a, c, d) S
      |
      - for_each_gfn_indirect_valid_sp(a, c, d, b) S
      + for_each_gfn_indirect_valid_sp(a, c, d) S
      |
      for_each_host(a,
      - b,
      c) S
      |
      for_each_host_safe(a,
      - b,
      c, d) S
      |
      for_each_mesh_entry(a,
      - b,
      c, d) S
      )
          ...+>
      
      [akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c]
      [akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c]
      [akpm@linux-foundation.org: checkpatch fixes]
      [akpm@linux-foundation.org: fix warnings]
      [akpm@linux-foudnation.org: redo intrusive kvm changes]
      Tested-by: NPeter Senna Tschudin <peter.senna@gmail.com>
      Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NSasha Levin <sasha.levin@oracle.com>
      Cc: Wu Fengguang <fengguang.wu@intel.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Gleb Natapov <gleb@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b67bfe0d
  14. 19 2月, 2013 2 次提交
  15. 19 9月, 2012 1 次提交
  16. 11 9月, 2012 1 次提交
  17. 08 9月, 2012 1 次提交
  18. 15 8月, 2012 1 次提交
  19. 09 8月, 2012 1 次提交
  20. 08 8月, 2012 1 次提交