1. 18 7月, 2011 1 次提交
  2. 17 7月, 2011 4 次提交
  3. 14 7月, 2011 1 次提交
    • D
      net: Embed hh_cache inside of struct neighbour. · f6b72b62
      David S. Miller 提交于
      Now that there is a one-to-one correspondance between neighbour
      and hh_cache entries, we no longer need:
      
      1) dynamic allocation
      2) attachment to dst->hh
      3) refcounting
      
      Initialization of the hh_cache entry is indicated by hh_len
      being non-zero, and such initialization is always done with
      the neighbour's lock held as a writer.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f6b72b62
  4. 13 7月, 2011 2 次提交
  5. 11 7月, 2011 2 次提交
  6. 10 6月, 2011 1 次提交
    • G
      rtnetlink: Compute and store minimum ifinfo dump size · c7ac8679
      Greg Rose 提交于
      The message size allocated for rtnl ifinfo dumps was limited to
      a single page.  This is not enough for additional interface info
      available with devices that support SR-IOV and caused a bug in
      which VF info would not be displayed if more than approximately
      40 VFs were created per interface.
      
      Implement a new function pointer for the rtnl_register service that will
      calculate the amount of data required for the ifinfo dump and allocate
      enough data to satisfy the request.
      Signed-off-by: NGreg Rose <gregory.v.rose@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      c7ac8679
  7. 21 1月, 2011 1 次提交
  8. 20 12月, 2010 1 次提交
  9. 21 10月, 2010 1 次提交
  10. 12 10月, 2010 2 次提交
    • E
      neigh: Protect neigh->ha[] with a seqlock · 0ed8ddf4
      Eric Dumazet 提交于
      Add a seqlock in struct neighbour to protect neigh->ha[], and avoid
      dirtying neighbour in stress situation (many different flows / dsts)
      
      Dirtying takes place because of read_lock(&n->lock) and n->used writes.
      
      Switching to a seqlock, and writing n->used only on jiffies changes
      permits less dirtying.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0ed8ddf4
    • E
      neigh: speedup neigh_hh_init() · 34d101dd
      Eric Dumazet 提交于
      When a new dst is used to send a frame, neigh_resolve_output() tries to
      associate an struct hh_cache to this dst, calling neigh_hh_init() with
      the neigh rwlock write locked.
      
      Most of the time, hh_cache is already known and linked into neighbour,
      so we find it and increment its refcount.
      
      This patch changes the logic so that we call neigh_hh_init() with
      neighbour lock read locked only, so that fast path can be run in
      parallel by concurrent cpus.
      
      This brings part of the speedup we got with commit c7d4426a
      (introduce DST_NOCACHE flag) for non cached dsts, even for cached ones,
      removing one of the contention point that routers hit on multiqueue
      enabled machines.
      
      Further improvements would need to use a seqlock instead of an rwlock to
      protect neigh->ha[], to not dirty neigh too often and remove two atomic
      ops.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      34d101dd
  11. 07 10月, 2010 1 次提交
    • E
      neigh: RCU conversion of struct neighbour · 767e97e1
      Eric Dumazet 提交于
      This is the second step for neighbour RCU conversion.
      
      (first was commit d6bf7817 : RCU conversion of neigh hash table)
      
      neigh_lookup() becomes lockless, but still take a reference on found
      neighbour. (no more read_lock()/read_unlock() on tbl->lock)
      
      struct neighbour gets an additional rcu_head field and is freed after an
      RCU grace period.
      
      Future work would need to eventually not take a reference on neighbour
      for temporary dst (DST_NOCACHE), but this would need dst->_neighbour to
      use a noref bit like we did for skb->_dst.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      767e97e1
  12. 06 10月, 2010 2 次提交
  13. 04 10月, 2010 1 次提交
    • E
      net: introduce DST_NOCACHE flag · c7d4426a
      Eric Dumazet 提交于
      While doing stress tests with IP route cache disabled, and multi queue
      devices, I noticed a very high contention on one rwlock used in
      neighbour code.
      
      When many cpus are trying to send frames (possibly using a high
      performance multiqueue device) to the same neighbour, they fight for the
      neigh->lock rwlock in order to call neigh_hh_init(), and fight on
      hh->hh_refcnt (a pair of atomic_inc/atomic_dec_and_test())
      
      But we dont need to call neigh_hh_init() for dst that are used only
      once. It costs four atomic operations at least, on two contended cache
      lines, plus the high contention on neigh->lock rwlock.
      
      Introduce a new dst flag, DST_NOCACHE, that is set when dst was not
      inserted in route cache.
      
      With the stress test bench, sending 160000000 frames on one neighbour,
      results are :
      
      Before patch:
      
      real	2m28.406s
      user	0m11.781s
      sys	36m17.964s
      
      
      After patch:
      
      real	1m26.532s
      user	0m12.185s
      sys	20m3.903s
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c7d4426a
  14. 24 9月, 2010 1 次提交
  15. 15 7月, 2010 1 次提交
    • D
      net/core: neighbour update Oops · 91a72a70
      Doug Kehn 提交于
      When configuring DMVPN (GRE + openNHRP) and a GRE remote
      address is configured a kernel Oops is observed.  The
      obserseved Oops is caused by a NULL header_ops pointer
      (neigh->dev->header_ops) in neigh_update_hhs() when
      
      void (*update)(struct hh_cache*, const struct net_device*, const unsigned char *)
      = neigh->dev->header_ops->cache_update;
      
      is executed.  The dev associated with the NULL header_ops is
      the GRE interface.  This patch guards against the
      possibility that header_ops is NULL.
      
      This Oops was first observed in kernel version 2.6.26.8.
      Signed-off-by: NDoug Kehn <rdkehn@yahoo.com>
      Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      91a72a70
  16. 28 5月, 2010 1 次提交
  17. 30 3月, 2010 1 次提交
    • T
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking... · 5a0e3ad6
      Tejun Heo 提交于
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
      
      percpu.h is included by sched.h and module.h and thus ends up being
      included when building most .c files.  percpu.h includes slab.h which
      in turn includes gfp.h making everything defined by the two files
      universally available and complicating inclusion dependencies.
      
      percpu.h -> slab.h dependency is about to be removed.  Prepare for
      this change by updating users of gfp and slab facilities include those
      headers directly instead of assuming availability.  As this conversion
      needs to touch large number of source files, the following script is
      used as the basis of conversion.
      
        http://userweb.kernel.org/~tj/misc/slabh-sweep.py
      
      The script does the followings.
      
      * Scan files for gfp and slab usages and update includes such that
        only the necessary includes are there.  ie. if only gfp is used,
        gfp.h, if slab is used, slab.h.
      
      * When the script inserts a new include, it looks at the include
        blocks and try to put the new include such that its order conforms
        to its surrounding.  It's put in the include block which contains
        core kernel includes, in the same order that the rest are ordered -
        alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
        doesn't seem to be any matching order.
      
      * If the script can't find a place to put a new include (mostly
        because the file doesn't have fitting include block), it prints out
        an error message indicating which .h file needs to be added to the
        file.
      
      The conversion was done in the following steps.
      
      1. The initial automatic conversion of all .c files updated slightly
         over 4000 files, deleting around 700 includes and adding ~480 gfp.h
         and ~3000 slab.h inclusions.  The script emitted errors for ~400
         files.
      
      2. Each error was manually checked.  Some didn't need the inclusion,
         some needed manual addition while adding it to implementation .h or
         embedding .c file was more appropriate for others.  This step added
         inclusions to around 150 files.
      
      3. The script was run again and the output was compared to the edits
         from #2 to make sure no file was left behind.
      
      4. Several build tests were done and a couple of problems were fixed.
         e.g. lib/decompress_*.c used malloc/free() wrappers around slab
         APIs requiring slab.h to be added manually.
      
      5. The script was run on all .h files but without automatically
         editing them as sprinkling gfp.h and slab.h inclusions around .h
         files could easily lead to inclusion dependency hell.  Most gfp.h
         inclusion directives were ignored as stuff from gfp.h was usually
         wildly available and often used in preprocessor macros.  Each
         slab.h inclusion directive was examined and added manually as
         necessary.
      
      6. percpu.h was updated not to include slab.h.
      
      7. Build test were done on the following configurations and failures
         were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
         distributed build env didn't work with gcov compiles) and a few
         more options had to be turned off depending on archs to make things
         build (like ipr on powerpc/64 which failed due to missing writeq).
      
         * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
         * powerpc and powerpc64 SMP allmodconfig
         * sparc and sparc64 SMP allmodconfig
         * ia64 SMP allmodconfig
         * s390 SMP allmodconfig
         * alpha SMP allmodconfig
         * um on x86_64 SMP allmodconfig
      
      8. percpu.h modifications were reverted so that it could be applied as
         a separate patch and serve as bisection point.
      
      Given the fact that I had only a couple of failures from tests on step
      6, I'm fairly confident about the coverage of this conversion patch.
      If there is a breakage, it's likely to be something in one of the arch
      headers which should be easily discoverable easily on most builds of
      the specific arch.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
      5a0e3ad6
  18. 10 3月, 2010 1 次提交
  19. 17 2月, 2010 1 次提交
  20. 23 1月, 2010 1 次提交
  21. 26 11月, 2009 1 次提交
  22. 12 11月, 2009 1 次提交
    • E
      sysctl net: Remove unused binary sysctl code · f8572d8f
      Eric W. Biederman 提交于
      Now that sys_sysctl is a compatiblity wrapper around /proc/sys
      all sysctl strategy routines, and all ctl_name and strategy
      entries in the sysctl tables are unused, and can be
      revmoed.
      
      In addition neigh_sysctl_register has been modified to no longer
      take a strategy argument and it's callers have been modified not
      to pass one.
      
      Cc: "David Miller" <davem@davemloft.net>
      Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
      Cc: netdev@vger.kernel.org
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      f8572d8f
  23. 03 8月, 2009 1 次提交
    • E
      neigh: Convert garbage collection from softirq to workqueue · e4c4e448
      Eric Dumazet 提交于
      Current neigh_periodic_timer() function is fired by timer IRQ, and
      scans one hash bucket each round (very litle work in fact)
      
      As we are supposed to scan whole hash table in 15 seconds, this means
      neigh_periodic_timer() can be fired very often. (depending on the number
      of concurrent hash entries we stored in this table)
      
      Converting this to a workqueue permits scanning whole table, minimizing
      icache pollution, and firing this work every 15 seconds, independantly
      of hash table size.
      
      This 15 seconds delay is not a hard number, as work is a deferrable one.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e4c4e448
  24. 14 7月, 2009 1 次提交
  25. 11 6月, 2009 1 次提交
    • T
      neigh: fix state transition INCOMPLETE->FAILED via Netlink request · 5ef12d98
      Timo Teras 提交于
      The current code errors out the INCOMPLETE neigh entry skb queue only from
      the timer if maximum probes have been attempted and there has been no reply.
      This also causes the transtion to FAILED state.
      
      However, the neigh entry can be also updated via Netlink to inform that the
      address is unavailable.  Currently, neigh_update() just stops the timers and
      leaves the pending skb's unreleased. This results that the clean up code in
      the timer callback is never called, preventing also proper garbage collection.
      
      This fixes neigh_update() to process the pending skb queue immediately if
      INCOMPLETE -> FAILED state transtion occurs due to a Netlink request.
      Signed-off-by: NTimo Teras <timo.teras@iki.fi>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5ef12d98
  26. 03 6月, 2009 1 次提交
  27. 04 3月, 2009 1 次提交
  28. 27 2月, 2009 1 次提交
  29. 25 2月, 2009 1 次提交
    • P
      netlink: change nlmsg_notify() return value logic · 1ce85fe4
      Pablo Neira Ayuso 提交于
      This patch changes the return value of nlmsg_notify() as follows:
      
      If NETLINK_BROADCAST_ERROR is set by any of the listeners and
      an error in the delivery happened, return the broadcast error;
      else if there are no listeners apart from the socket that
      requested a change with the echo flag, return the result of the
      unicast notification. Thus, with this patch, the unicast
      notification is handled in the same way of a broadcast listener
      that has set the NETLINK_BROADCAST_ERROR socket flag.
      
      This patch is useful in case that the caller of nlmsg_notify()
      wants to know the result of the delivery of a netlink notification
      (including the broadcast delivery) and take any action in case
      that the delivery failed. For example, ctnetlink can drop packets
      if the event delivery failed to provide reliable logging and
      state-synchronization at the cost of dropping packets.
      
      This patch also modifies the rtnetlink code to ignore the return
      value of rtnl_notify() in all callers. The function rtnl_notify()
      (before this patch) returned the error of the unicast notification
      which makes rtnl_set_sk_err() reports errors to all listeners. This
      is not of any help since the origin of the change (the socket that
      requested the echoing) notices the ENOBUFS error if the notification
      fails and should resync itself.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Acked-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1ce85fe4
  30. 06 2月, 2009 1 次提交
  31. 30 12月, 2008 1 次提交
  32. 21 11月, 2008 1 次提交
  33. 12 11月, 2008 1 次提交