1. 02 6月, 2010 3 次提交
    • E
      net: add additional lock to qdisc to increase throughput · 79640a4c
      Eric Dumazet 提交于
      When many cpus compete for sending frames on a given qdisc, the qdisc
      spinlock suffers from very high contention.
      
      The cpu owning __QDISC_STATE_RUNNING bit has same priority to acquire
      the lock, and cannot dequeue packets fast enough, since it must wait for
      this lock for each dequeued packet.
      
      One solution to this problem is to force all cpus spinning on a second
      lock before trying to get the main lock, when/if they see
      __QDISC_STATE_RUNNING already set.
      
      The owning cpu then compete with at most one other cpu for the main
      lock, allowing for higher dequeueing rate.
      
      Based on a previous patch from Alexander Duyck. I added the heuristic to
      avoid the atomic in fast path, and put the new lock far away from the
      cache line used by the dequeue worker. Also try to release the busylock
      lock as late as possible.
      
      Tests with following script gave a boost from ~50.000 pps to ~600.000
      pps on a dual quad core machine (E5450 @3.00GHz), tg3 driver.
      (A single netperf flow can reach ~800.000 pps on this platform)
      
      for j in `seq 0 3`; do
        for i in `seq 0 7`; do
          netperf -H 192.168.0.1 -t UDP_STREAM -l 60 -N -T $i -- -m 6 &
        done
      done
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Acked-by: NAlexander Duyck <alexander.h.duyck@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      79640a4c
    • E
      net: QDISC_STATE_RUNNING dont need atomic bit ops · 37112105
      Eric Dumazet 提交于
      __QDISC_STATE_RUNNING is always changed while qdisc lock is held.
      
      We can avoid two atomic operations in xmit path, if we move this bit in
      a new __state container.
      
      Location of this __state container is carefully chosen so that fast path
      only dirties one qdisc cache line.
      
      THROTTLED bit could later be moved into this __state location too, to
      avoid dirtying first qdisc cache line.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      37112105
    • E
      net: Define accessors to manipulate QDISC_STATE_RUNNING · bc135b23
      Eric Dumazet 提交于
      Define three helpers to manipulate QDISC_STATE_RUNNIG flag, that a
      second patch will move on another location.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bc135b23
  2. 27 5月, 2010 1 次提交
    • E
      net: fix lock_sock_bh/unlock_sock_bh · 8a74ad60
      Eric Dumazet 提交于
      This new sock lock primitive was introduced to speedup some user context
      socket manipulation. But it is unsafe to protect two threads, one using
      regular lock_sock/release_sock, one using lock_sock_bh/unlock_sock_bh
      
      This patch changes lock_sock_bh to be careful against 'owned' state.
      If owned is found to be set, we must take the slow path.
      lock_sock_bh() now returns a boolean to say if the slow path was taken,
      and this boolean is used at unlock_sock_bh time to call the appropriate
      unlock function.
      
      After this change, BH are either disabled or enabled during the
      lock_sock_bh/unlock_sock_bh protected section. This might be misleading,
      so we rename these functions to lock_sock_fast()/unlock_sock_fast().
      Reported-by: NAnton Blanchard <anton@samba.org>
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Tested-by: NAnton Blanchard <anton@samba.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8a74ad60
  3. 26 5月, 2010 2 次提交
  4. 25 5月, 2010 5 次提交
  5. 24 5月, 2010 3 次提交
    • L
      Revert "ath9k: Group Key fix for VAPs" · a69eee49
      Linus Torvalds 提交于
      This reverts commit 03ceedea, since it
      breaks resume from suspend-to-ram on Rafael's Acer Ferrari One.
      NetworkManager thinks everything is ok, but it can't connect to the AP
      to get an IP address after the resume.
      
      In fact, it even breaks resume for non-ath9k chipsets: reverting it also
      fixes Rafael's Toshiba Protege R500 with the iwlagn driver.  As Johannes
      says:
      
        "Indeed, this patch needs to be reverted. That mac80211 change is wrong
         and completely unnecessary."
      Reported-and-requested-by: NRafael J. Wysocki <rjw@sisk.pl>
      Acked-by: NJohannes Berg <johannes@sipsolutions.net>
      Cc: Daniel Yingqiang Ma <yma.cool@gmail.com>
      Cc: John W. Linville <linville@tuxdriver.com>
      Cc: David Miller <davem@davemloft.net>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a69eee49
    • H
      cls_cgroup: Store classid in struct sock · f8451725
      Herbert Xu 提交于
      Up until now cls_cgroup has relied on fetching the classid out of
      the current executing thread.  This runs into trouble when a packet
      processing is delayed in which case it may execute out of another
      thread's context.
      
      Furthermore, even when a packet is not delayed we may fail to
      classify it if soft IRQs have been disabled, because this scenario
      is indistinguishable from one where a packet unrelated to the
      current thread is processed by a real soft IRQ.
      
      In fact, the current semantics is inherently broken, as a single
      skb may be constructed out of the writes of two different tasks.
      A different manifestation of this problem is when the TCP stack
      transmits in response of an incoming ACK.  This is currently
      unclassified.
      
      As we already have a concept of packet ownership for accounting
      purposes in the skb->sk pointer, this is a natural place to store
      the classid in a persistent manner.
      
      This patch adds the cls_cgroup classid in struct sock, filling up
      an existing hole on 64-bit :)
      
      The value is set at socket creation time.  So all sockets created
      via socket(2) automatically gains the ID of the thread creating it.
      Whenever another process touches the socket by either reading or
      writing to it, we will change the socket classid to that of the
      process if it has a valid (non-zero) classid.
      
      For sockets created on inbound connections through accept(2), we
      inherit the classid of the original listening socket through
      sk_clone, possibly preceding the actual accept(2) call.
      
      In order to minimise risks, I have not made this the authoritative
      classid.  For now it is only used as a backup when we execute
      with soft IRQs disabled.  Once we're completely happy with its
      semantics we can use it as the sole classid.
      
      Footnote: I have rearranged the error path on cls_group module
      creation.  If we didn't do this, then there is a window where
      someone could create a tc rule using cls_group before the cgroup
      subsystem has been registered.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f8451725
    • S
      caif: Bugfix - use standard Linux lists · 7aecf494
      Sjur Braendeland 提交于
      Discovered bug when running high number of parallel connect requests.
      Replace buggy home brewed list with linux/list.h.
      Signed-off-by: NSjur Braendeland <sjur.brandeland@stericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7aecf494
  6. 22 5月, 2010 2 次提交
  7. 20 5月, 2010 1 次提交
  8. 19 5月, 2010 1 次提交
  9. 18 5月, 2010 6 次提交
  10. 16 5月, 2010 3 次提交
  11. 13 5月, 2010 2 次提交
  12. 11 5月, 2010 4 次提交
    • P
      ipv6: ip6mr: support multiple tables · d1db275d
      Patrick McHardy 提交于
      This patch adds support for multiple independant multicast routing instances,
      named "tables".
      
      Userspace multicast routing daemons can bind to a specific table instance by
      issuing a setsockopt call using a new option MRT6_TABLE. The table number is
      stored in the raw socket data and affects all following ip6mr setsockopt(),
      getsockopt() and ioctl() calls. By default, a single table (RT6_TABLE_DFLT)
      is created with a default routing rule pointing to it. Newly created pim6reg
      devices have the table number appended ("pim6regX"), with the exception of
      devices created in the default table, which are named just "pim6reg" for
      compatibility reasons.
      
      Packets are directed to a specific table instance using routing rules,
      similar to how regular routing rules work. Currently iif, oif and mark
      are supported as keys, source and destination addresses could be supported
      additionally.
      
      Example usage:
      
      - bind pimd/xorp/... to a specific table:
      
      uint32_t table = 123;
      setsockopt(fd, SOL_IPV6, MRT6_TABLE, &table, sizeof(table));
      
      - create routing rules directing packets to the new table:
      
      # ip -6 mrule add iif eth0 lookup 123
      # ip -6 mrule add oif eth0 lookup 123
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      d1db275d
    • P
      6bd52143
    • P
      f30a7784
    • P
      ipv6: ip6mr: move unres_queue and timer to per-namespace data · c476efbc
      Patrick McHardy 提交于
      The unres_queue is currently shared between all namespaces. Following patches
      will additionally allow to create multiple multicast routing tables in each
      namespace. Having a single shared queue for all these users seems to excessive,
      move the queue and the cleanup timer to the per-namespace data to unshare it.
      
      As a side-effect, this fixes a bug in the seq file iteration functions: the
      first entry returned is always from the current namespace, entries returned
      after that may belong to any namespace.
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      c476efbc
  13. 10 5月, 2010 7 次提交