1. 17 10月, 2010 10 次提交
    • E
      fib: avoid false sharing on fib_table_hash · 10da66f7
      Eric Dumazet 提交于
      While doing profile analysis, I found fib_hash_table was sometime in a
      cache line shared by a possibly often written kernel structure.
      
      (CONFIG_IP_ROUTE_MULTIPATH || !CONFIG_IPV6_MULTIPLE_TABLES)
      
      It's hard to detect because not easily reproductible.
      
      Make sure we allocate a full cache line to keep this shared in all cpus
      caches.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      10da66f7
    • E
      fib_trie: use fls() instead of open coded loop · 874ffa8f
      Eric Dumazet 提交于
      fib_table_lookup() might use fls() to speedup an open coded loop.
      
      Noticed while doing a profile analysis.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      874ffa8f
    • E
      fib: remove a useless synchronize_rcu() call · a0a4a85a
      Eric Dumazet 提交于
      fib_nl_delrule() calls synchronize_rcu() for no apparent reason,
      while rtnl is held.
      
      I suspect it was done to avoid an atomic_inc_not_zero() in
      fib_rules_lookup(), which commit 7fa7cb71 added anyway.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a0a4a85a
    • E
      fib6: use FIB_LOOKUP_NOREF in fib6_rule_lookup() · 2c1c0004
      Eric Dumazet 提交于
      Avoid two atomic ops on found rule in fib6_rule_lookup()
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2c1c0004
    • D
      sundance: Add initial ethtool stats support · 725a4a46
      Denis Kirjanov 提交于
      Add ethtool stats support.
      Signed-off-by: NDenis Kirjanov <dkirjanov@kernel.org>
      Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      725a4a46
    • D
      pch_gbe: fix if condition in set_settings() · 89980827
      Dan Carpenter 提交于
      There were no curly braces in this if condition so it always enabled full
      duplex.
      
      And ecmd->speed is an unsigned short so it is never equal to -1.  The
      effect is that mii_ethtool_sset() fails with -EINVAL and an error is
      printed to dmesg.
      Signed-off-by: NDan Carpenter <error27@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      89980827
    • H
      dnet: mark methods static and annotate for correct endianness · 35f2516f
      Harvey Harrison 提交于
      Their doesn't appear to be bugs with the endianness handling here, just get the
      annotations right to keep sparse happy.
      
      Suppresses the following sparse warnings:
      drivers/net/dnet.c:30:5: warning: symbol 'dnet_readw_mac' was not declared. Should it be static?
      drivers/net/dnet.c:49:6: warning: symbol 'dnet_writew_mac' was not declared. Should it be static?
      drivers/net/dnet.c:364:5: warning: symbol 'dnet_phy_marvell_fixup' was not declared. Should it be static?
      drivers/net/dnet.c:66:13: warning: incorrect type in assignment (different base types)
      drivers/net/dnet.c:66:13:    expected unsigned short [unsigned] [usertype] tmp
      drivers/net/dnet.c:66:13:    got restricted __be16 [usertype] <noident>
      drivers/net/dnet.c:68:13: warning: incorrect type in assignment (different base types)
      drivers/net/dnet.c:68:13:    expected unsigned short [unsigned] [usertype] tmp
      drivers/net/dnet.c:68:13:    got restricted __be16 [usertype] <noident>
      drivers/net/dnet.c:70:13: warning: incorrect type in assignment (different base types)
      drivers/net/dnet.c:70:13:    expected unsigned short [unsigned] [usertype] tmp
      drivers/net/dnet.c:70:13:    got restricted __be16 [usertype] <noident>
      drivers/net/dnet.c:92:27: warning: cast to restricted __be16
      drivers/net/dnet.c:94:33: warning: cast to restricted __be16
      drivers/net/dnet.c:96:33: warning: cast to restricted __be16
      Signed-off-by: NHarvey Harrison <harvey.harrison@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      35f2516f
    • H
      cxgb4vf: make single bit signed bitfields unsigned · 65495745
      Harvey Harrison 提交于
      Single bit signed bitfields don't make a lot of sense, noticed by sparse:
      drivers/net/cxgb4vf/t4vf_common.h:135:31: error: dubious one-bit signed bitfield
      drivers/net/cxgb4vf/t4vf_common.h:136:36: error: dubious one-bit signed bitfield
      drivers/net/cxgb4vf/t4vf_common.h:137:36: error: dubious one-bit signed bitfield
      drivers/net/cxgb4vf/t4vf_common.h:138:36: error: dubious one-bit signed bitfield
      drivers/net/cxgb4vf/t4vf_common.h:139:36: error: dubious one-bit signed bitfield
      drivers/net/cxgb4vf/t4vf_common.h:140:31: error: dubious one-bit signed bitfield
      drivers/net/cxgb4vf/t4vf_common.h:141:31: error: dubious one-bit signed bitfield
      drivers/net/cxgb4vf/t4vf_common.h:142:35: error: dubious one-bit signed bitfield
      drivers/net/cxgb4vf/t4vf_common.h:143:35: error: dubious one-bit signed bitfield
      drivers/net/cxgb4vf/t4vf_common.h:154:27: error: dubious one-bit signed bitfield
      drivers/net/cxgb4vf/t4vf_common.h:155:26: error: dubious one-bit signed bitfield
      drivers/net/cxgb4vf/t4vf_common.h:156:27: error: dubious one-bit signed bitfield
      drivers/net/cxgb4vf/t4vf_common.h:157:26: error: dubious one-bit signed bitfield
      Signed-off-by: NHarvey Harrison <harvey.harrison@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      65495745
    • E
      net: allocate skbs on local node · 564824b0
      Eric Dumazet 提交于
      commit b30973f8 (node-aware skb allocation) spread a wrong habit of
      allocating net drivers skbs on a given memory node : The one closest to
      the NIC hardware. This is wrong because as soon as we try to scale
      network stack, we need to use many cpus to handle traffic and hit
      slub/slab management on cross-node allocations/frees when these cpus
      have to alloc/free skbs bound to a central node.
      
      skb allocated in RX path are ephemeral, they have a very short
      lifetime : Extra cost to maintain NUMA affinity is too expensive. What
      appeared as a nice idea four years ago is in fact a bad one.
      
      In 2010, NIC hardwares are multiqueue, or we use RPS to spread the load,
      and two 10Gb NIC might deliver more than 28 million packets per second,
      needing all the available cpus.
      
      Cost of cross-node handling in network and vm stacks outperforms the
      small benefit hardware had when doing its DMA transfert in its 'local'
      memory node at RX time. Even trying to differentiate the two allocations
      done for one skb (the sk_buff on local node, the data part on NIC
      hardware node) is not enough to bring good performance.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Acked-by: NTom Herbert <therbert@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      564824b0
    • E
      r8169: use 50% less ram for RX ring · 6f0333b8
      Eric Dumazet 提交于
      Using standard skb allocations in r8169 leads to order-3 allocations (if
      PAGE_SIZE=4096), because NIC needs 16383 bytes, and skb overhead makes
      this bigger than 16384 -> 32768 bytes per "skb"
      
      Using kmalloc() permits to reduce memory requirements of one r8169 nic
      by 4Mbytes. (256 frames * 16Kbytes). This is fine since a hardware bug
      requires us to copy incoming frames, so we build real skb when doing
      this copy.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6f0333b8
  2. 16 10月, 2010 1 次提交
  3. 15 10月, 2010 11 次提交
  4. 14 10月, 2010 4 次提交
  5. 13 10月, 2010 3 次提交
  6. 12 10月, 2010 11 次提交
    • G
      dccp: cosmetics - warning format · 2f34b329
      Gerrit Renker 提交于
      This  omits the redundant "DCCP:" in warning messages, since DCCP_WARN() already
      echoes the function name, avoiding messages like
      
         kernel: [10988.766503] dccp_close: DCCP: ABORT -- 209 bytes unread
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      2f34b329
    • G
      dccp: schedule an Ack when receiving timestamps · ecdfbdab
      Gerrit Renker 提交于
      This schedules an Ack when receiving a timestamp, exploiting the
      existing inet_csk_schedule_ack() function, saving one case in the
      `dccp_ack_pending()' function.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      ecdfbdab
    • I
      dccp: generalise data-loss condition · d196c9a5
      Ivo Calado 提交于
      This patch generalises the task of determining data loss from RFC 4340, 7.7.1.
      
      Let S_A, S_B be sequence numbers such that S_B is "after" S_A, and let
      N_B be the NDP count of packet S_B. Then, using modulo-2^48 arithmetic,
       D = S_B - S_A - 1  is an upper bound of the number of lost data packets,
       D - N_B            is an approximation of the number of lost data packets
                          (there are cases where this is not exact).
      
      The patch implements this as
       dccp_loss_count(S_A, S_B, N_B) := max(S_B - S_A - 1 - N_B, 0)
      Signed-off-by: NIvo Calado <ivocalado@embedded.ufcg.edu.br>
      Signed-off-by: NErivaldo Xavier <desadoc@gmail.com>
      Signed-off-by: NLeandro Sales <leandroal@gmail.com>
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      d196c9a5
    • G
      dccp: remove unused argument in CCID tx function · baf9e782
      Gerrit Renker 提交于
      This removes the argument `more' from ccid_hc_tx_packet_sent, since it was
      nowhere used in the entire code.
      
      (Btw, this argument was not even used in the original KAME code where the
       function initially came from; compare the variable moreToSend in the
       freebsd61-dccp-kame-28.08.2006.patch kept by Emmanuel Lochin.)
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      baf9e782
    • G
      dccp: merge now-reduced connect_init() function · 93344af4
      Gerrit Renker 提交于
      After moving the assignment of GAR/ISS from dccp_connect_init() to
      dccp_transmit_skb(), the former function becomes very small, so that
      a merger with dccp_connect() suggests itself.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      93344af4
    • G
      dccp: fix the adjustments to AWL and SWL · 0b53d460
      Gerrit Renker 提交于
      This fixes a problem and a potential loophole with regard to seqno/ackno
      validity: currently the initial adjustments to AWL/SWL are only performed
      once at the begin of the connection, during the handshake.
      
      Since the Sequence Window feature is always greater than Wmin=32 (7.5.2),
      it is however necessary to perform these adjustments at least for the first
      W/W' (variables as per 7.5.1) packets in the lifetime of a connection.
      
      This requirement is complicated by the fact that W/W' can change at any time
      during the lifetime of a connection.
      
      Therefore it is better to perform that safety check each time SWL/AWL are
      updated, as implemented by the patch.
      
      A second problem solved by this patch is that the remote/local Sequence Window
      feature values (which set the bounds for AWL/SWL/SWH) are undefined until the
      feature negotiation has completed.
      
      During the initial handshake we have more stringent sequence number protection;
      the changes added by this patch effect that {A,S}W{L,H} are within the correct
      bounds at the instant that feature negotiation completes (since the SeqWin
      feature activation handlers call dccp_update_gsr/gss()).
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      0b53d460
    • M
      bnx2: Enable AER on PCIE devices only · c239f279
      Michael Chan 提交于
      To prevent unnecessary error message.  pci_save_state() is also moved to
      the end of ->probe() so that all PCI config, including AER state, will be
      saved.
      
      Update version to 2.0.18.
      Signed-off-by: NMichael Chan <mchan@broadcom.com>
      Reviewed-by: NBenjamin Li <mchan@broadcom.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c239f279
    • M
      bnx2: Update firmware to 6.0.x. · 22fa159d
      Michael Chan 提交于
      - Improved flow control and simplified interface
      - Use hardware RSS indirection table instead of the slower firmware-
        based table
      - Lower latency interrupt on 5709
      Signed-off-by: NMichael Chan <mchan@broadcom.com>
      Reviewed-by: NBenjamin Li <benli@broadcom.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      22fa159d
    • E
      neigh: reorder struct neighbour fields · e37ef961
      Eric Dumazet 提交于
      Le mardi 12 octobre 2010 à 00:02 +0200, Eric Dumazet a écrit :
      > Here is the followup patch.
      >
      > Thanks !
      >
      
      Oops, this was an old version, the up2date ones also took care of "used"
      field.
      
      I guess its time for a sleep, sorry again.
      
      [PATCH net-next V2] neigh: reorder struct neighbour fields
      
      (refcnt) and (ha_lock, ha, used, dev, output, ops, primary_key) should
      be placed on a separate cache lines.
      
      refcnt can be often written, while other fields are mostly read.
      
      This gave me good result on stress test :
      
      before:
      
      real    0m45.570s
      user    0m15.525s
      sys     9m56.669s
      
      After:
      
      real    0m41.841s
      user    0m15.261s
      sys     8m45.949s
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e37ef961
    • E
      net dst: use a percpu_counter to track entries · fc66f95c
      Eric Dumazet 提交于
      struct dst_ops tracks number of allocated dst in an atomic_t field,
      subject to high cache line contention in stress workload.
      
      Switch to a percpu_counter, to reduce number of time we need to dirty a
      central location. Place it on a separate cache line to avoid dirtying
      read only fields.
      
      Stress test :
      
      (Sending 160.000.000 UDP frames,
      IP route cache disabled, dual E5540 @2.53GHz,
      32bit kernel, FIB_TRIE, SLUB/NUMA)
      
      Before:
      
      real    0m51.179s
      user    0m15.329s
      sys     10m15.942s
      
      After:
      
      real	0m45.570s
      user	0m15.525s
      sys	9m56.669s
      
      With a small reordering of struct neighbour fields, subject of a
      following patch, (to separate refcnt from other read mostly fields)
      
      real	0m41.841s
      user	0m15.261s
      sys	8m45.949s
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fc66f95c
    • E
      neigh: Protect neigh->ha[] with a seqlock · 0ed8ddf4
      Eric Dumazet 提交于
      Add a seqlock in struct neighbour to protect neigh->ha[], and avoid
      dirtying neighbour in stress situation (many different flows / dsts)
      
      Dirtying takes place because of read_lock(&n->lock) and n->used writes.
      
      Switching to a seqlock, and writing n->used only on jiffies changes
      permits less dirtying.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0ed8ddf4